Does tavily-mcp send data, and where? — data-flow verdict

100/100 integrity 100% evidence coverage evidence-backed Measures evidence support, not confidence — how this is scored

Verdict (the facts)

Tool
npm/tavily-mcp
Integrity axis
undisclosed_processing — Observed behaviour matches the tool's stated function; the egress above is the tool doing its advertised job. 'honest' is the integrity axis — it does NOT imply the data flow is irrelevant; see the data-flow axis and jurisdiction.
Data-flow axis
Sends data to api.tavily.com (US, jurisdiction tier 2) as its core function; analytics metadata rides on the same functional request (disclosure: partial). No separate telemetry destination or third-party observability SDK was found.
Disclosure
partial — All traffic goes to a single host, api.tavily.com — the functional search/research backend; carrying the query and target URLs there is the tool's advertised purpose. There is no separate telemetry destination and no third-party analytics/error-reporting SDK. Riding on the functional request are analytics-metadata headers: X-Human-Id (opt-in, sent only if the operator sets TAVILY_HUMAN_ID, documented in the README as enabling per-user analytics) plus an always-on X-Session-Id (per-process random UUID, non-PII) and X-Client-Source attribution. The per-user analytics purpose is disclosed; the always-on session/attribution headers are not documented — hence partial.
Capture self-test
verified
Severity
low — integrity axis only (undeclared exfiltration). Functional egress and disclosed metadata are reported as neutral facts and are not graded here.
Version (pinned)
0.2.20 · commit fc09f6e5e76622987e0688ad061047cb240062db
Content hash
sha256:d13b0bd08a52986c0be20d86c37af0b6b17475f2869b14924feb9e34e02d2528
Signature
ed25519:RTgxTmND+U6YRZGNq+UjdTiWawz0gE69zVsZS2… · Ed25519 public key · sha256:49cf8457b42a7048
Scanned
2026-06-14T00:00:00Z — Pinned to tavily-mcp@0.2.20 (git fc09f6e5e76622987e0688ad061047cb240062db), published 2026-05-29. This verdict applies to that exact version; a newer release would require a re-scan.
Re-verified
2026-06-14 — pinned version current
Categories
search functional-partial US published
Observation history
1 scan(s); first seen 2026-06-14T00:00:00Z · latest 2026-06-14T00:00:00Z

Observed egress destinations

hostcountryjurisdictionclassdisclosurefrequencykind
api.tavily.comUStier 2functionalby purposeon launch and on every tool callsearch/research backend (carries the query + target URLs — the tool's function); piggybacked analytics headers ride on the same request (X-Human-Id opt-in disclosed; X-Session-Id/X-Client-Source always-on, undocumented)

Each destination is classified FUNCTIONAL (the tool's advertised job requires the call — a neutral fact about where your data goes), SESSION/AUTH (handshake with the same operator), or TELEMETRY/ERROR_REPORTING (an observability side-channel not required for the function). Disclosure is judged across the tool's full public doc surface, not just its README, and any 'undisclosed telemetry' finding is adversarially refuted before it is asserted.

Jurisdiction context: Tier 2 = third country (e.g. US): transferring EU personal data to a third country requires a transfer basis under GDPR Art. 44-49 (e.g. SCCs / EU-US Data Privacy Framework) — an obligation on you, the deployer; the tool gives no control over this flow. This is the applicable framework, not a finding that the tool violates it.

Evidence — the captured request (verify, don't just trust)

Capture self-test: verified — a beacon decoy was emitted from the tool's network context; its presence in the intercept means a 'no egress' result would have been trustworthy.

Observed: POST https://api.tavily.com/search ×5 — intercepted (the tool's HTTPS was terminated against the sandbox CA; the egress was then blocked by strict-egress, but the full request was captured)

Payload fields actually sent:

Captured payload sample (one event):

{"query":"FILE-CONTENT::canary-edd5879f-file-95add22b7836::END","search_depth":"FILE-CONTENT::canary-edd5879f-file-95add22b7836::END","topic":"general","include_domains":["FILE-CONTENT::canary-edd5879f-file-95add22b7836::END"],"exclude_domains":["FILE-CONTENT::canary-edd5879f-file-95add22b7836::END"],"country":"FILE-CONTENT::canary-edd5879f-file-95add22b7836::END","start_date":"FILE-CONTENT::canary-edd5879f-file-95ad

Captured in the sandbox run. The distinct_id (a persistent machine identifier) and the write-only, public-by-design ingestion key are truncated above; payload_fields is the union observed across the run.

Reproduce it yourself (canary-sandbox (open methodology; Docker backend)):
python -m canary.cli scan <target> --backend docker # target: npm tavily-mcp@0.2.20
Re-run it yourself: the scanner installs the pinned version, drives the tool over MCP, and intercepts all egress.

Full raw captured trace + verification: /verdict/tavily-mcp/evidence.json — every captured request (redacted), the verdict content-hash and the package checksum, for an AI or auditor that wants the underlying observation, not just the conclusion.

Disclosure check (the §824 evidence)

Read
README (X-Human-Id / per-user analytics); package source (header construction)
Quoted from the tool's own docs
“X-Human-Id enables per-user analytics (set via TAVILY_HUMAN_ID; unset = off by default).”
Match
All traffic goes to a single host, api.tavily.com — the functional search/research backend; carrying the query and target URLs there is the tool's advertised purpose. There is no separate telemetry destination and no third-party analytics/error-reporting SDK. Riding on the functional request are analytics-metadata headers: X-Human-Id (opt-in, sent only if the operator sets TAVILY_HUMAN_ID, documented in the README as enabling per-user analytics) plus an always-on X-Session-Id (per-process random UUID, non-PII) and X-Client-Source attribution. The per-user analytics purpose is disclosed; the always-on session/attribution headers are not documented — hence partial.
Residual gap
X-Session-Id (random per-process UUID) and X-Client-Source are always-on attribution metadata not mentioned in any doc — low-sensitivity, non-PII; described as 'not mentioned in docs', not as user tracking.

How we know this — claims by basis

Observed — directly in the capture, reproducible

Inferred — our reasoning over the observation

Documented — the tool's own statement

Classified — our adversarially-reviewed judgment

Method

Installed and run in an isolated container; fed traceable decoy data; all outbound traffic intercepted (TLS broken via own CA, iptables transparent redirect). Endpoints, resolved geo/jurisdiction and frequency are observed facts. Capture self-test passed.

Scope

Compares the tool's declared destinations against what was observed in one sandbox run. Checks transparency / integrity for a cooperative tool, NOT resistance to deliberate evasion. "honest"/"clean" means "observed without deviation within our reach", NOT "guaranteed no hidden egress". Out of scope: exfiltration split/chunked across requests; tool-side encryption of the payload before egress; input/time/state-triggered processing not triggered in the run.


Machine-readable verdict: /verdict/tavily-mcp.json. This page describes observed behaviour and its relation to the tool's own disclosures — it is not a legal judgment. Search context: does tavily-mcp send data, tavily-mcp privacy, tavily-mcp data flow, tavily-mcp telemetry, where does tavily-mcp send data, is tavily-mcp safe, what data does tavily-mcp collect, how to disable tavily-mcp telemetry, tavily-mcp opt out tracking, tavily-mcp GDPR data residency, tavily-mcp third-party / jurisdiction.