Skip to content
PrivateAI
← Back to Home
Local AI

How to Monitor Your AI Tools for Network Leaks (And What to Do When You Find Them)

11 min readBy PrivateAI Team

Most people who install a "local" AI tool assume the data stays local. Most of them are wrong — at least partially.

Local LLM runners, coding assistants, and even privacy-focused AI apps make network calls you didn't authorize. Telemetry pings, model update checks, license validations, crash reports. Some are benign. Some aren't. The only way to know is to watch the wire.

This guide shows you how to monitor exactly what your AI tools are sending, intercept the connections that concern you, and build a workflow where sensitive data never leaves your machine without your explicit say-so.

Why "Local" AI Isn't Always Local

The local LLM ecosystem has a trust problem. Tools marketed as privacy-first still reach out to external servers — sometimes for legitimate reasons, sometimes not.

Here's what gets transmitted without most users realizing:

  • Telemetry and analytics: Usage patterns, model selection, session duration. LM Studio, Jan, and even some Ollama-adjacent tools ship with analytics enabled by default.
  • Model metadata requests: Checking for updates, pulling manifests from registries, fetching model card info from Hugging Face.
  • License and auth validation: Paid local tools often ping a licensing server on startup.
  • Crash reports: When inference crashes (and it does), stack traces containing fragments of your last prompt can be sent to error-tracking services.
  • Plugin and extension callouts: If you've installed coding assistant plugins (Continue.dev, Codeium, Cursor), each has its own call home behavior.

None of this means your actual prompts are being exfiltrated. But it does mean your threat model needs to account for metadata and context that can be almost as revealing as the content itself.

What You Need: The Monitoring Stack

You don't need expensive tools. Here's the free stack:

For macOS:

  • Little Snitch (paid, best-in-class) or Lulu (free, open source)
  • Wireshark or tcpdump for deep packet inspection
  • nettop (built-in macOS) for quick process-level monitoring

For Linux:

  • nethogs — per-process bandwidth monitor
  • Wireshark or tshark
  • ss and lsof for socket inspection

For Windows:

  • Glasswire (free tier is sufficient)
  • Wireshark
  • Resource Monitor (built-in)

This guide focuses on macOS and Linux since that's where most privacy-conscious developers run local AI.

Method 1: Quick-and-Dirty with nettop (macOS)

Before you reach for Wireshark, start here. nettop shows real-time per-process network activity from the command line.

```bash

Watch all processes, refresh every 2 seconds

sudo nettop -P -d -t wifi -t wired

Filter to a specific process (e.g., Ollama)

sudo nettop -P -d | grep -i ollama

```

Open your AI tool. Start a session. Watch for any process making outbound connections during inference — specifically anything that isn't your browser or OS services.

If you see ollama making calls to anything other than localhost:11434, that's worth investigating. Same goes for any Electron-based app (LM Studio, Jan) calling domains other than local model endpoints.

Method 2: Wireshark for Full Packet Inspection

nettop tells you that connections are happening. Wireshark tells you what is being sent.

Install:

```bash

macOS

brew install wireshark

Ubuntu/Debian

sudo apt install wireshark

```

Capture AI tool traffic:

  1. Open Wireshark, select your active network interface (usually en0 on Mac, eth0 or wlan0 on Linux)
  2. Apply a capture filter to exclude noise:

```

not (host 127.0.0.1 or host ::1) and not arp

```

  1. Launch your AI tool and run a few prompts
  2. Stop capture, then apply display filter:

```

http or http2 or dns

```

  1. Look for outbound connections. Right-click any suspicious packet → Follow → HTTP/TCP Stream to see the full payload.

What you're looking for:

  • POST requests to analytics endpoints (Segment, Amplitude, Mixpanel domains are common)
  • DNS lookups for cloud AI provider domains (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com) from tools claiming to be fully local
  • Any request that includes prompt-like strings in the payload

Method 3: DNS-Level Blocking with Pi-hole or /etc/hosts

Even if you can't decrypt TLS traffic (which Wireshark can't without the server's private key), you can block domains at the DNS level.

For quick blocking without setting up Pi-hole:

```bash

Add to /etc/hosts to null-route telemetry domains

echo "0.0.0.0 telemetry.lmstudio.ai" | sudo tee -a /etc/hosts

echo "0.0.0.0 analytics.jan.ai" | sudo tee -a /etc/hosts

echo "0.0.0.0 sentry.io" | sudo tee -a /etc/hosts # crash reporting

```

Verify it worked:

```bash

ping telemetry.lmstudio.ai

Should show: ping: cannot resolve telemetry.lmstudio.ai: Name or service not known

```

Note: blanket-blocking Sentry may break error reporting for legitimate local services too. Be surgical.

What Ollama Actually Sends

Since Ollama is the most popular local LLM runner, let's be specific about what it does:

On model pull: Makes requests to registry.ollama.ai to download model manifests and layers. This is expected and necessary. Ollama sees which models you're pulling.

On startup: Checks for version updates via ollama.com. Metadata only — not your prompts.

During inference: All traffic is between your client and localhost:11434. Nothing leaves your machine.

Crash reports: Ollama does not send crash reports by default as of the current release.

So Ollama's privacy posture is reasonably good — the main exposure is the model registry knowing what you're running. If that matters to you (e.g., you're evaluating a model associated with a specific use case), you can pre-download models on a non-sensitive network and run Ollama with network disabled:

```bash

Disable Ollama's external access entirely via firewall rule (macOS)

sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/local/bin/ollama

sudo /usr/libexec/ApplicationFirewall/socketfilterfw --blockapp /usr/local/bin/ollama

```

The Harder Problem: Your Documents

Running a local LLM is one thing. Feeding it sensitive documents is another. This is where most privacy setups break down.

If you're using a RAG (Retrieval-Augmented Generation) workflow — giving your LLM access to PDFs, code repos, or internal docs — those documents are processed in memory locally. But they also exist on your filesystem, often in cleartext, accessible to every process on your machine including those that do phone home.

The principle: separate your document storage from your AI workspace.

Sensitive documents (client contracts, medical records, source code under NDA) should live in encrypted storage and only be decrypted into a sandboxed workspace for the duration of an AI session.

Tresorit handles this well. It's zero-knowledge encrypted cloud storage — meaning even Tresorit's servers can't read your files. You can set up a sync folder that exists as a local mount for AI ingestion, then unmount it when you're done. No cleartext copy sits on disk long-term.

Affiliate Disclosure: This article may contain affiliate links. If you make a purchase through these links, we may earn a small commission at no extra cost to you. We only recommend products we genuinely believe in. This helps support our work and allows us to continue providing free content.

The rule of thumb: use local for anything that touches sensitive or proprietary data. Use trusted cloud AI for research, general knowledge, and tasks where the question itself isn't sensitive.

Building a Zero-Trust AI Network Policy

If you want a systematic approach rather than ad-hoc monitoring, set up an application firewall with explicit allow-lists.

On macOS with Lulu (free):

  1. Install Lulu from objective-see.org
  2. Enable "Block All" mode
  3. Launch your AI tools one at a time and approve only the connections you expect:

- Ollama: allow localhost only, deny everything else

- LM Studio: allow localhost, deny analytics domains

- Cursor/Continue.dev: this will require more careful decisions — these legitimately need some cloud connectivity

  1. Save your rules to version control

On Linux with ufw and per-process rules:

```bash

Allow Ollama to bind on localhost only

sudo ufw allow from 127.0.0.1 to 127.0.0.1 port 11434

Block Ollama from making outbound connections (except localhost)

Use owner-match with iptables for per-process control

sudo iptables -A OUTPUT -m owner --uid-owner $(id -u ollama) \

! -d 127.0.0.1 -j REJECT

```

Note: per-process Linux firewall rules require iptables owner match module, which isn't available in all environments.

The Email Vector: Exfiltration Through AI Plugins

One underappreciated risk: AI tools with email integration.

If you've connected Gmail, Outlook, or any IMAP-based email to an AI assistant for summarization or drafting, you've potentially granted that tool (and its cloud backend) access to your entire inbox. Even "local" tools that use an email plugin often proxy through a cloud service for OAuth token management.

For any AI workflow that touches email, use a properly isolated account. Proton Mail offers end-to-end encrypted email with a privacy architecture that's audited and open-source. Their bridge app lets you use Proton with IMAP clients and local AI tools while keeping server-side content encrypted.

Affiliate Disclosure: This article may contain affiliate links. If you make a purchase through these links, we may earn a small commission at no extra cost to you. We only recommend products we genuinely believe in. This helps support our work and allows us to continue providing free content.