Running OSINT Locally with Ollama: No API Key Required

← Back to Blog

RUNNING OSINT LOCALLY WITH OLLAMA

Published: 2026-05-24

WHY LOCAL INFERENCE MATTERS FOR OSINT

Most AI-assisted OSINT tools send the target identifier — an email address, IP, domain, or person's name — to a remote API for processing. For many investigations this is fine. For sensitive ones it is not. Investigators working on insider threat cases, due diligence for acquisitions, or security research involving vulnerable individuals have reasonable concerns about routing investigation targets through a third-party cloud provider's infrastructure. Even with strong data retention policies, the operational security posture of a fully local system is categorically different from one that phones home with every query.

Ollama provides a clean solution: it runs large language models on your local hardware, exposing the same REST API surface that OpenOSINT uses. With a one-line flag change, OpenOSINT's agentic loop routes through your local Ollama instance instead of Anthropic's API. The tool execution — the actual OSINT calls to holehe, sherlock, ipinfo.io, and the rest — was always local. Now the reasoning layer is too.

There is a capability tradeoff, covered below. But for workflows where local execution is required, the tradeoff is often acceptable.

SETTING UP OLLAMA

Ollama installs as a single binary on macOS, Linux, and Windows. On macOS: brew install ollama. On Linux: the install script at ollama.com/install.sh. Once installed, the daemon starts automatically or via ollama serve. Pull a model:

ollama pull llama3.2
ollama pull mistral
ollama pull qwen2.5:14b

Models are stored locally in ~/.ollama/models/. After pulling, the REST API is available at http://localhost:11434 by default. You can verify: curl http://localhost:11434/api/tags returns a JSON list of available models.

For OSINT use with OpenOSINT, switch the provider at launch:

# One-time session flag
openosint --provider ollama

# Or set in .env
OPENOSINT_PROVIDER=ollama
OPENOSINT_OLLAMA_MODEL=llama3.2

OpenOSINT's --provider ollama path uses the Ollama Python library to send the user message and tool definitions to the local model, dispatches tool calls to the same Python functions used in the Anthropic path, and feeds results back. The REPL interface and slash commands (/history, /export, /model) remain identical.

MODEL SELECTION FOR TOOL-USE WORKFLOWS

Not all models handle structured tool calling well. The agentic loop in OpenOSINT requires the model to output valid JSON-formatted tool calls, parse structured tool results, and decide whether to chain further calls or return a final answer. Smaller models frequently fail at one or more of these steps.

Models that perform reliably for OSINT tool-use workflows as of mid-2026:

llama3.2 (3B/11B): Good balance of speed and tool-use accuracy. 3B runs on 8GB RAM; 11B needs 16GB. Recommended starting point.
qwen2.5:14b: Strong structured output fidelity. 16GB VRAM minimum for decent speed. Handles multi-hop tool chains reliably.
mistral:7b-instruct: Fast on CPU-only setups. Tool-use reliability is acceptable for single-tool queries; breaks down on complex chains.
codellama variants: Avoid for OSINT reasoning — optimized for code generation, not investigation synthesis.

The practical test: ask the model to investigate a domain and see whether it calls search_whois followed by search_ip without prompting. If it returns a text response without calling any tools, the model lacks sufficient instruction-following for the agentic workflow.

$ openosint --provider ollama --model qwen2.5:14b
openosint [ollama/qwen2.5:14b] > investigate 8.8.8.8

  → search_ip('8.8.8.8')
    Org: Google LLC | Country: US | ASN: AS15169
  → search_abuseipdb('8.8.8.8')
    Abuse Confidence: 0% | Reports: 2 (not significant)

  8.8.8.8 is Google's public DNS resolver operated from AS15169.
  Clean reputation. No threat indicators.

CAPABILITY TRADEOFFS VS. CLAUDE

The capability gap between local models and Claude is real and should be stated plainly. Claude has substantially stronger cross-tool reasoning, better handling of ambiguous targets, and more reliable synthesis of multi-source findings into coherent investigative summaries. On a complex investigation — email to account enumeration to username to cross-platform profiles, with a breach pivot in the middle — Claude makes better decisions about what to chain and produces more actionable summaries.

Concrete differences to expect with local models:

Missed pivots: Local models may fail to notice a GitHub username in holehe output and call search_username automatically. Claude does this reliably.
Weaker synthesis: The final report is shorter and less analytical. Local models describe what they found; Claude explains what it means.
Tool call formatting failures: Smaller local models occasionally produce malformed JSON tool calls. OpenOSINT catches and retries these, but it adds latency.
Speed: On CPU-only hardware, a 7B model running single-threaded is slower than the Anthropic API for complex chains. On a modern GPU, local inference is comparable.

For simple lookups — a single IP, a single email, a WHOIS check — the capability gap is narrow. The Ollama path is entirely adequate. For multi-pivot investigations that need synthesis, the Anthropic backend produces better results.

WHEN LOCAL INFERENCE IS THE RIGHT CALL

Use the Ollama provider when: the investigation targets are sensitive and should not leave your machine; you are in an air-gapped or restricted network environment; you need to run bulk automation without per-token API costs; or you are developing and testing OpenOSINT workflows and don't want to burn Anthropic credits on iteration cycles.

Use the Anthropic provider when: investigation quality matters more than privacy for the reasoning layer; you need reliable multi-hop chaining; or you are producing a final report where synthesis quality affects the output's value.

The provider flag makes switching trivial, so the decision can be per-investigation rather than a permanent configuration choice. A practical workflow: develop and iterate with Ollama, run the final investigation pass with Claude when quality is the priority.

For more on the agentic architecture both providers share, see How AI agents are changing OSINT. For tool reference including the --provider flag, see the tools reference. For automating multi-target workflows regardless of provider, see the parallel OSINT guide.