Generating Structured OSINT Reports: JSON, PDF, and Markdown

← Back to Blog

GENERATING STRUCTURED OSINT REPORTS

Published: 2026-05-24

WHY OUTPUT FORMAT MATTERS

An investigation that produces only terminal output has a short half-life. The findings exist in the scrollback buffer of a single session. They can't be searched programmatically, diffed against a previous investigation of the same target, ingested by a SIEM, or handed to a colleague who wasn't present. Structured output transforms a one-time lookup into a persistent record that can feed downstream systems, support a chain of custody, or serve as the basis for a client-facing report.

The right output format depends on the consumer. JSON is the right format when downstream automation will process the results — SIEM ingestion, database storage, diff analysis. Markdown is the right format when a human will read the output in a terminal, editor, or document system that renders it. PDF is the right format when the report needs to be attached to a ticket, sent to a client, or archived with a legal hold. OpenOSINT supports all three export paths, plus the REPL's /export command for on-the-fly Markdown export during interactive sessions.

JSON OUTPUT: STRUCTURED MACHINE-READABLE RESULTS

The --json flag on any OpenOSINT command produces a structured JSON object instead of Rich-rendered terminal output:

# Single target, JSON output
openosint email target@example.com --json > result.json

# Multi-target, JSON output
openosint multi targets.txt --json > bulk_results.json

The JSON schema for a single investigation result:

{
  "target": "target@example.com",
  "target_type": "email",
  "timestamp": "2026-05-24T14:23:01Z",
  "tools_run": ["search_email", "search_breach", "generate_dorks"],
  "results": {
    "search_email": {
      "accounts_found": ["GitHub", "Spotify", "WordPress"],
      "services_tested": 312
    },
    "search_breach": {
      "breaches": [
        {"name": "LinkedIn", "date": "2016-05-21", "data_classes": ["Emails", "Passwords"]},
        {"name": "Adobe", "date": "2013-10-04", "data_classes": ["Emails", "Passwords", "Usernames"]}
      ]
    }
  },
  "summary": "2 breach exposures found. GitHub account confirmed."
}

The timestamp field is ISO 8601 UTC. For multi-target output, results is an array of these objects. The schema is stable across OpenOSINT versions from v2.0 onward; breaking changes are versioned.

MARKDOWN EXPORT: THE /EXPORT SLASH COMMAND

During an interactive REPL session, /export writes the current session's findings to a Markdown file:

openosint > investigate target@example.com
[... investigation runs ...]
openosint > /export investigation_2026-05-24.md
[+] Exported 4 tool results to investigation_2026-05-24.md

The exported Markdown includes: the target, investigation timestamp, each tool's output as a formatted section with headers, and the agent's synthesis as a conclusion section. The file is readable in any Markdown renderer and can be committed to a git repository for version-controlled investigation records.

For programmatic Markdown generation outside the REPL, OpenOSINT's json_output.py module provides results_to_markdown() which takes a structured result dict and returns a Markdown string. This is the same function the REPL uses internally:

from openosint.json_output import results_to_markdown
import json

with open('results.json') as f:
    results = json.load(f)

md = results_to_markdown(results)
with open('report.md', 'w') as f:
    f.write(md)

PDF REPORT GENERATION

PDF generation requires the reportlab optional dependency: pip install openosint[pdf] or pip install reportlab. Once installed, the --pdf flag generates a formatted PDF report:

openosint email target@example.com --pdf report.pdf

The PDF output includes: a cover page with the investigation target, timestamp, and tool inventory; one section per tool with findings formatted as tables or bullet lists; and a conclusions section containing the agent's synthesis. The page layout uses a monospace font and includes page numbers and a footer with the OpenOSINT version and generation timestamp.

For custom PDF layouts, pdf_report.py exposes the generate_pdf(results: dict, output_path: str) function directly. The results parameter is the same structured dict produced by --json output. To add custom sections or modify the layout, subclass the ReportBuilder class and override the section rendering methods.

PDF reports are appropriate for any investigation that will be reviewed by someone outside the terminal environment: a security team reviewing a threat actor profile, a compliance team auditing contractor background checks, or a client receiving a deliverable from a security engagement.

BUILDING A REPORT PIPELINE

For recurring investigations — weekly threat intelligence reports on a watchlist of domains, monthly breach exposure audits of an employee email list — the recommended pattern is: run the investigation in JSON mode, commit the JSON to a git repository, generate human-readable output from JSON on demand.

#!/bin/bash
DATE=$(date +%Y-%m-%d)
# Run investigation, save structured output
openosint multi watchlist.txt --json > "reports/raw/${DATE}.json"

# Generate Markdown report from JSON
python3 -c "
from openosint.json_output import results_to_markdown
import json, sys
data = json.load(open(sys.argv[1]))
print(results_to_markdown(data))
" "reports/raw/${DATE}.json" > "reports/md/${DATE}.md"

# Commit both
git add reports/ && git commit -m "Weekly OSINT run ${DATE}"

This gives you a queryable history of investigation results, diff-able to track changes in target state week over week. Combined with the parallel investigation patterns, this pipeline handles hundreds of targets in a single scheduled run. For the full tool output schemas that the JSON exporter relies on, see the tools reference. For AI-assisted investigation that writes findings directly to disk, see the Claude Code OSINT guide.