// INVESTIGATION · v0.4.0

uap-analyzer

The analyst's tool for working through declassified UAP releases.

An MCP server that ingests the U.S. Department of War's PURSUE-era UAP corpus — videos, PDFs, photos — and exposes the heavy work (frame extraction, FLIR vision describe, PDF OCR, FTS5 search) as fast typed tools your LLM session can call. No cloud APIs in the analysis path. The chat never has to load a 274 MB MP4 to know what's in it.

Hundreds of files per tranche

war.gov drops PDFs, FLIR videos, scanned 1940s memos, FBI photo packets. Opening them one at a time is the wrong workflow.

Chat-as-substrate fails

Loading a 274 MB MP4 into context to ask "what's in this?" burns tokens and crushes the session. The substrate should hold the bytes; the chat should hold the conclusions.

Cloud inference isn't an option

For analyst work on declassified-but-sensitive material, you want inference to stay on your LAN. The toolchain shouldn't fan out to a third party.

How it works

One MCP server, registered with Claude Code via claude mcp add --transport http. From there it's just a set of typed tools the LLM session can call by name.

PRIMITIVE 1 · CORPUS

Filesystem-rooted corpus + SQLite/FTS5 cache

Drop releases under a Release_N/ folder. The server walks the tree on rescan, runs ffprobe on videos, pdfplumber + Tesseract OCR fallback on PDFs, Pillow on photos. Everything gets indexed into a SQLite FTS5 virtual table — bm25 search across every PDF in milliseconds.

PRIMITIVE 2 · INFERENCE

Local-only vision + text inference

Vision-describe routes through a local ollama instance — llama3.2-vision:11b for frames, qwq:32b for text summarization. FLIR-aware prompting suppresses the model's tendency to hallucinate HUD elements from training-data exposure. No cloud APIs in the analysis path; the corpus + your analyst notes stay on your LAN.

PRIMITIVE 3 · RETRIEVAL

Cross-corpus FTS5 search

Once index_corpus has run, search_corpus("range fouler") spans every PDF in the tranche in milliseconds — bm25-ranked, with snippets. Find the mission report that matches the FLIR clip. Track recurring patterns across releases.

v0.4.0 · 10 TOOLS LIVE

The tool surface

list_corpus, analyze_video, extract_frame, describe_image, analyze_pdf, search_corpus, index_corpus. Each returns text or structured JSON; raw media stays in the container.

Shipped across v0.2.x – v0.4.x: flir_hud_ocr (tesseract + vision-mode via qwen2.5vl), transcribe_audio (faster-whisper, CPU int8), detect_objects (YOLOv8). All three corpus-aware primitives now in place; next wave is composition tools (cross-corpus threading, report synthesis).

WORKED FINDINGS

Four notes from sampling Release 1

A maritime FLIR clip with a wake-event between two vessels mid-clip. A NASA-logo placeholder file revealing a redaction-by-substitution pattern in the release pipeline. A coastal-recon snippet with a sensor-format anomaly. AARO's Western US Event briefing (federal LE witnesses, restricted-zone, four sightings). Full markdown in /findings.

"Disclosure doesn't end at the download button. It ends when someone has actually read the corpus and can tell you what's in it. The analyzer is the gap between."