// INVESTIGATION · v0.4.0
uap-analyzer
The analyst's tool for working through declassified UAP releases.
An MCP server that ingests the U.S. Department of War's PURSUE-era UAP corpus — videos, PDFs, photos — and exposes the heavy work (frame extraction, FLIR vision describe, PDF OCR, FTS5 search) as fast typed tools your LLM session can call. No cloud APIs in the analysis path. The chat never has to load a 274 MB MP4 to know what's in it.
Hundreds of files per tranche
war.gov drops PDFs, FLIR videos, scanned 1940s memos, FBI photo packets. Opening them one at a time is the wrong workflow.
Chat-as-substrate fails
Loading a 274 MB MP4 into context to ask "what's in this?" burns tokens and crushes the session. The substrate should hold the bytes; the chat should hold the conclusions.
Cloud inference isn't an option
For analyst work on declassified-but-sensitive material, you want inference to stay on your LAN. The toolchain shouldn't fan out to a third party.
How it works
One MCP server, registered with Claude Code via claude mcp add --transport http. From there it's just a set of typed tools the LLM session can call by name.
PRIMITIVE 1 · CORPUS
Filesystem-rooted corpus + SQLite/FTS5 cache
Drop releases under a Release_N/ folder. The server walks the tree on rescan, runs ffprobe on videos, pdfplumber + Tesseract OCR fallback on PDFs, Pillow on photos. Everything gets indexed into a SQLite FTS5 virtual table — bm25 search across every PDF in milliseconds.
PRIMITIVE 2 · INFERENCE
Local-only vision + text inference
Vision-describe routes through a local ollama instance — llama3.2-vision:11b for frames, qwq:32b for text summarization. FLIR-aware prompting suppresses the model's tendency to hallucinate HUD elements from training-data exposure. No cloud APIs in the analysis path; the corpus + your analyst notes stay on your LAN.
PRIMITIVE 3 · RETRIEVAL
Cross-corpus FTS5 search
Once index_corpus has run, search_corpus("range fouler") spans every PDF in the tranche in milliseconds — bm25-ranked, with snippets. Find the mission report that matches the FLIR clip. Track recurring patterns across releases.
v0.4.0 · 10 TOOLS LIVE
The tool surface
list_corpus, analyze_video, extract_frame, describe_image, analyze_pdf, search_corpus, index_corpus. Each returns text or structured JSON; raw media stays in the container.
Shipped across v0.2.x – v0.4.x: flir_hud_ocr (tesseract + vision-mode via qwen2.5vl), transcribe_audio (faster-whisper, CPU int8), detect_objects (YOLOv8). All three corpus-aware primitives now in place; next wave is composition tools (cross-corpus threading, report synthesis).
WORKED FINDINGS
Four notes from sampling Release 1
A maritime FLIR clip with a wake-event between two vessels mid-clip. A NASA-logo placeholder file revealing a redaction-by-substitution pattern in the release pipeline. A coastal-recon snippet with a sensor-format anomaly. AARO's Western US Event briefing (federal LE witnesses, restricted-zone, four sightings). Full markdown in /findings.
"Disclosure doesn't end at the download button. It ends when someone has actually read the corpus and can tell you what's in it. The analyzer is the gap between."