Methodology
How we ingest, process, and analyze declassified records.
1. Document Ingestion
Our automated pipeline regularly crawls four official government sources: the National Archives (NARA), CIA FOIA Reading Room, FBI Vault, and NSA FOIA releases. We check for new documents at least daily.
Ingestion Steps
- 1. Discover new document listings from source indexes
- 2. Download PDF/media files to our secure storage
- 3. Generate SHA-256 hash for integrity verification
- 4. Record canonical source URL and mirrored location
- 5. Store first-seen and last-checked timestamps
2. Dual-Layer Model
We maintain strict separation between original documents and our analysis layer:
Immutable Layer
- • Original PDFs/media files
- • Never altered or modified
- • Cryptographic hash verification
- • Full provenance chain
Intelligence Layer
- • Extracted text and metadata
- • Entity tags and classifications
- • Correlation links
- • Fully rebuildable from originals
3. Data Extraction
Our extraction process identifies factual elements only. We do not interpret or draw conclusions.
Extracted Elements
- • Dates and timestamps
- • Geographic locations
- • Agency/organization names
- • Personnel names (as stated)
- • Document classification markings
- • Evidence type (memo, report, photo)
- • Object descriptions (as quoted)
- • Witness statements (verbatim)
4. Correlation Logic
Cases are linked based on explicit, explainable attributes:
- •Temporal proximity: Events occurring within the same time window
- •Geographic proximity: Events in nearby locations
- •Shared entities: Same agencies, personnel, or organizations mentioned
- •Similar descriptors: Matching object descriptions or characteristics
5. Limitations
We acknowledge the following limitations:
- • OCR extraction from scanned documents may contain errors
- • Some documents are heavily redacted, limiting extractable data
- • Location normalization may be imprecise for historical place names
- • Correlation strength is algorithmic, not a measure of significance
- • We cannot verify the accuracy of original government records
