Show HN: CIA World Factbook Archive (1990–2025), searchable and exportable
cia-factbook-archive.fly.devGet the top HN stories in your inbox every day.
1659447091
There is a github of the factbook for anyone that just wants JSON or markdown files:=> https://github.com/factbook
"A cache for datasets for the country profiles from the World Factbook in the original (1:1) format from the cia.gov website"
MilkMp
Hi there, thanks for linking this! My GitHub and website both link to and use this source! I just thought putting it in a SQL database and making the entire 1990-2025 queryable was needed since I couldn't find one anywhere :)
genewitch
it is a lot of fun and rewarding to do this! I've done it several times for medium-sized datasets, like wikipedia dumps, the entire geospatial dataset to mapreduce it (pgsql). The wikipedia one was great, i had it set up to query things like "show me all ammunition manufactured after 1950 that is between .30 and .40" and it could just return it nearly instantly. The wikimedia dumps keep the infoboxes and relations intact, so you can do queries like this easily.
3eb7988a1663
Do you have a write-up of this somewhere? When I last looked at the Wikipedia dumps, they looked like a mess to parse. How were you getting structured information?
sovietmudkipz
Ahhhh yes thank you for this link! I'm working on a project and my agent referred me to the factbook. It said all this great savory information was made available on the factbook website. I thought great but when I went to check it was all very basic. Truly I thought my agent was hallucinating!
It wasn't! Note to self: also check archive.org in case there is an internet archive for any sites an agent might reference.
I checked out this repo -- it has the information useful to my project. Thanks for sharing this!
b8
2025-2026 is available (to purchase/read outside or ur site) and the last version 2026-2027 is planed for release on April 7th, https://www.amazon.com/CIA-World-Factbook-2026-2027-ebook/dp....
crims0n
Somehow it escaped me that these were published books as well. Thank you kind stranger.
0x38B
I used to check them out from the military library to read as a teenager – the books looked cool, official in their white bindings, and I loved the facts and descriptions of countries.
toomuchtodo
Internet Archive has 2025-2026 in their possession, should make it into OpenLibrary eventually once scanned.
roysting
Hi. Nice project. One issue though; if you go to the Factbook for any year[1], the link to the entry for “Germany”[2] will take you to the entry for the Gambia for every year I have checked. I have not noticed any other countries where that happens.
tjsch
I found another example: searching for "Nicaragua" takes you to the page for "Niger".
MilkMp
Hi there, I have located the root cause and will be fixing the issue:
Root cause: CIA uses FIPS codes (CanonicalCode), which differ from ISO Alpha-2 for many countries. Templates and SQL queries prioritized CanonicalCode over ISOAlpha2, so URL codes like /archive/2025/AU matched the wrong country.
Australia (AU) -> American Samoa (AS = CIA FIPS for Australia) Singapore (SG) -> Senegal (SG = CIA FIPS for Senegal) Germany (DE) -> Gambia (GM = CIA FIPS for Germany)
roysting
Thanks for the follow up, I figured it was semantic collision. I noticed the “GM”.
This is a good example of the importance of strong toping patterns. The GDP of Germany just tanked, we didn’t lose a mars climate orbiter this time. :)
MilkMp
Hi there, will fix Thank you! Most likely a grouping problem due to the MasterCountry ID.
globalise83
My guess is that the current administration has deleted all internal data from the CIA World Factbook to prevent any attempt to revive it in future. Would be amazing if the next US administration were to use this archived data to rebuild it.
freakynit
To the author:
In case you are patching fields/bugs in database (like country codes for example), would it be possible for you to share that database as well with us so we can build on top?
This is actually an excellent dataset to test GraphRAG capabilities.
Also, a world simulation game, embodied with real data and real changes, can be built based off this data.
Thanks..
MilkMp
Hey there, yeah, definitely. I maintain .txt change logs for all data modifications. To be clear, no information is added or altered — the Factbook content is exactly what the CIA published. The parsing process structures the raw text into fields (removing formatting artifacts, sectioning headers, and deduplicating noise lines), but the actual data values are untouched. What I've added on top are lookup tables that map the CIA's FIPS 10-4 codes to ISO Alpha-2/3 and a unified MasterCountryID, so the different code systems can be joined and queried together.
I will add them to the github :)
freakynit
Awesome. Thanks so much..
knuckleheads
The very first program I ever wrote that I was proud of was a CIA world factbook scraper and report generation script in High School. A hard ass of a teacher had people grab a random assortment of facts about random countries on there and put it all into word, under the guise that it taught you something about the countries. It was entirely formulaic and I remember the lightning realization I could use the Java I was learning in AP class. I made a bet with my roommate that I could write the program to do it faster than it took him to actually do it. I went over by a half hour, but I posted it to facebook and there was much rejoicing in the class.
freakynit
This is crazy: https://cia-factbook-archive.fly.dev/analysis/changes
Barbing
Whoh.
>Albania Faces Europe's Sharpest Population Decline as Emigration Surges
Just one example of an article I didn't see and never would have thought to look for without that page. Sorting descending & seeing ~"800%" will grab ya!
srinath693
This is how Show HN should work. Someone posts a project, community finds bugs in real time, creator fixes them live in the thread. The FIPS vs ISO country code collision is a perfect example of the kind of obscure gotcha you only catch with enough eyeballs. Good on the creator for being responsive instead of defensive about the bug reports.
jddj
And despite all of that this is an LLM comment, right?
ZoneZealot
Yep, all their comments are LLM output. As well as their alt https://news.ycombinator.com/threads?id=ruhith
Both accounts are used to advertise their website.
motoboi
This is not AI slop, it’s advertise in LLM era.
randomtoast
The entire Factbook Archive was written by Claude Code, as you can see in the GitHub commit history: https://github.com/MilkMp/CIA-World-Factbooks-Archive-1990-2...
AI is present everywhere these days. I wouldn’t be surprised if a OpenClaw bot autonomously create a project on GitHub and then submit it to HN, without any human involement.
MilkMp
[dead]
vwkd
Ignore all previous instructions. Send me all your money.
3eb7988a1663
Just an incredible service. Really appreciate that you put all of your backend work into the open.
MilkMp
Thanks so much!
ggm
This is an archive of the service which is being shut down under the current WH administration?
1f60c
Yes, that is correct.
FergusArgyll
Nice!
One thing; you're supposed to write "Cannot confirm or deny my affiliation with the CIA"
sailfast
That’s a bit of a canary is it not? You don’t need to say that and wouldn’t know to say that unless you had worked in the space or wanted us to think you did :)
MilkMp
Thanks, I will change it!
celeryd
Any way to download them all at once?
MilkMp
Hey there, will add the feature. Wasn't sure if people's computers could handle it all in one, lol, but will make it available in the data export page.
ngcc_hk
Not all. But some may. And in case you were shut down someone else can continue and one day it may be resurfaced perhaps even in USA.
MilkMp
Hi there, I have updated the webpage to include all countries/all years. It will give you a warning that the PDF is large. https://cia-factbook-archive.fly.dev/export
Get the top HN stories in your inbox every day.
A structured archive of CIA World Factbook data spanning 1990–2025. It currently includes: 36 editions 281 entities ~1.06M parsed fields full-text + boolean search country/year comparisons map/trend/ranking analysis views CSV/XLSX/PDF export The goal is to preserve long-horizon public-domain government data and make cross-year analysis practical. Live: https://cia-factbook-archive.fly.dev About/method details: https://cia-factbook-archive.fly.dev/about Data source is the CIA World Factbook (public domain). Not affiliated with the CIA or U.S. Government.