City Atlas
The world's largest cities, placed by what an AI model reads in their name
("City, Country") — next to where they actually are. How much of the real
world map does a name alone recover? Below, every embedding model side by side.
Each map below places those same cities using only a model's reading of their name. The number is same-continent neighbours — for a typical city, how many of its 10 closest name-twins share its continent (higher = a name alone better recovers geography). Sorted best first; click any map to explore it.
loading…
▸ How this works — the picture, the scores & the setup
The picture. Each map flattens a model's name-embeddings to 2D. Map (PCA) is the honest "shadow" of the data — the same fixed math for every model, no tuning — rotated and scaled to line up with real longitude/latitude, so it's comparable to the reference map and across models. Clusters (UMAP) instead pulls each city's nearest neighbours into tight clumps: prettier local structure, but it's free to scramble the global map, so it's not a fair cross-model comparison — handy for seeing groupings, not for "is it a map?".
The scores are computed on the raw embeddings, so they describe the model — not the drawing:
- Same-continent neighbours (the headline) — for a typical city, how many of its 10 closest name-twins are on the same continent. Higher is better.
- Distance agreement — does a bigger gap in name-space mean a longer real-world distance? 1 = always, 0 = never. It's higher within a continent than overall for every model, so names recover local geography best.
- Continent separation — how sharply a model walls continents apart. This is not a quality score: strong separation breaks the layout into isolated blobs that look less like a real map.
- Map shape (the good / fair / poor badge) — how close a model's PCA map is to the real world (lower disparity is better; 0 = identical shape).
Setup. Every model runs in its recommended clustering configuration with no geographic instruction — telling it to "think about regions" would leak the very thing we're testing. The country token does much of the geographic anchoring; the bare-name variants (which drop the country) show how much. Data: GeoNames cities15000. Embed + projection pipeline: scripts/city_atlas/.