City Atlas

The world's largest cities, placed by what an AI model reads in their name ("City, Country") — next to where they actually are. How much of the real world map does a name alone recover? Below, every embedding model side by side.

Focus a continent:
How this works — the picture, the scores & the setup

The picture. Each map flattens a model's name-embeddings to 2D. Map (PCA) is the honest "shadow" of the data — the same fixed math for every model, no tuning — rotated and scaled to line up with real longitude/latitude, so it's comparable to the reference map and across models. Clusters (UMAP) instead pulls each city's nearest neighbours into tight clumps: prettier local structure, but it's free to scramble the global map, so it's not a fair cross-model comparison — handy for seeing groupings, not for "is it a map?".

The scores are computed on the raw embeddings, so they describe the model — not the drawing:

  • Same-continent neighbours (the headline) — for a typical city, how many of its 10 closest name-twins are on the same continent. Higher is better.
  • Distance agreement — does a bigger gap in name-space mean a longer real-world distance? 1 = always, 0 = never. It's higher within a continent than overall for every model, so names recover local geography best.
  • Continent separation — how sharply a model walls continents apart. This is not a quality score: strong separation breaks the layout into isolated blobs that look less like a real map.
  • Map shape (the good / fair / poor badge) — how close a model's PCA map is to the real world (lower disparity is better; 0 = identical shape).

Setup. Every model runs in its recommended clustering configuration with no geographic instruction — telling it to "think about regions" would leak the very thing we're testing. The country token does much of the geographic anchoring; the bare-name variants (which drop the country) show how much. Data: GeoNames cities15000. Embed + projection pipeline: scripts/city_atlas/.