An AI Solved a Geometry Problem with the Wrong Math

An AI Solved a Geometry Problem with the Wrong Math

/ Maxim Starkweather

On May 20, an OpenAI reasoning model produced a counterexample to the Erdős unit distance conjecture — a problem in discrete geometry that Paul Erdős posed in 1946 and that no one had meaningfully improved on since. The counterexample is genuine. Nine leading mathematicians, including Fields medalist Tim Gowers, co-signed a companion paper verifying it. Princeton number theorist Will Sawin extracted an explicit improvement exponent of δ = 0.014 from the AI’s proof. What the press has almost universally missed: the proof doesn’t use geometry.

The October Incident

Seven months earlier, Kevin Weil — then OpenAI’s VP of Product — posted that “GPT-5 found solutions to 10 (!) previously unsolved Erdős problems and made progress on 11 others.” Thomas Bloom, who maintains the authoritative Erdős Problems website, promptly explained what actually happened: “GPT-5 found references, which solved these problems, that I personally was unaware of.” The model had searched the existing literature and surfaced papers Bloom hadn’t encountered. It hadn’t proved anything new. Yann LeCun mocked the announcement publicly. Demis Hassabis called it embarrassing. Weil deleted the post.

That incident set a credibility bar for every subsequent OpenAI mathematics claim. The May 2026 result clears it — and the person who established the bar is now one of the nine mathematicians who verified the result. Thomas Bloom co-authored the companion paper confirming the unit distance proof. He’s not performing forgiveness. The proof is real, and he knows what distinguishes it from the October fiasco: this time the AI found something that wasn’t in the literature.

What the Conjecture Actually Was

The unit distance problem asks: for a set of n points placed anywhere in the plane, what is the maximum number of pairs you can arrange to be at distance exactly 1 from each other? Call this maximum u(n). Erdős proved in 1946 that u(n) = O(n^(4/3)) — you can’t have more than n^(4/3) unit-distance pairs. The lower bound he established using a square grid construction was approximately n^(1 + c/log log n) for some constant c, which is barely superlinear: the exponent above 1 shrinks as n grows. His conjecture was that the grid was roughly optimal — that u(n) is essentially n^(1+o(1)), linear up to subpolynomial factors.

The gap between OpenAI's claimed mathematical breakthroughs: the October 2025 false alarm and the May 2026 real proof

For 80 years, no one produced a construction that beat the grid by more than a logarithmic factor in the exponent. The record stood not because no one tried, but because the problem resisted every approach that seemed natural. Discrete geometers tried generalizations of the grid, algebraic varieties, probabilistic arguments. The algebraic number theory direction had been visited and abandoned. Experts including Jacob Tsimerman — who co-authored the companion paper — had attempted counterexamples and failed to make progress.

The OpenAI model produced constructions achieving n^(1+δ) for some fixed δ > 0. Fixed. Not shrinking with n. A genuine polynomial improvement over the grid. Sawin’s subsequent paper makes δ explicit at 0.014, meaning sets of n points can contain more than n^1.014 unit-distance pairs for arbitrarily large n. Eighty years of geometry, and the answer came from class field towers.

Where the Proof Went

The proof doesn’t think in terms of points and distances. It thinks in terms of rings of integers in CM number fields — algebraic objects several layers of abstraction above anything Erdős was looking at in 1946. The construction works like this: build a lattice in high-dimensional space using the ring of integers in a carefully chosen CM field, then project that lattice down to the plane. Because of how the field is structured, many lattice vectors project to exactly unit length in ℝ². The count of unit-distance pairs then depends on how many elements of the ring have a particular norm — which is a question about how primes split in the field extension.

The tool for guaranteeing enough of these small-norm elements is Golod-Shafarevich theory, a result from algebraic number theory that establishes the existence of infinite class field towers over certain base fields. The Golod-Shafarevich theorem dates to 1964. It was proved to answer questions about class groups and field extensions, not to solve combinatorics problems. The AI found it applicable here. The companion paper by Alon, Bloom, Gowers, and six others notes that the argument “relies crucially on ideas that may, at least in retrospect, be attributed to Ellenberg-Venkatesh, Golod-Shafarevich, and Hajir-Maire-Ramakrishna” — three separate research threads from algebra and number theory, combined in a way no one had attempted for a geometry problem.

Arul Shankar, a Princeton number theorist and co-author, said the work demonstrates that “AI models are capable of having original ingenious ideas.” Tsimerman was direct: “I actually briefly worked on this problem and tried to make a counterexample, but failed to make progress.” Thomas Bloom noted he had listed the unit distance problem in a recent “Top 10” problems survey and had not expected a solution within a month. The experts had looked at the algebraic number theory direction and turned back. The AI didn’t.

The AI's 125-page chain-of-thought proof, requiring 9 mathematicians to translate into a readable companion paper

The 125-Page Translation Problem

The AI’s original proof is 125 pages of chain-of-thought. It is mathematically valid. It is also essentially unreadable in its original form. Nine leading mathematicians — Noga Alon, Thomas Bloom, Tim Gowers, Daniel Litt, Will Sawin, Arul Shankar, Jacob Tsimerman, Victor Wang, and Melanie Matchett Wood — produced a “short, digested, human-verified version” of it. The companion paper exists because the AI’s output, while correct, was not the kind of argument a human mathematician would write for other human mathematicians. The community needed a translation.

This is worth taking seriously as a data point about what AI mathematical reasoning currently produces. The AI found the right bridge between fields. It followed the logic to a valid conclusion. But the path it took was not structured around exposition, motivation, or the kind of conceptual economy that makes mathematical writing useful. Victor Wang, who is both a co-author of the companion paper and the mathematician who suggested simplifications to the original AI proof, did crucial work making the argument clean. The AI generated the existence proof; humans produced the paper.

There is also the matter of scale. The original OpenAI result proved n^(1+δ) for some δ > 0 — the actual value of δ was astronomically small, around 10^-38 in simplified versions. The proof established that a polynomial improvement over the grid exists, but the improvement was so tiny it would be irrelevant for any finite n a human would ever care about. Sawin’s explicit refinement — getting δ up to 0.014 — was a separate mathematical contribution. The AI found the connection; Sawin built the road.

What Gowers Said and What He Didn’t

Gowers’s statement from the companion paper has been quoted extensively: “if a human had written the paper and submitted it to the Annals of Mathematics, I would have recommended acceptance without any hesitation. No previous AI-generated proof has come close to that.” The press is reading this as a blanket endorsement of AI mathematics. Gowers is saying something more specific.

He also wrote that the AI “is good at finding surprising connections, and it can afford to try quite hard to prove statements that seem unlikely to be true.” The second half of that sentence is the interesting part. Human mathematicians develop strong intuitions about which directions are worth pursuing. Those intuitions are mostly right. They also filter out some directions that would have worked, because the filtering is imperfect. The AI doesn’t have that filter — it doesn’t share the expert consensus that the algebraic number theory approach was unlikely to yield a polynomial improvement. That absence of prior is not a cognitive virtue; it is the absence of a prior. It happened to be useful here.

The open questions are real. Journal peer review of the full result is still in progress as of publication. The generalization is unresolved: OpenAI has not released a success rate on other open problems. One well-verified result is not a track record. The upper bound problem — how large can u(n) actually get? — remains completely open; the model disproved the conjecture that the grid was optimal, but didn’t determine what the true maximum is. Thomas Bloom believes the construction may open new approaches to other discrete geometry problems previously thought disconnected from number theory, but that work is ahead of us.

What happened on May 20 is not that AI solved mathematics. It is more specific and more interesting than that: a general-purpose reasoning model, not trained for this problem and not scaffolded to search for proofs, was willing to explore a direction that experts had classified as unlikely, produced a valid argument using tools from algebraic number theory that no geometer had combined this way before, and generated an output that required nine leading mathematicians to translate into something the community could read. TechCrunch called it “for real this time” — and they’re right. The question worth asking now is whether this was a single anomalously good result, or the first data point in a new kind of mathematical collaboration. The field has one counterexample to that conjecture, too.

AI-generated editorial image

AI-generated editorial illustration · TemperatureZero · May 25, 2026

Keep reading the signal

Get the Daily Signal — a concise briefing on what actually matters in AI and the systems around it.

Subscribe Free

Continue the archive

Latest BriefingsArticlesAbout Temperature Zero