Roderick Fanou

The AI industry this week offered a data point that seemed impossible six months ago: a general-purpose language model cracked an unsolved conjecture in pure mathematics. The Erdős unit distance problem, posed in 1946 and untouched for eight decades, fell to an OpenAI reasoning model with no mathematics-specific training. Meanwhile, Google turned its developer conference into an agentic product announcement, OpenAI quietly built out a self-serve advertising platform, and Anthropic disclosed a model so capable of cyberattacks it declined to release it publicly. The field is accelerating. The restraint is selective.

A Theorem 80 Years in the Making

On May 20, OpenAI announced that a general-purpose reasoning model had disproved a central conjecture in discrete geometry: the unit distance problem Paul Erdős first posed in 1946.^[1] Mathematicians had long believed square grid constructions were optimal for maximizing unit-distance pairs among n points in a plane. The model found an infinite family of counterexamples, doing so by drawing from algebraic number theory, specifically infinite class field towers and a 1960s theorem called Golod-Shafarevich. The connection between that theorem and this geometry problem had never been made before. Three external mathematicians independently verified the proof.^[2] The result came from a general reasoner, not a system tuned for mathematics or targeted at this specific problem. That is the point.

Google's Agentic Pivot at I/O

At Google I/O 2026, the company released Gemini 3.5 Flash as the global default model for the Gemini app and AI Mode in Search.^[3] Flash gains improved safety characteristics: fewer harmful outputs and fewer false refusals. Google also announced Gemini Spark, a general-purpose agent that reasons across connected apps and takes actions on behalf of users.^[4] Spark is in beta, available first to trusted testers and Google AI Ultra subscribers. Google added Omni, a world model that predicts physical environments from user actions. Gemini can now generate Docs, Sheets, Slides, PDFs, and spreadsheets directly from prompts, posing a direct challenge to the productivity software layer Microsoft has long owned. The breadth of the product surface suggests Google is betting on the assistant layer, not just the model.

ChatGPT Becomes an Ad Platform

In May, OpenAI eliminated the $50,000 minimum spend requirement for its self-serve advertising platform and expanded the ads pilot to the UK, Mexico, Brazil, Japan, and South Korea.^[5]^[6] Ads appear in labeled, tinted boxes below AI responses; OpenAI says they cannot influence the model's answers. The company targets $2.5 billion in ad revenue this year and $100 billion by 2030. ChatGPT launched ads for U.S. free users in February, less than four months ago. The timeline from "no ads, ever" to an international self-serve platform took approximately 18 months of public pressure and a need for revenue diversification. Whether answers remain uninfluenced as advertiser revenue scales is a structural question, not a policy one.

The Model Anthropic Won't Ship

Anthropic's Claude Mythos Preview identified more than 23,000 potential vulnerabilities in open-source projects, with 1,726 confirmed by external security firms, including over 1,000 rated high or critical severity.^[7] In controlled tests, Mythos developed 181 working exploits against Firefox's JavaScript engine; Claude Opus 4.6 produced two.^[8] Anthropic declined to release Mythos commercially. Instead, the company launched Project Glasswing, deploying Mythos to proactively patch the vulnerabilities it finds.^[9] The UK AI Safety Institute published an independent evaluation of Mythos's cyber capabilities, the first of its kind for a frontier model.^[10] Anthropic is simultaneously in talks to raise $30 billion at a reported $900 billion valuation,^[11] which would surpass OpenAI's $852 billion post-money figure from earlier this year. The combination of the most capable offensive-security model and the industry's highest valuation is not a coincidence.

Claude Goes to Work: Enterprise Deals, Coding Events, and New Features

Anthropic held Code with Claude in East London this week, its first developer-focused event in Europe, drawing mainstream coverage from Fortune, MIT Technology Review, and Time.^[15] The event was oversubscribed and the company's stated position was direct: Claude now performs at roughly the level of a midlevel software engineer for writing code, though senior engineers remain necessary for system design and harder debugging.^[16] On the partnership side, KPMG announced a global alliance to embed Claude inside Digital Gateway, its core business software platform used by 276,000 employees, starting with tools for tax and legal clients.^[17] SAP followed by announcing plans to make Claude a primary reasoning capability across its AI-enabled solution portfolio and the newly launched SAP Business AI Platform, extending Claude's reach into enterprise resource planning at global scale.^[18] On the product side, Anthropic announced a "dreaming" capability in which Claude Code agents write notes to themselves during tasks, with a consolidation system that synthesizes those notes across tasks to build persistent knowledge of a codebase. The company also shipped sandboxes, letting companies run Claude agents on their own infrastructure, and MCP tunnels, which allow agents to reach internal systems without touching the public internet.

Coding Agents and What AI Actually Costs

xAI launched Grok Build, a coding agent positioned against Anthropic's Claude Code and GitHub Copilot CLI.^[12] The beta is limited to SuperGrok Heavy subscribers at $300 per month. Grok 4.3, released May 4, added native video input and a 1-million-token context window at $1.25 per million input tokens.^[13] Simultaneously, Microsoft reportedly began canceling Claude Code licenses for engineers, six months after first enabling access, redirecting them toward Copilot CLI.^[14] A Fortune investigation found that deploying AI agents can cost more than paying human employees to do equivalent work. The industry does not yet have a public answer to this problem. Scaling laws work in both directions.

The Erdős proof and the Mythos evaluation represent opposite ends of the same trend: AI systems that produce outputs their builders did not fully anticipate. One resolved a problem mathematicians had given up on; the other found vulnerabilities faster than any human security team. Both results exceeded expectation, and neither company had a fully worked-out plan for what to do next. The industry needs better answers to that question than a classified distribution program and a fundraising round.

Disclaimer: All information in this post was gathered through research from publicly available web sources. While every effort has been made to verify accuracy and link primary sources, readers are encouraged to check the references below before drawing conclusions.

References

Interested in AI agents, custom software, web design, or any of my other services? I offer consulting across AI & automation, computer networks, IT infrastructure, research collaboration, and more. Reach out to discuss your project →Reach out to discuss your project →

← Back to all posts