ALTITUDE

AI Signal – May 2026

The map matters more than the model. AI pricing is shifting at the edges, product surfaces keep sprawling, and the case for access tiering got sharper. The small operator's edge is not the model, it is the map.

24 May 202626 min readAIAI SignalPricing ShiftNavigationSmall PracticeSolopreneur2026
The model is good enough. The map matters more.
The model is good enough. The map matters more.

In late May, AI pricing started to shift at the edges. GitHub Copilot estimator screenshots circulated showing the new usage-based bill against the old flat-fee one. A monthly cost of $451 became $11,432. A $39 bill became $5,851. A $54 bill became $1,200. Twenty-five times, one hundred and fifty times, twenty-two times the previous rate, for work the same user was already doing.

The numbers are eye-catching. The honest reading is more nuanced. The shift hit the edges first: API users, out-of-harness consumption, enterprise customers. The in-harness subscription tier most small operators rely on (Claude Code, Codex CLI, Cursor) has not seen the same change yet, and a Claude Code Max subscriber today is still getting roughly the same value for money as six months ago. The direction of travel is set; the speed of travel depends on which side of the harness boundary a user sits. Four things actually mattered in May:

  • AI pricing is shifting at the edges. Anthropic enterprise pricing, OpenAI's new Guaranteed Capacity commitments, GitHub Copilot, and Google's Ultra plan all moved towards usage-based billing through April and May. The change is real for API and enterprise users; in-harness subscription users have not seen the same shift yet. The direction of travel is set, and prudent planning means anticipating token budgets that move.
  • Product surfaces sprawled out of any single map. Google's I/O confirmed the picture: a dozen new product names, an agent for your digital life with no release date, model tiers that nobody outside the company can keep straight. Navigation is now the unmet need.
  • Capability accelerated past where the labour market is priced. OpenAI's general-purpose model disproved an 80-year-old open geometry conjecture on a simple prompt. Ten months ago this would have been a major news event; today it already feels routine.
  • The case for access tiering got sharper. A clear-eyed essay by Antoine Licht set out the structural follow-on: as compute stays zero-sum and security tightens, the small operator's likely future may be mediated access through product layers rather than clean APIs. The argument is structural, not yet a fait accompli.

That's the digest. The rest is the unpacking.

At a Glance

Key takeaway: the map matters more than the model. The model is good enough. The wrapper is mostly sold pre-built. What is left for a small operator to build is the map: which tools, for what, at what cost, with which fallback, and inside which privacy boundary. The five signals below all point at this from different angles.

May 2026 – five signals across cost, sprawl, capability, access, and the role that holds the map

PRICING IS SHIFTING AT THE EDGES

Usage-based billing arriving for API and enterprise; in-harness subscriptions still subsidised

  • Anthropic enterprise pricing moved to $20 per seat plus usage (April); Claude Code subscription tier mostly unchanged
  • GitHub Copilot switched to usage-based billing on 1 June: top-model rates 25 to 150 times the previous flat-fee equivalent on circulating estimator screenshots
  • OpenAI launched Guaranteed Capacity, a one-to-three-year compute commit deal that looks more like cloud than SaaS
  • Google's Ultra plan dropped headline price from $250 to $200 but added usage-based billing inside the agentic surfaces (Anti Gravity)
  • Prudent planning: anticipate token budgets that move, even if your subscription still feels flat-fee today

PRODUCT SPRAWL IS REAL

Google I/O confirmed it; navigation is the unmet need

  • Twelve new product names in one keynote: Anti Gravity 2.0, Spark, Omni (Nano Banana for video), Gemini 3.5 Flash, Gemini Business, AI Pro, AI Ultra and more
  • Gemini app reached 900 million monthly active users; tokens processed jumped from 480 trillion to 3.2 quadrillion per month year-on-year
  • Most small operators will encounter agents first through Google Search, where a new persistent-query axis (your standing brief, kept fresh while you sleep) lands as a default mode
  • The honest reading: most readers cannot keep this straight, and that is not a personal failing

CAPABILITY ACCELERATED

An open conjecture fell on a simple prompt

  • OpenAI's general-purpose model disproved an 80-year-old Erdős geometry conjecture without specialised training
  • Fields medalist Tim Gowers: 'the first really clear example of AI solving not just an unsolved math problem, but a really well known unsolved math problem'
  • Resource cost (Ethan Mollick estimate): less than three almonds' worth of water; electricity equivalent of two to twenty miles of EV driving
  • OpenAI's Alexander Way: 'math is a leading indicator of what is to come'

THE CASE FOR ACCESS TIERING GOT SHARPER

Pricing shift may be the leading edge of something larger

  • Antoine Licht's 'Cut Off' essay sets out three compounding constraints: security restrictions (Mythos as precedent), compute as zero-sum, US Government strategic interest
  • The new equilibrium: frontier models first to US national security, then to trusted defenders and US firms, then to KYC-cleared customers, then everyone else
  • Licht names the small-operator audience explicitly: 'enthusiastic consumers, scrappy startups and nervous governments all over the world, might never get clean API access, but draw their access through fundamentally limited product layers. Maybe the chatbot and coding agent interfaces of today'
  • Practical implication: do not bet your stack on any single provider's mediated product layer remaining unchanged

NAVIGATION IS THE UNMET NEED

What to do about all of it

  • Build a small, durable, multi-tool harness; route work to the right tool, not the favourite tool
  • Use the new cost-efficient middle (Cursor Composer 2.5: comparable to Opus 4.7 and GPT-5.5 at ten to sixty times cheaper)
  • Maintain a single page that names how your business actually uses AI; treat it as a living asset
  • If you do not know who holds the map for your business, you are the navigator by default

Model Releases

May was not a frontier-model month. The releases that matter for a small practice are the ones that shift cost, surface, or what you can route work through. Listed in that order.

ANTHROPIC

May 21 (week)

First profitable quarter + Karpathy joins

Anthropic reported its first profitable quarter (Q2 projection: $10.9bn revenue, $44bn ARR, $559m operating profit), the first ever for any frontier lab. Karpathy joined the pre-training research team. SpaceX compute partnership scaled to Colossus 2 with a $45bn three-year commitment. The bubble narrative that the labs would never serve agentic workloads profitably lost ground in a single week. Practical effect for a small operator: stop pricing AI access as if next year's flat-rate generosity is coming back.

GOOGLE

May 20 (keynote)

I/O 2026 product wave

Twelve new product names. The ones to know about: Gemini 3.5 Flash (faster, but materially more expensive than 2.0 Flash); Anti Gravity 2.0 (agent system pulled out of the IDE and sold as the product); Spark (no release date); Omni (Nano Banana for video). The Gemini app sits at 900 million monthly active users. Search added a persistent-query axis: agents inside Search that monitor a standing brief and report back. Most small operators will encounter their first AI agents this way.

OPENAI

May (rolling)

Codex multi-surface + Guaranteed Capacity

Codex is now a multi-surface harness: CLI, IDE extension, Mac and Windows desktop, mobile via the ChatGPT app, side panel. Local computer use, browser use, and connectors (Slack, Gmail, GitHub, Notion, Vercel) all operate against the operator's own machine. Cloud-sandbox runs remain one mode among several. Guaranteed Capacity, OpenAI's new one-to-three-year commit deal, formalises the shift away from flat-fee AI for large customers.

CURSOR

May 19 (announcement)

Composer 2.5

A new in-house coding model from Cursor that benchmarks at or near Opus 4.7 and GPT-5.5 on coding tasks, at ten to sixty times lower cost on medium settings. This is the cost-efficient middle finding its place. For a small practice that does any coding-adjacent work (websites, automation scripts, data wrangling), routing routine work through Composer 2.5 and reserving the frontier models for the hard problems is now a defensible cost discipline.

ANTHROPIC

May (mid-month)

Claude Code /usage

A small but useful addition: type /usage inside Claude Code to see a breakdown of where your tokens are going by skill, agent, MCP, or plugin. The practical point is not the command itself; it is that the platforms have started shipping visibility into your own cost surface. Run it once this week; the result is usually surprising.

The pattern this month is convergence at the top and divergence at the price point. The model layer keeps consolidating; the wrapper layer keeps sprawling; the bill keeps moving. Small operators win by routing, not by loyalty.


The Landscape: what shipped

The most important release of the month was not a model. It was the realisation that frontier-lab economics had shifted underneath the conversation. Anthropic reported its first profitable quarter in late May (Q2 projection: $10.9 billion in revenue, $44 billion annualised, $559 million in operating profit). It is the first time any frontier lab has posted a profitable quarter. There are caveats worth carrying: it is a projection, not a closed quarter; revenue accounting differs from OpenAI's; the result is partly profitable because the lab is supply-constrained and rationing access. None of that changes the headline. The bubble narrative that the labs would never serve agentic workloads at scale profitably lost ground in a single week. The compute-as-kingmaker thesis moved from analyst commentary to literal balance-sheet line items: Anthropic and SpaceX deepened their partnership to a $45 billion, three-year compute deal, scaling onto Colossus 2 alongside Colossus 1.

The same week, Andrej Karpathy joined Anthropic's pre-training research team, with the explicit remit of using Claude to accelerate pre-training itself. Karpathy's own earlier framing was that anyone outside the labs drifts from the frontier within months; he has now moved back inside. Whether that compounds into the recursive-self-improvement story the labs hint at, or into something more modest, is genuinely uncertain. The signal-watch language is: keep an eye on it; do not bet the stack on it.

Google's I/O keynote dropped a dozen new product names into a single hour. Anti Gravity 2.0 (the agent system pulled out of the IDE and sold as the product), Spark (no release date), Omni (a Nano Banana moment for video, editing-first), Gemini 3.5 Flash (faster but materially more expensive than 2.0 Flash). The Gemini app reached 900 million monthly active users, having effectively closed the gap with ChatGPT. Tokens processed across Google's surfaces jumped from 480 trillion per month last May to 3.2 quadrillion per month this May, a sevenfold increase. The single most consequential change for the small-operator reader was buried inside the keynote: Google Search added a persistent-query axis. Agents inside Search can now hold a standing brief, monitor for new matches, and report back. The apartment search that used to be a one-off lookup becomes an open standing instruction. Most readers will encounter their first usable AI agent this way, inside a surface they already use, not by going looking for one.

The cost-efficient middle is now competitive. Cursor's Composer 2.5 benchmarks at or near Opus 4.7 and GPT-5.5 on coding tasks at ten to sixty times lower cost on medium settings. For any small practice doing coding-adjacent work (websites, automation scripts, data wrangling), routing the routine work through Composer 2.5 and reserving the frontier models for the hard problems is a defensible cost discipline.

The release that is worth holding lightly but watching is OpenAI's Erdős-problem result. An internal OpenAI model disproved an 80-year-old open conjecture in geometry on a simple prompt, with no specialised training. Fields medalist Tim Gowers, who wrote the companion paper, called it the first really clear example of AI solving not just an unsolved maths problem but a really well known unsolved maths problem. Ethan Mollick estimated the resource cost at less than three almonds' worth of water and the electricity equivalent of two to twenty miles of EV driving. OpenAI's Alexander Way wrote: "ten months ago I was ecstatic that AI could win international Math Olympiad gold. Today that excitement feels quaint." The signal is not the maths. It is that capability is now landing on tasks the labour market has not yet priced.

The Foundation: what is holding

Two foundation-level points carry from May. They are connected.

The first is the pricing shift visible at the edges through April and May. The moves were not synchronised, but the direction is convergent. Anthropic enterprise pricing went from a flat $200 per seat to $20 plus usage in April; the harness boundary now meters anything outside Claude Code and Claude Cowork, while in-harness subscription usage stays subsidised. GitHub Copilot moved to usage-based on 1 June (the Copilot estimator screenshots circulating in late May show new bills at twenty-five, one hundred and fifty, and twenty-two times the prior flat rate for the same workload). OpenAI launched Guaranteed Capacity, a one-to-three-year compute commit deal that looks more like a cloud agreement than a SaaS subscription. Google's Ultra plan dropped headline price from $250 to $200 but added usage-based billing inside the agentic surfaces. None of this means the flat-fee subscription has ended for most small operators today; it does mean the flat-fee assumption is no longer a safe baseline for planning the next twelve months. Anticipate token budgets that move; build a stack that can absorb a single provider's pricing change without breaking your work.

The second, less-discussed point is what the subsidy end is the leading edge of. Antoine Licht's essay "Cut Off", on the Threading the Needle blog, sets out the structural follow-on with the clearest language anyone has used on it so far. Three constraints are compounding: security restrictions (Anthropic's Mythos was the first signal that some capabilities will not be offered to every paying customer, and the US Government is moving towards more formal pre-release authority); compute as a genuinely zero-sum game (the marginal cost of providing access to another user of a frontier model is high enough that "Mythos 2 will not be cheaper than Mythos"); and US Government strategic interest in who gets access to American-built tokens of intelligence. Together these point at an equilibrium where the frontier model goes first to US national security, then to trusted defenders and US firms, then to KYC-cleared customers, and only later to everyone else, increasingly through pre-shaped product layers rather than clean APIs.

Licht names the audience this affects most directly with unusual precision: "enthusiastic consumers, scrappy startups and nervous governments all over the world might never get clean API access, but draw their access through fundamentally limited product layers. Maybe the chatbot and coding agent interfaces of today." That is the Pandion reader, named.

The practical Foundation-level move that holds these two points together is multi-tool routing. Chamath Palihapitiya's line on the consultancy-lab partnerships caught the shape of it: "controlling the tokens is controlling the spice." For a Fortune 500 the conclusion is to negotiate compute commits directly. For a small operator the conclusion is more modest and more useful: do not depend on a single provider's subsidy floor, and do not depend on a single provider's mediated product layer remaining unchanged. A working setup that runs two or three tools side by side, routes each task to the right one, and treats tool diversification as resilience is now the prudent default, not an enthusiast preference. The same lab-consulting pivot that has Anthropic certifying 30,000 PwC professionals in Claude is evidence that even the labs have accepted diffusion (not raw capability) as the binding constraint. Small operators doing what Accenture is being paid to teach Fortune 500 are on the right side of that curve.

A brief safety-adjacent note. Anthropic shipped a /usage command inside Claude Code in May that breaks down token consumption by skill, agent, MCP, or plugin. The practical point is not the command; it is that the platforms have started shipping visibility into your own cost surface. Run it once this week; the result is usually surprising.

The Practice: how to work

The most useful practitioner write-up of the month came from the Codex team itself: Jason Liu's "Codex Maxing" article, which set out nine working tips and bound them under a single integrating frame, don't break the loop. The vocabulary Liu reaches for (mono-thread, files-not-chat memory, voice as a way to brief richer context, side panel for parallel review, harness as a work system rather than a chat replacement) is the vocabulary Pandion has been using on the /ai/practice page for several months. The patterns are not new; what is new is that the practitioners shipping the harnesses are now using the same words. The AgentOS layers framework is no longer Pandion-specific vocabulary.

The practical translation for a small operator is simpler than the literature suggests. Three habits do most of the work.

Stay on one thread per workstream. Treat the AI conversation as a long, evolving brief that compounds over weeks, not as a series of fresh chats. The good capture, the saved instruction, the standing pattern: all of it lives on the thread or in files the thread reads. The cost of a thread that gets long is small; the cost of restarting context every session is large. The Codex team's framing for this is "don't break the loop", which is exactly right.

Keep memory in files, not in chat. Anything you would need a future AI session to know goes into a markdown file you can paste from. A one-page brief about your business. A list of preferred tools and when you use which. A house style note. A short library of templates. The harness reads the files; the files survive the model. This is the part of AI fluency that compounds, and the part that travels with you when the tools change.

Brief like you are delegating, not pair-programming. April's Opus 4.7 prompt-tightening lesson still applies. Lead with the goal. State the constraints. Define what "done" looks like explicitly. Tell the model what to verify before returning. Then let it run. Most disappointing AI output is a context failure, not a capability failure.

These three habits together are the small-operator version of what the harness-engineering literature is now naming. The wiring is sold pre-built. The discipline of writing your work down so the wiring can act on it is yours.

The Application: where it lands

Where the human premium holds

The mainstream labour-market discourse caught up with something the practitioner conversation has been circling for two years. Alex Imas's essay What Will Be Scarce and Ezra Klein's response in the New York Times both named the same point: where the provenance of human creation or human service is part of the economic value, the relational sector grows in proportion to savings elsewhere. AI eats tasks; it does not automatically eat demand for human involvement.

For the Pandion reader this is not new advice, but the external articulation matters. It gives the small practice a defensible answer to the AI-displacement worry: the work most worth doing is the work where relationship, trust, accountability, translation, behaviour change, or provenance are part of the service. Continuous support that previously could not be afforded becomes viable. Personalised guidance that previously had to be templated can be made specific. Human escalation becomes a more important role, not a less important one. The same labour-saving that compresses some tasks expands the affordable surface of others. Most small practices already operate in this zone by default; the May discourse just gave it a name.

The role families emerging

The other May thread worth carrying is the one about what new role shapes are appearing as agents handle more of the routine, mid-skilled, repeatable work. NLW spent several AIDB episodes through April and May testing role-family language: Agent Manager, Domain Operator, Context Engineer, Harness Engineer, Outcome Owner, AI Coach / Player-Coach, Navigator, Escalation Specialist. None of it is settled vocabulary. All of it is being used in the wild.

The Sequoia conversation with Jack Dorsey landed a simpler version: a small AI-native company has three roles, the builder/operator (a multi-functional individual contributor running the agents and the tools), the outcome owner (the human who carries accountability for what an agent or set of agents produces), and the player-coach (the senior who both does the work and teaches others how to do it; less management, more apprentice-style transfer). This maps cleanly onto the Pandion Capability framework's HR-equivalent layer, where the same shapes live as part of the team-design treatment on /capability/capability-talent.

The honest read is that nobody has these skills yet. Everyone is learning as they go. Universities are not yet shaped for it. Gen Z is entering a different graduate market than even five years ago. The small-business edge in this transition is agility: less hierarchy to unwind, faster experimentation cycles, the principal as the first navigator.

This thread is bigger than a single month's signal. A standalone Altitude piece in June, What Comes After the Org Chart, will take it on properly. For now, the line worth holding: the new question is not which jobs AI takes. It is which human roles become newly important when more work is routed through agents.

Framework Check

The four-tier framework (Landscape, Foundation, Practice, Application) held in May without strain. The four cross-cutting lenses we have been using since the March redesign (Scarcity, Staging, Interaction, Human Premium) are still the right ones. Scarcity sharpened materially this month with the pricing convergence and the access-tiering frame. Human Premium was named by external voices we can cite. Staging and Interaction continued to mature in the practitioner literature without requiring a structural change.

Navigation is not a fifth tier. It is a cross-cutting capability that lives across all four. The Landscape gives you what to navigate. The Foundation gives you what the cost and access constraints are. The Practice gives you the habits. The Application gives you where it lands. The map is the small operator's edge.

What to do this week

Three small things to do this week

  1. 1Run /usage on the AI tools you use most. The Anthropic /usage command inside Claude Code, equivalent surfaces on OpenAI and Cursor, your provider's billing dashboard, anything that shows where your tokens actually go. The result is almost always surprising; it gives you the routing brief for the next month, before the next pricing surprise.
  2. 2Test one tool you do not currently use against one you do. Same task, side by side. Composer 2.5 against Opus 4.7 on a routine refactor. Codex against Claude Code on a small website tweak. Gemini Search agentic mode against your usual lookup pattern for a standing query. Routing matters more than loyalty; you cannot route well without recent data.
  3. 3Write down what you would want a navigator to know about your business in one page. Who you serve. How you write. What you do. What you would never ask AI to do. Which tools are in the stack and why. Most readers find they want this page even when they did not think they did. If you do not have someone external to hand it to, you have just briefed the future version of yourself.

The hardest sentence to land in a month like this is the closing one. Google's Demis Hassabis tried for it at the close of his I/O keynote, and the framing is worth borrowing rather than competing with: "When we look back at this time, I think we will realise that we were standing in the foothills of the Singularity. This technology will be a force multiplier for human ingenuity and usher in a new golden age of scientific discovery and progress." As the AI commentator NLW put it the same week: it's a great message; it's our jobs to make sure it's true.

For a small operator, the way you make it true is more modest than a Singularity speech allows for. It is the routing habit. It is the saved-context file. It is the small, durable harness. It is knowing who holds the map for your business and naming them. The next twelve months will be more uneven than the past three. Some surfaces will get more priced; some will stay subsidised. Product sprawl will accelerate. Access will continue to tier where security and capacity drive it. The work that compounds is not picking the right model. It is keeping the map current, whichever way any single provider moves.

The map matters more than the model. June's AI Signal will track the early experience of working through the new pricing reality. A standalone Altitude piece, What Comes After the Org Chart, will take on the role-families thread properly. Both are coming.


AI Signal is published monthly by Pandion Studio for anyone using AI as a core operating tool: solopreneurs, micro-organisations, small landscape and professional practices, and individuals using AI to organise their own life and admin. We read the AI firehose so you don't have to.

If you want help building the map for your own business, that's what AI Sessions are for.

FAQs

Should I expect AI subscriptions to get more expensive?

Probably over time, but not yet for most subscription users. The pattern is starting at the edges. GitHub Copilot moved to usage-based on 1 June. Anthropic shifted enterprise pricing in April from a flat $200 per seat to $20 plus usage. OpenAI launched Guaranteed Capacity, a one-to-three-year compute commit deal that looks more like a cloud agreement than a SaaS subscription. Google's Ultra plan dropped its headline price but added usage-based billing for the agentic surfaces. Inside the platforms' own harnesses (Claude Code, Codex CLI, Cursor) subscription users have not seen the same shift yet, and your existing flat-fee plan is probably still working the way it did six months ago. The direction of travel is clear though. The prudent default is to anticipate token budgets that move rather than assume next year's costs will look like this year's. The fix is not to switch to a cheaper provider every time a price moves. It is to know which of your tasks actually need the top model, route everything else to a cheaper one, and build a stack that can absorb a single provider's pricing changes without breaking your work.

Should I switch tools every time prices change?

No, and that is the wrong frame. The pattern that works is multi-tool routing as a habit, not vendor loyalty followed by panic switching. The practitioner literature this month converged on a simple instruction: build a small, durable harness that runs two or three tools side by side, route each task to the right one, and treat tool diversification as a form of resilience. Claude down, lean on Codex. Codex down, lean on Claude. Pricing change on one, route more work through the other while you decide. The point is not to chase the cheapest option every week. It is to remove the assumption that any single tool will be the right home for all your AI work for the next year.

Do I need to track every new AI release to keep up?

No. That is a full-time job that almost nobody actually does well, and the people doing it are paid to. What a small operator needs is the opposite: a thin, durable filter that surfaces what changes your tools, your cost, your reliability, or your routine, and ignores the rest. That is what AI Signal exists for. Most months, the news you actually need to act on is a handful of items. This month, four: AI pricing shifting at the edges, product surfaces sprawling, model capability accelerating, and the case for access tiering getting sharper. None of those changes are a release announcement; they are structural shifts. The discipline is to read for structural shifts, not to chase product launches.

What is the difference between using AI and navigating it well?

Using AI is opening a tool when you have a task. Navigating it well is having a system for which tool you reach for, why, when you switch, what you keep in saved context, and where the next change is likely to come from. Most small operators who have been using AI for a while have most of the parts already; what they lack is a single page that names how they actually work, so the next person who joins (or the next model that releases) does not start from zero. Navigating well is making that page exist and keeping it current. It is also, increasingly, the part of AI fluency that compounds. Models change. Wrappers change. Your map of your own work does not.

What is a 'navigator' practically? A person, a service, or a tool?

All three are emerging shapes, and at small-operator scale the most common shape is none of the above; it is a discipline you hold yourself. A navigator is whoever maintains the small operator's map of the AI landscape: which tools to use for what, what the cost envelope is, what the privacy boundary is, when to switch, what to watch. For a solo or two-person business, this is usually the principal. For a five-to-twenty-person practice, it can be a part-time role, sometimes filled by an external advisor, sometimes by the most AI-fluent person on the team. As an external service it is still being defined; if you want help with it, that is what AI Sessions are for. The point is that the role is now real even when it is not formally filled. If you do not know who holds the map for your business, you are the navigator by default, and naming that is the first move.

AI Signal – May 2026 | Pandion Studio