Inference APIs are as dead as seat-based pricing.
Selling tokens is selling seat licenses with better marketing. The value in enterprise AI will accrue to the layer closest to the customer’s outcome.
Originally published as a Google Doc in early 2026.
This is an extension of a strategy memo I wrote at Salesforce in 2020, Shopify is to Salesforce as Fortnite is to Netflix, about value-aligned business models and the incumbency trap. That essay argued that Shopify’s success stemmed not from being a better SaaS product but from a revenue model perfectly aligned to its customers’ outcomes, and that Salesforce’s ACV-driven model left it structurally vulnerable to exactly this kind of disruptor. I think the same dynamics are applicable in enterprise AI, and the implications are arguably even more consequential. (This essay doesn’t really apply to consumer AI.)
The inference API is the new seat license
In the original essay, the core thesis was that SaaS companies who generate revenue directly based on the success of their customers have a structural advantage over those that generate revenue based on their ability to sell seats-based contracts. The thesis holds and I’d argue it’s going to define who wins the AI era. Right now, OpenAI and Anthropic (and the other labs) sit in a position remarkably similar to where Salesforce sat when I wrote the original memo. They are synonymous with AI. The average enterprise CIO will give them the first look, the first pilot, and (in many cases) the first contract. This is the incumbency advantage and it is very real but it is also very temporary if the underlying business model doesn’t evolve past selling inference. If all OpenAI and Anthropic ever are is an API for inference (a metered pipe to a foundation model) they have no durable advantage over open source models that are comparable or even slightly worse in raw benchmark performance.
This is the analog to the ACV trap I described in the original essay. Selling API calls is selling seat licenses with better marketing. The customer pays for access, not for outcomes, and the vendor gets paid whether the customer succeeds or fails. The moment a cheaper, open-source alternative hits 85% of the performance at 20% of the cost, the enterprise will calculate the delta and switch. We’ve seen this movie before with Linux eating Unix, Postgres eating Oracle’s lunch at the startup layer and working its way up, and basically every commoditized infrastructure layer eventually getting arbitraged to zero margin.
The inference API, in isolation, is a commodity. It might not feel that way today because GPT and Claude (and Kimi, and Qwen, and so on) are legitimately differentiated on capability but the gap is closing quarter over quarter, and the structural dynamics of open source (massive distributed R&D investment, no margin to defend, permissive licensing) virtually guarantee that the gap will continue to close. Anthropic and OpenAI know this, which is why both are racing to build products on top of their models (chat interfaces, coding assistants, search products, agent frameworks). But building products on top of a model you also sell as an API creates the same misalignment I described in the original essay because you’re trying to be both the platform and the solution, and your revenue model (tokens consumed) doesn’t care which one wins. You get paid the same whether your customer builds something transformative on your API or burns tokens on a hallucinating chatbot that gets turned off after the pilot, which is the ACV trap reincarnated.
Where the real value accrues
The true enterprise winner in AI (and by winner I mean hundreds of billions in revenue, not tens) will not be the company that builds the best model but the company that builds end-to-end solutions specific to each industry, analogous to entire human workflows (“jobs”), and gets paid based on the outcomes those solutions produce. Let me be concrete. In mortgage origination, the loan officer job isn’t “respond to a lead.” The job is to qualify a borrower, lookup their credit, shepherd them through underwriting scenarios, provide terms, collect documentation, coordinate with title, assist in closing logistics, and coordinate the loan to funding while maintaining compliance with a labyrinth of federal and state regulations across a multi-month timeline with dozens of decision points and handoffs. No amount of just prompt engineering on top of a foundation model accomplishes this.
This is a systems problem that requires persistence, task decomposition, nondeterministic judgment, multi-channel communication, and the ability to operate autonomously over weeks and months. The same is true in healthcare where the job isn’t “answer a patient’s question” but rather to schedule a patient with the right specialist in the right network in the right geography, call their insurance company and sit on hold for 40 minutes to get a prior authorization, follow up three times when the authorization is denied, coordinate between the PCP and the specialist, and ensure the referral is documented correctly in the EHR, all at scale across thousands of patients simultaneously. These are the operations jobs that constitute the single largest cost center for most healthcare organizations. Enterprise agents are fundamentally an orchestration problem as opposed to an inference problem and the companies that solve it will have the same structural value alignment advantage that Shopify had over Salesforce. Revenue becomes a direct function of customer outcomes, not customer access. If the loan doesn’t fund, you don’t get paid. If the patient doesn’t get scheduled, you don’t get paid.
Programmable teammates, not smarter chatbots
The mental model that most of the market operates with today is that AI is a tool that makes humans more productive i.e. copilots and assistants. Drafters of emails and summarizers of meetings. This is the equivalent of Salesforce’s view that CRM is the center of the enterprise software universe, which was true for a while and then suddenly wasn’t as the underlying platform tailwind shifted to ecommerce for a large segment of the market. The real paradigm shift isn’t “AI makes humans faster,” it’s “AI can perform entire workflows that humans currently perform,” not necessarily because the AI is smarter than any individual human in any individual task, but because an agent can perform all the tasks, in sequence, at scale, without dropping the ball, without going on vacation, without churning, and without costing $85,000 a year in salary plus benefits plus management overhead plus training plus HR.
I think of this as the “programmable teammate” model where the customer doesn’t think about the underlying model, doesn’t configure agents, doesn’t write prompts. They make an API call: “originate this loan,” “schedule this patient,” “collect this debt,” “qualify this lead.” The system consumes the request, orchestrates a network of underlying agents to perform the work across whatever channels the task requires (voice, SMS, email, fax, yes fax, because healthcare still runs on fax), and returns the outcome via webhook. The customer gets a teammate that operates like a human but scales like software. This is the Shopify analog. Shopify didn’t win by being a better website builder, Shopify won by being the platform that made its merchants sell more stuff and then getting paid a percentage of every sale. The AI companies that win won’t win by having the best model, they’ll win by being the platform that replaces their customers’ most expensive cost centers and getting paid a percentage of every dollar saved or every dollar earned.
The incumbency trap, again
Someone reading this at Anthropic or OpenAI might say “Sure, but we’re building agents too. We have function calling, tool use, computer use, MCP. We’re building the primitives for exactly what you’re describing.” This is true and it’s also exactly what Salesforce was saying about ecommerce when I wrote the first essay. Salesforce had Demandware, they were building Commerce Cloud, they had partnerships with ecommerce ISVs, and none of it mattered because the underlying business model (ACV on CRM seats) meant that the org was structurally incapable of prioritizing the right things. The same dynamic applies here. When your primary KPI is tokens consumed or API revenue or monthly active users on a chat interface, you will optimize for engagement and consumption, not for customer outcomes.
You’ll build impressive demos and ship agent frameworks that developers can tinker with and announce partnerships with enterprises. And then the enterprises will run pilots that look promising but never scale to production. The gap between “impressive demo” and “can do what an entire human team in a regulated industry does” is approximately the same as the gap between a Salesforce demo org and a production Salesforce implementation at a Fortune 500, except with much higher stakes because you’re now automating decisions that have legal, financial, and top line implications. Building the primitives is necessary but wildly insufficient. It’s the difference between AWS offering EC2 instances and Shopify offering a turnkey ecommerce platform. Yes, you could build Shopify on AWS but AWS isn’t Shopify and the value accrues to the company that does the hard, unglamorous, industry-specific work of turning raw compute into solved problems. Foundation model providers offering agent primitives is the equivalent of AWS saying “you can build anything on our cloud,” which is true, but the point I’m making (and the point of the original essay) is that the company closest to the customer’s actual outcome is the one that captures the value.
Credit where it’s due
To be fair, and I don’t want to undersell this, it’s clear that Anthropic and OpenAI have meaningfully realized at least part of this dynamic. The launch of Claude Code, followed by Claude for Excel, Claude in Chrome (+ the analogous OAI products), and the broader push into agentic products that do real work inside real workflows is not an accident, it is a deliberate move up the stack from “inference provider” to “solution that directly contributes to a customer’s output.” Claude Code doesn’t charge you to chat about code, it writes the code, and Claude for Excel doesn’t summarize your spreadsheet, it builds the spreadsheet. These products are meaningfully closer to the customer’s actual job than a chat interface or a raw API endpoint will ever be. This is the same instinct, and it mirrors what I described in the original essay about Salesforce’s pockets of value alignment (Demandware’s “% of GMV” model, CDP exploring usage-based pricing).
The labs are clearly thinking about how to move from selling tokens to delivering outcomes and the question is whether the orgs can push this instinct far enough, fast enough, and into domains specific enough to matter before the dynamic I’m describing plays out (i.e. the 800 lb Gorillas, the biggest of which is MSFT, beat everyone to the punch). The tension is that Claude Code and Claude for Excel are horizontal tools that make knowledge workers more productive across industries, which is valuable and is a meaningful step beyond raw inference, but it’s still a productivity layer (a better copilot), not an end-to-end replacement of a human workflow in a specific vertical.
The gap between “Claude helps a developer write code faster” and something like Devin, “the autonomous software engineer” is wide (I’m not saying they are both equally real today, just pointing out a gap between the two visions). Similarly, the gap between a system of record with agents auto filling fields and “an agent system originates a mortgage from lead to funded, autonomously, across voice/SMS/email over 45 days while maintaining compliance with RESPA, TILA, ECOA, and fifty state-specific regulations” is equally wide. To use another analogy, it’s the difference between Squarespace and Shopify where both help you build a website but only Shopify positions itself as directly responsible for whether you sell anything.
Why open source breaks the moat
There’s a nuance in the AI version of this thesis that makes the dynamic even more acute than it was in SaaS. In the original essay, Salesforce’s moat was its admin/ISV/consultant ecosystem (a genuine two-sided marketplace that created real switching costs) and foundation model providers don’t have an equivalent moat, or rather, their moat is temporary and purely capability-driven. The moment open-source models (Llama, Mistral, DeepSeek, and whatever comes next) achieve comparable performance on the specific tasks that matter for a given industry workflow, the cost advantage becomes overwhelming. An enterprise running a thousand concurrent voice agents doesn’t care if Claude scores 3% higher on MMLU, they care about cost per completed task, accuracy on their specific domain, and reliability at scale, and if they can get 95% of the performance at 30% of the cost by running an open-source model on their own infrastructure (or a commoditized cloud) they will.
This is the Linux parallel, the Postgres parallel, and frankly every infrastructure commoditization story ever told. The foundation model layer will be commoditized and the question is what happens above it. What happens above it is where the value-aligned companies win because the company that builds the end-to-end loan origination agent doesn’t care which model sits underneath it, they care about whether the loan funded. The model is an input, not the product. When the model layer commoditizes, the company that owns the outcome (the company whose revenue is directly tied to whether the loan funded, whether the patient was scheduled, whether the debt was collected) has a business that is structurally insulated from foundation model commoditization because they ride the commoditization wave rather than being drowned by it. Cheaper and more capable models make their agents better and cheaper to run, which makes them more competitive, which drives more volume, which drives more revenue. This is the exact same flywheel that Shopify rides as ecommerce infrastructure gets cheaper and better.
The experiment is running
The value aligned version of enterprise AI is what we are building at Takeoff. Takeoff is an agent orchestration platform that builds what we refer to as “programmable teammates,” which are essentially harnesses for real-world tasks that operate like humans but scale like software. We don’t sell model access, we don’t sell agent frameworks, we don’t sell seats. Our customers make an API call or send in a webhook tied to an ambiguous request (e.g. “originate this loan,” “schedule this patient,” “call insurance for a prior authorization”) and Takeoff agents consume and perform that request via a network of underlying agents across voice, SMS, email, and whatever other channels the task requires, returning the outcome via webhook. The goal is that the customer never has to build, configure, or think about the agents themselves, they are simply programmable teammates.
For a top 10 mortgage company, Takeoff agents surpassed a human LO team control in a direct side-by-side split test. That is a production system that improved upon an entirely human workflow in a regulated industry across a multi-month origination cycle. For a major health tech pilot, Takeoff agents turned one of the largest cost centers (operations staff) into an API call for an engineering team where the client sends in an ambiguous task, Takeoff agents figure out how to do it, do it, and return the results. This is a new category of software.
The reason I wrote the original essay was not just to diagnose a problem at Salesforce but to articulate the principle that the companies whose revenue is directly and exclusively a function of their customers’ success will out-innovate, out-iterate, and ultimately out-grow the companies whose revenue is a function of their ability to sell contracts. Takeoff is built on that principle from the ground up. Our growth so far is a direct consequence of value alignment. When your agents result in more loans originated than a human team, the customer doesn’t need to be sold on expanding the contract, the results sell themselves.
The Shopify flywheel applied to AI is better outcomes generate more volume, more volume generates more data, more data generates better outcomes, and unlike a seat license (where revenue is disconnected from usage) every improvement in our system translates directly to more revenue for our customers and subsequently for us. The broader thesis of this essay (that the inference API is a commodity, that horizontal copilots are a bridge but not a destination, that the value accrues to the layer closest to the customer outcome, that open source commoditization actually benefits the outcome-aligned layer above it) is not something we arrived at theoretically, it’s what we see every day building Takeoff. We are model-agnostic not by choice but because we have to be. The mortgage workflow doesn’t care which LLM powers the income verification step, it cares whether the loan funds, and the healthcare coordination workflow doesn’t care which model generates the phone call to the insurance company, it cares whether the prior authorization gets approved. The model is an input. The outcome is the product.
The inevitable convergence
Here’s the part that I think matters most for the big labs. Just as Shopify’s dominance in ecommerce meant that marketing solutions were increasingly evaluated in the context of “how well does this work with Shopify” (Klaviyo bootstrapping to $100MM ARR as a Shopify ISV is the best example of this), the dominant vertical AI platforms will become the context in which foundation models are evaluated. Today, enterprises evaluate foundation models on benchmarks like MMLU, HumanEval, and reasoning tests. Tomorrow, they’ll evaluate foundation models based on which one works best inside the agent platform that originates their loans or which one works best inside the system that runs their healthcare operations. The vertical AI company becomes the source of truth while the foundation model becomes the interchangeable component.
This is the exact inversion of the current power dynamic and it’s why the big labs will attempt to build true vertical solutions themselves (likely but hard, industry-specific domain embedding, regulatory knowledge, and a fundamentally different org + product structure than a research lab) or deeply partner with the companies that are doing so (less likely, given that the value and the pricing power accrues to the layer closest to the customer outcome). As I wrote in the original essay, it’s important to realize that a competitive moat does not come undone slowly and predictably but instead collapses into itself almost instantly as power dynamics shift, rendering an incumbent helpless. The foundation model providers are in a strong position today but “today” in AI moves a lot faster than “today” in SaaS. Our bet is that our opinionated agent orchestration platform gives us a chance at being a meaningful contender in the race.
Thank you for taking the time to read this, I really appreciate it. If you’re interested in the ideas I discuss here, I would love to discuss them further. My email address is aakash@keepmovingforward.dev.