Volume XXVIII, Issue 2 |

In early 2024, Cursor, a fast-growing AI coding assistant, faced an uncomfortable reality: it was sending 100% of its revenue to Anthropic to cover API costs. Every dollar customers paid went straight to its infrastructure provider. Cursor wasn’t alone. Perplexity burned through 164% of its revenue on cloud and large language model (LLM) costs that same year.

Cursor and Perplexity show how AI workloads are driving exponential API usage, with each query triggering thousands of background calls across models and supporting services. But consider an AI copilot that summarizes deal notes from a customer relationship management system (CRM) or analyzes campaign data from a warehouse. Beneath those activities is another layer of APIs connecting to data sources such as CRMs, data warehouses, communication tools and content platforms — surfacing and moving the live data that powers AI responses, updates records and triggers automated actions.

As agents take over this work, machine-driven usage is outpacing the pricing models built for people. The scale of that shift is staggering, and it’s easiest to see in how AI workloads interact with infrastructure.

The core idea is simple. AI agents generate far more API calls than human users ever could.

A human user might log into a CRM once an hour and pull data. An autonomous agent does the same work hundreds of times faster and without breaks, generating thousands of calls per hour. Traditional seat-based pricing ignores this reality, leaving vendors with models built for people, not machines.

How AI is driving explosive API growth

A single query can trigger hundreds or even thousands of API calls to fetch context, check facts, format responses and validate output. Complex workflows multiply this load exponentially, turning thousands of calls into hundreds of thousands per day.

This pattern holds across most AI workloads. The visible query is small, but the underlying computation that supports it is large, continuous and costly.

Google revealed at I/O 2025 that it’s processing over 480 trillion tokens monthly, a 50-fold increase from one year earlier. OpenAI's research shows ChatGPT messages grew fivefold between July 2024 and July 2025, reaching 18 billion per week. A single response can require significant retrieval, embedding and model-processing work, meaning most infrastructure cost comes from the underlying computation that models perform rather than the final output they return.

The companies whose APIs fuel AI-driven products now face a choice: monetize usage or subsidize it. Reddit’s 2023 pivot shows the trade-off. After LLMs trained on its data at scale, Reddit began charging $0.24 per 1,000 API calls. Most vendors lack Reddit’s unique data advantage and have less leverage to recover costs.

How pricing is shifting from seats to API calls in SaaS

API-centric pricing represents the next evolution. APIs scale elastically in ways human users never could. Consider a sales team of 20 employees using a CRM system such as HubSpot. Through integrations with finance systems, marketing automation and data warehouses, their CRM generates steady API traffic from routine operations.

Now the company upgrades to add agentic AI capabilities to its HubSpot subscription. These agents can automate customer research, draft personalized outreach emails and update contact records based on prospect behavior. A single agent running these workflows can trigger tens of thousands of additional API calls each day.

Traditional pricing models can’t capture this reality. Seat-based pricing ignores machine consumption entirely, and unmetered APIs expose vendors to runaway costs as AI workloads multiply. The result is a full rethinking of how value is measured and priced.

For a deeper exploration of how consumption-based pricing reshapes unit economics and company valuation, see our analysis on consumption-based pricing models.

How to price APIs in the age of AI agents

API pricing has moved from an afterthought to a core strategic decision. Most vendors are still experimenting, adapting traditional models to a world where machine-driven consumption dwarfs human activity. Each approach offers benefits and risks when scaled under AI workloads.

Most vendors evolve across these API pricing models as usage scales — from per-call pricing toward tiered, hybrid or dynamic structures to manage agent-driven variability.

Pay-per-call

Twilio built its business on this model, charging per message or minute. It remains one of the clearest ways to monetize APIs, but it breaks down under AI-scale workloads. A single agent workflow that would take a human five minutes can now generate thousands of automated requests.

Tiered usage

Stripe and AWS offer predictable, volume-based pricing through usage tiers and overage fees. AWS, for example, includes 1 million free API calls per month, which once provided healthy buffers for human-driven workloads. But an AI agent debugging code or researching a customer question can exhaust that free tier in hours.

Hybrid (base + usage)

Hybrid is now the most common model for enterprise APIs. Customers pay a base fee for platform access plus incremental usage charges. This model balances predictability and scalability but introduces complexity. Vendors need real-time dashboards, usage alerts, and soft caps to prevent cost surprises and maintain trust with customers.

Dynamic or off-peak pricing

This emerging method treats API capacity like airline seats: cheaper when idle, more expensive when demand surges. DeepSeek cut off-peak API rates by 75% to smooth traffic spikes, while OpenAI’s Batch API offers similar discounts for non-urgent jobs processed asynchronously.

Levers worth testing as agent traffic grows

As agentic usage grows, vendors are exploring new ways to differentiate between access types:
•    Human vs. agent access pricing
•    Off-peak vs. real-time rates
•    Per-agent identity or license fees
•    Tiered data-class pricing (e.g., compute-intensive or sensitive endpoints)

The next wave of innovation may come from assigning identity or licenses to AI agents themselves, charging per authorized agent rather than per human user. This shift blurs the line between API monetization and digital labor pricing; it will force vendors to decide whether to price by access, usage or even agent seats, and will require them to consider how each approach reshapes value capture.

Why API pricing models are failing under AI workloads

Companies are testing new models, but implementation is exposing critical gaps. The disconnect between how APIs are priced and how they’re consumed has never been wider. Four urgent problems have emerged:
 

  1. The profitability trap is persistent.
    Cursor and Perplexity show how infrastructure costs can swallow revenue. xAI, Elon Musk’s AI startup, reportedly burns about $1 billion a month on infrastructure and operations — proof of how quickly back-end consumption can break even the strongest growth story.
     
  2. Credits confuse customers
    Many vendors defaulted to credit-based systems when launching AI features. As one head of product monetization told Metronome, “Our finance team likes it. Our customers don’t know what a credit does.” Salesforce’s Agentforce combines three pricing methods: per conversation, per lead and credits. Layer in required licenses and API allowances, and customers struggle to forecast total costs.
     
  3. Agentic workloads break assumptions
    An autonomous agent might make 100 API calls or 10,000. Carnegie Mellon research found that AI agents fail on roughly 70% of knowledge-work tasks, and Gartner predicts that more than 40% of agentic AI projects will be canceled by 2027 due to escalating costs.
     
  4. Infrastructure can’t keep up
    Most vendors maintain separate billing stacks for self-serve and enterprise sales. Neither handles dynamic usage well. A customer’s AI agent might burn through a month’s API allocation over a weekend, but billing systems can’t surface that in real time or trigger proactive alerts. Customers often need engineering support just to decode their bills.

Some companies are finding a path forward. Zapier bills for completed tasks rather than raw API calls. Paid.ai raised $33 million to build outcome-based pricing infrastructure. DeepSeek cut off-peak rates by 75% to smooth demand spikes. Companies that tie price directly to customer value are adapting. Those that cling to legacy models face mounting pressure.

We help companies navigate API pricing transformation

Value is shifting from the number of people using a system to the scale of machine-driven activity it supports. API monetization now demands the same strategic attention as product development or go-to-market planning. Getting it right requires balancing technical implementation, customer expectations and unit economics.

L.E.K.’s B2B pricing practice works with software companies to manage these complex pricing shifts. If your organization is grappling with API monetization, agentic workloads or the transition from seat-based to consumption pricing, our team can help you design models that capture value without creating customer friction.

For more information, please contact us

L.E.K. Consulting is a registered trademark of L.E.K. Consulting LLC. All other products and brands mentioned in this document are properties of their respective owners. © 2026 L.E.K. Consulting LLC.

Related Insights