Software Cost

Why We Use Story Points for Outcome-Based Pricing

Hourly billing rewards slow work. Outcome-based pricing, built on story points and powered by AI estimation, rewards the engineers who actually deliver.

Praveen Ghanta Praveen Ghanta, CEO, Hire Fraction · April 3, 2026 ·5 min read
story pointsoutcome-based pricingsoftware costagile estimation
What you’ll learn
  • Why hourly billing structurally misaligns vendor and client incentives
  • How story points measure complexity rather than time — and why that matters for pricing
  • How AI estimation removes the human bias that makes story points inconsistent
  • How outcome-based pricing directly links engineer pay to output delivered
  • Why the best engineers earn more under this model, not less

Hourly billing rewards slow work. Outcome-based pricing, built on story points and powered by AI estimation, rewards the engineers who actually deliver. Here is how the model works and why the best developers prefer it.

Why does paying for time create the wrong incentives?

Most software development is still billed by the hour. The client pays for time spent, not results delivered. This creates a structural misalignment: the vendor benefits from work taking longer, and the client has no direct mechanism to tie cost to output.

McKinsey and the University of Oxford studied over 5,400 IT projects and found that large projects run 45% over budget on average while delivering 56% less value than predicted. Software projects carry the highest risk of cost and schedule overruns among all IT project types. That is not a coincidence. When you pay for hours, you optimize for hours, not outcomes.

Outcome-based pricing flips this. Instead of invoicing for time, you tie payment to measurable output. But to make that work, you need a reliable, consistent unit of output. That is where story points come in.

What do story points actually measure?

Story points are an agile estimation method that measures relative complexity rather than absolute time. Instead of asking “how many hours will this take?” you ask “how complex is this compared to other tasks we have done?”

Different organizations implement this differently. Some use simple categories: small, medium, and large. Others use the Fibonacci sequence (1, 2, 3, 5, 8, 13, and up) to reflect the fact that uncertainty grows as tasks get bigger. The specifics matter less than the principle: you are measuring the size and difficulty of work, not the clock.

Definition

Story point: a unit of measure for expressing an estimate of the overall effort required to fully implement a product backlog item or any other piece of work, based on relative complexity, risk, and uncertainty rather than absolute time.

Here is how complexity typically breaks down in practice. A small task might be a minor UI change, a label update, or a quick styling fix. A medium task could involve creating or modifying a simple API, adding a form, or changing validation logic. Complex tasks span multiple layers — UI, API, and backend data model changes all shipping together as a vertical feature. If each of those layers is itself complex, the feature gets broken into multiple story-pointed tasks rather than treated as one monolith.

The real value of story points is that they measure what matters to the buyer: how much functional capability was delivered. Not how many hours someone sat at a desk.

Why does human estimation break down in practice?

Story points have been a standard practice in agile development for years, and they remain the dominant estimation method across agile teams. But the process has a well-documented weakness: inconsistency. Estimation is done by humans, and humans bring bias. A developer who built the last feature might anchor their estimate on that experience. A team that has been burned by a late delivery might inflate estimates as a buffer. Two teams at the same company can look at the same ticket and come back with wildly different numbers.

Academic research on agile estimation has consistently found that techniques like Planning Poker, while collaborative, are still susceptible to cognitive bias, group dynamics, and individual expertise gaps. The result is unreliable estimates that undermine the whole point of measuring output.

This is the gap that AI fills.

How does AI-powered estimation remove the bias?

Using AI agents to estimate story points changes the dynamic entirely. An AI agent can examine every task in the backlog, review the associated source code and context, and produce a complexity forecast based on patterns across thousands of prior tasks. It does not anchor on personal experience. It does not pad estimates because it is worried about working late on a Friday. It has no skin in the game.

Recent research from multiple institutions, including work published in the Software Quality Journal, has demonstrated that machine learning approaches to story point estimation can reduce the bias and inconsistency inherent in human estimation. A 2026 study on LLM-based estimation found that large language models can predict story points more accurately than supervised deep learning models, even without project-specific training data.

This is the agentic AI piece. It is not just a calculator. The AI reviews the task description, looks at the codebase, considers dependencies, and produces a calibrated estimate. That makes story points a much more reliable unit of measurement — which is exactly what you need if you are going to tie pricing to output.

Get an Instant Project Plan + Cost Estimate

Describe your software or AI project. Get a full scope with story-point pricing, sprint estimates, and a downloadable plan in minutes. No calls, no waiting.

Scope Your Project for Free

Free and instant. No call required.

How does outcome-based pricing actually work?

Here is the model. An engineer’s monthly or periodic compensation is tied to story point output at defined breakpoints. If only half the planned work was completed in a period, only half the invoice gets paid. There is a direct, transparent relationship between cost and outcome.

From the client’s perspective, this is a fundamentally better deal. You are paying for software that ships, not for hours that pass. If a project stalls, your costs reflect that. If the team delivers ahead of plan, you get your product faster without paying a premium for the speed.

The natural objection is: “Isn’t this just a way to underpay engineers?” Actually, it is precisely the opposite.

ModelWhat you pay forClient riskEngineer incentive
HourlyTime spent regardless of outputHigh — cost grows with delaysWork slowly, bill more hours
Fixed-bidDeliverable at a set priceScope creep and disputesCut scope to protect margin
Outcome-based (story points)Functional software deliveredLow — cost mirrors outputDeliver more, earn more

Why do top engineers earn more under this model?

When you pay by the hour, every engineer on the team earns roughly the same regardless of how productive they are. A developer who finishes a task in three hours and one who finishes the same task in eight hours earn the same daily rate. The faster developer’s efficiency is invisible to the compensation structure. Maybe they get promoted eventually. Maybe their manager notices. But there is no guaranteed, immediate financial upside to being faster and better.

Under outcome-based pricing, speed and quality translate directly to earnings. The best engineers at Fraction can earn 50% more on a dollars-per-hour basis because they simply deliver more output in less time. For the first time, a 10x developer can actually be compensated like one — in the here and now, not as a vague future promise.

This creates a powerful selection effect. The most productive engineers are naturally attracted to a model where their output determines their pay. Developers who thrive on efficiency, who take pride in clean, fast execution, see this model as an opportunity rather than a risk. Meanwhile, the model naturally filters out developers who rely on billable hours as a cushion.

How do quality controls still apply under this model?

The obvious concern with output-based compensation is quality. If you pay by the story point, what stops someone from “chucking AI slop over the wall”? This is a real concern, and it exists today even under hourly billing. Plenty of developers on hourly contracts submit AI-generated code without meaningful review.

The difference with outcome-based pricing is that quality is built into the measurement. Story points are not awarded for code that does not pass review, does not meet acceptance criteria, or introduces regressions. The completed story point is the unit, and “completed” means it works, it has been reviewed, and it meets the definition of done.

Combined with AI-powered estimation that calibrates complexity accurately, this creates a system where productive, high-quality engineers thrive and low-effort work gets caught before it ships.

The software industry’s move away from pure hourly billing is accelerating. AI tools are making developers dramatically more productive, which is pushing companies to rethink time-based billing entirely. When a senior developer using AI-assisted tooling can accomplish in two hours what used to take eight, billing by the hour penalizes efficiency. L.E.K. Consulting describes outcome-based pricing as a significant emerging trend that aligns costs with measurable results. Story points, standardized through AI estimation, are the unit that makes this possible.

Frequently asked questions

How do AI agents estimate story points without human input?

AI agents analyze the task description, review the associated source code and dependencies, and compare the work against patterns from thousands of previously completed tasks. They produce a calibrated complexity score that maps to the story point scale. The process is automatic and does not require a planning meeting or manual review, though teams can override estimates when context warrants it.

Doesn't outcome-based pricing create an incentive to cut corners?

It can, if quality controls are weak. That is why the completed story point is the unit of measurement, not just code submitted. Work must pass code review, meet acceptance criteria, and satisfy the definition of done before it counts toward output. This is no different from the quality gates that should exist in any professional development process, but the financial alignment makes enforcement more natural.

How do you prevent story point inflation, where tasks are estimated higher than they should be?

AI-powered estimation is the primary control. Because the AI agent evaluates complexity based on the actual codebase and historical patterns, it produces estimates that are independent of developer motivation. Developers do not set their own story point values under this model, which removes the incentive to inflate.

What happens when an engineer is blocked by external dependencies?

Blocking dependencies are a normal part of software development and should be accounted for in sprint planning. Under outcome-based pricing, the project manager and the engineer work together to sequence tasks that avoid or minimize blocking. When blocks are genuinely outside the engineer’s control, the story point targets for that period are adjusted accordingly.

Is outcome-based pricing only for senior engineers, or can junior developers participate?

The model works best with experienced developers who have established, consistent output levels. Junior developers, whose productivity is still ramping and whose output is more variable, may find hourly or salaried arrangements more appropriate while they build their skills.

How do I transition from hourly billing to outcome-based pricing?

Start by establishing a baseline. Run a few sprints using AI-estimated story points alongside your existing hourly billing. Track how many story points each engineer delivers per period and what that translates to on a cost basis. Once you have reliable velocity data, you can set pricing breakpoints that reflect fair value for both the client and the engineer.

Sources
  1. McKinsey & Company. “Delivering Large-Scale IT Projects on Time, on Budget, and on Value.” https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/delivering-large-scale-it-projects-on-time-on-budget-and-on-value
  2. Springer. “Improving Story Points Estimation Using Ensemble Machine Learning.” Software Quality Journal, 2025. https://link.springer.com/article/10.1007/s11219-025-09731-6
  3. Islam, M.R. et al. “Story Point Estimation Using Large Language Models.” arXiv, 2026. https://arxiv.org/html/2603.06276v1
  4. FullStack Labs. “2025 Software Development Price Guide & Hourly Rate.” https://www.fullstack.com/labs/resources/blog/software-development-price-guide-hourly-rate-comparison
  5. L.E.K. Consulting. “The Rise of Outcome-Based Pricing in SaaS.” https://www.lek.com/insights/tmt/us/ei/rise-outcome-based-pricing-saas-aligning-value-cost
Praveen Ghanta
Praveen Ghanta
CEO, Hire Fraction

Praveen Ghanta is a five-time founder and serial entrepreneur. He is the founder of DevHawk.ai, an AI-powered engineering management platform, and Fraction.work, which connects fast-growing companies with top fractional tech and growth marketing talent. Previously, he founded HiddenLevers, a risk analytics platform for wealth management that he bootstrapped from inception to acquisition by Orion Advisor Solutions in 2021, serving thousands of advisors and $600B in assets. He earlier founded SmartWorkGroups, acquired by Intralinks in 2000.

Connect on LinkedIn →
Get started

Get an Instant Project Plan + Cost Estimate

Describe your software or AI project. Get a full scope with story-point pricing, sprint estimates, and a downloadable plan in minutes. No calls, no waiting.

Scope Your Project for Free

Working on a data strategy? Talk to a Fraction CTO. → Book an intro call