Offering metered Usage
Definition (short). Customers pay in proportion to what they consume—API calls, GB stored, kWh, messages, transactions. Revenue scales with usage, not (only) with time or seats; pricing can be linear or tiered.
Recent examples. AWS, Azure, and GCP bill per compute hour, GB, or request; AWS alone produced >$100 B in 2024 revenue. Utilities charge ~16.4¢ per kWh to U.S. residential customers on average (Apr 2025), and the average home uses ~10,500 kWh/year (≈875 kWh/mo).
Historical example. Edison’s Pearl Street Station (1882) billed electricity by the kilowatt‐hour—one of the earliest metered models. Taxi meters (1890s) and postage (per-letter) are other classic pay-per-use forebears.

KPI Definitions
Profit per Unit of Usage (PPUU) EN: Profit generated for each billable unit (after variable costs). Pseudo:
PPUU = (Effective_Unit_Price - Unit_Cost) / Unit
or more commonly(Revenue - Variable_Costs) / Total_Units
Why: Forces clarity on both monetization (price) and efficiency (cost). High volume without profit per unit is a trap; high margin without volume underutilizes the model. Benchmark: Digital infra aims 70–85% gross margin.
Total Usage Volume (UV) EN: Sum of consumed units (API calls, kWh, GB, etc.). Pseudo:
Σ usage_units_all_customers
Why: Primary revenue driver; shows engagement/demand intensity. Benchmark: Target >30–50% YoY usage growth early on.Effective Unit Price (EUP) EN: Average realized price per unit after tiers/discounts. Pseudo:
Total_Usage_Revenue / Total_Usage_Volume
Why: Monetization efficiency; erosion may signal commoditization.Gross Margin per Unit % (GMU) EN: % of unit price kept after variable cost. Pseudo:
(EUP - Unit_Cost) / EUP * 100
Why: Unit economics; determines scalability of volume growth.Active Users / Accounts (AU) EN: Unique customers generating billable usage in period. Pseudo:
COUNT(DISTINCT user_id WHERE usage>0)
Why: Breadth of adoption; needed to diversify revenue and de-risk whale dependence.Average Usage per User (AUPU) EN: Mean consumption per active user. Pseudo:
Total_Usage_Volume / Active_Users
Why: Depth of engagement; helps forecast infra needs.Tiered / Overages Mix % (TIER) EN: Share of revenue from higher tiers or overage charges vs base rate. Pseudo:
Revenue_from_Tiers_>1 / Total_Usage_Revenue * 100
Why: Shows success of pricing design in capturing heavy users’ value.Unit Cost (Variable) EN: Direct variable cost per unit (bandwidth, compute, fuel). Pseudo:
Variable_Costs / Total_Units
Why: Drives GMU; optimizing infra/vendor deals pushes PPUU up.New Active Users (ACQ) EN: Fresh customers who generated usage this period. Pseudo:
COUNT(users with first_usage_date in period)
Why: Pipeline for future volume; complements expansion of existing users.Expansion Usage % (EXPu) EN: Incremental usage from existing users vs last period. Pseudo:
(Usage_existing_t - Usage_existing_{t-1}) / Usage_existing_{t-1} * 100
Why: Land-and-expand health metric in usage world.Growth Efficiency Factor (GEF) EN: How efficiently you convert capacity (supply) growth into profitable usage growth. Pseudo:
GEF = (Demand_Growth % / Capacity_Growth %) * (GMU / 100)
Why: If you scale infra faster than demand or without keeping margins, you destroy PPUU. GEF >1 means you’re growing demand faster than capacity (or holding margin so growth is efficient). Benchmark idea: Cloud/API teams often target GEF ≥ 1.2 in growth phases; utilities, constrained by regulation, hover near 1 (capacity expansions match load forecasts).Demand (Usage) Growth % (DG) EN: Year-over-year change in total billed usage units. Pseudo:
(Usage_t − Usage_{t−1}) / Usage_{t−1} * 100
Why: Core top-line driver in metered models; faster than price hikes, usage growth proves product value. Benchmark idea: Early-stage APIs: 30–50%+ YoY; mature utilities: 1–2% YoY.Capacity Growth % (CG) EN: YoY change in maximum deliverable capacity (compute throughput, MW, TPS). Pseudo:
(Cap_t − Cap_{t−1}) / Cap_{t−1} * 100
Why: Overbuild wastes capex; underbuild throttles revenue (throttling, outages). Benchmark idea: hyperscalers expand capacity ~in line with forecasted demand + buffer (e.g., 20–40% YoY during high-growth years).Capacity Utilization % (CI) EN: Average share of provisioned capacity actually used (often measured at peak window or averaged). Pseudo:
Avg_Usage / Provisioned_Capacity * 100
Why: Direct efficiency metric—too low means idle assets, too high means no headroom for spikes. Benchmark idea: Power plants target ~60–80% load factors; cloud infra teams often aim 40–60% average to leave burst room.Provisioned Capacity (PC) EN: The maximum sustained throughput you can deliver (e.g., kWh/day, requests/sec). Pseudo:
Cap = Σ(node_capacity)
(or grid MW installed) Why: Sets the ceiling; also denominator for utilization. Needed for capex planning.Peak Usage / Peak Load (PU) EN: Highest instantaneous (or short-window) usage observed in the period. Pseudo:
max(usage_rate_t)
over periodWhy: Determines required headroom and auto-scaling needs; drives worst-case cost.
Benchmark: Peak multiples of 1.5–3× average are common; extreme bursty APIs can see 10×.
Peak-to-Average Ratio (PAR) EN: Ratio of peak load to average load. Pseudo:
Peak_Usage / Average_Usage
Why: Quantifies burstiness; high PAR stresses infra cost and pricing design (need overage/tier pricing).
Benchmark: Utilities PAR ~1.3–1.6; consumer APIs (chat/LLM) can be >3–5 during viral events.
Capex per Unit Capacity (CAPU) EN: Capital required to add one unit of capacity (e.g., $ per kW, $ per 1k TPS). Pseudo:
Capex_added / Capacity_added
Why: Guides ROI on expansion; lower CAPU means cheaper scaling.
Benchmark: Data center build costs ≈ $7–12M per MW; GPU clusters vary wildly but trend ~$25–40 per deployable TFLOP.
Auto-Scaling Latency (ASL) EN: Time it takes to provision additional capacity after demand spike. Pseudo:
t(scale_complete) - t(threshold_trigger)
Why: Slow scaling means you must over-provision; fast scaling lets you run lean.
Benchmark: Best cloud-native infra targets seconds–minutes; regulated utilities can’t “auto-scale,” they plan years ahead.
Headroom % (HR) EN: Buffer capacity above expected peak. Pseudo:
(Provisioned_Capacity - Expected_Peak) / Provisioned_Capacity * 100
Why: Prevents outages and throttling; too much wastes capital.
Benchmark: Cloud SREs often keep 20–30% headroom; grid operators maintain N-1 redundancy (varies but ~15–25% capacity reserve).
Usage Concentration Risk % (auxiliary) EN: % of total usage (or revenue) coming from top X customers. Pseudo:
Usage_top_X / Total_Usage * 100
Why: Whales are great—until they churn. Tracks fragility of revenue base. Benchmark: Aim <30% from top 5; many infra/API firms start >50% and work it down over time.
Last updated