Offering metered Usage
Definition (short). Customers pay in proportion to what they consume—API calls, GB stored, kWh, messages, transactions. Revenue scales with usage, not (only) with time or seats; pricing can be linear or tiered.
Recent examples. AWS, Azure, and GCP bill per compute hour, GB, or request; AWS alone produced >$100 B in 2024 revenue. Utilities charge ~16.4¢ per kWh to U.S. residential customers on average (Apr 2025), and the average home uses ~10,500 kWh/year (≈875 kWh/mo).
Historical example. Edison’s Pearl Street Station (1882) billed electricity by the kilowatt‐hour—one of the earliest metered models. Taxi meters (1890s) and postage (per-letter) are other classic pay-per-use forebears.

KPI Definitions 
- Profit per Unit of Usage (PPUU) EN: Profit generated for each billable unit (after variable costs). Pseudo: - PPUU = (Effective_Unit_Price - Unit_Cost) / Unitor more commonly- (Revenue - Variable_Costs) / Total_Units- Why: Forces clarity on both monetization (price) and efficiency (cost). High volume without profit per unit is a trap; high margin without volume underutilizes the model. Benchmark: Digital infra aims 70–85% gross margin. 
- Total Usage Volume (UV) EN: Sum of consumed units (API calls, kWh, GB, etc.). Pseudo: - Σ usage_units_all_customersWhy: Primary revenue driver; shows engagement/demand intensity. Benchmark: Target >30–50% YoY usage growth early on.
- Effective Unit Price (EUP) EN: Average realized price per unit after tiers/discounts. Pseudo: - Total_Usage_Revenue / Total_Usage_VolumeWhy: Monetization efficiency; erosion may signal commoditization.
- Gross Margin per Unit % (GMU) EN: % of unit price kept after variable cost. Pseudo: - (EUP - Unit_Cost) / EUP * 100Why: Unit economics; determines scalability of volume growth.
- Active Users / Accounts (AU) EN: Unique customers generating billable usage in period. Pseudo: - COUNT(DISTINCT user_id WHERE usage>0)Why: Breadth of adoption; needed to diversify revenue and de-risk whale dependence.
- Average Usage per User (AUPU) EN: Mean consumption per active user. Pseudo: - Total_Usage_Volume / Active_UsersWhy: Depth of engagement; helps forecast infra needs.
- Tiered / Overages Mix % (TIER) EN: Share of revenue from higher tiers or overage charges vs base rate. Pseudo: - Revenue_from_Tiers_>1 / Total_Usage_Revenue * 100Why: Shows success of pricing design in capturing heavy users’ value.
- Unit Cost (Variable) EN: Direct variable cost per unit (bandwidth, compute, fuel). Pseudo: - Variable_Costs / Total_UnitsWhy: Drives GMU; optimizing infra/vendor deals pushes PPUU up.
- New Active Users (ACQ) EN: Fresh customers who generated usage this period. Pseudo: - COUNT(users with first_usage_date in period)Why: Pipeline for future volume; complements expansion of existing users.
- Expansion Usage % (EXPu) EN: Incremental usage from existing users vs last period. Pseudo: - (Usage_existing_t - Usage_existing_{t-1}) / Usage_existing_{t-1} * 100Why: Land-and-expand health metric in usage world.
- Growth Efficiency Factor (GEF) EN: How efficiently you convert capacity (supply) growth into profitable usage growth. Pseudo: - GEF = (Demand_Growth % / Capacity_Growth %) * (GMU / 100)Why: If you scale infra faster than demand or without keeping margins, you destroy PPUU. GEF >1 means you’re growing demand faster than capacity (or holding margin so growth is efficient). Benchmark idea: Cloud/API teams often target GEF ≥ 1.2 in growth phases; utilities, constrained by regulation, hover near 1 (capacity expansions match load forecasts).
- Demand (Usage) Growth % (DG) EN: Year-over-year change in total billed usage units. Pseudo: - (Usage_t − Usage_{t−1}) / Usage_{t−1} * 100Why: Core top-line driver in metered models; faster than price hikes, usage growth proves product value. Benchmark idea: Early-stage APIs: 30–50%+ YoY; mature utilities: 1–2% YoY.
- Capacity Growth % (CG) EN: YoY change in maximum deliverable capacity (compute throughput, MW, TPS). Pseudo: - (Cap_t − Cap_{t−1}) / Cap_{t−1} * 100Why: Overbuild wastes capex; underbuild throttles revenue (throttling, outages). Benchmark idea: hyperscalers expand capacity ~in line with forecasted demand + buffer (e.g., 20–40% YoY during high-growth years).
- Capacity Utilization % (CI) EN: Average share of provisioned capacity actually used (often measured at peak window or averaged). Pseudo: - Avg_Usage / Provisioned_Capacity * 100Why: Direct efficiency metric—too low means idle assets, too high means no headroom for spikes. Benchmark idea: Power plants target ~60–80% load factors; cloud infra teams often aim 40–60% average to leave burst room.
- Provisioned Capacity (PC) EN: The maximum sustained throughput you can deliver (e.g., kWh/day, requests/sec). Pseudo: - Cap = Σ(node_capacity)(or grid MW installed) Why: Sets the ceiling; also denominator for utilization. Needed for capex planning.
- Peak Usage / Peak Load (PU) EN: Highest instantaneous (or short-window) usage observed in the period. Pseudo: - max(usage_rate_t)over period- Why: Determines required headroom and auto-scaling needs; drives worst-case cost. - Benchmark: Peak multiples of 1.5–3× average are common; extreme bursty APIs can see 10×. 
- Peak-to-Average Ratio (PAR) EN: Ratio of peak load to average load. Pseudo: - Peak_Usage / Average_Usage- Why: Quantifies burstiness; high PAR stresses infra cost and pricing design (need overage/tier pricing). - Benchmark: Utilities PAR ~1.3–1.6; consumer APIs (chat/LLM) can be >3–5 during viral events. 
- Capex per Unit Capacity (CAPU) EN: Capital required to add one unit of capacity (e.g., $ per kW, $ per 1k TPS). Pseudo: - Capex_added / Capacity_added- Why: Guides ROI on expansion; lower CAPU means cheaper scaling. - Benchmark: Data center build costs ≈ $7–12M per MW; GPU clusters vary wildly but trend ~$25–40 per deployable TFLOP. 
- Auto-Scaling Latency (ASL) EN: Time it takes to provision additional capacity after demand spike. Pseudo: - t(scale_complete) - t(threshold_trigger)- Why: Slow scaling means you must over-provision; fast scaling lets you run lean. - Benchmark: Best cloud-native infra targets seconds–minutes; regulated utilities can’t “auto-scale,” they plan years ahead. 
- Headroom % (HR) EN: Buffer capacity above expected peak. Pseudo: - (Provisioned_Capacity - Expected_Peak) / Provisioned_Capacity * 100- Why: Prevents outages and throttling; too much wastes capital. - Benchmark: Cloud SREs often keep 20–30% headroom; grid operators maintain N-1 redundancy (varies but ~15–25% capacity reserve). 
- Usage Concentration Risk % (auxiliary) EN: % of total usage (or revenue) coming from top X customers. Pseudo: - Usage_top_X / Total_Usage * 100Why: Whales are great—until they churn. Tracks fragility of revenue base. Benchmark: Aim <30% from top 5; many infra/API firms start >50% and work it down over time.
Last updated
