Design the billing backend for a usage-metered SaaS — flat seats plus hourly metering, mid-cycle plan changes, overages, free trials, multi-currency. One prompt, two revenue streams of opposite shape, and a single decision that keeps the invoice honest: a complete working-through of data flow, schema, the append-only ledger, streaming Python, the MRR roll-up, and the dashboard that proves the books balance.
Every data engineer who touches a revenue system eventually meets this question. It sounds like a CRUD app for invoices. It is two billing systems running on different clocks, joined only at the moment money is owed.
"Design the data model for a SaaS with mixed pricing — flat-rate seats plus usage-based metering, per-query, per-GB, per-API-call. Handle mid-cycle plan changes, prorations, overages, free trials, and multi-currency invoicing. How would you keep the bill correct?"
The trap is to reach for one fact table and a column called amount. The two revenue streams the prompt names have opposite shapes and opposite failure modes. Subscription is predictable, low-volume, and changes a handful of times per customer per year — a Pro seat is a flat number until someone clicks upgrade. Usage is volatile, arrives by the billion, and is worthless to restate after the fact — a per-query meter ticks tens of thousands of times a second across the fleet. A model that stores them at the same grain either crushes the subscription logic under usage volume or smears usage into a coarse monthly figure that finance can never audit. Keep them apart at the fact layer; join them only at invoice time.
A weak answer treats the invoice as a mutable running total it edits as charges land. A strong answer notices that the invoice is the wrong place to hold truth at all — truth lives in an append-only log of charges, and the invoice is a SUM over a window. So before any boxes and arrows, the working frame for the whole session:
Scope is the first scored dimension, and most candidates skip it. State what you build, what you ignore, and the numbers that shape every later choice. Out of scope here, said explicitly: the payment gateway itself (treated as a callable service that returns success/failure with its own idempotency key), tax engines and tax jurisdiction logic (a separate line-item service), revenue recognition under ASC 606 (downstream of this ledger), and fraud. In scope: how a charge becomes a row, how a row becomes an invoice, how a mid-cycle change stays correct to the penny, and how MRR is read in one query.
Then the envelope math, volunteered rather than extracted. Snowflake-shaped numbers — a usage warehouse where compute is metered by the credit:
| Quantity | Estimate | Consequence |
|---|---|---|
| Active subscriptions | 2,000,000 | Sets the daily snapshot size: 2 M rows/day, trivial |
| Raw meter events/day | ≈ 5 B | Per-query, per-call — cannot live at raw grain forever |
| Hourly meter rows/day | ≈ 2 M subs × 4 meters × 24 ≈ 190 M | The grain that shapes the whole architecture |
| Billing events/day | ~3 M | One charge stream; the audit spine, append-only |
| Invoices/month | ~2 M | Derived rollup; never the source of a number |
| MRR query | SUM over 2 M rows, 1 day | Sub-second on a partitioned snapshot — a single SELECT |
| FX rates | ~30 pairs × hourly | Locked at usage time; never restated when rates drift |
Notice the asymmetry. The raw meter firehose is three orders of magnitude above everything else, and the subscription state is so small it would be silly to model it as a slowly-changing dimension. That spread is the whole design: roll the firehose to the hour at the edge, snapshot the slow stream daily, and let an immutable ledger be the one place a dollar figure is ever authoritative. The rest of this article follows the money.
One ledger, two tributaries. The spine of the design is an append-only billing-event log fed by a metering pipeline on one side and a subscription state machine on the other — and read, at month-end, by a roll-up that mints invoices without ever holding a balance of its own.
Three properties of this picture do most of the interview's work. First, the firehose never reaches the ledger at raw grain — five billion daily ticks collapse to the hour at the aggregator, and only one usage_charge per subscription per period ever becomes a billing event. Second, authority is deliberately one-directional: every dollar that will appear on an invoice is first an append-only row in fct_billing_events, written exactly once, and the invoice roll-up holds no balance it could disagree with. Third, the two streams are joined only at the rightmost box — usage and subscription stay separate facts, at separate grains, until the moment finance asks "what does this customer owe," and the answer is a SUM with a GROUP BY.
The meter may lose a tick; the ledger may never lose a charge. A dropped raw usage event costs a fraction of a cent and self-heals at the next reconciliation pass against the warehouse's own audit logs — so the metering path optimizes for throughput and tolerates at-least-once duplication, made safe by idempotent upserts on the hour bucket. A lost billing event — a proration, a refund — is never acceptable: it lands through a durable, exactly-once-keyed write, and if the producer retries, UNIQUE(source_event_id) swallows the duplicate silently. Approximate the usage; never approximate the bill.
The schema falls out of the grain question. Volatile, high-volume truth: the hourly meter. Bounded, slow truth: the daily subscription snapshot. Immutable money: the billing-event ledger. The invoice is a view, derived and re-derivable.
The meter is the integration point, so its grain is chosen to serve three masters at once: fine enough for product's adoption analytics, coarse enough that the table does not explode, and aligned to the unit finance bills on. One row per (subscription, meter, hour). The decisive columns are billable_quantity — already net of the free allowance, so no consumer re-derives the allowance and they all agree — and the FX trio, locked at the hour the usage occurred and never restated.
A subscription changes a handful of times a year, so the temptation is to model it as a slowly-changing dimension and reconstruct any day's state with an as-of join. Resist it. There is enough daily flux — trials converting, accounts pausing, MRR ticking on plan change — that a daily snapshot is both simpler and faster to query than an SCD2 walk. The snapshot is bounded at two million rows a day and answers the question finance asks most: what was true on this day.
This is the table everything else exists to feed. Every movement of money is a row, never an update: a subscription_charge at the cycle boundary, a usage_charge at period close, a proration credit and a proration charge on a plan change, a refund, a credit, a dunning step. The source_event_id is the idempotency key — the producer's stable identifier for the thing that happened — and UNIQUE on it is what makes the whole pipeline replay-safe. A retried plan-change webhook cannot double-charge, because the second insert is a no-op.
And the invoice — deliberately last, deliberately derived. It is a materialization of a SUM over the ledger for one subscription and one period, refreshed at close and re-derivable from the events at any time. It stores split totals (subscription_total, usage_total, overage, proration, credits) for presentation, but it is never the place a figure originates. If the invoice and the ledger ever disagree, the ledger is right and the invoice is rebuilt.
The whole correctness of this system lives in one rule about mutation: a plan change does not edit a charge, it appends two. Everything else — invoices, MRR, NRR — is arithmetic over an immutable log, and arithmetic over an immutable log is auditable by construction.
Consider the canonical hard case. An account on Pro at $100/month upgrades to Enterprise at $300/month on day 15 of a 30-day cycle. The naive instinct is to find the existing $100 charge and change it. That instinct is the bug: it destroys the history a dispute needs, and it makes the invoice a recompute that can drift from what was actually billed. The correct move is to leave the original $100 charge untouched and append two proration rows — a credit for the unused remainder of the old plan, and a charge for the remainder of the new plan.
Each transition of the subscription lifecycle is itself a fact: a trial converting to active emits a subscription_charge and bumps mrr_usd in tomorrow's snapshot; a plan change inside ACTIVE emits the two proration events; a PAST_DUE step emits a dunning row; a cancel freezes MRR to zero from the effective day forward. The state machine never overwrites — it appends, and the daily snapshot reads the latest state forward. Trials are the quiet edge case the prompt plants: they look exactly like active subscriptions in every column except that mrr_usd = 0, so they appear in adoption analytics and trial-conversion cohorts without ever inflating revenue.
The deterministic identifiers :sub||':chg42:credit' and :chg42:charge are doing the heavy lifting. They are derived from the change request, not generated fresh, so the same plan-change event arriving twice — a webhook retry, a replayed Kafka offset, a backfill — produces the same two keys and the second attempt collides harmlessly. This is the difference between a billing system that is merely correct on the happy path and one that is correct under the duplicate-delivery realities of every distributed queue.
Three programs carry the data. The aggregator that tames the firehose into the hour, the proration calculator that turns a plan change into exactly two events, and the gateway-webhook handler that records settlement idempotently. Each is small; the judgment is in what they refuse to do.
The reasoning is physical: at five billion raw events a day, storing raw grain is paying for resolution no one will ever query. The product team wants the hour; finance bills on the hour. So the aggregator folds raw ticks into hour buckets, computes billable_quantity against the plan's free allowance once, and — critically — locks the FX rate to the hour the usage occurred. A rate that moves tomorrow must never change a charge that already happened. The write is an idempotent upsert keyed on (subscription, meter, hour), so an at-least-once stream that replays a window simply re-converges to the same totals.
One carve-out, always stated: the aggregator never emits a billing event. It fills the meter and stops. Turning a month of meter rows into a single usage_charge is the roll-up job's responsibility at period close — separating "record the usage" from "bill the usage" is what lets product read the meter all month without any of those reads being mistaken for charges.
Month-end is where the two streams finally meet. The close job does three things in order — turn the meter into one usage charge, sum the ledger into an invoice, and hand a balance to the gateway — and it holds no money of its own at any step.
The first act collapses a subscription's entire month of meter rows into a single usage_charge billing event. This is the one place usage crosses into the ledger, and it crosses as a SUM: all the hourly amount_usd for the period, plus the overage split that customer success watches, condensed to one immutable row with a stable key derived from (subscription, period). Because the key is stable, re-running the close — after a late-arriving meter correction, say — does not double-bill; it either inserts the missing charge or collides on the one already there.
The second act is the invoice itself — and it is, satisfyingly, just a pivoted SUM over the ledger. Every event type that landed in the period folds into its column; the grand total is the sum of all of them, which is identically the sum of the raw ledger rows. There is no balance being maintained, no running total to reconcile. The invoice is a photograph of the ledger taken at close, and if the photograph is ever doubted, you re-take it from the same rows and get the same picture.
The snapshot and the ledger are where the business explains itself. Three queries an interviewer loves, because each one carries a classic SQL pattern on its back — the MRR scan, the upsell threshold, and the cohort retention join.
The reason the daily snapshot earns its place: monthly recurring revenue is a single filtered SUM over today's partition. No ledger replay, no SCD2 walk, no window functions — active subscriptions, today, summed. Trials contribute nothing because their mrr_usd is zero by construction, so the number is clean without a special case.
Customer success wants accounts about to blow past their plan tier — the expansion trigger. This is a rolling-window aggregation of the meter joined to the plan cap, filtered at the 85% line. The shape is the classic "current usage versus an allowance" pattern: aggregate the fast fact over a trailing window, join the slow fact for the limit, and surface the ratio.
Sales lives on NRR: does a signup cohort's revenue grow as it ages, expansion outrunning churn? The pattern is cohort retention — anchor each subscription to its first-seen quarter and its starting MRR with a window function, join to its current MRR, and ratio the cohort sums. Above 100% is the SaaS holy grail: the cohort is worth more today than the day it arrived.
A senior design ends with observability, because every guarantee above is invisible without it. The billing dashboard watches three things a healthy system keeps true: revenue is growing, money is flowing, and the ledger reconciles to the gateway to the cent.
Read the amber tiles together and the dashboard narrates a month-end close in progress: usage is being rolled to charges, so unbilled usage is draining while ingest lag breathes — and the reconciliation row stays at zero throughout, because the invoice is a SUM of an immutable ledger and there is nothing for it to disagree with. That is what a designed billing system looks like from the controller's chair: the numbers move, but they never stop tying out.
Strip the billing details away and the question was testing five judgments, each of which generalizes far beyond SaaS: