Design Stripe's payments ledger across charges, refunds, chargebacks, payouts and multi-currency FX — where every transaction's entries must sum to exactly zero, nothing is ever updated, and a regulator can ask for the books as they stood on any past day. One decision separates the senior answer: append-only is not a style, it is the invariant. A full working-through — data flow, the double-entry schema, the SUM=0 heart, replay-safe ingestion, the daily integrity check, and the reconciliation dashboard.
Most systems can afford a correction. Payments cannot. The instant money moves, the record of how it moved becomes evidence — and evidence you can edit is evidence you cannot defend. That single constraint dictates the entire model.
"Design the data model for Stripe's payment system — charges, refunds, chargebacks, payouts, multi-currency FX, reserves — with rigorous double-entry accounting where every transaction's entries sum to zero per currency. Append-only: never UPDATE, for regulatory audit compliance. How would you scope it?"
The prompt hands you the answer and then tests whether you understand why. "Double-entry" and "append-only" are not flourishes; they are the same idea viewed from two sides. A regulator can ask, of a system that moves a trillion dollars a year, "what was merchant X's balance on the fifteenth, as the books showed it then?" — and the answer must be correct regardless of every refund, chargeback, and correction posted since. You cannot reconstruct that from a mutable balance column. You can only reconstruct it from a history that was never allowed to change. The model is forced: an immutable log of signed movements, and a balance that is nothing but a sum over that log.
The weak answer is a charges table with a status column that flips to 'refunded'. It loses the past the moment it updates a row, it cannot survive a retry without double-charging, and it has no honest answer for FX. The strong answer notices that an accountant solved this in the fourteenth century: every event is a balanced set of debits and credits, recorded forever, and the truth is derived, not stored. So before any schema, the working frame for the session:
Scope, said before any boxes: in scope is the ledger spine, the SUM=0 invariant, idempotency, the reversal pattern for refunds and chargebacks, and multi-currency FX locked at event time. Explicitly out of scope, named aloud: card-network authorization and the issuer rails (we model the outcome of an auth, not the ISO 8583 handshake), fraud scoring (a separate system that reads the ledger), and the payout banking integration beyond the entries that represent it. One non-negotiable I will state up front: there is exactly one writer discipline — append a balanced set of entries, or append nothing. No partial transactions, ever.
Then the envelope math, volunteered. Stripe-shaped numbers:
| Quantity | Estimate | Consequence |
|---|---|---|
| Payment volume / year | ~$1 T | Sets the regulatory bar: SOX, PCI, banking audit — all assume immutability |
| Charges / day | ~10^8 | The business-event rate; modest writes, enormous stakes |
| Ledger entries / charge | 3–8 | The fan-out that makes the SUM=0 invariant meaningful |
| Chargeback arrival lag | 30–90 days | The past must stay open to new entries long after it "closed" |
| Currencies in flight | ~135 | Every cross-currency event needs an FX leg with a locked rate |
| Acceptable balance error | $0.00 | The invariant is exact, not approximate; any drift halts payments |
| Idempotency-key retries | routine | Networks fail mid-write; replay must be a no-op, not a double-charge |
Notice the row that is unlike any other system: the acceptable error is zero. Not "five nines," not "eventually consistent" — zero, exactly, per currency, per transaction, continuously. A recommender can be noisy and a search index can be stale, but a ledger that is off by a cent is not a degraded ledger, it is a broken one. That absolutism is the spine of the whole design, and it is enforceable precisely because the model is append-only.
One spine, two readers. Producers translate business events into balanced entry sets and append them through an idempotency gate. The ledger is the immutable record; balances and reports are derived views that can be thrown away and rebuilt at any time.
Three properties of this picture carry the interview. First, the entry builder asserts SUM=0 before anything is sent — an unbalanced transaction never reaches the log, so the invariant is a precondition of writing, not a hope checked afterward. Second, the idempotency gate sits between the builder and the log, so a producer that retries after a timeout collides on a unique key and the second attempt is silently absorbed; money is moved at-most-once even though the network guarantees at-least-once delivery. Third, everything below the dashed line is derived: balances, the merchant API, the regulator's as-of-date view. Drop the entire read layer and you have lost nothing but cache — the truth survives intact in the append-only log, and the balances rebuild from a single scan.
The log never lies, and the log never forgets. Under load, a balance read may serve a slightly stale snapshot — acceptable, because the merchant API is a convenience view. But a write degrades in exactly one direction: it either appends a fully balanced, idempotent set of entries, or it appends nothing and fails loudly. There is no half-posted transaction, no UPDATE to patch a mistake, no DELETE to make a problem disappear. Corrections move forward as new entries; the past is read-only by construction.
The schema is almost austere. A single append-only fact carries every movement. A small chart of accounts gives each party a balance. FX rates are a slowly-changing dimension. The balance and the business-event grouping are derived. The discipline is in the constraints, not the column count.
The ledger entry is the only table the system genuinely owns. Each row is one signed amount on one account in one currency, tied to a business event by balance_txn_id. The two load-bearing columns are source_event_id, the idempotency key with a UNIQUE constraint that turns retries into no-ops, and reversal_of_id, the self-reference that makes a refund or chargeback point back at what it undoes — without ever touching the original row.
Two timestamps, not one, and the distinction is the whole audit story. effective_ts is when the money moved in the world; posted_ts is when the row landed in the books. A chargeback that happened in March but posts in May has a March effective_ts and a May posted_ts — and "the balance as the books showed it on April 1" is a filter on posted_ts <= '2025-04-01', which correctly excludes a movement not yet recorded then. One column separates economic reality from recorded reality, and regulators care about both.
Every party in a transaction is an account with a balance — not just merchants and customers, but internal accounts: fees, reserve, tax, fx_holding, chargeback_reserve. This is what lets a single charge balance to zero: the money the merchant gains is exactly the money the customer's account loses plus the fee revenue Stripe recognizes. FX rates are SCD2 because the rate used must be the one that was in force at posting — frozen onto the entry, never re-derived at read time.
Two more tables, both derived, neither a source of truth. fct_balance_transactions groups the atoms of one business event for convenient query; snap_account_balance_daily is the running sum materialized for sub-second balance reads. The crucial property: both can be dropped and rebuilt from the log by replaying it. They exist for speed, and they are allowed to be temporarily stale, because correctness lives upstream of them.
Every correctness guarantee in this system collapses to one statement: for any business event, the signed entries sum to zero within each currency. It is not a report you run nightly to feel reassured. It is a gate the books are not allowed to pass through unbalanced.
Double-entry's genius is that it makes a whole class of bugs arithmetically impossible to hide. If a producer drops a leg, miscomputes a fee, or fat-fingers a sign, the transaction's currency sum is no longer zero, and a single query finds it. There is no way for money to silently appear or vanish, because every credit is born with its matching debit. The model does not trust the producer to be correct; it gives the producer a property that is cheap to verify and impossible to fake.
Consider a €100 charge with a €2.90 fee. It is one business event and four atoms, and the EUR sum is exactly zero:
The invariant runs continuously, not as reassurance but as a circuit breaker. An empty result set is health; a single returned row is a producer bug, and the response is not a Jira ticket — it is to halt the payments pipeline before the corruption propagates into a merchant payout. This is the one query the on-call engineer is woken for.
There is a second invariant hiding inside the first, and it is what makes the system replay-safe: the UNIQUE constraint on source_event_id. A producer that times out mid-write does not know whether its entries landed, so it retries with the identical event id. The retry's INSERT collides on the unique key; the application catches the violation and returns the original transaction. The money is moved exactly once even though the write was attempted twice — idempotency enforced at the database, not hoped for in application logic. SUM=0 keeps the books honest; UNIQUE keeps them from double-counting. Together they are the entire correctness story.
Three small programs carry the write path: the entry builder that turns an event into a balanced set, the FX builder that locks a rate into the ledger, and the idempotent appender that makes a retry a no-op. Each is defined as much by what it refuses to do as by what it does.
The builder's contract is absolute: it returns a set of entries whose per-currency sums are zero, or it raises. It never returns a partial transaction for the caller to "finish later," because a partial transaction is the one thing the ledger must never see. The SUM=0 assertion lives here, in the producer, so that an unbalanced set dies in application code and never reaches the durable log.
Note the use of Decimal, not floating point. Money in binary float is a bug waiting for the right two numbers; cents must be exact, and the invariant must hold to the cent, not to the nearest epsilon. This is a small choice that an interviewer will notice you making, and a large one to get wrong.
A cross-currency event is the canonical four-entry transaction, and its discipline is to record the conversion rather than compute it at read time. The rate is read from the SCD2 dimension as of the event, stamped onto the entries, and balanced per currency independently. "What did we earn in USD in March?" then sums the USD-denominated rows and never re-runs FX — because the rate that mattered is already frozen in the row.
The carve-out an interviewer listens for is in that docstring: this function has no UPDATE path. There is deliberately no code anywhere that mutates a posted entry, because the existence of such a path is itself the vulnerability. Append, or fail, or absorb a duplicate — those are the only three outcomes, by construction.
Two derived layers sit atop the log: balances, which are running sums, and corrections, which are new transactions. A refund is not an edit. A chargeback is not an edit. They are fresh, balanced sets of entries that point back at what they undo — which is the only way the past can stay both correct and immutable.
The junior instinct is to mark a charge "refunded" with an UPDATE. The double-entry answer appends the mirror image: every leg of the original, with the sign flipped, in a new transaction whose entries carry reversal_of_id back to the originals. Both transactions exist forever. The merchant's balance moves because the sum over the log moved — not because any historical row changed.
Balances, then, are pure derivation. The daily snapshot is a windowed running sum over the log per account and currency — materialized for speed, but defined entirely by the entries. Drop it and rebuild it from one scan and nothing is lost. This is the property that lets a finance team trust a sub-second balance read: it is not an independent number that could be wrong, it is a cache of a sum that is always recomputable.
That rule resolves the hardest tension in payments accounting. Card networks deliver chargebacks thirty to ninety days after the charge — long after the period that contained the charge has "closed." A model that treats a closed month as immutable-and-sealed has nowhere to put them. This model does: the close locks the past against edits, but the append-only log always accepts new entries. A late chargeback posts today, with an effective_ts back in the original month for economic reporting and a posted_ts of now for the audit trail, attributed to a post-close adjustment sub-period. The merchant's balance silently steps down sixty days later — correctly, traceably, and without anyone having reopened a single historical row.
The append-only log answers questions no mutable table can. Three queries an interviewer loves, each carrying a named pattern: the as-of-date audit reconstruction, a self-join for dispute lag, and conditional aggregation for realized FX.
The regulator's question, and the one the whole design exists to answer: merchant X's balance on a historical date, as it appeared then, regardless of corrections posted since. The pattern is the as-of-date reconstruction — a filter on posted_ts, summing the immutable log. Because rows are never edited, the answer is stable forever; re-run it next year and it returns the same number.
Risk's question: which merchants are bleeding chargebacks, and how long after the charge do disputes land? The pattern is a self-join over the log — charges joined to their reversing chargebacks via reversal_of_id — with conditional aggregation for the rate and a timestamp difference for the lag. Merchants over a threshold trigger a reserve increase, closing the loop back to the chart of accounts.
Finance's question: how much did currency movement actually cost or earn? Because every conversion is a recorded entry with a frozen rate, realized FX is just the sum of the dedicated fx_gain and fx_loss legs — no re-conversion, no guessing what rate applied. The pattern is conditional aggregation, slicing the same log by entry type and reporting currency.
A ledger's dashboard has one tile that matters more than all the others: the invariant. Everything else is context. The operator's first glance answers a single question — is every transaction still summing to zero? — and only then looks at volume, disputes, and FX.
Read the amber and red tiles together and the dashboard narrates the open past from §06. Chargebacks are landing on a 45-to-60-day lag, post-close adjustments are flowing into prior months as new entries, and net realized FX is mildly negative on a strong-dollar week. None of it touches a historical row; all of it moves balances forward. And the one green tile that governs everything — zero unbalanced transactions — is what lets finance trust every other number on the screen.
Strip away the payments specifics and the question was probing five judgments, each of which generalizes far past Stripe: