Model Amazon orders end to end — partial returns, replacements, marketplace sellers (1P vs FBA vs 3P), and a single SKU pulled from two warehouses. The decision that separates a senior answer: never update history. Append the refund, log the movement, recognize revenue through one column. A full working through — data flow, schema, the ledger invariant, streaming Python, nightly reconciliation, analytics SQL, and the dashboard.
A sale is the easy part. The hard part is everything that reshapes it afterward — a partial return a week later, a replacement that triple-counts at the gross level, a 3P seller where Amazon books commission instead of the whole price, and stock that has to reconcile to the ledger every single night.
"Design a model for Amazon orders that handles partial returns, replacements, marketplace seller orders — 1P retail, FBA, and 3P marketplace — and inventory drawn from multiple warehouses with FIFO/LIFO consumption rules. How would you scope this out?"
The instinct that fails this interview is the instinct to keep an order's row up to date. A customer returns three of six pens, so you decrement the quantity; a seller is marketplace, so you overwrite the revenue; stock ships, so you set the on-hand number. Every one of those is an UPDATE, and every UPDATE destroys the one thing finance and operations cannot live without — the ability to ask what was true last Tuesday and get an honest answer. Returns reshape revenue retroactively. The seller-of-record decides whether a dollar is GMV or commission. And the same physical pen lives at two grains at once: a financial event that must never be mutated, and a stock movement that must reconcile to a count. The whole question is how to hold both without lying about either.
So before any boxes and arrows, the working frame for the session — two ledgers and a snapshot, each immutable in its own way:
Scope is the first scored dimension, and most candidates skip it. Said out loud: in scope is the order-to-cash and inventory model — order lines, returns, replacements, movements, the daily snapshot, and the seller-recognition rule that ties them to finance. Out of scope, explicitly: the checkout and payments-capture flow (we receive an order, we don't authorize the card), the routing/carrier and last-mile logistics engine, pricing and promotions optimization (we record the discount, we don't compute it), and the warehouse-management WMS internals (we consume its movement events). The caveat: FIFO/LIFO consumption is in scope as a costing rule on movements, because it changes the dollar value of cost-of-goods, even though the physical pick path is the WMS's job.
Then the envelope math, volunteered rather than extracted. Amazon-shaped numbers:
| Quantity | Estimate | Consequence |
|---|---|---|
| Orders / day (peak) | ~70–80 M | Sets the write rate on the financial ledger |
| Avg lines per order | ~2.5 | Plus warehouse splits → more rows than orders |
| Inventory movements / day | ≈ 1–2 B | The number that shapes the whole architecture |
| Return rate | ~5–15 % (category-dep.) | Reshapes revenue weeks after the sale |
| 3P share of units | ~60 % | Most rows hit the commission carve-out |
| Active SKUs × warehouses | ~10⁹ snapshot rows | The nightly reconciliation surface |
| Reconciliation SLA | nightly · zero drift | Snapshot must equal SUM(movements) |
Notice where the weight sits. The financial ledger is large but tame — tens of millions of orders a day is gigabytes, and finance reads it in batch. The firehose is the physical ledger: a billion-plus signed movements a day, every one immutable, all of which must sum back to the snapshot to the unit. That single row of the table — movements as the source of truth, snapshot as a derived cache — dictates the storage layout, the reconciliation job, and the failure philosophy. The rest of the design follows the stock.
A sale forks into two streams the moment it is placed: a financial event onto the order ledger, and a physical event onto the movement log. Returns and replacements append to both. The daily snapshot is a fold of movements, and reconciliation is the proof the two ledgers still agree.
Three properties of this picture do most of the interview's work. First, nothing on the financial side is ever mutated — a return does not edit the order line it refunds, it appends a row that points back at it, so net revenue is a subtraction across two append-only tables and any past period can be recomputed exactly. Second, the snapshot is a cache, not a source: on-hand stock is a fold of the immutable movement log, so if the snapshot is lost or corrupted it rebuilds from movements rather than from a backup of a number nobody can trust. Third, one real-world event lands at two grains on purpose — a sale is both an order line and a sale-out movement; a return is both a refund row and a return-in movement — and keeping them as separate facts that reference a shared key is what lets finance and operations each have a complete, non-lossy view of the same pen.
The snapshot may lag; the log may not drop. If the nightly fold fails, yesterday's on-hand is stale but the movement log is intact, so the snapshot simply rebuilds on the next run — staleness self-heals. But a lost movement is unforgivable: it breaks the reconciliation invariant permanently and corrupts cost-of-goods, so movement ingestion is exactly-once-effective via idempotency keys, durable, and back-pressured rather than dropped. A snapshot you can recompute beats a movement you can never recover.
The schema falls out of the immutability question. A financial fact at the sale grain and a returns fact that only ever appends; a physical movement fact that is the system's source of truth; a daily snapshot derived from it; and a replacement bridge that links orders without collapsing them.
An order line is the financial atom: one row per (order × product × source_warehouse), so a three-item order whose middle line splits across two warehouses is three rows that share an order_id and a line_number. The customer still sees one line item; the warehouse split is queryable because it is a distinct row, not a buried attribute. Crucially there is no returned_qty column here — returns live in their own table, because writing them back would destroy the historical truth.
A return references the order line it reverses and carries its own partial quantity, refund amount, and the condition the goods came back in. Append-only is the contract: an unreturn (a customer who changes their mind, a reversed RMA) is another row, not an edit. This is what lets finance restate March's net revenue in June and get a number that matches what March actually closed at, because nothing about March was ever overwritten.
Every change to stock is a signed movement: a receipt is positive, a sale-out negative, a return-in positive again, transfers net to zero across two warehouses, damage and adjustments are explicit. The unit_cost_usd on each movement is what makes FIFO/LIFO costing possible — consuming oldest-cost or newest-cost layers is a function of how you read these rows, not a number you store. This table is never updated; an error is corrected by an offsetting adjustment movement, so the audit trail stays complete.
The dimensions are conventional and mostly SCD2 — dim_products (titles and dimensions drift), dim_warehouses, dim_customers with SCD2 dim_addresses, dim_promotions — but one carries the whole revenue policy on its back. dim_sellers holds seller_type (retail_1p, fba_2p, marketplace_3p), a commission_pct, and the column that does the work: recognized_as, either gmv or commission_only. That one SCD2 field is the difference between booking a $22 sale and booking $3.30 of commission on it — and because it is a dimension column, the rule is a join-and-CASE, not a separate pipeline.
This system's correctness lives in one equation that must hold every night, per product, per warehouse: opening plus receipts plus returns-in, minus sales, minus damage, plus net transfers, equals ending. If it ever fails to balance, the physical ledger and the snapshot have disagreed — and you trust the ledger.
The invariant is a conservation law for atoms. The snapshot is not allowed to hold a number that the movement log cannot reproduce, so the nightly job does not set the ending quantity — it computes it from the day's signed movements added to yesterday's close, and stores both the computed value and its inputs. Reconciliation then re-derives the same arithmetic and compares: a clean run returns nothing, a non-empty result is a row where stock has gone unaccounted, which is an operational incident, never a number to quietly overwrite.
There is a second invariant on the financial side, quieter but just as load-bearing: net revenue is always a subtraction, never an edit. A replacement deliberately triple-counts at the gross level — the original sale, the return that reverses it, and the new replacement line all post — and the only reason that is safe is that the three rows net to zero by construction. The original gross is cancelled by the refund; the replacement stands as the real revenue. Collapse those into one mutated row and you lose both the audit trail and the ability to explain a customer's history; keep them as three append-only facts and the arithmetic takes care of itself.
The forward path — placed, shipped, delivered, settled — is a sequence of timestamp columns on the order line, set as the events arrive; it is the only place the order row "changes," and even then it only fills nulls, never rewrites a value. The return loop is entirely separate: it does not reopen the order line, it appends to fct_returns and emits a return_in movement, so a refund three weeks after delivery reshapes revenue and restores stock without ever touching the historical sale. That separation — forward state as fill-only timestamps, reversals as new facts — is what keeps the two ledgers honest.
Three small programs carry the write path: the order splitter that fans one cart into warehouse-grained lines plus their sale-out movements, the movement consumer that keeps the physical log exactly-once, and the return handler that appends a refund and its return-in. Each is small; the judgment is in what they refuse to do.
The splitter's most important behavior is that a single customer line can become several rows when stock is sourced from more than one warehouse — and it emits the physical twin of each at the same time. One placed order yields N order lines and N matching sale-out movements, and they share the order_line_id so finance and operations can always rejoin them.
One carve-out, always stated: the splitter does not decide costing. It records the unit_cost_usd of the layer it consumed, but whether the business reports FIFO or LIFO cost-of-goods is a read-time decision over the movement history — the writer's job is only to never lose which cost layer left the shelf.
A return is two appends that must both land: a refund row that references the original order line, and a return-in movement that puts the goods back (if they came back saleable). The handler refuses to UPDATE the order line, and it prices the refund proportionally to the partial quantity — three of six pens returns half the line's revenue, not the whole thing.
Two derived layers turn logs into answers. A nightly job folds the movement log into the on-hand snapshot — computing, never setting, the ending quantity. And the net-revenue mart applies the seller carve-out, so 1P books GMV and 3P books commission, from one query instead of two pipelines.
The snapshot fold is where the reconciliation invariant is born. Yesterday's ending_qty becomes today's opening_qty; the day's signed movements are bucketed by type and summed; the new ending is the arithmetic sum, stored alongside the inputs that produced it. Because every input is kept, the reconciliation check in §04 can re-derive the same number and catch any disagreement. The weighted-average cost is folded the same way — a value-weighted blend of opening stock and the day's receipts — which is the default costing layer the FIFO/LIFO read-time logic can override per report.
The net-revenue mart is where the recognition column earns its keep. The carve-out is a single CASE: when recognized_as = 'gmv', Amazon books the whole gross; when 'commission_only', it books only gross × commission_pct. Returns are recognized through the same lens — a refund on a 3P order reverses only the commission, not the whole refund amount — so net revenue stays internally consistent across the 1P/3P boundary without a second code path.
The two append-only ledgers and the derived snapshot are where the system explains itself. Three queries an interviewer loves, because each carries a classic pattern on its back — a percentile funnel, a cohort net-revenue join across the carve-out, and a FIFO cost-layer walk.
The customer-facing question. The pattern is a duration funnel over the fill-only timestamps on the order line, split by the membership flag, reporting both the average and the tail — because Prime's promise is about the 95th percentile, not the mean.
The finance question, and the one that proves append-only pays off. Net lifetime value of a customer cohort is gross recognized minus refunds recognized, both passed through the seller carve-out — a join that would be impossible if returns had been written back into the order rows, because the historical gross would already be corrupted.
The operations-meets-finance question, and the payoff for keeping unit_cost_usd on every movement. FIFO cost-of-goods is a window walk: order the receipt layers oldest-first, take a running sum of their quantities, and consume against the day's sales — the classic running-balance allocation that values each unit sold at the cost of the oldest layer still on the shelf.
A senior design ends with observability, because the immutability discipline above is invisible without it. The fulfillment dashboard watches three concerns — the customer promise, the financial restatement, and the physical reconciliation — and the single most important tile is the one that should always read zero.
Read the tiles together and the dashboard narrates the discipline. The drift tile reads zero, which is the whole point: a billion-plus movements folded into the snapshot and the arithmetic still balanced to the unit, so finance and operations are looking at the same stock. The two revenue lines tell a strategy story — 3P commission climbing relative to 1P GMV — that is only legible because the carve-out is a column. And the amber return-rate tile is the kind of signal append-only was built to surface: because returns are their own facts, the spike can be sliced by category, reason, and cohort without ever having corrupted the original sales it reverses.
Strip the order details away and the question was testing five judgments, each of which generalizes far beyond e-commerce: