PaddySpeaks · Systems at the Whiteboard · Nº 02

The Stacked-Delivery Problem

Model the order lifecycle of a food-delivery marketplace — placement to prep to pickup to drop — for three stakeholders who want contradictory truths, while one courier carries two strangers' dinners on a single run. A complete working through: the three grains, the bridge that makes stacking honest, the SLA invariant, streaming ingestion, payout close, and the dashboard.

§ 01 — THE QUESTIONOne order, three stakeholders, two dinners in one bag

This prompt sounds like a CRUD app and is in fact a marketplace with three irreconcilable points of view — and the moment you allow one courier to carry two orders at once, the naïve schema lies to at least one of them.

Interview Prompt

"Model the order lifecycle for a food-delivery platform — placement, restaurant prep, courier pickup, delivery — serving customers (on-time SLA), couriers (earnings), and restaurants (operations), with native support for stacked orders where one courier delivers several orders on a single dispatch. How do you scope it?"

LEVEL · SENIOR / STAFFDURATION · 45 MINFORMAT · WHITEBOARD

The trap is the word order. It invites a single fact table — one fat row per order, with the customer columns, the restaurant columns, and the courier columns all sitting side by side — and for a while it works. Then the dispatch engine, hunting for efficiency, hands one courier two orders bound for the same block. Now there is no honest place to write the courier's pay. Split it across the two order rows and you have invented a fiction; the courier was never paid "per order," they were paid for a route. Write it once and double-count it on read. The single fact has no grain that fits all three readers, because there is no single event they are all watching.

A weak answer reaches for one table and patches the stacked case with a nullable batch_id and a prayer. A strong answer notices that the prompt names three actors who each live at a different grain — and that the only clean model is to give each their own fact and a bridge to reconcile them. So before any tables, the frame for the session:

THE CUSTOMER GRAIN
One row per order. Promised time versus delivered time, five lifecycle timestamps, fees and the editable tip. This reader never wants to know that their dinner shared a car. Their question is binary: was it on time, and if not, whose fault.
THE COURIER GRAIN
One row per dispatch. From "accept" to "all delivered" — a route, not an order. Carries the payout, the total distance, the batched flag. The courier is paid for the run; the run is the fact.
THE RESTAURANT GRAIN
One row per order, read sideways. Prep variance, cancellation source, GMV. The restaurant shares the customer's table but asks a different column of it — how long did the kitchen take, and how often did it bail.
A fact table has exactly one grain. Three readers asking three questions about three different events need three facts — and one bridge to make them agree on the money.

Scoping out loud

Scope is the first scored dimension, so name it. In scope: the order/dispatch/bridge core, the state-event log that powers stage attribution, mutable tips, and the analytics that serve each side. Out of scope, said explicitly: the real-time dispatch optimizer (the routing brain that decides which orders to stack — treated as an upstream service that hands us a dispatch plan), menu and catalog, payments rails, fraud, and the customer-facing map. The caveat: the model must not preclude a courier picking up a third order mid-route, so stacking is N-ary from day one, never hard-coded to two.

Then the envelope math, volunteered. Marketplace-shaped numbers at national scale:

QuantityEstimateConsequence
Orders per day25,000,000≈ 290/s average, ~1.5 K at the dinner peak
State events per order~7placed, confirmed, ready, picked_up, delivered + retries
State-event writes≈ 2,000/s peakThe append-only firehose that shapes the telemetry tier
Batch rate (stacked share)~35% of dispatchesThe bridge is the common case, not an edge case
Tip-mutation window24 h post-deliveryEconomics must be append-only; no row is final on delivery
Active couriers at peak~600,000Sizes the dispatch fact and the earnings rollup
On-time SLA target≥ 92% of ordersThe single number the customer org is graded on

Read the table and the architecture is half-decided. The volume is trivial — 290 orders a second is a rounding error against the ride-sharing firehose. The pressure here is not throughput; it is reconciliation. One in three runs is stacked, every tip can move for a day after the fact, and three orgs will write conflicting SQL against the same data unless the grains are kept clean. The rest of this article follows the money through the bridge.


§ 02 — DATA FLOWFollowing an order through the building

Two write paths, one reconciliation. Orders and dispatches are born in separate services at separate tempos; the bridge is stamped the instant the dispatcher commits a plan; the state-event log is the spine that lets anyone reconstruct what happened and when.

ORDER PLANE · CUSTOMER + RESTAURANT GRAIN DISPATCH PLANE · COURIER GRAIN + THE BRIDGE CUSTOMER APP place order · cart ORDER SERVICE writes fct_orders emits state events EVENT LOG key: order_id · ordered PLACED…READY…DELIVERED RESTAURANT TABLET confirm · mark ready STAGE-LATENCY JOB prep / pickup / drive WAREHOUSE (PARQUET) facts + dims · SCD2 DISPATCH OPTIMIZER stack plan (upstream) DISPATCH SERVICE writes fct_dispatches + brg_dispatch_orders BRIDGE WRITER one txn · SUM(share)=payout COURIER APP leg sequence · GPS legs EARNINGS / PAYOUT tip lock at +24h · close SLA / NOTIFICATIONS promised_ts watchdog bridge reads order_ids from log SOLID — durable writes · DASHED — the reconciliation join · The order is the customer's truth; the dispatch is the courier's; the bridge is the treaty.
FIG. 1 — Two planes, one bridge. Orders and dispatches are written independently; the bridge row is the single transaction that reconciles the courier's pay to the customers' orders.

Three properties of this picture do most of the interview's work. First, the order plane and the dispatch plane never share a row — the customer's experience and the courier's pay are written by different services and only ever joined, never merged, so neither reader can corrupt the other's grain. Second, the bridge is the only place stacking is expressed, and it is written in a single transaction whose invariant is arithmetic: the per-leg payout shares must sum to exactly the dispatch payout. Third, the event log is dashed into everything because it owns no truth of its own — it is an append-only narration that lets the stage-latency job, finance, and disputes each reconstruct the timeline without locking a hot row.

The Reconciliation Philosophy, In One Rule

Each grain owns its own numbers; the bridge owns only the relationship. A late delivery is a fact of fct_orders and never touches courier pay. A generous tip is a fact of fct_orders that flows to the dispatch through the bridge's leg_payout_share, but the order row remains the source of record. When the two disagree — and after a 24-hour tip edit they will — the order is authoritative for the customer, the dispatch is authoritative for the courier, and finance restates from the tip-lock boundary. No reader is ever forced to filter another reader's rows.


§ 03 — DATA MODELTwo facts, a bridge, and an append-only log

The schema falls out of the grain question. Customer truth: fct_orders. Courier truth: fct_dispatches. The treaty between them: brg_dispatch_orders. The narration that explains everything: fct_order_state_events. Around them, conformed dimensions, SCD2 where history has legal weight.

The customer fact

One row per order, born at placement and filled in as the order moves. The five lifecycle timestamps are nullable until reached — a ready_ts of NULL is not missing data, it is an order the kitchen has not finished. The two columns that earn their keep are tip_locked_ts (the immutability boundary; NULL means the tip can still change) and terminal_state, which records not just that an order ended but who ended it — the cancellation source is the difference between a refund and a restaurant warning.

DDL · CUSTOMER GRAIN
-- Truth #1: the customer's order. One row, five lifecycle stamps, -- fees, and a tip that stays mutable for 24h after delivery. CREATE TABLE fct_orders ( order_key BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, order_id UUID NOT NULL UNIQUE, -- natural key customer_id BIGINT NOT NULL REFERENCES dim_customers, restaurant_id BIGINT NOT NULL REFERENCES dim_restaurants, delivery_h3 BIGINT NOT NULL, -- H3 res 8 drop cell -- the five timestamps: NULL until the stage is reached placed_ts TIMESTAMPTZ NOT NULL, confirmed_ts TIMESTAMPTZ, ready_ts TIMESTAMPTZ, picked_up_ts TIMESTAMPTZ, delivered_ts TIMESTAMPTZ, promised_ts TIMESTAMPTZ NOT NULL, -- the SLA promise, set at placement -- customer-facing economics subtotal_usd NUMERIC(8,2) NOT NULL, tax_usd NUMERIC(8,2) NOT NULL DEFAULT 0, delivery_fee_usd NUMERIC(8,2) NOT NULL DEFAULT 0, service_fee_usd NUMERIC(8,2) NOT NULL DEFAULT 0, promo_applied_usd NUMERIC(8,2) NOT NULL DEFAULT 0, tip_usd NUMERIC(8,2) NOT NULL DEFAULT 0, -- mutable up to tip_locked_ts tip_locked_ts TIMESTAMPTZ, -- NULL = tip still editable terminal_state TEXT CHECK (terminal_state IN ( 'delivered','cancelled_customer', 'cancelled_restaurant','cancelled_courier','abandoned')) ); CREATE INDEX idx_orders_rest ON fct_orders (restaurant_id, placed_ts DESC); CREATE INDEX idx_orders_cust ON fct_orders (customer_id, placed_ts DESC);

The courier fact and the bridge

The dispatch is the courier's shift-segment — accept to last drop — and its grain is deliberately blind to how many orders rode along. The count lives in total_orders as a convenience denormalization, but the truth of which orders, in which sequence, at what per-leg distance and pay, lives only in the bridge. A single dispatch carrying one order produces one bridge row with leg_sequence = 1; a stacked dispatch carrying three produces three. Stacking is not a flag bolted onto a 1:1 table — it is the cardinality of the bridge.

DDL · COURIER GRAIN + STACKING BRIDGE
-- Truth #2: the courier's dispatch. One row per route, not per order. -- Pay is earned for the run; the bridge attributes it back to orders. CREATE TABLE fct_dispatches ( dispatch_key BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, dispatch_id UUID NOT NULL UNIQUE, courier_id BIGINT NOT NULL REFERENCES dim_couriers, dispatch_start_ts TIMESTAMPTZ NOT NULL, -- "accept" dispatch_end_ts TIMESTAMPTZ, -- "all delivered" is_batched BOOLEAN NOT NULL DEFAULT FALSE, total_orders SMALLINT NOT NULL DEFAULT 1, -- = COUNT(*) in the bridge total_distance_km NUMERIC(6,2) NOT NULL DEFAULT 0, base_pay_usd NUMERIC(8,2) NOT NULL DEFAULT 0, tips_distributed_usd NUMERIC(8,2) NOT NULL DEFAULT 0, payout_usd NUMERIC(8,2) NOT NULL DEFAULT 0 -- base + tips ); -- The treaty. One row per (dispatch x order). N=1 normal, N>1 stacked. -- leg_sequence is the delivery order on the route; the shares MUST -- sum to the dispatch payout (enforced at close — see § 04). CREATE TABLE brg_dispatch_orders ( dispatch_key BIGINT NOT NULL REFERENCES fct_dispatches, order_key BIGINT NOT NULL REFERENCES fct_orders, leg_sequence SMALLINT NOT NULL, -- 1, 2, 3 ... leg_distance_km NUMERIC(6,2) NOT NULL, -- prev drop (or pickup) -> this drop leg_payout_share_usd NUMERIC(8,2) NOT NULL, -- this leg's slice of payout PRIMARY KEY (dispatch_key, order_key), UNIQUE (dispatch_key, leg_sequence) -- no two legs share a position );

The telemetry log

Every state transition lands in an append-only table, one row per change, never updated. This is what turns "the order was 18 minutes late" into "the restaurant prep took 22 minutes" — the difference between a complaint and an action. It also keeps the order row small and contention-free: payments, the SLA watchdog, and analytics are all downstream consumers of this log, not writers of fct_orders.

DDL · TELEMETRY GRAIN (APPEND-ONLY)
-- One row per order state change. Immutable. The stage-latency -- job and dispute tooling reconstruct any timeline from here. CREATE TABLE fct_order_state_events ( event_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, order_id UUID NOT NULL, state TEXT NOT NULL CHECK (state IN ( 'PLACED','CONFIRMED','READY', 'PICKED_UP','DELIVERED','CANCELLED')), event_ts TIMESTAMPTZ NOT NULL DEFAULT now(), actor TEXT NOT NULL, -- customer | restaurant | courier | system detail JSONB ); CREATE INDEX idx_ose_order ON fct_order_state_events (order_id, event_ts);

The dimensions are conformed and mostly SCD2, because history here has consequences. dim_restaurants versions its hours and menu so a dispute six weeks later reads against the menu that was live then. dim_addresses is SCD2 for the same reason a refund is a legal artifact — you must be able to prove where the order was going on the night it failed. dim_couriers versions vehicle type and rating tier; dim_h3_cell is SCD1 because a hexagon's geometry does not have a history worth keeping.


§ 04 — THE CORE INVARIANTWhere the money is conserved

The correctness of the whole model lives in one arithmetic identity and one immutability boundary. The identity: the bridge's per-leg shares sum to the dispatch payout. The boundary: tip_locked_ts. Everything else is reporting.

A stacked dispatch is dangerous precisely because the same dollars are visible from two grains. The courier sees payout_usd on the dispatch; the customer's order contributes a tip that flows into that payout; an analyst joining the two can double-count or lose money in the seams. The guard is a closing transaction that refuses to commit a dispatch whose legs do not reconcile — RULE Nº 1 made executable.

The dispatch lifecycle

OFFERED ACCEPTED EN_ROUTE_PICKUP PICKED_UP DELIVERING CLOSED

A stacked dispatch loops through PICKED_UP → DELIVERING once per leg, but reaches CLOSED exactly once — and CLOSED is the only state in which money is written. Closing is where the invariant is checked: the sum of the legs' leg_payout_share_usd must equal the dispatch's payout_usd to the cent, or the close aborts and the dispatch is held for review. This is the food-delivery equivalent of a ledger that must sum to zero: the route's pay is conserved across its legs.

SQL · THE CLOSING TRANSACTION — PAY IS CONSERVED
-- Close a dispatch atomically. The bridge shares must reconcile to -- the payout, or nothing commits. This is the system's correctness. BEGIN; -- 1. recompute payout from base + the tips of the orders on this run UPDATE fct_dispatches d SET tips_distributed_usd = sub.tips, payout_usd = d.base_pay_usd + sub.tips, dispatch_end_ts = now() FROM ( SELECT b.dispatch_key, sum(o.tip_usd) AS tips FROM brg_dispatch_orders b JOIN fct_orders o USING (order_key) WHERE b.dispatch_key = :dispatch_key GROUP BY b.dispatch_key ) sub WHERE d.dispatch_key = sub.dispatch_key; -- 2. the invariant: leg shares must equal payout, to the cent. DO $$ DECLARE drift NUMERIC; BEGIN SELECT d.payout_usd - coalesce(sum(b.leg_payout_share_usd), 0) INTO drift FROM fct_dispatches d LEFT JOIN brg_dispatch_orders b USING (dispatch_key) WHERE d.dispatch_key = :dispatch_key GROUP BY d.payout_usd; IF abs(drift) > 0.005 THEN RAISE EXCEPTION 'dispatch % unbalanced: drift=%', :dispatch_key, drift; END IF; END $$; COMMIT;
The sum of the legs equals the pay of the route. A dispatch that cannot prove it does not close.STACKING RULE Nº 1 — THE BRIDGE BALANCES

The second half of the invariant is temporal. A tip is mutable for 24 hours, which means a dispatch can close and its payout can still move. The model handles this without mutating closed facts: the tip edit is an append to the order's economic history, the dispatch's payout_usd is recomputed by the close job's idempotent re-run, and the moment the 24-hour window expires the order's tip_locked_ts is stamped. After that instant the order is frozen and finance restates the courier's earnings from the locked figure. Before the lock, every reader knows the number is provisional because the column tells them so.


§ 05 — INGESTION & STREAMSPython on the lifecycle

Three programs carry the write path: the state-event ingester that narrates every order, the bridge writer that stamps a stacked plan in one transaction, and the leg-share allocator that turns a route's pay into per-order credit. Each is small; the judgment is in what they refuse to do.

1 · The state-event ingester — append, never mutate

Every transition is an immutable row. The ingester is idempotent by construction — a redelivered event with the same (order_id, state) is dropped, not double-written — so the upstream bus can be at-least-once without polluting the timeline. It refuses two things on principle: it never updates a prior state row, and it never advances fct_orders directly. The order row is updated by a separate projection that reads this log; the log is the source of truth.

PYTHON · STATE-EVENT INGESTER (IDEMPOTENT)
import asyncpg VALID = {"PLACED","CONFIRMED","READY","PICKED_UP","DELIVERED","CANCELLED"} async def ingest_state_event(pool, ev: dict) -> None: """One transition -> one append. At-least-once safe: a duplicate (order_id, state) is ignored, not re-inserted. We NEVER update an earlier row -- a corrected timestamp is a new event with actor=system, so the audit trail keeps the mistake AND the fix.""" if ev["state"] not in VALID: raise ValueError(f"unknown state {ev['state']!r}") async with pool.acquire() as con: await con.execute(""" INSERT INTO fct_order_state_events (order_id, state, event_ts, actor, detail) VALUES ($1, $2, $3, $4, $5) ON CONFLICT DO NOTHING """, ev["order_id"], ev["state"], ev["event_ts"], ev["actor"], ev.get("detail")) # the projection that advances fct_orders.{confirmed,ready,...}_ts # is a separate consumer of this table -- ingestion stays pure append.

2 · The bridge writer — a stacked plan in one transaction

The dispatch optimizer (upstream, out of scope) hands us a plan: one courier, an ordered list of orders, a per-leg distance. The bridge writer's whole job is to commit the dispatch row and every bridge row together, so the system is never observed in a state where a dispatch exists without its legs. It refuses partial writes — a plan with a gap in leg_sequence, or a duplicated order, is rejected before any row is touched.

PYTHON · BRIDGE WRITER (ALL-OR-NOTHING)
async def write_dispatch(pool, plan: dict) -> None: """Commit a dispatch and its 1..N legs atomically. A stacked plan is the common case (~35%), so this is the hot path, not an edge case. Validate the shape BEFORE the transaction: legs must be 1..N with no gaps and no repeated order -- the bridge's UNIQUE(dispatch,seq) is the backstop, but we fail fast with a readable error first.""" legs = plan["legs"] seqs = [l["leg_sequence"] for l in legs] if sorted(seqs) != list(range(1, len(legs) + 1)): raise ValueError(f"leg_sequence must be 1..N contiguous, got {seqs}") if len({l["order_key"] for l in legs}) != len(legs): raise ValueError("an order may ride a dispatch at most once") async with pool.acquire() as con, con.transaction(): dk = await con.fetchval(""" INSERT INTO fct_dispatches (dispatch_id, courier_id, dispatch_start_ts, is_batched, total_orders, total_distance_km, base_pay_usd) VALUES ($1,$2,$3,$4,$5,$6,$7) RETURNING dispatch_key """, plan["dispatch_id"], plan["courier_id"], plan["start_ts"], len(legs) > 1, len(legs), sum(l["leg_distance_km"] for l in legs), plan["base_pay_usd"]) await con.executemany(""" INSERT INTO brg_dispatch_orders (dispatch_key, order_key, leg_sequence, leg_distance_km, leg_payout_share_usd) VALUES ($1,$2,$3,$4,$5) """, [(dk, l["order_key"], l["leg_sequence"], l["leg_distance_km"], 0) for l in legs]) # shares start at 0; the allocator fills them at close, then the # § 04 invariant proves they sum to payout. No partial dispatch is # ever visible: the transaction makes the route and its legs one write.

3 · The leg-share allocator — distance, then tip, then truth

When a dispatch closes, its payout must be split across the legs. The allocation is not arbitrary: base pay follows distance (the courier earned the miles), and each order's own tip is credited to its own leg (the tip belongs to the customer who gave it). The allocator's one discipline is that the parts must reconstitute the whole — it allocates by rounding down and assigns the residual cent to the longest leg, so the sum is exact and the § 04 guard always passes.

PYTHON · LEG-SHARE ALLOCATOR (EXACT TO THE CENT)
from decimal import Decimal, ROUND_DOWN def allocate_shares(base_pay: Decimal, legs: list[dict]) -> list[dict]: """Split a dispatch payout across its legs so the parts sum to the whole EXACTLY. Base pay is distance-weighted; each leg also keeps its own order's tip. Rounding losses are swept to the longest leg, so there is never a stray cent for the close invariant to trip on.""" total_km = sum(Decimal(str(l["leg_distance_km"])) for l in legs) or Decimal(1) cent = Decimal("0.01") out, allocated_base = [], Decimal(0) for l in legs: km = Decimal(str(l["leg_distance_km"])) share = (base_pay * km / total_km).quantize(cent, ROUND_DOWN) allocated_base += share out.append({**l, "base_share": share, "tip": Decimal(str(l["tip_usd"]))}) # sweep the rounding residual to the longest leg -- exact reconciliation residual = base_pay - allocated_base longest = max(out, key=lambda l: l["leg_distance_km"]) longest["base_share"] += residual for l in out: l["leg_payout_share_usd"] = l["base_share"] + l["tip"] return out # sum(leg_payout_share_usd) == base_pay + sum(tips), exactly

One carve-out, always stated: the allocator never decides whether to stack. Which orders share a route is the optimizer's call, made on live road conditions and SLA risk; the allocator only divides money after the fact. Keeping the economic split downstream of the routing decision is what lets the marketplace re-tune its batching policy without ever touching the payout code.


§ 06 — AGGREGATIONStage latency and the nightly payout close

Two derived layers carry the slow loop. Stage latency turns the event log into per-stage durations and a whose-fault verdict. The payout close, run nightly and again at the 24-hour tip-lock boundary, freezes earnings and proves the bridge still balances.

Stage latency is a streaming projection over fct_order_state_events. For each order it pairs adjacent transitions — CONFIRMED→READY is prep, READY→PICKED_UP is pickup wait, PICKED_UP→DELIVERED is the drive — and writes them to a wide latency row that the restaurant and customer dashboards both read. The craft is in the attribution: a late order is not interesting; a late order with a 22-minute prep stage is an action item for a specific kitchen. The job is windowed on event time with a watermark, because a tablet that reconnects after a dead zone will deliver a READY event minutes after the wall clock moved on, and processing-time windows would blame the courier for the restaurant's Wi-Fi.

PYTHON · STAGE-LATENCY PROJECTION (EVENT-TIME)
# Pair adjacent state transitions into per-stage durations, keyed by # order, windowed on event time so reconnect-jitter can't misattribute. stages = (events .assign_timestamps(watermark_delay_s=120) # tablets reconnect late .key_by(lambda e: e.order_id) .map_with_state(pair_adjacent)) # (from_state,to_state,dt) def stage_durations(transitions): # transitions arrive ordered per order; emit durations once both ends seen seen = {t.state: t.event_ts for t in transitions} def mins(a, b): return (seen[b] - seen[a]).total_seconds() / 60 if a in seen and b in seen else None return { "prep_min": mins("CONFIRMED", "READY"), "pickup_wait_min": mins("READY", "PICKED_UP"), "drive_min": mins("PICKED_UP", "DELIVERED"), "total_min": mins("PLACED", "DELIVERED"), }

The payout close is the slow loop's other half — an idempotent batch that can be run any number of times and converges to the same answer. It recomputes each open dispatch's payout from the current tips, reruns the allocator, re-asserts the § 04 invariant, and — for any order whose delivery is now more than 24 hours old — stamps tip_locked_ts. Idempotence is the whole game: the close runs nightly for fresh dispatches and a second time after the lock window, and because every step is a recomputation rather than an increment, a retried or double-fired close never moves a courier's pay twice.

The close may run a hundred times; the courier is paid once. Idempotence is the difference between a batch job and a liability.PAYOUT RULE Nº 1 — CONVERGENT CLOSE

§ 07 — ANALYTICS SQLInterrogating three stakeholders

The facts are where the system explains itself — once per grain. Three queries an interviewer loves, because each serves a different stakeholder and each carries a classic SQL pattern on its back.

Whose fault was the late order? — conditional bucketing

The customer org's defining query. Decompose the order's lifecycle into stages and let a CASE ladder name the bottleneck. The pattern is conditional aggregation's cousin — per-row classification — turning five timestamps into a single actionable verdict.

SQL · STAGE LATENCY + WHOSE-FAULT ATTRIBUTION
-- Per-order stage breakdown with a bottleneck verdict. Customer-facing -- SLA plus the one thing support actually needs: who to call. SELECT order_id, extract(epoch FROM confirmed_ts - placed_ts) / 60 AS confirm_min, extract(epoch FROM ready_ts - confirmed_ts) / 60 AS prep_min, extract(epoch FROM picked_up_ts - ready_ts) / 60 AS pickup_wait_min, extract(epoch FROM delivered_ts - picked_up_ts) / 60 AS drive_min, extract(epoch FROM delivered_ts - placed_ts) / 60 AS total_min, (delivered_ts <= promised_ts) AS on_time, CASE WHEN ready_ts - confirmed_ts > INTERVAL '20 min' THEN 'restaurant_slow' WHEN picked_up_ts - ready_ts > INTERVAL '10 min' THEN 'courier_late_pickup' WHEN delivered_ts - picked_up_ts > INTERVAL '20 min' THEN 'courier_slow_drive' ELSE 'on_track' END AS bottleneck FROM fct_orders WHERE placed_ts::date = CURRENT_DATE - 1 AND delivered_ts IS NOT NULL;

Is stacking worth it for couriers? — split aggregation over the bridge

The courier org's defining query, and the one that justifies the entire bridge. Compare earnings per hour for batched versus single dispatches. If stacking does not beat single by a clear margin, the optimizer should stop offering it — the whole economic case is in this GROUP BY is_batched.

SQL · EARNINGS PER HOUR — STACKED VS SINGLE
-- Stack uplift: the courier value of batching, quantified. If batched -- earns < ~1.3x single per hour, stacking isn't paying the courier back. SELECT is_batched, count(*) AS dispatches, round(sum(payout_usd) / sum(extract(epoch FROM dispatch_end_ts - dispatch_start_ts) / 3600), 2) AS earnings_per_hr, round(sum(payout_usd) / sum(total_orders), 2) AS payout_per_order, round(avg(total_distance_km / total_orders), 2) AS km_per_order FROM fct_dispatches WHERE dispatch_start_ts::date BETWEEN '2025-05-01' AND '2025-05-31' AND dispatch_end_ts IS NOT NULL GROUP BY is_batched;

Which kitchens are overloaded? — percentile aggregation with a filtered rate

The restaurant org's defining query. Prep-time p50/p95 per restaurant per day, beside the restaurant-sourced cancellation rate. The pattern is percentile aggregation paired with a FILTERed count — one pass yields both the typical kitchen and its tail, plus how often it bailed entirely.

SQL · PREP P50/P95 + CANCELLATION RATE PER RESTAURANT
-- Kitchen health. Spiking p95 prep = overload; high cancel rate = a -- restaurant accepting orders it can't make. Both warned, then throttled. SELECT restaurant_id, placed_ts::date AS d, count(*) AS orders, percentile_cont(0.50) WITHIN GROUP ( ORDER BY extract(epoch FROM ready_ts - confirmed_ts) / 60) AS prep_p50_min, percentile_cont(0.95) WITHIN GROUP ( ORDER BY extract(epoch FROM ready_ts - confirmed_ts) / 60) AS prep_p95_min, round(100.0 * count(*) FILTER ( WHERE terminal_state = 'cancelled_restaurant') / nullif(count(*), 0), 1) AS cancel_pct FROM fct_orders WHERE confirmed_ts IS NOT NULL GROUP BY restaurant_id, placed_ts::date HAVING count(*) >= 10 ORDER BY prep_p95_min DESC;

§ 08 — THE DASHBOARDProving three grains stay honest

A senior design ends with observability, because every clean grain above is invisible without it. The dashboard watches the three stakeholders separately — each has its own definition of "healthy" — and one tile watches the seam between them.

CUSTOMER
on-time % (delivered_ts ≤ promised_ts), total delivery time p50/p95, bottleneck mix (the share of late orders blamed on prep vs pickup vs drive) — a shift toward restaurant_slow means the kitchens, not the couriers, need attention.
COURIER
earnings per hour split batched vs single, stack uplift ratio (the multiplier that justifies stacking), idle time between dispatches, legs per stacked dispatch — uplift sliding toward 1.0 is the signal to dial batching back.
RESTAURANT
prep p95, cancelled_restaurant rate, GMV per restaurant — the throttle list is built from the top of the p95 distribution.
THE SEAM
bridge balance drift — the count of dispatches whose leg shares failed to reconcile at close. This should be exactly zero; any non-zero value is a money bug, not a metric.
Marketplace Ops — Bay Area FRI 19:10 PT · ALL GRAINS · 60s REFRESH
On-Time %
90.4%
Delivery Time p95
48min
Stack Uplift
1.56×
Bridge Drift
0
Late-Order Bottleneck Mix — dinner rush 18:30–19:10
restaurant_slow 61% pickup_wait 26% slow_drive 13%
Earnings / Hr (batched)
43.75$
Earnings / Hr (single)
28.0$
Prep p95 (worst kitchen)
34min
Restaurant Cancel %
3.4%
Legs / Stacked Disp
2.2
Tip Still Mutable
18%
FIG. 2 — The story a healthy overload tells: on-time slipping amber and delivery time stretching, but the bottleneck bar says the kitchens are the cause — couriers are fine, stack uplift holds at 1.56×, and bridge drift is flat zero, because the money still reconciles.

Read the amber tiles together and the dashboard narrates a Friday dinner rush from the operator's chair: demand spiked, kitchens fell behind (prep p95 at 34 minutes, restaurant_slow owning 61% of the late orders), so on-time dipped and delivery time stretched — while couriers kept earning, stacking kept paying its 1.56× premium, and the seam stayed honest. The single most important tile is the boring one: bridge drift at zero. Everything else is a tunable; that one is the law.


§ 09 — THE RUBRICWhat was actually being tested

Strip the dinners away and the question was testing five judgments, each of which generalizes far beyond food delivery:

GRAIN
Hearing three readers where the prompt says one entity — and giving each a fact at its own grain instead of one fat table that lies to two of them.
THE BRIDGE
Modeling a many-to-many relationship as a first-class table, so stacking is the cardinality of a join, not a flag and a special case. The bridge is what makes batching expressible at all.
CONSERVATION
Finding the one arithmetic invariant — the legs sum to the payout — and making it executable at close, so a money bug aborts a transaction instead of leaking into a report.
MUTABILITY
Handling a number that changes after the fact without mutating closed facts: append the correction, recompute idempotently, freeze at the lock boundary. The tip moves; the audit trail doesn't.
ATTRIBUTION
Turning "it was late" into "the kitchen took 22 minutes" — designing the telemetry so every metric points at a stakeholder who can fix it.
Three grains keep three stories honest. One bridge keeps the money whole. Everything else is reporting — careful, stakeholder-aware reporting.— CLOSING ARGUMENT