Data Modeling Studio

Twenty-three modeling scenarios, the way an interviewer probes them — filter to the company you're prepping, then open a scenario into its tabbed study view.

Interview Studio · Design · PaddySpeaks

23 Scenarios Core · Advanced · Principal SQL · ER diagrams · Trade-offs

The twenty-three scenarios

Every modeling round tests the same five-move sequence — the domain changes what you say. Filter, search, or open a card for its study view.

  • 01Clarify use cases
  • 02Pin the grain
  • 03Conceptual → logical → physical
  • 04SCDs & time
  • 05Stress test

No scenarios match these filters.

Trip lifecycle (with multi-driver convoys)

Uber / Lyft · ride-sharing
Design a complete data model for the Trip Lifecycle — including the complex convoy / multi-driver scenario where one group request is split across two or more vehicles. Must support Operations (live ETA), Finance (driver payouts), and Risk (fraud).

Why this is hard — the design tension

The problem isn't a single trip — it's the group intent. Five friends going to the airport expect to be charged a single price, but Uber dispatches two vehicles. Operations sees two trips; Finance must charge the group once but pay two drivers fairly; Risk must verify the convoy actually moved together. The robust solution is a hierarchical Star Schema that decouples the physical movement of the car from the financial obligation of the rider.

Section 1 — Dimensional Model Overview

This model uses a Journey → Trip → Event hierarchy to handle everything from a single ride to complex shared convoys.

Fact Tables (the hierarchy)

  • fct_journeysThe "Parent" grain. One row per group request. Captures group intent (split type, total group cost, status).
  • fct_tripsThe "Vehicle" grain. One row per driver / vehicle per journey. Source of truth for Finance (driver payouts).
  • fct_trip_eventsThe "Telemetry" grain. One row per state transition (REQUEST, MATCH, PICKUP, etc.) per trip. Powers Ops (Live ETA) and Risk (GPS replay).
  • fct_trip_feedbackThe "Sentiment" grain. One row per (trip × rider) feedback event. Captures rating, category, free-text comment, sentiment. Powers Product (CSAT, NPS).
  • brg_journey_ridersThe "Financial" bridge. Links riders to journeys and to a specific assigned trip/vehicle — handles cost-splitting (journey-level) and capacity routing (trip-level).

Dimension Tables

  • dim_users — Riders and drivers, typed via user_role. SCD2 on plan tier and rating.
  • dim_vehicles — Vehicle specs (model, type), insurance status, owner. SCD2 on insurance status (Risk needs as-of-trip status).
  • dim_geography — split by volatility: the H3 grid itself (dim_h3_cell, level 8 ≈ 0.7 km²) is SCD1 because the geometry is static, but the operational overlays riding on it — tax zones, airport/surge rings, road status — are SCD2 (dim_surge_zone and road-state rows), because those boundaries get re-drawn. See §11.3 for SCD2 geography in action.
  • dim_payment_methods — Card / wallet on file. SCD2 — Finance audit needs the as-of-trip card.
  • dim_fx_rate — SCD2 daily rates; locked at request_timestamp.

Section 10 — Working with a real Uber dataset (148K bookings · Delhi NCR · 2024)

To make the practical SQL drillable on real data, the model maps cleanly to a public Kaggle-derived dataset: github.com/hannahsutton1/uber.data148,770 booking rows from Delhi NCR for calendar year 2024. Repo includes raw + cleaned CSVs plus the cleaning Jupyter notebooks.

Source schema (single wide table, 21 columns)

Date, Time, booking_ID, Booking_Status, Customer_ID, Vehicle_Type,
Pickup_Location, Drop_Location, Avg_VTAT, Avg_CTAT,
Cancelled_Rides_by_Customer, Reason_for_cancelling_by_Customer,
Cancelled_Rides_by_Driver, Driver_Cancellation_Reason,
Incomplete_Rides, Incomplete_Rides_Reason,
Booking_Value, Ride_Distance, Driver_Ratings, Customer_Rating, Payment_Method

Glossary: VTAT = vehicle-to-pickup minutes (driver arrival time). CTAT = customer-to-completion minutes (in-trip duration). Booking_Value = fare in INR. Ride_Distance in km. Ratings on a 1–5 scale.

Source → Target mapping (ETL plan)

Source column(s)Target table.columnNotes
booking_IDfct_trips.trip_idNatural key; idempotency anchor.
Date + Timefct_trips.request_timestampConcat to TIMESTAMP_TZ (Asia/Kolkata).
Customer_IDfct_trips.rider_id + dim_usersSingle-rider per booking.
Vehicle_Typedim_vehicles.typeEnum {Auto, Go Mini, Go Sedan, Premier Sedan, UberXL, Bike, eBike}.
Pickup_Location, Drop_Locationfct_trips.origin_h3, destination_h3Geocode locations → H3 cells (post-load enrichment).
Booking_Statusfct_trips.terminal_stateMap: {Completed, Cancelled by Customer, Cancelled by Driver, Incomplete, No Driver Found}.
Avg_VTATfct_trip_events (derived: MATCHED → PICKUP latency)Reverse-engineer the events table from VTAT/CTAT.
Avg_CTATfct_trip_events (derived: PICKUP → DROPOFF latency)
Booking_Valuefct_trips.total_fare_local (currency = INR)FX to USD via dim_fx_rate.
Ride_Distancefct_trips.trip_distance_km
Driver_Ratings, Customer_Ratingfct_trip_feedback.rating_score (typed by rater)Two rows per trip (driver→rider, rider→driver).
Reason_for_cancelling_by_Customerfct_trip_feedback.feedback_categoryFor cancelled trips.
Driver_Cancellation_Reasonfct_trips.cancellation_reasonFree-text → categorical NLP at ETL.
Payment_Methoddim_payment_methodsEnum {Cash, UPI, Debit Card, Credit Card, Wallet}.

Enrichment: inferring the journey hierarchy from flat bookings

The dataset is one row per single-rider booking — there is no journey_id, so the convoy hierarchy can't be exercised until you derive one. A window function groups bookings that share an origin, destination, and request-minute into an inferred journey:

-- Inferred journey derivation (the enrichment layer)
WITH inferred AS (
  SELECT
    booking_ID  AS trip_id,
    Customer_ID AS rider_id,
    DENSE_RANK() OVER (
      ORDER BY Pickup_Location, Drop_Location,
               DATE_TRUNC('minute', CAST(Date || ' ' || Time AS TIMESTAMP))
    ) AS inferred_journey_id
  FROM raw_kaggle_uber_data
)
SELECT
  inferred_journey_id      AS journey_id,
  COUNT(*)                 AS convoy_vehicle_count,
  COUNT(DISTINCT rider_id) AS distinct_riders
FROM inferred
GROUP BY inferred_journey_id
HAVING COUNT(*) > 1;        -- multi-vehicle groups = candidate convoys

Caveat — this is a heuristic, not ground truth. Grouping on (origin, destination, minute) fuses independent strangers sharing a popular corridor at rush hour into a phantom "convoy." A real enrichment needs a genuine correlation signal — a shared booking/party token, a tighter spatial radius (H3 L10) with a few-second window, or the same payment instrument — before two bookings can be claimed as one journey. Use it to teach the window-function pattern, and name its false-positive rate out loud in the interview: spotting the limit is itself the senior signal.

Real sample rows (5 representative bookings from the dataset)

-- Loaded into fct_trips after ETL transformation:
INSERT INTO fct_trips (trip_id, customer_id, vehicle_type, pickup_loc, drop_loc,
                       request_ts, terminal_state, vtat_min, ctat_min,
                       booking_value_inr, ride_distance_km, payment_method)
VALUES
  ('CNR5884300','CID1982111','eBike',         'Palam Vihar',  'Jhilmil',
    '2024-03-23 12:29:38','no_driver_found',     NULL,  NULL,    NULL,   NULL,  NULL),
  ('CNR8494506','CID9202816','Auto',          'Khandsa',      'Malviya Nagar',
    '2024-08-23 08:56:10','completed',           13.4,  25.8,    627.00, 13.58, 'Debit Card'),
  ('CNR8906825','CID2610914','Premier Sedan', 'Central Secretariat','Inderlok',
    '2024-10-21 17:17:25','completed',           13.1,  28.5,    416.00, 34.02, 'UPI'),
  ('CNR1950162','CID9933542','Bike',          'Ghitorni Village','Khan Market',
    '2024-09-16 22:08:00','completed',            5.3,  19.6,    737.00, 48.21, 'UPI'),
  ('CNR1326809','CID4604802','Go Sedan',      'Shastri Nagar','Gurgaon Sector 56',
    '2024-11-29 18:01:39','incomplete',           4.9,  14.0,    237.00,  5.73, 'UPI');

Q17 — Vehicle type mix & cancellation rate (real-data analysis)

SELECT
  vehicle_type,
  COUNT(*)                                                                      AS bookings,
  COUNT(*) FILTER (WHERE terminal_state = 'completed')                          AS completed,
  COUNT(*) FILTER (WHERE terminal_state LIKE 'cancelled%')                      AS cancelled,
  COUNT(*) FILTER (WHERE terminal_state = 'no_driver_found')                    AS no_driver,
  ROUND(100.0 * COUNT(*) FILTER (WHERE terminal_state = 'completed')
        / COUNT(*), 1)                                                          AS completion_pct,
  ROUND(AVG(booking_value_inr) FILTER (WHERE terminal_state = 'completed'), 0)  AS avg_fare_inr,
  ROUND(AVG(ride_distance_km)  FILTER (WHERE terminal_state = 'completed'), 1)  AS avg_distance_km
FROM fct_trips
WHERE DATE_TRUNC('year', request_ts) = '2024-01-01'
GROUP BY vehicle_type
ORDER BY bookings DESC;

Typical result on the 148K dataset:

vehicle_typebookingscompletedcompletion_pctavg_fare_inravg_distance_km
Auto~37,400~23,20062%~410~22.5
Go Sedan~31,800~19,80062%~520~24.1
Premier Sedan~21,500~13,30062%~610~24.0
Bike~18,900~11,70062%~390~22.7
eBike~14,600~9,00062%~395~22.8
Go Mini~12,300~7,60062%~440~23.0
UberXL~12,200~7,50062%~720~24.5

Insight: completion rate is ~62% across all vehicle types — the bottleneck isn't the vehicle, it's cancellation behavior (next query).

Q18 — Top 10 routes by booking volume (Pickup → Drop)

SELECT
  pickup_loc,
  drop_loc,
  COUNT(*)                                                                       AS trips,
  ROUND(AVG(booking_value_inr) FILTER (WHERE terminal_state = 'completed'), 0)   AS avg_fare,
  ROUND(AVG(ride_distance_km)  FILTER (WHERE terminal_state = 'completed'), 1)   AS avg_dist,
  ROUND(100.0 * COUNT(*) FILTER (WHERE terminal_state = 'completed')
        / COUNT(*), 1)                                                           AS completion_pct
FROM fct_trips
GROUP BY pickup_loc, drop_loc
ORDER BY trips DESC
LIMIT 10;

Q19 — Cancellation reason breakdown (driver vs customer)

WITH driver_cancels AS (
  SELECT 'driver' AS cancelled_by, driver_cancellation_reason AS reason, COUNT(*) AS n
  FROM fct_trips
  WHERE terminal_state = 'cancelled_driver' AND driver_cancellation_reason IS NOT NULL
  GROUP BY driver_cancellation_reason
),
customer_cancels AS (
  SELECT 'customer' AS cancelled_by, reason_for_cancelling_by_customer AS reason, COUNT(*) AS n
  FROM fct_trips
  WHERE terminal_state = 'cancelled_rider' AND reason_for_cancelling_by_customer IS NOT NULL
  GROUP BY reason_for_cancelling_by_customer
)
SELECT cancelled_by, reason, n,
  ROUND(100.0 * n / SUM(n) OVER (PARTITION BY cancelled_by), 1) AS pct_within_actor
FROM (SELECT * FROM driver_cancels UNION ALL SELECT * FROM customer_cancels) u
ORDER BY cancelled_by, n DESC;

Use case: if "Vehicle Breakdown" tops the driver list, fleet maintenance owns the fix; if "Driver took too long" tops the customer list, dispatch tuning owns it.

Q20 — Hourly demand pattern + completion rate (peak-hour stress)

SELECT
  EXTRACT(HOUR FROM request_ts) AS hour_of_day,
  COUNT(*) AS bookings,
  COUNT(*) FILTER (WHERE terminal_state = 'no_driver_found')          AS no_driver_found,
  ROUND(100.0 * COUNT(*) FILTER (WHERE terminal_state = 'no_driver_found')
        / COUNT(*), 2)                                                AS no_driver_pct,
  ROUND(100.0 * COUNT(*) FILTER (WHERE terminal_state = 'completed')
        / COUNT(*), 1)                                                AS completion_pct,
  ROUND(AVG(vtat_min) FILTER (WHERE terminal_state = 'completed'), 1) AS avg_vtat_min
FROM fct_trips
GROUP BY EXTRACT(HOUR FROM request_ts)
ORDER BY hour_of_day;
-- The hours where no_driver_pct spikes are the operational gaps where supply incentives matter most.

Q21 — Payment method mix by vehicle tier (premium vs economy)

SELECT
  CASE WHEN vehicle_type IN ('Premier Sedan','UberXL') THEN 'premium'
       WHEN vehicle_type IN ('Go Sedan','Go Mini')      THEN 'mid'
       ELSE                                                  'economy'
  END                                                  AS tier,
  payment_method,
  COUNT(*)                                             AS trips,
  ROUND(100.0 * COUNT(*) /
        SUM(COUNT(*)) OVER (PARTITION BY
          CASE WHEN vehicle_type IN ('Premier Sedan','UberXL') THEN 'premium'
               WHEN vehicle_type IN ('Go Sedan','Go Mini')      THEN 'mid'
               ELSE                                                  'economy' END), 1) AS pct_within_tier
FROM fct_trips
WHERE terminal_state = 'completed' AND payment_method IS NOT NULL
GROUP BY tier, payment_method
ORDER BY tier, trips DESC;
-- UPI dominates economy; cards skew premium. Drives partnership and incentive design.

What this dataset does not support (caveats — for honest interviews)

  • No driver_id, no vehicle_id. Only Vehicle_Type as a categorical. Driver-side analytics (utilization, payouts) need a synthetic driver_id assignment or a different dataset.
  • No surge multiplier, no fare breakdown. Only the final Booking_Value. Surge analysis (Scenario 2) needs synthetic data.
  • No lat/lon, no H3 cells. Only Delhi NCR neighborhood names. Geocode at ETL using h3-py or Mapbox.
  • No journey or multi-leg concept. One rider per booking — convoy / Pool / shared rides are not in scope. Use synthetic data for those examples.
  • No GPS trace. fct_trip_events can be reverse-engineered from VTAT/CTAT into approximate REQUESTED, MATCHED, PICKUP, DROPOFF timestamps but the geographic events (location_h3 per state) are unavailable.

Section 11 — Mass-departure scale: 20K+ people, road blocks & growth accounting

An interview prompt sometimes raises the stakes: "Architect the system to handle a 20,000-person stadium event — accidents, road blocks, supply/demand volatility, all at once." The existing Journey → Trip → Event hierarchy holds, but three additional concerns must be modeled explicitly: infrastructure as a fact (SCD2 geography), growth accounting on both sides of the marketplace, and point-in-time financial integrity under chaos.

Section 12 — Entity ownership & boundaries (who owns what, what crosses the line)

The model so far names tables. A senior answer names archetypes and owners — because the boundary between two tables is a contract, not a foreign key. Four archetypes, four write disciplines, four systems of record.

EntityArchetype & write disciplineSystem of recordWhat crosses the boundary
fct_tripsStateful aggregate. One row mutated through a known state machine in OLTP; snapshotted append-only to the warehouse.Trip service (OLTP) → warehouse mirrorThe terminal state + immutable keys (trip_id, surge_multiplier, fx_rate_to_usd). Never the in-flight state.
fct_trip_eventsAppend-only log. Insert-only, never updated, ordered by event_timestamp within trip_id.Dispatch / telemetry serviceEvery state transition, immutably. This log is the audit trail.
total_fare_usd · driver_payout_usdFinancial truth. Append-only ledger; corrections are new rows, never an in-place UPDATE.Payments / ledger serviceA committed amount keyed to trip_id — written only after a terminal state.
location_h3 · raw GPSTime-series stream. High-volume, lossy, latest-write-wins, TTL'd.Location service (Redis / Pinot)A sampled H3 cell on each event — not the raw 1 Hz GPS firehose.

The ownership rule that signals seniority:

fct_trips is a projection, not a source. The trip's life is the fct_trip_events log; fct_trips is that log folded into its current state. If the two ever disagree, the log wins and fct_trips is rebuilt.

That is event-sourcing-with-a-snapshot, and it earns the level. The interviewer's follow-up — "so can you rebuild fct_trips from fct_trip_events?" — has one correct answer: yes, and that is the whole point (Section 15).

Boundary violations that disqualify. A column on fct_trips that a second service writes directly → two writers, no source of truth. Raw GPS inside fct_trip_events → a 1 Hz firehose welded to an append-only audit log, now unqueryable. A fare written before the terminal state → finance reads an estimate as if it were committed. Each one collapses a boundary the model depends on.

Section 13 — Consistency contracts: one model, four consistency regimes

A model that does not state its consistency contracts looks architecturally correct but operationally fuzzy. The Uber model spans four subsystems, and each runs a different consistency regime. Name them explicitly, per subsystem.

SubsystemRegimeOperational contractMechanism
Matching / dispatchEventual — seconds✓ two riders may briefly see the same driver as "available"  ·  ✗ a driver permanently double-bookedOptimistic lock on driver_id + offer expiry; first MATCHED wins
Payments / payoutStrong — ACID✓ payout lands minutes after dropoff  ·  ✗ a charge with no matching ledger entry; SUM(payouts) ≠ SUM(fares − take)Double-entry ledger, trip_id as idempotency key, single writer
Location / GPSLossy — latest-write-wins✓ dropped samples, a stale fix  ·  ✗ blocking a trip write because a GPS packet is lateFire-and-forget into a TTL'd store; never on the trip's critical path
Analytics / warehouseBatch — bounded staleness✓ warehouse fct_trips is 5–60 min behind OLTP  ·  ✗ finance closing the books inside an open late-data windowWatermark + 24–48 h backfill window; an explicit "as-of" reporting timestamp

The framing that lifts the answer a level: consistency is per-field, not per-system. On a single fct_trips row —

  • surge_multiplier, fx_rate_to_usdstrong; locked at request, immutable forever.
  • driver_idstrong at assignment (the optimistic lock), then immutable.
  • the live ETA the rider sees (derived, not stored on the fact) — eventual; a stale value is a harmless UX hint.
  • is_flagged_riskeventual and slowly-changing; appended, never blocks anything.
The trap: treating the whole trip as one consistency class. The ETA and the fare share a row but nothing operationally — one can be 30 seconds stale at zero cost, the other must be exact or finance breaks. A model that cannot tell those two fields apart will either over-engineer the ETA or under-protect the fare.

Section 14 — State-machine rigor: legal transitions, idempotency, the races

Section 1 lists the states. Seniority is in the edges — which transitions are legal, what guards the illegal ones, and which concurrency races the model must already survive. Interviewers love breaking a state machine; bring the guards before they ask.

The legal transition table (event states; terminal states mirror fct_trips.terminal_state):

From stateLegal next statesAbsorbing?
REQUESTEDMATCHED · EXPIRED · CANCELLEDno
MATCHEDARRIVED · CANCELLEDno
ARRIVEDSTARTED · NO_SHOW · CANCELLEDno
STARTEDDROPOFFno
DROPOFF · CANCELLED · EXPIRED · NO_SHOW(none)yes — terminal

Anything outside this table is an invalid transition and must be rejected, not silently written: STARTED → MATCHED (time travel), DROPOFF → STARTED (a terminal state is absorbing), a second MATCHED on a trip already matched. The guard is a state-rank check plus event_timestamp ordering within trip_id — an event that would lower the rank is routed to a dead-letter queue, not appended.

Idempotency. fct_trip_events.event_id is the producer's idempotency key. At-least-once delivery (Kafka, retried RPCs) will redeliver; the append must be a no-op on a seen event_id.

-- Event-log append. At-least-once delivery WILL redeliver; event_id makes it safe.
INSERT INTO fct_trip_events (event_id, trip_id, state, event_timestamp, location_h3, actor)
SELECT :event_id, :trip_id, :state, :event_ts, :h3, :actor
 WHERE NOT EXISTS (SELECT 1 FROM fct_trip_events WHERE event_id = :event_id);
-- Equivalent: UNIQUE(event_id) + INSERT ... ON CONFLICT DO NOTHING.

The races a senior volunteers — before the interviewer reaches for them:

RaceWhat goes wrongResolution
Driver accepts twice (double-tap / RPC retry)Two MATCHED events, same trip_id, same driver_idIdempotent on event_id; the assignment UPDATE below is a no-op the second time
Duplicate offer — two drivers both acceptTwo MATCHED, same trip_id, different driver_idOptimistic lock: UPDATE … WHERE driver_id IS NULL — exactly one wins, the loser re-queues
Double cancel — rider and driver cancel togetherTwo terminal writes on one tripCompare-and-set: terminal_state is set once, under WHERE terminal_state IS NULL
Late event reorders the logARRIVED lands after STARTED on network delaySort by event_timestamp within trip_id at read; reject rank-violating writes
-- OLTP write path. Driver assignment: the optimistic lock that
-- resolves the duplicate-offer race. Exactly one racing driver wins.
UPDATE trips
   SET driver_id = :candidate_driver_id,
       state     = 'MATCHED'
 WHERE trip_id   = :trip_id
   AND state     = 'REQUESTED'     -- guard: trip still offerable
   AND driver_id IS NULL;          -- guard: only the first writer passes
-- rows_updated = 0  → another driver already won; release the offer, re-queue.
-- rows_updated = 1  → this driver is assigned; emit the MATCHED event.
The trap: drawing the happy-path state diagram and stopping. The junior answer is a picture of states; the senior answer is a guard on every edge — the state-rank check, the event_id dedup, and the compare-and-set that makes "exactly one driver wins" true under concurrency. States are a diagram; legal transitions plus the locks that enforce them are a system.

Section 15 — Failure scenarios & the reconciliation engine

The clean star schema is the steady-state view. Production runs a reconciliation engine underneath it — and a senior candidate volunteers the ugliness rather than waiting to be cornered by it.

FailureSymptomHow the model contains it
Duplicate driver assignmentTwo drivers en route to one riderThe optimistic lock (§14) blocks it at write. If it still leaks, (driver_id, journey_id) uniqueness flags a DQ violation; the second trip resolves to cancelled_system — no payout.
Late GPS → wrong ETARider sees a stale 12-min ETA; the driver is 2 min awayThe live ETA is eventual-consistency UX (§13) — a wrong ETA costs nothing financial. location_h3 is sampled, never authoritative, and never feeds the fare.
Payment succeeds, trip then failsRider charged; trip ends cancelled_system after STARTEDThe fare is an append-only ledger — the charge is real, so the reversal is a new refund row, not a delete. terminal_state drives the refund rule; net = SUM.
Event replay → double chargeKafka replays the DROPOFF; the fare is computed twiceevent_id idempotency (§14) dedups the replayed event before the fare calc sees it. The fare is keyed to trip_id once; a re-compute is a MERGE no-op.
Driver app crashes mid-tripNo DROPOFF event ever arrivesStuck-state sweeper: a trip in STARTED past p99(trip_duration) × 3 is force-resolved to cancelled_system and reconciled against its last-known H3.

The reconciliation engine is three standing jobs, not a feature bolted on later:

  1. Invariant checks — the two SUM reconciliations from Section 4, run continuously. A break pages the producing team.
  2. Stuck-state sweeper — trips that miss a terminal state inside their SLA window are force-resolved on a known rule.
  3. Ledger reconciliationSUM(fct_trips.total_fare_usd) vs the payments ledger, per day. A drift is a real bug, never "rounding."
Senior framing. "The clean star schema is the steady-state view; production runs a reconciliation engine beneath it. Idempotency keys absorb replays, optimistic locks absorb races, append-only ledgers absorb rollbacks, and a stuck-state sweeper absorbs the events that simply never arrive. The model isn't only correct — it's correct and self-healing."

Section 16 — Partitioning & shard-key strategy

The shard key is the one decision you cannot migrate cheaply later — and the OLTP write path and the warehouse read path want different keys. Conflating them is a classic miss.

Layer · tableShard / partition keyRationale
OLTP · live tripsHASH(trip_id)Lifecycle reads/writes are point lookups by trip_id; hashing spreads load evenly. Not by city — NYE Manhattan would melt a single shard.
OLTP · fct_trip_eventsHASH(trip_id)Co-located with its trip, so the lifecycle write stays single-shard.
OLTP · location storeH3 cellReads are spatial — "who is near this hex" — not by trip. L8 normally; sub-divide to L10 in dense cells.
Warehouse · fct_tripsPartition DATE(request_timestamp), cluster origin_h3_keyAnalysts scan a date range, then filter by geography.
Warehouse · fct_trip_eventsPartition DATE(event_timestamp)Append-only, time-series scan pattern.

Hot-partition risk — where production actually fails:

Hot spotCauseCounter-move
Dense city / NYEA city-keyed shard puts all of Manhattan on one nodeNever shard the write path by city. HASH(trip_id) for OLTP; HASH(h3_key) MOD N sub-partition in the warehouse.
Stadium let-outOne H3 L8 cell takes 100× its normal request rateAdaptive zoom: split that cell to L10 — the hot partition becomes per-gate, not per-stadium (§11.5).
Mega-event in analyticsOne origin_h3_key dominates a daily partitionSalted two-stage aggregation: GROUP BY h3 || salt, then re-roll.
Super-driver skewA few airport drivers have 100× the tripsSalt driver_id in the payout rollup; the dimension is fine, the fact aggregation salts.

Why trip_id and not city or driver_id: a shard key must be immutable and high-cardinality with uniform load. trip_id is both. City is low-cardinality and skewed (and a trip's city can even change cross-border); driver_id is skewed by super-drivers. Rebalancing: consistent hashing on trip_id moves only ~1/N of keys when a shard is added — and because trips are short-lived (hours, not years), a rebalance only drains in-flight trips, it never migrates history.

Senior framing. "Pick the shard key for the path that can't tolerate a hotspot — the live write path — and pick it for the property you can't change later: immutability and uniform cardinality. trip_id for OLTP, DATE + H3 for the warehouse. The mistake that pages you at 2 a.m. is sharding the live path by city, because the business's best night — New Year's Eve in Manhattan — becomes a single melted node."
Senior framing. "The Journey → Trip → Event hierarchy is the unlock — Operations runs latency metrics off fct_trip_events, Finance reconciles fares against two invariants (rider-side and driver-side), and Risk replays GPS sequences for phantom-convoy detection. Three stakeholders, three grains, one source of truth — and two invariants on the financial bridge keep producer bugs from ever reaching the audit."

Surge pricing & supply/demand state

Uber · supply/demand · dynamic pricing
Design a model for Uber's surge engine that captures the supply/demand state per geography per minute, the surge decisions made, and the actual outcome of those decisions — supporting the live engine, post-hoc effectiveness analysis, capacity planning, and pricing-team experimentation.

Why this is hard — the design tension

Surge isn't one fact — it's a causal chain: state observed → model invoked → multiplier set → riders react → drivers reposition → state changes. Each link needs its own grain or you lose the ability to evaluate the engine. The model also has to replay history with the surge model that was active then, not today's — pricing teams A/B test new models monthly. And the geographic skew is brutal: midtown Manhattan on New Year's Eve has 100× the rows of suburban Queens.

Section 1 — Dimensional Model Overview

Star Schema with a State → Decision → Outcome causal chain.

Fact Tables

  • fct_geo_state_minuteThe "Observation" grain. One row per (H3 cell × minute). Captures open requests, available drivers, ratio, ETA, weather, event signals.
  • fct_surge_decisionThe "Decision" grain. One row per surge-multiplier change per cell. Captures which model version made the call and on what input features.
  • fct_surge_outcomeThe "Effectiveness" grain. One row per surge decision joined to its post-decision N-minute outcome (did supply rebalance? did the price stick?).
  • brg_zone_cellsHierarchy bridge. Many H3 cells roll up to a surge zone (operational region used by the pricing team).

Dimension Tables

  • dim_h3_cell — H3 L8 hexagons. SCD1 (geometry is static).
  • dim_surge_zone — Operational regions (e.g. "SF Downtown", "JFK Airport"). SCD2 — boundaries get re-drawn.
  • dim_surge_model — Versioned ML/heuristic models. SCD2 — drives the replay capability.
  • dim_weather — Weather state per cell per hour (rain, snow, temperature). SCD1 snapshot.
  • dim_event — Concerts, sports games, conferences. SCD1 (each event is a row, not a state).

Section 10 — Geospatial indexing: GeoHash vs H3 vs Quadtree

The model leans on H3 everywhere — dim_h3_cell, fct_geo_state_minute per cell. A senior interviewer will ask why H3 and not the alternatives. Have the trade-off ready.

IndexStrengthWeaknessBest fit
GeoHash
(rectangles, string-prefix)
A prefix means containment — trivially range-scannable in any KV store or B-tree index, zero special libraryNon-uniform cell area (distorts toward the poles); 8 neighbors with diagonal ambiguity; adjacent cells can share no prefix (boundary jitter)Cheap proximity inside a system you already run (Postgres, DynamoDB)
H3
(hexagons)
Six equidistant neighbors, no diagonal ambiguity; near-equal cell area; clean resolution hierarchyHexagons don't perfectly nest (parent/child is approximate); needs the H3 libraryRide-hailing — supply/demand per cell, surge, ETA. Uber's own choice.
Quadtree
(adaptive rectangles)
Depth adapts to density — dense cities get deep nodes, empty ocean stays shallowThe tree mutates as density shifts; harder to shard and rebalanceHighly skewed, slowly-changing density (map tiles, static POI)

Why H3 wins here: surge is a supply/demand ratio per cell, and the math only behaves if cells are equal-area and neighbors are equidistant — you average a cell against its ring of neighbors. Hexagons give that; GeoHash rectangles do not (a diagonal neighbor sits 1.4× farther than an edge neighbor, biasing the ratio). The choice of dim_h3_cell is not cosmetic — it is what makes driver_to_request_ratio a trustworthy number.

Storage — Redis GEO vs a custom H3 grid:

OptionMechanismTrade-off
Redis GEOSorted set, GeoHash-encoded score; GEOSEARCH by radiusZero infra to build, radius search built in — but it is GeoHash underneath (same non-uniformity) and holds no per-cell aggregate state
Custom H3 gridh3_key → {open_requests, available_drivers, …} in Redis hashes / PinotPer-cell state is a first-class key; the surge engine reads one key per cell. More to build — you own the indexing

The surge engine reading "a Redis snapshot of the latest minute" keyed by cell is the custom H3 grid — chosen over raw Redis GEO because surge needs per-cell aggregate state, not point-radius search. And resolution is a dial: L10 localizes demand more sharply than L8 but multiplies the cell count (and the per-minute write cost) — so the model picks resolution per zone (L8 cities, L10 venues), never globally.

Section 11 — Update amplification & location staleness

Two production realities the clean per-minute grain hides — and both are favourite senior probes.

Update amplification. Each driver app emits a location ping roughly every 4 s. With D drivers online that is D/4 writes per second. A naïve design fans every single ping out to the matching index, the surge counter, the rider's live map, and the ETA service — one ping becomes four downstream writes, and matching re-evaluates candidate sets on every ping. The counter-moves:

ProblemCounter-move
One ping → four synchronous downstream writesWrite the ping once to the location stream; consumers pull or subscribe — they are never pushed to synchronously. The location service is the single fan-in point.
Matching re-runs on every pingMatching reads the cell snapshot on a fixed cadence (per minute, or on a rider request) — decouple ping rate from match rate.
fct_geo_state_minute touched per pingPre-aggregate at the edge: count drivers per cell in a tumbling 1-minute window — one row per cell per minute, not one per ping. (This is what the grain already implies; make it explicit.)
Hot-cell ping storm (stadium)Sub-divide that cell to L10 so the per-cell write rate stays bounded (§8 skew note).

Location staleness. A driver's last fix is 25 s old — include them in matching, or not? The rule is degrade, don't exclude — and degrade by confidence, not a hard cutoff:

StalenessTreatment
< 5 sFresh — full weight in matching and ETA.
5–30 sStale-but-usable — keep in the candidate set, widen the ETA confidence interval, de-rank vs a fresh driver at equal distance.
30–120 sSuspect — eligible only if the cell is supply-starved (a stale driver beats no driver); flag the ETA low-confidence.
> 120 sExcluded — treat as offline; the driver has likely lost connectivity.

A hard 30-second cutoff is the amateur move: dropping every stale driver in a supply-starved cell shrinks the matchable pool, pushes the multiplier higher, and the surge model starts fighting itself. Staleness is a ranking signal weighted against supply pressure, not a binary filter — so carry location_age_sec alongside every location read and let the matcher weight it.

Senior framing. "A location is a value plus an age. The amateur design treats every GPS fix as equally true and fans every ping out synchronously; the senior design writes the ping once, lets consumers pull on their own cadence, and carries location_age_sec so matching can weight a stale driver instead of dropping one — because in a supply-starved cell a 25-second-old driver still beats no driver."
Senior framing. "The surge model isn't a column — it's a fact table joined to a versioned dimension. fct_surge_decision + frozen input_features_json + SCD2 dim_surge_model is the contract that lets the pricing team A/B-test new models confidently and lets regulators get a defensible answer to 'why was this rider charged 3.2× on this date?' — every time, forever."

Order &amp; courier dispatch (with stacked deliveries)

DoorDash / Uber Eats · 3-sided marketplace
Model DoorDash's order lifecycle from customer placement → restaurant prep → courier pickup → delivery, supporting Customers (on-time SLA), Couriers (earnings), Restaurants (operations), with native handling of stacked orders where one courier delivers multiple orders on a single dispatch.

Why this is hard — the design tension

This is a 3-sided marketplace (customer + restaurant + courier) where one fact table can't satisfy all three. Stacked orders make it worse: one courier dispatch covers multiple orders, but each order has its own customer SLA and its own restaurant relationship. Mutable tips (editable up to 24h post-delivery) demand append-only economics. The model must read clean for each stakeholder without forcing the others to filter.

Section 1 — Dimensional Model Overview

Fact Tables (the order/dispatch split + bridge)

  • fct_ordersThe "Customer" grain. One row per customer order. Drives customer SLA + restaurant operations.
  • fct_dispatchesThe "Courier" grain. One row per courier dispatch. May cover 1 (single) or N (stacked) orders. Drives courier earnings.
  • brg_dispatch_ordersThe "Stacking" bridge. Links a dispatch to its 1-N orders, with leg sequence and per-order distance.
  • fct_order_state_eventsThe "Telemetry" grain. One row per order state transition (PLACED, CONFIRMED, READY, PICKED_UP, DELIVERED, CANCELLED).

Dimension Tables

  • dim_customers · dim_addresses (SCD2 on address — for legal/refund history)
  • dim_restaurants (SCD2 — hours, menu version, status)
  • dim_couriers (SCD2 — vehicle type, rating tier, on/off platform)
  • dim_h3_cell (SCD1)

Search ads auction → impression → click → conversion

Google · search ads · auction
Design a model for Google search ads that captures the entire chain from auction clearing → impression → click → conversion, supporting advertiser billing, quality-score recomputation, ranker training, incrementality measurement, and re-runnable attribution windows.

Why this is hard — the design tension

This is a causal chain across 4 facts with extreme asymmetry: 2 billion ad requests/day → 200 billion targeting lookups → ~100 candidate ads per request → 1 winner. Loser logs are 20× winner volume. Auctions clear in <100ms but conversions land days later. The model must keep the join key (auction_decision_id) deterministic across all stages, support re-attribution when models change, and protect privacy under ATT/Sandbox.

Section 1 — Dimensional Model Overview

Fact Tables (the auction chain)

  • fct_auctionsThe "Decision" grain. One row per auction clearing. Carries auction_decision_id.
  • fct_impressionsThe "Served" grain. One row per impression (auction winner served on the page).
  • fct_clicksThe "Click" grain. One row per click on an impression. Mutable: invalid-click invalidation comes later.
  • fct_conversionsThe "Conversion" grain. One row per advertiser-reported conversion event.
  • fct_attributionsThe "Attribution" grain. One row per (conversion × eligible-touchpoint × model × run). Append-only per attribution_run_id.
  • fct_auction_losersThe "Counterfactual" grain. Sampled losing candidates (1%) for ranker training.

Dimension Tables

  • dim_advertisers, dim_campaigns, dim_ad_groups, dim_creatives — all SCD2 (budget, targeting, creative drift).
  • dim_keywords — SCD1, high cardinality.
  • dim_users · privacy-tokenized (Customer Match, FLoC/Topics, ATT bucket).
  • dim_attribution_models — SCD2 — versioned (last_click, position_based, data_driven_v3, etc.).

Cross-device attribution &amp; identity graph

Meta / Facebook · cross-device attribution
Design a model for Facebook ads where a user sees an ad on phone, opens the advertiser's site on laptop later, and converts on tablet a week after. Support 1/7/28-day click + view-through windows, ATT/SKAdNetwork iOS constraints, and incrementality lift measurement.

Why this is hard — the design tension

One human, many devices. Apple's ATT cuts user-level signals on iOS. SKAdNetwork only returns aggregated postbacks. The model must support cross-device joining via a probabilistic+deterministic identity graph, multiple attribution windows in parallel (1d / 7d / 28d × click vs view), and two privacy paths (user-level for opted-in, aggregate-only for SKAN).

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_ad_eventsThe "Touch" grain. One row per impression OR click (event-typed).
  • fct_conversionsThe "Conversion" grain. One row per advertiser-reported conversion.
  • fct_skan_postbacksThe iOS-aggregate grain. One row per SKAdNetwork postback (no user-level data).
  • fct_attributionsThe "Match" grain. One row per (conversion × eligible touchpoint × window × model × run). Append-only via attribution_run_id.

Dimension & Bridge Tables

  • dim_identities — One row per resolved person. Probabilistic + deterministic.
  • bridge_identity_devicesSCD2. Many devices per identity; link_confidence drives weighting.
  • dim_advertisers, dim_campaigns, dim_creatives — SCD2.
  • dim_user_holdouts — Holdout assignment for incrementality (SCD2 per advertiser × period).

CTV ad inventory, pacing &amp; frequency capping

Netflix · CTV ads · inventory &amp; pacing
Design a model for Netflix's ad-supported tier — capturing every ad slot opportunity (filled or unfilled), every ad served, frequency caps per user, and contractual pacing guarantees per advertiser ("guaranteed 10M impressions by month-end with make-good if under-delivered").

Why this is hard — the design tension

CTV inventory has a contractual layer that programmatic doesn't: guaranteed buys with monetary penalties for under-delivery. Three pressures pull simultaneously: (1) maximize fill rate, (2) honor frequency caps so users don't see the same ad 8 times, (3) hit committed impressions per advertiser per month. Modeling unfilled opportunities (not just impressions) is the senior move — it's the only way to answer "why did we leave money on the table?"

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_ad_opportunitiesThe "Slot" grain. One row per ad-slot opportunity in a viewing session. Filled or unfilled — both kept.
  • fct_ad_impressionsThe "Served" grain. Subset of opportunities where an ad actually played.
  • fct_pacing_dailyThe "Contract" grain. One row per (campaign × day) aggregating delivered vs committed.
  • fct_frequency_stateThe "Cap" grain. One row per (profile × campaign × day) tracking impressions toward cap.
  • fct_makegood_obligationsThe "Owed" grain. One row per (campaign × month) when committed not met. Drives next-month bonus inventory.

Dimension Tables

  • dim_advertisers, dim_campaigns, dim_creatives (SCD2)
  • dim_campaign_flights — Period a campaign runs (start_date / end_date / total_committed_impressions)
  • dim_titles (SCD2 for content metadata; ratings drive ad eligibility)
  • dim_profiles (SCD2; subscription tier = ad-supported / premium)

Orders, returns &amp; multi-warehouse inventory

Amazon · e-commerce · returns &amp; inventory
Design a model for Amazon orders that handles partial returns, replacements, marketplace seller orders (1P vs FBA vs 3P), and inventory drawn from multiple warehouses with FIFO/LIFO consumption rules.

Why this is hard — the design tension

Returns reshape revenue retroactively. Multi-warehouse fulfillment splits a single SKU across two warehouses. Seller-of-record (1P retail / 2P FBA / 3P marketplace) determines whether Amazon books GMV or commission. And inventory has to reconcile to ledger every night. The model must be append-only on the financial side and movement-based on the physical side — same product, two grains, one source of truth.

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_order_linesThe "Sale" grain. One row per (order × product × source_warehouse). 3-item order = 3 rows.
  • fct_returnsThe "Refund" grain. One row per return event, line-level. Append-only.
  • fct_inventory_movementsThe "Stock-flow" grain. One row per stock change (in / out / transfer / damage / adjustment).
  • snap_inventory_dailyThe "State" grain. Daily snapshot per (product × warehouse).
  • fct_replacementsThe "Linked-order" grain. Maps replacement orders back to their original.

Dimension Tables

  • dim_products (SCD2 — title, category, dimensions drift)
  • dim_sellers (SCD2 — seller_type ∈ {retail_1p, fba_2p, marketplace_3p}, recognized_as ∈ {gmv, commission_only})
  • dim_warehouses, dim_customers + dim_addresses (SCD2), dim_promotions (SCD2)

Feed engagement at scale (with ranker A/B)

Meta / Instagram · social feed · ranker
Model engagement on the Instagram feed — impressions, dwell time, likes, saves, comments, shares, plus negative signals (hide, report) — supporting feed ranker training/A-B, creator analytics, and brand safety. Volume: 100B+ impressions/day.

Why this is hard — the design tension

Volume alone forces design choices: 100B impressions/day means exact COUNT DISTINCT is unaffordable; HLL is mandatory. Mutable engagements (like → unlike) demand append-only with is_undone flags. Heartbeat-driven dwell can't live at raw grain — must be aggregated at ETL. And the ranker A/B requirement means every impression must carry the model_id that ranked it, or you can't compare baseline lift.

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_impressionsThe "Served" grain. One row per (viewer × post × impression). Carries ranker_model_id + predicted_score.
  • fct_engagementsThe "Reaction" grain. One row per (viewer × post × event_type × event_ts). Likes, saves, comments, shares, hides, reports — all here.
  • fct_dwell_per_impressionThe "Time-on-post" grain. One row per impression with total dwell_ms (aggregated from heartbeats at ETL).
  • fct_ranker_experimentsThe "Treatment" grain. One row per (viewer × experiment × variant). Drives A/B comparison.

Dimension Tables

  • dim_users (SCD2) · dim_posts (SCD2 — caption edits, audio swaps) · dim_creators (SCD2)
  • dim_post_audio (SCD1)
  • dim_ranker_models — SCD2; semver + model_card_url + deployed_pct.

Listening history &amp; pool-model royalty payouts

Spotify · streaming media · per-stream royalties
Design a model for Spotify's listening history that supports recommendations, the "Year in Review" feature, and per-stream royalty payouts to multiple rights holders (artist + label + publisher + songwriter) — under the pool model where revenue is divided by total qualified streams in a country-period.

Why this is hard — the design tension

Two consumers want the same fact for opposite reasons. Product needs every play (skips included) for recommendations. Finance only pays on qualified streams (≥30s). Multi-rights splits mean one stream produces 4+ payout rows (artist + label + publisher + songwriter) and the splits change mid-quarter when deals renegotiate. Pool model math requires computing per-stream rate as revenue ÷ total streams in pool — a chicken-and-egg dependency that resolves only at period close.

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_play_eventsThe "Listen" grain. One row per play event with qualified_play flag.
  • fct_royalty_periodsThe "Pool" grain. One row per (country × period). Holds total pool revenue, total qualified streams, computed per-stream rate.
  • fct_royalty_attributionsThe "Payout" grain. One row per (play × rights_holder × period). Derived nightly at period close.

Dimension & Bridge Tables

  • dim_users (SCD2 — plan tier drifts) · dim_subscriptions (Family/Duo group of users)
  • dim_tracks (SCD2 — metadata) · dim_artists · dim_labels (SCD2) · dim_publishers (SCD2) · dim_songwriters
  • bridge_track_rightsSCD2 with effective dates. The deal-terms ledger.

Double-entry ledger for payments (charges, refunds, chargebacks, FX, reserves)

Stripe · payments · double-entry ledger
Design a model for Stripe's payment system supporting charges, refunds, chargebacks, payouts, multi-currency FX, reserves, and rigorous double-entry accounting where every transaction's entries sum to zero per currency. Append-only — never UPDATE — for regulatory audit compliance.

Why this is hard — the design tension

Payments is the domain with zero tolerance for retro edits. A regulator can ask "what was merchant X's balance on day Y as the books showed it then?" — the answer has to be defensible regardless of corrections made since. That forces append-only on every fact. Multi-currency adds an FX gain/loss leg. Chargebacks arrive 30-180 days after the charge. Reserves are an internal transfer, not external money. The model must enforce SUM=0 per balance transaction per currency as a continuous invariant.

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_ledger_entriesThe "Atom" grain. One row per debit OR credit on (account, currency). The source of truth. Append-only.
  • fct_balance_transactionsThe "Business event" grain. One row per charge/refund/payout/chargeback. Groups its entries via balance_txn_id.
  • snap_account_balance_dailyThe "State" grain. Derived — running SUM per (account × currency × day). Pre-aggregated for fast balance reads.

Dimension Tables

  • dim_merchants (SCD2 — risk profile, reserve_pct) · dim_customers · dim_payment_methods (SCD2)
  • dim_accounts — Chart of accounts: merchant_balance, fees, reserve, tax, fx_gain_loss, chargeback_reserve.
  • dim_currencies · dim_fx_rates (SCD2 daily — FX rate locked at posting time)

Bookings, calendar, cancellations &amp; reviews

Airbnb · 2-sided marketplace · calendar-as-fact
Design a model for Airbnb supporting search/availability lookup, booking lifecycle (request → accepted → stayed → reviewed), cancellation-policy refund logic, host payouts after stay completion, and dynamic pricing — with full point-in-time queryability ("what was the price of listing X on July 4 last year?").

Why this is hard — the design tension

The calendar is the hot read path: a search query for "Paris, July 4-7, 2 guests" hits potentially millions of (listing × night) rows in milliseconds. Calendar-as-fact with daily snapshots is the canonical answer. Booking modifications require append-only — same booking_id can't UPDATE in place. Cancellation refunds depend on (policy × days-from-check-in) and must be computed at cancel time, locked on the booking row. Multi-currency pays guest in EUR, host in USD; FX locked at booking.

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_calendar_nightsThe "Availability" grain. One row per (listing × night × snapshot_date). Daily snapshots enable point-in-time price lookup.
  • fct_bookingsThe "Reservation" grain. One row per booking lifecycle. Append-only — modifications create new rows.
  • fct_reviewsThe "Bilateral feedback" grain. One row per review (typed: guest_to_host OR host_to_guest).
  • fct_host_payoutsThe "Payout" grain. One row per payout event after stay completion.

Dimension Tables

  • dim_listings (SCD2 — price, policies, capacity drift)
  • dim_hosts (SCD2 — superhost status, response rate)
  • dim_guests
  • dim_geography (neighborhood, lat/lon, market_id)

Subscription billing + hourly usage metering (Snowflake / Datadog model)

SaaS · subscription + usage metering
Design a model for a SaaS with mixed pricing — flat-rate seats + usage-based metering (per-query, per-GB, per-API-call) — handling mid-cycle plan changes, prorations, overages, free trials, and multi-currency invoicing.

Why this is hard — the design tension

Two revenue streams with opposite shapes. Subscription is predictable, monthly, modeled at low row volume. Usage is volatile, hourly, billions of rows/day. Mid-cycle plan changes require pro-rated math. Trials look like paid plans but bill $0. Free-tier allowances reset each period. Multi-currency demands FX locked at usage time. The model must keep usage and subscription separate at the fact layer but join cleanly at invoice time.

Section 1 — Dimensional Model Overview

Fact Tables

  • fct_subscription_stateThe "Daily snapshot" grain. One row per (subscription × day). Captures plan, status, MRR.
  • fct_usage_meterThe "Hourly usage" grain. One row per (subscription × meter × hour). High volume.
  • fct_billing_eventsThe "Charge" grain. One row per subscription_charge / usage_charge / proration / refund / credit / dunning.
  • fct_invoicesThe "Statement" grain. One row per invoice — derived rollup of billing events.

Dimension Tables

  • dim_accounts · dim_plans (SCD2 — pricing tiers drift) · dim_meters (compute_credits, storage_gb_month, api_calls) · dim_promotions (SCD2)

Streaming royalty distribution (Spotify pool-based payout)

Spotify · streaming royalty pool

Why this is hard — the design tension

Spotify's royalty model is the cleanest example of a multi-party financial fact: each stream is a small event, but the dollars get split across artist + label + publisher + Spotify, with the rate calibrated per-country and the pool refreshed monthly. Mistakes are not just dollars — they're rights-holder lawsuits.

Two competing structural forces:

  • Pool-based pro-rata (Spotify's choice): every premium subscriber's $$ enters one country pool; each artist gets their streams ÷ total country streams × pool. Fast, simple, predictable.
  • User-centric (UCPS) (Apple/Deezer testing): each subscriber's $$ goes ONLY to the artists THEY listened to. Fairer to niche artists; computationally heavy at billions-of-users × millions-of-artists scale.

The query in co_sql_305-0375 implements pro-rata; the modeling section here covers the schema that makes both possible without a rewrite.

Section 1 — Dimensional Model

TypeTablesNotes
Dimdim_artist, dim_track, dim_album, dim_label, dim_publisher, dim_user, dim_country, dim_subscription_tier, dim_dateSCD2 on artist→label and track→album where rights can transfer
Bridgebridge_track_rightsholder (track_sk, party_sk, party_role, share_pct, effective_from, effective_to)SCD2 — owners change between accounting periods
Fact (event)fct_stream — one row per playback ≥ 30 secAppend-only; partition by play_date
Fact (periodic)fct_subscription_revenue — monthly per (user, country, tier)The denominator pool source
Fact (snapshot)fct_country_pool — monthly per country, computed from subscription revenue × ~70%Refreshed at month-close; never recomputed
Fact (output)fct_payout — one row per (track, party_role, country, month)Frozen at close; audit trail

TikTok For-You-Page recommendation feedback loop

TikTok · recommendation feedback loop

Why this is hard — the design tension

The For-You-Page is the most-watched ML system on the planet — billions of impressions per day, sub-200ms latency, and the recommendations train on themselves. Modeling this without producing biased feedback loops is the entire game.

The trap: echo-chamber bias. The ranker boosts videos that are already getting watched; users see them more; they get watched more; they're boosted more. New videos and new creators starve. The schema must support both online learning (continuous updates) and counterfactual evaluation (was the recommendation actually causal?).

Section 1 — Dimensional Model

TypeTablesNotes
Dimdim_user, dim_video, dim_creator, dim_sound, dim_hashtag, dim_deviceBridge: video↔hashtags is many-to-many
Fact (event)fct_impression — one row per video shown to user (BILLIONS/day)Sample to 1% for analytics; full data lives in feature store
Fact (event)fct_engagement — like, share, comment, watch_complete, swipe_awayJoined to impression by impression_id
Fact (counterfactual)fct_exploration_event — random or holdout videos (5% traffic)Critical for causal estimation
Fact (model)fct_ranker_score — predicted engagement at impression-timeStored alongside impression for off-policy evaluation

Stripe payment ledger (double-entry, replay-safe)

Stripe · double-entry ledger & idempotency

Why this is hard — the design tension

Stripe processes ~$1T/year. Every cent must reconcile across charges, refunds, fees, payouts, FX, taxes, chargebacks — and survive replay, retries, and partial failures. The whole system is an append-only double-entry ledger: nothing is updated, only debited or credited. This is the same model banks have used for centuries; the schema is just the SQL realization.

Section 1 — Dimensional Model

TypeTablesNotes
Dimdim_account (every party: customer, merchant, platform, fee, FX, tax, chargeback)Each is an "account" with a balance
Dimdim_currency, dim_country, dim_payment_methodSCD2 on payment_method when bank rules change
Fact (event)fct_journal_entry — every transaction emits N rows summing to zeroThe append-only spine
Fact (snapshot)fct_account_balance — derived per-account daily balanceMaterialized from journal entries
Bridgebridge_charge_journal — links a charge to its journal-entry rowsOne charge → many entries (revenue + fee + tax)

Salesforce account hierarchy + multi-tenant CRM

Salesforce · account hierarchy & multi-tenant

Why this is hard — the design tension

Enterprise CRM has a problem most B2C platforms never face: nested organizations. JP Morgan has 100 subsidiaries, each with their own opportunity flow, but the renewal happens at the parent level. A revenue rollup that doesn't aggregate up the tree is wrong; one that aggregates incorrectly double-counts subsidiaries.

Plus multi-tenancy: every Salesforce customer's data is in the same physical tables, partitioned by org_id. One bad query reads someone else's data. The schema design and the row-level security stack must enforce isolation absolutely.

Section 1 — Dimensional Model

TypeTablesNotes
Dimdim_account — recursive: parent_account_id FK to selfSCD2 when re-orged
Dimdim_user, dim_role (with role_hierarchy), dim_territoryPermission edges live here
Bridgebridge_account_user_access — many-to-many account↔user with permission_levelImplements record-sharing rules
Fact (event)fct_opportunity_stage_change — one row per stage transitionThe pipeline-velocity source
Fact (snapshot)fct_account_arr_snapshot — daily ARR per accountSurvives renewals + downgrades

Netflix streaming, series + bingewatching, late-arriving dimensions

Netflix · streaming + series + LAD

Why this is hard — the design tension

Netflix has the data problem most consumer platforms can only dream of: 250M subscribers, 15B+ play events per day, content metadata that propagates regionally over minutes-to-hours. Three structural challenges intersect:

  • Late-arriving dimensions (LAD) — a Netflix Original drops at midnight Pacific; play events fire before the title metadata propagates to every regional shard. INNER JOIN drops the views silently and Content Strategy stares at "0 plays in the first 3 minutes" of the most-anticipated launch of the quarter.
  • Series-vs-episode-vs-season grain — "Stranger Things S5 E1" is one episode, of one season, of one series, of one franchise. Aggregations must roll up cleanly without double-counting (a viewer watched 8 episodes — that's 1 series-bingewatcher, not 8 series-views).
  • Bingewatching as a derived behavior — bingewatching isn't recorded; it's computed from the play stream. The schema must support both per-session binge (3+ episodes within 24h) and per-day binge (any 3+ episodes of same series in a calendar day) without a re-write.

Section 1 — Dimensional Model

TypeTablesNotes
Dim — content treedim_franchise, dim_series, dim_season, dim_episodeEach row links to its parent FK; SCD2 on series_title (re-titled, re-genre'd)
Dim — userdim_account, dim_profile, dim_device, dim_subscription_tierProfile is a child of account; both visible in events
Dim — geodim_country, dim_isp, dim_cityFor QoE + regional content licensing
Fact — play event (raw)fct_play_event — one row per playback session startAppend-only; partition by play_date
Fact — heartbeatfct_play_heartbeat — every 60s during playbackFor QoE / completion rate; downsampled in warehouse
Fact — derivedfct_binge_session, fct_series_streak — computed nightly from playsMaterialized so analyst queries don't re-derive
SCD2 bridgebridge_episode_title (episode_id, title, effective_from, effective_to)For temporal-correct attribution of late-arriving plays

M&amp;A integration, golden records, schema strangler, multi-org metrics

M&A integration · golden records

Why this is hard — four data problems pretending to be one deal

Every acquisition press release says "stronger together." Every acquirer's data team knows that means two incompatible CRMs, two warehouses, two ID systems, two fiscal calendars, and an executive expecting unified KPIs on day 90. The interview question: "your company just bought a $20B SaaS — walk me through the data integration". The senior signal is naming all four problems first, then sequencing them:

  • Entity resolution — is their Acme Corp the same as our Acme Corp? Golden record + crosswalk + an ordered rule playbook (strict-then-fuzzy).
  • Schema unification — both apps keep shipping; can't freeze either DB. Strangler pattern with dual-write + CDC + reconciliation.
  • Warehouse consolidation — Snowflake vs Databricks vs BigQuery. Three strategies: big-bang, federated query, or share-and-bridge.
  • Multi-org metrics — their DAU isn't your DAU. Fiscal calendars, FX policy, consolidation rules, overlapping ARR. Versioned dim_metric_definition.

Full deep-dive: the M&A Integration page walks all four patterns end-to-end with Adobe-Figma, Microsoft-Activision, Cisco-Splunk, and Salesforce-Slack as live case studies.

Section 1 — Dimensional Model

TypeTablesNotes
Master data — goldengolden_customer, golden_account, golden_userUnified entity; SCD2 on canonical fields; lineage via sources[]
Crosswalk — xrefcustomer_xref, account_xref, user_xrefSource ID ↔ golden ID; append-only with superseded_by for un-merge
Source — per systemacquirer_crm.customers, target_crm.tenantsKept native; both apps continue writing during strangler
Unified — read modelunified.customers, unified.accounts, unified.ordersEventual destination of dual-write; reads through customer_xref
Reconciliationrecon.daily_driftSource row counts vs unified row counts; alerts > 0.1% drift
Metric versionsdim_metric_definitionorg_scope · window · dedup_grain · fiscal_calendar · currency_policy
Match candidatesmatch_candidatesTier 3-4 probabilistic matches staged for MDM-analyst review

Fraud detection · identity graph, impossible travel, account sharing, model replay

Fraud · identity graph & decision replay

Why this is hard — five problems pretending to be one model

Every interview asks the same question: "design the data model for a fraud system at a fintech / marketplace / streaming product." Junior answers stop at "a transaction table with a risk_score column." The senior signal is naming the five orthogonal sub-problems first:

  • Identity resolution — the SAME user appears across 3 emails, 2 phones, 4 devices, 5 IPs. Build a graph, not a row. Strong edges (same payment instrument) vs weak edges (same IP within 1h).
  • Impossible travel / multi-geo — login from Lagos, transaction from Berlin 18 min later. Velocity of physical position vs known device/SIM.
  • Account sharing detection — one user, many simultaneous-session devices/geos (Netflix family-plan abuse, Spotify subscription sharing). Inverse of multi-account fraud: one identity using many accounts looks like sharing; many identities sharing one account looks like ATO.
  • Decision replay — at audit time you must reproduce why you blocked the September 14 transaction. Means: freeze the model version, ruleset version, and feature snapshot at decision time. SCD2 everywhere.
  • Feedback loop — chargebacks come back 30-180 days later; analyst verdicts hours later. Both must flow back as labels to retrain the next model — without the loop, the model degrades silently.

Common interview pitfall: conflating multi-account fraud (one bad actor opens 50 accounts to abuse signup bonuses) with account sharing (one legit user shares Netflix with their parents) with account takeover (one bad actor uses one legit user's compromised account). All three look like "identity ↔ account doesn't match 1:1" in the data — the schema must distinguish them, or every detection rule produces the wrong false-positive class.

Section 1 — Dimensional Model

TypeTablesNotes
Subject dimensionsdim_user, dim_account, dim_device, dim_payment_instrumentSCD2 on the risk-bearing attributes: kyc_status, account_state, device_trust_score, bin_country
Location dimensionsdim_ip, dim_geoJoined per request. is_proxy, is_vpn, is_datacenter, asn, asn_risk_class from third-party feed (SCD2)
Identity graph (bridge)brg_user_device, brg_user_ip, brg_user_payment, identity_clusterAppend-only edges with edge_strength (1=strong: shared payment instrument; 0.3=weak: shared IP > 24h apart). identity_cluster.cluster_id rolls up edges into a connected component
Decision-context dimensions (frozen)dim_ruleset, dim_risk_model, dim_feature_snapshotSCD2 ESSENTIAL — the only way to replay a 6-month-old decision. ruleset_id + model_id live on every fact row. Feature snapshot is a frozen subset of computed values used by the model at decision time
Decision factsfact_login_attempt, fact_transaction, fact_session_eventOne row per attempt. Includes risk_score_at_decision, decision (allow/step_up/block), ruleset_id, model_id, feature_snapshot_id
Signal facts (append-only)fact_rule_firing, fact_model_scoreOne row per rule firing or model evaluation. Lets you trace which signals contributed to a decision. Volume is high — partition by day, keep 90-day hot tier
Review & labelsfact_case, fact_analyst_label, fact_chargebackCases opened on threshold cross. Analyst verdicts (fraud / clean / inconclusive) and chargebacks (30-180 day delayed) feed back as labels for the next training cycle
Outcome facts (delayed)fact_outcome_labelMaterialized union of analyst labels + chargebacks; the training-set source of truth. Reconciles disagreements (analyst said clean, chargeback came back 90d later → final label = fraud)

Growth accounting · DAU/WAU/MAU, the new/retained/resurrected/churned state machine, cohort LTV

Growth accounting · DAU/WAU/MAU & LTV

Why this is hard — a single MAU number hides a leaky bucket

"How many monthly active users do we have?" is the question a junior answers with COUNT(DISTINCT user_id). The senior answer: that number is nearly useless on its own. A flat 10M MAU can mean a healthy stable product OR a bucket leaking 2M users a month and refilling with 2M new ones — a business about to fall off a cliff. The framework that fixes this is growth accounting (the Social Capital / Reforge model), and it forces four design problems:

  • Define the activity event. Netflix counts a viewing session ≥ 2 min; Uber counts a completed trip; Spotify counts a track play ≥ 30s. The grain of fact_activity IS the product's definition of "active" — get it wrong and every downstream metric is wrong.
  • The state machine. Every period, each user is exactly one of: new (first-ever active period), retained (active last period too), resurrected (was dormant, came back), or churned (active last period, gone this period). Plus, for revenue: expansion and contraction.
  • The frozen cohort. A user's cohort = the period of their first activity, and it never changes. Cohort retention and LTV curves are only meaningful if the cohort key is immutable — which means it's a derived dimension, computed once, SCD0.
  • The generalized value column. The whole framework runs on one column, inc_amt — incremental value created by a user in a period. Set inc_amt = 1 and you get pure engagement growth accounting; set inc_amt = daily_revenue and the identical query produces MRR growth accounting with expansion/contraction. One model, two metrics.

The two identities that make it work: MAU(t) = retained(t) + new(t) + resurrected(t) and MAU(t−1) = retained(t) + churned(t). Subtract them: ΔMAU = new + resurrected − churned. That decomposition is the entire point — it turns "MAU is flat" into "we added 2M new + 0.3M resurrected and lost 2.3M churned," which is an actionable, alarming, completely different statement.

Section 1 — Dimensional Model

TypeTablesNotes
Atomic activity factfact_activityThe grain: one row per (user, day, activity). Holds inc_amt. Netflix=viewing-session, Uber=completed-trip, Spotify=track-play. Everything else derives from here
Period rollupsagg_dau, agg_wau, agg_mauOne row per (period, user) with summed inc_amt. date_trunc rollups — materialized so the growth-accounting joins don't re-scan raw activity
User dimensiondim_userSCD2 on plan / country / acquisition_channel. The slowly-changing attributes you'll want to slice growth accounting by
Cohort dimension (frozen)dim_user_cohortSCD0 — first_dt, first_week, first_month, acquisition_channel_at_signup. Computed ONCE; never updated. The immutable cohort key
Growth-accounting factsfact_dau_growth, fact_mau_growth, fact_mrr_growthOne row per period: active, new, retained, resurrected, churned (+ expansion, contraction for MRR). Produced by the FULL OUTER self-join offset by one period
Cohort-retention factfact_cohort_retention(first_period, active_period, periods_since_first, active_users, inc_amt). The long-form retention triangle a BI tool pivots into a heatmap
Cohort-LTV factfact_cohort_ltvCumulative: (first_period, periods_since_first, retained_pctg, cum_amt, cum_amt_per_user). The LTV curve per cohort
Metric-definition dimensiondim_metric_definitionSCD2 — versioned definition of "active" (session length threshold, dedup grain, period boundary). When marketing changes the bar, OLD numbers must not silently move

Spatial computing & XR — immersive sessions, the sensor firehose, and biometric consent

Spatial/XR · Meta Quest · Apple Vision Pro

Design the data platform for a spatial-computing platform (Meta Quest, Apple Vision Pro). A fleet of headsets streams eye-, hand-, and room-sensors during immersive sessions. The model must serve Product (engagement & comfort), Trust & Privacy (biometric + spatial-map consent), and Monetization (store + attention) — without drowning in raw telemetry or mishandling the most sensitive data a consumer device has ever collected.

Why this is hard — the device records your body and your home

A phone knows where you tapped. A headset knows where your eyes went, the geometry of your living room, and how your hands move — at 60–120 Hz. Two facts make this unlike every other scenario in this catalog:

  • The most sensitive data classes, at firehose rates. Eye-gaze is biometric; the room mesh is a 3-D map of a private space. Both stream continuously. You cannot weld a 90 Hz pose firehose to a queryable analytics fact — the same boundary lesson as raw GPS in the ride-share trip log (#1).
  • Capability is not uniform. Vision Pro has eye-tracking; a Quest controller session does not; phone-AR has neither. What data even exists is a function of the device — so device capability is a dimension that gates the schema, not a footnote.

The robust answer is a device-centric star: dim_device conformed, the immersive session as the analytic grain, raw sensor streams in an object store keyed back to the session, and a temporal dim_consent that gates what may ever be read.

Autonomous fleet — disengagements, incident replay, and the versioned "driver"

Autonomous · Waymo · Tesla · Uber AV

Design the data platform for a driverless ride-hail fleet (Waymo, Tesla robotaxi, Uber AV). The autonomy stack replaces the human driver. The model must serve Operations (dispatch & operational domain), Safety & Regulatory (disengagements & incident replay), and Finance (a mixed human + AV marketplace) — at terabytes of sensor data per vehicle per day.

Why this is hard — there is no driver to blame

Every other ride-share model (#1, #2) attributes a trip to a driver. Here the driver is software. That single fact reshapes the schema:

  • The "driver" is a stack of versioned dimensions. Responsibility for any action traces to a software release × HD-map version × sensor-calibration × operational design domain — each of which changes over time and must be reconstructable to the millisecond.
  • The regulator owns a fact table. A disengagement (autonomy hands control to a safety driver or remote operator) is the safety currency — counted, reported (e.g. CA DMV), and litigated. It is append-only and must replay against the exact versions that were driving.
  • The sensor log is a firehose at extreme scale. Camera + lidar + radar is terabytes/vehicle/day. None of it belongs in the warehouse; the analytics layer holds events and pointers, the raw drive-logs sit in object storage.

This scenario is the capstone: it reuses the catalog's SCD2 versioning, decision-replay, temporal geofence, firehose boundary, and conformed-dimension patterns — at safety-critical grade — and folds back into the marketplace fct_trips from #1.

Recurring events, time zones & free/busy

Google Calendar · productivity · recurrence-as-rule + occurrence-as-fact
Design the analytical model behind Google Calendar: recurring events with per-instance exceptions, IANA time zones with daylight-saving correctness, invitations with accept / decline / tentative, and "find a time" free/busy across many attendees — with point-in-time queryability ("what did my Tuesday look like as Calendar showed it last Monday?").

Why this is hard — the design tension

A recurring event is one logical row that denotes an unbounded set of occurrences — "every weekday, forever" is a single rule and an infinite series. Materialize occurrences as rows and you sign up for a horizon plus a forever-running backfill; expand the rule on every read and you recompute a deterministic answer, with a FREQ=MINUTELY rule waiting to blow up a query. The senior answer mirrors Airbnb's calendar-as-fact: keep the rule in an SCD2 series dimension, materialize an occurrence fact to a rolling horizon, and overlay exceptions as an append-only ledger keyed by recurrence_id. Time is the second landmine: a 09:00 stand-up must stay 09:00 across a DST change, so the rule lives in local wall-clock + IANA zone and the UTC instant is derived per occurrence — never frozen.

Section 1 — Dimensional Model Overview

Fact Tables

Dimension Tables