Model Netflix's ad-supported tier — every ad slot, frequency caps, contractual pacing, and the make-good penalty when a guaranteed buy comes up short. The single decision that separates a senior answer: store the empty slots too. A full working through — data flow, schema, the invariant, streaming Python, the nightly pacing close, the analytics SQL, and the dashboard.
Programmatic advertising sells what it can. Connected-TV advertising sells what it promised — and pays a penalty when it misses. That one difference turns an analytics warehouse into a finance system, and it is the whole question.
"Design a model for Netflix's ad-supported tier — capture every ad-slot opportunity, filled or unfilled, every ad served, frequency caps per user, and the contractual pacing guarantees per advertiser: ten million impressions by month-end, with a make-good if we under-deliver."
The trap in this prompt is that it sounds like a logging problem. Count the ads, sum the revenue, build a dashboard — a junior answer ships an fct_ad_impressions table and stops. But the prompt buried three forces in one sentence, and they pull in different directions. Maximize fill rate, so no slot goes to waste. Honor frequency caps, so a viewer does not see the same Ford spot eight times in one film. And hit the committed volume per advertiser per month, because the contract has teeth. A model tuned for any one of these starves the others, and the only artifact that lets you reason about all three at once is the one a logging mindset never builds: a row for the ad that didn't play.
So before any boxes and arrows, the working frame for the session — three grains, three stakeholders, three tempos:
Scope is the first scored dimension, and most candidates skip it. Said out loud: in scope is the inventory and contractual model — opportunities, impressions, caps, pacing, make-goods — and the analytics that hang off them. Out of scope, explicitly: the real-time bidding auction itself (treated as an upstream service that hands us a winner or a null), the creative-rendering and ad-stitching pipeline, billing and invoicing systems (we hand finance a liability, not an invoice), and identity resolution across devices (we assume a resolved profile_id). The caveat: the model must not preclude programmatic — guaranteed buys and auction wins share the same opportunity row, distinguished by a fill reason, not a separate table.
Then the envelope math, volunteered rather than extracted. Mid-2025-shaped numbers for the ad tier:
| Quantity | Estimate | Consequence |
|---|---|---|
| Ad-supported profiles | ~94 M | The audience the caps and pacing run against |
| Avg ad load | ~4–5 min / hr | Sets slots per viewing hour |
| Avg slots per session | ~6 | Pre-roll + a handful of mid-rolls |
| Opportunity rows / day | ≈ 1.5–2 B | The number that shapes the whole architecture |
| Fill rate | ~85–92 % | So ~10–15% of rows are the precious null-creative ones |
| Unfilled rows / day | ~150–250 M | Doubles nothing — it's the diagnostic surface |
| Active campaigns | ~10³–10⁴ | The cardinality of pacing & make-good math |
| Make-good exposure | $ millions / quarter | A balance-sheet liability, not a metric |
Notice where the weight sits. The opportunity table is the firehose — a couple of billion rows a day, partitioned by date so it stays queryable — but the row that earns the architecture its keep is the unfilled one. Two billion rows is an engineering problem you solve with partitioning and columnar storage. The hundred-and-fifty million empty rows are a modeling decision, and refusing to make it is what fails the interview.
One opportunity row is born at the ad server, enriched if it fills, rolled up nightly into pacing, and — at month close — distilled into a liability. The spine is an append-only opportunity log feeding three derived layers that each run at their own tempo.
Three properties of this picture do most of the interview's work. First, the ad decision server writes a row whether or not it fills the slot — the unfilled branch is not an error path, it is the diagnostic that later answers "why did we leave money on the table?" Second, the frequency-state store is the only hot, low-latency read on the decision path: a tiny counter keyed by (profile × campaign × day) that the server consults before every fill, and the only place in the design where staleness costs money in real time. Third, pacing and make-goods are derived, not live — a daily roll-up and a month-close job — because a contract is measured against a calendar, not a clock, and pre-aggregating the snapshot keeps the account-manager dashboard fast.
When in doubt, do not fill. The frequency-cap read is the one place a stale value is dangerous, so the cap check fails closed: if the counter store is unreachable or ambiguous, the server declines the impression and emits an unfilled row with reason cap_check_unavailable. Over-delivering against a cap means a viewer sees the same spot too many times and the campaign's frequency promise is broken; under-delivering means an empty slot we can always diagnose later. One of those is a degraded experience we logged; the other is a contract violation we can't take back.
The schema falls out of the grain question. A wide append-only opportunity fact at the slot grain; a thin impression fact for the slots that filled; a daily pacing snapshot; a hot cap-state table; and a make-good ledger that finance treats as a liability.
This is the spine. One row per ad-slot opportunity, written by the decision server. The served_creative_id is nullable by design: a null is not missing data, it is a fully-formed fact that the slot existed and went unfilled, qualified by an unfilled_reason. The mirror of that column, fill_reason, records how a filled slot was won — a guaranteed buy, an auction, or a house ad — so guaranteed and programmatic inventory live in one table, distinguished by a value rather than a schema.
The filled_xor_unfilled constraint is doing real work, and it is worth saying out loud in the room: it makes the table physically incapable of holding an ambiguous row. Every slot is either won-and-explained or empty-and-explained — never both, never neither. That is what turns fill rate from a fragile derived metric into a trivial COUNT(*) FILTER, and it is the difference between a model that lies quietly at quarter-end and one that can't.
Filling a slot is a promise to play; an impression is the proof it played. The impression fact is a thin child of the opportunity, keyed by opportunity_id, carrying the quartile-completion signal the beacons report and the recognized revenue — the money that actually counts toward delivery, which is not always the same as the money that was predicted.
Pacing is not computed on the fly. It is a daily snapshot per (campaign × day) carrying a pacing_index — where 1.0 means exactly on track — so an account manager's morning dashboard reads one small pre-aggregated table instead of scanning two billion opportunity rows. When the flight closes short, the make-good obligation is born: a row in a ledger that finance carries as a balance-sheet liability until it is fulfilled with bonus inventory next period.
The dimensions are conventional and mostly SCD2 — dim_advertisers, dim_campaigns, dim_creatives, dim_titles (content ratings drive ad eligibility), dim_profiles (the tier column gates the whole audience) — plus one that earns special mention: dim_campaign_flights, which holds the start_date, end_date, and total_committed_impressions that the entire pacing-and-make-good machinery measures itself against. The flight is the calendar the contract lives on.
This system's correctness lives in two places: a counter the ad server increments atomically so a frequency cap is never silently breached, and a structural promise that every opportunity is conserved — filled plus unfilled equals offered, always.
Walk the decision. An ad break opens; the server has a slot to fill. It pulls the candidate campaigns, and for each one it must answer a question the marketplace cannot get wrong: has this profile already hit its cap? That read-then-decide is a race — two ad breaks for the same viewer can open within milliseconds on two servers — so the increment has to be atomic, and it has to be conditional on staying under the cap. The guard is the lock.
In production this counter lives in a key-value store, where the same logic is an atomic INCR guarded by a compare against the cap; the SQL above is the mirror that makes the invariant legible. The point is identical either way: the cap is enforced by a single conditional write, not by reading the count and trusting it. Two simultaneous ad breaks racing for a viewer's last allowed Ford impression resolve with exactly one winner — and the loser does not error, it falls through to the next campaign or writes an honest empty row.
Every slot exits to exactly one terminal state, and both terminals are recorded. That is the second invariant, and it is a conservation law: for any campaign, day, or title, opportunities = filled + unfilled, with no remainder. Because the opportunity fact's filled_xor_unfilled constraint enforces it at write time, the accounting can never drift — there is no third bucket where lost slots hide. Fill rate, the most-watched number in the building, is therefore exact and cheap, and "money on the table" is just the unfilled side of the same ledger.
Three small programs carry the decision path: the slot resolver that fills or refuses, the impression-beacon consumer that promotes a fill into a verified play, and the cap-check primitive they both lean on. Each is small; the judgment is in what they refuse to do.
The resolver's most important behavior is the one a junior version omits: it always emits a row. Every branch — capped, no demand, unsuitable content, cap store down — ends in a write. There is no path where a slot quietly evaporates, because an evaporated slot is exactly the revenue leak the model exists to measure.
One carve-out, always stated: house ads are the floor, not a branch here. Some designs backfill an unfilled premium slot with an owned-and-operated promo so the screen is never black; that is a policy layer above this resolver, and it changes the fill_reason to house_ad rather than suppressing the unfilled accounting. The diagnostic still sees that no paying demand cleared — we just chose not to show a blank.
A filled opportunity is a promise; the player's quartile beacons are the receipt. The consumer folds 0/25/50/75/100 pings into one impression row, and — the senior move — it only recognizes revenue past the contractual completion threshold, because a viewer who skipped at the two-second mark did not deliver the impression the advertiser bought.
Two derived jobs turn rows into accountability. A nightly roll-up computes each campaign's pacing index against an even-delivery target; a month-close job reads the flights that ended and mints the make-good liability for the ones that fell short.
The pacing index is the contract's pulse. The even-pace target for a flight is simply total_committed ÷ flight_days; cumulative delivery divided by what should have been delivered by now gives an index where 1.0 is on track, below is behind, above is ahead of plan. The subtlety is that pacing must not over-correct: a campaign at 0.6 with twenty days left has time, while the same index with three days left is a five-alarm fire. So the job emits both the index and days_remaining, and the alerting threshold is a function of both — pacing is read with the calendar, never alone.
The month-close job runs once, the morning after a flight ends. For every campaign whose flight closed in the period, it compares cumulative delivery to the commitment; any shortfall becomes a make-good row, priced at the contracted CPM, and lands in finance's ledger as a pending liability. Nothing about this is approximate — a make-good is a real obligation to deliver bonus impressions next period or refund, and it sits on the balance sheet until status moves to fulfilled.
The opportunity fact and the contractual layer are where the system explains itself. Three queries an interviewer loves, because each carries a classic pattern on its back — conditional aggregation, a window-LAG onset detector, and a liability roll-up.
The flagship report, and the reason the empty rows exist. Fill rate is a single COUNT(*) FILTER, and the unfilled reasons fan out into a diagnosis: no_eligible_demand is the sales team's problem, frequency_capped is the cap-policy team's, content_unsuitable is eligibility rules. One scan tells you which lever to pull.
The account manager's morning alert. The pattern is a guarded read of the snapshot — index below a threshold and little runway left — that computes the catch-up rate now required. Note that "behind" alone is not actionable; "behind with no days left" is.
Finance wants the standing liability; the cap-policy team wants to know the moment a campaign started bleeding fills to frequency caps. The first is a straight roll-up of the obligation ledger. The second is the classic onset detector: LAG over an hourly cap-block rate to find where it first crossed a line — the same gaps-and-onsets shape that finds a surge or an incident.
A senior design ends with observability, because every careful accounting choice above is invisible without it. The inventory dashboard serves three audiences at once — ops watches fill, account managers watch pacing, finance watches the liability — and each reads a different definition of "healthy."
Read the tiles together and the dashboard narrates a specific tension. Fill rate looks healthy at 88.6%, which a logging-only model would call a win and stop. But the unfilled-reason bar says most of the empty slots are no_eligible_demand — a sales problem, not an ops one — and the cap-block rate is creeping up, which means the frequency policy is starting to suppress otherwise-fillable inventory. Meanwhile twelve campaigns are behind with the month nearly out, and the make-good liability has already grown to $6.8M. None of that is visible from impressions alone. It is visible only because the empty slots were stored as first-class rows.
Strip the ad-tier specifics away and the question was testing five judgments, each of which generalizes far beyond CTV advertising: