Model the engagement behind an Instagram-scale feed — a hundred billion impressions a day, every one stamped with the model that ranked it, mutable likes that may never be deleted, and a single celebrity post that can melt a partition by itself. One prompt, three stakeholders who want contradictory truths, and the one decision that separates a senior answer: the impression is the fact, and the model version rides on every row. A complete working-through of data flow, schema, streaming Python, the aggregation layer, the ranker-lift SQL, and the dashboard that proves it.
Every data engineer who works near a social product eventually meets this question. It sounds like "log the likes." It is really "design a fact table that a ranking team, a creator-payments team, and a brand-safety team can all query without lying to any of them" — and those three stakeholders do not want the same number.
"Model engagement on the Instagram feed — impressions, dwell time, likes, saves, comments, shares, plus the negative signals: hides and reports. It has to support feed-ranker training and A/B testing, creator analytics, and brand safety, at a hundred billion impressions a day. How do you grain it?"
The feed is the most-instrumented surface humans have built, and the trap is not the volume — it is that the cheapest mental model is wrong in a way that quietly poisons everything downstream. The naive design stores a counter: post 001 has 9,412 likes. That number cannot answer a single question the business actually asks. It cannot tell the ranking team whether the new model earned those likes or inherited them. It cannot tell finance which of them were unliked four minutes later — the canary for a low-quality recommendation. It cannot tell brand safety whether the hides clustered in one ugly hour. The moment you collapse engagement to a count, you have thrown away the dimension that every stakeholder is secretly asking about: which impression, ranked by which model, at what position, produced this reaction, and was it taken back?
A weak answer models the post and decrements a like counter on unlike. A strong answer notices that the unit of truth is not the post and not the like — it is the impression: one row every time a viewer is shown a post, stamped with the model that chose it. Engagements then hang off impressions, append-only, an unlike recorded as a second row rather than a deletion. Get that grain right and the ranker A/B becomes a GROUP BY; get it wrong and it becomes a separate ETL pipeline you will rebuild every quarter. So before any boxes and arrows, the working frame for the whole session:
Scope is the first scored dimension, and most candidates skip it. State what you build, what you ignore, and the numbers that shape every later choice. Out of scope here, said explicitly: the ranking model itself (treated as a callable service that returns scored posts), the media CDN and transcoding, the social graph and follow-fan-out service, and the feed UI. In scope: how an impression becomes a row at a hundred billion a day, how the model version is stamped so A/B lift is a query rather than a project, how a mutable like is recorded without ever deleting, how dwell collapses from heartbeats to a per-impression number, and — the part that separates the seniors — how the system survives the single post from a celebrity that arrives a million times a second.
Then the envelope math, volunteered rather than extracted. Instagram-shaped numbers:
| Quantity | Estimate | Consequence |
|---|---|---|
| Impressions / day | ≈ 100 B | ≈ 1.2 M/s average, multiples at peak — the firehose that defines the table |
| Raw dwell heartbeats | ≈ 12 T / day | 250 ms × ~30 s/impression — must die at ETL, never reaches the warehouse |
| Engagements / day | ≈ 5 B | ~5% of impressions react; append-only, so unlikes add rows, not subtract |
| Concurrent ranker versions | 5–10 | Forces ranker_model_id onto every impression as a first-class dimension |
| Hottest single post | ~10⁶ impressions/s | One celebrity melts any post-keyed partition — the row that shapes the architecture |
| Distinct viewers / post / day | up to 10⁸ | Exact COUNT DISTINCT is unaffordable; HLL sketches are mandatory |
| Raw event retention | 30 days | PII; aggregate to durable per-post and per-user rollups, expire the raw |
Notice which row does the architectural work. The hundred-billion firehose dictates the partitioning and the sampling and the bill — but the row that dictates the shape is the hottest post. A uniform hundred-billion stream is easy; you hash it and walk away. The cruelty of a feed is that the distribution is violently skewed: the median post is seen a handful of times and a single celebrity post is seen a million times a second, into the same partition, at the same instant. Any design that keys the write path on post_id has already lost — that post is a hot partition the moment it is published. The rest of this article follows that skew.
One feed, two tempos. The hot path logs an impression and freezes the model's prediction in under the serving budget; everything heavy — dwell aggregation, the per-post and per-user rollups, the ranker-lift join — happens offline, downstream of a single ordered log that is partitioned by viewer, never by post.
Three properties of this picture do most of the interview's work. First, the log is partitioned by viewer_id, never by post_id — because impressions of one celebrity post are produced by millions of different viewers, so keying on the viewer scatters that spike across every partition while a post-key would funnel it into one. The hottest object in the system is structurally prevented from being a hot partition. Second, the prediction is frozen on the hot path: the gateway stamps the serving ranker_model_id and its predicted_score onto the impression the instant the post is shown, before any reaction exists — the only moment that prediction is uncontaminated by what later happened. Third, the dashed lines are deliberate: raw dwell heartbeats are transient and collapse a hundred-to-one before any durable table sees them, and the ranker-lift join is offline against the append-only facts — the comparison that licenses promoting a model is a GROUP BY on a column, not a bespoke experiment harness.
Aggregate before you durably store; append rather than mutate; and never let one object key the write path. The twelve-trillion-row heartbeat stream is summarized to a hundred billion impression-dwell rows at ETL and the raw is thrown away — storing it would be a liability with no buyer. Engagements append: an unlike is a new row, because the warehouse's job is to remember what happened, not to depict the present. And the partition key is the viewer, so a post going viral degrades into more rows spread evenly, never one partition on fire. Each rule trades a tempting shortcut for the property that keeps the system queryable at the 99th percentile of skew.
The schema falls out of the grain question. The impression is the spine, stamped with the model that chose it. The engagement is the reaction, append-only. The dwell is the time-on-post, pre-aggregated. The dimensions are SCD2 because models, posts, and creators all change underneath the facts — and the per-post and per-user rollups exist because nobody can scan a hundred billion rows for a dashboard.
The impression fact is the most-written table in the design, so it is about discipline under skew. One row per (viewer × post × impression), stamped with the serving context the analytics layer will need: the feed_position where position-bias lives, the surface that separates feed from reels from explore, and — the column that makes A/B a query — the ranker_model_id with the predicted_score it chose on. The model id is a foreign key into a versioned dimension, not free text, because "which model" must survive a redeploy.
Engagement is a separate fact, joined to the impression by impression_id — nullable, because a comment can arrive on a post the viewer reached by deep link, with no prior feed impression. Every event type lives here: the positives the ranker optimizes and the negatives brand-safety polices. The non-negotiable design choice is that it is append-only. A like and its later unlike are two rows; the unlike is the original marked is_undone = TRUE with an undone_ts, never a deletion. That preserves the toggle time-series — which is the only way to answer "what fraction of likes are taken back within five minutes," the sharpest canary for a ranker pushing regret-bait.
Dwell is the third fact, and it exists only because something upstream refused to let the raw grain survive. The client emits a heartbeat every 250 milliseconds a post is on screen; at a hundred billion impressions of ~30 seconds each that is twelve trillion rows a day — a number with no warehouse and no purpose. The streaming job sums those heartbeats into one dwell_ms per impression and discards the rest. What lands is a hundred-billion-row table, the regression target the ranker trains on, with the watch-through signal for reels baked in.
Two structural pieces finish the model. The dimensions are SCD2 because the things they describe change while the facts keep flowing: a creator's follower count, a post's caption or swapped audio, and above all the ranker model — semver, deployed percentage, model-card URL — so that a query about a fact from last Tuesday joins to the model as it was last Tuesday. And the rollups exist because no dashboard can scan the spine: mart_post_engagement_daily pre-aggregates per (post × day) with an HLL sketch for distinct reach, and mart_user_affinity_daily rolls per (viewer × day) for personalization, both de-identified enough to keep long after the raw facts expire at thirty days.
One subtlety worth saying out loud at the whiteboard: the rollups do not store a count of likes — they store a rate, computed from the append-only facts with is_undone = FALSE applied at aggregation time. The raw facts remember every toggle; the marts remember only the settled truth. That separation is what lets the same engagement table serve a regret analysis (which needs the toggles) and a creator dashboard (which needs the net) without either one corrupting the other.
The whole correctness of this system lives in one rule joining two halves: engagement facts are never mutated — an unlike appends — and every impression carries, frozen at serving time, the model version and score that chose it. Together they make two otherwise-impossible questions answerable: "did the new ranker actually lift engagement?" and "how often does the feed make people regret a tap?"
Why this is the invariant and not a nice-to-have. Counters lie by forgetting. A like-count that decrements on unlike can tell you the present and nothing else — it cannot reconstruct that the like happened, that it survived four minutes, that it was then withdrawn. The ranker's most valuable training label is buried in exactly that history: a like taken back inside five minutes is a near-miss the model should learn from, indistinguishable from a like that never happened if you let the DELETE through. So the engagement table appends, always, and "active" is a query — the latest non-undone row — not a stored state. Symmetrically, the model id must be stamped at impression-time, because models redeploy daily and traffic ramps continuously; if you do not capture that this impression was served by v3.2 at the instant it happened, you can never reconstruct it, and the A/B comparison the whole feed team depends on evaporates.
Read the lifecycle left to right and notice that nothing is ever overwritten. The model id is frozen between RANKED and IMPRESSED, before any reaction exists. The like at T1 is a row; the unlike at T2 is a second row, not an edit of the first; the settled state is computed, not stored. Because every step is an append, the gap between the impression and the analysis can be arbitrarily long and the comparison stays honest — a post shown ten million times under v3.2 with a high predicted score and a wave of five-minute unlikes is not noise, it is the cleanest possible indictment of that model, and you only have it because nothing in the chain was allowed to mutate.
The ranker_model_id foreign key and the immutability of the facts are doing the quiet work together. Because the model id is on the row and the row never changes, the same impression can be attributed to exactly one served model forever, and a head-to-head between v3.1 and v3.2 is a GROUP BY over a column — no separate experiment store, no shadow pipeline, no reconciliation. That is the whole payoff: the experiment infrastructure is the schema.
Three programs carry the system. The gateway that stamps the model id and partitions by viewer so no post can melt a shard, the heartbeat collapser that turns twelve trillion rows into a hundred billion, and the engagement applier that appends an unlike instead of deleting a like. Each is small; the judgment is in what they refuse to do.
This runs on the serving hot path. It does the minimum and emits asynchronously: it freezes the ranker_model_id and predicted_score onto the impression at the instant the post is shown, and — the load-bearing decision — it chooses the log partition by hashing the viewer_id, never the post_id. A celebrity post is impressed by millions of distinct viewers, so a viewer key scatters that million-per-second spike across all partitions, while a post key would bury it in one. The hottest object in the product is, by construction, incapable of being a hot partition.
One carve-out, always stated: the gateway never resolves the model id to a name on the hot path. It writes the surrogate ranker_model_id only; the join to semver and deployed-percentage happens offline against the SCD2 dimension. Resolving on the hot path would couple a 1.2-million-per-second writer to a slowly-changing lookup, for a label nobody reads until analysis time. Stamp the key, defer the meaning.
The aggregation layer turns the append-only firehose into something a dashboard can read in milliseconds. Two ideas carry it: HLL sketches that count distinct viewers without storing them, and per-post / per-user rollups that materialize the settled rates so nobody ever scans the spine. The craft is in counting drivers, not pings — distinct viewers, not impressions — and in computing rates from the append-only truth.
The intuition first, because it is what an interviewer wants out loud. "How many people saw this post?" sounds trivial until you remember a single post can reach a hundred million distinct viewers in a day, and exact COUNT DISTINCT at that cardinality, across petabytes, is a sort the size of a small country. The answer is a HyperLogLog sketch: a few kilobytes that estimate distinct count to within a couple of percent and — the property that makes them indispensable — merge. A per-hour sketch unions into a per-day sketch with no re-scan; a per-post sketch unions across posts for a creator total. You never store the viewers; you store the sketch, and the sketch composes.
The per-user rollup is the mirror image and feeds personalization rather than reporting: per (viewer × day), how many impressions, how many settled engagements, the average dwell, the dominant audio — a compact affinity vector that survives long after the raw impressions expire at thirty days, because it is aggregate and de-identified where the raw is PII. The two marts are why the warehouse stays sane: the spine is written once and scanned by batch jobs, while every interactive query hits a table that is already at the grain the question asks. A dashboard never touches a hundred billion rows; it touches the few million rows of the rollup that already answered.
The three facts are where the system explains itself. Three queries an interviewer loves, because each one carries a classic pattern on its back — conditional aggregation for the creator funnel, a window LAG for the brand-safety velocity, and the model-versioned GROUP BY that is the entire point of the ranker A/B.
The bread-and-butter question: per post, what are the like / save / share rates, the unique reach, and the average dwell — counting only active engagements, with the undone ones filtered out. The pattern is conditional aggregation: a single scan with FILTER clauses fanning the one stream of engagements into per-type counts, joined to impressions for the denominator and to dwell for the time-on-post.
A post going wrong is not a level, it is a slope: the negative rate jumping hour-over-hour is the alarm, because a steady 0.3% hide-rate is background noise while a leap from 0.1% to 0.9% in one hour is a post turning toxic in real time. The pattern is a window LAG over the per-post-hour negative rate — compare each hour to the one before and surface the accelerations to the human review queue.
The query that justifies the entire schema, and the one that separates a senior answer. Because ranker_model_id rides every impression, comparing two model versions is a GROUP BY on that column — no separate experiment store, no bespoke pipeline. The senior framing is to compute both the positive and the negative rate per version, because a model that lifts likes while also lifting hides has not won: the new ranker ships only if it raises engagement and does not worsen the negatives.
A senior design ends with observability, because every safeguard above is invisible without it. The feed dashboard watches three readers at once: is the live ranker lifting without raising regret, is the firehose ingesting cleanly without a partition on fire, and is any post turning toxic in real time.
Read the amber tiles together and the dashboard narrates the central tension of the whole problem. The skew chart shows a single post arriving at a million impressions a second — the spike that would have melted a post-keyed partition — yet partition skew sits at 1.4×, because the viewer key scattered those writes across thousands of shards. Meanwhile the new ranker is winning on its own terms, likes and dwell up, negatives down; but the regret rate and the review queue climb, the canaries warning that the same viral content driving engagement is also driving complaints. That is exactly why regret and the safety velocity exist: they are the only instruments that see the feed straining underneath metrics that otherwise look excellent.
Strip the Instagram details away and the question was testing five judgments, each of which generalizes far beyond a social feed: