PaddySpeaks · Systems at the Whiteboard · Nº 17

The Growth Problem

"How many monthly active users do we have?" is a one-line question with a one-line answer that is almost always a lie. A flat MAU can be a stable product or a bucket leaking and refilling — and the single number cannot tell you which. A complete working through: the activity grain, the four-state machine, the FULL OUTER self-join that decomposes it, the frozen cohort, the retention triangle, and the dashboard that exposes the leak.

§ 01 — THE QUESTIONOne number, four destinies

Every growth-stage data team eventually meets this question, and it is a trap disguised as a warm-up. It sounds like a SELECT COUNT and it is actually an accounting system.

Interview Prompt

"How would you measure whether this product is actually growing? Leadership reports MAU every month and it's been flat at ten million. Are we healthy?"

LEVEL · SENIOR / STAFFDURATION · 45 MINFORMAT · WHITEBOARD

The junior answer is COUNT(DISTINCT user_id) over a trailing thirty-day window, and it is worse than useless because it radiates false confidence. Ten million active users this month and ten million last month is reported as "stable." But the same flat line describes a product that retained nine million and added one, and a product that retained three million, churned seven, and refilled the hole with seven million freshly-acquired strangers who will themselves be gone by summer. Those are not the same company. One has product-market fit; the other is renting its user base from the ad network and the lease is about to expire. A single scalar cannot distinguish a reservoir from a sieve.

The framework that fixes this is growth accounting — the Social Capital / Reforge decomposition — and it forces four design problems before a single query is written:

THE ACTIVITY GRAIN
What counts as "active"? Netflix counts a viewing session of two minutes or more; Uber counts a completed trip; Spotify counts a track play of thirty seconds. The grain of the activity fact is the product's definition of active — get it wrong and every downstream number is wrong, silently.
THE STATE MACHINE
Four mutually exclusive states. Each period, every user is exactly one of: new (first-ever active period), retained (active last period too), resurrected (was dormant, came back), or churned (active last period, gone this period).
THE FROZEN COHORT
A user's cohort never moves. Cohort = the period of first activity, computed once, immutable. Retention and LTV curves are only meaningful if the cohort key is frozen — which makes it a derived dimension, SCD0.
THE VALUE COLUMN
One column runs the whole engine. Set the incremental-value column to 1 and the identical machinery yields engagement growth; set it to daily revenue and it yields MRR growth with expansion and contraction. One model, two metrics.
A flat MAU is not a measurement. It is a sum that has destroyed its own evidence.

Scoping out loud

Scope is the first scored dimension, and the candidates who skip it are the ones who never name the leaky bucket. State what you are building: a growth-accounting warehouse that decomposes every period's active base into the four states, computes a quick-ratio, and renders a cohort-retention triangle. State what you are deliberately ignoring: attribution and channel ROI (treated as a slice, not the spine), the event-collection SDK itself, real-time activation, and anti-fraud (active here means a real human session — bot-filtering is upstream). And state the one definitional landmine you will defend with a versioned dimension: when marketing later moves the bar for "active," last quarter's published numbers must not silently move.

Then the envelope math, volunteered. A consumer product at ten-million-MAU scale:

QuantityEstimateConsequence
Monthly active users10,000,000The headline number — and the one that lies
Activity events / day~80–120 MThe atomic fact; rolls up to DAU/WAU/MAU
Monthly churned≈ 2.4 MThe hidden row that shapes the whole architecture
New + resurrected / month≈ 2.5 MJust barely covers the leak — quick-ratio ≈ 1.04
Distinct cohorts tracked~60 months × channelThe frozen SCD0 dimension; never rewritten
Retention-triangle cells~60² / 2 ≈ 1,800Long-form fact a BI tool pivots into a heatmap
Growth-fact rows / period1 per periodThe decomposition is tiny; the join that builds it is not

Notice the asymmetry the envelope exposes: the churned figure is invisible in the MAU headline yet nearly equal to everything the product acquired. That single row — the one the naive query cannot see — dictates the entire model. Everything below exists to make churn a first-class number instead of an inference.


§ 02 — DATA FLOWFrom a firehose of events to four numbers

One pipeline, three tempos. A raw activity firehose narrows to a per-period rollup, the rollup self-joins against its own previous period to produce the state machine, and the state machine fans out into cohort curves. The whole design is a funnel that turns volume into meaning.

RAW PLANE · ~100M EVENTS/DAY DERIVED PLANE · 1 ROW/PERIOD CLIENT EVENTS session · play · trip at-least-once fact_activity grain: user × day × activity holds inc_amt (=1 or =$) the definition of "active" agg_dau · wau · mau date_trunc rollups 1 row / (period, user) summed inc_amt dim_user_cohort · SCD0 first_dt / week / month computed once · frozen forever FULL OUTER SELF-JOIN tm (this period) ⟗ lm (last) ON tm.user = lm.user · offset 1 FULL OUTER or churn is invisible fact_mau_growth new · retained resurrected · churned quick_ratio cohort_retention + cohort_ltv retention triangle LTV : CAC gate this period + last period frozen key SOLID — materialized flow · DASHED — the join's two inputs · The funnel turns volume into a decomposition.
FIG. 1 — The growth funnel. Raw events narrow to a per-period rollup; the rollup self-joins against its own past to produce the four states; the states fan out into cohort curves.

Three properties of this picture do most of the interview's work. First, the activity fact is the only place the product's definition of "active" lives — change the session threshold and everything downstream re-derives, which is exactly why that threshold belongs in a versioned dimension and not in a hard-coded WHERE clause. Second, the growth fact is produced by a self-join of the rollup against itself, offset by one period — the present joined to its own immediate past. Third, the cohort key descends from a frozen dimension that touches the curves but is never recomputed; the moment a user's cohort can move, every retention number becomes a fiction.

The Decomposition Philosophy, In One Rule

Never report a level without its flows. A stock — MAU, MRR, headcount — is the running total of four flows: arrivals, survivors, returns, and departures. Report the stock alone and you have hidden three of the four numbers that explain it. The entire architecture exists to make the flows as cheap to query as the stock, so that "MAU is flat" can never again be said without "...because we added 2.5M and lost 2.4M" in the same breath.


§ 03 — DATA MODELAn atomic fact, a frozen cohort, and a versioned definition

The schema falls out of the four design problems. One atomic activity fact carries the value column. One frozen cohort dimension is computed once. One SCD2 metric-definition dimension keeps history stable when the bar moves. The growth facts are derived from these and stay tiny.

The atomic activity fact

Everything derives from fact_activity, whose grain is one row per (user, day, activity_type). The single most consequential column is inc_amt — incremental value created by a user in a period. It is the generalization that lets one model answer two questions: leave it 1 and the decomposition counts heads; set it to that day's revenue and the same decomposition counts dollars, with expansion and contraction appearing for free.

DDL · ATOMIC ACTIVITY + PERIOD ROLLUPS
-- The grain IS the product's definition of "active". One row per user per -- day per activity. inc_amt = 1 for engagement, = revenue for MRR. CREATE TABLE fact_activity ( user_id BIGINT NOT NULL, activity_dt DATE NOT NULL, activity_type TEXT NOT NULL, -- 'session' | 'trip' | 'track_play' inc_amt NUMERIC(12,4) NOT NULL DEFAULT 1, -- the value column metric_def_id BIGINT NOT NULL, -- which "active" rule qualified this row PRIMARY KEY (user_id, activity_dt, activity_type) ); CREATE INDEX idx_activity_dt ON fact_activity (activity_dt, user_id); -- Period rollups: one row per (period, user) with summed value. Materialized -- so the growth-accounting self-join never re-scans the raw firehose. CREATE TABLE agg_mau ( period_month DATE NOT NULL, -- date_trunc('month', activity_dt) user_id BIGINT NOT NULL, inc_amt NUMERIC(14,4) NOT NULL, -- SUM over the month PRIMARY KEY (period_month, user_id) ); -- agg_dau and agg_wau are identical with date_trunc('day'|'week', ...).

The frozen cohort (SCD0)

A cohort is the period of a user's first activity, and it must never change — which makes it a slowly-changing dimension of the simplest possible kind: type 0, write-once. The whole point of a retention curve is to compare January's signups at month three against February's signups at month three; if a backfill or a re-login could quietly re-stamp a user's cohort, the curves would shift under you and the comparison would be meaningless. So the cohort is computed exactly once, on first sighting, and is read-only thereafter.

DDL · FROZEN COHORT & VERSIONED DEFINITION
-- SCD0: computed once on a user's first-ever activity, never updated. -- This immutability is the precondition for meaningful cohort curves. CREATE TABLE dim_user_cohort ( user_id BIGINT PRIMARY KEY, first_dt DATE NOT NULL, first_week DATE NOT NULL, first_month DATE NOT NULL, acquisition_channel_at_signup TEXT NOT NULL, -- frozen at signup, not "current" cohort_locked_at TIMESTAMPTZ NOT NULL DEFAULT now() ); -- SCD2: the versioned definition of "active". When marketing moves the bar, -- OLD published numbers must not silently move — so the rule is temporal. CREATE TABLE dim_metric_definition ( metric_def_id BIGINT PRIMARY KEY, metric_name TEXT NOT NULL, -- 'active_user' active_threshold INTERVAL NOT NULL, -- e.g. session >= '2 minutes' dedup_grain TEXT NOT NULL, -- 'user_day' period_boundary TEXT NOT NULL, -- 'calendar' | 'rolling_28' valid_from TIMESTAMPTZ NOT NULL, valid_to TIMESTAMPTZ NOT NULL DEFAULT '9999-12-31' ); -- SCD2 on the attributes you want to SLICE growth accounting by. CREATE TABLE dim_user ( user_sk BIGINT PRIMARY KEY, -- surrogate; one row per version user_id BIGINT NOT NULL, plan TEXT NOT NULL, -- free | premium country TEXT NOT NULL, channel TEXT NOT NULL, valid_from TIMESTAMPTZ NOT NULL, valid_to TIMESTAMPTZ NOT NULL DEFAULT '9999-12-31', is_current BOOLEAN NOT NULL DEFAULT true );

The derived growth facts

The growth facts are small — one row per period — but their columns are load-bearing. The engagement and revenue variants share a shape; the revenue variant adds expansion and contraction, the two states that only exist when the value column is money. These tables are produced by the self-join in §04, and they exist precisely so that the dashboard and the analysts never re-run that join.

DDL · GROWTH-ACCOUNTING FACTS + RETENTION TRIANGLE
-- One row per period. Satisfies the two identities by construction. CREATE TABLE fact_mau_growth ( period_month DATE PRIMARY KEY, active BIGINT NOT NULL, new BIGINT NOT NULL, retained BIGINT NOT NULL, resurrected BIGINT NOT NULL, churned BIGINT NOT NULL, -- stored as a negative magnitude quick_ratio NUMERIC(6,3) -- (new + resurrected) / |churned| ); -- The MRR variant: same shape, plus the two money-only states. CREATE TABLE fact_mrr_growth ( period_month DATE PRIMARY KEY, mrr NUMERIC(16,2) NOT NULL, new_mrr NUMERIC(16,2) NOT NULL, retained_mrr NUMERIC(16,2) NOT NULL, resurr_mrr NUMERIC(16,2) NOT NULL, expansion NUMERIC(16,2) NOT NULL, -- retained user whose inc_amt rose contraction NUMERIC(16,2) NOT NULL, -- retained user whose inc_amt fell churned_mrr NUMERIC(16,2) NOT NULL ); -- The long-form retention triangle a BI tool pivots into a heatmap. CREATE TABLE fact_cohort_retention ( first_month DATE NOT NULL, active_month DATE NOT NULL, months_since_first INT NOT NULL, active_users BIGINT NOT NULL, inc_amt NUMERIC(16,4) NOT NULL, PRIMARY KEY (first_month, active_month) ); -- The cumulative LTV curve per cohort (semi-cartesian cumulative join). CREATE TABLE fact_cohort_ltv ( first_month DATE NOT NULL, months_since_first INT NOT NULL, retained_pctg NUMERIC(6,3) NOT NULL, cum_amt NUMERIC(18,2) NOT NULL, cum_amt_per_user NUMERIC(14,4) NOT NULL, PRIMARY KEY (first_month, months_since_first) );

§ 04 — THE CORE INVARIANTThe two identities, and the FULL OUTER that protects them

The correctness of the entire system lives in two algebraic identities and the one join clause that makes them hold. Get the join wrong — use an inner join, or a left join — and churn quietly reports as zero, which is the single most dangerous bug in growth analytics because it looks like good news.

Each user, each period, is in exactly one of four states, and the states are defined by a self-join of the rollup against its own previous period:

DORMANT NEW RETAINED CHURNED RESURRECTED RETAINED

Read the chain as a life: a user is dormant until their first active period (new); they persist (retained); they lapse (churned); they return (resurrected); they persist again. The distinction between new and resurrected is the cohort: a user present this period but not last is new if their first-ever activity equals this period, and resurrected otherwise. That is why the frozen cohort dimension is not optional — it is the only thing that tells arrivals from returns.

The two identities below hold by construction once the join is right. Subtracting one from the other yields the decomposition that turns a flat headline into an alarm:

The Two Identities

MAU(t) = retained(t) + new(t) + resurrected(t) — this period's active base is its survivors plus its arrivals plus its returns.
MAU(t−1) = retained(t) + churned(t) — last period's base is the same survivors plus whoever left.
Subtract: ΔMAU = new + resurrected − churned. The level is a consequence of the flows; the flows are the truth.

FULL OUTER, or churn is invisible. Churned users live only on the previous-period side of the join — an inner or left join silently drops them and reports zero churn.GROWTH RULE Nº 1

Here is the atomic heart of the framework: a single FULL OUTER self-join of the rollup against itself, offset by one period. The COALESCE on the join keys, and the CASE over which side is NULL, is the entire state machine.

SQL · THE GROWTH-ACCOUNTING SELF-JOIN (THE INVARIANT)
-- One period's decomposition. tm = this month, lm = last month, joined -- FULL OUTER on user. The cohort dim splits "tm only" into new vs resurrected. WITH tm AS ( SELECT user_id, inc_amt FROM agg_mau WHERE period_month = :p ), lm AS ( SELECT user_id, inc_amt FROM agg_mau WHERE period_month = (:p::date - INTERVAL '1 month') ), classified AS ( SELECT coalesce(tm.user_id, lm.user_id) AS user_id, CASE WHEN lm.user_id IS NULL AND c.first_month = :p THEN 'new' WHEN lm.user_id IS NULL THEN 'resurrected' WHEN tm.user_id IS NULL THEN 'churned' ELSE 'retained' END AS state, coalesce(tm.inc_amt, 0) - coalesce(lm.inc_amt, 0) AS delta_amt FROM tm FULL OUTER JOIN lm ON tm.user_id = lm.user_id -- the load-bearing clause LEFT JOIN dim_user_cohort c ON c.user_id = coalesce(tm.user_id, lm.user_id) ) SELECT :p AS period_month, count(*) FILTER (WHERE state <> 'churned') AS active, count(*) FILTER (WHERE state = 'new') AS new, count(*) FILTER (WHERE state = 'retained') AS retained, count(*) FILTER (WHERE state = 'resurrected') AS resurrected, -count(*) FILTER (WHERE state = 'churned') AS churned FROM classified;

The same query, with inc_amt as money instead of a head count and the delta_amt sign inspected on retained rows, splits retention into expansion (delta > 0) and contraction (delta < 0). That is the whole trick of the generalized value column: one join, two metrics, and the revenue identities — MRR(t) = retained + new + resurrected + expansion and MRR(t−1) = retained + churned + contraction — fall out of the same machinery.


§ 05 — INGESTION & STREAMSPython that refuses to double-count

Three programs carry the derived plane: the activity reducer that turns raw events into the dedup'd grain, the cohort stamper that writes once and never again, and the period roller that materializes the rollups. Each is small; the judgment is in what they refuse to do.

1 · The activity reducer — the grain is the definition

Raw events arrive at-least-once and at every imaginable granularity — a user opens the app forty times a day. The reducer collapses them to the qualifying grain: one row per user per day per activity, but only for events that clear the active threshold from the metric definition in force. The refusal here is the important part: a sub-threshold session is not "a small amount of active" — it is not active, and silently keeping it would inflate every number above.

PYTHON · ACTIVITY REDUCER → fact_activity
from datetime import date def reduce_to_activity(raw_events, metric_def): """Collapse a user's raw events for one day into at most one qualifying activity row per activity_type. The metric_def supplies the threshold; a session below it is NOT active and is dropped, not down-weighted. Idempotent by construction: re-reducing the same day converges to the same row, so at-least-once delivery upstream is harmless.""" threshold = metric_def["active_threshold_s"] qualifying = {} # (user_id, dt, kind) -> inc_amt for e in raw_events: if e["duration_s"] < threshold: continue # the refusal: sub-threshold is not active key = (e["user_id"], e["event_dt"], e["kind"]) # engagement: presence counts once. revenue: sum the money. if metric_def["mode"] == "revenue": qualifying[key] = qualifying.get(key, 0.0) + e["amount"] else: qualifying[key] = 1 # dedup to presence, never a tally return [ {"user_id": u, "activity_dt": d, "activity_type": k, "inc_amt": amt, "metric_def_id": metric_def["id"]} for (u, d, k), amt in qualifying.items() ]

2 · The cohort stamper — write once, never again

The cohort stamper is the guardian of SCD0. Its entire job is to compute a user's first-activity period exactly once and then to refuse every subsequent write. The discipline lives in the SQL: an INSERT ... ON CONFLICT DO NOTHING, so a re-login, a late event, or a replayed batch can never re-stamp a cohort. A user whose first event was in January is in the January cohort forever, even if the warehouse first sees them in March.

PYTHON · COHORT STAMPER (SCD0 — INSERT-ONCE)
def stamp_cohorts(conn, new_user_first_seen): """Freeze each user's cohort on first sighting. The ON CONFLICT DO NOTHING is the whole invariant: a cohort, once written, is immutable. Late-arriving history that predates the stored first_dt is the ONE allowed correction, handled explicitly below — everything else is a no-op.""" for u in new_user_first_seen: first_dt = u["first_dt"] conn.execute(""" INSERT INTO dim_user_cohort (user_id, first_dt, first_week, first_month, acquisition_channel_at_signup) VALUES (%s, %s, date_trunc('week', %s), date_trunc('month', %s), %s) ON CONFLICT (user_id) DO UPDATE SET first_dt = LEAST(dim_user_cohort.first_dt, EXCLUDED.first_dt), first_week = date_trunc('week', LEAST(dim_user_cohort.first_dt, EXCLUDED.first_dt)), first_month = date_trunc('month', LEAST(dim_user_cohort.first_dt, EXCLUDED.first_dt)) WHERE EXCLUDED.first_dt < dim_user_cohort.first_dt """, (u["user_id"], first_dt, first_dt, first_dt, u["channel"])) # The cohort only ever moves EARLIER (a genuinely older event surfaced), # never later. That asymmetry is what keeps retention curves honest.

3 · The period roller — late events land in the right period

The roller materializes agg_mau from fact_activity. The judgment it encodes is event-time, not processing-time: an event with an activity_dt of last month must land in last month's rollup even though it arrived today, and the roller therefore re-materializes a trailing window of recent periods on every run rather than only the current one. Without that, a late batch would understate a closed period and the self-join would manufacture phantom churn.

PYTHON · PERIOD ROLLER (EVENT-TIME, TRAILING RE-MATERIALIZE)
def roll_periods(conn, watermark_dt, lookback_months=2): """Re-materialize agg_mau for the trailing `lookback_months` periods, not just the current one. Calendars don't close instantly: a straggler event timestamped to a past month must still update that month's rollup, or the self-join sees a user vanish and reports churn that never happened.""" lo = month_floor(watermark_dt, lookback_months) conn.execute(""" INSERT INTO agg_mau (period_month, user_id, inc_amt) SELECT date_trunc('month', activity_dt) AS period_month, user_id, SUM(inc_amt) AS inc_amt FROM fact_activity WHERE activity_dt >= %s -- event-time, trailing window GROUP BY 1, 2 ON CONFLICT (period_month, user_id) DO UPDATE SET inc_amt = EXCLUDED.inc_amt -- recompute, not increment """, (lo,)) # Idempotent: the SUM is recomputed from the source, so replaying a batch # can never double-count. The rollup is a projection, not a ledger.

One carve-out, always stated: the rollups are recomputed, never incremented. A growth metric that adds deltas to a running total will, the first time a batch replays, drift — and the drift is invisible because the number still looks plausible. Recompute-from-source is slower and correct; increment-in-place is faster and a time bomb.


§ 06 — AGGREGATIONThe retention triangle and the cumulative LTV curve

The slow, derived layer is where cohorts become curves. The retention triangle is a self-join of cohort membership against later activity; the LTV curve is its cumulative cousin. Both are deterministic, so they are materialized once per period and read forever.

The retention triangle is the framework's most legible artifact: a matrix whose rows are cohorts (by first month) and whose columns are months_since_first, each cell holding the fraction of that cohort still active. Read down a column and you see whether successive cohorts retain better or worse; read across a row and you watch a single cohort's curve flatten — and the level at which it flattens is the retention floor, the population that genuinely found the product. A curve that decays to zero is a product nobody keeps; a curve that settles at forty per cent has forty per cent product-market fit.

SQL · MATERIALIZE THE RETENTION TRIANGLE
-- Long-form triangle: for each cohort and each later period, how many of -- that cohort's users were active. A cohort-membership join to activity. INSERT INTO fact_cohort_retention (first_month, active_month, months_since_first, active_users, inc_amt) SELECT c.first_month, m.period_month AS active_month, (extract(year FROM m.period_month) * 12 + extract(month FROM m.period_month)) - (extract(year FROM c.first_month) * 12 + extract(month FROM c.first_month)) AS months_since_first, count(DISTINCT m.user_id) AS active_users, sum(m.inc_amt) AS inc_amt FROM dim_user_cohort c JOIN agg_mau m ON m.user_id = c.user_id AND m.period_month >= c.first_month -- never before the cohort GROUP BY c.first_month, m.period_month ON CONFLICT (first_month, active_month) DO UPDATE SET active_users = EXCLUDED.active_users, inc_amt = EXCLUDED.inc_amt;

The LTV curve is the triangle's cumulative cousin. Where retention asks "what fraction remain at month N," LTV asks "how much value, per original cohort member, has accumulated by month N." It is a running sum of per-period value divided by the cohort's original size — a semi-cartesian cumulative join — and its asymptote, compared against the cost of acquiring that cohort, is the single number that decides whether growth spend is rational.

SQL · CUMULATIVE COHORT LTV (RUNNING SUM PER COHORT)
INSERT INTO fact_cohort_ltv (first_month, months_since_first, retained_pctg, cum_amt, cum_amt_per_user) SELECT r.first_month, r.months_since_first, round(100.0 * r.active_users / nullif(first_value(r.active_users) OVER w, 0), 3) AS retained_pctg, sum(r.inc_amt) OVER w AS cum_amt, round(sum(r.inc_amt) OVER w / nullif(first_value(r.active_users) OVER w, 0), 4) AS cum_amt_per_user FROM fact_cohort_retention r WINDOW w AS (PARTITION BY r.first_month ORDER BY r.months_since_first ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) ON CONFLICT (first_month, months_since_first) DO UPDATE SET retained_pctg = EXCLUDED.retained_pctg, cum_amt = EXCLUDED.cum_amt, cum_amt_per_user = EXCLUDED.cum_amt_per_user;
The decomposition is the alarm; the triangle is the diagnosis. Quick-ratio tells you the bucket leaks; the retention floor tells you whether anyone was ever going to stay.AGGREGATION RULE Nº 1

§ 07 — ANALYTICS SQLThree questions, three classic patterns

The growth facts are where the product explains itself. Three queries an interviewer loves, because each carries a named pattern on its back — the quick-ratio gauge, the as-of cohort slice, and the window-function retention curve.

Is the bucket leaking? — the quick-ratio gauge

The quick-ratio is the single number that exposes a flat-MAU treadmill: gains over losses, (new + resurrected) / |churned|. Above one, the product grows; at one, it runs to stand still; below one, it shrinks even while the headline MAU looks flat. This is conditional aggregation reduced to one expressive ratio.

SQL · QUICK-RATIO OVER TIME (THE LEAK GAUGE)
SELECT period_month, active, new, resurrected, churned, round((new + resurrected) / nullif(abs(churned), 0), 2) AS quick_ratio, CASE WHEN (new + resurrected) / nullif(abs(churned), 0) >= 1.10 THEN 'growing' WHEN (new + resurrected) / nullif(abs(churned), 0) >= 0.95 THEN 'treadmill' ELSE 'shrinking' END AS verdict FROM fact_mau_growth ORDER BY period_month; -- A flat MAU with quick_ratio ~1.0 is the leaky bucket: huge gross flows, -- near-zero net. The verdict column says out loud what the level hides.

Where is the leak coming from? — SCD2 as-of slice

"Is churn worse on free or premium, and in which country?" requires slicing the decomposition by attributes as they were when the user was active, not as they are now — a premium user who downgraded last week should not retroactively recolor their history. That is an SCD2 as-of join: bound each activity period to the dimension version valid at that time.

SQL · CHURN BY SEGMENT, AS-OF THE ACTIVE PERIOD
WITH tm AS (SELECT user_id FROM agg_mau WHERE period_month = :p), lm AS (SELECT user_id FROM agg_mau WHERE period_month = :p::date - INTERVAL '1 month') SELECT d.plan, d.country, count(*) FILTER (WHERE tm.user_id IS NULL) AS churned, count(*) FILTER (WHERE lm.user_id IS NOT NULL) AS base_last_period, round(100.0 * count(*) FILTER (WHERE tm.user_id IS NULL) / nullif(count(*) FILTER (WHERE lm.user_id IS NOT NULL), 0), 1) AS churn_pct FROM lm FULL OUTER JOIN tm ON tm.user_id = lm.user_id JOIN dim_user d ON d.user_id = coalesce(lm.user_id, tm.user_id) AND (:p::date - INTERVAL '1 month') >= d.valid_from -- as-of last period AND (:p::date - INTERVAL '1 month') < d.valid_to GROUP BY d.plan, d.country ORDER BY churn_pct DESC;

Does retention floor out? — the window-function curve

The heatmap source: each cohort's active count normalized to its own month-zero size with FIRST_VALUE over a cohort-partitioned window. A curve that flattens has found its floor; a curve still falling cohort-over-cohort is a widening leak the dashboard will paint red.

SQL · RETENTION CURVE (FIRST_VALUE NORMALIZATION)
SELECT first_month, months_since_first, active_users, round(100.0 * active_users / first_value(active_users) OVER (PARTITION BY first_month ORDER BY months_since_first), 1) AS retention_pct FROM fact_cohort_retention WHERE first_month >= :since ORDER BY first_month, months_since_first; -- A curve that settles (e.g. flat at 41%) means the floor is found — those -- users ARE the product-market fit. A curve still sliding cohort-over-cohort -- says the leak is widening; the fix is the product, not the ad budget.

§ 08 — THE DASHBOARDMaking the leak impossible to miss

A senior design ends with the dashboard, because the entire point of growth accounting is to make a comfortable number alarming. The board shows the level and the flows side by side, so that "MAU flat" can never be read without "and here is why."

THE LEVEL
MAU and its trend — deliberately small on the board, because it is the number that lies. It is here only so the flows can contradict it.
THE FLOWS
new, resurrected, and churned as a stacked bar; the gross magnitude is the story even when the net is zero. A flat MAU with tall bars is the treadmill made visible.
THE GAUGE
quick-ratio with its verdict band — the single number an executive can read in one second. Below one is red regardless of what MAU is doing.
THE TRIANGLE
the retention heatmap — cohorts down, months across, color = fraction surviving. A flattening row is a floor found; a darkening column is a regression shipping.
THE GATE
LTV : CAC by cohort — the asymptote of cumulative value per user against acquisition cost. Below the payback line, spend is destroying money.
Growth Accounting — April MAU 10.1M · QR 1.04 · MONTHLY REFRESH
MAU (the level)
10.1M
Quick Ratio
1.04
Churned
2.4M
New + Resurrected
2.5M
Flows — new (green) · resurrected (amber) · churned (rust) · Jan→Apr
JAN FEB MAR APR
Month-1 Retention
60%
LTV : CAC
3.1×
Retention triangle — cohort (row) × months since first (col)
2026-01 2026-02 2026-03 M0M1M2M3 100 62 48 41 100 58 44 100 64
FIG. 2 — The story the tiles tell: MAU sits flat and grey while the flow bars stand tall — huge gross churn barely covered by acquisition — the quick-ratio gauge drifts amber at 1.04, and the triangle's bottom rows are still filling, retention hovering at 60% month-one. A treadmill, painted.

Read the board top to bottom and it argues the §01 thesis without a word of prose: the level is stable, the flows are violent, the gauge says treadmill, and the only genuinely good news is the LTV:CAC of 3.1× — which means the acquired users who do stay are worth more than they cost, so the fix is to plug the leak, not to stop spending. That is the difference between a dashboard that reports a number and one that prescribes an action.


§ 09 — THE RUBRICWhat was actually being tested

Strip the users and sessions away and the question was testing five judgments, each of which generalizes far beyond growth analytics:

LEVELS vs FLOWS
Refusing to report a stock without its flows. Every flat top-line metric — MAU, MRR, headcount, inventory — is four numbers wearing one number's clothes, and the senior move is to decompose before you reassure.
THE JOIN THAT TELLS THE TRUTH
Knowing that FULL OUTER, not inner, is what makes departures visible. The most dangerous bug in analytics is the one that drops the bad news on the floor and reports zero.
IMMUTABLE KEYS
Freezing the cohort (SCD0) so comparisons stay honest, and versioning the definition (SCD2) so history stays stable when the bar moves. Yesterday's published number must not move because today's rule changed.
ONE MODEL, TWO METRICS
Generalizing through the value column — 1 for engagement, money for revenue — so a single pipeline answers two questions and expansion/contraction fall out for free. The best abstraction is the one that pays for itself twice.
FROM NUMBER TO ACTION
Ending at the retention floor and the LTV:CAC gate, not the MAU headline. The deliverable is not a metric — it is the sentence "plug the leak, don't cut the spend," with the evidence to defend it.
A single MAU is a vanity metric — a reservoir and a sieve draw the same flat line. Growth accounting is the act of refusing to be comforted by a number that has eaten its own evidence.— CLOSING ARGUMENT