Real interview experiences — round-by-round.
Anonymized write-ups from recent data and AI engineering loops. Each entry: company, level, outcome (offer / down-level / no-offer), the actual questions asked per round, what the interviewer was looking for, what worked, and what the candidate would do differently. Submit your own at the bottom.
experiences@paddyspeaks.com.
Recent experiences
- Google · Staff Data Engineer (L6) · offer
- Netflix · Senior Data Engineer (L5) · offer
- Snowflake · Staff DE (L6) · down-level to L5
- Databricks · Staff Software Engineer (L6) · offer
- Amazon · Senior DE (L6) · no-offer
- Meta · E6 Production Engineer · offer
- OpenAI · Senior Member of Technical Staff · offer
Google · Staff Data Engineer (L6)
OFFERBackground
L6 at a fintech, 12 years in data. Targeting Google Search Quality data infra team. Recruiter reached out cold based on a conference talk.
The loop — 5 rounds, single day
- Coding 1 — sliding-window top-K problem. Implemented heap-based solution, optimized for memory.
- Coding 2 — SQL: cohort retention analysis on a 3-table schema. Asked about window-function edge cases (gap-filling).
- System design — design a global query-completion logging pipeline handling 10B queries/day. Drilled on hot-key handling, real-time vs batch trade-offs, and cost.
- Googliness — 30 min behavioral. Pushed hard on "tell me about a time you changed your mind." Spent half the round on follow-ups.
- Domain (data quality) — designed a metric-anomaly detection system for the query logs.
What worked
- System design — drew the diagram FIRST, then iterated. Interviewer led the dive into specific components.
- Behavioral round used the "I had a strong opinion → peer brought data I hadn't seen → I changed my mind" archetype. Interviewer visibly engaged.
- Connected coding answers to real production via Google-flavoured patterns (Spanner, BigQuery, Dataflow).
What I'd do differently
- Coding 1 — I optimised for memory before optimising for correctness. Should have written the simple solution first, then optimised.
- Spent too long on the Situation in one behavioral story. The interviewer cut me off — embarrassing.
Netflix · Senior Data Engineer (L5)
OFFERBackground
L5 at a streaming-adjacent company, 7 years experience. Heavy Spark + Kafka background. Referred by an ex-coworker.
The loop — 4 rounds across 2 weeks
- Manager screen — 60 min. Mostly culture-fit. Detailed discussion of the Netflix culture memo. Asked "tell me about a time you made a decision with incomplete information."
- Technical deep-dive — 90 min. Walk-through of a recent project at L5 ownership level. Drilled on architecture choices, tradeoffs, what I'd do differently.
- System design — design a content telemetry pipeline. They wanted Iceberg + Flink answer (not Spark Streaming). Cost discussion was 30% of the time.
- Bar-raiser / behavioral — explicitly asked about "Keeper test." Stories about decisions made with autonomy and high judgment.
What worked
- Re-read the culture memo the night before each round. Quoted specific phrases back ("informed captains").
- Cost discussion in system design — they care about $$ deeply.
- Every behavioral story explicitly named the decision I made WITHOUT asking for approval.
What I'd do differently
- In the deep-dive, I started with "we" and the interviewer asked me to redo it as "I". Don't make this mistake.
Snowflake · Staff DE (L6)
DOWN-LEVEL TO L5Background
L6 at a SaaS company, 10 years in data. Strong SQL background, weaker on platform/internal systems.
The loop — 5 rounds, single day
- SQL screening (pre-onsite) — 45 min, 3 problems including a hierarchical CTE and a sessionization. Cleared.
- SQL deep-dive — query optimization, EXPLAIN walkthrough, micro-partition pruning. I struggled here.
- System design — design Snowflake's own bug-bashing platform. Drilled on multi-tenant isolation.
- Behavioral — Customer Obsession + Earn Trust archetype. Solid.
- Hiring manager — domain fit + leveling probe.
What worked
- The customer-obsession story landed — drew it back to a specific Snowflake customer pain point.
- Behavioral was strong; the bar-raiser said so.
What didn't work — why the down-level
- SQL deep-dive — I couldn't speak to micro-partition pruning specifics. Recruiter feedback: "L6 expected to have hands-on Snowflake internals knowledge; L5 expected to know the concept."
- System design — I designed at L5 scope (one team, one quarter). L6 wanted multi-team, year-long horizon.
What I'd do differently
- Hands-on study of Snowflake-specific internals before the loop. Reading the docs counts.
- Pre-loop scope calibration with the recruiter — what's the L6 bar specifically.
Databricks · Staff Software Engineer (L6)
OFFERBackground
L5 at a hyperscaler, 9 years. Heavy Spark background. Applied for L6.
The loop — 5 rounds across 3 weeks
- HM screen — domain fit + Spark deep-dive (live questions on AQE, Photon, Delta).
- Live PySpark coding — 60 min, optimize a slow query. Walked through partition strategy, broadcast hint, AQE flags.
- System design — design a streaming ETL pipeline with Delta Live Tables. Drilled on schema evolution, late-arriving data.
- Founder-mindset / behavioral — "what would you build at Databricks?" Real question, expected real answer.
- Bar-raiser — out-of-org Staff. Pushed on cross-team influence stories.
What worked
- Live PySpark — I narrated my thinking. Named AQE flags by config key. They liked specificity.
- Founder round — I'd actually prepared a 1-page proposal for a Delta Lake feature gap. They asked for it.
What I'd do differently
- Practice PySpark live without my IDE's autocomplete. The shared Google Doc had none, and I fumbled imports.
Amazon · Senior DE (L6)
NO-OFFERBackground
L6 at a non-FAANG. 11 years in DE. Strong SQL/Python. Targeting an Alexa data team.
The loop — 5 rounds, single day, all LP-driven
- Coding 1 — string manipulation + 2 LPs (Bias for Action, Dive Deep).
- SQL — windowed aggregation + 2 LPs (Ownership, Earn Trust).
- System design — design Alexa wake-word telemetry + 2 LPs (Think Big, Customer Obsession).
- Bar-raiser — 60 min, 4 LPs. They went DEEP on each.
- HM — 2 LPs (Earn Trust, Hire and Develop).
What didn't work
- Bar-raiser — they asked for a Hire and Develop story. I gave one about mentoring. They asked "tell me about a time you grew someone you didn't want to manage." I didn't have one. Visible damage.
- Two of my prepared stories overlapped on LPs — when asked for a SECOND example of Customer Obsession, I told a story I'd already used. Bar-raiser noted it.
- Coding was fine but I rushed the LP follow-ups. The 50/50 split caught me off guard.
What I'd do differently
- Story matrix BEFORE the loop. Map every story to 2-3 LPs. Practice mapping in real time.
- Prepare 2 distinct stories per LP — bar-raisers ask for a second.
- Practice 50/50 coding-and-LP rhythm. The technical round isn't purely technical.
Meta · E6 Production Engineer
OFFERBackground
E5 → E6 promotion-track, internal candidate. Prep time was 3 months.
The loop — 5 rounds, single day
- Coding 1 — 2 problems in 45 min. LRU cache + interval merging. Pace pressure was real.
- Coding 2 — 2 problems. String manipulation + tree traversal.
- System design — design a distributed log aggregation system. Heavy on hot-key handling and cost.
- Behavioral — leveled at E6 scope. Career story + cross-functional disagreement.
- Career story — 60 min. The whole hour was about scope evolution and ambiguity navigation.
What worked
- Practiced 2-problems-in-45-min cadence for 2 months. Hit it in the real interviews.
- System design — every claim had a metric ("at scale X, P99 became Y, we fixed by Z"). Meta wants numbers everywhere.
- Cross-functional story included a PM partner and a data scientist partner. Both were quoted in my story.
What I'd do differently
- The "second weakness" question in behavioral — I didn't have a strong second answer. Prepare two.
OpenAI · Senior Member of Technical Staff
OFFERBackground
Staff at a FAANG, 8 years. Working on agent infrastructure. Strong DE foundation, recent shift toward AI engineering.
The loop — 6 rounds across 3 weeks
- Recruiter screen — mission alignment probe. Specifically asked "why OpenAI, not Anthropic?"
- Technical screen — 90 min pair-programming. Build a small evals harness from scratch. They watched HOW I structured the code.
- Take-home — design + implement a small RAG demo with custom retrieval. 1 week.
- Onsite coding — extend the take-home in front of them. Live refactoring.
- System design — design a multi-tenant LLM inference platform. Heavy on cost, latency, and tenant isolation.
- Mission & values — 60 min on alignment, safety, and "why this lab."
What worked
- The take-home was tested. The onsite extension was where they watched me debug + refactor. Don't over-engineer the take-home; leave clean refactor opportunities.
- Mission round — I'd specifically read 3 recent OpenAI papers and could discuss specifics. Generic "AGI is exciting" answers don't pass.
- Knew the AI Engineering vocabulary cold (RAG, evals, agents, MCP, cost-per-call attribution).
What I'd do differently
- Pair-programming screen — I optimised before being asked. Should have shipped the simple version, let them ask for the optimisation.
Submit your experience
Help the next candidate. Email your write-up to experiences@paddyspeaks.com with:
- Company, level, quarter, outcome (offer / down-level / no-offer / rescinded)
- Loop structure — number of rounds, round types
- Per-round: what was asked, what the interviewer was looking for
- What worked, what didn't, what you'd do differently
- Anonymized — no company-specific project details, no team names
We publish entries within 1-2 weeks. Anonymous by default; opt in to bylines.
Related pages
- Company Tracks — per-employer prep playbooks
- Behavioral & Leadership — the round most of these candidates flagged as decisive
- Mock Interview Simulator — practice live