How to Walk Through a System-Design Round Without Freezing
The 45-minute system-design round is the round most senior candidates fail. Not because they don't know the components. Because they don't know how to spend the minutes. Forty-five minutes feels like a lot until you've been silent for six of them, drawing a load balancer.
This is a phase-by-phase breakdown of what to say, when. The framework is boring on purpose. Most posts on system design try to give you a clever mnemonic. The panels don't care about your mnemonic. They care about whether you cover the right ground in the right order.
Minutes 0 to 5: clarify, don't assume
When the interviewer says "design Twitter", every senior candidate's instinct is to start drawing. Resist it. The first five minutes are for questions, not boxes.
Three things to nail down:
- Scope. Tweets, replies, the timeline, search, DMs, notifications. Pick two or three. "I'll focus on tweet posting plus the home timeline. Skip DMs and search unless we have time." This single sentence buys you the next 40 minutes.
- Scale. 100 million DAU, 1 million tweets per minute peak, average tweet 200 bytes, 90/10 read-write skew. Get the panel to either confirm these or correct them.
- Non-functional requirements. Latency target (read p99 under 200ms, post p99 under 500ms), availability (three nines is normal, four nines is a stretch for a side-channel like search), consistency model (eventually consistent timeline, strongly consistent post acknowledgement).
Write each of these on the whiteboard. The panel can see you committed to a scope. You can refer back when you have to trade something off later.
Minutes 5 to 10: back-of-envelope numbers
Estimate, don't guess. The panel wants three numbers from you in this phase: request rate (QPS), storage growth per year, and bandwidth at peak.
For the Twitter example: 1M tweets/min peak ÷ 60 ≈ 17k writes/sec. Reads at 90/10 ratio ≈ 150k reads/sec. Storage: tweet (200 bytes) + metadata (200 bytes) + index entries (≈ 400 bytes) ≈ 800 bytes per tweet. 1M tweets/min × 60 × 24 × 365 ≈ 525B tweets/year × 800 bytes ≈ 420 TB/year. That's enough for the panel to know you can do napkin math.
Don't round to nice numbers. 17k writes/sec is better than "about 20k". The panel reads round numbers as fabrication.
Minutes 10 to 25: the high-level architecture
Draw the boxes. Client app, CDN, edge load balancer, API gateway, service tier, cache, primary datastore, async workers. Keep it simple. Six to eight boxes is enough for this phase. The panel will push you to expand specific ones later.
When you draw, narrate. "I'm using an API gateway here because we want auth, rate limiting, and request routing in one place. Could be Envoy or AWS API Gateway." Naming specific tools is a positive signal. Hand-waving "a service layer" is a negative one.
Two patterns the panel expects to see by minute 20:
- Fan-out-on-write timeline. When a user tweets, the post service writes the tweet, then publishes an event to a queue (Kafka, Kinesis), and a fan-out worker materialises that tweet into each follower's timeline cache. Reads become O(1).
- Celebrity-account exception. Fan-out doesn't scale when an account has 100M followers. Mention you'd hybrid this: fan-out for normal accounts, pull-on-read for celebrities. This single observation is what separates an L4 from an L5 answer.
Minutes 25 to 35: deep-dive into one or two components
The panel will pick something they want you to expand. Usually it's the timeline cache or the datastore choice. Be ready to defend both.
For the datastore: don't say "I'll use MongoDB" without justifying it. Say "I'd use a wide-column store like Cassandra for the tweet store because writes are append-only, the schema is simple, and we get tunable consistency per-request. For user-profile data which is read-heavy and small, a relational store like Postgres with a read replica fleet is fine. Two different stores for two different access patterns." That's what the panel wants to hear.
For the cache: name the tier (Redis cluster), the eviction policy (LRU with a longer TTL for verified-user timelines), and the warmup strategy (preload on follow event). Mention the failure mode: cache stampede. Mention the mitigation: request coalescing with a single in-flight refresh per key. These are the details that move you from "candidate has read a blog post" to "candidate has actually built this".
Minutes 35 to 42: trade-offs and failure modes
The last seven minutes are where staff-level candidates earn their level. Walk through three failure modes explicitly:
- What happens if Redis goes down? Read latency spikes from p99 200ms to p99 3s as everything falls back to Postgres. We degrade timeline to a slower path. We do not return 500s.
- What happens if a fan-out worker falls behind? Followers see stale timelines. We monitor consumer lag and page on 30-second consumer lag. We never block tweet posting on fan-out success.
- What happens if the celebrity-account write rate spikes (a politically-charged moment, a Super Bowl halftime, a product launch)? We shed load by rate-limiting per follower-count tier. The AWS Builders' Library is the standard reference for load-shedding patterns if the panel pushes.
Name three things you'd monitor on a dashboard: p99 read latency, fan-out consumer lag, error rate per service. Mention you'd alert on the second derivative of the error rate, not the absolute value. (Spikes matter more than level.)
Minutes 42 to 45: the wrap
When the panel says "we have a few minutes left", they're asking you to summarise. Three sentences. What you designed, the two biggest trade-offs you made (fan-out for normal vs pull for celebrities; Cassandra for tweets vs Postgres for profiles), and one thing you'd revisit if you had another hour. That last one is the most important. It tells the panel you know what you don't know.
What we see candidates get wrong
Across the candidates LastRound AI works with on system design prep, the same three failure modes account for most of the no-hires:
- Drawing before clarifying. The panel asks an ambiguous question on purpose. Drawing in minute one means you committed to a scope the panel might not have meant. You'll spend the next twenty minutes defending the wrong design.
- Skipping the back-of-envelope math. Even if your final design is right, skipping the numbers tells the panel you don't know whether your design supports the load. That fails the bar at L5+.
- Refusing to commit to a choice. "We could use either Postgres or DynamoDB" with no follow-up is worse than picking the wrong one and defending it. Pick. Justify. Acknowledge the trade-off.
References worth your time
Two that hold up better than the average tutorial:
- The system-design-primer on GitHub. 280k+ stars for a reason. Read the "design a TinyURL" section end-to-end and you've covered 60% of what an L5 round needs.
- "Designing Data-Intensive Applications" by Martin Kleppmann. Long. Worth it. Chapters 5 (replication), 6 (partitioning), and 7 (transactions) before any senior round. The rest of the book is great but not interview-critical.
Written by
Hari
Engineering, LastRound AI
Engineer at LastRound AI. Writes about coding interviews, system design, and the patterns we see when candidates use our copilot for live technical rounds.
Further reading
- NeetCode 150 — Curated DSA practice with video explanations
- System Design Primer — 270k★ open-source system design study guide
- Designing Data-Intensive Applications — Industry-standard distributed-systems text
Share this post
Related articles
Technical prep
Data Structures for Coding Interviews 2026: The 8 You Actually Need | LastRound AI
Technical prep
LeetCode Patterns Cheat Sheet 2026: The 15 Patterns That Cover 90% of Problems | LastRound AI
Technical prep
NeetCode 150 vs Blind 75: Which Should You Study? 2026 Guide | LastRound AI
Technical prep
AI Coding Interview Assistant: What Actually Works in Technical Interviews (2025)
