Data Analyst Interview Questions 2026

The BLS projects data science and analyst roles will grow 34 percent between 2024 and 2034 – roughly ten times the average for all occupations. There’s no shortage of prep lists. There is a shortage of honest ones that explain why candidates who know SQL still get rejected. They can write queries. They can’t explain what the query is actually answering.

This is not a list of 40 surface-level questions. It’s 14 that interviewers actually use to separate candidates who’ve memorized syntax from candidates who understand data.

SQL: the part most candidates think they’ve already nailed

SQL shows up in nearly every data analyst role. The 2025 Stack Overflow Developer Survey found 58.6% of all respondents use SQL regularly – third only to JavaScript and HTML/CSS. Being common doesn’t make it easy to interview on.

1. Top 5 customers by revenue, including those with zero orders

The trap is “top 5 by revenue.” Most candidates write an INNER JOIN and exclude zero-order customers without noticing. The correct approach: LEFT JOIN on orders, COALESCE to handle NULL revenue, then order descending and LIMIT 5. Interviewers are watching for whether you catch the edge case before they point it out. Don’t wait for the hint.

2. What’s the difference between WHERE and HAVING?

WHERE filters rows before aggregation. HAVING filters groups after GROUP BY runs. You can’t use aggregate functions like SUM() in a WHERE clause. In practice: use WHERE to narrow your dataset early (it’s faster), and HAVING to filter on computed group metrics. Candidates who say “they’re basically the same” are done.

3. Calculate month-over-month growth using a window function

LAG(revenue, 1) OVER (ORDER BY month) gives you last month’s revenue in the current row. Growth rate is (current – previous) / previous. The follow-up is always about interpretation: “Growth jumped 40% in March – what do you check first?” Good answers mention seasonality, a product launch, or a tracking change. Bad answers say “that’s good news.”

4. How do you handle NULL values?

COALESCE is the function. But the real answer is: it depends on what NULL means. A NULL in a revenue column for a trial user might correctly be zero. A NULL in an order_date might mean the order never shipped. Replacing every NULL with zero is sometimes exactly wrong. Interviewers want to hear you ask “what does NULL represent here?” before you reach for COALESCE.

5. Find customers who purchased in three or more consecutive months

Pattern-detection problem. One approach: ROW_NUMBER() partitioned by customer_id and ordered by purchase month, then compute the difference between the actual month number and the row number. Consecutive months produce the same difference. Group by customer and that difference, filter where count >= 3. Shows up in interviews because it’s genuinely useful for churn work.

Statistics and A/B testing: where most candidates lose

You don’t need a stats PhD to work as a data analyst. You do need to know enough to avoid confidently wrong conclusions. A/B testing questions are where interviewers discover who learned stats versus who absorbed it.

6. What does a p-value actually mean?

The memorized answer is “probability the null hypothesis is true,” which is wrong. The accurate version: a p-value is the probability of seeing results at least as extreme as yours, assuming the null hypothesis is true. A p-value of 0.03 doesn’t mean “there’s a 3% chance we’re wrong” – it means “if there were really no difference, we’d see a result this large only 3% of the time by chance.” Most analysts confuse these. Saying the correct version out loud in an interview is memorable.

7. Type I vs Type II errors

Type I is a false positive: you concluded the treatment worked when it didn’t. Type II is a false negative: the treatment worked but you missed it. In A/B testing, a Type I error might mean you shipped a change that hurt revenue. Type II means you killed a feature that would’ve helped. Which is worse depends on stakes. Good candidates ask “what’s the cost of each error?” rather than reciting textbook definitions.

8. How do you determine sample size for an A/B test?

Four inputs: baseline conversion rate, the minimum detectable effect you care about, desired statistical power (usually 80%), and significance level (usually 0.05). Testing whether checkout conversion improved from 10% to 12% requires roughly 4,000 visitors per variant. The number most candidates get wrong is the MDE – they set it too small and propose six-month experiments that’ll never run. Match the MDE to a change your team would actually ship.

9. Correlation vs causation – give a real example

Ice cream sales correlate with drowning deaths. Summer causes both. In a product context: users who visit the help center might show higher 30-day retention. That doesn’t mean routing more users to the help center improves retention – engaged users just do both. The interviewer wants this skepticism baked into your instinct, not recited as a rule.

Business framing: what actually separates hired from rejected

Every technical question is also a business question in disguise. Candidates who treat them as separate categories answer one half correctly and miss the other.

What we see on LastRoundAI mock interviews

Candidates practicing mock data analyst interviews on LastRoundAI consistently go further on SQL and stats questions than on business framing ones. The most common failure: finishing a technically correct answer and stopping there, without connecting the result to a decision. Interviewers notice when you don’t close that loop.

10. Daily active users dropped 20% overnight. Walk me through your investigation

Start by ruling out a tracking bug before drawing conclusions. Check whether the metric definition changed, whether instrumentation fired correctly, whether it’s isolated to one platform or geography. Once the drop is confirmed real: segment by acquisition channel, device, user cohort, and feature usage to find where it lives. Cross-reference with product or marketing changes in the same window. “I’d look at the data” isn’t an answer. Interviewers want the diagnostic sequence.

11. How do you explain a finding to a non-technical stakeholder?

Lead with the decision, not the analysis. Executives don’t need to know you ran a regression – they need to know what to do Monday. Structure it as: here’s what we found, here’s what it means for the business, here’s my recommendation, here’s my confidence level. That last part is what most candidates skip. Saying “I’m fairly confident this is directionally right, but the sample was small” is more credible than presenting everything with equal certainty.

12. Assess data quality in a dataset you’ve just been handed

Four dimensions: completeness (what % of rows have values per field?), accuracy (are values in expected ranges?), consistency (do IDs join cleanly across tables?), and freshness (when was this last updated?). Then document what you find. A dataset with 12% NULLs in a critical field is usable if everyone running analyses knows about it. It’s a trap if they don’t.

13. When do you use Python vs SQL for analysis?

SQL is faster for aggregation and filtering on structured relational data. Python is better for statistical modeling, machine learning, or multi-step transformations that don’t fit neatly into a query. In practice: do heavy data prep in SQL close to the warehouse, pull a reasonably sized result set, then do modeling or visualization work in Python. The 2025 Stack Overflow Survey shows Python at 57.9% developer adoption, up 7 points year-over-year. Learning both isn’t optional in 2026.

14. What data would you use to prioritize which feature to build next?

The answer depends on stage and feature type – and I’ll admit the right heuristics for early-stage companies with thin usage data aren’t obvious to me either. Generally: frequency of the pain point across your user base, correlation between workaround behavior and retention, support ticket volume, and revenue at risk from the gap. If you have a proxy metric like “users who complete X have 2x 12-month retention,” build the case around that. Interviewers mostly want to see you ask the right questions first.

The pattern in every hired candidate

Interviewers reject technically correct candidates regularly. Hired candidates answer the question, then add “here’s what I’d verify before acting on this.” That instinct – connecting analysis to a decision and hedging honestly – is what separates analysts from people who can run queries.

The BLS projects 23,400 analyst and scientist openings per year through 2034. The supply of candidates who’ve memorized the standard questions is growing too. Practicing until the reasoning becomes reflex, not recall, is the real edge. Machine learning interview questions and software developer interview questions cover adjacent territory for data-adjacent technical roles.

Practice these questions before your next data analyst interview

LastRoundAI’s mock interview mode lets you work through SQL, stats, and business framing questions with real-time feedback so you find your gaps before the actual call.

14 Data Analyst Interview Questions That Interviewers Actually Use

SQL: the part most candidates think they’ve already nailed

1. Top 5 customers by revenue, including those with zero orders

2. What’s the difference between WHERE and HAVING?

3. Calculate month-over-month growth using a window function

4. How do you handle NULL values?

5. Find customers who purchased in three or more consecutive months

Statistics and A/B testing: where most candidates lose

6. What does a p-value actually mean?

7. Type I vs Type II errors

8. How do you determine sample size for an A/B test?

9. Correlation vs causation – give a real example

Business framing: what actually separates hired from rejected

10. Daily active users dropped 20% overnight. Walk me through your investigation

11. How do you explain a finding to a non-technical stakeholder?

12. Assess data quality in a dataset you’ve just been handed

13. When do you use Python vs SQL for analysis?

14. What data would you use to prioritize which feature to build next?

The pattern in every hired candidate

Practice these questions before your next data analyst interview

Leave a Reply Cancel reply

14 Data Analyst Interview Questions That Interviewers Actually Use

SQL: the part most candidates think they’ve already nailed

1. Top 5 customers by revenue, including those with zero orders

2. What’s the difference between WHERE and HAVING?

3. Calculate month-over-month growth using a window function

4. How do you handle NULL values?

5. Find customers who purchased in three or more consecutive months

Statistics and A/B testing: where most candidates lose

6. What does a p-value actually mean?

7. Type I vs Type II errors

8. How do you determine sample size for an A/B test?

9. Correlation vs causation – give a real example

Business framing: what actually separates hired from rejected

10. Daily active users dropped 20% overnight. Walk me through your investigation

11. How do you explain a finding to a non-technical stakeholder?

12. Assess data quality in a dataset you’ve just been handed

13. When do you use Python vs SQL for analysis?

14. What data would you use to prioritize which feature to build next?

The pattern in every hired candidate

Practice these questions before your next data analyst interview

Keep reading

Leave a Reply Cancel reply