Database Administrator Interview Questions

The interviewer at my third DBA screen asked me to explain a deadlock I’d personally debugged. Not a textbook definition. A real one, with a timeline. I hadn’t prepared for that angle, and it showed.

Database administrator interviews don’t reward broad coverage. They reward depth on 12 to 14 specific problems. The BLS projects about 13,900 database administrator openings per year through 2034, and most candidates get screened out not because they lack knowledge but because they stop at definitions when interviewers want operational depth. The questions below are the database administrator interview questions that consistently separate candidates in technical screens at companies like Oracle, Amazon, and Microsoft. Candidates practicing on LastRound AI’s mock interview tool get these wrong more than almost any other question category on their first attempt.

Query Optimization and Indexing

1. How do you find and fix a slow-running query?

Interviewers want a process, not a list of tools. Pull slow query logs or query sys.dm_exec_query_stats on SQL Server (or pg_stat_statements on Postgres). Generate the execution plan. Look for table scans where index seeks should be. Then ask why the index isn’t being used: outdated statistics, parameter sniffing, or a composite index with columns in the wrong order for the filter pattern. Don’t just say “add an index.” Explain how you’d confirm it’s working afterward.

2. B-Tree versus Hash indexes. When would you choose one over the other?

B-Tree indexes support equality and range queries. Hash indexes are faster for exact equality lookups but useless for ranges. PostgreSQL’s Hash indexes became WAL-logged and crash-safe only in version 10, so many teams still default to B-Tree out of habit even where Hash would be marginally faster. The useful follow-up: when would you use a partial index? When only a small subset of rows gets queried frequently, a partial index on WHERE status = 'active' is dramatically smaller than a full index on that column.

3. What is parameter sniffing, and when does it cause problems?

SQL Server compiles an execution plan optimized for the first parameter value it sees, then caches and reuses it. If your data has high skew, a plan optimized for a rare customer ID might perform badly for a common one. Symptom: a query that runs fine in dev but crawls in production. Fixes include OPTION (RECOMPILE), OPTIMIZE FOR UNKNOWN, or redesigning the query to avoid skew sensitivity. I think redesigning is usually cleanest, but reasonable engineers disagree on that.

Transactions, Locks, and Isolation

4. How do you handle deadlocks in production?

Prevention matters more than reaction. Access tables in a consistent order across all transactions. Keep transactions short: don’t execute application logic inside a transaction if you can avoid it. Index your foreign keys so lock escalation stays narrow. When deadlocks occur despite prevention, the deadlock graph (SQL Server’s system_health session, Postgres log settings) tells you exactly which resources are involved. Read it before guessing at a fix.

5. Explain the four isolation levels and the tradeoffs.

Read Uncommitted allows dirty reads. Read Committed (the default in most systems) prevents them but allows non-repeatable reads. Repeatable Read prevents that but allows phantom reads. Serializable blocks all three anomalies but reduces concurrency significantly. Most OLTP workloads use Read Committed. Financial systems sometimes need Serializable for specific operations like balance checks, but applying it globally kills throughput. Be ready to describe a scenario where you’d deliberately lower the isolation level and why it was acceptable.

Backup, Recovery, and High Availability

6. What’s your backup strategy for a mission-critical database?

The 3-2-1 rule is the baseline: three copies, two media types, one offsite. Weekly full backups, daily differentials, transaction log backups every 15 to 30 minutes. The exact cadence depends on your RPO. What interviewers actually probe for is testing. Most teams have backups. Far fewer test them. Monthly restore drills and quarterly DR exercises are the floor. If you can’t name your last tested RTO, that’s a gap.

7. What’s the difference between RTO and RPO, and how do you achieve both?

RTO is how long you can be down. RPO is how much data loss is acceptable. These are business decisions, not purely technical ones, and framing them that way signals seniority. Low RTO: hot standby plus automated failover via Patroni (Postgres) or Always On (SQL Server). Low RPO: synchronous replication or very frequent log backups. Synchronous replication adds 20 to 40ms round-trip to every write transaction. That’s fine for some workloads. For high-throughput transaction processing, it’s a serious cost.

8. How do you perform point-in-time recovery?

Restore the last full backup taken before the target time. Apply differentials. Replay transaction logs up to the specific timestamp: recovery_target_time in Postgres’s recovery.conf, or STOPAT in SQL Server. The step candidates skip: verify integrity after recovery. Run DBCC CHECKDB on SQL Server. Don’t declare success because the database came online.

Security and Access Control

9. How do you implement least-privilege access for a new application?

Create a dedicated application role with only the permissions that application actually needs: SELECT on the tables it reads, INSERT and UPDATE on the tables it writes. No DROP, no system table access. The application role authenticates; individual users shouldn’t connect to the database directly. Then audit quarterly. The access that seems right on day one is rarely still right on day 365.

10. Data at rest versus data in transit encryption. What’s the difference?

At-rest encryption (TDE on SQL Server, file-system encryption on Postgres) protects data if someone gets physical access to disk or backup files. In-transit encryption (SSL/TLS) protects data moving across a network. They’re different threat models. At-rest encryption doesn’t help if an attacker compromises an application that has valid credentials. Know which threat you’re actually solving for.

Performance at Scale

11. When would you use database partitioning, and which type?

Partitioning makes sense when a table exceeds roughly 2GB and most queries filter on a natural key. Range partitioning by date is the most common pattern for time-series data: partition by month, and queries for “last 30 days” only scan the relevant partition. Hash partitioning spreads rows evenly when there’s no obvious range key. List partitioning works for enumerated values like region or status. The gotcha: if your queries don’t filter on the partition key, partitioning makes things worse, not better. The execution plan shows whether partition pruning is actually happening.

What candidates get wrong in mock DBA interviews

In mock interview sessions on LastRound AI, DBA candidates most often give correct definitions but stop there. They explain what a deadlock is but don’t walk through how they’d detect it in production, read the deadlock graph, or prevent recurrence. Interviewers at larger companies probe for that operational follow-through, not definitional recall. Practicing with follow-up questions is where the real gap closes.

Replication and Disaster Recovery

12. Synchronous versus asynchronous replication. How do you choose?

Synchronous replication waits for the replica to confirm each write before the primary acknowledges to the client. Zero data loss on failover, but every write pays the round-trip latency to the replica. Asynchronous ships changes in the background and a failover could lose a few seconds of commits. A same-datacenter replica for high availability can often use synchronous replication without painful overhead. A geographically distant DR replica probably can’t: 60 to 100ms added to every write is usually unacceptable.

13. How do you monitor replication lag, and what do you do when it grows?

On Postgres: pg_stat_replication shows lag in bytes and seconds. On SQL Server: Always On dashboards or sys.dm_hadr_database_replica_states. Alert when lag crosses the threshold that would violate your RPO, before it becomes obvious. When lag grows, diagnose before acting: high write volume on the primary, network saturation, I/O bottleneck on the replica, or a long query on the replica blocking log replay each need different fixes. “Restart replication” without diagnosing the cause just makes the same lag recur.

The 2024 Stack Overflow Developer Survey found PostgreSQL used by 49% of professional developers, ahead of MySQL at 39%. If you’ve only practiced on SQL Server, adding Postgres-specific knowledge before a senior DBA screen is worth the time. The system design interview guide covers distributed systems questions that frequently come up in the same loop as DBA technical screens.

One honest caveat: I can’t tell you with confidence which questions a specific company will ask in 2026. Interview loops change, and what Amazon emphasized in 2023 isn’t necessarily what they emphasize now. Use these as a framework, not a guaranteed list.

Practice DBA Questions With Real Follow-Ups

LastRound AI’s mock interview sessions push beyond definitions and ask you to walk through production scenarios, the same way real DBA interviewers do.

13 Database Administrator Interview Questions That Test Real Operational Depth

Query Optimization and Indexing

1. How do you find and fix a slow-running query?

2. B-Tree versus Hash indexes. When would you choose one over the other?

3. What is parameter sniffing, and when does it cause problems?

Transactions, Locks, and Isolation

4. How do you handle deadlocks in production?

5. Explain the four isolation levels and the tradeoffs.

Backup, Recovery, and High Availability

6. What’s your backup strategy for a mission-critical database?

7. What’s the difference between RTO and RPO, and how do you achieve both?

8. How do you perform point-in-time recovery?

Security and Access Control

9. How do you implement least-privilege access for a new application?

10. Data at rest versus data in transit encryption. What’s the difference?

Performance at Scale

11. When would you use database partitioning, and which type?

Replication and Disaster Recovery

12. Synchronous versus asynchronous replication. How do you choose?

13. How do you monitor replication lag, and what do you do when it grows?

Practice DBA Questions With Real Follow-Ups

Leave a Reply Cancel reply

13 Database Administrator Interview Questions That Test Real Operational Depth

Query Optimization and Indexing

1. How do you find and fix a slow-running query?

2. B-Tree versus Hash indexes. When would you choose one over the other?

3. What is parameter sniffing, and when does it cause problems?

Transactions, Locks, and Isolation

4. How do you handle deadlocks in production?

5. Explain the four isolation levels and the tradeoffs.

Backup, Recovery, and High Availability

6. What’s your backup strategy for a mission-critical database?

7. What’s the difference between RTO and RPO, and how do you achieve both?

8. How do you perform point-in-time recovery?

Security and Access Control

9. How do you implement least-privilege access for a new application?

10. Data at rest versus data in transit encryption. What’s the difference?

Performance at Scale

11. When would you use database partitioning, and which type?

Replication and Disaster Recovery

12. Synchronous versus asynchronous replication. How do you choose?

13. How do you monitor replication lag, and what do you do when it grows?

Practice DBA Questions With Real Follow-Ups

Keep reading

Leave a Reply Cancel reply