Leave a Review & Get 30% OFF - Limited Time Offer!

00:00:00
Guides

Data Science Interview Questions 2026: SQL, ML, and Case Studies

Last updated: March 25, 2026|9 min read|By InterviewMan Team

ok so Dev's kitchen counter. 11pm. tuesday. pad thai leaking grease onto the counter, that place on mission street, too many napkins as usual. Dev goes "explain Type I vs Type II errors in an A/B test context." nothing. my mouth opens and nothing comes out. i did a whole master's in stats. two years of this stuff. and i am standing in his kitchen unable to say one sentence about Type I errors lol. formulas, sure. english words, nope. Dev sat back and waited. three seconds that lasted a year. that kitchen counter moment is why i am writing this, because two weeks later a VP of product at a fintech company asked me the same kind of question in round three of an onsite and my brain did the same exact thing. four hours in. SQL window functions round one. stats grilling round two. churn prediction pitch round three. then this man wants me to justify sample sizes for an A/B test and i have nothing. my master's did not save me. Dev warned me. he is a data scientist at Spotify, three years there, i owe him nine beers at this point. he told me at a bar once "DS interviews are four interviews pretending to be one" and i was on my phone not listening. lol

Dev said something that night i hated. leaning on his fridge, beer in hand, the audacity. "if you cannot explain a concept to a PM in two sentences you do not understand it for the interview." then this lunatic made me explain Bayesian updating to his girlfriend Lena. kindergarten teacher. zero stats background. she sat there eating trail mix and squinting at me while i rambled and i could not get confidence intervals into normal words. four attempts. FOUR. i was visibly red. attempt three i almost gave up and she was still squinting and eating trail mix. attempt four she nodded and something unlocked in my brain. i think about Lena's squint face every time someone asks me to explain p-values without using the word probability, or when do you use t-test vs chi-squared, or walk me through A/B test sample size math. the questions that show up in every DS loop i have heard about from anyone. same five concepts wearing different company logos. hundred forty buck textbook taught me less than trail mix lady

Meta's DS phone screen though, late 2025, Dev went through it. almost entirely SQL, hard SQL. window functions. CTEs. self-joins. date math. his question was find users from a logins table whose login frequency dropped more than fifty percent month over month for three consecutive months. LAG, window functions, date grouping, twenty minutes, shared editor, go. two other people i know confirmed Meta DS is like that. SQL is the gate, write it cold or the screen is done. Google DS coding is lighter than SWE but they want real Python, Pandas specifically, hand you messy data, clean it, compute metrics, then the interviewer squints at you and goes "what does this tell us." lol. people write working code and then cannot say in english what their own output means. dead air. i have been that person. staring at a dataframe. forgetting how numbers work

Dev watched me do a mock Google round over FaceTime and muted himself so i would not hear him laughing. he texted me after "bro you wrote correct pandas and then said nothing for forty seconds." that forty seconds is the same silence that kills you in ML rounds honestly. they are not going to ask you to derive backpropagation. Dev got "when would you pick gradient boosting over a random forest" at Amazon and i got "imbalanced classes what do you do" on a mock with him at that same kitchen counter, same pad thai grease on the counter because he never cleans, and i froze again. he listed off SMOTE and class weights and threshold tuning and precision-recall over accuracy and the business case for why false positives cost different from false negatives and sat there looking at me like Lena with the trail mix. five things. i had two of them. two out of five sounds like you read a blog post. you need all five and you need to connect them and that is what passes. Amazon ML deep-dive round is where Dev almost died though, you pick one project off your resume and they drill you for thirty minutes. what model. why. what features. evaluation. what would you change. this interviewer spent FIFTEEN minutes on feature engineering for a rec system Dev built at Spotify. fifteen. on one project. i prepped him by interrogating him about that project for an hour at his kitchen counter and by minute twenty he was stumbling on choices he made eight months prior. rewrote all his notes that night. all of them. Google goes theoretical instead, bias-variance tradeoff, model complexity, regularization. "ok but WHY does L1 produce sparse weights." you say diamond constraint region and the followup is what does that mean geometrically and why it matters for feature selection in prod. Lena would have squinted so hard at that one lol. intuition over proofs always

case studies are where Dev impressed me honestly. "metrics dropped twelve percent last week, what happened." you build an investigation live while the interviewer plays a PM who gives you vague answers to everything. his Meta case study was "Instagram Reels daily active users dropped eight percent in Brazil, walk me through it." five minutes of clarifying questions before he proposed a single thing. all users or a segment. app update timing. seasonal patterns in Brazil specifically. those five clarifying minutes impressed his interviewer more than the analysis framework that came after. i would have jumped straight to hypotheses and gotten dinged for it probably. same exact problem as the forty seconds of dead pandas silence lol. my brain wants to answer before it wants to ask. behavioral rounds use the STAR method, "tell me about a time your analysis changed a business decision." specific analysis, which stakeholder, what decision changed, measurable outcome. "did some analysis and it helped" is a no-hire answer. take-homes some companies still do, dataset, prompt, forty eight to seventy two hours, submit a notebook. Dev reviewed a friend's take-home once. XGBoost model tuned perfectly, zero explanation of why any feature engineering choice was made. not one sentence about why she picked those features. she did not advance. Dev looked at it and went "this is a kaggle dump not a memo." write it like you are convincing a VP to spend money. every person i talked to who got an offer wrote it that way. could you get away with a clean kaggle-style notebook? maybe. but i would not bet on it after watching that submission get rejected

company formats if you want them because Dev quizzed me on these too lol. Meta DS is SQL phone screen then onsite with product sense plus case study plus technical deep-dive plus behavioral, product sense being the Meta-specific one where you reason about metrics for their apps and it is the round Dev said felt most like being grilled by a PM who hates you. Google DS is phone screen with coding and stats then onsite with coding and ML concepts and case study and Googleyness round which is their version of "are you a person we want to eat lunch with." Amazon DS starts with an OA for SQL and basic stats then virtual loop with coding, ML deep-dive on past work, case study, behavioral with leadership principles and if you have not memorized fourteen leadership principles good luck. DS loops mix coding and talking which is why they mess people up. SQL and Python rounds feel like coding interviews where live help catches syntax blanks or reminds you of a window function your brain dropped (mine dropped LAG during my Meta screen, of all things lol). stats and case study rounds are more conversational, a nudge on a framework step keeps you from spiraling

i used InterviewMan during my second cycle. SQL round, flagged me writing a correlated subquery when a window function would be cleaner. i KNOW window functions. pressure made me reach for whatever i learned first which is the dumbest kind of mistake. case study round it surfaced consider seasonality, product changes, data pipeline issues when i got a metrics drop question, basically the same Brazil Reels thing Dev got but for a payments company. exactly what Dev drilled into me at that kitchen counter over pad thai. nerves wiped it all clean. mocks with it honestly taught me more than the live rounds because i saw exactly where my explanations turned into hand-waving and there was no pressure so i could actually fix them. twelve bucks a month, no caps on sessions, which after spending two years on a master's and getting wrecked by trail mix lady i was not about to be cheap about this lol. actually started looking forward to mock sessions at some point which has never happened to me in my life. i looked around for something else first but everything was coding-only or stats-only, nothing that hit the full DS loop. stats and case study and behavioral and SQL in one tool. everything else was half a tool

here is my dumbest move. i studied SQL for a week. stats the next week. ML after that. separate little buckets like they are separate subjects. walked into my onsite and they threw all of it at me in one afternoon, four hours, everything at once, same as that fintech VP who wanted A/B test math after three rounds of everything else. Dev told me that would happen, at the bar, at his kitchen counter, probably in a text i ignored while eating pad thai. nine beers. i owe him nine beers and probably owe Lena an apology for butchering Bayesian updating in her kitchen while she was trying to eat trail mix. she still squints at me when i come over lol

Ready to Ace Your Next Interview?

Join 57,000+ professionals using InterviewMan to get real-time AI assistance during their interviews.

ShareTwitterLinkedIn

Related Articles

Try InterviewMan Free

AI interview assistant. Undetectable.

Get Started