In October 2012, the Harvard Business Review described “Data Scientist” as the “sexiest” job of the 21st century. Well, as we approach 2020 the description still holds true! The world needs more data scientists than there are available for hire. All companies – from the smallest to the biggest – want to hire for a job role that has something “Data” in its name: “Data Scientists”, “Data Analysts”, “Data Engineers” etc.
On the other hand, there’s large number of people who are trying to get a break in the Data Science industry, including people with considerable experience in other functional domains such as marketing, finance, insurance, and software engineering. You might have already invested in learning data science (maybe even at a data science bootcamp), but how confident are you for your next Data Science interview?
This blog is intended to give you a nice tour of the questions asked in a Data Science interview. After thorough research, we have compiled a list of 101 actual data science interview questions that have been asked between 2016-2019 at some of the largest recruiters in the data science industry – Amazon, Microsoft, Facebook, Google, Netflix, Expedia, etc.
If you want to know more regarding the tips and tricks for acing the interviews, watch the data science interview AMA with some of our own Data Scientists.
Data Science is an interdisciplinary field and sits at the intersection of computer science, statistics/mathematics, and domain knowledge. To be able to perform well, one needs to have a good foundation in not one but multiple fields, and it reflects in the interview. We’ve divided the questions into 6 categories:
- Machine Learning
- Data Analysis
- Statistics, Probability, and Mathematics
- Experiential/Behavioral Questions
We’ve also provided brief answers and key concepts for each question. Once you’ve gone through all the questions, you’ll have a good understanding of how well you’re prepared for your next data science interview!
As one will expect, data science interviews focus heavily on questions that help the company test your concepts, applications, and experience on machine learning. Each question included in this category has been recently asked in one or more actual data science interviews at companies such as Amazon, Google, Microsoft, etc. These questions will give you a good sense of what sub-topics appear more often than others. You should also pay close attention to the way these questions are phrased in an interview.
Machine learning concepts are not the only area in which you’ll be tested in the interview. Data pre-processing and data exploration are other areas where you can always expect a few questions. We’re grouping all such questions under this category. Data analysis is the process of evaluating data using analytical and statistical tools to discover useful insights. Once again, all these questions have been recently asked in one or more actual data science interviews at the companies listed above.
Statistics, Probability and Mathematics
As we’ve already mentioned, data science builds its foundation on statistics and probability concepts. Having a strong foundation in statistics and probability concepts is a requirement for data science, and these topics are always brought up in data science interviews. Here is a list of statistics and probability questions that have been asked in actual data science interviews.
When you appear for a data science interview your interviewers are not expecting you to come up with a highly efficient code that takes the lowest resources on computer hardware and executes it quickly. However, they do expect you to be able to use R, Python, or SQL programming languages so that you can access the data sources and at least build prototypes for solutions.
You should expect a few programming/coding questions in your data science interviews. You interviewer might want you to write a short piece of code on a whiteboard to assess how comfortable you are with coding, as well as get a feel for how many lines of codes you typically write in a given week.
Here are some programming and coding questions that companies like Amazon, Google, and Microsoft have asked in their data science interviews.
Structured Query Language (SQL)
Real-world data is stored in databases and it ‘travels’ via queries. If there’s one language a data science professional must know, it’s SQL – or “Structured Query Language”. SQL is widely used across all job roles in data science and is often a ‘deal-breaker’. SQL questions are placed early on in the hiring process and used for screening. Here are some SQL questions that top companies have asked in their data science interviews.
Capabilities don’t necessarily guarantee performance. It’s for this reason employers ask you situational or behavioral questions in order to assess how you would perform in a given situation. In some cases, a situational or behavioral question would force you to reflect on how you behaved and performed in a past situation. A situational question can help interviewers in assessing your role in a project you might have included in your resume, can reveal whether or not you’re a team player, or how you deal with pressure and failure. Situational questions are no less important than any of the technical questions, and it will always help to do some homework beforehand. Recall your experience and be prepared!
Here are some situational/behavioral questions that large tech companies typically ask:
Thanks for reading! We hope this list is able to help you prepare and eventually ace the interview!
If you need help understanding the concepts above, check out Data Science Dojo’s online data science certificate!
Like the 101 machine learning algorithms blog post, the accordion drop down lists are available for you to embed on your own site/blog post. Simply click the ’embed’ button in the lower left-hand corner, copy the iframe, and paste it within the page.