Two women and a man sitting at a table, working on a laptop.
Tell me about the biggest project you worked on in the last year.

A strong candidate will focus on the big picture here, rather than getting bogged down in the nitty-gritty details. They should talk about both the success of their project specifically, but also the role it played in helping the company achieve its overarching goals. If you want more detail around the technical specifications and procedural side of the project, dig a little deeper with your questioning.

 

Tell me about an original algorithm you’ve created. How did you go about doing that, and for what purpose?

This question digs into the mathematical know-how of your candidate. Listen for answers that discuss how they gathered the appropriate data, what skills or tools they used to build the algorithm—like database design, SQL, normal forms, table design, and indexing—and what that algorithm helped them discover. A great answer will also dive into the value their algorithm added to the business.

How do you keep your data clean, and why is clean data important?

“Dirty” data can wreak havoc on an organization and the people it serves. The candidate should recognize that cleaning data is essential if it’s coming from multiple sources and that, despite the time-consuming nature of the process, it’s important for improving accuracy of the model in machine learning. Since data cleaning does take time, their answer should indicate that they factor this into their time estimates for analytic tasks, and that they follow best practices for keeping data clean.

What analytics tools and coding languages have you used? Which are you the most proficient in?

Depending on the requirements of the role, you may be looking for different answers here. The candidate may mention coding languages like Python and Java, and may have experience using data visualization tools. But if they’re not proficient in the tools and languages the role requires, they should be able to demonstrate an aptitude for picking up new languages quickly.

What is a linear regression, and what are some alternatives—and their advantages or disadvantages?

This is just one example of a specific, technical question you can use to test whether a data scientist candidate knows their stuff. But in addition to the basic technical knowledge they display, listen to how the candidate answers the question. Can they communicate a complex topic clearly, without using a lot of jargon? If so, they may be better suited to explain data science initiatives to leaders and other staff, many of whom will have little to no understanding of data concepts.

Tell me about a data project you’ve worked on where you encountered a challenging problem. How did you respond?

When you’re working with data, the question isn’t if problems will arise, it’s when . You want to know that your candidate understands how to deal with data-related problems or errors. How do they correct the mistake? How do they communicate the problem to leaders, customers, or other stakeholders? What precautions do they put in place to prevent the same issue occurring in the future? Data science is about constant iteration, especially in the face of unforeseen problems, so a great answer will reveal the candidate’s nimbleness and ability to adapt, as well as their problem-solving skills.

How have you used data to elevate the experience of a customer or stakeholder?

Ultimately, data science is about improving decision-making and performance—whether for the end users or for your company overall. If the candidate doesn’t understand the ultimate impact of their work, they may lack the big-picture thinking that you’re looking for, so listen out for answers that draw a clear line from data to an objective business result.

Tell me about a time when you had to clean and organize a big data set.

The candidate may mention a variety of techniques and tools here, like value correction methods and automated cleanup tools like Paxata. Can they describe why they chose particular methods over others? Did they break up the data set to make it easier to clean? The more specificity in the answer the better.

Which data scientists or startups do you admire the most and why?

A great candidate is plugged into the trends of their industry, so they should be able to name a few key players in the field, like the staff at StackOverflow. Why do they admire these people so much—is it their technique, their contribution to the community, or maybe an insightful blog they run? And if they mention startups, do they specifically reference companies that are making an impact on the profession?

Do you contribute to any open-source projects?

This question screens for a continuous learner attitude. But it also tells you whether a candidate is curious and collaborative. Data scientists are known to be a collaborative bunch, sharing new ideas, knowledge, and information with one another so they can keep pace with the rapidly changing field, so a candidate who is active in the wider data science community is definitely one to watch.

Why should we hire you over other applicants? What makes you best suited for the role?

Answers to this question should help you gauge what a candidate brings to the role beyond the core skills and capabilities. They might talk about their communication skills which make them a great asset to team projects, or their analytical mindset that lets them approach problems from a different perspective. They may also draw on previous experience that they can apply to the role to help improve processes and boost company performance.