Data scientists are trained to handle uncertainty. The data we work with, no matter how “big” it may be, remains a finite sample riddled with potential biases. Our models tread the fine line between being too simple to be meaningful and too complex to be trusted. Armed with methodologies to control for noise in our data, we simulate, test and validate everything we can. A great data scientist develops a healthy skepticism of their data, their methods and their conclusions.
Then, one day, a data scientist is promoted and presented with an entirely new challenge: Evaluating a candidate to become a member of their team. The sample size drops fast, experimentation seems impractical, and the biases in interviewing are orders of magnitude more obvious than those we carefully control for in our work. We will try to outline the goals of a new process, describe its underlying principles, and walk through the implementation of hiring senior talent. And of course, this wouldn’t be complete without looking ahead at opportunities to adapt and improve the process even further.
In developing our recruiting process, we set out to improve the following measurable objectives:
- Accuracy: Maximize the chances that new hires will become exceptional employees.
- Loss: Minimize the chances that great prospects leave the hiring funnel early.
- Success: Maximize the chance that offers will be accepted.
- Effort: Minimize the long-term distraction to the hiring team.
At first glance, any experienced manager would think that it’s impossible to improve all four of the above goals simultaneously. The first three tend to work against each other in practice (e.g., the greater the candidate, the harder it is to get them to accept an offer). Beyond that, improving them all would seem to dictate greater ongoing effort by the team.
In a traditional hiring process, most managers feel fortunate if their accuracy is as high as 50%. That is, no more than half of their hires turn out to be exceptional. Loss is hard to measure (after all, senior candidates who fall out of the process didn’t come to work for you), and most managers worry that they regularly lose amazing talent because their process is so long and cumbersome.
And the ongoing effort that hiring requires can easily consume 20% or more of a data science team’s time. After validating this experience with other data science leaders, I sought to implement a process that could achieve the following:
- Accuracy: 90% of hires should in fact be exceptional employees.
- Loss: We should make offers to 80% of the great senior candidates who enter our funnel.
- Success: 65% of offers extended should be accepted.
- Effort: Hiring should consume less than 10% of the team’s time.
By designing a hiring process that is smarter — both in identifying great senior candidates and simultaneously reducing the risk of losing them — it’s possible to improve on the first three goals simultaneously. And, by investing heavily upfront (an investment that pays off handsomely over time), the ongoing effort and distraction to the team can be managed.
To ensure that we met our objectives, we developed a set of core principles that can be applied to hiring for any function:
Ensure your hiring process is always on and continually improving
Investing in an always-on process will force you to treat hiring as a discipline. This will drive consistency in protocol and results, enable you to collect data about your successes and failures, and force you to manage your talent pipeline with the same care you manage your data pipelines.
Make your process mirror the reality of your hiring needs
Ask senior candidates about their prior experience, and you’ll discover whether they can articulate what happened around them at other jobs. Ask them technical questions, and you’ll uncover their ability to regurgitate knowledge. Make them solve a ‘toy’ problem on a white board, and you’ll discover how quickly they solve toy problems. A candidate that passes all of these hurdles with flying colors may be a completely ineffective data scientist in practice.
To address these flaws, you must first have a very clear understanding of how you want senior candidates to perform data science. At the highest level, you should be clear on the end product your team will produce. Next, you should have a clear understanding of what you want successful candidates to do. Identify five opportunities you would love to see a data scientist tackle. For each, ensure that you have (or could reasonably collect) the data required, and can envision a solution that would be effective even if you couldn’t design it yourself. Knowing answers to how your team performs data science and what challenges you most want candidates to be able to handle, you can design a hiring process that closely reflects your working conditions. This means you should put senior candidates into an environment that closely resembles what their ‘day-to-day’ would be.
Run objective evaluations first to minimize your biases
Senior candidates who would be top performers may fail a traditional interview process. The culprit is interviewer bias. As soon as you enter the room with a candidate, you begin forming opinions (mostly unconscious) about their abilities. There are a wide array of such but the most common bias in interviewing is to prefer people who are similar to ourselves.
Great data scientists must have very strong quantitative and programming skills. That’s non-negotiable. So we designed our process to test these skills first, then move on to more subjective (yet still measurable) skills like problem solving and communication. Only at the end do we get to the most subjective of all — how the candidate works on a team and fits into the culture.
These later stage, more subjective criteria are the most time-consuming to evaluate and are where biases are most likely to creep in. Moving them late in the funnel has the combined benefit of reducing the load on the team (we don’t evaluate culture fit until we’re confident they have the skills we need) and minimizing the risk of losing a great candidate prematurely.
Move faster than the market.
The market for great data science talent is incredibly competitive, so your process should ensure that you move candidates through your funnel as quickly as possible, keeping momentum high and minimizing the chance that they accept a competing offer. Moving fast requires a streamlined process that allows you to build confidence as well as speed. Invest in tools and logistics to track how long candidates stay in each stage of your funnel and aggressively change your system to gain keep your edge
Cast the widest possible net for available talent, attract them with a challenging problem and intriguing offer for employment. At the highest level, this interviewing process has two key components:
- Take-home test: A short exercise that tests a candidate’s ability to solve a series of increasingly difficult challenges.
- Data Day: A full day spent working beside the team on a more open-ended challenge, concluding with a presentation of their work to a group.
The key levers to pull here are (A) the quality of the applicants in the funnel,(B) the success rates in submission of take-home tests and attending a Data Day, and (C) the accuracy of the take-home test and Data Day filters. By tracking your candidates through this funnel and examining the loss at each stage by channel (e.g. where they came from), you can begin to identify higher performing channels, and the stages in your funnel that are filtering too aggressively.