Writing a Good Job Description for Data Science/Machine Learning | by Stephanie Kirmer | Aug, 2024
Things to do and things to avoid in order to find the right candidates for your open position
I’ve probably been involved in the hiring process for data scientists a dozen times or more over my career, while never being the hiring manager myself, and I have been closely involved in writing the job description for several of these. It kind of seems like this should be easy — you’re just trying to convince people to apply for your job, so you can pick the one you like best, right?
Well, it’s actually more complicated than that. Most of the people out there in the world are not qualified for any given job, and even among those who are qualified, there may be reasons they wouldn’t like working in this role. It’s not a one-way street; you don’t want just anybody to apply, you want the best suited people, for whom this job would work, to apply. So, how do you thread that needle? What should you write?
This column is only my opinion and does not represent the views of my employer. I have not been involved in writing any job descriptions my current employer has posted, for ML or anything else.
To figure out what to write, let’s break down what it is a good job description is supposed to do, for a DS/ML job or for any other kind.
- Explain to candidates what the job is, and what they would do in the job
- Explain to candidates what qualifications you’re looking for in applicants
These are the bare essential functions, although there are several other things your job description posting should also do:
- Make your organization seem like an attractive place to work for a diverse pool of qualified candidates
- Describe the compensation, work circumstances, and benefits, so candidates can decide whether to bother applying
With this, we’re starting to get into more subjective and complicated components, in some ways.
In some spots, I’m going to give advice for two different scenarios: first, for a small organization with few or zero existing DS/ML staff members, and second, for a medium or large sized organization with some DS/ML staff. These two can be quite different situations, with different needs and challenges in certain areas.
You may notice I’m using “DS/ML” a lot in this article — I consider the advice here good for people hiring data scientists as well as those hiring machine learning engineers, so I want to be inclusive where possible. Sorry it’s a little clunky.
Firstly, for any organization, consider what kind of role you have open. I’ve written in the past about the different kinds of data scientist, and I’d strongly recommend taking a look and seeing what archetypes your role fits into. Think about how this person will fit into your organization, and be clear about that as you proceed.
The Small Organization
A challenge, especially for small organizations with limited or no existing DS/ML capability or expertise, is that you don’t really know what your ML Engineer or Data Scientist may end up doing. You know what general outcomes you’d like this person to produce, but you don’t know how they’ll achieve them, because this isn’t your area of expertise!
However, you’ll still need to figure out some way to describe the role’s responsibilities anyway. I advise being honest and up-front about the level of data science sophistication at your company, and explaining the outcomes you’re hoping to see. Candidates with enough experience and skill to help you will be able to conceptualize how they’d attack the problem, and in the interview process you should ask them to do that. You should have some kind of project or goal in mind for this person, otherwise why are you hiring in the first place?
The Larger Organization
In this case, you already have at least a couple of DS/ML staff members, so you can hopefully call on those folks to tell you what the job is like day to day for an IC. Ask them! It’s surprising how often you’ll find HR or management not actually taking advantage of the expertise they already have in house in situations like this.
However, you should also determine whether this new hire is going to be doing mostly or entirely the same thing as someone already in place, or whether they may end up filling a different kind of gap. If your existing problem is just having not enough skilled hands to do all the work on your plate, then it’s probably reasonable to expect the new hire will be filling a role similar to what’s there. But, if you are hiring someone for a very specific skillset (say, a new NLP problem came up and nobody on your team knows that stuff very well), then make sure you are clear in your job posting about the unique responsibilities this role will have to pioneer.
This brings us to an important point, as well — how much experience and which skills does your candidate need to have in order to successfully do the job?
The Small Organization
- Experience: If this person is your first or second DS/ML hire, do not hire someone without some substantial work experience. These folks will cost more, but in your situation, you need someone who can be very self-directed and who has seen data science and machine learning practice done well in other professional settings already. This might go without saying, but you have little or no in-house capacity to train this person on the job, so you need them to already have acquired training from other previous roles.
- Technical skills: But what skills do you really need to look for, then? What technical competencies, programming languages, etc will someone need to have to pursue your goals effectively? Beyond making sure that they can use Python, I’d recommend seeking advice from other practitioners in the field already if you can, to ask them what the skillset for your needs should look like. This changes a lot, as this is a very fast-moving discipline, so I can’t tell you today what your Data Scientist or MLE will need to be able to do a year from now. (I can tell you that asking for a Ph.D. is almost definitely not the answer.)
If you do go looking for advice, make sure you’re consulting people who are practicing DS/ML on the ground, not just “thought leaders” or people who market themselves as recruiting whisperers. If you don’t know anybody directly who fits the bill, try looking through your network or reaching out to DS/ML professional organizations. Take a look at other job postings that sound like what you need, but be cautious, since these other postings may not be that good either.
Regardless, take this seriously — if you write unrealistic, unreasonable, or absurdly irrelevant/outdated skills in your job description, you will turn off qualified candidates because they’ll recognize “Oh, this company doesn’t know what they’re doing”, and that will defeat the whole point of this exercise.
Another option is finding a freelancer in data science/machine learning to get you started, instead of hiring someone yourself at all. There are a lot of fractional or freelance practitioners these days, as well as consulting firms that can take this whole problem off your plate. A quick google of “fractional data scientist” produces lots of options, but remember to do your due diligence.
The Larger Organization
- Experience: I’m a big believer in hiring less senior folks and training them up, if your organization can handle it. New entrants to the field have to learn somehow, and business experience is often the biggest gap in a new data scientist’s skillset. Consider whether you really need to hire a Senior Staff Machine Learning Engineer, or whether you could promote internally and backfill a junior person. There’s no right or wrong answer, but give it some thought instead of jumping right to hiring the most senior level. We senior folks are both expensive and rare!
- Technical skills: As with the job responsibilities, this is time to ask your existing team for their advice. Don’t just ask them what tech they use, also ask them what they might like to learn, if there was someone skilled brought on who could share that knowledge. (These skills go in Optional or Nice to Have, not Requirements!) You already have a DS/ML tech stack in place, of course, so the new person will need to be able to work with that, but if there are adjacent or newer technologies that might benefit your team, this is a good time to find out and potentially bring them on board. Don’t fall into the trap of asking for only the same stuff everyone in your org already uses, without giving any consideration or value to additional other competencies.
Also keep in mind what your candidates need to already have, in contrast with what they could learn on the job from your team. Don’t inflate your requirements to make the role sound more prestigious, or to artificially weed out candidates, especially if you’re not paying commensurate with those inflated requirements, because you’ll be shooting yourself in the foot. You’ll be deterring the candidates who might be a good fit for the level, and getting overqualified people in the pipeline who wouldn’t accept the pay you have available. And don’t ask for a Ph.D. if it’s not vital! (It’s almost never vital.)
It may seem insignificant, but once you’ve defined the role, picking a title to post really does send signals to candidates out there deciding what to apply for. I’ve talked in other pieces about the evolution of titles in data science, and this continues to change over time. But my shorthand advice, at least today, is:
- Data Scientist: Not responsible for data engineering, pipelining, or doing their own deployment, although they may be capable of it. May do BI or analytics as well as model development.
- Machine Learning Engineer: Responsible for any or all of data engineering, pipelining, and doing their own deployment. Does model development, but minimal or no BI or analytics work.
For leveling, I’d say this, as a very very rough rule of thumb, your mileage may vary significantly:
- Junior or Associate: Fresh out of school. No work experience. Maybe an internship.
- No Level: May have had one or two professional jobs or 2–3 years experience.
- Senior: 3+ years professional experience.
Beyond that, there are higher levels that some orgs have and some don’t:
- Staff: maybe 6–10 years professional experience.
- Principal, Senior Staff, etc: More than that. It varies so widely in different orgs it’s really hard to say.
So if you want someone who can do their own pipelines, deployment, and modeling, and don’t need them to do analytics, and you want them to have multiple years experience, then Senior Machine Learning Engineer is what you should write. If you are looking for someone fresh out of school to do some modeling and analytics, but engineers can handle the deployment etc, you need an Associate Data Scientist.
This advice is subject to change as the field continues to evolve. If you really want to write something special like Machine Learning Scientist, I’d advise against it unless you have a really good explanation as to why. Clarity and findability are key here — use the terms your candidates will be familiar with and searching for.
Now we can move on to your pitch: sell your organization as a good place to work, and share the compensation/benefits that you have to offer. We’ve spent a lot of time telling the candidates what they need to bring, and what they need to be prepared to do if they get this job, but that’s not all that a job description is about. You actually also need to be advertising your company and department as an appealing place to work, in order to get the best candidates on your radar. This advice is mostly generalizable to any organization size.
Don’t Lie
I have a few rules of thumb when it comes to describing a company to job candidates, in writing or in interviews. The main one is Don’t Lie. Don’t say you have a “fast paced culture” when it takes three weeks to deploy. Don’t say you “value work life balance” when no one on this team has taken a vacation in a year. And DON’T say “remote” when you mean “hybrid” for pete’s sake! You may think you’re just throwing in nice-sounding boilerplate, but these words mean something.
Feeling like you got bait-and-switched into joining an organization that is a bad fit is awful. Think of it like selling a product — if you overpromise and underdeliver, maybe you made that initial sale, but that customer is going to churn and be out the door with a bad taste in their mouth as soon as they realize their mistake. Then you not only do not have that customer, you have someone out there in the world with a bad opinion of your company who may be telling their whole network about this experience.
If you can’t think of good selling points for your company that aren’t either lies or stretching the truth, then you need to take a cold hard look at your company’s operations.
Being honest will not only make your eventual hire better, but it will attract candidates who really do want to work in a company like yours. Everyone has different wants and needs from a job, and not everyone wants to work at a place that “works hard and plays hard”. There’s not one right culture for companies, and owning the culture your company has will get you the candidates who could be happy working there.
Value Diversity
One other important key is ensuring and displaying that your company values and includes all the angles of diversity among your staff. Your job description is the candidate’s first introduction to how you’re taking care of your people, regardless of protected class or general diversity of experience, background, ability, etc. In this case, that means you need to consider your choices of language very carefully. Unless you really mean it, don’t ask for an “expert” in a skill set. Don’t say your candidates must be “rock stars”. This is both deterring to candidates with reasonable humility about their skills, and also makes your organization sound, well, kind of like jerks.
Note: the old saw that we’ve all heard a million times that “women don’t apply to a job unless they meet all the requirements” is very, very tired, and problematic for many reasons, but it does remind us to ask for skills you actually need, not just a laundry list of wishes.
Instead, use inclusive language. I advise writing your desired qualifications in the form of “Successful candidates can do ….” and then write action oriented items like “build machine learning models using Python” or “perform model evaluation using appropriate metrics such as recall, precision, MAE, RMSE, etc”. Be clear, and make it easy for someone to say “oh, I can do that” or “nope, I can’t do that”.
If you know your pool of potential candidates is very homogeneous, for example because not many people of color get college degrees in your field, consider whether you need to take extra steps to get your job in front of those candidates. Take the time to post jobs on diversity-oriented job boards, and share your posting with professional organizations for different kinds of people. If your posting never gets seen by varied individuals, you won’t get varied candidates applying.
Compensation and Benefits
Now this should really go without saying, but be transparent and clear about the benefits and compensation for the role. Give a compensation range even if you don’t have to by law. If you’re not hiring in a state that requires a compensation range, you may think that isn’t an issue for you, but it actually is, because candidates with choices will prefer to apply to postings where they can clearly see the pay is commensurate with their expectations. It makes you look exploitative to leave off a compensation range (or to give a range spanning $100k so the range is effectively useless). Get with the times and give a reasonable range.
Also, I already mentioned it but it bears repeating — be honest about the working circumstances. Don’t advertise a job as “remote” only to reveal in the interviews that it’s 3 days a week on site. That’s also really bad practice and a rude waste of everyone’s time. Give candidates the details they need to make an informed decision about applying.
Beyond that, remember that health insurance is important to anyone in America looking for a job, and be as clear as you can be about what you are offering. If you can list the insurance carrier, do that; it may help people know if their doctor or provider would be in network. It’s not a huge deal to every candidate, but many candidates, including those with disabilities or health concerns (or dependents with concerns) will appreciate it.
Hiring for technical roles, including DS/ML, is hard. This advice might all sound like a lot of work you’d rather avoid, but consider: the alternative is weeding through thousands of applications from terribly unqualified candidates, or candidates who would never accept the job. Do some work up front so you’re not wasting your own time and that of the applicants down the road. It’s not only more efficient, it’s also the ethical choice. Applicants are real people and deserve to be treated as such.
To recap:
- Figure out what the job would do (or what outcomes you want to see)
- Figure out what the experience level and technical skillset needs to be (not your dream wish list, but realistic needs)
- Write a job title that’s clear, accurate, and searchable
- Don’t lie about your organization or the job
- State the compensation range up front, and describe the benefits
Good luck out there!
link