Pivoting from science to data science
Exploring data science as a career path for folks with scientific training
One of the most common informational interview requests that I get are junior scientists (late undergrad, early grad school) who want to know about pivoting into data science. Since I often give them the same advice, I figured I'd write it down.
This post is tailored for folks who have a science background, meaning that they are studying something like physics or biology and their primary "work" experience has been in academic research. It is also relevant for folks who want to pivot from primarily wet-lab work to dry lab.
I'll caveat this post with the fact that I haven't actually tested this advice — I got my data science job because a labmate started a company that needed someone with a computational skillset and a broader title than computational biologist. That said, I did hire a handful of junior folks onto our data science team and I successfully pivoted my microbiome-focused PhD into a public health data job. So hopefully I’m not completely off base!
First off, it's important to understand the different flavors of "data science." As a computational scientist, you do data science to understand things about the world. In contrast, my impression is that most industry data science jobs are focused on analyzing data about the business. For example: where is most the recurring revenue coming from? How well did a marketing campaign perform? How can we improve our customer support?
Of course there are parallels between the two flavors of data science and a huge overlap in necessary skillsets. But if you're looking to continue analyzing microbial networks or forestry coverage or whatever your passion is, then you'll have to find companies that are working on those problems and actively hiring scientists in those focus areas. The vast majority of data science jobs are generic "understand the business" roles, so you'll have to look harder for what you want.
Which brings me to my first piece of advice for leveraging your scientific background to pivot to data science: learn about about what a career in data science might look like.
1. Sign up for email lists
More specifically, I recommend signing up for a million email lists. Part of why I feel semi-valid in talking about data science is that even though I haven't ever had a "generic" data science role, I have a good idea of what they entail based on the hundreds of articles I've read.
Some of my favorite data science email lists:
Data Science Weekly: a weekly roundup of data science links. A bit focused on ML these days, but still has enough variety that I find at least one interesting article each week.
Analytics Engineering Roundup: written by folks at dbt, gives opinions on the latest happenings in the analytics engineering world and also links to interesting pieces. This is a good email list to stay up to date on the modern data stack.
Data Elixir: another roundup of links with a lot of overlap with the other two, but enough unique content that I've stayed signed up for it.
Flowing Data is my favorite data viz newsletter. I’ve also recently been enjoying Storytelling with Data and Datawrapper’s blogs.
There are also a lot of data content producers on Substack; I recommend poking around and trying new things. In general, if you're trying to explore a new space, your barrier for signing up for a newsletter should be very low. You can always unsubscribe if after two or three articles you don't like it.
Signing up for email lists is also good advice if you're interested in exploring a new field. I heartily recommend signing up for email lists from the government if your interests overlap with any of the federal agencies. Basically, go to your agency of choice and poke around until you find somewhere to subscribe to email updates. Once you enter one agency's email subscription management service, it’ll give you options to sign up to other agencies' emails as well. It can be a pain to unsubscribe, but it's worth it to get notified of potential cool opportunities (including data science!), and also just to better understand that landscape of federal priorities and activities.
Finally, another great option is to join data science communities. I've been enjoying Locally Optimistic's slack group, but there must be millions more out there.
2. Learn to code
If you want to pivot to data science, it'll help a lot if you know how to code. Some entry-level data analyst jobs may not actually need coding skills (leveraging instead Excel and dashboarding tools like Tableau), but these will probably all be in the "analyze the business" category. That said, if you're pivoting from wet lab to dry lab they could be a good place to start.
There are many ways to learn to code, for example:
Bootcamps: short, intensive experiences that teach you the basics of coding and data science and help you prep for data science resumes. At this point, I think you have to pay for most of them. That said, many programs have some sort of "if you don't get a job offer within X months of graduating, we'll refund you Y% of your tuition" deals. Bootcamps are a good option if you're shooting for the generic data science role, as they tend to work with bigger companies. If you're pivoting from PhD, the Insight Data Science fellows program is specifically made for folks like you.
Online courses: if you're a motivated self-starter, you can learn coding and data science through free online courses. Rafa Irizarry and Stephanie Hicks create data-focused open courses; I haven’t taken the courses but I know both of these folks and they’re great educators. Johns Hopkins also has an open data science lab, with some additional courses.
Self-learning: of course, if you're extremely motivated there's always self-learning. Find a cool dataset, analyze it, and publish your results. One of my favorite places to look for data and inspiration is Data is Plural (which also has a newsletter to sign up for!)
3. Build a portfolio and/or online presence
If you're trying to pivot or break into data science I recommend having some sort of online profile and, ideally, a small portfolio. It doesn't have to be anything fancy: a couple of github repos or a handful of blog posts on a static website are totally fine. If you're just sticking with a github repo, make sure that your repos are well-organized and well documented. People won't be spending much time looking through your repo, so you want to make it easy for them to understand what's going on.
I view the portfolio as fulfilling three goals:
Giving recruiters and hiring managers a super easy way to understand more about you. If your resume comes through and they're intrigued but confused because you don't have any prior data science experience, you want to make it easy for them to figure out what your deal is. (This might be bad advice, especially in this market: it's highly unlikely that hiring managers will click through to your website. But I did it all the time as a hiring manager, and I often get frustrated when I google someone I've been connected with and can't find anything about them. So shrug, take it or leave it.)
It shows that you actually know how to code and have technical data chops.
It's also a way to showcase the other skills you have as a scientist: critical thinking and storytelling are the two that I often look for in portfolios. If someone just blasts through all the generic scikit learning models on a dataset, I'm unimpressed. But if I see them applying basic stats and really thinking about what the data might mean, I'm intrigued!
4. Volunteering
A final option that falls somewhere between all of these is volunteering. If you already have computational skills, want to practice applying them to problems outside your field, and have the time/money/space to work for free, you can consider volunteering. Some volunteering opportunities I know about are DataKind, the Data Liberation Project (spin-off of Data Is Plural), and Bluebonnet Data (for learning about data in progressive politics).