At The Data Incubator we run a free eight week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. David was a Fellow in our winter cohort who landed a job with one of our hiring partners, Sotera, after finishing his PhD at Princeton. Here’s his story:
I did my PhD in computational chemistry, which, as the name suggests, focuses on computer simulations of chemicals. The difficulty in computational chemistry doesn’t come from a lack of knowledge of the underlying physics. Rather, the difficulty comes from computational complexity – 100 year computer simulations aren’t an effective way to obtain a PhD. Computational chemistry does a great job of training researchers to think about how their computer simulations really work, how to approximate the really hard parts, and when those approximations do (and don’t) work. Those are all really important skills for a data scientist confronted with a large data set distributed across multiple computers (aka “Big Data”).
What do you think you got out of The Data Incubator?
My fellow cohorts and I loved to discuss this during the fellowship and we settled on three important aspects of the Incubator. First, The Data Incubator gives you a nice overview of the current techniques in data science. That includes everything from what’s a random forest to the latest technologies for distributed computing. Second, The Data Incubator exposes you to how data science is actually used by companies. By hearing from many different companies you get a feeling for the different types of problems that they tackle and how data science offers real solutions in a variety of fields. Finally, The Data Incubator helps expand your network both by introducing you to companies looking for data scientists, and also (but just as importantly!) by introducing you to the other Fellows.
Could you tell us about the mini-projects you worked on? How did they help?
The mini-projects are a wonderful way to be exposed to the wide array of different types of problems a data scientist might face. Our mini-projects covered diverse topics such as social networks, natural language processing, distributed computing, and more. Completing the mini-projects gives you a well-rounded data science background. I know during my interviews that the mini-projects gave me a great way to demonstrate that I had experience in multiple areas of data science.
It’s also important to note that the mini-projects are not trivial and I learned a great deal by discussing them with the other Fellows. The Fellows came from diverse backgrounds and having people with so many different ways of looking at a problem working together always produced interesting results.
What advice would you give to someone who is applying for The Data Incubator, particularly someone with a chemistry background?
Chemistry is, as far as I can tell, a less typical background in data science. That doesn’t mean chemists don’t have the right training! Computational chemists are taught both quantum mechanics and statistical mechanics. Statistical mechanics uses, as the name suggests, statistics, and don’t forget that in quantum mechanics the electron density is formally the electron probability density. Almost everything in quantum mechanics involves taking the expectation value of a probability density.
Chemists have the necessary math background. We also have the necessary computational background. If you have ever programmed a simulation with more than one computer core you already know some parallel computing. If anything you might know too much; the message passing idioms used in computational chemistry are a lot more complicated than MapReduce. Just brush up on analyzing algorithms and big-O notation – chemists have the tendency to abuse the notation a bit.
So my advice to a chemist applying to The Data Incubator would be: don’t worry, you have the necessary background. It certainly wouldn’t hurt to brush up on your probability theory or algorithms 101, but you already know the basics. [Editor’s Note: For more information about how to prepare for The Data Incubator, check out this post.]
Learn more about The Data Incubator here.