At The Data Incubator we run a free eight-week Data Science Fellowship Program to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring Data Scientists. Andrew was a Fellow in our Fall 2015 cohort who landed a job with one of our hiring partners in Washington, DC.
Tell us about your background. How did it set you up to be a great Data Scientist?
There are two things that I think have helped me get to where I am:
1) Like most physicists, I think I have a natural propensity to tinker with things well outside my expertise. Taken too far, this can be a bad thing. But, applied appropriately, it’s exactly the kind of attitude needed to learn and keep up with the ever-changing field of data science.
2) Having focused on precision measurements in my research, I’ve seen time and time again how much the environment in which I performed my experiments impacted the data and informed my analysis. The parallel to data science is that my training has taught me that a deep understanding of the problem and how the data was collected are what allow you to ask the right questions and produce meaningful results.
What do you think you got out of The Data Incubator?
1) An appreciation for how the skills I’ve developed as a physicist are applicable to problems in industry.
2) The opportunity to take on challenging problems alongside Fellows from a range of academic and technical backgrounds.
3) The beginnings of a very solid professional network in data science composed of Fellows in my cohort, program alumni, Data Incubator staff, and the numerous hiring partners we interacted with.
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
I would also recommend that you start working with Python, if you haven’t already. You’ll be using it daily at the Incubator, and you don’t want your understanding of the language to get in the way of all the techniques you’ll be learning.
What is your favorite thing you learned at The Data Incubator?
1) I learned to make the Jupyter Notebook a big part of my workflow. It’s a fantastic tool for things ranging from messy exploratory work to analyses that will be shared with others.
2) I learned that Slack is a wonderful tool for working with a distributed team. Just be careful not to abuse the secret commands too much!
Could you tell us about your Data Incubator Capstone project?
It turns out that Metro riders are a very vocal bunch on social media, especially when it comes to service issues. My project used this data to make predictions about delays. I gathered Tweets using a set of searches on hashtags and user account mentions and scraped the daily service report archive to generate a training set. From the Tweets, I constructed a feature array containing temporal information, Tweet volume, and quantities related to the appearance of certain keywords in the Tweets (e.g. “delay” or “offloading”). This data was used to train a Random Forest classifier, which was then deployed to a web application. You can check it out and learn more about how the classifier was trained here.
Visit our website to learn more about our offerings:
- Data Science Fellowship – a free, full-time, eight-week bootcamp program for PhD and master’s graduates looking to get hired as professional Data Scientists in New York City, Washington DC, San Francisco, and Boston.
- Hiring Data Scientists
- Corporate data science training
- Online data science courses: introductory part-time bootcamps – taught by our expert Data Scientists in residence, and based on our Fellowship curriculum – for busy professionals to boost their data science skills in their spare time.