At The Data Incubator we run a free eight-week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. Michael was a Fellow in our Winter 2016 cohort who landed a job with one of our hiring partners, Schireson Associates.
My PhD work was in computational materials science, where I worked with reactive molecular dynamics simulations. The field is totally simulation based, and typically requires high performance computing resources. Running these simulations helped build my chops for working with parallel systems and command line tools. The software required familiarity with some powerful languages and APIs like C and CUDA. Learning those definitely helped my understanding of Python once I converted to using it.
Toward the third year of my PhD I got really interested in machine learning. I started using scikit-learn to predict different aspects of simulations I worked on. These projects became a large part of my thesis and contributed to choosing The Data Incubator as a next step in my career.
What do you think you got out of The Data Incubator?
The experience and camaraderie of trying to complete weekly projects with my peers was the highlight of my time at The Data Incubator. Everyone had expertise in different areas, and I learned a ton just from interacting with the other fellows. The instructors and daily lessons exposed me to concepts and data-tricks which hadn’t been on my radar. Honestly, most of the reviewed concepts were brought up in interviews and I’m seeing them again in my job.
Could you tell us about your Data Incubator project?
My project had two parts–I compared historical rolling averages of stock price movements with phrases published in New York Times articles to build a financial sentiment lexicon, then I tried to use the lexicon to predict future price movements based on what was published in the New York Times. While I wasn’t able to perfectly predict the market (or I wouldn’t even need a job), I really enjoyed making the project. It gave me some natural language processing experience in conjunction with the mathematical modelling necessary for feature engineering the moving stock prices. It also gave me a lot to talk about during interviews.
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
I would start putting together projects in your free time, there are quite a bit of free data out there, and it’s easier than you think! You’ll organically learn solutions to common problems that may seem like esoteric solutions otherwise. I had put quite a bit of work into my project before I even applied to The Data Incubator, and it probably helped my application. Maybe more importantly, doing this allowed me to have a beefy project by the end of the program. I spent most of my interviews going over my project. I think it’s similar to what you’ll be doing in a working environment.
Finally, I can’t really emphasize enough how using the right libraries can be a huge time saver and productivity booster. Using a language like Python with a simple package management system made data science way more fun for me. Seriously, don’t try to use C for this stuff, it’ll take forever. Actually, go ahead and try it, what doesn’t kill you makes you stronger.
What’s your favorite thing you learned while at The Data Incubator? This can be a technology, concept, or whatever you want!
My favorite software-based-thing to learn was Spark. I never really used a distributed file system so I had a lot to learn, and it was pretty powerful. I also really dug the Jupyter notebooks. I think I’ll be using them a lot in the future.
Where are you going to be working? And tell us a little about your new job!
I’m working at Schireson Associates as a Data Scientist. I can’t delve into any of the specifics for client privacy, but the work that I’ve been doing has been directly related to the data science skills I developed in my graduate work and The Data Incubator. When I started, I was able to jump right in and start modeling systems without any hitches. I get to be part of a larger data science team, where I can learn from my co-workers, and foster my growth as a data scientist. Also there’s a dog and a kegerator here, I feel right at home!