Data are becoming the new raw material of business
The Economist


Analyzing the Language of Twitter: Alumni Spotlight on Marc Ettlinger

At The Data Incubator we run a free eight-week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. Marc was a Fellow in our Spring 2016 cohort in San Francisco who’s now working at Google as a Computational Linguist.

Tell us about your background. How did it set you up to be a great data scientist?

I started life as a programmer, then I went back to graduate school at UC Berkeley for linguistics where I actually didn’t use my programming skills for a while. From there I did a neuroscience postdoc, where I started to use my programming skills a bit more. The neuroscience endeavor is, in many ways, a big-data endeavor. You get a tremendous amount of data from doing neuroscience experiments, and figuring out how to interpret and make sense of that incredible amount of data requires techniques that are not typical common in the behavioral scientific world, but are quite typical of machine learning and related data science fields. The way I would use those techniques as a scientist were not particularly sophisticated and while there is a lot of data when you’re analyzing neuroscience, it’s still orders of magnitude less than what people typically think of with big data.

So, when I started looking for job opportunities outside of academia, I realized that the way I talked about data analysis and the techniques that I used were not up to date. I wasn’t using the latest methodologies, tools, and terminologies that data scientists used even though the basic concepts were much the same.

What do you think you got out of The Data Incubator?

The key thing was being able to talk about data science intelligently in a way I hadn’t before, during interview. I was able to update my knowledge to where the field and industry currently is, which helped tremendously talking with prospective employers. I also learned about some ideas and concepts that helped make me be a better data scientist, reflecting the latest research within the field.

Continue reading


Making the Switch from Network Physics to Data Science: Alumni Spotlight on Hernan Rozenfeld

At The Data Incubator we run a free eight-week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. Hernan was a Fellow in our Fall 2015 cohort in New York City who landed a job with our hiring partner, 1010data.


 

Tell us about your background. How did it set you up to be a great data scientist?

My background is in complex network and statistical physics. My PhD studies focused mostly on theoretical modeling of networks and their topological properties. Later on, during my postdoc, I worked primarily on using those networks and graph theory techniques to analyze real-world data.

 

What do you think you got out of The Data Incubator?

I think the most important tool I learned is Machine Learning. Before coming to The Data Incubator I only knew conceptually what ML was. This fellowship gave me a much deeper understanding of the different ML techniques, and maybe more importantly hand-on experience using the different ML tools on real-world data.

I also learned a large number of tech tools, such as Hadoop and MapReduce which are essential for the analysis of very large amounts of data.

Last, but not least, the Incubator helped me to have a more business oriented thinking of problems. In a business environment conclusions must be concrete, translate into actionable items, and easily communicable. TDI helped me transition from an academic view of problems to a business/actionable approach.

Continue reading


An Indirect Route to Automotive Technology: Alumni Spotlight on Alex Thompson

At The Data Incubator we run a free eight-week data science fellowship to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring data scientists. Alex was a Fellow in our Fall 2015 cohort in Washington, DC who landed a job with our hiring partner, NAUTO, in Palo Alto, California.

Tell us about your background. How did it set you up to be a great data scientist?

I went in a straight line for 28 years, and then zig-zagged all over the place. I pursued and received a PhD in Math from UCLA, which culminated two decades of focusing on math. However, during my grad studies I developed other interests, and following grad school I did a lot of political activism and founded a not-for-profit bicycle shop. After that I worked in K-12 Education for 3.5 years, first at Green Dot Public Schools, then at McGraw Hill Education. That gave me a lot of business experience that has proved to be useful connecting the technical side of data science with the business side.

What do you think you got out of The Data Incubator?

It helped me get from the stage of unconscious incompetence – not knowing what you don’t know about data science – to conscious incompetence – knowing what you don’t know, and knowing how to fix that. After five hard weeks of homework, you have some pretty good skills, but more importantly, you have a good idea of where you need to spend time learning, and how to learn. If I was an employer, I would feel comfortable hiring people who have been through The Data Incubator, since they are (a) accomplished hard workers and (b) have shown a willingness and ability to learn a very new field.
Continue reading


The Iraq War by the Numbers: Extracting the Conflicts’ Staggering Costs

624246174001_5021696153001_5021209282001-vsOne of our fellows recently had a piece published about her very unique and timely capstone project. The original piece is posted on Data Driven Journalism

In her own words:

This war is not only important due to its staggering costs (both human and financial) but also on account of its publicly available and well-documented daily records from 2004 to 2010.

These documents provide a very high spatial and temporal resolution view of the conflict. For example, I extracted from these government memos the number of violent events per day in each county. Then, using latent factor analysis techniques, e.g. non-negative matrix factorization, I was able to cluster the top three principal war zones. Interestingly these principal conflict zones were areas populated by the three main ethno-religious groups in Iraq.

You can watch her explain it herself:

 

Editor’s Note: The Data Incubator is a data science education company.  We offer a free eight-week fellowship helping candidates with PhDs and masters degrees enter data science careers.  Companies can hire talented data scientists or enroll employees in our data science corporate training.