Data are becoming the new raw material of business
The Economist

A Study Of Reddit Politics

This article was written for The Data Incubator by Jay Kaiser, a Fellow of our 2018 Winter cohort in Washington, DC who landed a job with our hiring partner, ZeniMax Online Studios, as a Big Data Engineer.

 

The Question

The 2016 Presidential Election was, in a single word, weird. So much happened during the months leading up to November that it became difficult to keep track with what who said when and why. However, the finale of the election that culminated with Republican candidate Donald J. Trump winning the majority of the Electoral College and hence becoming the 45th President of the United States was an outcome which at the time I had thought impossible, if solely due to the aforementioned eccentric series of events that had circulated around Trump for a majority of his candidacy.

Following the election, the prominent question that could not leave my mind was a simple one: how? How had the American people changed so much in only a couple of years to allow an outsider hit by a number of black marks during the election to be elected to the highest position in the United States government? How did so many pollsters and political scientists fail to predict this outcome? How can we best analyze the campaigns of each candidate, now given hindsight and knowledge of the eventual outcome? In an attempt to answer each of these, I have turned to a perhaps unlikely source.

Continue reading


Engineering a New Career in Data Science: Alumni Spotlight on Abhishek Mishra

At The Data Incubator we run a free eight-week Data Science Fellowship Program to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring Data Scientists. Abhishek was a Fellow in our Winter 2015 cohort in Washington, DC who landed a job as a Data Scientist at Samsung SDS.

 

Tell us about your background. How did it set you up to be a great Data Scientist?

I have a PhD in electrical and computer engineering from Lehigh University. I already had a good background in probability, statistics, and mathematical optimization that helped me in understanding the essence of data science. I was part of the Electrical and Computer Engineering Department and I was working on making the Internet more secure, by developing theoretical frameworks for timing attacks on anonymous networks such as TOR.

One of the key contributions of my PhD work was that I was able to find the closed form characterization of maximum achievable anonymity in a simple Chaum Mix (the basic building blocks of an anonymous network).

One important thing that I learnt from my PhD work was how to do research. This process consists of following four parts:

  1. Find out an interesting problem to work on.
  2. Formulate the problem in a concrete mathematical framework.
  3. Find out the mathematical tools required to solve the problem.
  4. Convince your adviser and the world that you solved an important problem worth publishing in a reputed conference or journal.

This whole process allowed me to work on an unstructured problem. I had to keep my eyes open to find the problem in the domain I was working on. This whole process helped me to become a better data scientist by just changing the role a little bit. Now I keep an open eye for finding any interesting patterns in the data. Anything that is unexpected is interesting.

 

What do you think you got out of The Data Incubator?

There are tons of things that I got from The Data Incubator. First, The Data Incubator introduced me to a nice and comprehensive overview of the current techniques in data science. That includes everything from linear regression to Spark. Second, by hearing from many different companies, I got a feeling for the different types of problems that they tackle and how data science offers real solutions in a variety of fields. The Data Incubator exposed me to how data science is actually used by companies. Finally, The Data Incubator helped expand my network both by introducing me to companies looking for data scientists, and by introducing me to the other Fellows.

Continue reading


Asking the Right Questions: Alumni Spotlight on Suchandan Pal

At The Data Incubator we run a free eight-week Data Science Fellowship Program to help our Fellows land industry jobs. We love Fellows with diverse academic backgrounds that go beyond what companies traditionally think of when hiring Data Scientists. Suchandan was a Fellow in our Fall 2016 cohort in San Francisco who landed a job at our hiring partner, Argyle Data – now Mavenir

 

Tell us about your background. How did it set you up to be a great Data Scientist?

I did my PhD in a part of mathematics known as number theory/arithmetic algebraic geometry. I’ve always been drawn to difficult and impactful problems, and my training has provided me with invaluable skills that I use use in problem solving everyday.

Knowing techniques and tools is important, but asking the right questions (and knowing which to avoid) is often what makes the difference between a problem you can solve, and one that remains intractable. For example, there have been many times where choosing the right strategy or perspective has made extremely difficult conjectures appear “natural” in number theory/arithmetic algebraic geometry. I have always found my experiences in mathematics to give me skills that guide me in problem solving outside of mathematics, and for that I am very appreciative.

 

What do you think you got out of The Data Incubator?

I enjoyed learning from Robert, the instructor of the San Francisco cohort. I also liked that Fellowship program gave me exposure to different sectors of industry.

 
Continue reading


JUST Capital and The Data Incubator Challenge

Data Science For Social Good (1)

 

Today, we’re excited to announce that we’re teaming up with JUST Capital to help crowd-source data science for social good.  The Data Incubator offers a free eight-week data science fellowship for those with a PhD or a masters degree looking to transition into data science.  As a part of the application process, students are asked to submit a data science capstone project and the best students are invited to work on them during the fellowship.  JUST Capital is helping providing data and project prompts to harness the collective brainpower amongst The Data Incubator fellows to solve these high-impact social problems.

  • These projects focus on applied data science techniques with tangible impacts on JUST Capital’s mission.
  • The projects are open ended and creativity is encouraged. The documents provided, below, are suitable for analysis, but one should not shy in seeking out additional sources of data.

JUST Capital is a nonprofit that provides information and rankings on how large corporations perform on issues that matter most to the public. We give individuals a voice on what really matters to them, and evaluate how companies perform on those issues. By providing the right knowledge and making it easy to access and understand, we believe capital will flow to corporations that are more JUST, ultimately leading to a balanced business world that takes into account human needs that are so often neglected today. The meaning of JUST is defined by the American public as fair, equitable and balanced. In 2016, JUST Capital surveyed nearly 4,000 Americans from all regions and walks of life, in its second annual Poll on Corporate America. The issues identified by the public form the basis of our benchmark — it is against these Drivers and Components that we measure corporate performance. The most important factors broadly relate to employees, customers, company leadership, the environment, communities and investors.

Continue reading


How Employers Judge Data Science Projects

mark-516277_960_720One of the more commonly used screening devices for data science is the portfolio project.  Applicants apply with a project that they have showcasing a piece of data science that they’ve accomplished.  At The Data Incubator, we run a free eight week fellowship helping train and transition people with masters and PhD degrees for careers in data science.  One of the key components of the program is completing a capstone data science project to present to our (hundreds of) hiring employers.  In fact, a major part of the fellowship application process is proposing that very capstone project, with many successful candidates having projects that are substantially far along if not nearly completed.  Based on conversations with partners, here’s our sense of priorities for what makes a good project, ranked roughly in order of importance: 

  1. Completion: While their potential is important, projects are assessed primarily based on the success of analysis performed rather than the promise of future work.  Working in any industry is about getting things done quickly, not perfectly, and projects with many gaps, “I wish I had time for”, or “ future steps” suggests the applicant may not be able to get things done at work.
  2. Practicality: High-impact problems of general interest are more interesting than theoretical discussions on academic research problems. If you solve the problem, will anyone care? Identifying interesting problems is half the challenge, especially for candidates leaving academia who must disprove an inherent “academic” bias.
  3. Creativity: Employers are looking for creative, original thinkers who can identify either (1) new datasets or (2) find novel questions to ask about a dataset. Employers do not want to see the tenth generic presentation on Citibike (or Chicago Crime, Yelp Restaurant Ratings data, NYC Restaurant Inspection DataNYC Taxi, BTS Flight Delay, Amazon Review, Zillow home price, World Bank or other macroeconomic data, or beating the stock market) data. Similarly, projects that explain a non-obvious thesis supported by concise plots are more compelling than ones that present obvious conclusions (e.g. “more riders use Citibike during the day than at night”). Employers are looking for data scientists who can find trends in the data that they don’t already know. Continue reading

Data Sources for Cool Data Science Projects Part 6

startup-593324_960_720Links to Part 1Part 2Part 3Part 4, Part 5

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are a few cool public data sources you can use for your next project:

Continue reading


Data Sources for Cool Data Science Projects: Part 5

computer-1185626_960_720Links to Part 1Part 2Part 3, Part 4

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are some more cool public data sources you can use for your next project:

Continue reading


Data Sources for Cool Data Science Projects: Part 4

student-849825_960_720Links to Part 1Part 2Part 3

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are some more cool public data sources you can use for your next project: Continue reading


Data Sources for Cool Data Science Projects: Part 3

student-849822_960_720Links to Part 1, Part 2

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are some more cool public data sources you can use for your next project: Continue reading


The Data Incubator Featured in The Next Web

Today, The Data Incubator was featured in The Next Web. The article, “Data Incubator opens a West Coast campus to groom the next generation of data scientists,” can be found below and on The Next Web here.

 

golden-gate-bridge-388917_960_720Data Incubator, an East Coast fellowship program, is expanding to the West Coast with a new office in San Francisco. Its goal is to prepare highly qualified scientists and engineers for work as quants or data scientists.

The Bay Area campus has already accepted 10 fellows for its inaugural class.

Data Incubator takes fellows with “most of the advanced education and foundational knowledge required to pursue a professional data science role.” Those with Masters degrees or PhDs in computer science, mathematics, social science, statistics, and physics are best positioned to be accepted. Continue reading