Data are becoming the new raw material of business
The Economist


Ranking Popular Distributed Computing Packages for Data Science

At The Data Incubator, we strive to provide the most up-to-date data science curriculum available. Using feedback from our corporate and government partners, we deliver training on the most sought after data science tools and techniques in industry. We wanted to include a more data-driven approach to developing the curriculum for our corporate data science training and our free Data Science Fellowship program for PhD and master’s graduates looking to get hired as professional Data Scientists. To achieve this goal, we started by looking at and ranking popular deep learning libraries for data science. Next, we wanted to analyze the popularity of distributed computing packages for data science. Here are the results.

The Rankings

Below is a ranking of the top 20 of 140 distributed computing packages that are useful for Data Science, based on Github and Stack Overflow activity, as well as Google Search results. The table shows standardized scores, where a value of 1 means one standard deviation above average (average = score of 0). For example, Apache Hadoop is 6.6 standard deviations above average in Stack Overflow activity, while Apache Flink is close to average. See below for methods.
Continue reading


Ranking Popular Deep Learning Libraries for Data Science

Gold Blog
At The Data Incubator, we pride ourselves on having the most up to date data science curriculum available. Much of our curriculum is based on feedback from corporate and government partners about the technologies they are using and learning. In addition to their feedback we wanted to develop a data-driven approach for determining what we should be teaching in our data science corporate training and our free fellowship for masters and PhDs looking to enter data science careers in industry. Here are the results.
 

The Rankings

Below is a ranking of 23 open-source deep learning libraries that are useful for Data Science, based on Github and Stack Overflow activity, as well as Google search results. The table shows standardized scores, where a value of 1 means one standard deviation above average (average = score of 0). For example, Caffe is one standard deviation above average in Github activity, while deeplearning4j is close to average. See below for methods.


Continue reading


Ranked: 15 Python Packages for Data Science

Cover of Python Packages for Data Science

At The Data Incubator we pride ourselves on having the latest data science curriculum. Much of our course material is based on feedback from corporate and government partners about the technologies they are looking to learn. However, we wanted to develop a more data-driven approach to what we teach in our data science corporate training and our free fellowship for
Data science masters and PhDs looking to begin their careers in the industry.

This report is the second in a series analyzing data science related topics, to see more be sure to check out our R Packages for Machine Learning report. We thought it would be useful to the data science community to rank and analyze a variety of topics related to the profession in a simple, easy to digest cheat sheet, rankings or reports. Continue reading


Ranked: 16 R Packages for Machine Learning

Ranked R PackagesAt The Data Incubator we pride ourselves on having the latest data science curriculum. Much of our course material is based on feedback from corporate and government partners about the technologies they are looking to learn. However, we wanted to develop a more data-driven approach to what we teach in our data science corporate training and our free fellowship for

Data science masters and PhDs looking to begin their careers in the industry.

This report is the first in a series analyzing data science related topics. We thought it would be useful to the data science community to rank and analyze a variety of topics related to the profession in a simple, easy to digest cheat sheet, rankings or reports.

Continue reading