Data are becoming the new raw material of business
The Economist


Data Sources for Cool Data Science Projects: Part 5

computer-1185626_960_720Links to Part 1Part 2Part 3, Part 4

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are some more cool public data sources you can use for your next project:

Environmental Data

  1. Climate change: Climate data is hot right now, and US Climate Data is a good starting point for a lot of great, up to date datasets. You can find more detailed sets from NOAA, and climate change sets from the US gov.
  2. Sea Ice: University of Colorado’s National Snow and Ice Data Center publishes the Sea Ice Index, which records ice coverage in the Antarctic and Arctic Oceans. Their datasets include daily and monthly measures from 1978 to now!
  3. Forest Coverage: The World Bank maintains data on forest coverage per country and across the globe. Fun fact: Over 98% of land area in Suriname was forest in 2015.

Education

  1. Education Fulfillment: Researchers at Wittgenstein Centre for Demography and Global Human Capital based in Vienna have compiled a dataset of chronicled and projected education levels for over 150 countries dating back to 1970 and projecting to 2060. You can download the complete dataset through the Wittgenstein Centre Data Explorer.
  2. Student Loans: The US Department of Education publishes the default rates for student loans assembled by school, school type, and state. They recently published data compiling students with loans due for repayment in 2013.

International Data

  1. New Zealand National Statistics: New Zealand has a rather impressive national statistics website. The small nation publishes data on everything from businesses, abortion, to the Māori census.
  2. International Financial History: The Jordà-Schularick-Taylor Macrohistory Database contains data for 17 “advanced” economies dating back to 1870 updated on an annual basis. They claim to be the “most extensive long run macro-financial dataset to date.

 

While building your own project cannot replicate the experience of fellowship at The Data Incubator (our Fellows get amazing access to hiring managers and access to nonpublic data sources) we hope this will get you excited about working in data science.  And when you are ready, you can apply to be a Fellow!

Got any more data sources?  Let us know and we’ll add them to the list!

 

 

Tweet about this on TwitterShare on FacebookShare on LinkedIn

Back to index