Data are becoming the new raw material of business
The Economist


Automatically Generating License Data from Python Dependencies

We all know how important keeping track of your open-source licensing is for the average startup.  While most people think of open-source licenses as all being the same, there are meaningful differences that could have potentially serious legal implications for your code base.  From permissive licenses like MIT or BSD to so-called “reciprocal” or “copyleft” licenses, keeping track of the alphabet soup of dependencies in your source code can be a pain.

Today, we’re releasing pylicense, a simple python module that will add license data as comments directly from your requirements.txt or environment.yml files.

Continue reading

Tweet about this on TwitterShare on FacebookShare on LinkedIn

Chef and Google App Engine

This post was written by our very own Christian Moscardi and originally posted here on his blog.

 

Anyone who’s ever tried to write a nontrivial application on Google App Engine has encountered at least seven* design decisions that have led to serious head-scratching moments. One of those happened to me about a month ago, while integrating Chef into our course at The Data Incubator. Our goal was to allow for one-click spinning up (on DigitalOcean’s cloud) and monitoring of our Fellows’ course machines, already under Chef management.

* No basis in fact – there are probably more than seven. It should be noted that the Google Cloud Platform is going to greatly improve this situation by allowing you to deploy Docker containers – woohoo!

 

A First Look

Chef servers have an HTTP API. Seems like it’d be an easy integration, right? While GAE doesn’t let you do many things (including making SMTP connections), one thing you, thankfully, can do with relative ease is make HTTP requests (although everyone’s favorite Python HTTP library, requests, is a total nightmare – but that’s for another blogpost). This was going to be a quick job – we’d spend a couple days coding, write some tests, and have one-click deployment, right? Right? As you probably guessed, that timeline was anything but right.  Continue reading

Tweet about this on TwitterShare on FacebookShare on LinkedIn

Painlessly Deploying Data Apps with Bokeh, Flask, and Heroku

Here at The Data Incubator, our Fellows deploy their own fully functional, public-facing web app to showcase their data science skills to employers. This not only gives them valuable experience dynamically fetching and displaying data, but also encourages them to think about end user interaction. To demo the process, we decided to marry together some of our favorite technologies:

  • Flask, a slick web framework for Python
  • Heroku for cloud-based app deployment
  • Bokeh for interactive, D3.js-style visualizations
  • Git for version control and distributing code

The goal is to create some distant ancestor of Google Finance: a form capable of accepting a stock ticker as input and producing a plot of the daily close price. Here’s the finished product. So how do we get there?

Continue reading

Tweet about this on TwitterShare on FacebookShare on LinkedIn