Data are becoming the new raw material of business
The Economist

Automatically Generating License Data from Python Dependencies

computer-1869236_960_720We all know how important keeping track of your open-source licensing is for the average startup.  While most people think of open-source licenses as all being the same, there are meaningful differences that could have potentially serious legal implications for your code base.  From permissive licenses like MIT or BSD to so-called “reciprocal” or “copyleft” licenses, keeping track of the alphabet soup of dependencies in your source code can be a pain.

Today, we’re releasing pylicense, a simple python module that will add license data as comments directly from your requirements.txt or environment.yml files. requirements.txt

or -e environment.yml

Under the covers, it uses xmlrpclib to fetch package data from pypi and looks for the “license” tag or the “License” classifier. The operation is also idempotent so if you’ve already commented the file with the license, it will not add them in.

Starting with an environment.yml file like this

name: website
- astroid=1.3.2=py27_0
- cairo=1.12.18
- cffi=1.1.2=py27_0
- dill=0.2.2=py27_0

will yield a file like this one

name: website
- astroid=1.3.2=py27_0  # LGPL
- cairo=1.12.18  # LGPL 2.1, MPL 1.1
- cffi=1.1.2=py27_0  # MIT
- dill=0.2.2=py27_0  # 3-clause BSD

The python module is free, and easily installable from github. For more information (including installation and usage), checkout the github page at  For the record, pylicense itself is licensed under the MIT license. Enjoy!

Editor’s Note: The Data Incubator is a data science education company.  We offer a free eight-week Fellowship helping candidates with PhDs and masters degrees enter data science careers.  Companies can hire talented data scientists or enroll employees in our data science corporate training.

Tweet about this on TwitterShare on FacebookShare on LinkedIn

Back to index