Python Libraries for Machine Learning

Machine learning helps with many practical applications, suitably augmented by deep learning and with extensions of the overall field of artificial intelligence. Many people, with the help of analytics and statistics, are busy navigating the vast universe of deep or machine learning, artificial intelligence, and big data. However, they do not really have to qualify as data scientists, as popular machine learning libraries in Python are available.

Machine learning is promoting deep learning and AI for all kinds of machine assists, including driverless cars, better prevention healthcare, and even better movie recommendations.


A machine-learning group at the Universite de Montreal developed and released Theano a decade ago. In the machine learning community, Theano is one of the most used mathematical compiler for CPUs and GPUs. A 2016 paper describes Theano as a “Python framework for fast computation of mathematical expressions,” and offers a thorough overview of the library.

According to the paper, development of several software packages build on the strengths of Theano, offering higher-level user interface, making them more suitable for specific goals. For instance, expressing training algorithms mathematically and evaluating the architecture of deep learning models using Theano became easier with the development of Keras and Lasagne.

Likewise, a probabilistic programming framework PyMC3, using Theano, derives expressions automatically for gradients. PyMC3 also generates C-codes for fast execution. That people have forked Theano over two-thousand times, it has almost 300 contributors on GitHub, and it garners more than 25,000 commits, is testimony to its popularity.


Although a newcomer to the world of open source, TensorFlow is a library for numerical computing and uses data flow graphs. In its first year itself, TensorFlow has helped students, artists, engineers, researchers, and many others. According to the Google Developers Blog, TensorFlow has helped with preventing blindness in diabetes, early detection of skin cancer, language translation, and more.

TensorFlow has appeared several times in the most recent Open Source Yearbook. It has been included as a project in the list of top ten open source projects to watch in 2017. In a tour of Google’s 2016 open source releases, an article by Josh Simmons refers to Magenta, a TensorFlow based project.

According to Simmons, Magenta advances the technology in machine intelligence for music and art generation. It also helps build a collaborative community of coders, artists, and researchers dealing with machine learning. According to another researcher, Rachel Roumeliotis, she lists TensorFlow as a language for powering AI as a part of her roundup of Hot programming trends of 2016.

Anyone can learn more about TensorFlow by watching the live stream of recording from the TensorFlow Dev Summit 2017, or by reading the DZone series—TensorFlow on the Edge.


Spotify engineers at okCupid use Scikit-Learn for recommending music, for helping evaluate and improve their matchmaking system, and for exploring phases of new product development at Birchbox. Scikit-Learn is built on Matplotlib, SciPy, and NumPy. It has 800 contributors on GitHub, and garners almost 22,000 commits.

The Scikit-Learn project website offers free tutorials, where one can read about using Scikit-Learn for machine learning. Alternately, they can watch the PyData Chicago 2016 talk given by Sebastian Raschka.