The Data Science Machine

In this episode

The Data Science Machine is an end-to-end software system that is able to automatically develop predictive models from relational data. The Machine was created by Max Kanter and Kalyan Verramachaneni at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. The system automates two of the most human-intensive components of a data science endeavor: feature engineering, and selection and tuning of the machine learning methods that build predictive models from those features. First, an algorithm called Deep Feature Synthesis automatically engineers features. Next, through an approach called Deep Mining, the Machine composes a generalized machine learning pipeline that includes dimensionality reduction methods, feature selection methods, clustering, and classifier design. Finally, it tunes the parameters through a Gaussian Copula Process.

About the speakers

Principal Research Scientist, MIT Laboratory for Information and Decision Systems (LIDS)

Kalyan is a Principal Research Scientist in the Laboratory for Information and Decision Systems (LIDS, MIT). Previously he was a Research Scientist at CSAIL (CSAIL, MIT). His primary research interests are in machine learning and building large scale statistical models that enable discovery from large amounts of data. His research is at the intersection of Big data, machine learning and data science. He directs a research group called Data to AI in the new MIT Institute for Data Systems and Society (IDSS). The group is interested in Big data science and Machine learning, and is focussed on how to solve foundational issues preventing artificial intelligence and machine learning solutions to reach their full potential for societal applications.