Why Automation?

Manual vs Automated Feature Engineering

There are few certainties in data science — libraries, tools, and algorithms constantly change as better methods are developed. However, one trend that is not going away is the move towards increased levels of automation.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is

  • problem-dependent

  • and, must be re-written for each new dataset.

Automated feature engineering improves upon this standard workflow by automatically extracting useful and meaningful features from a set of related data tables with a framework that can be applied to any problem.

Results

Featuretools has a used on numerous real world datasets. In an example using a dataset of loan applications to predict future loan default, automation achieved significant time savings

  • Development time: accounts for everything required to make the final feature engineering code: 10 hours manual vs 1 hour automated

  • Number of features produced by the method: 30 features manual vs 1820 automated

  • Improvement relative to baseline is the % gain over the baseline compared to the top public leaderboard score using a model trained on the features: 65% manual vs 66% automated

Learn more

To see the full study and results on 2 other datasets, read the article Why Automated Feature Engineering Will Change the Way You Do Machine Learning.