by Chee Yee Lim

Posted on 2021-08-28

Comparing autoML and one-stop shop machine learning Python packages for time-series analysis

I have tested and reviewed a few Python packages for time-series data analysis, mostly on forecasting. Most of these packages are one-stop shop machine learning packages, with some of them also containing autoML function.

The main objective here is to review and explore Python packages that will shorten the time needed for time-series data analysis.

In summary, kats is the most promising one-stop shop machine learning package for time-series analysis. pycaret-ts-alpha is likely to be a strong contender once it matures out of the alpha status and gets integrated officially into pycaret.

These libraries tend to be a bit rough around the edges in terms of documentations and API implementations, especially for the newer packages. The support for multivariate time series forecasting is also on the weaker side, as most of them focus on univariate time series forecasting.

pytorch-forecasting deserves a special mention as it is the only library with a deep learning focus. While I agree that deep learning is very sexy to play with, I am still quite reserved in terms of applying deep learning to time series problems. Compare to traditional statistical models that have tens of parameters, deep learning models often have millions or billions of parameters to be trained. Fitting an N-BEATS model that has 1.6 million parameters on the air passenger data with hundreds of data points feels wrong.

Or as John von Neumann famously said, "With four parameters I can fit an elephant, and with five I can make him wiggle his trunk."

Package | kats | pmdarima | sktime | pytorch-forecasting | pycaret-ts-alpha | autots |
---|---|---|---|---|---|---|

Version | 0.1.0 | 1.8.2 | 0.5.3 | 0.9.0 | 3.0.0.dev1624743408 | 0.3.2 |

Recommended for exploration | No | No | No | No | No | No |

Recommended for production | Yes | Yes | No | No | No | No |

Ease of use | Yes | Yes | No | Yes | No | Yes |

Computation speed | Fast | Fast | Fast | Slow | Medium | Slow |

Installation complexity | Low | Low | Low | Medium | Medium | Low |

One-stop shop | Yes | No | Yes | No | Yes | Yes |

AutoML focus | No | No | No | No | Yes | Yes |

Deep learning focus | No | No | No | Yes | No | No |

Score | 4 | 3 | 2 | 2 | 1 | 1 |

- kats
- Has a promising list of time-series models implemented, including a good selection of algorithms for change detection and time-series feature extraction.
- Seems good for stable/production deployment due to stable implementation and good documentation.

- pmdarima
- Standard baseline model for time-series forecasting.
- Modelled after equivalent in R.
- Easier to be used as part of larger one-stop shop library.

- sktime
- Has a promising list of time-series models and time-series algorithms implemented.
- Less complete documentation and non-standardised APIs make exploring them slightly trickier.

- pytorch-forecasting
- Has a focus on using deep learning models for time-series forecasting.
- Very interesting selection of DL models for time-series forecasting.
- The non-conventional models implemented in PyTorch Forecasting are often developed very recently.
- Each model has millions of parameters (requires a lot of data to train) and is slowed to train.

- pycaret-ts-alpha
- Has a strong potential as it is based on pycaret framework, but currently in alpha.
- Makes experimentation easy and standardised.

- autots
- Uses genetic algorithm to find an ensemble of best models, with options to setup weighted metrics to evaluate model performance.
- Very slow in building the model ensemble.
- Documentations and tutorials are lacking which makes using AutoTS slightly more time consuming to use than other similar packages.

- kats
- pmdarima
- sktime
- pytorch-forecasting
- pycaret-ts-alpha
- autots