Forecasting Big Time Series: Old and New

Time: 11:00 - 12:30pm, Tuesday, August 28, 2018
Location: Segóvia IV, Windsor Barra Hotel & Congresses, Rio de Janeiro

Overview

Time series forecasting is a key ingredient in the automation and optimization of business processes: in retail, deciding which products to order and where to store them depends on the forecasts of future demand in different regions; in cloud computing, the estimated future usage of services and infrastructure components guides capacity planning; and workforce scheduling in warehouses and factories requires forecasts of the future workload. Recent years have witnessed a paradigm shift in forecasting techniques and applications, from computer-assisted model- and assumption-based to data-driven and fully-automated. This shift can be attributed to the availability of large, rich, and diverse time series data sources, The challenges that need to be addressed are therefore the following. How can we build statistical models to efficiently and effectively learn to forecast from large and diverse data sources? How can we leverage the statistical power of ``similar’’ time series to improve forecasts in the case of limited observations? What are the implications for building forecasting systems that can handle large data volumes?

The objective of this tutorial is to provide a concise and intuitive overview of the most important methods and tools available for solving large-scale forecasting problems. We review the state of the art in three related fields: (1) classical modeling of time series, (2) scalable tensor methods, and (3) deep learning for forecasting. Further, we share lessons learned from building scalable forecasting systems. While our focus is on providing an intuitive overview of the methods and practical issues which we will illustrate via case studies, we also present some technical details underlying these powerful tools.

Presenters

Christos Faloutsos (CMU and Amazon)
Jan Gasthaus (AWS AI Labs)
Tim Januschowski (AWS AI Labs)
Yuyang (Bernie) Wang (AWS AI Labs)

Slides

[PDF] (The animations can only be viewed with Acrobat Reader, and in Preview, they appear as static images.)

Cite [PDF]

@article{faloutsos2018forecasting,
  title={Forecasting Big Time Series: Old and New},
  author={Faloutsos, Christos and Gasthaus, Jan and Januschowski, Tim and Wang, Yuyang},
  journal={Proceedings of the VLDB Endowment},
  volume={11},
  number={12},
  year={2018}
}

Tentative Schedule

Introduction to Forecasting
- Basic (explanatory) analysis and decomposition of time series, i.e., trend, level, seasonality, etc.
- Point forecast vs. probabilistic forecast
- Forecast accuracy metric and backtest scenario
Classical approaches (local, learning one time series at a time)
- Naive baselines: mean, drift, seasonal naive, …
- Generalized Linear Models (GLM), Autoregressive GLM (AR models)
- Exponential smoothing, state-space models
Modern approaches (globally finding patterns)
- Large scale tensor analysis
- Deep learning for forecasting
  - Multi-layer perceptron (feedforward neural networks)
  - Recurrent neural networks (RNN)s: caonical, Sequence-to-Sequence and other architectures
  - Others structures: Convolution, WaveNet, and all that
Lessons learnt building forecasting system
- Building large scale forecasting systems
- Developing Deep Autoregressive Network (DeepAR) in AWS Sagemaker
- Getting started with Forecasting

Regarding the last topic, we will hold DeepAR demos at the Amazon booth!

Presenters’ Bio

Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the Research Contributions Award in ICDM 2006, the SIGKDD Innovations Award (2010), twenty ``best paper’’ awards (including two test of time awards), and four teaching awards. Five of his advisees have attracted KDD or SCS dissertation awards. He is an ACM Fellow, he has served as a member of the executive committee of SIGKDD; he has published over 300 refereed articles, 17 book chapters, and two monographs. He holds eight patents and has given over 40 tutorials and over 20 invited distinguished lectures. His research interests include data mining for graphs and streams, fractals, database performance, and indexing for multimedia and bioinformatics data.

Jan Gasthaus is a Senior Machine Learning Scientist in the Amazon AI Labs, working mainly on time series forecasting and large-scale probabilistic machine learning. He is passionate about developing novel machine learning solutions for addressing challenging business problems with scalable machine learning systems, all the way from scientific ideation to productization. Prior to joining Amazon, Jan obtained a BS in Cognitive Science from the University of Osnabrueck, an MS in Intelligent Systems from UCL, and pursued a PhD at the Gatsby Unit, UCL, focusing on Nonparametric Bayesian methods for sequence data.

Tim Januschowski is a Machine Learning Science Manager in Amazon AI Labs. He has worked on forecasting since starting his professional career. At Amazon, he has produced end-to-end solutions for a wide variety of forecasting problems, from demand forecasting to server capacity forecasting. Tim’s personal interests in forecasting span applications, system, algorithm and modeling aspects and the downstream mathematical programming problems. He studied Mathematics at TU Berlin, IMPA, Rio de Janeiro, and Zuse-Institute Berlin and holds a PhD from University College Cork.

Yuyang (Bernie) Wang is a Senior Machine Learning Scientist in Amazon AI Labs, working mainly on large-scale probabilistic machine learning with its application in Forecasting. He received his PhD in Computer Science from Tufts University, MA, US and he holds an MS from the Department of Computer Science at Tsinghua University, Beijing, China. His research interests span statistical machine learning, numerical linear algebra, and random matrix theory. In forecasting, Yuyang has worked on all aspects ranging from practical applications to theoretical foundations.

Some recent tutorials by Christos and Co. on big time series mining:

Notebooks with MXNet Gluon

Several of the notebooks come from the time series chapter we are writing for Deep Learning – The Straight Dope, an interactive book on deep learning by our colleagues at Amazon: Zachary C. Lipton (@zackchase), Mu Li (@mli), Alex Smola (@smolix), Sheng Zha (@szha), Aston Zhang (@astonzhang), and others.

Prerequisite:
- A crash course on Gluon
- Gluon Tutorial at KDD18 and KDD18-Gluon repository
Introduction to Forecasting
- Basic elements of forecasting
- Forecast Evaluation: metrics, backtest scenarios
Classical Models
Neural Network Models
- Multi-layer Perceptron
- Recurrent Neural Networks (RNN)
- Bayesian RNN
- CNN-based models
Doing Forecasting with DeepAR in AWS SageMaker