paint-brush
A New Tiny AI Model Is Outsmarting the Big Guys With a Four-Part Brainby@fewshot
New Story

A New Tiny AI Model Is Outsmarting the Big Guys With a Four-Part Brain

tldt arrow

Too Long; Didn't Read

Researchers have developed a practical, efficient alternative to massive AI models for time series forecasting.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - A New Tiny AI Model Is Outsmarting the Big Guys With a Four-Part Brain
The FewShot Prompting Publication  HackerNoon profile picture
0-item

Authors:

(1) Vijay Ekambaram, IBM Research;

(2) Arindam Jati, IBM Research;

(3) Nam H. Nguyen, IBM Research;

(4) Pankaj Dayama, IBM Research;

(5) Chandra Reddy, IBM Research;

(6) Wesley M. Gifford, IBM Research;

(7) Jayant Kalagnanam, IBM Research.

Editor's note: this is part 2 of 5 of a study detailing the development of a tiny, fast AI model that delivers excellent accuracy. Read the rest below.

2 TTM Components

2.1 Multi-level Modeling

TTM follows a multi-level architecture consisting of four key components (see Figure 1(a)): (1) The TTM Backbone is assembled using building blocks derived from the efficient TSMixer architecture [Ekambaram et al., 2023]. TSMixer is based on simple MLP blocks that enable mixing of features within patches, across patches and channels, surpassing existing transformer-based TS approaches with minimal computational requirements. Since TSMixer is not targeted to handle multi-resolution data, we introduce various novel enhancements to it as explained later. (2) TTM Decoder follows the same backbone architecture but is considerably smaller in size, approximately 10-20% of the size of the backbone, (3) Forecast Head consists of a linear head designed to produce the forecast output, and (4) Optional Exogenous Mixer serves to fuse exogenous data into the model’s forecasting process. This multi-level model refactoring is required to dynamically change the working behavior of various components based on the workflow type, as explained in Section 3. In addition to the above primary components, we also have a preprocessing component as explained next.


Figure 1: Overview of Multilevel Tiny Time Mixers (TTM): (a) Refer to Section 2 and 3, (b) Refer to Section 3.1, (c) Refer to Section 3.2

2.2 Pre-processing


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.