When forecasting imbalanced time series with rare but important events, using attention mechanisms that explicitly model extreme patterns outperforms treating all time points uniformly.
This paper introduces Exformer, a Transformer model designed for time series forecasting that explicitly handles rare extreme events. Unlike standard Transformers that treat all data points equally, Exformer uses a specialized attention mechanism with three components—Local, Stride, and Extreme—to capture both normal patterns and critical outliers.