Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Nonlinear dynamical systems (DS) are commonly accessed through time series measurements.
- Data modalities can include event counts and continuous signals.
- Sparse teacher forcing (TF) has been suggested as a control-theoretic method for training ML models on chaotic DS.
- A novel recurrent neural network (RNN) training framework has been developed for DS reconstruction based on multimodal variational autoencoders (MVAE).
- This training method achieves better reconstructions on multimodal datasets generated from chaotic DS benchmarks than alternative methods.
Paper Content
Introduction
- Inferring dynamical mechanisms from data is important for predicting changes in system dynamics.
- Recent methods for recovering dynamical systems from time series observations are mostly based on recurrent neural networks.
- One recent study considered non-Gaussian data for DS reconstruction.
- Discrete random processes are common in many areas, such as medical, neuroscience, and climate science.
- Exploding and vanishing gradient problem is a challenge when attempting to capture natural systems with chaotic dynamics.
- Control-theoretic methods based on sparse teacher forcing were suggested to address the EVGP.
- A novel formulation of the multimodal data integration problem for DS reconstruction was proposed.
Related work
- Goal of DS reconstruction is to learn a model of a nonlinear DS from observed quantities
- RNNs are popular machine learning tools for modeling DS
- Previous methods for DS reconstruction have focused on continuous data with Gaussian noise
- RNN models trained with classical BPTT suffer from the EVGP
- EVGP is particularly severe when training on time series from chaotic systems
- Variational generative models are powerful methods for learning latent representations of joint distributions
- Longitudinal autoencoders have been proposed to model temporal correlations in latent space
- State space models have been applied for posterior inference of latent state paths
- Multimodal data integration can improve model inference
- DS reconstruction requires an approximation to the governing equations to capture temporal and geometrical structure
- Kramer et al. proposed a nonlinear state space model embedded within a sequential VAE for DS reconstruction from multimodal time series data
Method
- Our approach to DS reconstruction from multimodal time series rests on three components: dendritic PLRNN, sparse identity-TF, and MVAE
- dendritic PLRNN is defined by a linear diagonal matrix, off-diagonal matrix, and diagonal noise covariance matrix
- sparse TF balances loss and trajectory divergence with the need to capture relevant long time scales
- MVAE infers a joint latent representation over data and minimizes the negative Evidence Lower Bound
- MVAE uses convolutional neural networks for parameterizing the encoder model
- MVAE-TF connects the latent codes of the MVAE and the dendPLRNN
- MVAE-TF loss is the reconstruction loss of the MVAE, the latent loss, and the dendPLRNN loss from the likelihoods of the observed time series
Experiments
- Interested in capturing invariant properties of underlying DS
- MSE not suited for evaluating longer-term behavior
- Compiled set of performance measures
- Kullback-Leibler divergence in state space
- Hellinger distance between empirical and model-generated power spectra
- MSE between autocorrelation functions
- Spearman cross-correlation structure
- 10-step-ahead prediction error
- Transform data modalities to Gaussian
- Multiple shooting method
- Heavily distorted by Gaussian noise
- DS reconstruction possible from ordinal data
Conclusions
- Introduced novel training method for DS reconstruction from multimodal time series data
- Reconstruction based on multimodal, noncontinuous/ non-Gaussian data has hardly been addressed
- Utilized control signals to guide training process
- Training the dendPLRNN by MVAE-TF outperformed other model formulations
- Attractor geometries can be faithfully reconstructed from discrete random variables
- Modular algorithm, subcomponents can be replaced
- Assessed agreement between data distributions using binning, Hellinger distance, mean squared prediction error, ordinal prediction error, Spearman cross-correlation, and geometric reconstruction measure
- Used Lorenz-63 and Lewis-Glass network models
- Used generalized linear model to couple ordinal observations to latent states