Building: Cero Infinito
Room: 1403
Date: 2024-12-11 03:00 PM – 04:00 PM
Last modified: 2024-11-22
Abstract
Scientific Machine Learning (SciML) is an emerging field that integrates traditional mathematical modeling with modern machine learning techniques, offering increased interpretability, generalizability, and data efficiency [1, 2]. Within this framework of differentiable modeling, we present GOKU-UI [3], an evolution of the GOKU-net model [4]. GOKU-nets are continuous-time generative models that encode time series into a latent space governed by predefined differential equations, simultaneously learning the transformation from the original space to the latent space and the equations parameters.
While leveraging the power of the Julia Programming Language's SciML Ecosystem, we not only broadened the original model's spectrum to incorporate other classes of differential equations, but also introduced two key enhancements in our GOKU-nets with Ubiquitous Inference (GOKU-UI): (1) attention mechanisms in the parameter inference component, and (2) a novel multiple shooting [5] training strategy in the latent space. These modifications significantly improve the model's performance in both reconstruction and forecasting tasks, as demonstrated on simulated and empirical brain data.
On synthetic datasets generated from stochastic Stuart-Landau oscillators, GOKU-UI outperforms baseline models even with 16 times less training data, showcasing its remarkable data efficiency. Furthermore, when applied to empirical human brain data, while incorporating stochastic Stuart-Landau oscillators into its dynamical core, our proposed enhancements markedly increased the model's effectiveness in capturing complex brain dynamics. GOKU-UI demonstrated a reconstruction error five times lower than other baselines, and the multiple shooting method reduced the GOKU-nets prediction error for future brain activity up to 15 seconds ahead.
By encoding high-dimensional brain dynamics into low-dimensional, interpretable representations, our work contributes to both fundamental neuroscientific understanding and the practical challenge of building accurate predictive models from neuroimaging data. Beyond neuroscience, this research advances the field of SciML, demonstrating the potential of integrating established scientific insights with modern machine learning techniques for studying complex dynamical systems.
[1] Baker, Nathan, et al. "Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence." USDOE Office of Science (SC), Washington, DC (United States), 2019.
[2] Rackauckas, Christopher, et al. "Universal Differential Equations for Scientific Machine Learning." arXiv preprint arXiv:2001.04385, 2020.
[3] Abrevaya, Germán, et al. "Effective Latent Differential Equation Models via Attention and Multiple Shooting." Transactions on Machine Learning Research, 2024.
[4] Linial, Ori, et al. "Generative ODE Modeling with Known Unknowns." Proceedings of the Conference on Health, Inference, and Learning, 2021.
[5] Turan, Evren Mert, and Johannes Jäschke. "Multiple Shooting for Training Neural Differential Equations on Time Series." IEEE Control Systems Letters, vol. 6, 2021, pp. 1897-1902.