Disentangling Stochastic PDE Dynamics for Unsupervised Video Prediction.

Publisher:
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:
Journal Article
Citation:
IEEE Trans Neural Netw Learn Syst, 2023, PP, (99)
Issue Date:
2023-06-27
Filename Description Size
Disentangling_Stochastic_PDE_Dynamics_for_Unsupervised_Video_Prediction.pdfPublished version3.78 MB
Adobe PDF
Full metadata record
Unsupervised video prediction aims to predict future outcomes based on the observed video frames, thus removing the need for supervisory annotations. This research task has been argued as a key component of intelligent decision-making systems, as it presents the potential capacities of modeling the underlying patterns of videos. Essentially, the challenge of video prediction is to effectively model the complex spatiotemporal and often uncertain dynamics of high-dimensional video data. In this context, an appealing way of modeling spatiotemporal dynamics is to explore prior physical knowledge, such as partial differential equations (PDEs). In this article, considering real-world video data as a partly observed stochastic environment, we introduce a new stochastic PDE predictor (SPDE-predictor), which models the spatiotemporal dynamics by approximating a generalized form of PDEs while dealing with the stochasticity. A second contribution is that we disentangle the high-dimensional video prediction into low-level dimensional factors of variations: time-varying stochastic PDE dynamics and time-invariant content factors. Extensive experiments on four various video datasets show that SPDE video prediction model (SPDE-VP) outperforms both deterministic and stochastic state-of-the-art methods. Ablation studies highlight our superiority driven by both PDE dynamics modeling and disentangled representation learning and their relevance in long-term video prediction.
Please use this identifier to cite or link to this item: