Bayesian non-parametric models for time segmentation and regression

Publication Type:
Thesis
Issue Date:
2015
Full metadata record
Non-parametric Bayesian modelling offers a principled way for avoiding model selection such as pre-defining the number of modes in a mixture model or the optimal number of factors in factor analysis. Instead, Bayesian non-parametric methods allow the data to determine the complexity of model. In particular, the hierarchical Dirichlet process (HDP) is used in a variety of applications to infer an arbitrary number of classes from a set of samples. Within the temporal modelling paradigm, Bayesian non-parametrics is used to model sequential data by integrating HDP priors into state-space models such as HMM, constructing HDP-HMM. Also in latent factor modelling and dimensionality reduction, Indian buffet process (IBP) is a well-known method capable of sparse modelling and selecting an arbitrary number of factors among the often high-dimensional features. In this PhD thesis, we have applied the above methods to propose novel solutions to two prominent problems. The first model, named as ‘AdOn HDP-HMM’, is an adaptive online system based on HDP-HMM. ‘AdOn HDP-HMM’ is capable of segmenting and classifying the sequential data over unlimited number of classes, while meeting the memory and delay constraints of streaming contexts. The model is further enhanced by a number of learning rates, responsible for tuning the adaptability by determining the extent to which the model sustains its previous parameters or adapts to the new data. Empirical results on several variants of synthetic and action recognition data, show remarkable performance, particularly using adaptive learning rates for evolutionary sequences. The second proposed solution is an elaborate factor regression model, named as non-parametric conditional factor regression (NCFR), to cater for multivariate prediction, preserving the correlations in the response layer. NCFR enhances factor regression by integrating IBP to infer the optimal number of latent factors, in a sparse model. Thanks to this data-driven approach, NCFR can significantly avoid over-fitting even in cases where the ratio between the number of available samples and dimensions is very low. Experimental results on three diverse datasets give evidence of its remarkable predictive performance, resilience to over-fitting, good mixing and computational efficiency.
Please use this identifier to cite or link to this item: