Causal and causally-inspired learning

Publication Type:
Thesis
Issue Date:
2017
Full metadata record
Files in This Item:
Filename Description Size
01front.pdf141.61 kB
Adobe PDF
02whole.pdf1.26 MB
Adobe PDF
A main goal of statistics and machine learning is to discover statistical dependencies between random variables, and these dependencies will be used to perform predictions on future observations. However, many scientific investigations involve causal predictions, the aim of which is to infer how the data generating system should behave under changing conditions, for example, changes induced by external interventions. To perform causal predictions, we need both statistical dependencies as well as causal structures to determine the behaviour of the system. The standard way to identify causal structures is to use randomized controlled experiments. However, conducting these experiments is usually expensive or even impossible in many scenarios. As a consequence, inferring cause and effect relationships from purely observational data, known as causal discovery or causal learning, has drawn much attention. Various causal discovery methods have been proposed in the past decades, including constraint-based methods, structural equation models-based methods, and time series-based methods. Among these methods, time series-based methods, e.g., Granger causality, are relatively well-established as the temporal information excludes the case that effects happen before causes. Many of the existing time series-based methods assume that the data are measured at the right frequency; however, in practice the sampling frequency of the data is often lower than the true causal frequency. In this thesis, we consider learning high-resolution causal relationships at the causal frequency from subsampled time series. Existing methods suffer from the identifiability problems: under the Gaussianity assumption of the data, the solutions are generally not unique. We prove that, however, if the noise terms are non-Gaussian, the underlying model is identifiable from subsampled time series under mild conditions. We then propose an Expectation-Maximization approach and a variational inference approach to recover causal relations from subsampled data. More recently, researchers began to touch upon implications of causal models for machine learning tasks such as semi-supervised learning and domain adaptation. In this thesis, we develop causally-inspired learning methods for domain adaptation in both multi-source and single-source settings. In particular, we use causal models to represent the relationship between the features and labels, and consider possible situations where different modules of the causal model change with the domain. In each situation, we investigate what knowledge is appropriate to transfer and find the optimal target-domain hypothesis. Furthermore, we propose methods to correct distribution shift in the general situation where the marginal distribution of features and conditional distribution of labels given features both change, under the assumption that labels are causes for features. We provide theoretical analysis and empirical evaluation on both synthetic and real-world data to show the effectiveness of our methods.
Please use this identifier to cite or link to this item: