Big data applications increasingly involve high-dimensional and sophisticated dependence structures in complex data. Modelling high-dimensional dependence, that is, the dependence between a set of high-dimensional variables, is a critical but challenging issue in many applications including social media analysis and financial markets. A typical example concerns the interplay of financial variables involved in driving complex market movements. A particular problem is understanding the dependence between high-dimensional variables with tail dependence and asymmetric characteristics which appear widely in financial markets. Typically, existing methods, such as the Bayesian logic program, relational dependency networks and relational Markov networks, build a graph to represent the conditional dependence structure between random variables. These models aim at high-dimensional domains, and have the advantage of learning latent relationships from data. However, they tend to force the local quantitative part of the model to take a simple form such as the discretized form of the data when multivariate Gaussian or its mixtures cannot capture the data in the real world. The complex dependencies between high-dimensional variables are difficult to capture.
In statistics and finance, the copula has been shown to be a powerful tool for modelling high-dimensional dependencies. The copula splits the multivariate marginal distributions from dependence structures, so that the specification of dependence structures can be investigated independently of the marginal distributions. It can provide a flexible mechanism for modelling real world distributions that cannot be handled well by graphical models. Thus, researchers have tried to combine copula and probability graphical models, such as the tree-structured copula model and copula Bayesian networks. These copula-based models aim to resolve the limitations of discretizing data, but they impose assumptions and restrictions on the dependence structure. These assumptions and restrictions are not appropriate for dependence modelling among financial variables.
In order to address these research limitations and challenges, this thesis proposes the use of the truncated partial correlation-based canonical vine copula, partial correlation-based regular vine copula and truncated partial correlation-based regular vine copula to model the dependence of high-dimensional variables. Chapter 3 introduces a new partial correlation-based canonical vine to identify the asymmetric and non-linear dependence structures of asset returns without any prior dependence assumptions. To simplify the model while maintaining its merit, a partial correlation-based truncation method is proposed to truncate the canonical vine. The truncated partial correlation-based canonical vine copula is then applied to construct and analyse the dependence structures of European stocks as a case study.
Chapter 4 introduces the truncated partial correlation-based regular vine copula to explore the relations in multiple variables. Very often, strong restrictions are applied on a dependence structure by existing high-dimensional dependence models. These restrictions disabled the detection of sophisticated structures such as the upper and lower tail dependence between multiple variables. A partial correlation-based regular vine copula model may relax these restrictions. The partial correlation-based regular vine copula model employs a partial correlation to construct the regular vine structure, which is algebraically independent. This model is able to capture the asymmetric characteristics among multiple variables by using a two-parametric copula with flexible lower and upper tail dependence. The method is tested on a cross-country stock market data set to analyse the asymmetry and tail dependence in the dynamic period.
Chapter 5 proposes a novel truncated partial correlation-based regular vine copula model which can capture more flexible dependence structures without making pre-assumptions about the data. Specifically, the model employs a new partial correlation to build the dependence structures via a bottom-up strategy. It can identify important dependencies and information among high-dimensional variables, truncating the irrelevant information to significantly reduce the parameter estimate time. The in-sample and out-of-sample performance of the model are examined by using the data in currency markets over a period of 17 years.
Chapter 6 discusses how to resolve the high-dimensional asset allocation problem through a partial correlation-based canonical vine. Typically, the mean-variance criteria which is widely used in asset allocation, is actually not the optimal solution for asset allocation as the joint distribution of asset returns are distributed in asymmetric ways rather than in the assumed normal distribution. The partial correlation-based canonical vine can resolve the issue by producing the asymmetric joint distribution of asset returns in the utility function. Then, the utility function is then used for determining the optimal allocation of the assets. The performance of the model is examined by using data in both European and United State stock markets.
In summary, this thesis proposes three dependence models, including one canonical vine and two regular vines. The three dependence models, which do not impose any dependence assumption on the dependence structure, can be used for modelling different high-dimensional dependencies, such as asymmetry or tail dependencies. All of these models are examined by the datasets in the real world, such as stock or currency markets. In addition, the partial correlation-based canonical vine is used to resolve optimisation allocation of assets in stock markets. This thesis works to show that there is great potential in applying copula to model complex dependence, particular in modelling time-varying parameters, or in developing efficient vine copula simplification methods.