NO FULL TEXT AVAILABLE. Access is restricted indefinitely. ----- The aim of this research is to apply artificial intelligence to solving real world insurance regression problems.
Two types of problems are explored:
The assessment of personal injury claims, where no systematic data is available in order to derive a predictive model, however experts exist who perform this function.
The rating of insurance risk, in various different portfolios, where systematic data does exist and actuarial models are already employed.
The problems are discussed in detail and a glossary of insurance terms that are used throughout this research is provided.
After a review of the relevant literature each problem is discussed in turn.
Firstly the use of expert systems technology for developing a regression solution to the personal injury assessment problem is presented, and its results are presented. The resultant solution, called Colossus, has achieved significant penetration in the insurance industry on 3 continents, and the implication of this success are discussed.
Secondly, the use machine learning techniques, and artificial neural networks and induction, are applied to insurance rating problems. Some measurement methods are developed for determining the relevance of any results obtained. The general experimental method employed is described, as well as the relevant characteristics of the insurance datasets used. Some of these feature make the detection of any signal, beyond that already discerned by the use of actuarial techniques, difficult. Specifically, the insurance data exhibits:
• Sparsity, in relation to some variable values.
• A range of numeric, ordinal and categorical variables.
• The presence of massively categorical variables, which can have thousands discrete values.
• Noise, arising from the variability of insurance claims.
Methods for dealing with these attributes of the insurance data are devised and implemented in the two approaches taken. The artificial neural network and induction techniques employed are discussed, and how they deal specifically with insurance data.
The experimental results, and resultant conclusions, are presented for each of the methods separately. Each consists of a series of 4 experiments, each carried out on a different insurance dataset. Detailed experimental results are provided in the various appendices.
A comparison of the two methods, and their results, follows. The implications of the use of these methods are discussed, and an estimate of how effective they are at the detection of the risk signal is outlined.
Finally, further research and potential improvements to the methods employed in this research are discussed.