Debt detection and debt recovery with advanced classification techniques

Wu, SS

Debt detection and debt recovery with advanced classification techniques

Wu, SS

Permalink

Publication Type:: Thesis
Issue Date:: 2015

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (60.57 kB)

Adobe PDF

Download thesisAdobe PDF (770.62 kB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Wu, SS
dc.date.accessioned	2015-11-30T04:22:18Z
dc.date.available	2015-11-30T04:22:18Z
dc.date.issued	2015
dc.identifier.uri	http://hdl.handle.net/10453/39025
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_AU
dc.description.abstract	My study is part of an ARC linkage project between University of Technology, Sydney and Centrelink Australia, which aims to applying data mining techniques to optimise the debt detection and debt recovery. A debt indicates an overpayment made by the government to a customer who is not entitled to that payment. In social security, an interaction between a customer and the government department is recorded as an activity. Each customer’s activities happen sequentially along the time, which can be regarded as a sequence. Based on the experience of debt detection experts, there are usually some patterns in the sequence of activities of customers who commit debts. The patterns indicating the customers’ intention to be overpaid can thus be used to discover or predict debt occurrence. The development of debt detection and recovery over sequential transaction data, however, is a challenging problem due to following reasons. (1) The size of transaction data is vast, and the transaction data are being generated continuously as the business goes on. (2) Transaction data are always time stamped by the business system, and the temporal order of the transaction data is highly related to the business logic. (3) The patterns and relationships hidden behind the transaction data may be affected by a lot of factors. They are not only dependent on business domain knowledge, but also subject to seasonal and social factors outside the business. Based on a survey of existing methods on debt detection and recovery, data mining techniques are studied in this thesis to detect and recovery debt in an adaptive and efficient fashion. Firstly, sequence data is used to model the evolvement of customer activities, and the sequential patterns generalize the trends of sequences. For long running sequence classification issues, even if the sequences come from the same source, the sequential patterns may vary from time to time. An adaptive sequential classification model is to be built to make the sequence classification adapt to the sequential pattern variation. The model is applied to 15,931 activity sequences from Centrelink which includes 849,831 activity records. The experimental results show that the proposed adaptive sequence classification framework performs effectively on the continuously arriving data. Secondly, a new technique of sequence classification using both positive and negative patterns is to be studied, which is able to find the relationship between activity sequences and debt occurrences and also the impact of oncoming activities on the debt occurrence. The same dataset is used for the evaluation. The outcome shows if built with the same number of rules, in terms of recall, the classifier built with both positive and negative rules outperforms traditional classifiers with only positive rules under most conditions. Finally, decision trees are to be built in the thesis to model debt recovery and predict the response of customers if contacted by phone. The customer contact strategy driven by the model aims to improve the efficiency of debt recovery process. The model is utilized in a real life pilot project for debt recovery in Centrelink. The pilot result outperforms the traditional random customer selection. In summary, this thesis studies debt detection and debt recovery in social security using data mining techniques. The proposed models are novel and effective, showing potentials in real business.	en_AU
dc.format	Thesis (PhD)
dc.language.iso	en_AU	en_AU
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/39025/2/02whole.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	au.edu.uts.lib/ppc
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.subject	Debt detection.	en
dc.subject	Debt recovery.	en
dc.subject	Data mining techniques.	en
dc.subject	Centrelink Australia.	en
dc.subject	Sequential transaction data.	en
dc.title	Debt detection and debt recovery with advanced classification techniques	en_AU
dc.type	Thesis	en_AU
utslib.copyright.status	open_access

Abstract:

My study is part of an ARC linkage project between University of Technology, Sydney and Centrelink Australia, which aims to applying data mining techniques to optimise the debt detection and debt recovery. A debt indicates an overpayment made by the government to a customer who is not entitled to that payment. In social security, an interaction between a customer and the government department is recorded as an activity. Each customer’s activities happen sequentially along the time, which can be regarded as a sequence. Based on the experience of debt detection experts, there are usually some patterns in the sequence of activities of customers who commit debts. The patterns indicating the customers’ intention to be overpaid can thus be used to discover or predict debt occurrence. The development of debt detection and recovery over sequential transaction data, however, is a challenging problem due to following reasons. (1) The size of transaction data is vast, and the transaction data are being generated continuously as the business goes on. (2) Transaction data are always time stamped by the business system, and the temporal order of the transaction data is highly related to the business logic. (3) The patterns and relationships hidden behind the transaction data may be affected by a lot of factors. They are not only dependent on business domain knowledge, but also subject to seasonal and social factors outside the business. Based on a survey of existing methods on debt detection and recovery, data mining techniques are studied in this thesis to detect and recovery debt in an adaptive and efficient fashion. Firstly, sequence data is used to model the evolvement of customer activities, and the sequential patterns generalize the trends of sequences. For long running sequence classification issues, even if the sequences come from the same source, the sequential patterns may vary from time to time. An adaptive sequential classification model is to be built to make the sequence classification adapt to the sequential pattern variation. The model is applied to 15,931 activity sequences from Centrelink which includes 849,831 activity records. The experimental results show that the proposed adaptive sequence classification framework performs effectively on the continuously arriving data. Secondly, a new technique of sequence classification using both positive and negative patterns is to be studied, which is able to find the relationship between activity sequences and debt occurrences and also the impact of oncoming activities on the debt occurrence. The same dataset is used for the evaluation. The outcome shows if built with the same number of rules, in terms of recall, the classifier built with both positive and negative rules outperforms traditional classifiers with only positive rules under most conditions. Finally, decision trees are to be built in the thesis to model debt recovery and predict the response of customers if contacted by phone. The customer contact strategy driven by the model aims to improve the efficiency of debt recovery process. The model is utilized in a real life pilot project for debt recovery in Centrelink. The pilot result outperforms the traditional random customer selection. In summary, this thesis studies debt detection and debt recovery in social security using data mining techniques. The proposed models are novel and effective, showing potentials in real business.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/39025