Efficient techniques for cost-sensitive learning with multiple cost considerations

DSpace/Manakin Repository

Search OPUS


Advanced Search

Browse

My Account

Show simple item record

dc.contributor.author Wang, Tao
dc.date.accessioned 2013-09-10T01:07:39Z
dc.date.available 2013-09-10T01:07:39Z
dc.date.issued 2013
dc.identifier.uri http://hdl.handle.net/10453/23546
dc.description University of Technology, Sydney. Faculty of Engineering and Information Technology. en_US
dc.description.abstract Cost-sensitive learning is one of the active research topics in data mining and machine learning, designed for dealing with the non-uniform cost of misclassification errors. In the last ten to fifteen years, diverse learning methods and techniques were proposed to minimize the total cost of misclassification, test and other types. This thesis studies the up-to-date prevailing cost-sensitive learning methods and techniques, and proposes some new and efficient cost-sensitive learning methods and techniques in the following three areas: First, we focus on the data over-fitting issue. In an applied context of cost-sensitive learning, many existing data mining algorithms can generate good results on training data but normally do not produce an optimal model when applied to unseen data in real world applications. We deal with this issue by developing three simple and efficient strategies - feature selection, smoothing and threshold pruning to overcome data over-fitting in cost-sensitive learning. This work sets up a solid foundation for our further research and analysis in this thesis in the other areas of cost-sensitive learning. Second, we design and develop an innovative and practical objective-resource cost-sensitive learning framework for addressing a real world issue where multiple cost units are involved. A lazy cost-sensitive decision tree is built to minimize the objective cost subjecting to given budgets of other resource costs. Finally, we study semi-supervised learning approach in the context of cost-sensitive learning. Two new classification algorithms are proposed to learn cost-sensitive classifier from training datasets with a small amount of labelled data and plenty unlabelled data. We also analyse the impact of the different input parameters to the performance of our new algorithms. en_US
dc.language.iso en en_US
dc.subject Data mining. en
dc.subject Machine learning. en
dc.subject Cost-sensitive learning. en
dc.title Efficient techniques for cost-sensitive learning with multiple cost considerations en_US
dc.type Thesis (Ph.D) en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record