A novel consistent random forest framework: Bernoulli random forests

Wang, Y; Xia, ST; Tang, Q; Wu, J; Zhu, X

A novel consistent random forest framework: Bernoulli random forests

Wang, Y Xia, ST Tang, Q Wu, J

Zhu, X

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (8), pp. 3510 - 3523
Issue Date:: 2018-08-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted Manuscript VersionAdobe PDF (2.54 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, Y	en_US
dc.contributor.author	Xia, ST	en_US
dc.contributor.author	Tang, Q	en_US
dc.contributor.author	Wu, J https://orcid.org/0000-0002-1371-5801	en_US
dc.contributor.author	Zhu, X	en_US
dc.date.issued	2018-08-01	en_US
dc.identifier.citation	IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (8), pp. 3510 - 3523	en_US
dc.identifier.issn	2162-237X	en_US
dc.identifier.uri	http://hdl.handle.net/10453/123486
dc.description.abstract	© 2012 IEEE. Random forests (RFs) are recognized as one type of ensemble learning method and are effective for the most classification and regression tasks. Despite their impressive empirical performance, the theory of RFs has yet been fully proved. Several theoretically guaranteed RF variants have been presented, but their poor practical performance has been criticized. In this paper, a novel RF framework is proposed, named Bernoulli RFs (BRFs), with the aim of solving the RF dilemma between theoretical consistency and empirical performance. BRF uses two independent Bernoulli distributions to simplify the tree construction, in contrast to the RFs proposed by Breiman. The two Bernoulli distributions are separately used to control the splitting feature and splitting point selection processes of tree construction. Consequently, theoretical consistency is ensured in BRF, i.e., the convergence of learning performance to optimum will be guaranteed when infinite data are given. Importantly, our proposed BRF is consistent for both classification and regression. The best empirical performance is achieved by BRF when it is compared with state-of-the-art theoretical/consistent RFs. This advance in RF research toward closing the gap between theory and practice is verified by the theoretical and experimental studies in this paper.	en_US
dc.relation.ispartof	IEEE Transactions on Neural Networks and Learning Systems	en_US
dc.relation.isbasedon	10.1109/TNNLS.2017.2729778	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	A novel consistent random forest framework: Bernoulli random forests	en_US
dc.type	Journal Article
utslib.citation.volume	8	en_US
utslib.citation.volume	29	en_US
utslib.for	0802 Computation Theory and Mathematics	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	open_access
pubs.issue	8	en_US
pubs.publication-status	Published	en_US
pubs.volume	29	en_US

Abstract:

© 2012 IEEE. Random forests (RFs) are recognized as one type of ensemble learning method and are effective for the most classification and regression tasks. Despite their impressive empirical performance, the theory of RFs has yet been fully proved. Several theoretically guaranteed RF variants have been presented, but their poor practical performance has been criticized. In this paper, a novel RF framework is proposed, named Bernoulli RFs (BRFs), with the aim of solving the RF dilemma between theoretical consistency and empirical performance. BRF uses two independent Bernoulli distributions to simplify the tree construction, in contrast to the RFs proposed by Breiman. The two Bernoulli distributions are separately used to control the splitting feature and splitting point selection processes of tree construction. Consequently, theoretical consistency is ensured in BRF, i.e., the convergence of learning performance to optimum will be guaranteed when infinite data are given. Importantly, our proposed BRF is consistent for both classification and regression. The best empirical performance is achieved by BRF when it is compared with state-of-the-art theoretical/consistent RFs. This advance in RF research toward closing the gap between theory and practice is verified by the theoretical and experimental studies in this paper.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/123486