Compressed learning

Zhou, T

Compressed learning

Zhou, T

Permalink

Publication Type:: Thesis
Issue Date:: 2013

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (278.32 kB)

Adobe PDF

Download thesisAdobe PDF (6.86 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Zhou, T
dc.date.accessioned	2014-03-25T00:35:25Z
dc.date.available	2014-03-25T00:35:25Z
dc.date.issued	2013
dc.identifier.uri	http://hdl.handle.net/10453/24180
dc.description	University of Technology, Sydney. Faculty of Engineering and Information Technology.	en_US
dc.description.abstract	There has been an explosion of data derived from the internet and other digital sources. These data are usually multi-dimensional, massive in volume, frequently incomplete, noisy, and complicated in structure. These "big data" bring new challenges to machine learning (ML), which has historically been designed for small volumes of clearly defined and structured data. In this thesis we propose new methods of "compressed learning", which explore the components and procedures in ML methods that are compressible, in order to improve their robustness, scalability, adaptivity, and performance for big data analysis. We will study novel methodologies that compress different components throughout the learning process, propose more interpretable general compressible structures for big data, and develop effective strategies to leverage these compressible structures to produce highly scalable learning algorithms. We present several new insights into popular learning problems in the context of compressed learning. The theoretical analyses are tested on real data in order to demonstrate the efficacy and efficiency of the methodologies in real-world scenarios. In particular, we propose "manifold elastic net (MEN)" and "double shrinking (DS)" as two fast frameworks extracting low-dimensional sparse features for dimension reduction and manifold learning. These methods compress the features on both their dimension and cardinality, and significantly improve their interpretation and performance in clustering and classification tasks. We study how to derive fewer "anchor points" for representing large datasets in their entirety by proposing "divide-and-conquer anchoring", in which the global solution is rapidly found for near-separable non-negative matrix factorization and completion in a distributed manner. This method represents a compression of the big data itself, rather than features, and the extracted anchors define the structure of the data. Two fast low-rank approximation methods, "bilateral random projections (BRP)" of fast computer closed-form and "greedy bilateral sketch (GreBske)", are proposed based on random projection and greedy augmenting update rules. They can be broadly applied to learning procedures that requires updates of a low-rank matrix variable and result in significant acceleration in performance. We study how to compress noisy data for learning by decomposing it into the sum mixture of low-rank part and sparse part. "GO decomposition (GoDec)" and the "greedy bilateral (GreB)" paradigm are proposed as two efficient approaches to this problem based on randomized and greedy strategies, respectively. Modifications of these two schemes result in novel models and extremely fast algorithms for matrix completion that aim to recover a low-rank matrix from a small number of its entries. In addition, we extend the GoDec problem in order to unmix more than two incoherent structures that are more complicated and expressive than low-rank or sparse matrices. The three proposed variants are not only novel and effective algorithms for motion segmentation in computer vision, multi-label learning, and scoring-function learning in recommendation systems, but also reveal new theoretical insights into these problems. Finally, a compressed learning method termed “compressed labelling (CL) on distilled label sets (DL)" is proposed for solving the three core problems in multi-label learning, namely high-dimensional labels, label correlation modeling, and sample imbalance for each label. By compressing the labels and the number of classifiers in multi-label learning, CL can generate an effective and efficient training algorithm from any single-label classifier.	en_US
dc.format	Thesis (PhD)	en_US
dc.language.iso	en	en_US
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/24180/12/02whole.pdf
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Compressed learning.	en
dc.subject	Sparse learning.	en
dc.subject	Machine learning.	en
dc.subject	Manifold learning.	en
dc.subject	Big data.	en
dc.title	Compressed learning	en_US
dc.type	Thesis
utslib.copyright.status	open_access

Abstract:

There has been an explosion of data derived from the internet and other digital sources. These data are usually multi-dimensional, massive in volume, frequently incomplete, noisy, and complicated in structure. These "big data" bring new challenges to machine learning (ML), which has historically been designed for small volumes of clearly defined and structured data. In this thesis we propose new methods of "compressed learning", which explore the components and procedures in ML methods that are compressible, in order to improve their robustness, scalability, adaptivity, and performance for big data analysis. We will study novel methodologies that compress different components throughout the learning process, propose more interpretable general compressible structures for big data, and develop effective strategies to leverage these compressible structures to produce highly scalable learning algorithms. We present several new insights into popular learning problems in the context of compressed learning. The theoretical analyses are tested on real data in order to demonstrate the efficacy and efficiency of the methodologies in real-world scenarios. In particular, we propose "manifold elastic net (MEN)" and "double shrinking (DS)" as two fast frameworks extracting low-dimensional sparse features for dimension reduction and manifold learning. These methods compress the features on both their dimension and cardinality, and significantly improve their interpretation and performance in clustering and classification tasks. We study how to derive fewer "anchor points" for representing large datasets in their entirety by proposing "divide-and-conquer anchoring", in which the global solution is rapidly found for near-separable non-negative matrix factorization and completion in a distributed manner. This method represents a compression of the big data itself, rather than features, and the extracted anchors define the structure of the data. Two fast low-rank approximation methods, "bilateral random projections (BRP)" of fast computer closed-form and "greedy bilateral sketch (GreBske)", are proposed based on random projection and greedy augmenting update rules. They can be broadly applied to learning procedures that requires updates of a low-rank matrix variable and result in significant acceleration in performance. We study how to compress noisy data for learning by decomposing it into the sum mixture of low-rank part and sparse part. "GO decomposition (GoDec)" and the "greedy bilateral (GreB)" paradigm are proposed as two efficient approaches to this problem based on randomized and greedy strategies, respectively. Modifications of these two schemes result in novel models and extremely fast algorithms for matrix completion that aim to recover a low-rank matrix from a small number of its entries. In addition, we extend the GoDec problem in order to unmix more than two incoherent structures that are more complicated and expressive than low-rank or sparse matrices. The three proposed variants are not only novel and effective algorithms for motion segmentation in computer vision, multi-label learning, and scoring-function learning in recommendation systems, but also reveal new theoretical insights into these problems. Finally, a compressed learning method termed “compressed labelling (CL) on distilled label sets (DL)" is proposed for solving the three core problems in multi-label learning, namely high-dimensional labels, label correlation modeling, and sample imbalance for each label. By compressing the labels and the number of classifiers in multi-label learning, CL can generate an effective and efficient training algorithm from any single-label classifier.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/24180