META DISCOVERY: LEARNING TO DISCOVER NOVEL CLASSES GIVEN VERY LIMITED DATA

Chi, H; Liu, F; Han, B; Yang, W; Lan, L; Liu, T; Niu, G; Zhou, M; Sugiyama, M

META DISCOVERY: LEARNING TO DISCOVER NOVEL CLASSES GIVEN VERY LIMITED DATA

Chi, H Liu, F

Han, B Yang, W Lan, L Liu, T Niu, G Zhou, M Sugiyama, M

Permalink

Publication Type:: Conference Proceeding
Citation:: ICLR 2022 - 10th International Conference on Learning Representations, 2022
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	1474_meta_discovery_learning_to_dis.pdf	Published version	847.68 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Chi, H
dc.contributor.author	Liu, F https://orcid.org/0000-0002-5005-9129
dc.contributor.author	Han, B
dc.contributor.author	Yang, W
dc.contributor.author	Lan, L
dc.contributor.author	Liu, T
dc.contributor.author	Niu, G
dc.contributor.author	Zhou, M
dc.contributor.author	Sugiyama, M
dc.date.accessioned	2023-07-17T01:51:48Z
dc.date.available	2023-07-17T01:51:48Z
dc.date.issued	2022-01-01
dc.identifier.citation	ICLR 2022 - 10th International Conference on Learning Representations, 2022
dc.identifier.uri	http://hdl.handle.net/10453/171534
dc.description.abstract	In novel class discovery (NCD), we are given labeled data from seen classes and unlabeled data from unseen classes, and we train clustering models for the unseen classes. However, the implicit assumptions behind NCD are still unclear. In this paper, we demystify assumptions behind NCD and find that high-level semantic features should be shared among the seen and unseen classes. Based on this finding, NCD is theoretically solvable under certain assumptions and can be naturally linked to meta-learning that has exactly the same assumption as NCD. Thus, we can empirically solve the NCD problem by meta-learning algorithms after slight modifications. This meta-learning-based methodology significantly reduces the amount of unlabeled data needed for training and makes it more practical, as demonstrated in experiments. The use of very limited data is also justified by the application scenario of NCD: since it is unnatural to label only seen-class data, NCD is sampling instead of labeling in causality. Therefore, unseen-class data should be collected on the way of collecting seen-class data, which is why they are novel and first need to be clustered.
dc.language	en
dc.relation.ispartof	ICLR 2022 - 10th International Conference on Learning Representations
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	META DISCOVERY: LEARNING TO DISCOVER NOVEL CLASSES GIVEN VERY LIMITED DATA
dc.type	Conference Proceeding
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access	*
dc.date.updated	2023-07-17T01:51:46Z
pubs.publication-status	Published

Abstract:

In novel class discovery (NCD), we are given labeled data from seen classes and unlabeled data from unseen classes, and we train clustering models for the unseen classes. However, the implicit assumptions behind NCD are still unclear. In this paper, we demystify assumptions behind NCD and find that high-level semantic features should be shared among the seen and unseen classes. Based on this finding, NCD is theoretically solvable under certain assumptions and can be naturally linked to meta-learning that has exactly the same assumption as NCD. Thus, we can empirically solve the NCD problem by meta-learning algorithms after slight modifications. This meta-learning-based methodology significantly reduces the amount of unlabeled data needed for training and makes it more practical, as demonstrated in experiments. The use of very limited data is also justified by the application scenario of NCD: since it is unnatural to label only seen-class data, NCD is sampling instead of labeling in causality. Therefore, unseen-class data should be collected on the way of collecting seen-class data, which is why they are novel and first need to be clustered.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/171534