Weakly supervised photo cropping

Zhang, L; Song, M; Yang, Y; Zhao, Q; Zhao, C; Sebe, N

Weakly supervised photo cropping

Zhang, L Song, M Yang, Y

Zhao, Q Zhao, C Sebe, N

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Multimedia, 2014, 16 (1), pp. 94 - 107
Issue Date:: 2014-01-01

Closed Access

	Filename	Description	Size
	Weakly Supervised Photo Cropping.pdf	Published Version	3.08 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, L	en_US
dc.contributor.author	Song, M	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Zhao, Q	en_US
dc.contributor.author	Zhao, C	en_US
dc.contributor.author	Sebe, N	en_US
dc.date.issued	2014-01-01	en_US
dc.identifier.citation	IEEE Transactions on Multimedia, 2014, 16 (1), pp. 94 - 107	en_US
dc.identifier.issn	1520-9210	en_US
dc.identifier.uri	http://hdl.handle.net/10453/115895
dc.description.abstract	Photo cropping is widely used in the printing industry, photography, and cinematography. Conventional photo cropping methods suffer from three drawbacks: 1) the semantics used to describe photo aesthetics are determined by the experience of model designers and specific data sets, 2) image global configurations, an essential cue to capture photos aesthetics, are not well preserved in the cropped photo, and 3) multi-channel visual features from an image region contribute differently to human aesthetics, but state-of-the-art photo cropping methods cannot automatically weight them. Owing to the recent progress in image retrieval community, image-level semantics, i.e., photo labels obtained without much human supervision, can be efficiently and effectively acquired. Thus, we propose weakly supervised photo cropping, where a manifold embedding algorithm is developed to incorporate image-level semantics and image global configurations with graphlets, or, small-sized connected subgraph. After manifold embedding, a Bayesian Network (BN) is proposed. It incorporates the testing photo into the framework derived from the multi-channel post-embedding graphlets of the training data, the importance of which is determined automatically. Based on the BN, photo cropping can be casted as searching the candidate cropped photo that maximally preserves graphlets from the training photos, and the optimal cropping parameter is inferred by Gibbs sampling. Subjective evaluations demonstrate that: 1) our approach outperforms several representative photo cropping methods, including our previous cropping model that is guided by semantics-free graphlets, and 2) the visualized graphlets explicitly capture photo semantics and global spatial configurations. © 1999-2012 IEEE.	en_US
dc.relation.ispartof	IEEE Transactions on Multimedia	en_US
dc.relation.isbasedon	10.1109/TMM.2013.2286817	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Weakly supervised photo cropping	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	16	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	08 Information and Computing Sciences	en_US
utslib.for	09 Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	1	en_US
pubs.publication-status	Published	en_US
pubs.volume	16	en_US

Abstract:

Photo cropping is widely used in the printing industry, photography, and cinematography. Conventional photo cropping methods suffer from three drawbacks: 1) the semantics used to describe photo aesthetics are determined by the experience of model designers and specific data sets, 2) image global configurations, an essential cue to capture photos aesthetics, are not well preserved in the cropped photo, and 3) multi-channel visual features from an image region contribute differently to human aesthetics, but state-of-the-art photo cropping methods cannot automatically weight them. Owing to the recent progress in image retrieval community, image-level semantics, i.e., photo labels obtained without much human supervision, can be efficiently and effectively acquired. Thus, we propose weakly supervised photo cropping, where a manifold embedding algorithm is developed to incorporate image-level semantics and image global configurations with graphlets, or, small-sized connected subgraph. After manifold embedding, a Bayesian Network (BN) is proposed. It incorporates the testing photo into the framework derived from the multi-channel post-embedding graphlets of the training data, the importance of which is determined automatically. Based on the BN, photo cropping can be casted as searching the candidate cropped photo that maximally preserves graphlets from the training photos, and the optimal cropping parameter is inferred by Gibbs sampling. Subjective evaluations demonstrate that: 1) our approach outperforms several representative photo cropping methods, including our previous cropping model that is guided by semantics-free graphlets, and 2) the visualized graphlets explicitly capture photo semantics and global spatial configurations. © 1999-2012 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/115895