UTS ISA submission at the TRECVID 2019 video to text description task

Rao, Q; Li, G; Yang, Y; Zhang, F; Wang, Z

UTS ISA submission at the TRECVID 2019 video to text description task

Rao, Q Li, G Yang, Y

Zhang, F Wang, Z

Permalink

Publication Type:: Conference Proceeding
Citation:: 2019 TREC Video Retrieval Evaluation, TRECVID 2019, 2020
Issue Date:: 2020-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted versionAdobe PDF (86.66 kB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Rao, Q
dc.contributor.author	Li, G
dc.contributor.author	Yang, Y https://orcid.org/0000-0002-0512-880X
dc.contributor.author	Zhang, F
dc.contributor.author	Wang, Z
dc.date.accessioned	2021-04-26T06:32:37Z
dc.date.available	2021-04-26T06:32:37Z
dc.date.issued	2020-01-01
dc.identifier.citation	2019 TREC Video Retrieval Evaluation, TRECVID 2019, 2020
dc.identifier.uri	http://hdl.handle.net/10453/148389
dc.description.abstract	In this paper, we summarize the technical details applied in our submission of TRECVID 2019[1] video to text task. The main effective improvements include three parts: Several efficient and comprehensive high-level features to gain expressive visual feature encodings, the algorithms in regulating and optimizing a robust language model, the expandable strategy to ensemble the well-trained single models. Besides, we conducted a meticulous evaluation of these techniques, and a comprehensive comparison of the experiments indicated the effectiveness of these techniques in video captioning.
dc.language	en
dc.relation.ispartof	2019 TREC Video Retrieval Evaluation, TRECVID 2019
dc.rights	info:eu-repo/semantics/openAccess
dc.title	UTS ISA submission at the TRECVID 2019 video to text description task
dc.type	Conference Proceeding
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	open_access	*
dc.date.updated	2021-04-26T06:32:36Z
pubs.publication-status	Published

Abstract:

In this paper, we summarize the technical details applied in our submission of TRECVID 2019[1] video to text task. The main effective improvements include three parts: Several efficient and comprehensive high-level features to gain expressive visual feature encodings, the algorithms in regulating and optimizing a robust language model, the expandable strategy to ensemble the well-trained single models. Besides, we conducted a meticulous evaluation of these techniques, and a comprehensive comparison of the experiments indicated the effectiveness of these techniques in video captioning.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/148389