UTS ISA submission at the TRECVID 2019 video to text description task

Publication Type:
Conference Proceeding
Citation:
2019 TREC Video Retrieval Evaluation, TRECVID 2019, 2020
Issue Date:
2020-01-01
Full metadata record
In this paper, we summarize the technical details applied in our submission of TRECVID 2019[1] video to text task. The main effective improvements include three parts: Several efficient and comprehensive high-level features to gain expressive visual feature encodings, the algorithms in regulating and optimizing a robust language model, the expandable strategy to ensemble the well-trained single models. Besides, we conducted a meticulous evaluation of these techniques, and a comprehensive comparison of the experiments indicated the effectiveness of these techniques in video captioning.
Please use this identifier to cite or link to this item: