Computer Science Diagram Understanding with Topology Parsing

Publisher:
Association for Computing Machinery (ACM)
Publication Type:
Journal Article
Citation:
ACM Transactions on Knowledge Discovery from Data, 2022, 16, (6), pp. 1-20
Issue Date:
2022-07-30
Filename Description Size
Computer Science Diagram Understanding with Topology.pdfPublished version3.49 MB
Adobe PDF
Full metadata record
Diagram is a special form of visual expression for representing complex concepts, logic, and knowledge, which widely appears in educational scenes such as textbooks, blogs, and encyclopedias. Current research on diagrams preliminarily focuses on natural disciplines such as Biology and Geography, whose expressions are still similar to natural images. In this article, we construct the first novel geometric type of diagrams dataset in Computer Science field, which has more abstract expressions and complex logical relations. The dataset has exhaustive annotations of objects and relations for about 1,300 diagrams and 3,500 question-answer pairs. We introduce the tasks of diagram classification (DC) and diagram question answering (DQA) based on the new dataset, and propose the Diagram Paring Net (DPN) that focuses on analyzing the topological structure and text information of diagrams. We use DPN-based models to solve DC and DQA tasks, and compare the performances to well-known natural images classification models and visual question answering models. Our experiments show the effectiveness of the proposed DPN-based models on diagram understanding tasks, also indicate that our dataset is more complex compared to previous natural image understanding datasets. The presented dataset opens new challenges for research in diagram understanding, and the DPN method provides a novel perspective for studying such data. Our dataset can be available from https://github.com/WayneWong97/CSDia.
Please use this identifier to cite or link to this item: