Adapting GNNs for Document Understanding: A Flexible Framework with Multiview Global Graphs

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Journal Article
Citation:
IEEE Transactions on Computational Social Systems, 2024, PP, (99), pp. 1-14
Issue Date:
2024-01-01
Full metadata record
Graph neural networks (GNNs) have recently gained attention for capturing complex relations, prompting researchers to explore their potential in document classification. Existing studies serving this purpose fall into two directions: inductive learning focusing on personalized context relations within documents and transductive learning targeting the global distribution relations among documents in a corpus. Both directions extract distinct types of beneficial structural information and yield encouraging outcomes. However, due to the incompatibility of underlying graph structures and learning settings, developing an enhanced model that effectively integrates local and global relational learning within existing frameworks is challenging. To address this issue, we propose a new GNN-based document representation learning framework that incorporates multiview global graphs at both the word and document levels, focusing on learning the diverse global distribution information of texts at different granularities. Additionally, a contextual encoder derives the initial representations of document nodes from the updated representations of word nodes, integrating personalized context relations into document representations during this process. Finally, we tailor a node representation learning strategy for the multiview global graphs, called the multiview graph sampling and updating module, which allows our framework to operate efficiently during training without being constrained by the scale of the global graph. Experiments indicate that our framework generally enhances performance by integrating both global and local relational learning. When combined with largescale language models, our framework achieves state-of-the-art results for GNN-based models across multiple datasets.
Please use this identifier to cite or link to this item: