COCLEP: Contrastive Learning-based Semi-Supervised Community Search

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2023 IEEE 39th International Conference on Data Engineering (ICDE), 2023, 2023-April, pp. 2483-2495
Issue Date:
2023-07-26
Filename Description Size
COCLEP_Contrastive_Learning-based_Semi-Supervised_Community_Search.pdfPublished version1.81 MB
Adobe PDF
Full metadata record
Community search is a fundamental graph processing task that aims to find a community containing the given query node Recent studies show that machine learning ML based community search can return higher quality communities than the classic methods such as k core and k truss However the state of the art ML based models require a large number of labeled data i e nodes in ground truth communities for training that are difficult to obtain in real applications and incur unaffordable memory costs or query time for large datasets To address these issues in this paper we present the community search based on contrastive learning with partition namely COCLEP which only requires a few labels and is both memory and query efficient In particular given a small collection of query nodes and a few e g three corresponding ground truth community nodes for each query COCLEP learns a query dependent model through the proposed graph neural network and the designed label aware contrastive learner The former perceives query node information low order neighborhood information and high order hypergraph structure information the latter contrasts low order intra view high order intra view and low high order inter view representations of the nodes Further we theoretically prove that COCLEP can be scalable to large datasets with the min cut over the graph To the best of our knowledge this is the first attempt to adopt contrastive learning for community search task that is nontrivial Extensive experiments on real world datasets show that COCLEP simultaneously achieves better community effectiveness and comparably high query efficiency while using fewer labels compared with the state of the art approaches and is scalable for large datasets
Please use this identifier to cite or link to this item: