HyperX: A Scalable Hypergraph Framework
- Publication Type:
- Journal Article
- IEEE Transactions on Knowledge and Data Engineering, 2019, 31 (5), pp. 909 - 922
- Issue Date:
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
© 1989-2012 IEEE. Hypergraphs are generalizations of graphs where the (hyper)edges can connect any number of vertices. They are powerful tools for representing complex and non-pairwise relationships. However, existing graph computation frameworks cannot accommodate hypergraphs without converting them into graphs, because they do not offer APIs that support (hyper)edges directly. This graph conversion may create excessive replicas and result in very large graphs, causing difficulties in workload balancing. A few tools have been developed for hypergraph partitioning, but they are not general-purpose frameworks for hypergraph processing. In this paper, we propose HyperX, a general-purpose distributed hypergraph processing framework built on top of Spark. HyperX is based on the computation paradigm Pregel, which is user-friendly and has been widely adopted by popular graph computation frameworks. To help create balanced workloads for distributed hypergraph processing, we further investigate the hypergraph partitioning problem and propose a novel label propagation partitioning (LPP) algorithm. We conduct extensive experiments using both real and synthetic data. The result shows that HyperX achieves an order of magnitude improvement for running hypergraph learning algorithms compared with graph conversion based approaches in terms of running time, network communication costs, and memory consumption. For hypergraph partitioning, LPP outperforms the baseline algorithms significantly in these measures as well.
Please use this identifier to cite or link to this item: