Path-based holistic detection plan for multiple patterns in distributed graph frameworks

Publication Type:
Journal Article
Citation:
VLDB Journal, 2017, 26 (3), pp. 327 - 345
Issue Date:
2017-06-01
Metrics:
Full metadata record
Files in This Item:
Filename Description Size
Gao2017_Article_Path-basedHolisticDetectionPla.pdfPublished Version1.23 MB
Adobe PDF
© 2017, Springer-Verlag Berlin Heidelberg. Multiple pattern detection is needed in applications like disease analysis over gene networks, bug detection in program flow networks. This paper takes pattern detection to investigate the evaluation and optimization of multiple jobs in existing distributed graph processing frameworks. The evaluation plan for multiple pattern detection should be parallelizable and can capture and reuse the shared parts among pattern queries easily. In this paper, we design a path-based holistic plan for multiple pattern queries. Specifically, (1) we design a path-based edge-covered plan for an individual pattern. The paths in the plan can be easily captured and reused among different queries. Additionally, the evaluation plan is fully parallelizable, in which each data vertex performs necessary join operations independently during exploring graph. (2) We extend the individual plan to a holistic evaluation plan for multiple queries, whose results are equivalent to those of individual queries. The plan reduces the overall cost by finding frequent paths among queries and reusing the shared part in the holistic plan. (3) We devise various optimization strategies over the holistic plan. The experimental studies, conducted on Giraph, illustrate the high effectiveness of our holistic approaches.
Please use this identifier to cite or link to this item: