RushMon: Real-time isolation anomalies monitoring

Publication Type:
Conference Proceeding
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2018, pp. 647 - 662
Issue Date:
Filename Description Size
p647-shang.pdfPublished version1.23 MB
Adobe PDF
Full metadata record
© 2018 Association for Computing Machinery. Motivated by the applicability of HogWild!-style algorithms, people turn their focus on system architectures that provide ultra-high throughput random-access with very limited or no isolation guarantees, and build inconsistent-tolerant applications (i.e., large scale optimization algorithms) on top of them. Although some optimization algorithms have theoretical convergence guarantees, sometimes these systems fail to compute the correct results when the presumptions of convergence cannot hold. Moreover, there is no practical way to tell whether a given result is accurate (without cross validation) or to tune the isolation strength on-the-fly. To resolve these problems, these systems need an indicator to report the number of lbad eventž caused by lout-of-orderž executions. In this paper, we tackle this problem. Based on transaction processing theory, we find the number of cycles in the dependency graph, and demonstrate it is a good indicator. With this observation, we propose the first real-time isolation anomalies monitor. Our monitor is at least 1000x faster than naive implementations and reports accurate isolation anomalies levels with less than 1% extra overhead. Monitoring anomalies in a real-time manner efficiently protects the systems from excessive isolation anomalies which could lead to incorrect results. We verify the performance and effectiveness of our monitor via extensive experimental studies.
Please use this identifier to cite or link to this item: