A scalable data chunk similarity based compression approach for efficient big sensing data processing on cloud
- Publication Type:
- Journal Article
- IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (6), pp. 1144 - 1157
- Issue Date:
Files in This Item:
|A scalable data chunk similarity based compression approach for efficient big sensing data processing on cloud.pdf||Published Version||904.11 kB|
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
© 1989-2012 IEEE. Big sensing data is prevalent in both industry and scientific research applications where the data is generated with high volume and velocity. Cloud computing provides a promising platform for big sensing data processing and storage as it provides a flexible stack of massive computing, storage, and software services in a scalable manner. Current big sensing data processing on Cloud have adopted some data compression techniques. However, due to the high volume and velocity of big sensing data, traditional data compression techniques lack sufficient efficiency and scalability for data processing. Based on specific on-Cloud data compression requirements, we propose a novel scalable data compression approach based on calculating similarity among the partitioned data chunks. Instead of compressing basic data units, the compression will be conducted over partitioned data chunks. To restore original data sets, some restoration functions and predictions will be designed. MapReduce is used for algorithm implementation to achieve extra scalability on Cloud. With real world meteorological big sensing data experiments on U-Cloud platform, we demonstrate that the proposed scalable compression approach based on data chunk similarity can significantly improve data compression efficiency with affordable data accuracy loss.
Please use this identifier to cite or link to this item: