A technical roadmap for achieving scalable big sensing data curation on Cloud

Publication Type:
Thesis
Issue Date:
2016
Full metadata record
Nowadays, big data means that data sets are so large and complex that they become difficult to process with traditional database management systems or traditional data processing tools. As important sources of big data sets, modern sensing systems generate huge volumes of sensing data beyond the ability of commonly-used software tools to capture, manage, and process within a tolerable time length. Big sensing data is prevalent in both industry and scientific research applications. The massive size, extreme complexity and high speed of big sensing data form new challenges in terms of data collection, data storage, data organization, data analysis and data publishing in real time when deploying some real world sensing systems. Cloud environment, with its massive storage, scalability and powerful computing capability, becomes an ideal platform for big sensing data processing. More and more research and industry efforts have been devoted to explore ways to process big sensing data on Cloud in order to offer better solutions for challenges brought by big sensing data. In this thesis, we will concentrate on the data curation and preparation issues under the overall theme of big sensing data processing. Especially, under the topic of big sensing data curation on Cloud, two important issues including scalable big sensing data cleaning and scalable big sensing data compression will be intensively investigated. In terms of big sensing data cleaning, a systematic approach will be developed to solve error detection and error recovery problems of big sensing data. In terms of big sensing data compression, independent techniques will be developed to reduce the size of incoming big sensing data, hence, to reduce the cost of Cloud storage, avoid big data set navigation and guarantee real time reaction. Different to previous traditional data cleaning and compression techniques, big sensing data features, the real time requirement, scalability of Cloud, will have huge influence to the techniques developed in this thesis. With those developed techniques, a detailed roadmap for achieving scalable big sensing data curation on Cloud will be proposed as our overall research outcome. Finally, the different techniques in our proposed big sensing data curation roadmap will be tested and verified with real world big sensing data sets on Cloud to show their effectiveness, efficiency and other performance gains. We aim to demonstrate that with the offered roadmap of big sensing data curation on Cloud, the typical challenges within big sensing data curation will be solved through the massive computational power and resource support from Cloud.
Please use this identifier to cite or link to this item: