Harvesting multiple resources for software as a service offers: A big data study

Alkalbani, AM; Ghamry, AM; Hussain, FK; Hussain, OK

Harvesting multiple resources for software as a service offers: A big data study

Alkalbani, AM Ghamry, AM Hussain, FK

Hussain, OK

Permalink

Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 9947 LNCS pp. 61 - 71
Issue Date:: 2016-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download full textAdobe PDF (239.63 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Alkalbani, AM	en_US
dc.contributor.author	Ghamry, AM	en_US
dc.contributor.author	Hussain, FK https://orcid.org/0000-0003-1513-8072	en_US
dc.contributor.author	Hussain, OK	en_US
dc.date.issued	2016-01-01	en_US
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 9947 LNCS pp. 61 - 71	en_US
dc.identifier.isbn	9783319466866	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/10453/103210
dc.description.abstract	© Springer International Publishing AG 2016. Currently, the World Wide Web (WWW) is the primary resource for cloud services information, including offers and providers. Cloud applications (Software as a Service), such as Google App, are one of the most popular and commonly used types of cloud services. Having access to a large amount of information on SaaS offers is critical for the potential cloud client to select and purchase an appropriate service. Web harvesting has become a primary tool for discovering knowledge from the Web source. This paper describes the design and development of Web scraper to collect information on SaaS offers from target Digital cloud services advertisement portals, namely www.getApp.com, and www.cloudreviews.com. The collected data were used to establish two datasets: a SaaS provider’s dataset and a SaaS reviews/feedback dataset. Further, we applied sentiment analysis on the reviews dataset to establish a third dataset called the SaaS sentiment polarity dataset. The significance of this study is that the first work focuses on Web harvesting for cloud computing domain, and it also establishes the first SaaS services datasets. Furthermore, we present statistical data that can be helpful to determine the current status of SaaS services and the number of services offered on the Web. In our conclusion, we provide further insight into improving Web scraping for SaaS service information. Our datasets are available online through www.bluepagesdataset.com.	en_US
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.relation.isbasedon	10.1007/978-3-319-46687-3_7	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Harvesting multiple resources for software as a service offers: A big data study	en_US
dc.type	Conference Proceeding
utslib.citation.volume	9947 LNCS	en_US
utslib.for	080301 Bioinformatics Software	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	9947 LNCS	en_US

Abstract:

© Springer International Publishing AG 2016. Currently, the World Wide Web (WWW) is the primary resource for cloud services information, including offers and providers. Cloud applications (Software as a Service), such as Google App, are one of the most popular and commonly used types of cloud services. Having access to a large amount of information on SaaS offers is critical for the potential cloud client to select and purchase an appropriate service. Web harvesting has become a primary tool for discovering knowledge from the Web source. This paper describes the design and development of Web scraper to collect information on SaaS offers from target Digital cloud services advertisement portals, namely www.getApp.com, and www.cloudreviews.com. The collected data were used to establish two datasets: a SaaS provider’s dataset and a SaaS reviews/feedback dataset. Further, we applied sentiment analysis on the reviews dataset to establish a third dataset called the SaaS sentiment polarity dataset. The significance of this study is that the first work focuses on Web harvesting for cloud computing domain, and it also establishes the first SaaS services datasets. Furthermore, we present statistical data that can be helpful to determine the current status of SaaS services and the number of services offered on the Web. In our conclusion, we provide further insight into improving Web scraping for SaaS service information. Our datasets are available online through www.bluepagesdataset.com.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/103210