Statistical strategies for the analysis of massive data sets.

Hwang, H; Ryan, L

Statistical strategies for the analysis of massive data sets.

Hwang, H

Ryan, L

Permalink

Publisher:: Wiley-VCH Verlag
Publication Type:: Journal Article
Citation:: Biometrical Journal: journal of mathematical methods in biosciences, 2020, 62, (2), pp. 270-281
Issue Date:: 2020

Closed Access

	Filename	Description	Size
	bimj.201900034.pdf	Published version	811.87 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Hwang, H https://orcid.org/0000-0002-5882-8068
dc.contributor.author	Ryan, L https://orcid.org/0000-0001-5957-2490
dc.date.accessioned	2020-10-28T04:18:17Z
dc.date.available	2019-06-24
dc.date.available	2020-10-28T04:18:17Z
dc.date.issued	2020
dc.identifier.citation	Biometrical Journal: journal of mathematical methods in biosciences, 2020, 62, (2), pp. 270-281
dc.identifier.issn	0323-3847
dc.identifier.issn	1521-4036
dc.identifier.uri	http://hdl.handle.net/10453/143552
dc.description.abstract	The advent of the big data age has changed the landscape for statisticians. Public and private organizations alike these days are interested in capturing and analyzing complex customer data in order to improve their service and drive efficiency gains. However, the large volume of data involved often means that standard statistical methods fail and new ways of thinking are needed. Although great gains can be obtained through the use of more advanced computing environments or through developing sophisticated new statistical algorithms that handle data in a more efficient way, there are also many simpler things that can be done to handle large data sets in an efficient and intuitive manner. These include the use of distributed analysis methodologies, clever subsampling, data coarsening, and clever data reductions that exploit concepts such as sufficiency. These kinds of strategies represent exciting opportunities for statisticians to remain front and center in the data science world.
dc.format	Print-Electronic
dc.language	eng
dc.publisher	Wiley-VCH Verlag
dc.relation	http://purl.org/au-research/grants/arc/CE140100049
dc.relation.ispartof	Biometrical Journal: journal of mathematical methods in biosciences
dc.relation.isbasedon	10.1002/bimj.201900034
dc.rights	info:eu-repo/semantics/restrictedAccess
dc.subject	0104 Statistics
dc.subject.classification	Statistics & Probability
dc.title	Statistical strategies for the analysis of massive data sets.
dc.type	Journal Article
utslib.citation.volume	62
utslib.location.activity	Germany
utslib.for	0104 Statistics
utslib.for	0104 Statistics
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
pubs.organisational-group	/University of Technology Sydney/Faculty of Science/School of Mathematical and Physical Sciences
pubs.organisational-group	/University of Technology Sydney
utslib.copyright.status	closed_access	*
pubs.consider-herdc	true
dc.date.updated	2020-10-28T04:18:11Z
pubs.issue	2
pubs.publication-status	Published online
pubs.volume	62
utslib.citation.issue	2

Abstract:

The advent of the big data age has changed the landscape for statisticians. Public and private organizations alike these days are interested in capturing and analyzing complex customer data in order to improve their service and drive efficiency gains. However, the large volume of data involved often means that standard statistical methods fail and new ways of thinking are needed. Although great gains can be obtained through the use of more advanced computing environments or through developing sophisticated new statistical algorithms that handle data in a more efficient way, there are also many simpler things that can be done to handle large data sets in an efficient and intuitive manner. These include the use of distributed analysis methodologies, clever subsampling, data coarsening, and clever data reductions that exploit concepts such as sufficiency. These kinds of strategies represent exciting opportunities for statisticians to remain front and center in the data science world.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/143552