Statistical strategies for the analysis of massive data sets.

Publisher:
Wiley-VCH Verlag
Publication Type:
Journal Article
Citation:
Biometrical Journal: journal of mathematical methods in biosciences, 2020, 62, (2), pp. 270-281
Issue Date:
2020
Filename Description Size
bimj.201900034.pdfPublished version811.87 kB
Adobe PDF
Full metadata record
The advent of the big data age has changed the landscape for statisticians. Public and private organizations alike these days are interested in capturing and analyzing complex customer data in order to improve their service and drive efficiency gains. However, the large volume of data involved often means that standard statistical methods fail and new ways of thinking are needed. Although great gains can be obtained through the use of more advanced computing environments or through developing sophisticated new statistical algorithms that handle data in a more efficient way, there are also many simpler things that can be done to handle large data sets in an efficient and intuitive manner. These include the use of distributed analysis methodologies, clever subsampling, data coarsening, and clever data reductions that exploit concepts such as sufficiency. These kinds of strategies represent exciting opportunities for statisticians to remain front and center in the data science world.
Please use this identifier to cite or link to this item: