Statistical strategies for the analysis of massive data sets.
- Publisher:
- Wiley-VCH Verlag
- Publication Type:
- Journal Article
- Citation:
- Biometrical Journal: journal of mathematical methods in biosciences, 2020, 62, (2), pp. 270-281
- Issue Date:
- 2020
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
bimj.201900034.pdf | Published version | 811.87 kB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
The advent of the big data age has changed the landscape for statisticians. Public and private organizations alike these days are interested in capturing and analyzing complex customer data in order to improve their service and drive efficiency gains. However, the large volume of data involved often means that standard statistical methods fail and new ways of thinking are needed. Although great gains can be obtained through the use of more advanced computing environments or through developing sophisticated new statistical algorithms that handle data in a more efficient way, there are also many simpler things that can be done to handle large data sets in an efficient and intuitive manner. These include the use of distributed analysis methodologies, clever subsampling, data coarsening, and clever data reductions that exploit concepts such as sufficiency. These kinds of strategies represent exciting opportunities for statisticians to remain front and center in the data science world.
Please use this identifier to cite or link to this item: