A clustering solution for analyzing residential water consumption patterns

Publisher:
Elsevier BV
Publication Type:
Journal Article
Citation:
Knowledge-Based Systems, 2021, 233, pp. 107522-107522
Issue Date:
2021-12-05
Filename Description Size
1-s2.0-S095070512100784X-main.pdfPublished version3.2 MB
Adobe PDF
Full metadata record
Water utility companies in urban areas face two major challenges: ensuring there is enough water for everyone during prolonged drought and maintaining adequate water pressure during the hours of peak demand. These issues can be overcome by applying data analytics and machine learning to the data gathered from digital water meters. For water conservation and demand management strategies to be effective, utility companies need to gain a better understanding of consumer behaviours, habits and routines. To accomplish this goal, we adapted a clustering approach to reveal residential water consumption patterns within metered data. In the experiment, we used two data sets (engineered features data set as well as the times of use and weighted probabilities of use data set) based on the data collected over 10 months from 306 households in Melbourne, Australia. For the engineered features data set, first, we identified the number of optimal clusters. We then performed extensive experiments to find the best clustering approach in terms of performance evaluation and clustering quality. We chose the hierarchical agglomerative clustering technique based on the nature of the data and the objective of the study. We observed that for the engineered features data set, k-means is the best performing clustering technique after considering performance metrics. For the other data set, we found that the number of clusters varies based on the type of water-consumption event, type of day (i.e., weekday or weekend), profiling interval and probability of use. In addition, we observed that insight into tap-water usage could be used to determine the population's adaptation of hygiene practices in an unprecedented time, such as the COVID-19 pandemic. Finally, we recommend that future clustering studies also employ aligned socio-demographic data and other key features.
Please use this identifier to cite or link to this item: