Fine-Grained Powercap Allocation for Power-Constrained Systems Based on Multi-Objective Machine Learning

Institute of Electrical and Electronics Engineers
Publication Type:
Journal Article
IEEE Transactions on Parallel and Distributed Systems, 2021, 32, (7), pp. 1789-1801
Issue Date:
Full metadata record
Power capping is an important solution to keep the system within a fixed power constraint. However, for the over-provisioned and power-constrained systems, especially the future exascale supercomputers, powercap needs to be reasonably allocated according to the workloads of compute nodes to achieve trade-offs among performance, energy and powercap. Thus it is necessary to model performance and energy and to predict the optimal powercap allocation strategies. Existing power allocation approaches have insufficient granularity within nodes. Modeling approaches usually model performance and energy separately, ignoring the correlation between objectives, and do not expose the Pareto-optimal powercap configurations. Therefore, this article combines the powercap with uncore frequency scaling and proposes an approach to predict the Pareto-optimal powercap configurations on the power-constrained system for input MPI and OpenMP parallel applications. Our approach first uses the elaborately designed micro-benchmarks and a small number of existing benchmarks to build the training set, and then applies a multi-objective machine learning algorithm which combines the stacked single-target method with extreme gradient boosting to build multi-objective models of performance and energy. The models can be used to predict the optimal processor and memory powercap settings, helping compute nodes perform fine-grained powercap allocation. When the optimal powercap configuration is determined, the uncore frequency scaling is used to further optimize the energy consumption. Compared with the reference powercap configuration, the predicted optimal configurations predicted by our method can achieve an average powercap reduction of 31.35 percent, an average energy reduction of 12.32 percent, and average performance degradation of only 2.43 percent.
Please use this identifier to cite or link to this item: