Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Journal Article
Citation:
IEEE Transactions on Computers, 2024, 73, (9), pp. 2305-2319
Issue Date:
2024-01-01
Filename Description Size
1735041.pdfPublished version1.63 MB
Adobe PDF
Full metadata record
3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension nn. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with O(N)O(N) under the LFB model, where NN is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.
Please use this identifier to cite or link to this item: