Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks

Fan, W; Xiao, F; Lv, M; Han, L; Yu, S

Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks

Fan, W Xiao, F Lv, M Han, L Yu, S

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Computers, 2024, 73, (9), pp. 2305-2319
Issue Date:: 2024-01-01

Closed Access

	Filename	Description	Size
	1735041.pdf	Published version	1.63 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Fan, W
dc.contributor.author	Xiao, F
dc.contributor.author	Lv, M
dc.contributor.author	Han, L
dc.contributor.author	Yu, S https://orcid.org/0000-0003-4485-6743
dc.date.accessioned	2024-09-05T10:46:00Z
dc.date.available	2024-09-05T10:46:00Z
dc.date.issued	2024-01-01
dc.identifier.citation	IEEE Transactions on Computers, 2024, 73, (9), pp. 2305-2319
dc.identifier.issn	0018-9340
dc.identifier.issn	1557-9956
dc.identifier.uri	http://hdl.handle.net/10453/180674
dc.description.abstract	3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension nn. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with O(N)O(N) under the LFB model, where NN is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Computers
dc.relation.isbasedon	10.1109/TC.2024.3416695
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0803 Computer Software, 0805 Distributed Computing, 1006 Computer Hardware
dc.subject.classification	Computer Hardware & Architecture
dc.subject.classification	4009 Electronics, sensors and digital hardware
dc.subject.classification	4606 Distributed computing and systems software
dc.title	Efficient Fault-Tolerant Path Embedding for 3D Torus Network Using Locally Faulty Blocks
dc.type	Journal Article
utslib.citation.volume	73
utslib.for	0803 Computer Software
utslib.for	0805 Distributed Computing
utslib.for	1006 Computer Hardware
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	University of Technology Sydney/Strength - CCSP - Centre for Cyber Security and Privacy
pubs.organisational-group	University of Technology Sydney/All Manual Groups
pubs.organisational-group	University of Technology Sydney/All Manual Groups/Centre for Cyber Security and Privacy (CCSP)
utslib.copyright.status	closed_access	*
dc.date.updated	2024-09-05T10:45:57Z
pubs.issue	9
pubs.publication-status	Published
pubs.volume	73
utslib.citation.issue	9

Abstract:

3D tori are significant interconnection architectures in building supercomputers and parallel computing systems. Due to the rapid growth of edge faults and the crucial role of path structures in large-scale distributed systems, fault-tolerant path embedding and correlated issues have drawn widespread researches. However, existing path embedding methods are based on traditional fault models, allowing all faults to be near the same node, so they usually only focus on theoretical proof and generate linear fault-tolerance related to dimension nn. In order to improve the fault-tolerance of 3D torus, we first propose a novel conditional fault model called the Locally Faulty Block model (LFB model). On the basis of this model, the Hamiltonian paths with large-scale edge defects in torus are investigated. After that, we construct an Hamiltonian path embedding algorithm HP-LFB into torus with O(N)O(N) under the LFB model, where NN is the number of nodes in torus. Furthermore, we present an adaptive routing algorithm HoeFA, which is based on the method of distance vector to limit the use of virtual channels (VCs). We also make a comparison with state-of-the-art schemes, indicating that our scheme enhance other comprehensive results. The experiment indicated that HP-LFB can sustain the dynamic degradation of the batting average of establishing Hamiltonian paths, with the added faulty edges exceeding fault-tolerance.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/180674