A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting

Arseniy Vitkovskiy, Vassos Soteriou Soteriou, Chrysostomos Nicopoulos

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Denser transistor integration has enabled the fabrication of multi-tile chips, however, at the expense of higher susceptibility to defects and wear-out. Metal wires comprising the links of Networks-on-Chip (NoCs) are especially vulnerable to such defects, which can render some links disconnected. This paper presents a new fault-tolerant routing scheme to sustain on-chip communication. It uses a localized re-routing approach, whereby de-touring around faulty links - or complex regions of faults - is done locally at each node in a purely distributed and dynamic manner, while guaranteeing deadlock- and livelock-freedom. Results using synthetic traffic and real applications with full-system simulations prove its efficacy in addressing a large percentage of NoC links being faulty albeit at a gracefully degraded performance mode.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 2012 Interconnection Network Architecture
    Subtitle of host publicationOn-Chip, Multi-Chip Workshop, INA-OCMC'12
    Pages29-32
    Number of pages4
    DOIs
    StatePublished - Feb 20 2012
    Event2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12 - Paris, France
    Duration: Jan 25 2012Jan 25 2012

    Other

    Other2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12
    CountryFrance
    CityParis
    Period1/25/121/25/12

    Fingerprint

    Routing algorithms
    Defects
    Tile
    Transistors
    Wear of materials
    Wire
    Fabrication
    Communication
    Metals
    Network-on-chip

    Keywords

    • fault-tolerance
    • on-chip networks
    • routing algorithm

    ASJC Scopus subject areas

    • Software
    • Human-Computer Interaction
    • Computer Vision and Pattern Recognition
    • Computer Networks and Communications

    Cite this

    Vitkovskiy, A., Soteriou, V. S., & Nicopoulos, C. (2012). A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting. In Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12 (pp. 29-32) https://doi.org/10.1145/2107763.2107771

    A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting. / Vitkovskiy, Arseniy; Soteriou, Vassos Soteriou; Nicopoulos, Chrysostomos.

    Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12. 2012. p. 29-32.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Vitkovskiy, A, Soteriou, VS & Nicopoulos, C 2012, A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting. in Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12. pp. 29-32, 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12, Paris, France, 1/25/12. https://doi.org/10.1145/2107763.2107771
    Vitkovskiy A, Soteriou VS, Nicopoulos C. A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting. In Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12. 2012. p. 29-32 https://doi.org/10.1145/2107763.2107771
    Vitkovskiy, Arseniy ; Soteriou, Vassos Soteriou ; Nicopoulos, Chrysostomos. / A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting. Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop, INA-OCMC'12. 2012. pp. 29-32
    @inproceedings{8de5f5697d6246438e2720562ad0b1ae,
    title = "A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting",
    abstract = "Denser transistor integration has enabled the fabrication of multi-tile chips, however, at the expense of higher susceptibility to defects and wear-out. Metal wires comprising the links of Networks-on-Chip (NoCs) are especially vulnerable to such defects, which can render some links disconnected. This paper presents a new fault-tolerant routing scheme to sustain on-chip communication. It uses a localized re-routing approach, whereby de-touring around faulty links - or complex regions of faults - is done locally at each node in a purely distributed and dynamic manner, while guaranteeing deadlock- and livelock-freedom. Results using synthetic traffic and real applications with full-system simulations prove its efficacy in addressing a large percentage of NoC links being faulty albeit at a gracefully degraded performance mode.",
    keywords = "fault-tolerance, on-chip networks, routing algorithm",
    author = "Arseniy Vitkovskiy and Soteriou, {Vassos Soteriou} and Chrysostomos Nicopoulos",
    year = "2012",
    month = "2",
    day = "20",
    doi = "10.1145/2107763.2107771",
    language = "English (US)",
    isbn = "9781450310109",
    pages = "29--32",
    booktitle = "Proceedings of the 2012 Interconnection Network Architecture",

    }

    TY - GEN

    T1 - A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting

    AU - Vitkovskiy, Arseniy

    AU - Soteriou, Vassos Soteriou

    AU - Nicopoulos, Chrysostomos

    PY - 2012/2/20

    Y1 - 2012/2/20

    N2 - Denser transistor integration has enabled the fabrication of multi-tile chips, however, at the expense of higher susceptibility to defects and wear-out. Metal wires comprising the links of Networks-on-Chip (NoCs) are especially vulnerable to such defects, which can render some links disconnected. This paper presents a new fault-tolerant routing scheme to sustain on-chip communication. It uses a localized re-routing approach, whereby de-touring around faulty links - or complex regions of faults - is done locally at each node in a purely distributed and dynamic manner, while guaranteeing deadlock- and livelock-freedom. Results using synthetic traffic and real applications with full-system simulations prove its efficacy in addressing a large percentage of NoC links being faulty albeit at a gracefully degraded performance mode.

    AB - Denser transistor integration has enabled the fabrication of multi-tile chips, however, at the expense of higher susceptibility to defects and wear-out. Metal wires comprising the links of Networks-on-Chip (NoCs) are especially vulnerable to such defects, which can render some links disconnected. This paper presents a new fault-tolerant routing scheme to sustain on-chip communication. It uses a localized re-routing approach, whereby de-touring around faulty links - or complex regions of faults - is done locally at each node in a purely distributed and dynamic manner, while guaranteeing deadlock- and livelock-freedom. Results using synthetic traffic and real applications with full-system simulations prove its efficacy in addressing a large percentage of NoC links being faulty albeit at a gracefully degraded performance mode.

    KW - fault-tolerance

    KW - on-chip networks

    KW - routing algorithm

    UR - http://www.scopus.com/inward/record.url?scp=84856945737&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84856945737&partnerID=8YFLogxK

    U2 - 10.1145/2107763.2107771

    DO - 10.1145/2107763.2107771

    M3 - Conference contribution

    SN - 9781450310109

    SP - 29

    EP - 32

    BT - Proceedings of the 2012 Interconnection Network Architecture

    ER -