Automatic synthesis of self-recovering VLSI systems

Alex Orailoglu, Ramesh Karri

Research output: Contribution to journalArticle

Abstract

In this paper, we will describe an integrated system for synthesizing self-recovering microarchitectures called SYNCERE. In the SYNCERE model for self-recovery, transient faults are detected using duplication and comparison, while recovery from transient faults is accomplished via checkpointing and rollback. SYNCERE initially inserts checkpoints subject to designer specified recovery time constraints. Subsequently, SYNCERE incorporates detection constraints by ensuring that two copies of the computation are executed on disjoint hardware. Towards ameliorating the dedicated hardware required for the original and duplicate computations, SYNCERE imposes intercopy hardware disjointness at a sub-computation level instead of at the overall computation level. The overhead is further moderated by restructuring the pliable input representation of the computation. SYNCERE has successfully derived numerous self-recovering microarchitectures. Towards validating the methodology for designing fault-tolerant VLSI ICs, we carried out a physical design of a self-recovering 16-point FIR filter.

Original languageEnglish (US)
Pages (from-to)131-142
Number of pages12
JournalIEEE Transactions on Computers
Volume45
Issue number2
DOIs
StatePublished - 1996

Fingerprint

Synthesis
Transient Faults
Recovery
Hardware
FIR Filter
Checkpointing
Checkpoint
FIR filters
Duplication
Integrated System
Fault-tolerant
Disjoint
Methodology
Model

Keywords

  • Fault tolerance
  • High level synthesis
  • Self-recovery
  • Transient faults
  • VLSI design automation

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Hardware and Architecture

Cite this

Automatic synthesis of self-recovering VLSI systems. / Orailoglu, Alex; Karri, Ramesh.

In: IEEE Transactions on Computers, Vol. 45, No. 2, 1996, p. 131-142.

Research output: Contribution to journalArticle

@article{82d5a5d5b29c4a68992fe75991c488a9,
title = "Automatic synthesis of self-recovering VLSI systems",
abstract = "In this paper, we will describe an integrated system for synthesizing self-recovering microarchitectures called SYNCERE. In the SYNCERE model for self-recovery, transient faults are detected using duplication and comparison, while recovery from transient faults is accomplished via checkpointing and rollback. SYNCERE initially inserts checkpoints subject to designer specified recovery time constraints. Subsequently, SYNCERE incorporates detection constraints by ensuring that two copies of the computation are executed on disjoint hardware. Towards ameliorating the dedicated hardware required for the original and duplicate computations, SYNCERE imposes intercopy hardware disjointness at a sub-computation level instead of at the overall computation level. The overhead is further moderated by restructuring the pliable input representation of the computation. SYNCERE has successfully derived numerous self-recovering microarchitectures. Towards validating the methodology for designing fault-tolerant VLSI ICs, we carried out a physical design of a self-recovering 16-point FIR filter.",
keywords = "Fault tolerance, High level synthesis, Self-recovery, Transient faults, VLSI design automation",
author = "Alex Orailoglu and Ramesh Karri",
year = "1996",
doi = "10.1109/12.485368",
language = "English (US)",
volume = "45",
pages = "131--142",
journal = "IEEE Transactions on Computers",
issn = "0018-9340",
publisher = "IEEE Computer Society",
number = "2",

}

TY - JOUR

T1 - Automatic synthesis of self-recovering VLSI systems

AU - Orailoglu, Alex

AU - Karri, Ramesh

PY - 1996

Y1 - 1996

N2 - In this paper, we will describe an integrated system for synthesizing self-recovering microarchitectures called SYNCERE. In the SYNCERE model for self-recovery, transient faults are detected using duplication and comparison, while recovery from transient faults is accomplished via checkpointing and rollback. SYNCERE initially inserts checkpoints subject to designer specified recovery time constraints. Subsequently, SYNCERE incorporates detection constraints by ensuring that two copies of the computation are executed on disjoint hardware. Towards ameliorating the dedicated hardware required for the original and duplicate computations, SYNCERE imposes intercopy hardware disjointness at a sub-computation level instead of at the overall computation level. The overhead is further moderated by restructuring the pliable input representation of the computation. SYNCERE has successfully derived numerous self-recovering microarchitectures. Towards validating the methodology for designing fault-tolerant VLSI ICs, we carried out a physical design of a self-recovering 16-point FIR filter.

AB - In this paper, we will describe an integrated system for synthesizing self-recovering microarchitectures called SYNCERE. In the SYNCERE model for self-recovery, transient faults are detected using duplication and comparison, while recovery from transient faults is accomplished via checkpointing and rollback. SYNCERE initially inserts checkpoints subject to designer specified recovery time constraints. Subsequently, SYNCERE incorporates detection constraints by ensuring that two copies of the computation are executed on disjoint hardware. Towards ameliorating the dedicated hardware required for the original and duplicate computations, SYNCERE imposes intercopy hardware disjointness at a sub-computation level instead of at the overall computation level. The overhead is further moderated by restructuring the pliable input representation of the computation. SYNCERE has successfully derived numerous self-recovering microarchitectures. Towards validating the methodology for designing fault-tolerant VLSI ICs, we carried out a physical design of a self-recovering 16-point FIR filter.

KW - Fault tolerance

KW - High level synthesis

KW - Self-recovery

KW - Transient faults

KW - VLSI design automation

UR - http://www.scopus.com/inward/record.url?scp=0000761282&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0000761282&partnerID=8YFLogxK

U2 - 10.1109/12.485368

DO - 10.1109/12.485368

M3 - Article

VL - 45

SP - 131

EP - 142

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

SN - 0018-9340

IS - 2

ER -