Instruction-level impact analysis of low-level faults in a modern microprocessor controller

Mihalis Maniatakos, Naghmeh Karimi, Chandra Tirumurti, Abhijit Jas, Yiorgos Makris

Research output: Contribution to journalArticle

Abstract

We investigate the correlation between low-level faults in the control logic of a modern microprocessor and their instruction-level impact on the execution of typical workload. Such information can prove immensely useful in accurately assessing and prioritizing faults with regards to their criticality, as well as commensurately allocating resources to enhance online testability and error/fault resilience through concurrent error detection/correction methods. To this end, we developed an extensive fault simulation infrastructure which allows injection of stuck-at faults and transient errors of arbitrary starting time and duration, as well as cost-effective simulation and classification of their repercussions into various instruction-level error types. As a test vehicle for our study, we employ a superscalar, dynamically-scheduled, out-of-order, Alpha-like microprocessor, on which we execute SPEC2000 integer benchmarks. Extensive fault injection campaigns in control modules of this microprocessor facilitate valuable observations regarding the distribution of low-level faults into the instruction-level error types that they cause. Experimentation with both Register Transfer (RT-) and Gate-Level faults, as well as with both stuck-at faults and transient errors, confirms the validity and corroborates the utility of these observations.

Original languageEnglish (US)
Article number5432157
Pages (from-to)1260-1273
Number of pages14
JournalIEEE Transactions on Computers
Volume60
Issue number9
DOIs
StatePublished - Aug 8 2011

Fingerprint

Microprocessor
Microprocessor chips
Fault
Controller
Controllers
Error detection
Error correction
Fault Simulation
Superscalar
Fault Injection
Error Detection
Resilience
Criticality
Experimentation
Workload
Concurrent
Injection
Infrastructure
Logic
Benchmark

Keywords

  • concurrent error detection
  • Fault simulation
  • instruction-level error
  • microprocessor controller

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Software
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

Instruction-level impact analysis of low-level faults in a modern microprocessor controller. / Maniatakos, Mihalis; Karimi, Naghmeh; Tirumurti, Chandra; Jas, Abhijit; Makris, Yiorgos.

In: IEEE Transactions on Computers, Vol. 60, No. 9, 5432157, 08.08.2011, p. 1260-1273.

Research output: Contribution to journalArticle

Maniatakos, Mihalis ; Karimi, Naghmeh ; Tirumurti, Chandra ; Jas, Abhijit ; Makris, Yiorgos. / Instruction-level impact analysis of low-level faults in a modern microprocessor controller. In: IEEE Transactions on Computers. 2011 ; Vol. 60, No. 9. pp. 1260-1273.
@article{d9d1afbe1ed84eeda24e1553af0018ea,
title = "Instruction-level impact analysis of low-level faults in a modern microprocessor controller",
abstract = "We investigate the correlation between low-level faults in the control logic of a modern microprocessor and their instruction-level impact on the execution of typical workload. Such information can prove immensely useful in accurately assessing and prioritizing faults with regards to their criticality, as well as commensurately allocating resources to enhance online testability and error/fault resilience through concurrent error detection/correction methods. To this end, we developed an extensive fault simulation infrastructure which allows injection of stuck-at faults and transient errors of arbitrary starting time and duration, as well as cost-effective simulation and classification of their repercussions into various instruction-level error types. As a test vehicle for our study, we employ a superscalar, dynamically-scheduled, out-of-order, Alpha-like microprocessor, on which we execute SPEC2000 integer benchmarks. Extensive fault injection campaigns in control modules of this microprocessor facilitate valuable observations regarding the distribution of low-level faults into the instruction-level error types that they cause. Experimentation with both Register Transfer (RT-) and Gate-Level faults, as well as with both stuck-at faults and transient errors, confirms the validity and corroborates the utility of these observations.",
keywords = "concurrent error detection, Fault simulation, instruction-level error, microprocessor controller",
author = "Mihalis Maniatakos and Naghmeh Karimi and Chandra Tirumurti and Abhijit Jas and Yiorgos Makris",
year = "2011",
month = "8",
day = "8",
doi = "10.1109/TC.2010.60",
language = "English (US)",
volume = "60",
pages = "1260--1273",
journal = "IEEE Transactions on Computers",
issn = "0018-9340",
publisher = "IEEE Computer Society",
number = "9",

}

TY - JOUR

T1 - Instruction-level impact analysis of low-level faults in a modern microprocessor controller

AU - Maniatakos, Mihalis

AU - Karimi, Naghmeh

AU - Tirumurti, Chandra

AU - Jas, Abhijit

AU - Makris, Yiorgos

PY - 2011/8/8

Y1 - 2011/8/8

N2 - We investigate the correlation between low-level faults in the control logic of a modern microprocessor and their instruction-level impact on the execution of typical workload. Such information can prove immensely useful in accurately assessing and prioritizing faults with regards to their criticality, as well as commensurately allocating resources to enhance online testability and error/fault resilience through concurrent error detection/correction methods. To this end, we developed an extensive fault simulation infrastructure which allows injection of stuck-at faults and transient errors of arbitrary starting time and duration, as well as cost-effective simulation and classification of their repercussions into various instruction-level error types. As a test vehicle for our study, we employ a superscalar, dynamically-scheduled, out-of-order, Alpha-like microprocessor, on which we execute SPEC2000 integer benchmarks. Extensive fault injection campaigns in control modules of this microprocessor facilitate valuable observations regarding the distribution of low-level faults into the instruction-level error types that they cause. Experimentation with both Register Transfer (RT-) and Gate-Level faults, as well as with both stuck-at faults and transient errors, confirms the validity and corroborates the utility of these observations.

AB - We investigate the correlation between low-level faults in the control logic of a modern microprocessor and their instruction-level impact on the execution of typical workload. Such information can prove immensely useful in accurately assessing and prioritizing faults with regards to their criticality, as well as commensurately allocating resources to enhance online testability and error/fault resilience through concurrent error detection/correction methods. To this end, we developed an extensive fault simulation infrastructure which allows injection of stuck-at faults and transient errors of arbitrary starting time and duration, as well as cost-effective simulation and classification of their repercussions into various instruction-level error types. As a test vehicle for our study, we employ a superscalar, dynamically-scheduled, out-of-order, Alpha-like microprocessor, on which we execute SPEC2000 integer benchmarks. Extensive fault injection campaigns in control modules of this microprocessor facilitate valuable observations regarding the distribution of low-level faults into the instruction-level error types that they cause. Experimentation with both Register Transfer (RT-) and Gate-Level faults, as well as with both stuck-at faults and transient errors, confirms the validity and corroborates the utility of these observations.

KW - concurrent error detection

KW - Fault simulation

KW - instruction-level error

KW - microprocessor controller

UR - http://www.scopus.com/inward/record.url?scp=79961077712&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79961077712&partnerID=8YFLogxK

U2 - 10.1109/TC.2010.60

DO - 10.1109/TC.2010.60

M3 - Article

AN - SCOPUS:79961077712

VL - 60

SP - 1260

EP - 1273

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

SN - 0018-9340

IS - 9

M1 - 5432157

ER -