Balancing performance and fault detection for GPGPU workloads

Jerry B. Backer, Ramesh Karri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

GPUs are increasingly being used for processing highly parallel scientific and high performance workloads. Such applications require correctness and accuracy of the computation. GPUs lack adequate support for detecting hardware faults that may lead to computation errors. We present a tunable fault detection scheme that allows one to balance GPU performance and fault checking by configuring the amount of resources to allocate for detection and the frequency of checking for faults.

Original languageEnglish (US)
Title of host publication2012 IEEE 30th International Conference on Computer Design, ICCD 2012
Pages518-519
Number of pages2
DOIs
StatePublished - 2012
Event2012 IEEE 30th International Conference on Computer Design, ICCD 2012 - Montreal, QC, Canada
Duration: Sep 30 2012Oct 3 2012

Other

Other2012 IEEE 30th International Conference on Computer Design, ICCD 2012
CountryCanada
CityMontreal, QC
Period9/30/1210/3/12

Fingerprint

Fault detection
Hardware
Processing
Graphics processing unit

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Hardware and Architecture

Cite this

Backer, J. B., & Karri, R. (2012). Balancing performance and fault detection for GPGPU workloads. In 2012 IEEE 30th International Conference on Computer Design, ICCD 2012 (pp. 518-519). [6378702] https://doi.org/10.1109/ICCD.2012.6378702

Balancing performance and fault detection for GPGPU workloads. / Backer, Jerry B.; Karri, Ramesh.

2012 IEEE 30th International Conference on Computer Design, ICCD 2012. 2012. p. 518-519 6378702.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Backer, JB & Karri, R 2012, Balancing performance and fault detection for GPGPU workloads. in 2012 IEEE 30th International Conference on Computer Design, ICCD 2012., 6378702, pp. 518-519, 2012 IEEE 30th International Conference on Computer Design, ICCD 2012, Montreal, QC, Canada, 9/30/12. https://doi.org/10.1109/ICCD.2012.6378702
Backer JB, Karri R. Balancing performance and fault detection for GPGPU workloads. In 2012 IEEE 30th International Conference on Computer Design, ICCD 2012. 2012. p. 518-519. 6378702 https://doi.org/10.1109/ICCD.2012.6378702
Backer, Jerry B. ; Karri, Ramesh. / Balancing performance and fault detection for GPGPU workloads. 2012 IEEE 30th International Conference on Computer Design, ICCD 2012. 2012. pp. 518-519
@inproceedings{265cbfa9410f4a0b857157a33d57c9a2,
title = "Balancing performance and fault detection for GPGPU workloads",
abstract = "GPUs are increasingly being used for processing highly parallel scientific and high performance workloads. Such applications require correctness and accuracy of the computation. GPUs lack adequate support for detecting hardware faults that may lead to computation errors. We present a tunable fault detection scheme that allows one to balance GPU performance and fault checking by configuring the amount of resources to allocate for detection and the frequency of checking for faults.",
author = "Backer, {Jerry B.} and Ramesh Karri",
year = "2012",
doi = "10.1109/ICCD.2012.6378702",
language = "English (US)",
isbn = "9781467330503",
pages = "518--519",
booktitle = "2012 IEEE 30th International Conference on Computer Design, ICCD 2012",

}

TY - GEN

T1 - Balancing performance and fault detection for GPGPU workloads

AU - Backer, Jerry B.

AU - Karri, Ramesh

PY - 2012

Y1 - 2012

N2 - GPUs are increasingly being used for processing highly parallel scientific and high performance workloads. Such applications require correctness and accuracy of the computation. GPUs lack adequate support for detecting hardware faults that may lead to computation errors. We present a tunable fault detection scheme that allows one to balance GPU performance and fault checking by configuring the amount of resources to allocate for detection and the frequency of checking for faults.

AB - GPUs are increasingly being used for processing highly parallel scientific and high performance workloads. Such applications require correctness and accuracy of the computation. GPUs lack adequate support for detecting hardware faults that may lead to computation errors. We present a tunable fault detection scheme that allows one to balance GPU performance and fault checking by configuring the amount of resources to allocate for detection and the frequency of checking for faults.

UR - http://www.scopus.com/inward/record.url?scp=84872070275&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872070275&partnerID=8YFLogxK

U2 - 10.1109/ICCD.2012.6378702

DO - 10.1109/ICCD.2012.6378702

M3 - Conference contribution

SN - 9781467330503

SP - 518

EP - 519

BT - 2012 IEEE 30th International Conference on Computer Design, ICCD 2012

ER -