Resource oblivious sorting on multicores

Richard Cole, Vijaya Ramachandran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n logn) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(logn loglogn), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner. We improve on the above cache-oblivious processor-aware parallel implementation by using the Priority Work Stealing Scheduler (PWS) that we presented recently in a companion paper [12]. The PWS scheduler is both processor- and cache-oblivious (i.e., resource oblivious), and it tolerates asynchrony among the cores. Using PWS, we obtain a resource oblivious scheduling of our sorting algorithm that matches the performance of the processor-aware version. Our analysis includes the delay incurred by false-sharing. We also establish good bounds for our algorithm with the randomized work stealing scheduler.

Original languageEnglish (US)
Title of host publicationAutomata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings
Pages226-237
Number of pages12
Volume6198 LNCS
EditionPART 1
DOIs
StatePublished - 2010
Event37th International Colloquium on Automata, Languages and Programming, ICALP 2010 - Bordeaux, France
Duration: Jul 6 2010Jul 10 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6198 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other37th International Colloquium on Automata, Languages and Programming, ICALP 2010
CountryFrance
CityBordeaux
Period7/6/107/10/10

Fingerprint

Sorting
Cache
Scheduler
Resources
Sort
Sorting algorithm
Critical Path
Merging
Deterministic Algorithm
Path Length
Parallel Implementation
Shared Memory
Scheduling
Partitioning
Sharing
Data storage equipment
Computing

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Cole, R., & Ramachandran, V. (2010). Resource oblivious sorting on multicores. In Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings (PART 1 ed., Vol. 6198 LNCS, pp. 226-237). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6198 LNCS, No. PART 1). https://doi.org/10.1007/978-3-642-14165-2_20

Resource oblivious sorting on multicores. / Cole, Richard; Ramachandran, Vijaya.

Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings. Vol. 6198 LNCS PART 1. ed. 2010. p. 226-237 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6198 LNCS, No. PART 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cole, R & Ramachandran, V 2010, Resource oblivious sorting on multicores. in Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings. PART 1 edn, vol. 6198 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 6198 LNCS, pp. 226-237, 37th International Colloquium on Automata, Languages and Programming, ICALP 2010, Bordeaux, France, 7/6/10. https://doi.org/10.1007/978-3-642-14165-2_20
Cole R, Ramachandran V. Resource oblivious sorting on multicores. In Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings. PART 1 ed. Vol. 6198 LNCS. 2010. p. 226-237. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). https://doi.org/10.1007/978-3-642-14165-2_20
Cole, Richard ; Ramachandran, Vijaya. / Resource oblivious sorting on multicores. Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings. Vol. 6198 LNCS PART 1. ed. 2010. pp. 226-237 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).
@inproceedings{e501104050894a51b410636ceb41690a,
title = "Resource oblivious sorting on multicores",
abstract = "We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n logn) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(logn loglogn), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner. We improve on the above cache-oblivious processor-aware parallel implementation by using the Priority Work Stealing Scheduler (PWS) that we presented recently in a companion paper [12]. The PWS scheduler is both processor- and cache-oblivious (i.e., resource oblivious), and it tolerates asynchrony among the cores. Using PWS, we obtain a resource oblivious scheduling of our sorting algorithm that matches the performance of the processor-aware version. Our analysis includes the delay incurred by false-sharing. We also establish good bounds for our algorithm with the randomized work stealing scheduler.",
author = "Richard Cole and Vijaya Ramachandran",
year = "2010",
doi = "10.1007/978-3-642-14165-2_20",
language = "English (US)",
isbn = "3642141641",
volume = "6198 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 1",
pages = "226--237",
booktitle = "Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings",
edition = "PART 1",

}

TY - GEN

T1 - Resource oblivious sorting on multicores

AU - Cole, Richard

AU - Ramachandran, Vijaya

PY - 2010

Y1 - 2010

N2 - We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n logn) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(logn loglogn), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner. We improve on the above cache-oblivious processor-aware parallel implementation by using the Priority Work Stealing Scheduler (PWS) that we presented recently in a companion paper [12]. The PWS scheduler is both processor- and cache-oblivious (i.e., resource oblivious), and it tolerates asynchrony among the cores. Using PWS, we obtain a resource oblivious scheduling of our sorting algorithm that matches the performance of the processor-aware version. Our analysis includes the delay incurred by false-sharing. We also establish good bounds for our algorithm with the randomized work stealing scheduler.

AB - We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n logn) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(logn loglogn), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner. We improve on the above cache-oblivious processor-aware parallel implementation by using the Priority Work Stealing Scheduler (PWS) that we presented recently in a companion paper [12]. The PWS scheduler is both processor- and cache-oblivious (i.e., resource oblivious), and it tolerates asynchrony among the cores. Using PWS, we obtain a resource oblivious scheduling of our sorting algorithm that matches the performance of the processor-aware version. Our analysis includes the delay incurred by false-sharing. We also establish good bounds for our algorithm with the randomized work stealing scheduler.

UR - http://www.scopus.com/inward/record.url?scp=77955314108&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77955314108&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-14165-2_20

DO - 10.1007/978-3-642-14165-2_20

M3 - Conference contribution

SN - 3642141641

SN - 9783642141645

VL - 6198 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 226

EP - 237

BT - Automata, Languages and Programming - 37th International Colloquium, ICALP 2010, Proceedings

ER -