Task-Graph Scheduling Extensions for Efficient Synchronization and Communication

Abstract

Task graphs have been studied for decades as a foundation for scheduling irregular parallel applications and incorporated in programming models such as OpenMP. While many high-performance parallel libraries are based on task graphs, they also have additional scheduling requirements, such as synchronization from inner levels of data parallelism and internal blocking communications. In this paper, we extend task-graph scheduling to support efficient synchronization and communication within tasks. Our scheduler avoids deadlock and oversubscription of worker threads, and refines victim selection to increase the overlap of sibling tasks. Our approach is the first to combine gang-scheduling and work-stealing in a single runtime. Our approach has been evaluated on the SLATE high-performance linear algebra library. Relative to the LLVM OMP runtime, our runtime demonstrates performance improvements of up to 13.82%, 15.2%, and 36.94% for LU, QR, and Cholesky, respectively, evaluated across different configurations.

Publication
In 35th ACM International Conference on Supercomputing

#

#Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. #

#

#Create your slides in Markdown - click the Slides button to check out the example. #

#Supplementary notes can be added here, including code, math, and images.

Related