Chapel on Accelerators
People
Supervisor
Description
This project proposes to extend the Chapel language to support offload of limited computational kernels to a range of accelerator devices including NVidia, AMD, and Intel GPUs. This will be achieved through lossless translation from the Chapel compiler's generated LLVM-IR to OpenCL-flavoured SPIR-V, with appropriate library calls for kernel and data management. It will be necessary to define appropriate mappings between high-level Chapel constructs and OpenCL features, which may include making restrictions to the set of constructs that can be used on an accelerator.
Background: The Chapel language is an asynchronous partitioned global address space language developed by Cray for use on high-performance computers. Asynchronous tasks can be scheduled for parallel execution by computational elements in different "locales", or shared-memory areas. Development of the language has so far focused mainly on clusters of multi-core CPU nodes.
SPIR-V is an open standard for intermediate representation for parallel computing defined by the Khronos Group, with wide support from vendors including NVidia, AMD and Intel.
Research significance: While there have been three previous attempts to provide an approach to running Chapel on accelerators, only one [Chu et al. 2016] integrates well with the Chapel philosophy of 'multi-resolution design' within a single global-view programming model, and none have been successfully integrated into the Chapel project. One reason for this is the inherent complexity of adding an extra source-to-source translation to the existing compile chain. Our proposed approach uses Chapel's existing LLVM backend, directly targeting SPIR-V which is an open standard intermediate representation for accelerators. It therefore creates less additional complexity and consequent maintenance than previous approaches. Providing a seamless method for executing Chapel code on accelerators including GPUs would enable exploration of the Chapel programming model on a wider range of high-performance computing architectures than is currently possible.
Goals
- Understand key considerations of performance and productivity in programming models for high-performance computing;
- Understand key features of HPC accelerator architectures including GPUs, and programming models for these architectures;
- Demonstrate systems engineering skills, including design, development, empirical evaluation, and communication, by contributing to a large and dynamic free software project; and
- Present research findings using accepted methods of academic presentation.
Requirements
Background Literature
[1] Albert Sidelnik, Saeed Maleki, Bradford L. Chamberlain, Maria J. Garzaran, David Padua (2012). Performance Portability with the Chapel Language. IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS 2012).
[2] Michael L. Chu, Ashwin M. Aji, Daniel Lowell, Khaled Hamidouche (2017). GPGPU support in Chapel with the Radeon Open Compute Platform. Proceedings of the Chapel Implementers and Users Workshop (CHIUW 2017).
[3] Akihiro Hayashi, Sri Raj Paul, Vivek Sarkar (2019). GPUIterator: Bridging the Gap between Chapel and GPU Platforms. Proceedings of the Chapel Implementers and Users Workshop (CHIUW 2019).