A Synchronization Mechanism between CUDA Blocks for GPU
Bingru Wang, Changyou Zhang, Feng Wang, Jun Feng
Available Online June 2017.
- https://doi.org/10.2991/caai-17.2017.56How to use a DOI?
- GPU; synchronization mechanism; SSSP; parallel computing; delta-stepping; CUDA
- GPU(Graphic Processing Unit) provides a promising solution with massive threads and its advantage is high performance computing. The emergence of CUDA(Compute Unified Device Architecture) opens the door of using GPU's powerful computing power. However, because of the limitation of CUDA itself, direct communication is not supported between SMs(streaming multiprocessors) on GPU. It is time-consuming by atomic operation or barrier synchronization. A synchronization mechanism has been proposed in this paper, that is, on the premise of result available, the times of kernel launched should be reduced. Each kernel launched, it should be computed enough on GPU, the results back to the CPU. Based on SSSP, the validity of this method is illustrated by delta-stepping. For facebook dataset, compared with atomic operation, the speedup ratio is about 1.8. For New York map dataset, compared with atomic operation and barrier synchronization, the speedup ratio is about 9.3 and 1.7 separately.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Bingru Wang AU - Changyou Zhang AU - Feng Wang AU - Jun Feng PY - 2017/06 DA - 2017/06 TI - A Synchronization Mechanism between CUDA Blocks for GPU BT - 2017 2nd International Conference on Control, Automation and Artificial Intelligence (CAAI 2017) PB - Atlantis Press SP - 251 EP - 254 SN - 1951-6851 UR - https://doi.org/10.2991/caai-17.2017.56 DO - https://doi.org/10.2991/caai-17.2017.56 ID - Wang2017/06 ER -