oai:arXiv.org:2409.09874
Computer Science
2024
25/9/2024
In recent years, GPUs have become the preferred accelerators for HPC and ML applications due to their parallelism and fast memory bandwidth.
While GPUs boost computation, inter-GPU communication can create scalability bottlenecks, especially as the number of GPUs per node and cluster grows.
Traditionally, the CPU managed multi-GPU communication, but advancements in GPU-centric communication now challenge this CPU dominance by reducing its involvement, granting GPUs more autonomy in communication tasks, and addressing mismatches in multi-GPU communication and computation.
This paper provides a landscape of GPU-centric communication, focusing on vendor mechanisms and user-level library supports.
It aims to clarify the complexities and diverse options in this field, define the terminology, and categorize existing approaches within and across nodes.
The paper discusses vendor-provided mechanisms for communication and memory management in multi-GPU execution and reviews major communication libraries, their benefits, challenges, and performance insights.
Then, it explores key research paradigms, future outlooks, and open research questions.
By extensively describing GPU-centric communication techniques across the software and hardware stacks, we provide researchers, programmers, engineers, and library designers insights on how to exploit multi-GPU systems at their best.
Unat, Didem,Turimbetov, Ilyas,Issa, Mohammed Kefah Taha,Sağbili, Doğan,Vella, Flavio,De Sensi, Daniele,Ismayilov, Ismayil, 2024, The Landscape of GPU-Centric Communication