![]() ![]() CUDA C++ Language Integration and Portability Features Host / Device Functions ![]() In this post I’ll describe Hemi in depth, but first I want to cover the CUDA C/C++ language and compiler features on which Hemi is built. ![]() The result is much like the right-hand side of the screenshot above, with a minimum of memory management code and no explicit memory copies. To make CUDA memory management and transfers robust and simple to implement, PhysX uses a smart generic array class that automatically copies data between the device and host only when necessary. PhysX, for example, has a comprehensive 3D vector math library that is portable across multiple platforms, including CUDA GPUs, Intel and other CPUs, and game consoles. Hemi is inspired by real-world CUDA software projects like PhysX and OptiX, which use custom libraries of preprocessor macros and container classes that enable the definition of portable application-specific libraries, classes, and kernels. One cause of this is the code duplication that is required to support multiple target platforms, and another cause is the verbose memory management incurred by heterogeneous memory spaces. But as the left column above shows, using them directly can result in complicated code. Together, these features enable developers to write code that can be compiled and run on either the host, the device, or both. _host_ and _device_) and preprocessor definitions (e.g. Portable CUDA C++ code without Hemi (left) and with Hemi (right).ĬUDA C++ and the NVIDIA NVCC compiler tool chain provide a number of features designed to make it easier to write portable code, including language-level integration of host and device code and data, declaration specifiers (e.g. Using Hemi, the length and complexity of this code is reduced by half. The right column is written using Hemi’s macros and smart heterogeneous Array container class, hemi::Array. In the screenshot below, both columns show a simple Black-Scholes code written to be compilable with either NVCC or a standard C++ host compiler, and also runnable on either the CPU or a CUDA GPU. In this post I’ll talk about Hemi, a simple open-source C++ header library that simplifies writing portable CUDA C/C++ code. Ideally, libraries of domain-specific code should be easily retargetable. When building heterogeneous applications, developers must be able to share code between projects, platforms, compilers, and target architectures. Software development is as much about writing code fast as it is about writing fast code, and central to rapid development is software reuse and portability. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |