This example measures the memory bandwith capacity of GPU devices. It performs memcpy from host to GPU device, GPU device to host, and within a single GPU.
- User commandline arguments are parsed and test parameters initialized. If there are no commandline arguments then the test paramenters are initialized with default values.
- Bandwidth tests are launched.
- If the memory type for the test set to
-memory pageablethen the host side data is instantiated instd::vector<unsigned char>. If the memory type for the test set to-memory pinnedthen the host side data is instantiated inunsigned char*and allocated usinghipHostMalloc. - Device side storage is allocated using
hipMallocinunsigned char* - Memory transfer is performed
trailamount of times usinghipMemcpyfor pageable memory or usinghipMemcpyAsyncfor host allocated pinned memory. - Time of memory transfer operations is measured that is then used to calculate the bandwidth.
- All device memory is freed using
hipFreeand all host allocated pinned memory is freed usinghipHostFree.
The program uses HIP pageable and pinned memory. It is important to note that the pinned memory is allocated using hipHostMalloc and is destroyed using hipHostFree. The HIP memory transfer routine hipMemcpyAsync will behave synchronously if the host memory is not pinned. Therefore, it is important to allocate pinned host memory using hipHostMalloc for hipMemcpyAsync to behave asynchronously.
hipMallochipMemcpyhipMemcpyAsynchipGetDeviceCounthipGetDevicePropertieshipFreehipHostFreehipHostMallochipSetDevice