CUDA¶

If you have an Nvidia GPU, you can accelerate some CPU-intensive tasks of the EnsensoSDK with CUDA .

You can globally enable CUDA support with the Enabled node. After it is activated, all commands that support it will be executed on the GPU. The following commands can be accelerated with CUDA:

Note

Executing the ComputeDisparityMap command automatically rectifies the camera images. If you need a disparity map, you should therefore never explicitly execute the RectifyImages command, but directly call ComputeDisparityMap. This will yield a better performance, because it allows the CUDA implementation to interleave the rectification with the stereo matching and avoid unnecessary copy operations to the GPU.

Multiple GPUs¶

You can see a list of all available GPUs in the Devices node. By default all commands will use the first GPU in that list. The Nvidia driver will also order the devices such that the most powerful one becomes the first entry. In most cases you therefore won’t need to change anything.

If you want to use a different GPU for computations you can globally select it with the Device node. Additionally, you can override the device to be used in the parameters of all commands that support CUDA (see the parameter descriptions of the commands for more information). This way, you can also distribute the computations for different cameras to multiple CUDA devices.

Hints and Limitations¶

You need to have a GPU with at least Compute Capability 3.0. To check whether your GPU satisfies this requirement, you can use this overview on the Nvidia website.
By default, the NxLib keeps memory allocated on the GPU in between commands that use CUDA. This increases performance. You can limit or disable this behavior with the StaticBuffers setting.
The CUDA implementation of Semi-Global-Matching needs the number of disparities to be a multiple of 32. For compatibility with the CPU implementation, you can still choose it in multiples of 16, but the computation will automatically use the next larger multiple of 32.
The MarkFilterRegions parameter of the ComputeDisparityMap command is not supported in the CUDA implementation and will be ignored. The corresponding setting in NxView has no effect when CUDA is enabled.