Async CUDA operation with default stream (NULL CUstream) is not much
beneficial than blocking operation since all CUDA operations which belong
to the CUDA context will be synchronized with the default stream's operation.
Note that CUDA stream will share all resources of the corresponding CUDA context
but which can help parallel operation similar to the relation between thread and process
By adding system memory support for nvdec, both en/decoder
in the nvcodec plugin are able to be usable regardless of
OpenGL dependency. Besides, the direct use of system memory
might have less overhead than OpenGL memory depending on use cases.
(e.g., transcoding using S/W encoder)
... and add our stub cuda header.
Newly introduced stub cuda.h file is defining minimal types in order to
build nvcodec plugin without system installed CUDA toolkit dependency.
This will make cross-compile possible.