Commit graph

72 commits

Author SHA1 Message Date
Matthew Waters
dbf4915abd cuda/context: add gpu stack size property
Allows reducing the initial stack size of GPU threads.  Cuda should
automatically increase this value if a kernel requires a larger stack.

Can save roughly 40MB of GPU memory for a single nvh264enc instance.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/8158>
2024-12-19 00:33:03 +00:00
Matthew Waters
d6563016ca cuda: add CuGet/SetCtxLimit()
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/8158>
2024-12-19 00:33:03 +00:00
Nirbheek Chauhan
0c17efafa3 meson: Improve NVMM CUDA detection
1. Add some comments explaining what headers and libs are expected on
   what systems
2. Only look in default incdirs if no incdir is specified
3. Require libnvbufsurface.so on Jetson when cuda-nvmm=enabled
4. Require libatomic on Jetson when cuda-nvmm=enabled

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/8021>
2024-12-16 14:47:23 +00:00
Seungha Yang
6f92807759 cuda: Load external resource interop symbols
Required for d3d12 interop

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7480>
2024-09-14 15:33:44 +00:00
Seungha Yang
ad02fae416 cuda: Add support for application cuda memory pool
Adding gst_cuda_register_allocator_need_pool_callback() method
to support memory allocation from application's CUmemoryPool

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7427>
2024-09-11 17:15:14 +00:00
Seungha Yang
3c3b8e79c2 cuda: Add CUDA memory pool object
Adding a wrapper object for CUmemoryPool handle to use the native
handle in a refcounted way

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7427>
2024-09-11 17:15:14 +00:00
Seungha Yang
cdaa798ac7 cuda: Add methods to enable stream ordered allocation
Adding prefer-stream-ordered-alloc property to GstCudaContext.
If stream ordered allocation buffer pool option is not configured
and this property is enabled, buffer pool will enable the stream
ordered allocation. Otherwise it will follow default behavior.

If GST_CUDA_ENABLE_STREAM_ORDERED_ALLOC env is set,
default behavior is enabling the stream ordered allocation.
Otherwise sync alloc/free method will be used.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7427>
2024-09-11 17:15:14 +00:00
Seungha Yang
b266aa5e65 cuda: Add support for stream ordered allocation
Default CUDA memory allocation will cause implicit global
synchronization. This stream ordered allocation can avoid it
since memory allocation and free operations are asynchronous
and executed in the associated cuda stream context

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7427>
2024-09-11 17:15:14 +00:00
Seungha Yang
174c9bfaa5 cuda: Load stream ordered allocation related symbols
Required to support async memory allocation

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7427>
2024-09-11 17:15:14 +00:00
Matthew Waters
8dac91537d cuda/nvcodec: Add support for importing and producing embedded NVMM memory
As produced on the Nvidia Jetson series of devices.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7274>
2024-08-02 01:59:07 +00:00
Seungha Yang
0b285fc1a1 cuda: Fix runtime compiler loading with old CUDA tookit
Fallback to PTX if CUBIN symbol is unavailable

Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3685
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7220>
2024-07-23 19:53:09 +00:00
Seungha Yang
cee01d7fbd cuda: Load 1D memcpy method symbols
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6912>
2024-06-13 18:19:08 +00:00
Seungha Yang
afb62e98c7 cuda: Enable x86 NVMM support again
It was broken since memory copy helper function was moved to gst-libs.
Also, adding "cuda-nvmm" and "cuda-nvmm-include-path" build options
to en/disable NVMM support in gstcuda library

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6978>
2024-06-06 12:16:50 +00:00
Seungha Yang
e813ea8367 cudamemory: Fix offset of subsampled planar formats
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6903>
2024-05-23 11:47:16 +00:00
Seungha Yang
e6f496a240 cuda: Add support for VUYA format
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6417>
2024-04-02 13:07:29 +00:00
Seungha Yang
c9aaf39279 cuda,d3d11,d3d12bufferpool: Disable preallocation
Do not chain up to parent's GstBufferPool::start() which will do
preallocation. We don't want it to be preallocated
since there are various cases where negotiated downstream buffer pool is
not used at all (e.g., zero-copy decoding, IPC elements).

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6326>
2024-03-12 18:07:29 +00:00
Seungha Yang
f77f3e83ed cudamemory: Fix outstanding memory count tracing
Gets being released memory back to queue even if allocator is flushing
in order to count the number of outstanding memory objects.
And fixing double count increment

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6240>
2024-02-29 11:57:50 +00:00
Seungha Yang
51162acc31 cuda: Report device open error
Call gst_cuda_result() with CUDA_ERROR_NO_DEVICE error code if
we could not open device, so that application can catch the error

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6006>
2024-01-30 14:30:41 +00:00
Seungha Yang
cd6d62ddf0 cuda: Use cuStreamDestroy_v2 API
Sync up with CUDA 11.x/12.0 header

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6006>
2024-01-30 14:30:41 +00:00
Seungha Yang
abe1f5044d cuda: Prefer CUBIN over PTX
System installed NVRTC library might be newer version than
driver, then generate PTX can be incompatible with the driver.
Instead of the intermediate code PTX, use actual assembly code
directly.

Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3108
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5639>
2024-01-02 10:10:09 +00:00
Seungha Yang
91e0c3aafa cuda: Use d3d11 token data for interop data
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang
c818906236 cuda: Add support for I420_12LE format
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang
a80f542f66 cuda: Add support for P012_LE and Y444/GBR high bitdepth formats
Adding P012, Y444_10, Y444_12, GBR_10, GBR_12 and GBR_16 formats support

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5375>
2023-09-23 13:12:55 +00:00
Seungha Yang
dc2fd997a2 cuda: Add workaround for gir build
ERROR:../girepository/girparser.c:343:state_switch:
  assertion failed: (ctx->state != newstate)
Bail out! ERROR:../girepository/girparser.c:343:state_switch:
  assertion failed: (ctx->state != newstate)

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang
7b1e4d6051 cudabufferpool: Add support for virtual memory
Configure malloc or mmap allocator depending on config option

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang
547b13c68f cudacontext: Add memory allocation related properties
Adding "virtual-memory" and "os-handle" properties. New properties
will be used to query device's capability

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang
2f506e8ddc cudamemory: Add support for virtual memory in pool allocator
Adding new memory pool allocator for virtual memory

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang
194fd8bb82 cudamemory: Add support for virtual memory management
Adding new CUDA memory allocation methods

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang
a712a768a4 cuda: Load virtual memory management and IPC API symbols
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang
de749fa356 cuda: Introduce GST_CUDA_CRITICAL_ERRORS env to abort on critical error
Adding GST_CUDA_CRITICAL_ERRORS env variable so that program can be
terminated on unrecoverable error.

Example)
GST_CUDA_CRITICAL_ERRORS=2,700 gst-launch-1.0 ...

In this example, CUDA_ERROR_OUT_OF_MEMORY(2) and
CUDA_ERROR_ILLEGAL_ADDRESS(700) are registered as critical error
and program will be aborted on those errors

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4729>
2023-06-18 16:44:43 +00:00
Seungha Yang
58b166453d cuda: Move cuda debug helper function to .cpp
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4729>
2023-06-18 16:44:43 +00:00
Seungha Yang
1aa9e74aaf cudadownload: Always download CUDA memory if it's bound to decoder
Decoder bounded CUDA memory is allocated by driver and the pool size
is fixed. Since we don't know how many buffers would be held by
downstream non-CUDA element, we should download such CUDA memory
and release it back to decoder.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4810>
2023-06-08 22:27:06 +00:00
Seungha Yang
4ed3c46de7 cudamemory: Fix for semi planar YUV memory size decision
UV plan of the semi planar format requires only half of Y plane size

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4502>
2023-04-27 20:55:53 +00:00
Thibault Saunier
b14e675a27 gir: Checkout all .gir files and check that they are updated on the CI
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3010>
2023-04-22 09:32:32 -04:00
Alicia Boya García
85b6625150 cudaloader: Initialize logging category
gstcudaloader.cpp defines GST_DEBUG_CATEGORY (gst_cudaloader_debug);
but it wasn't initializing it anywhere.

This caused the following error to be logged by gst-plugin-scanner when
libcuda.so.1/nvcuda.dll couldn't be loaded, e.g. in systems without
CUDA:

(gst-plugin-scanner:39618): GStreamer-CRITICAL **: 14:40:22.346:
gst_debug_log_full_valist: assertion 'category != NULL' failed

This patch fixes the bug by initializing the category in
gst_cuda_load_library_once_func() before any logging occurs.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4154>
2023-03-14 03:07:50 +00:00
Seungha Yang
319f5f0760 cuda: Link libatomic if needed
Looks like C++ does not pull it automatically

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3997>
2023-02-21 16:00:32 +00:00
Seungha Yang
59f359eb99 cuda: Rename macro HAVE_NVCODEC_GST_GL -> HAVE_CUDA_GST_GL
... and always use #ifdef instead of #if

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 01:49:31 +09:00
Seungha Yang
ff3120a38c cudamemory, d3d11memory: Add memory_{get,set}_token_data() methods
Similar to GstMiniObject qdata but new methods will use int64
token value and per object lock, instead of GQuark with global
mutex in qdata

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang
f6defc0c5b cudamemory: Add gst_cuda_allocator_alloc_wrapped() method
... so that application can pass already allocated CUDA memory

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang
e77e6fd4a7 cudamemory: Skip sync if no I/O operation happend on free()
Synchronization for unused memory is not required

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang
f44cac1c9f cudamemory: Make CUtexObject object reusable
Create and hold CUtexObject objects in GstCudaMemory so that it can
be reusable

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang
992406cf4f cuda, nvcodec: Make GstD3D11 dependency mandatory
GstD3D11 build-time dependencies should be always available on Windows already
and runtime dependencies as well, since required external
(non-GStreamer) depends are all system DLLs

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang
f212bd901b cuda: Port to C++
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:53 +00:00
Seungha Yang
090d50e1a0 tests: Add CUDA memory allocator test
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:43 +00:00
Seungha Yang
7a8bb85523 cudaupload, cudadownload: Update for shared CUDA stream
Use CUDA stream of memory if exists

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:43 +00:00
Seungha Yang
d409c35367 cudabufferpool: Add support for CUDA stream use in memory
* Use GstCudaPoolAllocator
* Pass configured GstCudaStream object to allocator

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00
Seungha Yang
30d06e03c2 cudamemory: Make GstCudaStream-aware
This will be used for CUDA stream sharing.

* Adding GstCudaPoolAllocator object. The pool allocator will
  control synchronization of allocated memory objects.
* Modify gst_cuda_allocator_alloc() API so that caller can specify/set
  GstCudaStream object for the newly allocated memory.
* GST_CUDA_MEMORY_TRANSFER_NEED_SYNC flag is added in addition to
  existing GST_CUDA_MEMORY_TRANSFER_NEED_{UPLOAD,DOWNLOAD}.
  The flag indicates that any GPU command queued in the CUDA stream
  may not be finished yet, and caller should take care of the
  synchronization.
  The flag is controlled by GstCudaMemory object if the memory holds
  GstCudaStream. (Otherwise, GstCudaMemory will do synchronization
  as before this commit). Specifically, GstCudaMemory object will set
  the new flag automatically when memory is mapped with
  (GST_MAP_CUDA | GST_MAP_WRITE) flags. Caller will need to unset
  the flag via GST_MEMORY_FLAG_UNSET() if it's already synchronized
  by client code.
* gst_cuda_memory_sync() helper function is added to perform synchronization
* Why not use CUevent object to keep track of synchronization status?
  CUDA provides fence-like interface already via CUevent object,
  but cuEventRecord/cuEventQuery APIs are not zero-cost operations.
  Instead, in this version, the status is tracked by using map and
  object flags.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00
Seungha Yang
9eaae61a44 cudamemory: Allow nullptr allocator object
The GstCudaAllocator object doesn't hold any device object.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00
Seungha Yang
a7c54ebc06 cuda: Add GstCudaStream object
Wrap CUstream handle with GstCudaStream to make it ref-counted
object. This GstCudaStream object will be used later for
CUDA stream sharing

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00
Seungha Yang
661b5f60c6 cuda: Provide single header include entry point
Add "gstcuda.h" header file

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00