Commit graph

148 commits

Author SHA1 Message Date
Seungha Yang abe1f5044d cuda: Prefer CUBIN over PTX
System installed NVRTC library might be newer version than
driver, then generate PTX can be incompatible with the driver.
Instead of the intermediate code PTX, use actual assembly code
directly.

Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3108
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5639>
2024-01-02 10:10:09 +00:00
Seungha Yang 012222bcb3 cudaipcsink: Fix deadlock on stop
Manually close connection if client does not hold any shared memory
on stop.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5747>
2023-12-06 16:09:27 +00:00
Seungha Yang b168647073 nvdec: Fix division by zero when calculating buffer duration
Don't try to calculate buffer duration from variable framerate

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5764>
2023-12-06 13:38:09 +00:00
Seungha Yang 0a05ba3f62 nvencoder: Add support for new preset/tune/multi-pass options
Adding new P1 ~ P7 presets and deprecate old preset values.
Also adding tune and multi-pass properties.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5428>
2023-10-14 11:03:40 +00:00
James Oliver aeef97d81b nvh265encoder: fix bounds for auto-select GPU enumeration
Fixes the bounds-check for encoder auto-select GPU enumeration to be
between 0-7 instead of 0-6. This should allow 8-GPU machines to work
with nvautogpuh265enc for the last enumerated GPU.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5438>
2023-10-05 13:37:36 +08:00
James Oliver 54074af8ec nvh264encoder: fix bounds for auto-select GPU enumeration
Fixes the bounds-check for encoder auto-select GPU enumeration to be
between 0-7 instead of 0-6. This should allow 8-GPU machines to work
with nvautogpuh264enc for the last enumerated GPU.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5438>
2023-10-05 13:37:32 +08:00
Seungha Yang 2cd81eb1ac nvdecoder: Handle output surface alignment in decoder helper object
Output resolution might not be an even number. Set output resolution
without round up but consider the alignment inside of decoder

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang 42560c24df nvdecoder: Add support for D3D11 output
Since DXVA does not support some profiles such as HEVC RExt,
vendor specific decoding API is still required.
When decoder is negotiated with d3d11 caps, decoder will convert
semi-planar frame to planar since semi-planar format (e.g.,
DXGI_FORMAT_NV12) is not supported by CUDA/D3D11 interop.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang 57e0a0bd61 nvdecoder: Handle GstContext in helper object
... and move common code to helper object

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang c818906236 cuda: Add support for I420_12LE format
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang c5bd0faee3 nvdecoder: Add support for HEVC GBR output
... and use P012 format for 12bits instead of P016

Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/2991
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5375>
2023-09-23 13:12:56 +00:00
Seungha Yang 907c507680 nvh265encoder: Add support for RGB encoding
Adding GBR format support to nv{autogpu,cuda,d3d11}h265enc.
Note that the only difference between GBR and Y444 encoding
is matrix_coeffs value written in VUI.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5375>
2023-09-23 13:12:56 +00:00
Seungha Yang a80f542f66 cuda: Add support for P012_LE and Y444/GBR high bitdepth formats
Adding P012, Y444_10, Y444_12, GBR_10, GBR_12 and GBR_16 formats support

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5375>
2023-09-23 13:12:55 +00:00
Seungha Yang bd25c2738e nvdecoder: Copy output frame if needed
Even if decoder is negotiated with CUDA memory feature, if downstream
proposed no buffer pool, assume that the pool size is unknown.
And disable zero-copy if there's no more free output surface.
Or, in case of reverse playback, always copy frames.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5338>
2023-09-17 00:15:47 +09:00
Seungha Yang c5a5dcdf18 nvh265dec: Reconfigure decoder on max-dpb-size change
Decoder should create new picture pool for larger DPB size

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5338>
2023-09-16 23:00:11 +09:00
Seungha Yang f9169c5431 nvdecoder: Move common logic to decoder helper object
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5285>
2023-09-08 11:51:23 +00:00
Seungha Yang 97fc02cfe3 av1decoder: Port to GstCodecPicture struct
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5285>
2023-09-08 11:51:23 +00:00
Seungha Yang a73c6d7fb6 vp9decoder: Port to GstCodecPicture struct
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5285>
2023-09-08 11:51:23 +00:00
Seungha Yang 4571fac946 vp8decoder: Port to GstCodecPicture struct
... and remove unused "pts" variable from GstVp8Picture struct

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5285>
2023-09-08 11:51:23 +00:00
Seungha Yang 6e7cab43be h265decoder: Port to GstCodecPicture struct
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5285>
2023-09-08 11:51:23 +00:00
Seungha Yang ea3dfadbed h264decoder: Port to GstCodecPicture struct
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5285>
2023-09-08 11:51:23 +00:00
Seungha Yang 6fa405c1e2 cudaipcclient: Protect IPC handle import/close with global lock
Protect import/close with signle lock to avoid importing a IPC handle
while it's being closed by another cudaipcsrc from other thread.
Also fixing cuda context leak

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5230>
2023-08-25 10:06:58 +00:00
Seungha Yang 315cfaf2d8 nvencoder: Fix negotiation error when interlace-mode is unspecified
Use GST_PAD_SET_ACCEPT_INTERSECT() to accept caps without interlace-mode
field

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5229>
2023-08-24 14:58:25 +00:00
L. E. Segovia b1a5707fcb windows: Fix mutexes leaking into the exports table
Translation unit-local variables must be marked static on Windows,
otherwise they're made available to the whole binary.

See:

https://learn.microsoft.com/en-us/cpp/cpp/program-and-linkage-cpp?view=msvc-170#external-vs-internal-linkage

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5185>
2023-08-16 08:17:05 +00:00
Seungha Yang 7b6023d9cf nvcodec: Add support for CUDA IPC
Adding cudaipc{src,sink} element for CUDA IPC support.

Implementation note:
* For the communication between end points, Win32 named-pipe
and unix domain socket will be used on Windows and Linux respectively.

* cudaipcsink behaves as a server, and all GPU resources will be owned by
the server process and exported for other processes, then cudaipcsrc
(client) will import each exported handle.

* User can select IPC mode via "ipc-mode" property of cudaipcsink.
There are two IPC mode, one is "legacy" which uses legacy CUDA IPC
method and the other is "mmap" which uses CUDA virtual memory API
with OS's resource handle sharing method such as DuplicateHandle()
on Windows. The "mmap" mode might be better than "legacy" in terms
of stability since it relies on OS's resource management but
it would consume more GPU memory than "legacy" mode.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
2023-08-14 13:41:01 +00:00
Seungha Yang cf2cd20ba0 nvdecoder: Add max-display-delay property
The same as in old nvdec implementation so that user can control
the number of delayed output frames.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4841>
2023-08-10 11:58:42 +00:00
Seungha Yang 19ee5f6dc4 nvdecoder: Reduce DPB size
Decoder will copy decoded picture to downstream buffer or output CUDA
memory even in case of zero-copy mode. Additional margin should be unnecessary
unless baseclass passed wrong max DPB size.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4841>
2023-08-10 11:58:42 +00:00
Seungha Yang 1aa9e74aaf cudadownload: Always download CUDA memory if it's bound to decoder
Decoder bounded CUDA memory is allocated by driver and the pool size
is fixed. Since we don't know how many buffers would be held by
downstream non-CUDA element, we should download such CUDA memory
and release it back to decoder.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4810>
2023-06-08 22:27:06 +00:00
Thibault Saunier 4a4d7821a5 cudabasetransform: Handle video related meta as appropriate
This implements the same logic as GstVideoFilter

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4731>
2023-05-31 20:09:42 +00:00
Thibault Saunier 8d3f90eb8d cuda: memory: Enhance debug when CU_GL_DEVICE_LIST_ALL fails
Ensuring error from cuda is logged

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4731>
2023-05-31 20:09:42 +00:00
Seungha Yang 7a3be74b63 cudaconvertscale: Add support for flip/rotation
Similar to the d3d11convert element, colorspace conversion, resizing and
flip/rotation operations can be done in a single kernel function call

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4640>
2023-05-16 19:24:36 +00:00
Seungha Yang 96555c7ee9 nvh264encoder: Fix template caps
It should include progressive as well

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4236>
2023-03-22 23:40:58 +00:00
Seungha Yang fed252cabd nvencoder: Fix CQP option setting
... and zero initialize LUID and CUDA device list to address
coverity issue

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4215>
2023-03-17 18:30:19 +00:00
Seungha Yang 8d7cab1f0d nvcodec: Remove stateful decoders
Use H.264, H.265, VP8, and VP9 stateless decoders unconditionally
and don't register its stateful counterpart

Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/1768
Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/issues/1621
Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/1432
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4015>
2023-02-28 18:42:17 +00:00
Seungha Yang 67764a1579 cudaconverter: Rename CUDA kernel function
Changing its name (was too generic) to help GPU tracing via Nsight tool

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4029>
2023-02-21 20:07:31 +00:00
Seungha Yang b1c14b0357 nvencoder: Fix b-frame encoding on Linux
On Windows, Win32 event handle is used to wait for encoded output,
but it's not available on Linux. We should delay bitstream locking
if encoder returns "need-more-input"

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4004>
2023-02-20 20:49:01 +00:00
Mengkejiergeli Ba b5b86a0c98 nvh264dec: Remove type casting
Have fixed type of second_chroma_qp_index_offset as gint8, so remove the type
casting here.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3966>
2023-02-20 16:40:01 +00:00
Seungha Yang b88b323e04 nvencoder: Optimization for byte-stream to packetized format conversion
Allocate single memory instead of per NAL

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 02:15:24 +09:00
Seungha Yang 070a80943d nvencoder: Add support for caption insert
Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/1406
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 02:15:24 +09:00
Seungha Yang b6d371295a nvencoder: Add support for HDR10 static metadata
Insert HDR10 SEIs per IDR

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 02:15:24 +09:00
Seungha Yang 84ec16c67a nvencoder: Add support for GL memory
preparation to deprecate old NVENC elements

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 02:15:19 +09:00
Seungha Yang 58373e38f4 nvencoder: Fix critical warning in autogpu mode
If upstream memory is not CUDA, CUDA context will be null.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 01:49:31 +09:00
Seungha Yang 59f359eb99 cuda: Rename macro HAVE_NVCODEC_GST_GL -> HAVE_CUDA_GST_GL
... and always use #ifdef instead of #if

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3992>
2023-02-20 01:49:31 +09:00
Seungha Yang f831b92540 nvdecoder: Add support for reconfiguration
Instead of creating new decoder instance per new sequence,
re-use configured decoder instance via cuvidReconfigureDecoder()
API. It will make output surface reusable without re-allocation.
Also, in order for application to be able to reserve higher resolution
output surface, "init-max-width" and "init-max-height" properties are
added to each decoder.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang eb0fca4180 nvencoder: Reuse input resource
Call input resource map functions (i.e., nvEncRegisterResource,
nvEncUnregisterResource, nvEncMapInputResource, and
nvEncUnmapInputResource) only once and reuse the mapped resources,
instead of per input frame map/unmap

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang 03425bc702 nvdecoder: Add support for CUDA zero-copy in stateless decoder
Wrap mapped decoder output surface using GstCudaMemory and
output without any copy operation. Also, for application to be able to
control the number of zero-copyable output surfaces,
"num-output-surfaces" property is added.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang a94af552f5 nvdecoder: Port to C++
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang 6ddd1713f1 nvdecoder: Reduce render delay to 2 frames
4 frames delay seems to be too high

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang fa2bb42fda cudaconverter: Use cached texture
... instead of per conversion texture alloc/free

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang 992406cf4f cuda, nvcodec: Make GstD3D11 dependency mandatory
GstD3D11 build-time dependencies should be always available on Windows already
and runtime dependencies as well, since required external
(non-GStreamer) depends are all system DLLs

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00