Commit graph

148 commits

Author SHA1 Message Date
Seungha Yang
9c21923f04 nvcodec: Add H264 stateless codec implementation
Introduce GstH264Decoder based Nvidia H.264 decoder element.
Similar the element factory name of to v4l2 stateless codec,
this element can be configured with factory name "gstnvh264sldec".
Note that "sl" in the name stands for "stateless"

For now, existing nvh264dec covers more profile and formats
(e.g., interlaced stream) than this implementation.
However, this implementation allows us to control lower level
parameters such as decoded picture buffer management and therefore
we can get a chance to improve performance in terms of latency.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1198>
2020-04-24 09:23:10 +00:00
Seungha Yang
b4efdeba11 nvdec: Don't hardcode DPB size
Too many decode surface would waste GPU memory. Also it seems to be
introducing additional latency depending on stream. Since nvcodec
sdk version 9.0, CUVID parser API has been providing the minimum
required number of surface. By using it, we can save GPU memory
and reduce possible latency.
2020-04-09 16:30:58 +09:00
Seungha Yang
bd706edc52 nvh265enc: Update for video-hdr struct change
See the change of -base https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/594
2020-04-01 05:18:11 +00:00
Seungha Yang
22eab50907 nvdec: Add fallback for CUDA/OpenGL interop failure
It happens when local OpenGL context belongs to non-nvidia GPU.
2020-03-19 13:58:09 +09:00
Nirbheek Chauhan
266dc41596 nvcodec: Mark class data as may-be-leaked to quiet the leaks tracer
The class data with the caps in it will be leaked if the element is
registered but never instantiated. There is no way around this. Mark
the caps as such so that the leaks tracer does not warn about it.

This is the same as pad template caps getting leaked, which are also
marked as may-be-leaked. These objects are initialized exactly once,
and are 'global' data.
2020-02-12 00:00:51 +05:30
Nirbheek Chauhan
3ca87d9988 nvcodec: Fix crash in decoder on 32-bit Windows
Same fix as 1a7ea45ffd, but I didn't
test the decoder so I missed that the function pointers here weren't
using the correct calling convention too.
2020-02-06 13:39:52 +00:00
Sebastian Dröge
5b8ff98f96 nvdec: Don't leak template caps when registering elements with old NVIDIA driver 2020-02-05 09:49:20 +00:00
Seungha Yang
3f4a84bd32 nvenc: Query maximum supported API version
We've been using NvEncodeAPICreateInstance method to find the supported API
version, but that seems to be insufficient since there is a case
where plugin failed in opening encoding session even if NvEncodeAPICreateInstance
succeeded. Asking driver about the version would be the most certain way.
2020-02-03 14:15:28 +00:00
Nicolas Dufresne
d393232bc2 nvdec: Do not map GStreamer discont to CUVid discont
Setting the CUVID_PKT_DISCONTINUITY implies clearing any past information
about the stream in the decoder. The GStreamer discont flag is used for
discontinuity caused by a seek, for first buffer and if a buffer was
dropped. In the first two cases, the parsers and demuxers should ensure we
start from a synchronization point, so it's unlikely that delta will be
matched against the wrong state.

For packet lost, the discontinuity flag will prevent the decoder from doing
any concealment, with a result that ca be much worst visually, or freeze the
playback until an IDR is met. It's better to let the decoder handle that for
us.

Removing this flag, also workaround a but in NVidia parser that makes it
ignore our ENDOFFRAME flag and increase the latency by one frame.
2020-01-25 13:39:03 +00:00
Nicolas Dufresne
a28ce16b3f nvdec: Tell the parser we have complete pictures
This sets the CUVID_PKT_ENDOFPICTURE flag in order to inform the decoder that
we have a complete picture. This should remove one frame latency otherwise
introduce by NVidia parser.
2020-01-25 13:39:03 +00:00
Seungha Yang
a10f26aa3a nvenc: Do not access to broken encode session
If an encode session failed in initializing, the encode
session would be broken and the next nvenc API will cause crash.

Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1179
2020-01-21 16:34:41 +09:00
Nirbheek Chauhan
bda687344b nvcodec: Print debug info when initializing nvenc
We weren't printing the return value.
2020-01-20 17:15:55 +05:30
Nirbheek Chauhan
1a7ea45ffd nvcodec: Fix crash on 32-bit Windows
We weren't using the correct calling convention when calling CUDA and
CUVID APIs. `CUDAAPI` is `__stdcall` on Windows. This was working fine
on x64 because `__stdcall` is ignored and there's no special calling
convention. However, on x86, we need to use `__stdcall`.
2020-01-20 17:15:55 +05:30
Nirbheek Chauhan
7e93ae0638 nvcodec: cuda.h only needs glib.h, not gst.h
Just a nitpick. Also, force the compiler to use our stub header
instead of searching for it in the include paths.
2020-01-20 15:10:51 +05:30
Seungha Yang
e8d527df93 nvenc: Query supported minimum resolution
Hard-coded 16x16 resolution is likely to differ from the device's support
in most cases. If we can use NV_ENC_CAPS_WIDTH_MIN and NV_ENC_CAPS_HEIGHT_MIN,
update pad template with returned value.
2020-01-16 15:24:03 +00:00
Seungha Yang
58a663c1e5 nvcodec: Bump SDK header to version 9.1
Update header to query minimum resolution of encoder and to control
the number of reference frame if it's supported
2020-01-16 15:24:03 +00:00
Seungha Yang
49bccf0433 nvcodec: Refactor plugin initialization
Create CUDA context per device, instead of per codec and encoder/decoder.
Allocating CUDA context is heavy operation so we should reuse it
as much as possible.

Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1130
2019-12-24 08:10:14 +00:00
Seungha Yang
0cf67c3be7 nvenc: Fix crash when nvenc was reused then freed without encoding
GstNvBaseEnc::n_bufs was set from the previous encoding session
but it wasn't cleared after stop. That might result to invalid memory
access at the next start (no encoded data) and then stop sequence.
Instead of defining a variable for array length, use GArray::len
directly to avoid such confusion.
2019-11-22 03:02:57 +00:00
Seungha Yang
aef414375a nvenc: Remove unused code path
refilling queue would not happen
2019-11-22 03:02:57 +00:00
Aaron Boxer
6d3429af34 documentation: fixed a heap o' typos 2019-11-05 09:11:25 -05:00
Tim-Philipp Müller
f218ec2794 Remove autotools build system 2019-10-14 13:54:27 +01:00
Seungha Yang
52dfbbe5da nvenc: Early terminate handle_frame if the last flow was not GST_FLOW_OK
If the last flow was not GST_FLOW_OK, the encoding thread is not running
and there is nothing to pop from GAsyncQueue (this causes deadlock).

To prevent deadlock, just return the handle_frame without further encoding
process if the last flow was not GST_FLOW_OK. Note that the last flow
will be cleared per FLUSH_STOP and STREAM_START event.
2019-09-11 15:21:03 +00:00
Seungha Yang
68a51abdcd nvenc: Add support VUYA format
The addition is very simple. Map NV_ENC_BUFFER_FORMAT_AYUV format
to GST_VIDEO_FORMAT_VUYA and add a condition for the VUYA format.
2019-09-11 14:33:54 +00:00
Seungha Yang
14b9a1cffd nvdec: Add support for mpeg4 video decoding with codec_data
Decoder should handle codec_data of mpeg4 video which includes essential
config data.
2019-09-11 14:00:48 +00:00
Seungha Yang
af77988b9f nvenc: Reduce the number of pre-allocated device memory
The hard-coded upper bound 32 (or 48 depending on resolution) might waste
GPU memory and high resolution encoding causes OUT-OF-MEMORY allocation error
quite easily. This commit calculates the number of required pre-allocated
device memory based on encoding options and it can reduce the amount of device memory
used by nvenc.
2019-09-11 11:44:03 +00:00
Seungha Yang
e3508a4f26 nvdec: Update plugin description and fix typo
Use consistent description with nvenc, and fix typo s/devide/device/g
2019-09-11 15:16:45 +09:00
Seungha Yang
1cbb23cf79 nvenc: Adjust DTS when bframe is enabled
NVDEC driver always uses input timestamp without adjustment
even if bframe encoding was enabled.
So DTS can be larger than PTS when bframe was enabled.
To ensure PTS >= DTS, we should adjust the timestamp manually
based on the PTS difference between the first
encoded frame and the second one. That's also the maximum PTS/DTS
difference.
2019-09-11 13:18:12 +09:00
Seungha Yang
83a1c7a9a6 nvenc: Add qp-{min,max,const}-{i,p,b} properties
This new properties allows more detailed target QP value setting
2019-09-11 13:18:12 +09:00
Seungha Yang
d3a909ccdd nvenc: Add properties to support bframe encoding if device supports it
Note that bframe encoding capability varies with GPU architecture
2019-09-11 13:18:12 +09:00
Seungha Yang
94f2843774 nvenc: Refactoring internal buffer pool structure
To support rc-lookahead and bframe encoding, nvenc needs one more
staging queue, because NvEncEncodePicture can return NV_ENC_ERR_NEED_MORE_INPUT
but which was not considered so far.
As documented by NVENC programming guide, pending buffers should wait
other inputs until NvEncEncodePicture returns success.

New encoding flow is
- Submit raw picture buffer to encoder with NvEncEncodePicture
- The submitted input/output buffer pair will be queued to pending_queue
  - If NvEncEncodePicture returned success, then move all pair in pending_queue
    to final stage
  - Otherwise, wait more input raw pictures.

Another change is dropping NV_ENC_LOCK_INPUT_BUFFER usage.
So now nvenc always uses CUDA memory input buffer. As a result,
both opengl and system memory handling are unified.
2019-09-11 13:18:12 +09:00
Seungha Yang
e73acbaa5c nvenc: Remove pointless iteration and cleanup some code
* The number of iteration is always one so the iteration is useless
and that makes code complicated.
* Also defining named structure can code mroe readable.
* g_free is null safe
2019-09-11 13:18:12 +09:00
Seungha Yang
81272eaa82 nvenc: Add more rate-control options
New rate-control modes are introduced (if device can support)
* cbr-ld-hr: CBR low-delay high quality
* cbr-hq: CBR high quality
* vbr-hq: VBR high quality

Also, various configurable rate-control related properties are added.
2019-09-11 13:18:12 +09:00
Seungha Yang
ea19a7c715 nvenc: Add support for weighted prediction option
Note that this property will be exposed only if the device
supports the weighted prediction.
2019-09-11 13:18:12 +09:00
Seungha Yang
d05cbdbd72 nvenc: Add property for AUD insertion
Make AUD insertion configurable option
2019-09-11 13:18:12 +09:00
Seungha Yang
b3b723462e nvenc: Refactor class hierarchy to handle device capability dependent options
Introducing new dynamic class between GstNvBaseEncClass and
each subclass to be able to access device specific properties and
capabilities from each subclass implementation side.
2019-09-11 13:18:09 +09:00
Marc Leeman
3ef503346a nvcodec: minor spell corrects in log messages 2019-09-10 23:13:17 +00:00
Seungha Yang
09fd34dbb0 nvenc: Add support runtime resolution change freely
Do not restrict allowed maximum resolution depending on the
initial resolution. If new resolution is larger than previous one,
just re-init encode session.
2019-09-02 10:59:03 +09:00
Seungha Yang
fa83f086be nvdec: Check flow return of the only current handle_frame() to fix seeking issue
Due to uncleared last flow, decoding after seek was never possible
(last_ret == GST_FLOW_FLUSHING).
nvdec dose not need to keep track of the previous flow return,
and actually the interest is data/even flow of the current handle_frame().
2019-08-30 11:27:27 +09:00
Seungha Yang
be3a3da829 nvdec: Fallback to system memory if OpenGL context could not support PBO memory
If the environment could not support OpenGL PBO memory, nvdec will do negotiation
with system memory as fallback.
2019-08-30 01:36:46 +09:00
Seungha Yang
069fe93452 nvdec: Add support dynamic output format change
Implementing ::negotiate() method to support runtime output format
change. If downstream was reconfigured, baseclass will invoke
::negotiate() method, and nvdec should update output memory
type depending on downstream caps.
2019-08-30 01:36:46 +09:00
Seungha Yang
39f800c449 nvdec: Re-negotiate whenever output format is changed
Input stream might be silently changed without ::set_format() call.
Since nvdec has internal parser, nvdec element can figure out the format change
by itself.
2019-08-30 01:36:41 +09:00
Seungha Yang
f4f8941a91 nvdec: Add support 4:4:4 and 4:2:0 12bit decoding
Depending on GPU architecture, HEVC decoder can support
4:4:4 format up to 12 bitdepth. This commit covers VP9 4:2:0 12 bits
decoding also.
2019-08-29 13:39:59 +00:00
Seungha Yang
ff9838fd3d nvenc: Add support for old drivers which could not understand SDK version 9.0
Add helper functions to support old drivers
with our previous SDK version 8.1
2019-08-29 13:39:59 +00:00
Seungha Yang
afebb15d99 nvenc: Use consistent snake case convention 2019-08-29 13:39:59 +00:00
Seungha Yang
1010b9f567 nvcodec: Bump SDK header to version 9.0
The latest Turing architecture (e.g., RTX serise) can support
decoding HEVC 4:4:4 format up to 12bits.
2019-08-29 13:39:59 +00:00
Seungha Yang
338a32b672 nvenc: Port to GstCudaGraphicsResource
Register openGL resource only once per memory. Also if upstream
provides the registered information, reuse the information
instead of doing it again. This can improve performance dramatically
depending on system since the resource registration might cause
high overhead.
2019-08-29 18:45:25 +09:00
Seungha Yang
d0846f8eab nvdec: Port to GstCudaGraphicsResource
Make it possible to share registered graphics resource among nvidia encoders
and decoders.
2019-08-29 18:05:51 +09:00
Seungha Yang
da075b94a9 cudautils: Add GstCudaGraphicsResource structure for better openGL interoperability
Introduce GstCudaGraphicsResource structure to represent registered
CUDA graphics resources and to enable sharing the information among
nvdec and nvenc. This structure can reduce the number of resource
registration which cause high overhead.
2019-08-29 18:04:33 +09:00
Seungha Yang
8dc2b4a393 nvdec: Port to openGL PBO memory
For openGL interoperability, nvdec uses cuGraphicsGLRegisterImage API
which is to register openGL texture image.
Meanwhile nvenc uses cuGraphicsGLRegisterBuffer API to registure openGL buffer object.
That means two kinds of graphics resources are registered per memory
when nvdec/nvenc are configured at the same time.
The graphics resource registration brings possibly high overhead
so the registration should be performed only once per resource
from optimization point of view.
2019-08-29 18:04:33 +09:00
Seungha Yang
9bfd6d13e6 nvdec: Filter openGL API version to use
To ensure PBO buffer, openGL API >= 3 is required.
2019-08-29 18:04:29 +09:00
Seungha Yang
807e311ae8 nvdec: Always response QUERY_CONTEXT even if openGL is unavailable on the system
nvdec can response for the CUDA context type query regardless of openGL
availability.
2019-08-21 14:14:07 +09:00
Seungha Yang
4f60117db9 nvdec: Fix possible null object unref
gst_query_get_n_allocation_pools > 0 does not guarantee that
the N th internal array has GstBufferPool object. So users should
check the returned GstBufferPool object from
gst_query_parse_nth_allocation_pool.
2019-08-20 10:14:54 +09:00
Seungha Yang
eab564d857 nvcodec: Use default flag for CUDA stream creation
Since nvdec/nvenc engine is running on default stream,
non-default CUDA stream should be synchronized with default
stream eventually.
2019-08-19 07:13:26 +00:00
Seungha Yang
ca6657367c nvenc: Use non default CUDA stream and async operation
Use CUDA async operation if possible with non default CUDA stream
2019-08-19 01:18:52 +00:00
Seungha Yang
5615e9258f nvdec: Don't use default CUDA stream
Async CUDA operation with default stream (NULL CUstream) is not much
beneficial than blocking operation since all CUDA operations which belong
to the CUDA context will be synchronized with the default stream's operation.
Note that CUDA stream will share all resources of the corresponding CUDA context
but which can help parallel operation similar to the relation between thread and process
2019-08-19 01:18:52 +00:00
Seungha Yang
20d8f54e63 nvdec: Push/Pop CUDA context around library API call 2019-08-19 01:18:52 +00:00
Seungha Yang
f7b2b1b99d nvdec: Fix timestamp mismatch on draining frames
The internal decoding state must be GST_NVDEC_STATE_PARSE before
calling CuvidParseVideoData(). Otherwise, nvdec will be confused
on decode callback as if the frame is decoding only frame and
the input timestamp of corresponding frame will be ignored.
Eventually one decoded frame will have non-increased PTS.
2019-08-18 15:52:32 +09:00
Seungha Yang
b64733972e nvdec: Do not access nvdec object from destroy function of qdata
The destroy callback can be called just before the fìnalization of
GstMiniObject. So the nvdec object might be destroyed already.
Instead, store the GstCudaContext with increased ref to safely
unregister the CUDA resource.
2019-08-16 19:40:31 +09:00
Seungha Yang
e6d21d048a nvenc: Add support YV12 format
YV12 format is supported by Nvidia NVENC without manual conversion.
So nvenc is exposing YV12 format at sinkpad template but there is some
missing point around uploading the memory to GPU.
2019-08-09 11:43:22 +09:00
Seungha Yang
8dbaed0af7 nvh265enc: Enable HDR related SEI nal insertion
If upstream provides the HDR related information, create SEI message
nals and pass them to NVENC.
2019-08-08 23:18:14 +09:00
Seungha Yang
f3e12a0b56 nvh265enc: Add support YUV 444 10bits encoding
Note that h264 encoder does not support the YUV 444 10bits format
2019-08-08 00:46:16 +09:00
Seungha Yang
fa5e6f546b nvenc: Remove unnecessary constraint from YUV420 10bits capability decision
YUV444 capability shouldn't be applied to YUV420 10 bits format
2019-08-08 00:46:12 +09:00
Seungha Yang
cc4d0e91e3 nvenc: Fix broken RGB format support
Add missing format check introduced by the commit 7de4dbdeb2
2019-08-07 07:27:36 +00:00
Seungha Yang
9d0545d1a2 nvcodec: Wrap CUDA API return check with gst_cuda_result
The gst_cuda_result macro function is more helpful for debugging
than previous cuda_OK because gst_cuda_result prints the function
and line number. If the CUDA API return was not CUDA_SUCCESS,
gst_cuda_result will print WARNING level debug message with
error name, error text strings.
2019-08-07 00:59:36 +00:00
Seungha Yang
d69b590683 nvdec: Port to GstCUDAContext
... and drop CUvideoctxlock usage. The CUvideoctxlock basically
has the identical role of cuda context push/pop but nvdec specific
way. Since we can share the CUDA context among encoders and decoders,
use CUDA context directly for accessing GPU API.
2019-08-07 00:59:36 +00:00
Seungha Yang
5cf0351418 nvenc: Port to GstCudaContext
... and add support CUDA context sharing similar to glcontext sharing.
Multiple CUDA context per GPU is not the best practice. The context
sharing method is very similar to that of glcontext. The difference
is that there can be multiple context object on a pipeline since
the CUDA context is created per GPU id. For example, a pipeline
has nvh264dec (uses GPU #0) and nvh264device0dec (uses GPU #1),
then two CUDA context will propagated to all pipeline.
2019-08-07 00:59:36 +00:00
Seungha Yang
094e4a9f5c nvcodec: Introduce NVIDA CUDA helpers
New object and helper functions can remove duplicated code
from nvenc/nvdec. Also this is prework for CUDA device context sharing
among nvdec(s)/nvenc(s).
2019-08-07 00:59:36 +00:00
Seungha Yang
7de4dbdeb2 nvenc: Return profile compatible input formats from GstVideoEncoder::getcaps
Do not accept any input formats which could not be supported
by downstream requested codec profiles.
2019-08-06 15:03:22 +00:00
Seungha Yang
9e81f8e700 nvenc: Fix caps negotiation failure on unspecified interlace-mode
During GstVideoInfo conversion from GstCaps, interlace-mode is
inferred to progressive so unspecified interlace-mode should not cause any
negotiation issue. Simly set GST_PAD_FLAG_ACCEPT_INTERSECT flag
on sinkpad to fix issue.
2019-08-06 15:03:22 +00:00
Seungha Yang
b43d0f785c nvenc: Remove unused member variables
Supported interlace-mode and codec profiles are checked
during plugin init and those values are never used.
2019-08-06 15:03:22 +00:00
Seungha Yang
f7f9f327cd nvdec: Respect upstream provided timestamp
Decoder sometimes reports nonincreasing timestamp.
Use input frame's timestamp like other decoder elements.
2019-08-05 20:32:39 +00:00
Seungha Yang
e68bfd7566 nvenc: Add support RGB 8/10bits formats
BGRA/RGBA/RGB10A2/BGR10A2 formats can be supported by nvenc.
Depending on device, supported format can be different.

Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1038
2019-08-05 18:55:28 +00:00
Seungha Yang
c99b160b50 nvdec: Use upstream framerate if possible
Encoded bitstream might not have valid framerate. If upstream
provided non-variable-framerate (i.e., fps_n > 0 and fps_d > 0)
use upstream framerate instead of parsed one.
2019-08-05 15:32:43 +00:00
Seungha Yang
158b4d8649 nvenc: Fix crash with unspecified framerate
Nvidia driver seems to calculating floating point framerate
without validation. This causes crash both on linux and Windows.

Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1012
2019-08-05 15:32:43 +00:00
Seungha Yang
2a76807c9a configure: Update for nvcodec dependency change
nvcodec is compilable without external dependency
2019-07-31 15:36:04 +00:00
Seungha Yang
f1cbab7cfd nvdec: Fix build warning error
gstnvdec.c:1222:3: error: implicit declaration of function ‘memset’ [-Werror=implicit-function-declaration]
   memset (&type_info, 0, sizeof (type_info));
   ^~~~~~
2019-07-31 15:36:04 +00:00
Seungha Yang
4fa5a82762 nvenc: Fix build error with x86 msvc
__stdcall is accepted or ignored by the compiler on x64 but x86
is not the case. So the function definition should be consistent
with declaration.

Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1039
2019-07-30 19:12:46 +09:00
Seungha Yang
0445ed6ba5 nvenc: Fix deadlock when pad_push return was not GST_FLOW_OK
Encoding thread is terminated without any notification so
upstream streaming thread is locked because there is nothing
to pop from GAsyncQueue. If downstream returns error,
we need put SHUTDOWN_COOKIE to GAsyncQueue for chain function
can wakeup.
2019-07-30 17:49:25 +09:00
Seungha Yang
3faf439347 nvcodec: Fix broken ABI in cuda stub header to fix nvenc with opengl
Fix the broken ABI introduced by the commit 367e742e5d
From CUDA Toolkit 3.2, size_t has been used in CUDA_MEMCPY2D structure
instead of unsigned int.
2019-07-30 11:13:18 +09:00
Seungha Yang
694f91da88 nvdec: Make OpenGL dependency optional
By adding system memory support for nvdec, both en/decoder
in the nvcodec plugin are able to be usable regardless of
OpenGL dependency. Besides, the direct use of system memory
might have less overhead than OpenGL memory depending on use cases.
(e.g., transcoding using S/W encoder)
2019-07-26 00:01:23 +00:00
Seungha Yang
733c109ce9 nvcodec: Clean up pointless return values around plugin init
Any plugin which returned FALSE from plugin_init will be blacklisted
so the plugin will be unusable even if an user install required runtime
dependency next time. So that's the reason why nvcodec returns TRUE always.

This commit is to remove possible misreading code.
2019-07-25 08:47:50 +00:00
Seungha Yang
7b9045d846 nvcodec: Change log level for g_module_open failure
Since we build nvcodec plugin without external CUDA dependency,
CUDA and en/decoder library loading failure can be natural behavior.

Emit error only when the module was opend but required symbols are missing.
2019-07-25 08:47:50 +00:00
Seungha Yang
e5a98cf9d8 nvdec: Add support for 10bits 4:2:0 decoding
This commit includes h265 main-10 profile support if the device can
decode it.

Note that since h264 10bits decoding is not supported by nvidia GPU for now,
the additional code path for h264 high-10 profile is a preparation for
the future Nvidia's enhancement.
2019-07-25 08:06:26 +00:00
Seungha Yang
d692350fc3 nvdec: Specify supported profiles of h264/h265 codec
See more details about supported formats at
nvidia codec sdk document "NVDEC_VideoDecoder_API_ProgGuide.pdf"
Table 1. Hardware Video Decoder Capabilities.

Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/926
2019-07-25 08:06:26 +00:00
Seungha Yang
c8640e23f4 nvdec: Skip draining before creating internal parser
GstVideoDecoder::drain/flush can be called at very initial state
with stream-start and flush-stop event, respectively.
Draning with NULL CUvideoparser seems to unsafe and that eventually
failed to handle it.
2019-07-25 07:11:04 +00:00
Seungha Yang
367e742e5d nvcodec: Drop system installed cuda.h dependency
... and add our stub cuda header.

Newly introduced stub cuda.h file is defining minimal types in order to
build nvcodec plugin without system installed CUDA toolkit dependency.
This will make cross-compile possible.
2019-07-23 16:32:31 +09:00
Seungha Yang
a2ada54265 nvcodec: Keep requested rank for default device
Fix for default encoder and decoder element factory to make them have
higher rank than the others.
2019-07-23 10:28:52 +09:00
Seungha Yang
92afa74939 nvenc: Register elements per GPU device with capability check
* By this commit, if there are more than one device,
nvenc element factory will be created per
device like nvh264device{device-id}enc and nvh265device{device-id}enc
in addition to nvh264enc and nvh265enc, so that the element factory
can expose the exact capability of the device for the codec.

* Each element factory will have fixed cuda-device-id
which is determined during plugin initialization
depending on the capability of corresponding device.
(e.g., when only the second device can encode h265 among two GPU,
then nvh265enc will choose "1" (zero-based numbering)
as it's target cuda-device-id. As we have element factory
per GPU device, "cuda-device-id" property is changed to read-only.

* nvh265enc gains ability to encoding
4:4:4 8bits, 4:2:0 10 bits formats and up to 8K resolution
depending on device capability.
Additionally, I420 GLMemory input is supported by nvenc.
2019-07-22 21:01:41 +00:00
Seungha Yang
0239152bca nvdec: Create CUDA context with registered device id
Only the default device has been used by NVDEC so far.
This commit make it possible to use registered device id.
To simplify device id selection, GstNvDecCudaContext usage is removed.
2019-07-22 17:39:45 +00:00
Seungha Yang
1df2f13d0c nvdec: Register elements per device/codec with capability check
By this commit, each codec has its own element factory so the
nvdec element factory is removed. Also, if there are more than one device,
additional nvdec element factory will be created per
device like nvh264device{device-id}dec, so that the element factory
can expose the exact capability of the device for the codec.
2019-07-22 17:39:45 +00:00
Seungha Yang
afe3c7e3ef nvcodec: Drop cudaGL.h dependency
nvcodec does not use any type/define/enum in cudaGL.h.
2019-07-22 23:11:14 +09:00
Seungha Yang
48a6641717 nvdec: Fix video stuttering issue with VP9
Address nvidia driver specific behavior to avoid unexpected frame mismatch
between GStreamer and NVDEC.
2019-07-19 18:44:32 +09:00
Seungha Yang
8018fa2526 nvdec: Drop async queue and handle data on callback of CUvideoparser
Callbacks of CUvideoparser is called on the streaming thread.
So the use of async queue has no benefit.

Make control flow straightforward instead of long while/switch loop.
2019-07-19 18:44:32 +09:00
Seungha Yang
8753561015 nvdec: Port to color_{primaries,transfer,matrix}_to_iso
... and update the color information only when upstream was not provided
the information.
2019-07-17 06:34:21 +00:00
Seungha Yang
e01c68524f nvenc: Specify colorimetry related VUI parameters
Set the colorimetry config for the information to be embedded in encodec bitstream.
2019-07-17 14:45:05 +09:00
Seungha Yang
8862abd7c6 nvdec: Fix possible frame drop on EOS
On eos, baseclass videoencoder call finish() vfunc instead of drain()
2019-07-09 20:52:23 +09:00
Marc Leeman
489ff8604f nvcodec: do a generic cuda tests before going into version specifics 2019-07-08 10:37:46 +00:00
Seungha Yang
c18fda03d9 nvdec,nvenc: Port to dynamic library loading
... and put them into new nvcodec plugin.

* nvcodec plugin
Now each nvenc and nvdec element is moved to be a part of nvcodec plugin
for better interoperability.
Additionally, cuda runtime API header dependencies
(i.e., cuda_runtime_api.h and cuda_gl_interop.h) are removed.
Note that cuda runtime APIs have prefix "cuda". Since 1.16 release with
Windows support, only "cuda.h" and "cudaGL.h" dependent symbols have
been used except for some defined types. However, those types could be
replaced with other types which were defined by "cuda.h".

* dynamic library loading
CUDA library will be opened with g_module_open() instead of build-time linking.
On Windows, nvcuda.dll is installed to system path by CUDA Toolkit
installer, and on *nix, user should ensure that libcuda.so.1 can be
loadable (i.e., via LD_LIBRARY_PATH or default dlopen path)
Therefore, NVIDIA_VIDEO_CODEC_SDK_PATH env build time dependency for Windows
is removed.
2019-07-08 10:37:46 +00:00