By passing NULL to `g_signal_new` instead of a marshaller, GLib will
actually internally optimize the signal (if the marshaller is available
in GLib itself) by also setting the valist marshaller. This makes the
signal emission a bit more performant than the regular marshalling,
which still needs to box into `GValue` and call libffi in case of a
generic marshaller.
Note that for custom marshallers, one would use
`g_signal_set_va_marshaller()` with the valist marshaller instead.
In future, a sub class of GstMsdkEncClass may decide a native format by
using this method, e.g. JPEG encoder may accept YUY2 input, however the
current implemation needs a conversion from YUY2 to NV12 before encoding.
In addtion, a sub class may choose a format for encoding if the input
format is not supported by MSDK, e.g. the current implemation does
UYVY->NV12 if the input format is UYVY. We may do UYVY->YUY2 for JPEG
encoder in future
MFX_FOURCC_BGR4 is mapped to VA_FOURCC_ABGR and JPEG encoder needs a
MFX_FOURCC_BGR4 frame for internal usage when the input format is
MFX_FOURCC_RGB4
This is a preparation for supporting native formats of JPEG encoder
Instead of using a proxy of `is_packetized` flag this patch
replaces it with the accessor to that flag in decoder base class,
avoiding probable mismatches.
commit 55c0d720 added the capability to handle non-packetized bitstream,
and there is a loop to handle multiple frames in a non-packetized buffer
in gst_msdkdec_handle_frame. However it is possible that a
non-packetized buffer still contains valid data but there is no long any
pending unfinished frame. Currently gst_video_decoder_decode_frame is
invoked to send a new frame with new input data, the situaltion is
repeated till an EOS is received. An application has to exit when
receiving an EOS, however there is still valid data in a
non-packetezied input buffer, hence some frames are dropped.
This fix adds a parse callback for non-packeteized input, a new frame
will be sent to the subclass as soon as the input buffer has valid data
This fixes https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/665
Upon bitrate change, make sure to close the encoder otherwise
the encoder is not re-initialized and the target bitrate is
never reached, and the encoder was flushed at each frame
from this moment.
Regression introduced in f2b35abcab which replaced the call
that was closing the encoder by an early return to avoid
re-initialization.
gstwasapiutil.c(173) : warning C4715: 'gst_wasapi_device_role_to_erole': not all control paths return a value
gstwasapiutil.c(188) : warning C4715: 'gst_wasapi_erole_to_device_role': not all control paths return a value
The GstDeviceProvider isn't subclass of GstElement.
(gst-device-monitor-1.0:49356): GLib-GObject-WARNING **: 20:21:18.651:
invalid cast from 'GstWasapiDeviceProvider' to 'GstElement'
The first channel in memory for MFX_FOURCC_RGB4 (VA_FOURCC_ARGB or
GST_VIDEO_FORMAT_BGRA) is B, not A. In MSDK, channle B is used to access
data for RGB4 surface. In addition, the returned pointers for
MFX_FOURCC_AYUV and MFX_FOURCC_Y410 in gst_msdk_video_memory_map_full
were wrong too before this fix.
When the bitrate is changed in playing state the encoder issues a reconfig
that drains and recreates the underlaying hw encoder instance.
With this set of changes we ensure that all this work is only made when
the bitrate did actually change. It also tries to reuse the vpp buffer
pool and fixes the pool leak spotted when testing this feature.
When postpone_free_surface is TRUE, the output buffer is not writable,
however the base decoder needs a writable buffer as output buffer,
otherwise it will make a copy of the output buffer. As the underlying
memory is always lockable, so we may set the LOCKABLE flag for this buffer
to avoid buffer copy in the base class.
The refcount of the output buffer is 1 when postpone_free_surface is
FALSE, so needn't set the LOCKABLE flag for this case.
... instead of calculated display ratio from given PAR and DAR.
d3d11window calculates output display ratio
to decide padding area per window resize event. In the formula,
actual PAR is required to handle both 1:1 PAR and non-1:1 PAR.
Both MSDK and this plugin use mfxFrameAllocResponse for video and DMABuf
memory, it is possible that some GST buffers are still in use when calling
gst_msdk_frame_free, so add a reference count in the wrapper of
mfxFrameAllocResponse (GstMsdkAllocResponse) to make sure the underlying
mfx resources are still available if the corresponding buffer pool is in
use.
In addtion, currently all allocators for input or output share the same
mfxFrameAllocResponse pointer in an element, so it is possible that
the content of mfxFrameAllocResponse is updated for a new caps then all
GST buffers allocated from an old allocator will use this new content of
mfxFrameAllocResponse, which will result in unexpected behavior. In this
fix, we save the the content of mfxFrameAllocResponse in the corresponding
tructure to avoid such issue
Sample pipeline:
gst-launch-1.0 filesrc location=vp9_multi_resolutions.ivf ! ivfparse ! msdkvp9dec !
msdkvpp ! video/x-raw\(memory:DMABuf\),format=NV12 ! glimagesink
Otherwise it is possible that different wrappers share the same
mfxFrameAllocResponse pointer, so instead of caching the pointer, we may
cache the content of mfxFrameAllocResponse
For a skipped frame in VC1, MSDK returns the mfx surface of the reference
frame, so we have to make sure the corresponding surface for the
reference frame is not freed. In this fix, we postpone surface free because
we don't know whether a surface is referenced
Before this fix, the error is like as below:
New clock: GstSystemClock
0:00:00.181793130 23098 0x55f8a9d622d0 ERROR msdkdec
gstmsdkdec.c:622:gst_msdkdec_finish_task:<msdkvc1dec0> Couldn't find the
cached MSDK surface
Sample pipeline:
gst-launch-1.0 filesrc location=input_has_skipped_frame.wmv ! asfdemux !
vc1parse ! msdkvc1dec ! glimagesink
If the surface is not in use, we may release it even if GST_FLOW_OK is going
to be returned, which may avoid the issue of failing to get surface
available
This fixes the regression caused by commit c05acf4
GstAllocationParams::align is set to 31 in msdkdec/msdken/msdkvpp, hence
the stride align should be greater than or equal to 31, otherwise it
will result in issue
https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/861
(msdk: "GStreamer-CRITICAL: gst_buffer_resize_range failed" SPAM),
In addition, the stride should match the pitch alignment in the media driver,
otherwise it will result in some issues when a buffer is shared between
different elements, e.g. the NV12 issue mentioned in commit 3f2314a, which
can be reproduced by `gst-launch-1.0 vidoetestsrc ! msdkvpp !
video/x-raw\(memory:DMABuf\),format=NV12 ! glimagesink`
Fixed https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/861
For some hevc 10bit 4K encoding cases, the encoding process may be
slow, and MediaSDK surface can't be released in time before one other
available surface is needed. So add an extra surface for hevc encoding
to avoid this issue.
If the last flow was not GST_FLOW_OK, the encoding thread is not running
and there is nothing to pop from GAsyncQueue (this causes deadlock).
To prevent deadlock, just return the handle_frame without further encoding
process if the last flow was not GST_FLOW_OK. Note that the last flow
will be cleared per FLUSH_STOP and STREAM_START event.
The hard-coded upper bound 32 (or 48 depending on resolution) might waste
GPU memory and high resolution encoding causes OUT-OF-MEMORY allocation error
quite easily. This commit calculates the number of required pre-allocated
device memory based on encoding options and it can reduce the amount of device memory
used by nvenc.
NVDEC driver always uses input timestamp without adjustment
even if bframe encoding was enabled.
So DTS can be larger than PTS when bframe was enabled.
To ensure PTS >= DTS, we should adjust the timestamp manually
based on the PTS difference between the first
encoded frame and the second one. That's also the maximum PTS/DTS
difference.
To support rc-lookahead and bframe encoding, nvenc needs one more
staging queue, because NvEncEncodePicture can return NV_ENC_ERR_NEED_MORE_INPUT
but which was not considered so far.
As documented by NVENC programming guide, pending buffers should wait
other inputs until NvEncEncodePicture returns success.
New encoding flow is
- Submit raw picture buffer to encoder with NvEncEncodePicture
- The submitted input/output buffer pair will be queued to pending_queue
- If NvEncEncodePicture returned success, then move all pair in pending_queue
to final stage
- Otherwise, wait more input raw pictures.
Another change is dropping NV_ENC_LOCK_INPUT_BUFFER usage.
So now nvenc always uses CUDA memory input buffer. As a result,
both opengl and system memory handling are unified.
* The number of iteration is always one so the iteration is useless
and that makes code complicated.
* Also defining named structure can code mroe readable.
* g_free is null safe
New rate-control modes are introduced (if device can support)
* cbr-ld-hr: CBR low-delay high quality
* cbr-hq: CBR high quality
* vbr-hq: VBR high quality
Also, various configurable rate-control related properties are added.
Introducing new dynamic class between GstNvBaseEncClass and
each subclass to be able to access device specific properties and
capabilities from each subclass implementation side.
Add new macro for sink/src pad template to ensure no DMABuf caps
features are exposed on Windows. Some DMABuf caps features
were not handled by the commit 9ec62418c3
gst_buffer_make_writable() requires exclusive reference to the
GstMemory so the _make_writable() for the msdk buffer will result
to fallback system memory copy, because the msdk memory were initialized
with GST_MEMORY_FLAG_NO_SHARE flag.
Note that, disable sharing GstMemory brings high overhead but actually
the msdk memory objects can be shared over multiple buffers.
If the memory is not shareable, newly added GstAllocator::mem_copy will
create copied msdk memory.
Sometimes a HEVC/H265 stream doesn't have a valid profile but MSDK can
handle this stream. Like vaapih265dec, msdkh265dec may advertise the sink
caps without profile
DecodedOrder was deprecated in msdk-2017 version, but some customers
still use this for low-latency streaming of non-b-frame encoded streams,
which needs to output the frame at once
Asking decklink to render audio data seems to be based entirely on
the sample counts which completely disregards the timestamps
we pass to decklink. As a result, we need to explicitly check
for late buffers and drop them ourselves.
Do not restrict allowed maximum resolution depending on the
initial resolution. If new resolution is larger than previous one,
just re-init encode session.
Due to uncleared last flow, decoding after seek was never possible
(last_ret == GST_FLOW_FLUSHING).
nvdec dose not need to keep track of the previous flow return,
and actually the interest is data/even flow of the current handle_frame().
Implementing ::negotiate() method to support runtime output format
change. If downstream was reconfigured, baseclass will invoke
::negotiate() method, and nvdec should update output memory
type depending on downstream caps.
Input stream might be silently changed without ::set_format() call.
Since nvdec has internal parser, nvdec element can figure out the format change
by itself.
Register openGL resource only once per memory. Also if upstream
provides the registered information, reuse the information
instead of doing it again. This can improve performance dramatically
depending on system since the resource registration might cause
high overhead.
Introduce GstCudaGraphicsResource structure to represent registered
CUDA graphics resources and to enable sharing the information among
nvdec and nvenc. This structure can reduce the number of resource
registration which cause high overhead.
For openGL interoperability, nvdec uses cuGraphicsGLRegisterImage API
which is to register openGL texture image.
Meanwhile nvenc uses cuGraphicsGLRegisterBuffer API to registure openGL buffer object.
That means two kinds of graphics resources are registered per memory
when nvdec/nvenc are configured at the same time.
The graphics resource registration brings possibly high overhead
so the registration should be performed only once per resource
from optimization point of view.
Both g_list_delete_link and g_list_remove remove an element and free it,
so l->next is invalid (catched by valgrind) after calling g_list_delete_link
or g_list_remove
Returning MFX_WRN_INCOMPATIBLE_VIDEO_PARAM means MSDK detects some
incompatible parameters but it is resolved, and we may not regard
MFX_WRN_INCOMPATIBLE_VIDEO_PARAM as a fatal error. In this fix,
GST_FLOW_OK is returned but with a warning message so that a pipeline
may run to the end.
Fixes werror build:
In file included from ../sys/androidmedia/gstahcsrc.c:70:
../gst-libs/gst/interfaces/photography.h:27:2: error: "The GstPhotography interface is unstable API and may change in future." [-Werror,-W#warnings]
#warning "The GstPhotography interface is unstable API and may change in future."
^
../gst-libs/gst/interfaces/photography.h:28:2: error: "You can define GST_USE_UNSTABLE_API to avoid this warning." [-Werror,-W#warnings]
#warning "You can define GST_USE_UNSTABLE_API to avoid this warning."
^
video-direction property is common property in gstreamer. In addition,
both mirroring & rotation properties are marked as deprecated,
video-direction will override mirroring & rotation properties when they
are set explicitly
Fix https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1058