Requesting the GLTextureUpload meta on buffers in the bufferpool
prevents such metas from being de-allocated when buffers are released
in the sink.
This is particulary useful in terms of performance when using the
GLTextureUploadMeta API since the GstVaapiTexture associated with
the target texture is stored in the meta.
https://bugzilla.gnome.org/show_bug.cgi?id=712558
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Make GstVideoGLTextureUploadMeta::upload() implementation more robust
when the GstVaapiTexture associated with the supplied texture id could
not be created.
Clean public APIs up so that to better align with the decoder APIs.
Most importantly, gst_vaapi_encoder_get_buffer() is changed to only
return the VA coded buffer proxy. Also provide useful documentation
for the public APIs.
Refactor the GstVaapiCodedBuffer APIs so that to more clearly separate
public and private interfaces. Besides, the map/unmap APIs should not
be exposed as is but appropriate accessors should be provided instead.
* GstVaapiCodedBuffer: VA coded buffer abstraction
- gst_vaapi_coded_buffer_get_size(): get coded buffer size.
- gst_vaapi_coded_buffer_copy_into(): copy coded buffer into GstBuffer
* GstVaapiCodedBufferPool: pool of VA coded buffer objects
- gst_vaapi_coded_buffer_pool_new(): create a pool of coded buffers of
the specified max size, and bound to the supplied encoder
* GstVaapiCodedBufferProxy: pool-allocated VA coded buffer object proxy
- gst_vaapi_coded_buffer_proxy_new_from_pool(): create coded buf from pool
- gst_vaapi_coded_buffer_proxy_get_buffer(): get underlying coded buffer
- gst_vaapi_coded_buffer_proxy_get_buffer_size(): get coded buffer size
Rationale: more optimized transfer functions might be provided in the
future, thus rendering the map/unmap mechanism obsolete or sub-optimal.
https://bugzilla.gnome.org/show_bug.cgi?id=719775
Fix GstElement::set_context() implementation for all plug-in elements
to avoid leaking an extra reference to the VA display, thus preventing
correct cleanup of VA resources in GStreamer 1.2 builds.
Return earlier if the creation of a VA display failed. Likewise, simplify
gst_vaapi_video_context_propagate() now that we are guaranteed to have a
valid VA display.
When GstVideoMeta maps were used, the supporting functions incorrectly
used gst_buffer_get_memory() instead of gst_buffer_peek_memory(), thus
always increasing the associated GstMemory reference count and giving
zero chance to actually release that, and subsequently the VA display.
Simplify GstVaapiVideoMeta to only hold a surface proxy, which is
now allocated from a surface pool. This also means that the local
reference to the VA surface is also gone, as it could be extracted
from the associated surface proxy.
Drop the following functions that are not longer used:
- gst_vaapi_video_buffer_new_with_surface()
- gst_vaapi_video_meta_new_with_surface()
- gst_vaapi_video_meta_set_surface()
- gst_vaapi_video_meta_set_surface_from_pool()
Fix gst_vaapi_video_meta_new_from_pool() to allocate VA surface proxies
from surface pools instead of plain VA surfaces. This is to simplify
allocations now that surface proxies are created from a surface pool.
Optimize gst_vaapiencode_handle_frame() to avoid extra memory allocation,
and in particular the GstVaapiEncObjUserData object. i.e. directly use
the VA surface proxy from the source buffer. This also makes the user
data attached to the GstVideoCodecFrame more consistent between both
the decoder and encoder plug-in elements.
Simplify gst_vaapiencode_push_frame(), while also removing the call
to gst_video_encoder_negotiate() since this is implicit in _finish()
if caps changed. Also fixed memory leaks that occured on error.
Constify pointers wherever possible. Drop unused variables, and use
consistent variable names. Fix gst_vaapiencode_h264_allocate_buffer()
to correctly report errors, especially when in-place conversion from
bytestream to avcC format failed.
Move "rate-control" mode and "bitrate" properties to the GstVaapiEncode
base class. The actual range of supported rate control modes is currently
implemented as a plug-in element hook. This ought to be determined from
the GstVaapiEncoder object instead, i.e. from libgstvaapi.
Align the plug-in debug category to its actual name. i.e. enable debug
logs through vaapiencode_<CODEC> where <CODEC> is mpeg2, h264, etc. Fix
the plug-in element description to make it more consistent with other
VA-API plug-ins.
Add a GST_VAAPIENCODE_CAST() helper to avoid run-time checks against
the GObject type system. We are guaranteed to only deal with the same
plug-in element object.
Allow vaapiencode plug-in elements to encode from raw YUV buffers.
The most efficient way to do so is to let the vaapiencode elements
allocate a buffer pool, and subsequently buffers from it. This means
that upstream elements are expected to honour downstream pools.
If upstream elements insist on providing their own allocated buffers
to the vaapiencode elements, then it possibly would be more efficient
to insert a vaapipostproc element before the vaapiencode element.
This is because vaapipostproc currently has better support than other
elements for "foreign" raw YUV buffers.
Add GstVaapiEncodeMPEG2 element object. The actual plug-in element
is called "vaapiencode_mpeg2".
Valid properties:
- rate-control: rate control mode (default: cqp - constant QP)
- bitrate: desired bitrate in kbps (default: auto-calculated)
- key-period: maximal distance between two key frames (default: 30)
- max-bframes: number of B-frames between I and P (default: 2)
- quantizer: constant quantizer (default: 8)
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add GstVaapiEncodeH264 element object. The actual plug-in element
is called "vaapiencode_h264".
Valid properties:
- rate-control: rate control mode (default: none)
- bitrate: desired bitrate in kbps (default: auto-calculated)
- key-period: maximal distance between two key frames (default: 30)
- num-slices: number of slices per frame (default: 1)
- max-bframes: number of B-frames between I and P (default: 0)
- min-qp: minimal quantizer (default: 1)
- init-qp: initial quantizer (default: 26)
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix build when Wayland headers don't live in plain system include dirs
like /usr/include but rather in /usr/include/wayland for instance.
Original patch written by Dominique Leuenberger <dimstar@opensuse.org>
https://bugzilla.gnome.org/show_bug.cgi?id=712282
Destroy VPP output surface pool on exit. Also avoid a possible crash
in double-free situation caused by insufficiently reference counted
array of formats returned during initialization.
Fix advanced deinterlacing modes with VPP to track only up to 2 past
reference buffers. This used to be 3 past reference buffers but this
doesn't fit with the existing decode pipeline that only has 4 extra
scratch surfaces.
Also optimize references tracking to be only enabled when needed, i.e.
when advanced deinterlacing mode is used. This means that we don't
need to track past references for basic bob or weave deinterlacing.
In "mixed" interlaced streams, the buffer contains additional flags that
specify whether the frame contained herein is interlaced or not. This means
that we can alternatively get progressive or interlaced frames. Make sure
to disable deinterlacing at the VPP level when the source buffer is no longer
interlaced.
Fix memory leaks with advanced deinterlacing, i.e. when we keep track
of past buffers. Completely reset the deinterlace state, thus destroying
any buffer currently held, on _start(), _stop() and _destroy().
Port vaapipostproc element to GStreamer 1.2. Support is quite minimal
right now so that to cope with auto-plugging issues/regressions. e.g.
this happens when the correct set of expected caps are being exposed.
This means that, currently, the proposed caps are not fully accurate.
Fix basic deinterlacing flags provided to gst_vaapi_set_deinterlacing()
for the first field. Render flags were supplied instead of the actual
deinterlacing flags (deint_flags).
Fix GstBaseTransform::transform_caps() implementation to always return
the complete set of allowed sink pad caps (unfixated) even if the src
pad caps we are getting are fixated. Rationale: there are just so many
possible combinations, and it was wrong to provide a unique set anyway.
As a side effect, this greatly simplifies the ability to derive src pad
caps from fixated sink pad caps.
Fix deinterlacing flags to make more sense. The TFF (top-field-first)
flag is meant to specify the organization of reference frames used in
advanced deinterlacing modes. Introduce the more explicit flag TOPFIELD
to specify that the top-field of the supplied input surface is to be
used for deinterlacing. Conversely, if not set, this means that the
bottom field of the supplied input surface will be used instead.
There are situations where gst_video_decoder_flush() is called, and
this subsequently produces a gst_video_decoder_reset() that kills the
currently active GstVideoCodecFrame. This means that it no longer
exists by the time we reach GstVideoDecoder::finish() callback, thus
possibly resulting in a crash if we assumed spare data was still
available for decode (current_frame_size > 0).
Try to honour GstVideoDecoder::reset() behaviour from GStreamer 1.0
that means a flush, thus performing the actual operations there like
calling gst_video_decoder_have_frame() if pending data is available.
Review all interactions between the main video decoder stream thread
and the decode task to derive a correct sequence of operations for
decoding. Also avoid extra atomic operations that become implicit under
the GstVideoDecoder stream lock.
Fix hard reset for seek cases by flushing the GstVaapiDecoder queue
and completely purge any decoded output frame that may come out from
it. At this stage, the GstVaapiDecoder shall be in a complete clean
state to start decoding over new buffers.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
vaapidecode used to wait up to one second past the expected time of
presentation for the last decoded frame. This is not realistic in
practice when it comes to video pause/resume. Changed behaviour to
unconditionnally wait for a free VA surface prior to continuing the
decoding. The decode task will continue pushing the output frames to
the downstream element while also reporting errors at the same time
to the main thread.
https://bugzilla.gnome.org/show_bug.cgi?id=707108
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The srcpad caps exposed for GStreamer 1.2 were missing any useful info
like framerate, pixel-aspect-ratio, interlace-mode et al. Not to mention
that it relied on possibly un-initialized data. Fix srcpad caps to be
initialized from a sanitized copy of GstVideoDecoder output state caps.
Note: the correct way to expose the srcpad caps triggers an additional
issue in core GStreamer auto-plugging capabilities as the correct caps
to be exposed should be format=ENCODED with memory:VASurface caps feature
at the minimum. In some situations, we could determine the underlying
VA surface format, but this is not always possible. e.g. cases where it
is not allowed to expose the underlying VA surface data, or when the
VA driver implementation cannot actually provide such information.
Currently, the decoder only supports YUV 4:2:0 output. So, expose the
output formats for GStreamer 1.2 in caps to a realistic subset. This
means NV12, I420 or YV12 but also ENCODED if we cannot determine the
underlying VA surface format, or if it is actually not allowed to get
access to the surface contents.
Fix vaapidecode srcpad caps to only expose RGBA video format for the
meta:GstVideoGLTextureUploadMeta feature. That's only what is supported
so far. Besides, drop this meta from the vaapisink sinkpad caps since
we really don't support that for rendering.
https://bugzilla.gnome.org/show_bug.cgi?id=711828
Fix raw YUV data uploaded as in the following pipeline:
$ gst-launch-1.0 filesrc video.yuv ! videoparse ! vaapipostproc ! vaapisink
The main reason why it failed was that the videoparse element simply
allocates GstBuffer with raw data chunk'ed off the sink pad without
any prior knowledge of the actual frame info. i.e. it basically just
calls gst_adapter_take_buffer().
We could avoid the extra copy performed in vaapipostproc if the videoparse
element was aware of the downstream pool and bothers copying line by
line, for each plane. This means that, for a single frame per buffer,
the optimizatin will be to allocate the video buffer downstream, map
it, and copy each line that is coming through until we need to fills
in the successive planes.
Still, optimized raw YUV uploads already worked with the following:
$ gst-launch-1.0 videotestsrc ! vaapipostproc ! vaapisink
https://bugzilla.gnome.org/show_bug.cgi?id=711250
[clean-ups, fixed error cases to unmap and unref outbuf]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
If the currently selected deinterlacing method is not supported by the
underlying hardware, then try to downgrade the method to a supported one.
At the minimum, basic bob-deinterlacing shall always be supported.