Before pushing a the new frame, the render() method calls sync() to flush the
pending frames. Nonetheless, the last pushed frame never gets rendered, leading
to a memory leak too.
This patch calls sync() in the destroy() to flush the pending frames before
destroying the window.
Also a is_cancelled flag is added. This flag tells to not flush the event
queue again since the method failed previously or were cancelled by the user.
https://bugzilla.gnome.org/show_bug.cgi?id=749078
Otherwise wl_display_dispatch_queue() might prevent the pipeline from
shutting down. This can happen e.g. if the wayland compositor exits while
the pipeline is running.
Changes:
* renamed unlock()/unlock_stop() to unblock()/unblock_cancel() in gstvaapiwindow
* splitted the patch removing wl_display_dispatch_queue()
Signed-off-by: Víctor Manuel Jáquez Leal <victorx.jaquez@intel.com>
https://bugzilla.gnome.org/show_bug.cgi?id=747492https://bugzilla.gnome.org/show_bug.cgi?id=749078
wl_display_dispatch_queue() might prevent the pipeline from shutting
down. This can happen e.g. if the wayland compositor exits while the
pipeline is running.
This patch replaces it with these steps:
- With wl_display_prepare_read() all threads announce their intention
to read.
- wl_display_read_events() is thread save. On threads reads, the other
wait for it to finish.
- With wl_display_dispatch_queue_pending() each thread dispatches its
own events.
wl_display_dispatch_queue_pending() was defined since wayland 1.0.2
Original-patch-by: Michael Olbrich <m.olbrich@pengutronix.de>
* stripped out the unlock() unlock_stop() logic
* stripped out the poll handling
Signed-off-by: Víctor Manuel Jáquez Leal <victorx.jaquez@intel.com>
https://bugzilla.gnome.org/show_bug.cgi?id=749078https://bugzilla.gnome.org/show_bug.cgi?id=747492
Since frame in the private data means the last frame sent, it would
semantically better use last_frame.
Also, this patch makes use of g_atomic_pointer_{compare_and_exchange, set}()
functions.
https://bugzilla.gnome.org/show_bug.cgi?id=749078
Wayland window has a pointer to the last pushed frame and use it to set the
flag for stopping the queue dispatch loop. This may lead to memory leaks,
since we are not keeping track of all the queued frames structures.
This patch removes the last pushed frame pointer and change the binary flag
for an atomic counter, keeping track of number of queued frames and use it for
the queue dispatch loop.
https://bugzilla.gnome.org/show_bug.cgi?id=749078
This patch takes out the wayland's buffer from the the frame structure. The
buffer is queued to wayland and destroyed in the "release" callback. The
frame is freed in the surface's "done" callback.
In this way a buffer may be leaked but not the whole frame structure.
- surface 'done' callback is used to throttle the rendering operation and to
unallocate the frame, but not the buffer.
- buffer 'release' callback is used to destroy wl_buffer.
Original-patch-by: Zhao Halley <halley.zhao@intel.com>
* code rebase
* kept the the event_queue for buffer's proxy
Signed-off-by: Víctor Manuel Jáquez Leal <victorx.jaquez@intel.com>
https://bugzilla.gnome.org/show_bug.cgi?id=749078
This patch fixes several issues found when running the `make distcheck`
target:
- In commit c561b8da, the update of gstcompat.h in Makefile.am was
forgotten.
- In commit c5756a91 add the simple_encoder_source_h in EXTRA_DIST was
forgotten.
- vpx.build.stamp is not generated at all, only vpx.configure.stamp.
- The make target distcleancheck failed because some autogenerated files
were not handled with the DISTCLEANFILES variable.
Note: `make distcheck -jXX` is not currently supported.
GST_VAAPI_ENCODER_STATUS_NO_SURFACE and GST_VAAPI_ENCODER_STATUS_NO_BUFFER
are not errors, so they do not have the ERROR namespace.
This patch fixes this typo in documentation.
On s390x, guintptr and GstVaapiID are not compatible types. The
implementation of gst_vaapi_window_new_internal() and all its callers
seem to assume that its third argument is a GstVaapiID, while the
header gives it guintptr type.
https://bugzilla.gnome.org/show_bug.cgi?id=744559
Since bug #745728 was fixed the oldest supported version of GStreamer is
1.2. That GStreamer release requires glib 2.32, so we can upgrade our
requirement too.
This patch changes the required version of glib in configure.ac and removes
the hacks in glibcompat.h
https://bugzilla.gnome.org/show_bug.cgi?id=748698
This patch only intends to improve readability: in the method
gst_vaapi_window_wayland_sync() the if/do instructions are squashed into a
single while loop.
Also renames the frame_redraw_callback() callback into frame_done_callback(),
which is a bit more aligned to Wayland API.
The Wayland compositor may still use the buffer when the frame done
callback is called.
This patch destroys the frame (which contains the buffer) until the
release callback is called. The draw termination callback only controls
the display queue dispatching.
Signed-off-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
https://bugzilla.gnome.org/show_bug.cgi?id=747492
Based up on the value of uniform_spacing_flag in Picture Parameter Set,
the tile column width and tile row height should be calculated.
Equations: 6-1, 6-2
Tiled video Descriptions: 7.3.2.3, 7.4.3.3
-- Set NoRaslOutputFlag based on EOS and EOB Nal units
-- Fix PicOutputFlag setting for RASL picture
-- Fix prev_poc_lsb/prev_poc_msb calculation
-- Drop the RASL frames if NoRaslOutputFlag is TRUE for the associated IRAP picture
-- Fixed couple of crashes and added cosmetics
There is a race condition where g_drm_device_type can be left set to
DRM_DEVICE_RENDERNODES when it shouldn't.
If thread 1 comes in and falls into the last else statement setting up both
RENDERNODES and LEGACY types. And begins to process the first type (RENDERNODES),
it sets g_drm_device_type = RENDERNODES.
Now when thread 2 comes in and sees g_drm_device_type is RENDERNODES, it queues
up that type to be tried but then encounters the lock and has to wait until the
first thread finishes. Once the lock is acquired it will then proceed to ONLY try
RENDERNODES and fail it. But it doesn't try LEGACY. And from then on, all future
attempts will only try RENDERNODES.
So to avoid this situation I have simply moved the acquisition of the lock higher
up in the attached patch.
https://bugzilla.gnome.org/show_bug.cgi?id=747914
The video pool can be accessed with the display lock held, for example,
when releasing a buffer from inside vaapisink_render, but allocating
a new object can may also take the display lock. Which means a possible
deadlock.
https://bugzilla.gnome.org/show_bug.cgi?id=747944
The support for buffer exports in VA-API was added in version 0.36. These
interfaces are for interop with EGL, OpenCL, etc.
GStreamer-VAAPI uses it for a dmabuf memory allocator. Though, gstreamer-vaapi
has to support VA-API versions ranging from 0.30.4, which doesn't support it.
This patch guards all the buffer exports handling (and dmabuf allocator) if
the detected VA-API version is below 0.36.
https://bugzilla.gnome.org/show_bug.cgi?id=746405
The member size in GstMpeg4Packet is gsize which is unsigned, which cannot be
less than zero. Hence this pre-condition test is a no-op. This patch removes
that code.
https://bugzilla.gnome.org/show_bug.cgi?id=747312
slice_type in slice_param is defined as (char *), but it is compared against a
signed integer. clang complains about this comparison.
This patch casts the variable.
https://bugzilla.gnome.org/show_bug.cgi?id=747312
The symbol GstVaapiCodedBuffer is already defined in
gst-libs/gst/vaapi/gstvaapicodedbuffer.h which is loaded, at the end, by
gstvaapiencoder_objects.h. Clang complains about the symbol re-definition.
This patch removes that redefinition.
https://bugzilla.gnome.org/show_bug.cgi?id=747312
The member value in frame_rate_tab is float, the result of the abs() function
should be float too. But abs() only manages integers.
This patch replaces abs() with fabsf() to handle correctly the possible floats
values.
https://bugzilla.gnome.org/show_bug.cgi?id=747312
The purpose of gstcompat.h is to couple the API differences among
gstreamer-1.0 and gstreamer-0.10. Since gstreamer-0.10 is obsolete, the code
in this compatibility layer shall be removed.
Nevertheless, the gstcompat.h header should be kept, if new incompatibilites
appear in the future, but it shall live in gst/vaapi, not in gst-libs.
This patch removes the crumbs defined gstcompat.h and moves it to gst/vaapi.
In order to avoid layer violations, gstcompat.h includes sysdeps.h and all
the includes in gst/vaapi of sysdeps.h are replaced with gstcompat.h
https://bugzilla.gnome.org/show_bug.cgi?id=745728
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
This library was intended to add the base classes for video decoders which
where not included in gstreamer-0.10.
Since the support of gstreamer-0.10 is deprecated those classes are not
required, thus the whole library is removed.
https://bugzilla.gnome.org/show_bug.cgi?id=745728https://bugzilla.gnome.org/show_bug.cgi?id=732666
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
This patch only removes the support of gstreamer-0.10 in the autotools
scripts. No other files are touched.
The configuration parameter --gstreamer-api was deleted since now it is always
auto-detected.
The verification of vmethod query in GstBaseSinkClass was removed since it was
added in gstreamer 0.10.35. The same case for GstVideoOverlayComposition and
its format flags.
The precious variable GST_PLUGIN_PATH was removed, while GST_PLUGIN_PATH_1_0
remained.
The automake files were changed accordingly.
Removed, in debian/control, the vaapiupload and vaapidownload descriptions.
https://bugzilla.gnome.org/show_bug.cgi?id=732666https://bugzilla.gnome.org/show_bug.cgi?id=745728
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add support for H.264 MVC Multiview High profile encoding with
more than 2 views. All views within the same accesss unit are
provided in increasing order of view order index (VOIdx).
Upto 10 view are supported for now.
A new property "view-ids" has been provided for the plugins to
set the view ids (which is an array of guint values) to be used
for mvc encoding.
https://bugzilla.gnome.org/show_bug.cgi?id=732453
Add support for GstVideoGLTextureOrientation modes. In particular,
add orientation flags to the GstVaapiTexture wrapper and the GLX
implementations. Default mode is that texture memory is laid out
with top lines first, left row first. Flags indicate whether the
X or Y axis need to be inverted.
Add GstVaapiTextureEGL abstraction that can create its own GL texture,
or import a foreign allocated one, while still allowing updates from a
VA surface.
Add helpers to import EGLImage objects into VA surfaces. There are
two operational modes: (i) gst_vaapi_surface_new_from_egl_image(),
which allows for implicit conversion from EGLImage to a VA surface
in native video format, and (ii) gst_vaapi_surface_new_with_egl_image(),
which exactly wraps the source EGLImage, typically in RGBA format
with linear storage.
Note: in case of (i), the EGLImage can be disposed right after the
VA surface creation call, unlike in (ii) where the user shall ensure
that the EGLImage is live until the associated VA surface is no longer
needed.
https://bugzilla.gnome.org/show_bug.cgi?id=743847
Add initial support for EGL to libgstvaapi core library. The target
display server and the desired OpenGL API can be programmatically
selected at run-time.
A comprehensive set of EGL utilities are provided to support those
dynamic selection needs, but also most importantly to ensure that
the GL command stream is executed from within a single thread.
https://bugzilla.gnome.org/show_bug.cgi?id=743846
Added rounding control handling for VC1 simple and Main profile
based on VC1 standard spec: section 8.3.7
https://bugzilla.gnome.org/show_bug.cgi?id=743958
Signed-off-by: Lim Siew Hoon <siew.hoon.lim@intel.com>
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Practically we should be able to support more formats, for eg:
JPEG Encoder can support YUV422, RGBA and all.
But this is causing more issues which need proper fix here and there.
Otherwise the condition could become true before the lock
is taken and the g_cond_signal() could be called
before the g_cond_wait(), so the g_cond_wait() is never
awoken.
https://bugzilla.gnome.org/show_bug.cgi?id=740645
Add support for GEM buffer imports. This is useful for VA/EGL interop
with legacy Mesa implementations, or when it is desired or required to
support outbound textures for instance.
https://bugzilla.gnome.org/show_bug.cgi?id=736718
Add new gst_vaapi_surface_new_with_dma_buf_handle() helper function
to allow for creating VA surfaces from a foreign DRM PRIME fd. The
resulting VA surface owns the supplied buffer handle.
https://bugzilla.gnome.org/show_bug.cgi?id=735362
Add gst_vaapi_surface_new_from_buffer_proxy() helper function to
create a VA surface from an external buffer provided throug the
new GstVaapiBufferProxy object.
Add support for GEM buffer exports. This will only work with VA drivers
based off libdrm, e.g. the Intel HD Graphics VA driver. This is needed
to support interop with EGL and the "Desktop" GL specification. Indeed,
the EXT_image_dma_buf_import extension is not going to be supported in
Desktop GL, due to the lack of support for GL_TEXTURE_EXTERNAL_OES targets
there.
This is useful for implementing VA/EGL interop with legacy Mesa stacks,
in Desktop OpenGL context.
https://bugzilla.gnome.org/show_bug.cgi?id=736717
Use the new VA buffer export APIs to allow for a VA surface to be
exposed as a plain PRIME fd. This is in view to simplifying interop
with EGL or OpenCL for instance.
https://bugzilla.gnome.org/show_bug.cgi?id=735364
The VA buffer export APIs work for a particular lifetime starting from
vaAcquireBufferHandle() and ending with vaReleaseBufferHandle(). As such,
it could be much more convenient to support implicit releases by simply
having a refcount reaching zero.
https://bugzilla.gnome.org/show_bug.cgi?id=736721
Reword surface pool allocation helpers so that to allow for a simple
form, e.g. gst_vaapi_surface_pool_new(format, width, height); and a
somewhat more elaborated/flexible form with optional allocation flags
and precise GstVideoInfo specification.
This is an API/ABI change, and SONAME version needs to be bumped.
Add GstVaapiDisplay::get_{visual_id,colormap}() helpers to help determine
the best suitable window visual id and colormap. This is an indirection in
view to supporting EGL and custom/generic replacements.
Add GstVaapiWindowClass::get_colormap() hook to help determine the
currently active colormap bound to the supplied window, or actually
create it if it does not already exist yet.
Add GstVaapiWindowClass::get_visual_id() function hook to help find
the best suitable visual id for the supplied window. While doing so,
also simplify the process by which an X11 window is created with a
desired Visual, i.e. now use a visual id instead of a Visual object.
Add a new generic helper function gst_vaapi_window_new() to create
a window without having the caller to check for the display type
himself. i.e. internally, there is now a GstVaapiDisplayClass hook
to create windows, and the actual backend implementation fills it in.
Add new generic helper functions gst_vaapi_texture_new_wrapped()
This is a simplification in view to supporting EGL.
Add gst_vaapi_display_has_opengl() helper function to help determining
whether the display can support OpenGL context to be bound to it, i.e.
if the class is of type GST_VAAPI_DISPLAY_TYPE_GLX.
Make gst_vaapi_display_get_display_type() return the actual VA display
type. Conversely, add a gst_vaapi_display_get_class_type() function to
return the type of the GstVaapiDisplay instance. The former is used to
identify the display server onto which the application is running, and
the latter to identify the original object class.
Record the underlying native display instance into the toplevel
GstVaapiDisplay object. This is useful for fast lookups to the
underlying native display, e.g. for creating an EGL display.
Add new generic helper functions gst_vaapi_texture_new_wrapped()
and gst_vaapi_texture_new() to create a texture without having
the caller to uselessly check for the display type himself. i.e.
internally, there is now a GstVaapiDisplayClass hook to create
textures, and the actual backend implementation fills it in.
This is a simplification in view to supporting EGL.
GstVaapiTexture is a generic abstraction that could be moved to the
core libgstvaapi library. While doing this, no extra dependency needs
to be added. This means that a GstVaapitextureClass is now available
for any specific code that needs to be added, e.g. creation of the
underlying GL texture objects, or backend dependent ways to upload
a surface to the texture object.
Generic OpenGL data types (GLuint, GLenum) are also replaced with a
plain guint.
https://bugzilla.gnome.org/show_bug.cgi?id=736715
The VA/GLX interfaces are obsolete. They used to exist for XvBA, and
ease of use, but they had other caveats to deal with. It's now better
to move on to legacy mode, whereby VA/GLX interop is two be provided
through (i) X11 Pixmap, and (ii) other modern means of buffer sharing.
https://bugzilla.gnome.org/show_bug.cgi?id=736711
The gst_vaapi_texture_put_surface() function is missing a crop_rect
argument that would be used during transfer for cropping the source
surface to the desired dimensions.
Note: from a user point-of-view, he should create the GstVaapiTexture
object with the cropped size. That's the default behaviour in software
decoding pipelines that we need to cope with.
This is an API/ABI change, and SONAME version needs to be bumped.
https://bugzilla.gnome.org/show_bug.cgi?id=736712
Add new gst_vaapi_surface_new_full() helper function that allocates
VA surface from a GstVideoInfo template in argument. Additional flags
may include ways to
- allocate linear storage (GST_VAAPI_SURFACE_ALLOC_FLAG_LINEAR_STORAGE) ;
- allocate with fixed strides (GST_VAPI_SURFACE_ALLOC_FLAG_FIXED_STRIDES) ;
- allocate with fixed offsets (GST_VAAPI_SURFACE_ALLOC_FLAG_FIXED_OFFSETS).
Add new gst_vaapi_surface_proxy_new() helper to wrap a surface into
a proxy. The main use case for that is to convey additional information
at the proxy level that would not be suitable to the plain surface.
Re-introduce a GST_VAAPI_ID_INVALID value that represents
a non-zero and invalid id. This is useful to have a value
that is still invalid for cases where zero could actually
be a valid value.
Make it possible to have all libgstvaapi backends (libs) access to a
common GstVaapiMiniObject API and implementation. This is a minor step
towards full exposure when needed, but restrict it to libgstvaapi at
this time.
Really report sample aspect ratio (SAR) as present, and make it match
what we have obtained from the user as pixel-aspect-ratio (PAR). i.e.
really make sure VUI parameter aspect_ratio_info_present_flag is set
to TRUE and that the indication from aspect_ratio_idc is Extended_SAR.
This is a leftover from git commit a12662f.
https://bugzilla.gnome.org/show_bug.cgi?id=740360
Fix gst_vaapi_decoder_mpeg4_parse() to initialize the packet type to
GST_MPEG4_USER_DATA so that a parse error would result in skipping
that packet. Also fix gst_vaapi_decoder_mpeg4_decode_codec_data() to
initialize status to GST_VAAPI_DECODER_STATUS_SUCCESS.
Use the SEI pic_timing() message to track and propagate down the repeat
first field (RFF) flag. This is only initial support as there is one
other condition that could induce the RFF flag, which is not handled
yet.
Fix the decoding process for picture order count type 0 when the previous
picture had a memory_management_control_operation = 5. In particular, fix
the actual variable type for prev_pic_structure to hold the full bits of
the picture structure.
In practice, this used to work though, due to the underlying type used to
express a gboolean.
Use the SEI pic_timing() message to track the pic_struct variable when
present, or infer it from the regular slice header flags field_pic_flag
and bottom_field_flag. This fixes temporal sequence ordering when the
output pictures are to be displayed.
https://bugzilla.gnome.org/show_bug.cgi?id=739291
Add support for DRM Render-Nodes. This is a new feature that appeared
in kernel 3.12 for experimentation purposes, but was later declared
stable enough in kernel 3.15 for getting enabled by default.
This allows headless usages without authentication at all, i.e. usages
through plain ssh connections is possible.
Fix gst_vaapi_surface_proxy_copy() to copy the view-id element, thus
fixing random frames skipped when vaapipostproc element is used in
passthrough mode. In that mode, GstMemory is copied, thus including
the underlying GstVaapiVideoMeta and associated GstVaapiSurfaceProxy.
Ensure the X11 implementation for GstVaapiWindow::get_geometry() is
thread-safe by default, so that upper layer users don't need to handle
that explicitly.
Add gst_vaapi_window_reconfigure() interface to force an update of
the GstVaapiWindow "soft" size, based on the current geometry of the
underlying native window.
This can be useful for instance to synchronize the window size when
the user changed it.
Thanks to Fabrice Bellet for rebasing the patch.
[changed interface to gst_vaapi_window_reconfigure()]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add gst_vaapi_display_get_display_name() helper function to determine
the name associated with the underlying native display. Note that for
raw DRM backends, the display name is actually the device path.
The timestamp generator in gstvaapidecoder_mpeg2.c always interpolated
frame timestamps within a GOP, even when it's been fed input PTS for
every frame.
That leads to incorrect output timestamps in some situations - for example
live playback where input timestamps have been scaled based on arrival time
from the network and don't exactly match the framerate.
https://bugzilla.gnome.org/show_bug.cgi?id=732719
Forbid GstVaapiObject to be created without an associated klass spec.
It is mandatory that the subclass implements an adequate .finalize()
hook, so it shall provide a valid GstVaapiObjectClass.
https://bugzilla.gnome.org/show_bug.cgi?id=722757
[made non-NULL klass argument to gst_vaapi_object_new() a requirement]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Call the subclass .init() function in gst_vaapi_object_new(), if
needed. The default behaviour is to zero initialize the subclass
object data, then the .init() function can be used to initialize
fields to non-default values, e.g. VA object ids to VA_INVALID_ID.
Also fix the gst_vaapi_object_new() description, which was merely
copied from GstVaapiMiniObject.
https://bugzilla.gnome.org/show_bug.cgi?id=722757
[changed to always zero initialize the subclass]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
When a DPB flush is required, e.g. at a natural and of stream or issued
explicitly through an IDR, try to detect any frame left in the DPB that
is interlaced but does not contain two decoded fields. In that case, mark
the picture as having a single field only.
This avoids a hang while decoding tv_cut.mkv.
Simplify the dpb_output() function to exclusively rely on the frame store
buffer to output, since this is now always provided. Besides, also fix
cases where split fields would not be displayed.
This is a regression from f48b1e0.
Cope with latest changes from codecparsers/h264. It is now required
to explicitly clear the GstH264PPS structure as it could contain
additional allocations (slice_group_ids).
Add new GstVaapiSurfaceProxy flag FFB, which means "first frame in
bundle", and really expresses the first view component of a multi
view coded frame. e.g. in H.264 MVC, the surface proxy has flag FFB
set if VOIdx = 0.
Likewise, new API is exposed to retrieve the associated "view-id".
Allow decoders to set the "one-field" attribute when the decoded frame
genuinely has a single field, or if the second field was mis-decoded but
we still want to display the first field.
Make sure to output the decoded picture, and push the associated
GstVideoCodecFrame, only once. The frame fully represents what needs
to be output, included for interlaced streams. Otherwise, the base
GstVideoDecoder class would release the frame twice.
Anyway, the general process is to output decoded frames only when
they are complete. By complete, we mean a full frame was decoded or
both fields of a frame were decoded.
Slightly optimize decoding process by submitting the current VA surface
for decoding earlier to the hardware, and perform the reference picture
marking process and DPB update process afterwards.
This is a minor optimization to let the video decode engine kick in work
earlier, thus improving parallel resources utilization.
Fix decoding of interlaced streams where a first field (e.g. B-slice)
was immediately output and the current decoded field is to be paired
with that former frame, which is no longer in DPB.
https://bugzilla.gnome.org/show_bug.cgi?id=701340
Optimize the process to detect new pictures or start of new access
units by checking if the previous NAL unit was the end of a picture,
or the end of the previous access unit.
Add support for MVC streams with multiple SPS and subset SPS headers
emitted regularly, e.g. at around every I-frame. Track the maximum
number of views in ensure_context() and really reset the DPB size to
the expected value, always. i.e. even if it decreased. dpb_reset()
only cares of ensuring the DPB allocation.
Fix the compaction process when the DPB is cleared for a specific
view, i.e. fix the process of filling in the holes resulting from
removing frame buffers matching the current picture.
It is not necessary to periodically send SPS or subset SPS headers.
This is up to the upper layer (e.g. transport layer) to decide on
if/how to periodically submit those. For now, only generate new SPS
or subset SPS headers when the codec config changed.
Note: the upper layer could readily determine the config headers
(SPS/PPS) through the gst_vaapi_encoder_h264_get_codec_data() function.
https://bugzilla.gnome.org/show_bug.cgi?id=732083
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Report sample aspect ratio (SAR) as present, and make it match what
we have obtained from the user as pixel-aspect-ratio (PAR). i.e. the
VUI parameter aspect_ratio_info_present_flag now defaults to TRUE.
Set the value of num_anchor_refs_l0, num_anchor_refs_l1, num_non_anchor_refs_l0,
and num_non_anchor_refs_l1 to zero since the inter-view prediction is not yet
supported.
When the seq_parameter_set_data() syntax structure is present in a subset
sequence parameter set and vui_parameters_present_flag is equal to 1, then
timing_info_present_flag shall be equal to 0 (H.7.4.2.1.1).
Submit Prefix NAL headers (nal_unit_type = 14) before every packed
slice header (nal_unit_type = 1 or 5) only for the base view. In non
base views, a Coded Slice Extension NAL header (nal_unit_type = 20)
is required, with an appropriate nal_unit_header_mvc_extension() in
the NAL header bytes.
https://bugzilla.gnome.org/show_bug.cgi?id=732083
Fix search for a picture in the DPB that has a lower POC value than
the current picture. The dpb_find_lowest_poc() function will return
a picture with the lowest POC in DPB and that is marked as "needed
for output", but an additional check against the actual POC value
of the current picture is needed.
This is a regression from 1c46990.
https://bugzilla.gnome.org/show_bug.cgi?id=732130
Fix dpb_clear() to clear previous frame buffers only if they actually
exist to begin with. If the decoder bailed out early, e.g. when it
does not support a specific profile, that array of previous frames
might not be allocated beforehand.
We can avoid scanning for start codes again if the bitstream is fed
in NALU chunks. Currently, we always scan for start codes, and keep
track of remaining bits in a GstAdapter, even if, in practice, we
are likely receiving one GstBuffer per NAL unit. i.e. h264parse with
"nal" alignment.
https://bugzilla.gnome.org/show_bug.cgi?id=723284
[use gst_adapter_available_fast() to determine the top buffer size]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The `vaapipostproc' element could never determine if the H.264 stream
was interlaced, and thus always assumed it to be progressive. Fix the
H.264 decoder to report interlace-mode accordingly, thus allowing the
vaapipostproc element to automatically enable deinterlacing.
The packed slice header and packed raw data need to be paired with
the submission of VAEncSliceHeaderParameterBuffer. So handle them
on a per-slice basis insted of a per-picture basis.
[removed useless initializer]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out the removal process of unused inter-view only reference
pictures from the DPB, prior to the possible insertion of the current
picture.
Ideally, the compiler could still opt for generating two loops. But
at least, the code is now clearer for maintenance.
Improve process for the removal of pictures from DPB before possible
insertion of the current picture (C.4.4) for H.264 MVC inter-view only
reference components. In particular, handle cases where picture to be
inserted is not the last one of the access unit and if it was already
output and is no longer marked as used for reference, including for
decoding next view components within the same access unit.
While invoking the DPB bumping process in presence of many views,
it could be necessary to output previous pictures that are ready,
in a whole. i.e. emitting all view components from the very first
view order index zero to the very last one in its original access
unit; and not starting from the view order index of the picture
that caused the DPB bumping process to be invoked.
As a reminder, the maximum number of frames in DPB for MultiView
High profile with more than 2 views is not necessarily a multiple
of the number of views.
This fixes decoding of MVCNV-4.264.
Let the utility layer handle dynamic growth of the inter-view pictures
array. By definition, setting a new size to the array will effectively
grow the array, but would also fill in the newly created elements with
empty entries (NULL), thus also increasing the reported length, which
is not correct.
When decoding Multiview High profile streams with a large number of
views, it is not possible to make the VAPictureParameterBufferH264.
ReferenceFrames[] array hold the complete DPB, with all possibly
active pictures to be used for inter-view prediction in the current
access unit.
So reduce the scope of the ReferenceFrames[] array to only include
the set of reference pictures that are going to be used for decoding
the current picture. Basically, this is a union of all RefPicListX[]
array, for all slices constituting the decoded picture.
The inter-view reference components and inter-view only reference
components that are included in the reference picture lists shall
be considered as not being marked as "used for short-term reference"
or "used for long-term reference". This means that reference flags
should all be removed from VAPictureH264.flags.
This fixes decoding of MVCNV-2.264.
If the VA driver exposes ad-hoc H.264 MVC profiles, then we have to
be careful to detect profiles changes and not reset the underlying
VA context erroneously. In MVC situations, we could indeed get a
profile_idc change for every SPS that gets activated, alternatively
(base-view -> non-base view -> base-view, etc.).
An improved fix would be to characterize the exact profile to use
once and for all when SPS NAL units are parsed. This would also
allow for fallbacks to a base-view decoding only mode.
Exclusively use VA drivers that support raw packed headers for encoding.
i.e. simply submit packed headers Subset SPS and Prefix NAL units. This
provides for better compatibility accross the various VA drivers and HW
generations since no particular API is needed beyond what readily exists.
Since we are encoding each view independently from each other, we
need a higher number of pre-allocated surfaces to be used as the
reconstructed frames. For Stereo High profile encoding, this means
to effectively double the number of frames to be stored in the DPB.
Add initial support for Subset SPS, Prefix NAL and Slice Extension NAL
for non-base-view streams encoding, and the usual SPS, PPS and Slice
NALs for base-view encoding.
The H.264 Stereo High profile encoding mode will be turned on when the
"num-views" parameter is set to 2. The source (raw) YUV frames will be
considered as Left/Right view, alternatively.
Each of the two views has its own frames reordering pool and reference
frames list management system. Inter-view references are not supported
yet, so the views are encoded independently from each other.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
[limited to Stereo High profile per the definition of MAX_NUM_VIEWS]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Create structures to maintain the reference frames list (RefPool) and
frames reordering (ReorderPool) logic.
This is a prerequisite for H.264 MVC support.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Add provisions to write subset SPS headers to the bitstream in view
to supporting the H.264 MVC specification.
This assumes the libva "staging" branch is in use.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Optimize lookups of view ids / view order indices by caching the result
of the calculatiosn right into the GstVaapiParserInfoH264 struct. This
terribly simplifies is_new_access_unit() and find_first_field() functions.
Add safe fallbacks for MVC profiles:
- all MultiView High profile streams with 2 views at most can be decoded
with a Stereo High profile compliant decoder ;
- all Stereo High profile streams with only progressive views can be
decoded with a MultiView High profile compliant decoder ;
- all drivers that support slice-level decoding could normally support
MVC profiles when the DPB holds at most 16 frames.
In order to have a stricter conforming implementation, we need to carefully
detect access unit boundaries. Additional operations could be necessary to
perform at those boundaries.
Detect the first VCL NAL unit of a picture for MVC, based on the
view_id as per H.7.4.1.2.4. Note that we only need to detect new
view components.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Always cache the previous NAL unit so that we could check whether
there is a Prefix NAL unit immediately preceding the current slice
or IDR NAL unit. In that case, the NAL unit metadata is copied into
the current NAL unit. Otherwise, some default values are inferred,
tentatively. e.g. view_id shall be set to 0 and inter_view_flag to 1.
[infer default values for slice if previous NAL was not a Prefix]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Allow decoding for base views of MVC encoded streams. For now, just skip
the slice extension and prefix NAL units, and skip non-base view frames.
Signed-off-by: Xiaowei Li <xiaowei.a.li@intel.com>
[fixed memory leak, improved check for MVC NAL units]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out process by which the decoded picture with the lowest POC
is found, and possibly output. Likewise, the storage and marking of
a reference decoded, or non-reference decoded picture, into the DPB
could also be simplified as they mostly share the same operations.
Make init_picture_ref_lists() more consistent with other functions
related to the reference marking process by supplying the current
picture as argument.
Add gst_vaapi_display_get_vendor_string() helper function to query
the underlying VA driver name. The display object owns the resulting
string, so it shall not be deallocated.
That function is thread-safe. It could be used for debugging purposes,
for instance.
Make sure to initialize one GstVaapiDisplay at a time, even in threaded
environments. This makes sure the display cache is also consistent
during the whole display creation process. In the former implementation,
there were risks that display cache got updated in another thread.
Add support for dynamic growth of the VA surfaces pool. For decoding,
this implies the recreation of the underlying VA context, as per the
requirement from VA-API. Besides, only increases are supported, not
shrinks.
It is a requirement from VA-API specification that the VA context got
from vaCreateContext(), for decoding purposes, binds the supplied set
of VA surfaces. This means that if the set of VA surfaces is to be
changed for the current decode session, then the VA context needs to
be recreated with the new set of VA surfaces.
Complement fix committed as e95a42e.
The H.264 AVC standard has to say: if the field is part of a reference
frame or a complementary reference field pair, and the other field of
the same reference frame or complementary reference field pair is also
marked as "used for long-term reference", the reference frame or
complementary reference field pair is also marked as "used for long-term
reference" and assigned LongTermFrameIdx equal to long_term_frame_idx.
This fixes decoding of MR9_BT_B in strict mode.
https://bugs.freedesktop.org/show_bug.cgi?id=64624https://bugzilla.gnome.org/show_bug.cgi?id=724518
Request the correct chroma format for decoding grayscale streams.
i.e. make lookups of the VA chroma format more generic, thus possibly
supporting more formats in the future.
This means that, if a VA driver doesn't support grayscale formats,
it is now going to fail. We cannot safely assume that maybe grayscale
was implemented on top of some YUV 4:2:0 with the chroma components
all set to 0x80.
Fix reference picture marking process with memory_management_control_op
set to 3 and 6, i.e. assign LongTermFrameIdx to a short-term reference
picture, or the current picture.
This fixes decoding of FRExt_MMCO4_Sony_B.
https://bugs.freedesktop.org/show_bug.cgi?id=64624https://bugzilla.gnome.org/show_bug.cgi?id=724518
[squashed, edited to use GST_VAAPI_PICTURE_IS_COMPLETE() macro]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The initialization of reference picture lists (8.2.4.2) applies to all
slices. So, the RefPicList0/1 lists need to be constructed prior to
each slice submission to the HW decoder.
This fixes decoding of video sequences where frames are encoded with
multiple slices of different types, e.g. 4 slices in this order I, P,
I, and P. More precisely, CABAST3_Sony_E and CABASTBR3_Sony_B.
https://bugzilla.gnome.org/show_bug.cgi?id=724518
When NAL units of type 13 (SPS extension) or type 19 (auxiliary slice)
are present in a video, decoders shall perform the (optional) decoding
process specified for these NAL units or shall ignore them (7.4.1).
Implement option 2 (skip) for now, as alpha composition is not
supported yet during the decoding process.
This fixes decoding of the primary coded video in alphaconformanceG.
https://bugzilla.gnome.org/show_bug.cgi?id=703928https://bugzilla.gnome.org/show_bug.cgi?id=728869https://bugzilla.gnome.org/show_bug.cgi?id=724518
[skip NAL units earlier, i.e. at parsing time]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
When MVC slice NAL units (coded slice extension and prefix NAL) are
present, the number of NAL header bytes is 3, not 1 as usual.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
At the time the first VCL NAL unit of a primary coded picture is found,
and if that NAL unit was parsed to be an SPS or PPS, then the entries
in the parser may have been overriden. This means that, when the picture
is to be decoded, slice_hdr->pps could point to an invalid (the next)
PPS entry.
So, one way to solve this problem is to not use the parser PPS and
SPS info but rather maintain our own activation chain in the decoder.
https://bugzilla.gnome.org/show_bug.cgi?id=724519https://bugzilla.gnome.org/show_bug.cgi?id=724518
Retain the SEI messages that were parsed from the access unit until we
have completely decoded the current frame. This is done so that we can
peek at that data whenever necessary during decoding. e.g. for exposing
3D stereoscopic information at a later stage.
Fix support for grayscale encoded video clips, and possibly others if
the underlying driver supports the non-YUV 4:2:0 formats. i.e. defer
the decision that a surface with the desired chroma format is not
supported to the actual VA driver implementation.
https://bugzilla.gnome.org/show_bug.cgi?id=728144
Don't force allocation of VA surfaces in YUV 4:2:0 format. Rather, allow
for the upper layer to specify the desired chroma type. If the chroma
type field is not set (or yields zero), then YUV 4:2:0 format is used
by default.
Fix possible bug when a per-segment deblocking filter level value
needs to be set in non-absolute mode, i.e. when the loop filter update
value is negative in delta mode.
Also clamp the resulting filter level value to 0..63 range.
Improve condition to disable the loop filter. The previous heuristic
used to check all filter levels, for all segments. It turns out that
only the base filter_level value defined in the frame header needs
to be checked.
This fixes 00-comprehensive-013.
Fix generation of source tarballs when certain conditionals are not
met. e.g. always include all buildable codecparsers sources in the
distribution tarball, fix plug-in element sources set to include X11
and encoder bits.
The built-in libvpx serves multiple purposes, among which the most
important ones could be: track the most up-to-date, and optimized,
range decoder; allow for future hybrid implementations (non-VLD);
and have a completely independent range decoder implementation.
Apply correct patch from fd.o #722760 to fix several issues: update the
license terms to LGPLv2.1+, fix dependencies to built-in libvpx and fix
make dist.