We can avoid scanning for start codes again if the bitstream is fed
in NALU chunks. Currently, we always scan for start codes, and keep
track of remaining bits in a GstAdapter, even if, in practice, we
are likely receiving one GstBuffer per NAL unit. i.e. h264parse with
"nal" alignment.
https://bugzilla.gnome.org/show_bug.cgi?id=723284
[use gst_adapter_available_fast() to determine the top buffer size]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The `vaapipostproc' element could never determine if the H.264 stream
was interlaced, and thus always assumed it to be progressive. Fix the
H.264 decoder to report interlace-mode accordingly, thus allowing the
vaapipostproc element to automatically enable deinterlacing.
Factor out the removal process of unused inter-view only reference
pictures from the DPB, prior to the possible insertion of the current
picture.
Ideally, the compiler could still opt for generating two loops. But
at least, the code is now clearer for maintenance.
Improve process for the removal of pictures from DPB before possible
insertion of the current picture (C.4.4) for H.264 MVC inter-view only
reference components. In particular, handle cases where picture to be
inserted is not the last one of the access unit and if it was already
output and is no longer marked as used for reference, including for
decoding next view components within the same access unit.
While invoking the DPB bumping process in presence of many views,
it could be necessary to output previous pictures that are ready,
in a whole. i.e. emitting all view components from the very first
view order index zero to the very last one in its original access
unit; and not starting from the view order index of the picture
that caused the DPB bumping process to be invoked.
As a reminder, the maximum number of frames in DPB for MultiView
High profile with more than 2 views is not necessarily a multiple
of the number of views.
This fixes decoding of MVCNV-4.264.
Let the utility layer handle dynamic growth of the inter-view pictures
array. By definition, setting a new size to the array will effectively
grow the array, but would also fill in the newly created elements with
empty entries (NULL), thus also increasing the reported length, which
is not correct.
When decoding Multiview High profile streams with a large number of
views, it is not possible to make the VAPictureParameterBufferH264.
ReferenceFrames[] array hold the complete DPB, with all possibly
active pictures to be used for inter-view prediction in the current
access unit.
So reduce the scope of the ReferenceFrames[] array to only include
the set of reference pictures that are going to be used for decoding
the current picture. Basically, this is a union of all RefPicListX[]
array, for all slices constituting the decoded picture.
The inter-view reference components and inter-view only reference
components that are included in the reference picture lists shall
be considered as not being marked as "used for short-term reference"
or "used for long-term reference". This means that reference flags
should all be removed from VAPictureH264.flags.
This fixes decoding of MVCNV-2.264.
If the VA driver exposes ad-hoc H.264 MVC profiles, then we have to
be careful to detect profiles changes and not reset the underlying
VA context erroneously. In MVC situations, we could indeed get a
profile_idc change for every SPS that gets activated, alternatively
(base-view -> non-base view -> base-view, etc.).
An improved fix would be to characterize the exact profile to use
once and for all when SPS NAL units are parsed. This would also
allow for fallbacks to a base-view decoding only mode.
Optimize lookups of view ids / view order indices by caching the result
of the calculatiosn right into the GstVaapiParserInfoH264 struct. This
terribly simplifies is_new_access_unit() and find_first_field() functions.
Add safe fallbacks for MVC profiles:
- all MultiView High profile streams with 2 views at most can be decoded
with a Stereo High profile compliant decoder ;
- all Stereo High profile streams with only progressive views can be
decoded with a MultiView High profile compliant decoder ;
- all drivers that support slice-level decoding could normally support
MVC profiles when the DPB holds at most 16 frames.
In order to have a stricter conforming implementation, we need to carefully
detect access unit boundaries. Additional operations could be necessary to
perform at those boundaries.
Detect the first VCL NAL unit of a picture for MVC, based on the
view_id as per H.7.4.1.2.4. Note that we only need to detect new
view components.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Always cache the previous NAL unit so that we could check whether
there is a Prefix NAL unit immediately preceding the current slice
or IDR NAL unit. In that case, the NAL unit metadata is copied into
the current NAL unit. Otherwise, some default values are inferred,
tentatively. e.g. view_id shall be set to 0 and inter_view_flag to 1.
[infer default values for slice if previous NAL was not a Prefix]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Allow decoding for base views of MVC encoded streams. For now, just skip
the slice extension and prefix NAL units, and skip non-base view frames.
Signed-off-by: Xiaowei Li <xiaowei.a.li@intel.com>
[fixed memory leak, improved check for MVC NAL units]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out process by which the decoded picture with the lowest POC
is found, and possibly output. Likewise, the storage and marking of
a reference decoded, or non-reference decoded picture, into the DPB
could also be simplified as they mostly share the same operations.
Make init_picture_ref_lists() more consistent with other functions
related to the reference marking process by supplying the current
picture as argument.
Complement fix committed as e95a42e.
The H.264 AVC standard has to say: if the field is part of a reference
frame or a complementary reference field pair, and the other field of
the same reference frame or complementary reference field pair is also
marked as "used for long-term reference", the reference frame or
complementary reference field pair is also marked as "used for long-term
reference" and assigned LongTermFrameIdx equal to long_term_frame_idx.
This fixes decoding of MR9_BT_B in strict mode.
https://bugs.freedesktop.org/show_bug.cgi?id=64624https://bugzilla.gnome.org/show_bug.cgi?id=724518
Request the correct chroma format for decoding grayscale streams.
i.e. make lookups of the VA chroma format more generic, thus possibly
supporting more formats in the future.
This means that, if a VA driver doesn't support grayscale formats,
it is now going to fail. We cannot safely assume that maybe grayscale
was implemented on top of some YUV 4:2:0 with the chroma components
all set to 0x80.
Fix reference picture marking process with memory_management_control_op
set to 3 and 6, i.e. assign LongTermFrameIdx to a short-term reference
picture, or the current picture.
This fixes decoding of FRExt_MMCO4_Sony_B.
https://bugs.freedesktop.org/show_bug.cgi?id=64624https://bugzilla.gnome.org/show_bug.cgi?id=724518
[squashed, edited to use GST_VAAPI_PICTURE_IS_COMPLETE() macro]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The initialization of reference picture lists (8.2.4.2) applies to all
slices. So, the RefPicList0/1 lists need to be constructed prior to
each slice submission to the HW decoder.
This fixes decoding of video sequences where frames are encoded with
multiple slices of different types, e.g. 4 slices in this order I, P,
I, and P. More precisely, CABAST3_Sony_E and CABASTBR3_Sony_B.
https://bugzilla.gnome.org/show_bug.cgi?id=724518
When NAL units of type 13 (SPS extension) or type 19 (auxiliary slice)
are present in a video, decoders shall perform the (optional) decoding
process specified for these NAL units or shall ignore them (7.4.1).
Implement option 2 (skip) for now, as alpha composition is not
supported yet during the decoding process.
This fixes decoding of the primary coded video in alphaconformanceG.
https://bugzilla.gnome.org/show_bug.cgi?id=703928https://bugzilla.gnome.org/show_bug.cgi?id=728869https://bugzilla.gnome.org/show_bug.cgi?id=724518
[skip NAL units earlier, i.e. at parsing time]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
When MVC slice NAL units (coded slice extension and prefix NAL) are
present, the number of NAL header bytes is 3, not 1 as usual.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
At the time the first VCL NAL unit of a primary coded picture is found,
and if that NAL unit was parsed to be an SPS or PPS, then the entries
in the parser may have been overriden. This means that, when the picture
is to be decoded, slice_hdr->pps could point to an invalid (the next)
PPS entry.
So, one way to solve this problem is to not use the parser PPS and
SPS info but rather maintain our own activation chain in the decoder.
https://bugzilla.gnome.org/show_bug.cgi?id=724519https://bugzilla.gnome.org/show_bug.cgi?id=724518
Retain the SEI messages that were parsed from the access unit until we
have completely decoded the current frame. This is done so that we can
peek at that data whenever necessary during decoding. e.g. for exposing
3D stereoscopic information at a later stage.
Fix support for grayscale encoded video clips, and possibly others if
the underlying driver supports the non-YUV 4:2:0 formats. i.e. defer
the decision that a surface with the desired chroma format is not
supported to the actual VA driver implementation.
https://bugzilla.gnome.org/show_bug.cgi?id=728144
The gst_h264_parse_parse_sei() function now returns an array of SEI
messages, instead of a single SEI message. Reason: it is allowed to
have several SEI messages packed into a single SEI NAL unit, instead
of multiple NAL units.
Fix parser and decoder state to sync at the right locations. This is
because we could reset the parser state, while the decoder state was
not copied yet, e.g. when parsing several NAL units from multiple frames
whereas the current frame was not decoded yet.
This is a regression brought in by commit 6fe5496.
Instal <gst/vaapi/gstvaapiutils_h264.h> header but only expose the
H.264 levels in there. The additional helper functions are meant
to be private for now.
Improve robustness when some expected packets where not received yet
or that were not correctly decoded. For example, don't try to decode
a picture if there was no valid frame headers parsed so far.
https://bugs.freedesktop.org/show_bug.cgi?id=57902
Conformance test Base_Ext_Main_profiles/BA3_SVA_C.264 complys with
extended profile specifications. However, the SPS header has the
constraint_set1_flag syntax element set to 1. This means that, if
a Main profile compliant decoder is available, then it should be
able to decode this stream.
This changes makes it possible to fall-back from Extended profile
to Main profile if constraint_set1_flag is set to 1.
https://bugzilla.gnome.org/show_bug.cgi?id=720190
Recognize streams marked as conforming to the "Constrained Baseline
Profile". If VA driver supports that as is, fine. Otherwise, fallback
to baseline, main or high profile.
Constrained Baseline Profile conveys coding tools that are common
to baseline profile and main profile.
https://bugzilla.gnome.org/show_bug.cgi?id=719947
[Added fallbacks to main and high profiles]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The GStreamer codecparser layer now parses the scaling lists in zigzag
scan order, as expected, so that to match the original bitstream layout
and specification. However, further convert the scaling lists into
raster scan order to fit the existing practice in most VA drivers.
https://bugzilla.gnome.org/show_bug.cgi?id=706406
- gst_vaapi_utils_h264_get_level():
Returns GstVaapiLevelH264 from H.264 level_idc value
- gst_vaapi_utils_h264_get_level_idc():
Returns H.264 level_idc value from GstVaapiLevelH264
- gst_vaapi_utils_h264_get_level_limits():
Returns level limits as specified in Table A-1 of the H.264 standard
- gst_vaapi_utils_h264_get_level_limits_table():
Returns the Table A-1 specification
* Profiles:
- gst_vaapi_utils_h264_get_profile():
Returns GstVaapiProfile from H.264 profile_idc value
- gst_vaapi_utils_h264_get_profile_idc():
Returns H.264 profile_idc value from GstVaapiProfile
* Chroma formats:
- gst_vaapi_utils_h264_get_chroma_type():
Returns GstVaapiChromaType from H.264 chroma_format_idc value
- gst_vaapi_utils_h264_get_chroma_format_idc():
Returns H.264 chroma_format_idc value from GstVaapiChromaType
If the encoded stream has the frame_cropping_flag set, then associate
the cropping rectangle to GstVaapiPicture.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Port GstVaapiDecoder and GstVaapiDecoder{MPEG2,MPEG4,JPEG,H264,VC1} to
GstVaapiMiniObject. Add gst_vaapi_decoder_set_codec_state_changed_func()
helper function to let the user add a callback to a function triggered
whenever the codec state (e.g. caps) changes.
Drop support for user-defined data since this capability was not used
so far and GstVaapiMiniObject represents the smallest reference counted
object type. Add missing GST_VAAPI_MINI_OBJECT_CLASS() helper macro.
Besides, since GstVaapiMiniObject is a libgstvaapi internal object, it
is also possible to further simplify the layout of the object. i.e. merge
GstVaapiMiniObjectBase into GstVaapiMiniObject.
This integrates support for GStreamer API >= 1.0 only in the libgstvaapi
core decoding library. The changes are kept rather minimal here so that
the library retains as little dependency as possible on core GStreamer
functionality.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add a new GstVaapiDecoder::decode_codec_data() hook to actually decode
codec-data in the decoder sub-class. Provide a common shared helper
function to do the actual work and delegating further to the sub-class.