Fix the compaction process when the DPB is cleared for a specific
view, i.e. fix the process of filling in the holes resulting from
removing frame buffers matching the current picture.
It is not necessary to periodically send SPS or subset SPS headers.
This is up to the upper layer (e.g. transport layer) to decide on
if/how to periodically submit those. For now, only generate new SPS
or subset SPS headers when the codec config changed.
Note: the upper layer could readily determine the config headers
(SPS/PPS) through the gst_vaapi_encoder_h264_get_codec_data() function.
https://bugzilla.gnome.org/show_bug.cgi?id=732083
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Report sample aspect ratio (SAR) as present, and make it match what
we have obtained from the user as pixel-aspect-ratio (PAR). i.e. the
VUI parameter aspect_ratio_info_present_flag now defaults to TRUE.
Set the value of num_anchor_refs_l0, num_anchor_refs_l1, num_non_anchor_refs_l0,
and num_non_anchor_refs_l1 to zero since the inter-view prediction is not yet
supported.
When the seq_parameter_set_data() syntax structure is present in a subset
sequence parameter set and vui_parameters_present_flag is equal to 1, then
timing_info_present_flag shall be equal to 0 (H.7.4.2.1.1).
Submit Prefix NAL headers (nal_unit_type = 14) before every packed
slice header (nal_unit_type = 1 or 5) only for the base view. In non
base views, a Coded Slice Extension NAL header (nal_unit_type = 20)
is required, with an appropriate nal_unit_header_mvc_extension() in
the NAL header bytes.
https://bugzilla.gnome.org/show_bug.cgi?id=732083
Fix search for a picture in the DPB that has a lower POC value than
the current picture. The dpb_find_lowest_poc() function will return
a picture with the lowest POC in DPB and that is marked as "needed
for output", but an additional check against the actual POC value
of the current picture is needed.
This is a regression from 1c46990.
https://bugzilla.gnome.org/show_bug.cgi?id=732130
Fix dpb_clear() to clear previous frame buffers only if they actually
exist to begin with. If the decoder bailed out early, e.g. when it
does not support a specific profile, that array of previous frames
might not be allocated beforehand.
We can avoid scanning for start codes again if the bitstream is fed
in NALU chunks. Currently, we always scan for start codes, and keep
track of remaining bits in a GstAdapter, even if, in practice, we
are likely receiving one GstBuffer per NAL unit. i.e. h264parse with
"nal" alignment.
https://bugzilla.gnome.org/show_bug.cgi?id=723284
[use gst_adapter_available_fast() to determine the top buffer size]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The `vaapipostproc' element could never determine if the H.264 stream
was interlaced, and thus always assumed it to be progressive. Fix the
H.264 decoder to report interlace-mode accordingly, thus allowing the
vaapipostproc element to automatically enable deinterlacing.
The packed slice header and packed raw data need to be paired with
the submission of VAEncSliceHeaderParameterBuffer. So handle them
on a per-slice basis insted of a per-picture basis.
[removed useless initializer]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out the removal process of unused inter-view only reference
pictures from the DPB, prior to the possible insertion of the current
picture.
Ideally, the compiler could still opt for generating two loops. But
at least, the code is now clearer for maintenance.
Improve process for the removal of pictures from DPB before possible
insertion of the current picture (C.4.4) for H.264 MVC inter-view only
reference components. In particular, handle cases where picture to be
inserted is not the last one of the access unit and if it was already
output and is no longer marked as used for reference, including for
decoding next view components within the same access unit.
While invoking the DPB bumping process in presence of many views,
it could be necessary to output previous pictures that are ready,
in a whole. i.e. emitting all view components from the very first
view order index zero to the very last one in its original access
unit; and not starting from the view order index of the picture
that caused the DPB bumping process to be invoked.
As a reminder, the maximum number of frames in DPB for MultiView
High profile with more than 2 views is not necessarily a multiple
of the number of views.
This fixes decoding of MVCNV-4.264.
Let the utility layer handle dynamic growth of the inter-view pictures
array. By definition, setting a new size to the array will effectively
grow the array, but would also fill in the newly created elements with
empty entries (NULL), thus also increasing the reported length, which
is not correct.
When decoding Multiview High profile streams with a large number of
views, it is not possible to make the VAPictureParameterBufferH264.
ReferenceFrames[] array hold the complete DPB, with all possibly
active pictures to be used for inter-view prediction in the current
access unit.
So reduce the scope of the ReferenceFrames[] array to only include
the set of reference pictures that are going to be used for decoding
the current picture. Basically, this is a union of all RefPicListX[]
array, for all slices constituting the decoded picture.
The inter-view reference components and inter-view only reference
components that are included in the reference picture lists shall
be considered as not being marked as "used for short-term reference"
or "used for long-term reference". This means that reference flags
should all be removed from VAPictureH264.flags.
This fixes decoding of MVCNV-2.264.
If the VA driver exposes ad-hoc H.264 MVC profiles, then we have to
be careful to detect profiles changes and not reset the underlying
VA context erroneously. In MVC situations, we could indeed get a
profile_idc change for every SPS that gets activated, alternatively
(base-view -> non-base view -> base-view, etc.).
An improved fix would be to characterize the exact profile to use
once and for all when SPS NAL units are parsed. This would also
allow for fallbacks to a base-view decoding only mode.
Exclusively use VA drivers that support raw packed headers for encoding.
i.e. simply submit packed headers Subset SPS and Prefix NAL units. This
provides for better compatibility accross the various VA drivers and HW
generations since no particular API is needed beyond what readily exists.
Since we are encoding each view independently from each other, we
need a higher number of pre-allocated surfaces to be used as the
reconstructed frames. For Stereo High profile encoding, this means
to effectively double the number of frames to be stored in the DPB.
Add initial support for Subset SPS, Prefix NAL and Slice Extension NAL
for non-base-view streams encoding, and the usual SPS, PPS and Slice
NALs for base-view encoding.
The H.264 Stereo High profile encoding mode will be turned on when the
"num-views" parameter is set to 2. The source (raw) YUV frames will be
considered as Left/Right view, alternatively.
Each of the two views has its own frames reordering pool and reference
frames list management system. Inter-view references are not supported
yet, so the views are encoded independently from each other.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
[limited to Stereo High profile per the definition of MAX_NUM_VIEWS]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Create structures to maintain the reference frames list (RefPool) and
frames reordering (ReorderPool) logic.
This is a prerequisite for H.264 MVC support.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Add provisions to write subset SPS headers to the bitstream in view
to supporting the H.264 MVC specification.
This assumes the libva "staging" branch is in use.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Optimize lookups of view ids / view order indices by caching the result
of the calculatiosn right into the GstVaapiParserInfoH264 struct. This
terribly simplifies is_new_access_unit() and find_first_field() functions.
Add safe fallbacks for MVC profiles:
- all MultiView High profile streams with 2 views at most can be decoded
with a Stereo High profile compliant decoder ;
- all Stereo High profile streams with only progressive views can be
decoded with a MultiView High profile compliant decoder ;
- all drivers that support slice-level decoding could normally support
MVC profiles when the DPB holds at most 16 frames.
In order to have a stricter conforming implementation, we need to carefully
detect access unit boundaries. Additional operations could be necessary to
perform at those boundaries.
Detect the first VCL NAL unit of a picture for MVC, based on the
view_id as per H.7.4.1.2.4. Note that we only need to detect new
view components.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Always cache the previous NAL unit so that we could check whether
there is a Prefix NAL unit immediately preceding the current slice
or IDR NAL unit. In that case, the NAL unit metadata is copied into
the current NAL unit. Otherwise, some default values are inferred,
tentatively. e.g. view_id shall be set to 0 and inter_view_flag to 1.
[infer default values for slice if previous NAL was not a Prefix]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Allow decoding for base views of MVC encoded streams. For now, just skip
the slice extension and prefix NAL units, and skip non-base view frames.
Signed-off-by: Xiaowei Li <xiaowei.a.li@intel.com>
[fixed memory leak, improved check for MVC NAL units]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out process by which the decoded picture with the lowest POC
is found, and possibly output. Likewise, the storage and marking of
a reference decoded, or non-reference decoded picture, into the DPB
could also be simplified as they mostly share the same operations.
Make init_picture_ref_lists() more consistent with other functions
related to the reference marking process by supplying the current
picture as argument.
Add gst_vaapi_display_get_vendor_string() helper function to query
the underlying VA driver name. The display object owns the resulting
string, so it shall not be deallocated.
That function is thread-safe. It could be used for debugging purposes,
for instance.
Make sure to initialize one GstVaapiDisplay at a time, even in threaded
environments. This makes sure the display cache is also consistent
during the whole display creation process. In the former implementation,
there were risks that display cache got updated in another thread.
Add support for dynamic growth of the VA surfaces pool. For decoding,
this implies the recreation of the underlying VA context, as per the
requirement from VA-API. Besides, only increases are supported, not
shrinks.