Port GstVaapiDecoder and GstVaapiDecoder{MPEG2,MPEG4,JPEG,H264,VC1} to
GstVaapiMiniObject. Add gst_vaapi_decoder_set_codec_state_changed_func()
helper function to let the user add a callback to a function triggered
whenever the codec state (e.g. caps) changes.
Drop support for user-defined data since this capability was not used
so far and GstVaapiMiniObject represents the smallest reference counted
object type. Add missing GST_VAAPI_MINI_OBJECT_CLASS() helper macro.
Besides, since GstVaapiMiniObject is a libgstvaapi internal object, it
is also possible to further simplify the layout of the object. i.e. merge
GstVaapiMiniObjectBase into GstVaapiMiniObject.
This integrates support for GStreamer API >= 1.0 only in the libgstvaapi
core decoding library. The changes are kept rather minimal here so that
the library retains as little dependency as possible on core GStreamer
functionality.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add a new GstVaapiDecoder::decode_codec_data() hook to actually decode
codec-data in the decoder sub-class. Provide a common shared helper
function to do the actual work and delegating further to the sub-class.
Force luma_log2_weight_denom and chroma_log2_weight_denom to zero if
there is no pred_weight_table() that was parsed.
This is a workaround for the VA intel-driver on Ivy Bridge.
Allocate decoder unit earlier in the main parse() function and don't
delegate this task to derived classes. The ultimate purpose is to get
rid of dynamic allocation of decoder units.
The SPS, PPS and slice headers are not fully zero-initialized in the
codecparsers/ library. Rather, the standard upstream behaviour is to
initialize only certain syntax elements with some inferred values if
they are not present in the bitstream.
At the gstreamer-vaapi decoder level, we need to further initialize
certain syntax elements with some sensible default values so that to
not complicate VA drivers that just pass those verbatim to the HW,
and also avoid an memset() of the whole decoder unit.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Decoder units were zero-initialized, including the SPS/PPS/slice headers.
The latter don't require zero-initialization since the codecparsers/ lib
will do so for key variables already. This is not a great value per se but
at least it makes it possible to check whether the default initialization
decisions made in the codecparsers/ lib were right or not.
This can be reverted if this exposes too many issues.
Drop explicit initialization of most fields that are implicitly set to
zero. Drop helper macros for casting to GstVaapiPictureH264 or
GstVaapiFrameStore. Also remove some useless checks for NULL pointers.
Implement GstVaapiDecoder.start_frame() and end_frame() semantics so
that to create new VA context earlier and submit VA pictures to the
HW for decoding as soon as possible. i.e. don't wait for the next
frame to start decoding the previous one.
Introduce new GstVaapiDecoderUnitH264 object, which holds the standard
NAL unit header (GstH264NalUnit) and additional parsed header info.
Besides, we now parse headers as early as in the _parse() function so
that to avoid un-necessary creation of sub-buffers in _decode() for
NAL units that are not slices.
This is a performance win by ~+1.1% only.
Fix decode_slice() to ensure a VA context exists prior to creating a
new GstVaapiSliceH264, which invokes vaCreateBuffer() with some VA
context ID. i.e. the latter was not initialized, thus causing failures
on Cedar Trail for example.
The picture size signalled by sps->{width,height} is the actual size with
cropping applied, not the original size derived from pic_width_in_mbs_minus1
and pic_height_in_map_units_minus1. VA driver expects that original size,
uncropped.
There is another issue pending: frame cropping information needs to be
taken care of.
git am got confused somehow, though the end result doesn't change at
all since we require both SPS and PPS to be parsed prior to decoding
the first slice.
Only start decoding slices when at least one SPS and PPS got activated.
This fixes cases when a source represents a substream of another stream
and no SPS and PPS was inserted before the first slice of the generated
substream.
... for interlaced streams. The short_ref[] and long_ref[] arrays may
contain up to 32 fields but VA ReferenceFrames[] array expects up to
16 reference frames, thus including both fields.
Fix decoding of interlaced streams when adaptive_ref_pic_marking_mode_flag
is equal to 1, i.e. when memory management control operations are used. In
particular, when field_pic_flag is set to 0, the new reference flags shall
be applied to both fields.
Decoded frames are only output when they are complete, i.e. when both
fields are decoded. This also means that the "interlaced" caps is not
propagated to vaapipostproc or vaapisink elements. Another limitation
is that interlaced bitstreams with MMCO are unlikely to work.
Split remove_reference_at() into a function that actually removes the
specified entry from the short-term or long-term reference picture array,
and a function that sets reference flags to the desired value, possibly
zero. The latters marks the picture as "unused for reference".
Introduce new `structure' field to the H.264 specific picture structure
so that to simplify the reference picture marking process. That local
picture structure is derived from the original picture structure, as
defined by the syntax elements field_pic_flag and bottom_field_flag.
Move DPB flush up if the current picture to decode is an IDR. Besides,
don't bother to check for IDR pictures in dpb_add() function since an
explicit DPB flush was already performed in this case.
... to build the short_ref[] and long_ref[] lists from the DPB, instead
of maintaining them separately. This avoids refs/unrefs while making it
possible to generate the list based on the actual picture structure.
This also ensures that the list of generated ReferenceFrames[] actually
matches what reference frames are available in the DPB. i.e. short_ref[]
and long_ref[] entries are implied from the DPB, so there is no risk of
having "dangling" references.
Use the POC member available in the GstVaapiPicture base class and
get rid of the dependency on the local VAPictureH264 TopFieldOrderCnt
and BottomFieldOrderCnt. Rather, use a simple field_poc[] array
initialized to INT_MAX, so that to simplify picture POC calculation
for non frame pictures.
Further get rid of GstVaapiPictureH264-local VAPictureH264.flags for
reference bits, thus simplifying the reference picture marking process
to only track a single set of reference flags. Also introduce a new
long_term_frame_idx member.
Add vaapi_fill_picture() helper function to convert GstVaapiPictureH264
to VAPictureH264 structure. This is preparatory work to get rid of the
local VAPictureH264 member in GstVaapiPictureH264.
Delay ensure_context() until we actually need a VA context for allocating
new VA surfaces, and then GstVaapiPictures, but also when a real activation
of a new picture parameter set occurs, thus also implying an activation
of the related sequence parameter set.
The most important thing was to drop the global pps and sps pointers since
they may not have matched the currently activated picture parameter or
sequence parameter sets at the specified decode point.
Anoter positive side-effect is that this cleans up all occurrences of
decode_current_picture() to only keep those useful in decode_picture(),
before a new picture is allocated, or in decode_sequence_end() when
an end-of-stream or end-of-sequence condition occurred.
... aka fix regression from efaab79. In particular, ScalingList8x8[]
array was partially copied to the VAIQMatrixBufferH264. While we are
at it, also improve bounds checking and avoid copying 8x8 scaling
lists if transform_8x8_mode_flag is set to 0.
Don't copy scaling lists twice to an intermediate state. Rather, directly
use the scaling lists from GstH264PPS since they would match those provided
by SPS header, if necessary. i.e. if PPS-specific scaling lists are not
available in the bitstream.
Remove exit_picture() and exit_picture_poc() since PicOrderCnt(CurrPic)
is now updated accordingly to the standard. Besides, MMCO = 5 specific
operations are moved up to exec_ref_pic_marking_adaptive_mmco_5().
Fix adaptive memory control decoded reference picture marking process
implementation for operations 2 to 6, thus also fixing support for
long-term reference pictures.
This change only splits each individual MMCO handler into several functions
dedicated for each operation. This is needed to perform further work later
on.
Allow build with strict DPB ordering mode whereby evicted entries
are replaced by the next entries, in order instead of optimizing
it away with the last entry in the DPB.
This is only useful for debugging purpose, against a reference SW
decoder for example.