Split the input buffer data into decoder units that represent a JPEG
segment. Handle scan decoder unit specifically so that it can include
both the scan header (SOS) but also any other ECS or RSTi segment.
That way, we parse the input buffer stream only once at the gst-vaapi
level instead of (i) in gst_vaapi_decoder_jpeg_parse() to split the
stream into frames SOI .. EOI and (ii) in decode_buffer() to further
determine segment boundaries and decode them.
In practice, this is a +15 to +25% performance improvement.
Port GstVaapiDecoder and GstVaapiDecoder{MPEG2,MPEG4,JPEG,H264,VC1} to
GstVaapiMiniObject. Add gst_vaapi_decoder_set_codec_state_changed_func()
helper function to let the user add a callback to a function triggered
whenever the codec state (e.g. caps) changes.
Add gst_vaapi_decoder_get_frame_with_timeout() helper function that will
wait for a frame to be decoded, until the specified timeout in microseconds,
prior to returning to the caller.
This is a fix to performance regression from 851cc0, whereby the vaapidecode
loop executed on the srcpad task was called to often, thus starving all CPU
resources.
Rename GstVaapiDecoderFrame to GstVaapiParserFrame because this data
structure was only useful to parsing and a proper GstvaapiDecoderFrame
instance will be created instead.
Add a new GstVaapiDecoder::decode_codec_data() hook to actually decode
codec-data in the decoder sub-class. Provide a common shared helper
function to do the actual work and delegating further to the sub-class.
If the decoder was not able to decode a frame because insufficient
information was available, e.g. missing sequence or picture header,
then allow the frame to be gracefully dropped without generating
any error.
It is also possible that a frame is not meant to be displayed but
only used as a reference, so dropping that frame is also a valid
operation since GstVideoDecoder base class has extra references to
that GstVideoCodecFrame that needs to be released.
Use a GArray to hold decoder units in a frame, instead of a single-linked
list. This makes 'append' calls faster, but not that much. At least, this
makes things clearer.
Maintain decoded surfaces as GstVideoCodecFrame objects instead of
GstVaapiSurfaceProxy objects. The latter will tend to be reduced to
the strict minimum: a context and a surface.
Use standard GstVideoCodecState throughout GstVaapiDecoder and expose
it with a new gst_vaapi_decoder_get_codec_state() function. This makes
it possible to drop picture size (width, height) information, framerate
(fps_n, fps_d) information, pixel aspect ratio (par_n, par_d) information,
and interlace mode (is_interlaced field).
This is a new API with backwards compatibility maintained. In particular,
gst_vaapi_decoder_get_caps() is still available.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Split decoding process into two steps: (i) parse incoming bitstreams
into simple decoder-units until the frame or field is complete; and
(ii) decode the whole frame or field at once.
This is an ABI change.
Drop obsolete gst_vaapi_decoder_push_surface() that was no longer used.
Change gst_vaapi_decoder_push_surface_proxy() semantics to assume PTS
is already set correctly and reference count increased, if necessary.
The new API simplifies a lot reference counting and makes it more
flexible for future additions/changes. The GstVaapiCodecInfo is
also gone. Rather, new helper macros are provided to allocate
picture, slice and quantization matrix parameter buffers.
Under some circumstances, we could have leaked a surface, thus not
releasing it to the pool of available surfaces in the VA context.
The strategy is now to use a proxy earlier and automatically ref/unref
whenever necessary. In particular, during the lifetime needed for FFmpeg.