Implement GstVaapiDecoder.start_frame() and end_frame() semantics so
that to create new VA context earlier and submit VA pictures to the
HW for decoding as soon as possible. i.e. don't wait for the next
frame to start decoding the previous one.
Parse slice() header and first macroblock position earlier in _parse()
function instead of waiting for the _decode() stage. This doesn't change
anything but readability.
Introduce new GstVaapiDecoderUnitMpeg2 object, which holds the standard
GstMpegVideoPacket and additional parsed header info. Besides, we now
parse as early as in the _parse() function so that to avoid un-necessary
creation of sub-buffers in _decode() for video packets that are not slices.
Theory of operations: all units marked as "slice" are moved to the "units"
list. Since this list only contains slice data units, the prev_slice pointer
was removed. Besides, we now maintain two extra lists of units to be decoded
before or after slice data units.
In particular, all units in the "pre_units" list will be decoded before
GstVaapiDecoder::start_frame() is called and units in the "post_units"
list will be decoded after GstVaapiDecoder::end_frame() is called.
Make GstVaapiSurfaceProxy only a thin wrapper around a VA context and a
VA surface. i.e. drop any other attribute like timestamp, duration,
interlaced or top-field-first.
Maintain decoded surfaces as GstVideoCodecFrame objects instead of
GstVaapiSurfaceProxy objects. The latter will tend to be reduced to
the strict minimum: a context and a surface.
Add new gst_vaapi_decoder_get_frame() function meant to be used with
gst_vaapi_decoder_decode(). The purpose is to return the next decoded
frame as a GstVideoCodecFrame and the associated GstVaapiSurfaceProxy
as the user-data object.
Decoder units were zero-initialized, including the SPS/PPS/slice headers.
The latter don't require zero-initialization since the codecparsers/ lib
will do so for key variables already. This is not a great value per se but
at least it makes it possible to check whether the default initialization
decisions made in the codecparsers/ lib were right or not.
This can be reverted if this exposes too many issues.
Drop explicit initialization of most fields that are implicitly set to
zero. Drop helper macros for casting to GstVaapiPictureH264 or
GstVaapiFrameStore. Also remove some useless checks for NULL pointers.
Implement GstVaapiDecoder.start_frame() and end_frame() semantics so
that to create new VA context earlier and submit VA pictures to the
HW for decoding as soon as possible. i.e. don't wait for the next
frame to start decoding the previous one.
Introduce new GstVaapiDecoderUnitH264 object, which holds the standard
NAL unit header (GstH264NalUnit) and additional parsed header info.
Besides, we now parse headers as early as in the _parse() function so
that to avoid un-necessary creation of sub-buffers in _decode() for
NAL units that are not slices.
This is a performance win by ~+1.1% only.
Use standard GstVideoCodecState throughout GstVaapiDecoder and expose
it with a new gst_vaapi_decoder_get_codec_state() function. This makes
it possible to drop picture size (width, height) information, framerate
(fps_n, fps_d) information, pixel aspect ratio (par_n, par_d) information,
and interlace mode (is_interlaced field).
This is a new API with backwards compatibility maintained. In particular,
gst_vaapi_decoder_get_caps() is still available.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Align gst_vaapi_decoder_get_surface() semantics with the rest of the
API. That is, return a GstVaapiDecoderStatus and the decoded surface
as a handle to GstVaapiSurfaceProxy in parameter.
This is an API/ABI change.
Introduce new decoding process whereby a GstVideoCodecFrame is created
first. Next, input stream buffers are accumulated into a GstAdapter,
that is then passed to the _parse() function. The GstVaapiDecoder object
accumulates all parsed units and when a complete frame or field is
detected, that GstVideoCodecFrame is passed to the _decode() function.
Ultimately, the caller receives a GstVaapiSurfaceProxy if decoding
process was successful.
The start_frame() hook is called prior to traversing all decode-units
for decoding. The unit argument represents the first slice in the frame.
Some codecs (e.g. H.264) need to wait for the first slice in order to
determine the actual VA context parameters.
Split decoding process into two steps: (i) parse incoming bitstreams
into simple decoder-units until the frame or field is complete; and
(ii) decode the whole frame or field at once.
This is an ABI change.
Introduce a new GstVaapiDecoderFrame that is just a list of decoder units
(GstVaapiDecoderUnit objects) that constitute a frame. This object is just
an extension to GstVideoCodecFrame for VA decoder purposes. It is available
as the user-data member element.
This is a libgstvaapi internal object.
Introduce GstVaapiDecoderUnit which represents a fragment of the source
stream to be decoded. For instance, a decode-unit will be a NAL unit for
H.264 streams, an EBDU for VC-1 streams, and a video packet for MPEG-2
streams.
This is a libgstvaapi internal object.
GstVaapiSurfaceProxy does not use any particular functionality from
GObject. Actually, it only needs a basic object type with reference
counting.
This is an API and ABI change.
Introduce a new reference counted object that is very lightweight and
also provides flags and user-data functionalities. Initialization and
finalization times are reduced by up to a factor 5x vs GstMiniObject
from GStreamer 0.10 stack.
This is a libgstvaapi internal object.
Fix decode_slice() to ensure a VA context exists prior to creating a
new GstVaapiSliceH264, which invokes vaCreateBuffer() with some VA
context ID. i.e. the latter was not initialized, thus causing failures
on Cedar Trail for example.
Use glib >= 2.32 semantics for GMutex and GRecMutex wrt. initialization
and termination. Basically, the new mutex objects can be used as static
mutex objects from the deprecated APIs, e.g. GStaticMutex and GStaticRecMutex.
This fixes symbol clashes between the gst-vaapi built-in codecparsers/
library and the system-provided one, mainly used by videoparses/. Now,
only symbols with the gst_vaapi_* prefix will be exported, if they are
not marked as "hidden" to libgstvaapi.
Fix gst_vaapi_image_map() to return TRUE and the GstVaapiImageRaw
structure correctly filled in if the image was already mapped.
Likewise, make gst_vaapi_image_unmap() return TRUE if the image
was already unmapped.
Fix reference leak of surface and image in GstVaapiVideoBuffer wrapper,
thus resulting on actual memory leak of GstVaapiImage when using them
for downloads/uploads from VA surfaces and more specifically surfaces
when the pipeline is shutdown. i.e. vaTerminate() was never called
because the resources were not unreferenced, and thus not deallocated
in the end.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The picture size signalled by sps->{width,height} is the actual size with
cropping applied, not the original size derived from pic_width_in_mbs_minus1
and pic_height_in_map_units_minus1. VA driver expects that original size,
uncropped.
There is another issue pending: frame cropping information needs to be
taken care of.
Invoke gst_mpeg_video_finalise_mpeg2_sequence_header() to get the
correct PAR values. While doing so, require a newer version of the
bitstream parser library.
Note: it may be necessary to also parse the Sequence_Display_Extension()
header.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
This patch updates to relect the 1.0 version of the protocol. The main
changes are the switch to wl_registry for global object notifications
and the way that the event queue and file descriptor is processed.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
git am got confused somehow, though the end result doesn't change at
all since we require both SPS and PPS to be parsed prior to decoding
the first slice.
Only start decoding slices when at least one SPS and PPS got activated.
This fixes cases when a source represents a substream of another stream
and no SPS and PPS was inserted before the first slice of the generated
substream.
... for interlaced streams. The short_ref[] and long_ref[] arrays may
contain up to 32 fields but VA ReferenceFrames[] array expects up to
16 reference frames, thus including both fields.
Fix decoding of interlaced streams when adaptive_ref_pic_marking_mode_flag
is equal to 1, i.e. when memory management control operations are used. In
particular, when field_pic_flag is set to 0, the new reference flags shall
be applied to both fields.
Decoded frames are only output when they are complete, i.e. when both
fields are decoded. This also means that the "interlaced" caps is not
propagated to vaapipostproc or vaapisink elements. Another limitation
is that interlaced bitstreams with MMCO are unlikely to work.