Reset, i.e. destroy then create, the decoder in _setcaps() handler only
if the underlying codec type actually changed. This makes it possible
to be more tolerant with certain MPEG-2 streams that get parsed to
form caps that are compatible with the previous state but minor changes
to "codec-data".
Make it possible to specify the maximum number of references to use within
a single VA context. This helps reducing GPU memory allocations to the useful
number of references to be used.
Forward declaring enums is not allowed by the C standard and aborts
compilation if the header file is included in a C++ project.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Some VA drivers (e.g. EMGD) can have completely random values for initial
display attributes. So, try to improve the discovery process to check the
initial display attribute values actually fall within valid bounds. If not,
try to reset those to some sensible values like the default value reported
through vaQueryDisplayAttributes().
Use g_object_class_install_properties() to install GstVaapiDisplay properties.
It is useful to maintain properties as GParamSpec so that to be able to raise
"notify" signals by id instead of by name in the future.
A rendering mode can be "overlay" or "texture"'ed blit.
The former mode implies that a VA surface used for rendering can't be
re-used right away for decoding, so the sink shall make provisions to
retain the associated surface proxy until the next surface is to be
displayed.
The latter mode implies that the VA surface is implicitly copied to an
intermediate backing store, or back buffer of a frame buffer, so the
associated surface proxy can be disposed right away.
The VA display attributes are mapped to properties so that to maintain the
GStreamer terminology. Properties are to be identified by name, but internal
functions are available to lookup the property by the actual VA display
attribute type.
decode_current_picture() was converted to return a gboolean instead
of a GstVaapiDecoderStatus, so we were not getting out of the decode
loop as expected, or could cause an error instead.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Integrate the start code prefix in the slice data buffer that is submitted
to the hardware. VA-API specifies that slice_data_offset is the offset to
the first byte of slice data. And, for MPEG-2, slice() data begins with
the slice_start_code. Some VA driver implementations (EMGD) expect this.
Use g_object_notify_by_pspec() instead of g_object_notify() so that to
avoid a property name lookup. i.e. this makes notifications faster to
the `vaapidecode' element.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Two elements in the luminance quantization table were wrong. So,
gst_jpeg_get_default_quantization_tables() now reconstructs tables
in zig-zag order from the standard ones (Tables K.1 and K.2).
... instead of having them pre-calculated. This saves around 1.5 KB
of data in the DSO but requires gst_jpeg_get_default_huffman_tables()
to do more work. Though, the client application may have to call that
function at most once, only.
Move display types from gstvaapipluginutil.* to gstvaapidisplay.* so that
we could simplify characterization of a GstVaapiDisplay. Also rename "auto"
type to "any", and add a "display-type" attribute.
This improves display name comparisons by always allocating a valid display
name. This also helps to disambiguate lookups by name in the global display
cache, should a new backend be implemented.
The vdeo buffer creation routines shall actually be internal to gstreamer-vaapi
plugin elements. So deprecate any explicit creation routines that are not the
new *_typed_new*() variants.
Introduce new typed constructors internal to gstreamer-vaapi plugin elements.
This avoids duplication of code, and makes it possible to further implement
generic video buffer creation routines that automatically map to base or GLX
variants.
If GLX window was created from a foreign Display, then that same Display shall
be used for subsequent glXMakeCurrent(). This means that gl_create_context()
will now use the same Display that the parent, if available.
This fixes cluttersink with the Intel GenX VA driver.
This flag is obsolete. It was meant to explicitly enable/disable VA/GLX API
support, or fallback to TFP+FBO if this API is not found. Now, we check for
the VA/GLX API by default if --enable-glx is set. If this API is not found,
we now default to use TFP+FBO.
Note: TFP+FBO, i.e. using vaPutSurface() is now also a deprecated usage and
will be removed in the future. If GLX rendering is requested, then the VA/GLX
API shall be used as it covers most usages. e.g. AMD driver can't render to
an X pixmap yet.
GStreamer -base plugins >= 0.10.31 are now required, so the checks for
new APIs like GstXOverlay::set_window_handle() and ::set_render_rectangle()
are no longer necessary.
GStreamer codecparsers-based decoders are the only supported decoders now.
Though, FFmpeg decoders are still available in gstreamer-vaapi 0.3.x series.
This is a preferred thread-safe version. Also add an inline version of
g_clear_object() if compiling with glib < 2.28.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add valid flag to GstJpegQuantTable and GstJpegHuffmanTable so that
to determine whether a table actually changed since the last user
synchronization point. That way, this makes it possible for some
hardware accelerated decoding solution to upload only those tables
that changed.
Add new GstJpegHuffmanTables helper structure to hold all possible
AC/DC Huffman tables available to all components.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
gst_jpeg_parse() now gathers all scans available in the supplied
buffer. A scan comprises of the scan header and any entropy-coded
segments or restart marker following it. The size and offset to
the associated data (ECS + RST segments) are append to a new
GstJpegScanOffsetSize structure.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Improve generation of presentation timestamps to be less sensitive
to input stream errors. In practise, GOP is also a synchronization
point for PTS calculation.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
TRD and TRB fields are not large enough to hold the difference of PTS
expressed with nanosecond resolution. So, compute them from the original
VOP info.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Allow MPEG-2 High profile streams only if the HW supports that profile
or no High profile specific bits are used, and thus Main profile could
be used instead. i.e. chroma_format is 4:2:0, intra_dc_precision is not
set to 11 and no sequence_scalable_extension() was parsed.
In P-pictures, prediction shall be made from the two most recently
decoded reference fields. However, when the first I-frame is a field,
the next field of the current picture could be a P-picture but only a
single field was decoded so far. In this case, create a dummy picture
with POC = -1 that will be used as reference.
Some VA drivers would error out if P-pictures don't have a forward
reference picture. This is true in general but not in this very specific
initial case.
Allow fallback from simple to main profile when the HW decoder does
not support the former profile and that no sequence_header_extension()
is available to point out this.
decode_picture() could return an error when an MPEG-4 profile is not
supported for example. In this case, the underlying VA context is not
allocated and no other proper action can be taken. Likewise on exit
from decode_slice().
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Introduce a POC field in GstVaapiPicture so that to store simpler sequential
numbers. A signed 32-bit integer should be enough for 1 year of continuous
video streaming at 60 Hz.
Use this new POC value to maintain the DPB, instead of 64-bit timestamps.
This also aligns with H.264 that will be migrated to GstVaapiDpb infrastructure.
Always prefer PTS from the demuxer layer for GOP times. If this is invalid,
i.e. demuxer could not determine the PTS or the generated PTS is lower than
max PTS from past pictures, then try to fix it up based on the duration of
a frame.
For picture PTS, simply use the GOP PTS formerly computed then use TSN to
reconstruct a current time. Also now handle wrapped TSN correctly.
Some streams, badly constructed, could have signaled an interlaced
frame while the sequence was meant to be progressive. Warn and force
frame to be progressive in this case.
Add first-field (FF) flag to GstVaapiPicture, thus not requiring is_first_field
member in each decoder. Rather, when a GstVaapiPicture is created, it is considered
as the first field. Any subsequent allocated field will become the second field.
Add gst_vaapi_picture_new_field() function that clones a picture, while
preserving the parent picture surface. i.e. the surface proxy reference
count is increased and other fields copied as is. Besides, the picture
is reset into a "non-output" mode.
Add top-field-first (TFF) and interlaced flags to GstVaapiPicture so they
could be propagated to the surface proxy when it is pushed for rendering.
Besides, top and bottom fields are now expressed with picture structure flags
from GstVaapiSurfaceRenderFlags.
If GstVaapiPicture has flag SKIPPED set, this means gst_vaapi_picture_output()
will not push the underlying surface for rendering. Besides, VC-1 skipped P-frame
has nothing to do with rendering. This only means that the currently decoded
picture is just a copy of its reference picture.
Add new "interlaced" attribute to GstVaapiSurfaceProxy. Use this in
vaapipostproc so that to handles cases where bitstream is interlaced
but almost only frame pictures are generated. In this case, we should
not be alternating between top/bottom fields.
Allow rendering flags, as a combination of GstVaapiSurfaceRenderFlags,
to be set to the video buffer. In particular, this is mostly useful for
basic deinterlacing.
Some streams have incorrect GOP timestamps, or nothing set at all.
i.e. GOP time is 00:00:00 for all GOPs. Try to recover in this case
from demuxer timestamps, which are monotonic.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Skip all pictures prior to the first sequence_header(). Besides,
skip all picture_data() if there was no prior picture_header().
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
VA-API expects slice_vertical_position as the initial position from the
bitstream. i.e. the direct slice() information. VA drivers will be fixed
accordingly.
Unlike what VA-API documentation defines, the slice_data_bit_offset
represents the offset to the first macroblock in the slice data, minus
any emulation prevention bytes in the slice_header().
This fix copes with binary-only VA drivers that won't be fixed any
time soon. Besides, this aligns with the current FFmpeg behaviour
that was based on those proprietary drivers implementing the API
incorrectly.
Original values from sequence_header() are 12-bit and the remaining
2 most significant bits are coming from sequence_extension().
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
6.3.15 says that "some slices may have the same slice_vertical_position,
since slices may start and finish anywhere". So, we can't submit the current
picture to the HW right away since subsequent slices would be missing.
vaRenderPicture() implicitly disposes VA buffers. Some VA drivers would
push the VA buffer object into a list of free buffers to be re-used. However,
reference pictures (and data) that was kept would explicitly release the VA
buffer object later on, thus possibly destroying a valid (re-used) object.
Besides, some other VA drivers don't support correctly the vaRenderPicture()
semantics for VA buffers disposal and would leak memory if there is no explicit
vaDestroyBuffer(). The temporary workaround is to explcitily destroy VA buffers
right after vaRenderPicture(). All VA drivers need to be aligned.
This ensures the VA context is clear when the encoded resolution
changes. i.e. make sure older picture is decoded with the older
VA context before it changes.
On sequence end, if the last decoded picture is not output for rendering,
then the proxy surface is not created. In this case, the original surface
must be released explicitly to the context.
VA drivers may have a faster means to transfer user buffers to GPU
buffers than using memcpy(). In particular, on Intel Gen graphics, we
can use pwrite(). This provides for faster upload of bitstream and can
help higher bitrates.
vaapi_create_buffer() helper function was also updated to allow for
un-mapped buffers and pre-initialized data for buffers.