Fix decoding of interlaced streams where a first field (e.g. B-slice)
was immediately output and the current decoded field is to be paired
with that former frame, which is no longer in DPB.
https://bugzilla.gnome.org/show_bug.cgi?id=701340
Optimize the process to detect new pictures or start of new access
units by checking if the previous NAL unit was the end of a picture,
or the end of the previous access unit.
Add support for MVC streams with multiple SPS and subset SPS headers
emitted regularly, e.g. at around every I-frame. Track the maximum
number of views in ensure_context() and really reset the DPB size to
the expected value, always. i.e. even if it decreased. dpb_reset()
only cares of ensuring the DPB allocation.
Fix the compaction process when the DPB is cleared for a specific
view, i.e. fix the process of filling in the holes resulting from
removing frame buffers matching the current picture.
It is not necessary to periodically send SPS or subset SPS headers.
This is up to the upper layer (e.g. transport layer) to decide on
if/how to periodically submit those. For now, only generate new SPS
or subset SPS headers when the codec config changed.
Note: the upper layer could readily determine the config headers
(SPS/PPS) through the gst_vaapi_encoder_h264_get_codec_data() function.
https://bugzilla.gnome.org/show_bug.cgi?id=732083
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Report sample aspect ratio (SAR) as present, and make it match what
we have obtained from the user as pixel-aspect-ratio (PAR). i.e. the
VUI parameter aspect_ratio_info_present_flag now defaults to TRUE.
Set the value of num_anchor_refs_l0, num_anchor_refs_l1, num_non_anchor_refs_l0,
and num_non_anchor_refs_l1 to zero since the inter-view prediction is not yet
supported.
When the seq_parameter_set_data() syntax structure is present in a subset
sequence parameter set and vui_parameters_present_flag is equal to 1, then
timing_info_present_flag shall be equal to 0 (H.7.4.2.1.1).
Submit Prefix NAL headers (nal_unit_type = 14) before every packed
slice header (nal_unit_type = 1 or 5) only for the base view. In non
base views, a Coded Slice Extension NAL header (nal_unit_type = 20)
is required, with an appropriate nal_unit_header_mvc_extension() in
the NAL header bytes.
https://bugzilla.gnome.org/show_bug.cgi?id=732083
Fix search for a picture in the DPB that has a lower POC value than
the current picture. The dpb_find_lowest_poc() function will return
a picture with the lowest POC in DPB and that is marked as "needed
for output", but an additional check against the actual POC value
of the current picture is needed.
This is a regression from 1c46990.
https://bugzilla.gnome.org/show_bug.cgi?id=732130
Fix dpb_clear() to clear previous frame buffers only if they actually
exist to begin with. If the decoder bailed out early, e.g. when it
does not support a specific profile, that array of previous frames
might not be allocated beforehand.
We can avoid scanning for start codes again if the bitstream is fed
in NALU chunks. Currently, we always scan for start codes, and keep
track of remaining bits in a GstAdapter, even if, in practice, we
are likely receiving one GstBuffer per NAL unit. i.e. h264parse with
"nal" alignment.
https://bugzilla.gnome.org/show_bug.cgi?id=723284
[use gst_adapter_available_fast() to determine the top buffer size]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The `vaapipostproc' element could never determine if the H.264 stream
was interlaced, and thus always assumed it to be progressive. Fix the
H.264 decoder to report interlace-mode accordingly, thus allowing the
vaapipostproc element to automatically enable deinterlacing.
The packed slice header and packed raw data need to be paired with
the submission of VAEncSliceHeaderParameterBuffer. So handle them
on a per-slice basis insted of a per-picture basis.
[removed useless initializer]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out the removal process of unused inter-view only reference
pictures from the DPB, prior to the possible insertion of the current
picture.
Ideally, the compiler could still opt for generating two loops. But
at least, the code is now clearer for maintenance.
Improve process for the removal of pictures from DPB before possible
insertion of the current picture (C.4.4) for H.264 MVC inter-view only
reference components. In particular, handle cases where picture to be
inserted is not the last one of the access unit and if it was already
output and is no longer marked as used for reference, including for
decoding next view components within the same access unit.
While invoking the DPB bumping process in presence of many views,
it could be necessary to output previous pictures that are ready,
in a whole. i.e. emitting all view components from the very first
view order index zero to the very last one in its original access
unit; and not starting from the view order index of the picture
that caused the DPB bumping process to be invoked.
As a reminder, the maximum number of frames in DPB for MultiView
High profile with more than 2 views is not necessarily a multiple
of the number of views.
This fixes decoding of MVCNV-4.264.
Let the utility layer handle dynamic growth of the inter-view pictures
array. By definition, setting a new size to the array will effectively
grow the array, but would also fill in the newly created elements with
empty entries (NULL), thus also increasing the reported length, which
is not correct.
When decoding Multiview High profile streams with a large number of
views, it is not possible to make the VAPictureParameterBufferH264.
ReferenceFrames[] array hold the complete DPB, with all possibly
active pictures to be used for inter-view prediction in the current
access unit.
So reduce the scope of the ReferenceFrames[] array to only include
the set of reference pictures that are going to be used for decoding
the current picture. Basically, this is a union of all RefPicListX[]
array, for all slices constituting the decoded picture.
The inter-view reference components and inter-view only reference
components that are included in the reference picture lists shall
be considered as not being marked as "used for short-term reference"
or "used for long-term reference". This means that reference flags
should all be removed from VAPictureH264.flags.
This fixes decoding of MVCNV-2.264.
If the VA driver exposes ad-hoc H.264 MVC profiles, then we have to
be careful to detect profiles changes and not reset the underlying
VA context erroneously. In MVC situations, we could indeed get a
profile_idc change for every SPS that gets activated, alternatively
(base-view -> non-base view -> base-view, etc.).
An improved fix would be to characterize the exact profile to use
once and for all when SPS NAL units are parsed. This would also
allow for fallbacks to a base-view decoding only mode.
Exclusively use VA drivers that support raw packed headers for encoding.
i.e. simply submit packed headers Subset SPS and Prefix NAL units. This
provides for better compatibility accross the various VA drivers and HW
generations since no particular API is needed beyond what readily exists.
Since we are encoding each view independently from each other, we
need a higher number of pre-allocated surfaces to be used as the
reconstructed frames. For Stereo High profile encoding, this means
to effectively double the number of frames to be stored in the DPB.
Add initial support for Subset SPS, Prefix NAL and Slice Extension NAL
for non-base-view streams encoding, and the usual SPS, PPS and Slice
NALs for base-view encoding.
The H.264 Stereo High profile encoding mode will be turned on when the
"num-views" parameter is set to 2. The source (raw) YUV frames will be
considered as Left/Right view, alternatively.
Each of the two views has its own frames reordering pool and reference
frames list management system. Inter-view references are not supported
yet, so the views are encoded independently from each other.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
[limited to Stereo High profile per the definition of MAX_NUM_VIEWS]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Create structures to maintain the reference frames list (RefPool) and
frames reordering (ReorderPool) logic.
This is a prerequisite for H.264 MVC support.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Add provisions to write subset SPS headers to the bitstream in view
to supporting the H.264 MVC specification.
This assumes the libva "staging" branch is in use.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Optimize lookups of view ids / view order indices by caching the result
of the calculatiosn right into the GstVaapiParserInfoH264 struct. This
terribly simplifies is_new_access_unit() and find_first_field() functions.
Add safe fallbacks for MVC profiles:
- all MultiView High profile streams with 2 views at most can be decoded
with a Stereo High profile compliant decoder ;
- all Stereo High profile streams with only progressive views can be
decoded with a MultiView High profile compliant decoder ;
- all drivers that support slice-level decoding could normally support
MVC profiles when the DPB holds at most 16 frames.
In order to have a stricter conforming implementation, we need to carefully
detect access unit boundaries. Additional operations could be necessary to
perform at those boundaries.
Detect the first VCL NAL unit of a picture for MVC, based on the
view_id as per H.7.4.1.2.4. Note that we only need to detect new
view components.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Always cache the previous NAL unit so that we could check whether
there is a Prefix NAL unit immediately preceding the current slice
or IDR NAL unit. In that case, the NAL unit metadata is copied into
the current NAL unit. Otherwise, some default values are inferred,
tentatively. e.g. view_id shall be set to 0 and inter_view_flag to 1.
[infer default values for slice if previous NAL was not a Prefix]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Allow decoding for base views of MVC encoded streams. For now, just skip
the slice extension and prefix NAL units, and skip non-base view frames.
Signed-off-by: Xiaowei Li <xiaowei.a.li@intel.com>
[fixed memory leak, improved check for MVC NAL units]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Factor out process by which the decoded picture with the lowest POC
is found, and possibly output. Likewise, the storage and marking of
a reference decoded, or non-reference decoded picture, into the DPB
could also be simplified as they mostly share the same operations.
Make init_picture_ref_lists() more consistent with other functions
related to the reference marking process by supplying the current
picture as argument.
Add gst_vaapi_display_get_vendor_string() helper function to query
the underlying VA driver name. The display object owns the resulting
string, so it shall not be deallocated.
That function is thread-safe. It could be used for debugging purposes,
for instance.
Make sure to initialize one GstVaapiDisplay at a time, even in threaded
environments. This makes sure the display cache is also consistent
during the whole display creation process. In the former implementation,
there were risks that display cache got updated in another thread.
Add support for dynamic growth of the VA surfaces pool. For decoding,
this implies the recreation of the underlying VA context, as per the
requirement from VA-API. Besides, only increases are supported, not
shrinks.
It is a requirement from VA-API specification that the VA context got
from vaCreateContext(), for decoding purposes, binds the supplied set
of VA surfaces. This means that if the set of VA surfaces is to be
changed for the current decode session, then the VA context needs to
be recreated with the new set of VA surfaces.
Complement fix committed as e95a42e.
The H.264 AVC standard has to say: if the field is part of a reference
frame or a complementary reference field pair, and the other field of
the same reference frame or complementary reference field pair is also
marked as "used for long-term reference", the reference frame or
complementary reference field pair is also marked as "used for long-term
reference" and assigned LongTermFrameIdx equal to long_term_frame_idx.
This fixes decoding of MR9_BT_B in strict mode.
https://bugs.freedesktop.org/show_bug.cgi?id=64624https://bugzilla.gnome.org/show_bug.cgi?id=724518
Request the correct chroma format for decoding grayscale streams.
i.e. make lookups of the VA chroma format more generic, thus possibly
supporting more formats in the future.
This means that, if a VA driver doesn't support grayscale formats,
it is now going to fail. We cannot safely assume that maybe grayscale
was implemented on top of some YUV 4:2:0 with the chroma components
all set to 0x80.
Fix reference picture marking process with memory_management_control_op
set to 3 and 6, i.e. assign LongTermFrameIdx to a short-term reference
picture, or the current picture.
This fixes decoding of FRExt_MMCO4_Sony_B.
https://bugs.freedesktop.org/show_bug.cgi?id=64624https://bugzilla.gnome.org/show_bug.cgi?id=724518
[squashed, edited to use GST_VAAPI_PICTURE_IS_COMPLETE() macro]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The initialization of reference picture lists (8.2.4.2) applies to all
slices. So, the RefPicList0/1 lists need to be constructed prior to
each slice submission to the HW decoder.
This fixes decoding of video sequences where frames are encoded with
multiple slices of different types, e.g. 4 slices in this order I, P,
I, and P. More precisely, CABAST3_Sony_E and CABASTBR3_Sony_B.
https://bugzilla.gnome.org/show_bug.cgi?id=724518
When NAL units of type 13 (SPS extension) or type 19 (auxiliary slice)
are present in a video, decoders shall perform the (optional) decoding
process specified for these NAL units or shall ignore them (7.4.1).
Implement option 2 (skip) for now, as alpha composition is not
supported yet during the decoding process.
This fixes decoding of the primary coded video in alphaconformanceG.
https://bugzilla.gnome.org/show_bug.cgi?id=703928https://bugzilla.gnome.org/show_bug.cgi?id=728869https://bugzilla.gnome.org/show_bug.cgi?id=724518
[skip NAL units earlier, i.e. at parsing time]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
When MVC slice NAL units (coded slice extension and prefix NAL) are
present, the number of NAL header bytes is 3, not 1 as usual.
Signed-off-by: Li Xiaowei <xiaowei.a.li@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
At the time the first VCL NAL unit of a primary coded picture is found,
and if that NAL unit was parsed to be an SPS or PPS, then the entries
in the parser may have been overriden. This means that, when the picture
is to be decoded, slice_hdr->pps could point to an invalid (the next)
PPS entry.
So, one way to solve this problem is to not use the parser PPS and
SPS info but rather maintain our own activation chain in the decoder.
https://bugzilla.gnome.org/show_bug.cgi?id=724519https://bugzilla.gnome.org/show_bug.cgi?id=724518
Retain the SEI messages that were parsed from the access unit until we
have completely decoded the current frame. This is done so that we can
peek at that data whenever necessary during decoding. e.g. for exposing
3D stereoscopic information at a later stage.
Fix support for grayscale encoded video clips, and possibly others if
the underlying driver supports the non-YUV 4:2:0 formats. i.e. defer
the decision that a surface with the desired chroma format is not
supported to the actual VA driver implementation.
https://bugzilla.gnome.org/show_bug.cgi?id=728144
Don't force allocation of VA surfaces in YUV 4:2:0 format. Rather, allow
for the upper layer to specify the desired chroma type. If the chroma
type field is not set (or yields zero), then YUV 4:2:0 format is used
by default.
Fix possible bug when a per-segment deblocking filter level value
needs to be set in non-absolute mode, i.e. when the loop filter update
value is negative in delta mode.
Also clamp the resulting filter level value to 0..63 range.
Improve condition to disable the loop filter. The previous heuristic
used to check all filter levels, for all segments. It turns out that
only the base filter_level value defined in the frame header needs
to be checked.
This fixes 00-comprehensive-013.
The gst_h264_parse_parse_sei() function now returns an array of SEI
messages, instead of a single SEI message. Reason: it is allowed to
have several SEI messages packed into a single SEI NAL unit, instead
of multiple NAL units.
Fix parser and decoder state to sync at the right locations. This is
because we could reset the parser state, while the decoder state was
not copied yet, e.g. when parsing several NAL units from multiple frames
whereas the current frame was not decoded yet.
This is a regression brought in by commit 6fe5496.
Fix gstreamer-vaapi includedir for GStreamer 1.2 setups. i.e. use
the pkgconfig version (1.0) instead of the intended API version (1.2).
libgstvaapi1.0-dev and libgstvaapi1.2-dev packages will now conflict,
as would core GStreamer 1.0 and GStreamer 1.2 dev packages anyway.
Add gst_vaapi_get_config_attribute() helper function that takes a
GstVaapiDisplay and the rest of the arguments with VA types. The aim
is to have thread-safe VA helpers by default.
Make sure to configure the encoder with the set of packed headers we
intend to generate and submit. i.e. make selection of packed headers
to submit more robust.
Cache the first compatible GstVaapiProfile found if the encoder is not
configured yet. Next, factor out the code to check for the supported
rate-control modes by moving out vaGetConfigAttributes() to a separate
function, while also making sure that the attribute type is actually
supported by the encoder.
Also fix the default set of supported rate control modes to not the
"none" variant. It's totally useless to expose it at this point.
Introduce GstVaapiContextUsage so that to explicitly determine the
usage of a VA context. This is useful in view to simplifying the
creation of VA context for VPP too.
Unknown attributes, or attributes that are not supported for the given
profile/entrypoint pair have a return value of VA_ATTRIB_NOT_SUPPORTED.
So, return failure in this case.
Move GstVideoOverlayComposition handling to separate source files.
This helps keeing GstVaapiContext core implementation to the bare
minimal, i.e. simpy helpers to create a VA context and handle pool
of associated VA surfaces.
Improve documentation and debug messages. Clean-up APIs, i.e. strip
them down to the minimal set of interfaces. They are private, so no
need expose getters for instance.
Make sure that libgstvaapi private headers remain internally used to
build libgstvaapi libraries only. All header dependencies were reviewed
and checks for IN_LIBGSTVAAPI definition were added accordingly.
Also rename GST_VAAPI_CORE definition to IN_LIBGSTVAAPI_CORE to keep
consistency.
Set a valid GstVideoCodecFrame.system_frame_number when decoding a
stream in standalone mode. While we are at it, improve the debugging
messages to also include that frame number.
When decoding failed, or that the frame was dropped, the associated
surface proxy is not guaranteed to be present. Thus, the GST_DEBUG()
message needs to check whether the proxy is actually present or not.
https://bugzilla.gnome.org/show_bug.cgi?id=722403
[fixed gst_vaapi_surface_proxy_get_surface_id() to return VA_INVALID_ID]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Don't emit NAL HRD parameters for now in the SPS headers because the
SEI buffering_period() and picture_timing() messages are not handled
yet. Some additional changes are necessary to get it right.
https://bugzilla.gnome.org/show_bug.cgi?id=722734
Fix default CPB buffer size to something more reasonable (1500 ms)
and that still fits the level limits. This is a non configurable
property for now. The initial CPB removal delay is also fixed to
750 ms.
https://bugzilla.gnome.org/show_bug.cgi?id=722087
Round down the calculated, or supplied, bitrate (kbps) into a multiple
of the HRD bitrate scale factor. Use a bitrate scale factor of 64 so
that to have less losses in precision. Likewise, don't round up because
that could be a strict constraint imposed by the user.
Fix the level calculation involving bitrate limits. Since we are
targetting NAL HRD conformance, the check against MaxBR from the
Table A-1 limits shall involve cpbBrNalFactor depending on the
active profile.
Submit sequence parameter buffers only once, or when the bitstream
was reconfigured in a way that requires such. Always submit packed
sequence parameter buffers at I-frame period, if the VA driver needs
those.
https://bugzilla.gnome.org/show_bug.cgi?id=722737
Make sure to submit the packed headers only if the underlying VA driver
requires those. Currently, only handle packed sequence and picture
headers.
https://bugzilla.gnome.org/show_bug.cgi?id=722737
The VAEncSequenceParameterBuffer.ip_period value reprents the distance
between the I-frame and the next P-frame. So, this also accounts for
any additional B-frame in the middle of it.
This fixes rate control heuristics for certain VA drivers.
https://bugzilla.gnome.org/show_bug.cgi?id=722735
Fix level characterisation when the bitrate is automatically computed
from the active coding tools. i.e. ensure the bitrate once the profile
is completely characterized but before the level calculation process.
Document and rename a few functions here and there. Drop code that
caps num_bframes variable in reset_properties() since they shall
have been checked beforehand, during properties initialization.
Clean-up GstBitWriter related utility functions and simplify notations.
While we are at it, also make bitstream writing more robust should an
overflow occur. We could later optimize for writing headers capped to
their maximum possible size by using the _unchecked() helper variants.
Drop private header since it was originally used to expose internals
to the plugin element. The proper interface is now the properties API,
thus rendering private headers totally obsolete.
Add support for hue, saturation, brightness and constrat adjustments.
Also fix cap info local copy to match the really expected cap subtype
of interest.
https://bugzilla.gnome.org/show_bug.cgi?id=720376
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Change the submission order of VA objects so that to make that process
more logical. i.e. submit sequence parameter first, if any; next the
packed headers associated to sequece, picture or slices; and finally
the actual picture and associated slices.
Various clean-ups to improve consistency and readability: rename some
variables, drop unused macro definitions, drop initialization of vars
that are zero-initialized from the base class, drop un-necessary casts,
allocate GPtrArrays with a destroy function.
Fix frame cropping rectangle calculation to handle horizontal resolutions
that don't match a multiple of 16 pixels, but also the vertical resolution
that was incorrectly computed for progressive sequences too.
https://bugzilla.gnome.org/show_bug.cgi?id=722089
For non "Constant-QP" modes, we could provide more reasonable heuristics
for the target bitrate. In general, 48 bits per macroblock with all the
useful coding tools enable looks safe enough. Then, this rate is raised
by +10% to +15% for each coding tool that is disabled.
https://bugzilla.gnome.org/show_bug.cgi?id=719699
Add support for "high-compression" tuning option. First, determine the
largest supported profile by the hardware. Next, check any target limit
set by the user. Then, enable each individual coding tool based on the
resulting profile_idc value to use.
https://bugzilla.gnome.org/show_bug.cgi?id=719696
Allow user to precise the largest profile to use for encoding due
to target decoder constraints. For instance, if CABAC entropy coding
mode is requested by "constrained-baseline" profile only is desired,
then an error is returned during codec configuration.
Also make sure that the suitable profile we derived actually matches
what the HW can cope with.
https://bugzilla.gnome.org/show_bug.cgi?id=719694
Refine the heuristic to determine the maximum size of a coded buffer
to account for the exact number of slices. set_context_info() is the
last step during codec reconfiguration, no additional change is done
afterwards, so re-using the num_slices field here is fine.
https://bugzilla.gnome.org/show_bug.cgi?id=719953
Automatically derive the minimum profile and level to be used for
encoding, based on the activated coding tools. The encoder will
be trying to generate a bitstream that has the best chances to be
decoded on most platforms by default.
Also change the default profile to "constrained-baseline" so that
to ensure maximum compatibility when the stream is decoded.
https://bugzilla.gnome.org/show_bug.cgi?id=719691
Fix lookup for a suitable HW profile, as to be used by the underlying
hardware, based on heuristics that lead to characterize the SW profile,
i.e. the one used by the SW level encoding logic.
Also fix constraint_set0_flag (A.2.1) and constraint_set1_flag (A.2.2)
as they should respectively match the baseline and main profile.
https://bugzilla.gnome.org/show_bug.cgi?id=719827
The libgstvaapi core encoders are meant to support raw bitstreams only.
Henceforth, we are always producing a stream in "byte-stream" format.
However, the "codec-data" buffer which holds SPS and PPS headers is
always available. The "lengthSizeMinusOne" field is always set to 3
so that in-place "byte-stream" format to "avc" format conversion could
be performed.
Various clean-ups to improve consistency and readability: rename some
variables, drop unused macro definitions, drop initialization of vars
that are zero-initialized from the base class, drop un-necessary casts.
Fix lookup for a suitable HW profile, as to be used by the underlying
hardware, based on heuristics that lead to characterize the SW profile,
i.e. the one used by the SW level encoding logic.
Automatically derive the minimum profile and level to be used for
encoding, based on the activated coding tools. Improve lookup for
the best suitable level with the new MPEG-2 helper functions.
Also change the default profile to "simple" so that to ensure maximum
compatibility when the stream is decoded.
https://bugzilla.gnome.org/show_bug.cgi?id=719703
Various clean-ups to improve consistency and readability: drop unused
macro definitions, drop initialization of vars that are zero-initialized
from the base class, drop un-necessary casts.
Add encoder "tune" option to override the default behaviour that is to
favor maximum decoder compatibility at the expense of lower compression
ratios.
Expected tuning options to be developed are:
- "high-compression": improve compression, target best-in-class decoders;
- "low-latency": tune for low-latency decoding;
- "low-power": tune for encoding in low power / resources conditions.
Only expose the exact static set of supported rate-control properties
to the upper layer. For instance, if the GstVaapiEncoderXXX class does
only support CQP rate control, then only add it the the exposed enum
type.
Add helper macros and functions to build a GType for an enum subset.
Add gst_vaapi_encoder_set_keyframe_period() interface to allow the
user control the maximum distance between two keyframes. This new
property can only be set prior to gst_vaapi_encoder_set_codec_state().
A value of zero for "keyframe-period" gets it re-evaluated to the
actual framerate during encoder reconfiguration.
Improve codec reconfiguration to be performed only through a single
function. That is, remove the _set_context_info() hook as subclass
should not alter the parent GstVaapiContextInfo itself. Besides, the
VA context is constructed only at the final stages of reconfigure().
Add interface to communicate the encoder resolution and related info
like framerate, interlaced vs. progressive, etc. This new interface
supersedes gst_vaapi_encoder_set_format() and doesn't use any GstCaps
but rather use GstVideoCodecState.
Note that gst_vaapi_encoder_set_codec_state() is also a synchronization
point for codec config. This means that the encoder is reconfigured
there to match the latest properties.
Add interface to communicate configurable properties to the encoder.
This covers both the common ones (rate-control, bitrate), and the
codec specific properties.
https://bugzilla.gnome.org/show_bug.cgi?id=719529
Add gst_vaapi_encoder_set_bitrate() interface to allow the user control
the bitrate for encoding. Currently, changing this parameter is only
valid before the first frame is encoded. Should the value be modified
afterwards, then GST_VAAPI_ENCODER_STATUS_ERROR_OPERATION_FAILED is
returned.
https://bugzilla.gnome.org/show_bug.cgi?id=719529
Add gst_vaapi_encoder_set_rate_control() interface to request a new
rate control mode for encoding. Changing the rate control mode is
only valid prior to encoding the very first frame. Afterwards, an
error ("operation-failed") is issued.
https://bugzilla.gnome.org/show_bug.cgi?id=719529
Instal <gst/vaapi/gstvaapiutils_h264.h> header but only expose the
H.264 levels in there. The additional helper functions are meant
to be private for now.
Fix formts for various GST_DEBUG et al. invocations. More precisely,
make size_t arguments use the %zu format specifier accordingly; force
XID formats to be a 32-bit unsigned integer; and fix the format used
for gst_vaapi_create_surface_with_format() error cases since we have
been using strings nowadays.
The following helper functions are no longer used, thus are removed:
- gst_vaapi_video_format_from_structure()
- gst_vaapi_video_format_from_caps()
- gst_vaapi_video_format_to_caps()
Replace gst_vaapi_display_get_{decode,encode}_caps() APIs with more
more convenient APIs that return an array of GstVaapiProfile instead
of GstCaps: gst_vaapi_display_get_{decode,encode}_profiles().
Replace gst_vaapi_display_get_{image,subpicture}_caps() APIs, that
returned GstCaps, with more convenient APIs that return an array of
GstVideoFormat: gst_vaapi_display_get_{image,subpicture}_formats().
Fix display creation code to check that any display obtained from a
neighbour actually has the type we expect. Note: if display type is
set to "any", we can then accept any VA display type.
The GLTextureUploadMeta implementation assumed that for each upload()
sequence, the supplied texture id is always the same as the one that
was previously cached into the underlying GstVaapiTexture. Cope with
any texture id change the expense to recreate the underlying VA/GLX
resources.
https://bugzilla.gnome.org/show_bug.cgi?id=719643
Improve robustness when some expected packets where not received yet
or that were not correctly decoded. For example, don't try to decode
a picture if there was no valid frame headers parsed so far.
https://bugs.freedesktop.org/show_bug.cgi?id=57902
Conformance test Base_Ext_Main_profiles/BA3_SVA_C.264 complys with
extended profile specifications. However, the SPS header has the
constraint_set1_flag syntax element set to 1. This means that, if
a Main profile compliant decoder is available, then it should be
able to decode this stream.
This changes makes it possible to fall-back from Extended profile
to Main profile if constraint_set1_flag is set to 1.
https://bugzilla.gnome.org/show_bug.cgi?id=720190
Recognize streams marked as conforming to the "Constrained Baseline
Profile". If VA driver supports that as is, fine. Otherwise, fallback
to baseline, main or high profile.
Constrained Baseline Profile conveys coding tools that are common
to baseline profile and main profile.
https://bugzilla.gnome.org/show_bug.cgi?id=719947
[Added fallbacks to main and high profiles]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The GStreamer codecparser layer now parses the scaling lists in zigzag
scan order, as expected, so that to match the original bitstream layout
and specification. However, further convert the scaling lists into
raster scan order to fit the existing practice in most VA drivers.
https://bugzilla.gnome.org/show_bug.cgi?id=706406
- gst_vaapi_utils_h264_get_level():
Returns GstVaapiLevelH264 from H.264 level_idc value
- gst_vaapi_utils_h264_get_level_idc():
Returns H.264 level_idc value from GstVaapiLevelH264
- gst_vaapi_utils_h264_get_level_limits():
Returns level limits as specified in Table A-1 of the H.264 standard
- gst_vaapi_utils_h264_get_level_limits_table():
Returns the Table A-1 specification
* Profiles:
- gst_vaapi_utils_h264_get_profile():
Returns GstVaapiProfile from H.264 profile_idc value
- gst_vaapi_utils_h264_get_profile_idc():
Returns H.264 profile_idc value from GstVaapiProfile
* Chroma formats:
- gst_vaapi_utils_h264_get_chroma_type():
Returns GstVaapiChromaType from H.264 chroma_format_idc value
- gst_vaapi_utils_h264_get_chroma_format_idc():
Returns H.264 chroma_format_idc value from GstVaapiChromaType
The previous fix was only valid to express the maximum size of the
macroblock layer, i.e. without any headers. Now, also account for
the slice headers and top picture header, but also any other header
we might stuff into the VA coded buffer, e.g. sequence headers.
Fix coded buffer size for each codec. A generic issue was that the
number of macroblocks was incorrectly computed. The second issue was
specific to MPEG-2 were the max number of bits per macroblock, and
as defined by the standard, was incorrectly mapped to the (lower)
H.264 requirement. i.e. 4608 bits vs. 3200 bits limit.
Change get_context_info() into a set_context_info() function that
initializes common defaults into the base class, thus allowing the
subclasses to specialize the context info further on.
The set_context_info() hook is also the location where additional
context specific data could be initialized. At this point, we are
guaranteed to have valid video resolution size and framerate. i.e.
gst_vaapi_encoder_set_format() was called beforehand.
Clean public APIs up so that to better align with the decoder APIs.
Most importantly, gst_vaapi_encoder_get_buffer() is changed to only
return the VA coded buffer proxy. Also provide useful documentation
for the public APIs.
Kill GstVaapiEncoderSyncPic objects that are internally and temporarily
allocated. Rather, associate a GstVaapiEncPicture to a coded buffer
through GstVaapiCodedBufferProxy user-data facility.
Besides, use a GAsyncQueue to maintain a thread-safe queue object of
coded buffers.
Partial fix for the following report:
https://bugzilla.gnome.org/show_bug.cgi?id=719530
Fix the GstVaapiEncoderClass parent class type. Make sure to validate
subclass hooks as early as possible, i.e. in gst_vaapi_encoder_init(),
thus avoiding useless run-time checks. Also simplify the subclass
initialization process to be less error prone.
Refactor the GstVaapiCodedBuffer APIs so that to more clearly separate
public and private interfaces. Besides, the map/unmap APIs should not
be exposed as is but appropriate accessors should be provided instead.
* GstVaapiCodedBuffer: VA coded buffer abstraction
- gst_vaapi_coded_buffer_get_size(): get coded buffer size.
- gst_vaapi_coded_buffer_copy_into(): copy coded buffer into GstBuffer
* GstVaapiCodedBufferPool: pool of VA coded buffer objects
- gst_vaapi_coded_buffer_pool_new(): create a pool of coded buffers of
the specified max size, and bound to the supplied encoder
* GstVaapiCodedBufferProxy: pool-allocated VA coded buffer object proxy
- gst_vaapi_coded_buffer_proxy_new_from_pool(): create coded buf from pool
- gst_vaapi_coded_buffer_proxy_get_buffer(): get underlying coded buffer
- gst_vaapi_coded_buffer_proxy_get_buffer_size(): get coded buffer size
Rationale: more optimized transfer functions might be provided in the
future, thus rendering the map/unmap mechanism obsolete or sub-optimal.
https://bugzilla.gnome.org/show_bug.cgi?id=719775
Add gst_vaapi_surface_proxy_copy() function that creates a new surface
proxy with the same information from the parent proxy, except that the
user-defined destroy notify function is not copied over.
The underlying VA surface is pushed back to the video pool only when
the last reference to the parent surface proxy is released.
Optimize gst_vaapiencode_handle_frame() to avoid extra memory allocation,
and in particular the GstVaapiEncObjUserData object. i.e. directly use
the VA surface proxy from the source buffer. This also makes the user
data attached to the GstVideoCodecFrame more consistent between both
the decoder and encoder plug-in elements.
Add gst_buffer_new_allocate() and gst_buffer_fill() implementations.
Fix gst_buffer_new_wrapped_full() implementation to handle the destroy
notify function.
Add initial API for video encoding: only basic interfaces and small
encoder objects are implemented so far.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
GstBitWriter provides a bit writer that can write any number of bits
to a pre-allocated memory buffer. Helper functions are also provided
to write any number of bits from 8, 16, 32 and 64 bit variables.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Extend GstVaapiContextInfo structure to hold the desired rate control
mode for encoding purposes. For decoding purposes, this field is not
used and it is initialized to GST_VAAPI_RATECONTROL_NONE.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add GstVaapiRateControl types and GType values in view to supporting
rate controls for encoding. This is meant to be used for instance in
GstVaapiContext.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
This library is using symbols that don't exist in GStreamer 0.10 so
it needs to link to built-in implementation (libgstvaapi-videoutils).
https://bugzilla.gnome.org/show_bug.cgi?id=712282
Signed-off-by: Ross Burton <ross.burton@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Destroy VPP output surface pool on exit. Also avoid a possible crash
in double-free situation caused by insufficiently reference counted
array of formats returned during initialization.
The <gst/vaapi/gstvaapicontext.h> header was removed from the public
set of APIs. So, don't make public headers (gstvaapidecoder.h) depend
on private files.
Add gst_vaapi_fitler_set_deinterlacing_references() API to submit the
list of surfaces used for forward or backward reference in advanced
deinterlacing mode, e.g. Motion-Adaptive, Motion-Compensated.
The list of surfaces used as deinterlacing references shall be live
until the next call to gst_vaapi_filter_process().
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix deinterlacing flags to make more sense. The TFF (top-field-first)
flag is meant to specify the organization of reference frames used in
advanced deinterlacing modes. Introduce the more explicit flag TOPFIELD
to specify that the top-field of the supplied input surface is to be
used for deinterlacing. Conversely, if not set, this means that the
bottom field of the supplied input surface will be used instead.
Add a couple of helper functions:
- gst_vaapi_filter_has_operation(): checks whether the VA driver
advertises support for the supplied operation ;
- gst_vaapi_filter_use_operation(): checks whether the supplied
operation was already enabled to its non-default value.
The user-defined destroy notify function is meant to be called only when
the surface proxy was fully released, i.e. once it actually released the
VA surface back to the underlying pool.
Fix GstVaapiPicture, GstVaapiSlice and GstVaapiSurfaceProxy initialization
sequences to have the expected default values set beforehand in case of an
error raising up further during creation. i.e. make it possible to cleanly
destroy those partially initialized objects.
https://bugzilla.gnome.org/show_bug.cgi?id=707108
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix ensure_operations() to release the VPP operations array if non
NULL, prior to returning to the caller. The former function was also
renamed to a more meaningful get_operations() since the caller owns
the returned array that needs to be released.
Fix first-time operation lookup through find_operation() if the set
of supported operations was not initially determined through the
gst_vaapi_filter_get_operations() helper function.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix intiialization of GstVaapiFilterOpData for colorbalance related
operations. In particular, fill in the va_subtype field accordingly.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
VASurfaceAttrib and VAProcFilterParameterBufferType are symbols
that need to be guarded for libva 0.34 and 0.33, respectively.
https://bugzilla.gnome.org/show_bug.cgi?id=709102
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix calculation of MCU count for image sizes that are not a multiple
of 8 pixels in either dimension, but also for non-common sampling
factors like 4:2:2 in non-interleaved mode.
Add support for images with multiple scans per frame. The Huffman table
can be updated before SOS, and thus possibly requiring multiple uploads
of Huffman tables to the VA driver. So, the latter must be able to cope
with multiple VA buffers of type 'huffman-table' and with the correct
sequential order.
Improve robustness when some expected packets where not received yet
or that were not correctly decoded. For example, don't try to decode
a picture if there was no valid frame headers.
Improve debugging and error messages. Rename a few variables to fit the
existing naming conventions. Change some fatal asserts to non-fatal
error codes.
Split the input buffer data into decoder units that represent a JPEG
segment. Handle scan decoder unit specifically so that it can include
both the scan header (SOS) but also any other ECS or RSTi segment.
That way, we parse the input buffer stream only once at the gst-vaapi
level instead of (i) in gst_vaapi_decoder_jpeg_parse() to split the
stream into frames SOI .. EOI and (ii) in decode_buffer() to further
determine segment boundaries and decode them.
In practice, this is a +15 to +25% performance improvement.
Fix decode_buffer() function to gracefully skip comment (COM) segments.
This fixes decoding of streams generated by certain cameras, e.g. like
the Logitech Pro C920.
https://bugzilla.gnome.org/show_bug.cgi?id=708208
Signed-off-by: Junfeng Xu <jun.feng.xu@intel.com>
Look for the exact image bounds characterised by the <SOI> and <EOI>
markers. Use the gst_jpeg_parse() codec parser utility function to
optimize the lookup for the next marker segment.
https://bugzilla.gnome.org/show_bug.cgi?id=707447
Fix calculation of the offset to the next marker segment since the
correction of the codecparser part to match the API specification.
i.e. the GstJpegMarkerSegment.size field represents the size in bytes
of the segment minus any marker prefix.
https://bugzilla.gnome.org/show_bug.cgi?id=707447
Signed-off-by: Junfeng Xu <jun.feng.xu@intel.com>
Fix detection of VA/JPEG decoding API with non-standard libva packages.
More precisely, some packages were shipping with a <va/va.h> header that
did not include <va/va_dec_jpeg.h>.
https://bugzilla.gnome.org/show_bug.cgi?id=706055
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
As a last resort, if video processing capabilities (VPP) are not available,
or they did not produce anything conclusive enough, then try to fallback to
the original rendering code path whereby the whole VA surface is rendered
as is, no matter of video cropping or deinterlacing requests.
Note: under those conditions, the visual outcome won't be correct but at
least, something gets displayed instead of bailing out.
Try to use VA/VPP processing capabilities to handle video cropping and
additional rendering flags that may not be directly supported by the
underlying hardware when exposing a suitable Wayland buffer for the
supplied VA surface. e.g. deinterlacing, different color primaries than
BT.601, etc.
Update the frame redraw infrastructure with a new FrameState stucture
holds all the necessary information used to display the next pending
surface.
While we are at it, delay the sync operation down to when it is actually
needed. That way, we keep performing additional tasks meanwhile.
Disable video cropping in MPEG-2 codec because it is partially implemented
and actually because nobody implements it that way, and the standard spec
does not specify the display process either anyway.
Most notably, there are two possible use cases for sequence_display_extension()
horizontal_display_size & vertical_display_size: (i) guesstimating the
pixel-aspect-ratio, or (ii) implement some kind of span & scan process
in conjunction with picture_display_extension() information.
https://bugzilla.gnome.org/show_bug.cgi?id=704848
This fixes the following issue:
CC libgstvaapi_0.10_la-gstvaapidecoder_mpeg4.lo
gstvaapidecoder_mpeg4.c:113: error: redefinition of typedef
'GstVaapiDecoderMpeg4Class'
gstvaapidecoder_mpeg4.c:44: note: previous declaration of
'GstVaapiDecoderMpeg4Class' was here
make[5]: *** [libgstvaapi_0.10_la-gstvaapidecoder_mpeg4.lo] Error 1
make[5]: Leaving directory
`/builddir/build/BUILD/gstreamer-vaapi-0.5.5.1/gst-libs/gst/vaapi'
https://bugzilla.gnome.org/show_bug.cgi?id=705148
Add basic deinterlacing support, i.e. bob-deinterlacing whereby only
the selected field from the input surface is kept for the target surface.
Setting gst_vaapi_filter_set_deinterlacing() method argument to
GST_VAAPI_DEINTERLACE_METHOD_NONE means to disable deinterlacing.
Also move GstVaapiDeinterlaceMethod definition from vaapipostproc plug-in
to libgstvaapi core library.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add ProcAmp (color balance) adjustments for hue, saturation, brightness
and contrast. The respective range for each filter shall be the same as
for the VA display attributes.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Sharpening is configured with a float value. The supported range is
-1.0 .. 1.0 with 0.0 being the default, and that means no sharpening
operation at all.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Noise reduction is configured with a float value. The supported range
is 0.0 .. 1.0 with 0.0 being the default, and that means no denoise
operation at all.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add helper functions to ensure an operation VA buffer is allocated to
the right size; that filter caps get parsed and assigned to the right
operation too; and that float parameters are correctly scaled to fit
the reported range from the VA driver.
Add gst_vaapi_video_format_from_string() helper function to convert from
a video format string representation to a suitable GstVideoFormat. This
is just an alias to gst_video_format_from_string() for GStreamer 1.0.x
builds, and a proper iteration over all GstVideoFormat string representations
otherwise for earlier GStreamer 0.10.x builds.
Add helper functions to describe GstVaapiPoint and GstVaapiRectangle
structures as a standard GType. This could be useful to have them
described as a GValue later on.
Don't expose GstVaapiContext APIs and make them totally private to
libgstvaapi core library. That API would also tend to disappear in
a future revision. Likewise, don't expose GstVaapiDisplayCache API
but keep symbols visible so that the various render backends could
share a common display cache implementation in libgstvaapi.
Try to clean-up the documentation from any stale entry too.
Don't expose functions that reference a GstVaapiImageRaw, those are
meant to be internal only for implementing subpictures sync. Also add
a few private definitions to avoid functions calls for retrieving
image size and format information.
Use hardware accelerated XRenderComposite() function, from the RENDER
extension, to blit a pixmap to screen. Besides, this can also support
cropping and scaling.
Implement the new render-to-pixmap API. The only supported pixmap format
that will work is xRGB, with native byte ordering. Others might work but
they were not tested.
Add API to transfer VA urfaces to native pixmaps. Also add an API to
render a native pixmap, for completeness. In general, rendering to
pixmap would only be useful to certain VA drivers and use cases on
X11 display servers. e.g. GLX_EXT_texture_from_pixmap (TFP) handled
in an upper layer.
Mark dummy pictures as output already so that we don't try to submit
them to the upper layer since this is purely internal / temporary
picture for helping the decoder.
Once the picture was output, it is no longer necessary to keep an extra
reference to the underlying GstVideoCodecFrame. So, we can release it
earlier, and maybe subsequently release the associate surface proxy
earlier.
Fix new internal video format API, based on GstVideoFormat, to not
clobber with system symbols. So replace the gst_video_format_* prefix
with gst_vaapi_video_format_ prefix, even if the format type remains
GstVideoFormat.
Fix memory leak when processing interlaced pictures and that occurs
because the first field, represented as a GstVideoCodecFrame, never
gets released. i.e. when the picture is completed, this is generally
the case when the second field is successfully decoded, we need to
propagate the GstVideoCodecFrame of the first field to the original
GstVideoDecoder so that it could reclaim memory.
Otherwise, we keep accumulating the first fields into GstVideoDecoder
private frames list until the end-of-stream is reached. The frames
are eventually released there, but too late, i.e. too much memory
may have been consumed.
https://bugzilla.gnome.org/show_bug.cgi?id=701257
The queue of free objects to used was deallocated with g_queue_free_full().
However, this convenience function shall only be used if the original queue
was allocated with g_queue_new(). This caused memory corruption, eventually
leading to a crash.
The correct solution is to pair the g_queue_init() with the corresponding
g_queue_clear(), while iterating over all free objects to deallocate them.
Fix creation of surface pool objects to honour explicit pixel format
specification. If this operation is not supported, then fallback to
the older interface with chroma format.
If a VA surface was allocated with the chroma-format interface, try to
determine the underlying pixel format on gst_vaapi_surface_get_format(),
or return GST_VIDEO_FORMAT_ENCODED if this is not a supported operation.
Make it possible to create VA surfaces with a specific pixel format.
This is a new capability brought in by VA-API >= 0.34.0. If that
capability is not built-in (e.g. using VA-API < 0.34.0), then
gst_vaapi_surface_new_with_format() will return NULL.
Add gst_video_format_get_chroma_type() helper function to determine
the GstVaapiChromaType from a standard GStreamer video format. It is
possible to reconstruct that from GstVideoFormatInfo but it is much
simpler (and faster?) to use the local GstVideoFormatMap table.
Add new chroma formats available with VA-API >= 0.34.0. In particular,
this includes "RGB" chroma formats, and more YUV subsampled formats.
Also add a new from_GstVaapiChromaType() helper function to convert
libgstvaapi chroma type to VA chroma format.
Make gst_vaapi_image_pool_new() succeed, and thus returning a valid
image pool object, only if the underlying VA display does support the
requested VA image format.
Get rid of GstCaps to create surface/image pool, and use GstVideoInfo
structures instead. Those are smaller, and allows for streamlining
libgstvaapi more.
Add new video format mappings to VA image formats:
- YUV: packed YUV (YUY2, UYVY), grayscale (Y800) ;
- RGB: 32-bit RGB without alpha channel (XRGB, XBGR, RGBX, BGRX).
Fix debug message string with image format expressed with GstVideoFormat
instead of the obsolete format that turned out to be a fourcc.
This is a regression from git commit e61c5fc.
In particular, use gst_video_info_from_caps() helper function in VA image
for implementating gst_vaapi_image_get_buffer() [vaapidownload] and
gst_vaapi_image_update_from_buffer() [subpictures] in GStreamer 0.10 builds.
Drop GstVaapiImageFormat helpers since everything was moved to the new
GstVideoFormat based API. Don't bother with backwards compatibility and
just bump the library major version afterwards.
If the stream has a sequence_display_extenion, then attach the
display_horizontal/display_vertical dimension as the cropping
rectangle width/height to the GstVaapiPicture.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
If the Advanced profile has display_extension fields, then set the display
width/height dimension as cropping rectangle to the GstVaapiPicture.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
If the encoded stream has the frame_cropping_flag set, then associate
the cropping rectangle to GstVaapiPicture.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Make it possible associate an empty cropping rectangle to the surface
proxy, thus resetting any cropping rectangle that was previously set.
This allows for returning plain NULL when no cropping rectangle was
initially set up to the surface proxy, or if it was reset to defaults.
Add helper macros to retrieve the VA surface information like size
(width, height) or chroma type. This is a micro-optimization to avoid
useless function calls and NULL pointer re-checks in internal routines.
Add gst_vaapi_picture_set_crop_rect() helper function to copy the video
cropping information from raw bitstreams to each picture being decoded.
Also add helper function to surface proxy to propagate that information
outside of libgstvaapi. e.g. plug-in elements or standalone applications.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
The MPEG-2 standard specifies (6.3.7) that all quantisation matrices
shall be reset to their default values when a Sequence_Header() is
decoded.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Break the circular references between GstVaapiContext and its children
GstVaapiSurfaces. Since the VA surfaces held an extra reference to the
context, which holds a reference to its VA surfaces, then none of those
were released.
How does this impact support for subpictures?
The only situation when the parent context needs to disappear is when
it is replaced with another one because of a resolution change in the
video stream for instance, or a normal destroy. In this case, it does
not really matter to apply subpictures to the peer surfaces since they
are either gone, or those that are left in the pipe can probably bear
a reinstantiation of the subpictures for it.
So, parent_context is set to NULL when the parent context is destroyed,
other VA surfaces can still get subpictures attached to them, individually
not as a whole. i.e. subpictures for surface S1 will be created from
active composition buffers and associated to S1, subpictures for S2 will
be created from the next active composition buffers, etc. We don't try
to cache the subpictures in those cases (pending surfaces until EOS
is reached, or pending surfaces until new surfaces matching new VA context
get to be used instead).
Make sure to create the GLX context once the window object has completed
its creation. Since gl_resize() relies on the newly created window size,
then we cannot simply overload the GstVaapiWindowClass::create() hook.
So, we just call into gst_vaapi_window_glx_ensure_context() once the
window object is created in the gst_vaapi_window_glx_new*() functions.
Make it possible to add extra an extra filter to most of display cache
lookup functions so that the GstVaapiDisplay instance can really match
a compatible and existing display by type, instead of relying on extra
string tags (e.g. "X11:" prefix, etc.).
Drop obsolete GST_VAAPI_IS_xxx() helper macros since we are no longer
deriving from GObject and so those were only checking for whether the
argument was NULL or not. This is now irrelevant, and even confusing
to some extent, because we no longer have type checking.
Note: this incurs more type checking (review) but the libgstvaapi is
rather small, so this is manageable.
Make GstVaapiID a gsize instead of guessing an underlying integer large
enough to hold all bits of a pointer. Also drop GST_VAAPI_ID_NONE since
this is plain zero and that it is no longer passed as varargs.
Port GstVaapiDecoder and GstVaapiDecoder{MPEG2,MPEG4,JPEG,H264,VC1} to
GstVaapiMiniObject. Add gst_vaapi_decoder_set_codec_state_changed_func()
helper function to let the user add a callback to a function triggered
whenever the codec state (e.g. caps) changes.
Port GstVaapiVideoPool, GstVaapiSurfacePool and GstVaapiImagePool to
GstVaapiMiniObject. Drop gst_vaapi_video_pool_get_caps() since it was
no longer used for a long time. Make object allocators static, i.e.
local to the shared library.
Drop support for user-defined data since this capability was not used
so far and GstVaapiMiniObject represents the smallest reference counted
object type. Add missing GST_VAAPI_MINI_OBJECT_CLASS() helper macro.
Besides, since GstVaapiMiniObject is a libgstvaapi internal object, it
is also possible to further simplify the layout of the object. i.e. merge
GstVaapiMiniObjectBase into GstVaapiMiniObject.
Propagate the picture size from the bitstream to the GstVaapiDecoder,
and subsequent user who installed a signal on notify::caps. This fixes
decoding of TS streams when the demuxer failed to extract the required
information.
Add gst_vaapi_decoder_get_frame_with_timeout() helper function that will
wait for a frame to be decoded, until the specified timeout in microseconds,
prior to returning to the caller.
This is a fix to performance regression from 851cc0, whereby the vaapidecode
loop executed on the srcpad task was called to often, thus starving all CPU
resources.
Rework GstVideoDecoder::handle_frame() to decode the current frame,
while possibly waiting for a free surface, and separately submit all
decoded frames from a task. This makes it possible to pop and render
decoded frames as soon as possible.
Add support for interlaced streams with GStreamer 1.0 too. Basically,
this enables vaapipostproc, though it is not auto-plugged yet. We also
make sure to reply to CAPS queries, and happily handle CAPS events.
Fix support for interlaced contents with GStreamer 0.10. In particular,
propagate GstVaapiSurfaceProxy frame flags to GstVideoCodecFrame flags
correctly.
This is a regression from commit 87e5717.
Rename GstVaapiDecoderFrame to GstVaapiParserFrame because this data
structure was only useful to parsing and a proper GstvaapiDecoderFrame
instance will be created instead.
Fix regression from 0.4-branch whereby GstVaapiSurfaceProxy no longer
held any information about the expected presentation timestamp, frame
duration or additional flags like interlaced or top-field-first.
Use new GstVaapiSurfaceProxy internal helper functions to propagate the
necessary GstVideoCodecFrame flags to vaapidecode (GStreamer 0.10).
Also make GstVaapiDecoder push_frame() operate similarly to drop_frame().
i.e. increase the GstVideoCodecFrame reference count in push_frame rather
than gst_vaapi_picture_output().
Add more attributes for raw decoding modes, i.e. directly through the
libgstvaapi helper library. In particular, add presentation timestamp,
duration and a couple of flags (interlaced, TFF, RFF, one-field).
Ensure libgstvaapi-glx*.so builds against libdl since dlsym() is used
to resolve glXGetProcAddress() from GLX libraries. This fix builds on
Fedora 17.
https://bugzilla.gnome.org/show_bug.cgi?id=698046
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix previous commit whereby gst_vaapi_decoder_get_codec_state() was
supposed to make GstVaapiDecoder own the return GstVideoCodecState
object. Only comment was updated, not the actual code.
Make gst_vaapi_decoder_get_codec_state() return the original codec state,
i.e. make the GstVaapiDecoder object own the return state so that callers
that want an extra reference to it would just gst_video_codec_state_ref()
it before usage. This aligns the behaviour with what we had before with
gst_vaapi_decoder_get_caps().
This is an ABI incompatible change, library major version was bumped from
previous release (0.5.2).
Fix the name of the plug-in element reported to gst-inspect-1.0. i.e. we
need an explicit definition for GStreamer >= 1.0 because the GST_PLUGIN_DEFINE
incorrectly uses #name for creating the plug-in name, instead of using macro
expansion (and let further expansion of macros) through e.g. G_STRINGIFY().
Fix make dist to allow build for either GStreamer 0.10 or 1.0. i.e. make
sure to include all source files in either case while generating source
tarballs.
Introduce a new configure option --with-gstreamer-api that determines
the desired GStreamer API to use. By default, GStreamer 1.0 is selected.
Also integrate more compatibility glue into gstcompat.h and plugins.
This integrates support for GStreamer API >= 1.0 only in the libgstvaapi
core decoding library. The changes are kept rather minimal here so that
the library retains as little dependency as possible on core GStreamer
functionality.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Drop the following functions that are now obsolete:
- gst_vaapi_context_get_surface()
- gst_vaapi_context_put_surface()
- gst_vaapi_context_find_surface_by_id()
- gst_vaapi_surface_proxy_new()
- gst_vaapi_surface_proxy_get_context()
- gst_vaapi_surface_proxy_set_context()
- gst_vaapi_surface_proxy_set_surface()
This is an API change.
Now that the surface pool is reference counted in the surface proxy wrapper,
we can safely ignore surface size checks in gst_vaapi_decoder_ensure_context().
Besides, this check is already performed in gst_vaapi_context_reset_full().
Introduce gst_vaapi_surface_proxy_new_from_pool() to allocate a new surface
proxy from the context surface pool. This change also makes sure to retain
the parent surface pool in the proxy.
Besides, it was also totally useless to attach/detach parent context to
VA surface each time we acquire/release it. Since the whole context owns
all associated VA surfaces, we can mark this as such only once and for all.
Move GstVaapiVideoMeta from core libgstvaapi decoding library to the
actual plugin elements. That's only useful there. Also inline reference
counting code from GstVaapiMiniObject.
Make sure libgstvaapi core decoding library doesn't include un-needed
dependencies. So, move out GstVaapiVideoConverterGLX to plugins instead.
Besides, even if the vaapisink element is not used, we are bound to have
a correctly populated GstSurfaceBuffer from vaapidecode.
Also clean-up the file along the way.
In decode_codec_data(), force initialization of format to zero so that
we can catch up cases where codec-data has neither "format" nor "wmvversion"
fields, thus making it possible to gracefully fail in this case.
Add a new GstVaapiDecoder::decode_codec_data() hook to actually decode
codec-data in the decoder sub-class. Provide a common shared helper
function to do the actual work and delegating further to the sub-class.
Drop GstVaapiDecoderUnit buffer field (GstBuffer) since it's totally
useless nowadays as creating sub-buffers doesn't bring any value. It
actually means more memory allocations. We can't do without that in
JPEG and MPEG-4:2 decoders.
Use gst_element_class_set_static_metadata() from GStreamer 1.0, which
basically is the same as gst_element_class_set_details_simple() in
GStreamer 0.10 context.
Use newer gst_video_overlay_rectangle_get_pixels_unscaled_raw() helper
function with GStreamer 0.10 compatible semantics, or that tries to
approach the current meaning. Basically, this is also just about moving
the helper to gstcompat.h.
Force luma_log2_weight_denom and chroma_log2_weight_denom to zero if
there is no pred_weight_table() that was parsed.
This is a workaround for the VA intel-driver on Ivy Bridge.
g_async_queue_timeout_pop() appeared in glib 2.31.18. Implement it as
g_async_queue_timed_pop() with a GTimeVal as the final time to wait for
new data to arrive in the queue.
There shall be only one place to call decode_current_picture(), and this
is in the end_frame() hook. The EOS unit is processed after end_frame()
so this means we cannot have a valid picture to decode/output at this
point.
Improve robustness when some expected packets where not received yet
or that were not correctly decoded. For example, don't try to decode
a picture if there was no valid sequence or picture headers.
Fix gst_vaapi_decoder_get_surface() to only return frames with a valid
surface proxy, i.e. with a valid VA surface. This means that any frame
marked as decode-only is simply skipped.
If the decoder was not able to decode a frame because insufficient
information was available, e.g. missing sequence or picture header,
then allow the frame to be gracefully dropped without generating
any error.
It is also possible that a frame is not meant to be displayed but
only used as a reference, so dropping that frame is also a valid
operation since GstVideoDecoder base class has extra references to
that GstVideoCodecFrame that needs to be released.
The Wayland API is not fully thread-safe and client applications shall
perform locking themselves on key functions. Besides, make sure to
release the lock if the _render() function fails.
Introduce gst_vaapi_window_wayland_sync() helper function to wait for
the completion of the redraw request. Use it in _render() function to
actually block until the previous draw request is completed.
The redraw callback needs to be attached to the surface prior to the
commit. Otherwise, the callback notifies the next surface repaint,
which is not the desired behaviour. i.e. we want to be notified for
the surface we have just filled.
Another isse was the redraw_pending was reset before the actual completion
of the frame redraw callback function, thus causing concurrency issues.
e.g. the callback could have been called again, but with a NULL buffer.
When the Wayland display is shared, we still have to create our own local
shell and compositor objects, since they are not propagated from the cache.
Likewise, we also need to determine the display size or vaapisink would
fail to account for the display aspect ratio, and will try to create a 0x0
window.
Fix build with newer VC-1 codecparser where dqsbedge was renamed to
dqbedge, and now represents either DQSBEDGE or DQDBEDGE depending on
the actual value of DQPROFILE.
Fix size of encapsulated BDUs since GstVC1BDU.size actually represents
the size of the BDU data, starting from offset, i.e. after any start
code is parsed.
This fixes a buffer overflow during the unescaping process.
The AVI demuxer (avidemux) does not set a proper "format" attribute
to the generated caps. So, try to recover the video codec format from
the "wmvversion" property instead.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Don't create temporary GstBuffers for all decoder units, even if they
are lightweight "sub-buffers", since it is not really necessary to keep
the buffer data around.
Implement GstVaapiDecoder.start_frame() and end_frame() semantics so
that to create new VA context earlier and submit VA pictures to the
HW for decoding as soon as possible. i.e. don't wait for the next
frame to start decoding the previous one.
Use GstVaapiDpb interface instead of maintaining our own prev and next
picture pointers. While doing so, try to derive a sensible POC value.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Avoid usage of goto. Simplify decode_step() process to first accumulate all
pending buffers into the GstAdapter, and then parse and decode units from
that input adapter. Stop the process once a frame is fully decoded or an
error occurred.
Make sure we always have a free surface left to use for decoding the
current frame. This means that decode_step() has to return once a frame
gets decoded. If the current adapter contains more buffers with valid
frames, they will get parsed and decoded on subsequent iterations.
Keep only one DPB interface and rename gst_vaapi_dpb2_get_references()
to gst_vaapi_dpb_get_neighbours() so that to retrieve pictures in DPB
around the specified picture POC.
Move GstVaapiDpbMpeg2 API to a more generic version that could also be
useful to other decoders that require 2 reference pictures, e.g. VC-1.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Fix support for global-alpha subpictures. The previous changes brought
the ability to check for GstVideoOverlayRectangle changes by comparing
the underlying pixel buffer pointers. If sequence number and pixel data
did not change, then this is an indication that only the global-alpha
value changed. Now, try to update the underlying VA subpicture global-alpha
value.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Don't re-upload VA subpicture if only the render rectangle changed.
Rather deassociate the subpicture and re-associate it with the new
render rectangle.
A GstVideoOverlayRectangle is created whenever the underlying pixels data
change. However, when global-alpha is supported, it is possible to re-use
the same GstVideoOverlayRectangle but with a change to the global-alpha
value. This process causes a change of sequence number, so we can no longer
check for that.
Still, if sequence numbers did not change, then there was no change in
global-alpha either. So, we need a way to compare the underlying GstBuffer
pointers. There is no API to retrieve the original pixels buffer from
a GstVideoOverlayRectangle. So, we use the following heuristics:
1. Use gst_video_overlay_rectangle_get_pixels_unscaled_argb() with the same
format flags from which the GstVideoOverlayRectangle was created. This
will work if there was no prior consumer of the GstVideoOverlayRectangle
with alternate (non-"native") format flags.
2. In overlay_rectangle_has_changed_pixels(), we have to use the same
gst_video_overlay_rectangle_get_pixels_unscaled_argb() function but
with flags that match the subpicture. This is needed to cope with
platforms that don't support global-alpha in HW, so the gst-video
layer takes care of that and fixes this up with a possibly new
GstBuffer, and hence pixels data (or) in-place by caching the current
global-alpha value applied. So we have to determine the rectangle
was previously used, based on what previous flags were used to
retrieve the ARGB pixels buffer.
We previously assumed that an overlay composition changed if the number
of overlay rectangles in there actually changed, or that the rectangle
was updated, and thus its seqnum was also updated.
Now, we can cope with cases where the GstVideoOverlayComposition grew
by one or a few more overlay rectangles, and the initial overlay rectangles
are kept as is.
Create the GPtrArray once in the _init() function and destroy it only
in the _finalize() function. Then use overlay_clear() to remove all
subpicture associations for intermediate updates, don't recreate the
GPtrArray.
Make GstVaapiOverlayRectangle a reference counted object. Also make
sure that overlay_rectangle_new() actually creates and associates the
VA subpicture.
Handle global-alpha from GstVideoOverlayComposition API. Likewise,
the same code path could also work for premultiplied-alpha but this
was not tested.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add the necessary helpers in GstVaapiDisplay to determine whether subpictures
with global alpha are supported or not. Also add accessors in GstVaapiSubpicture
to address this feature.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Add premultiplied-alpha and global-alpha feature flags, along with converters
between VA-API and gstreamer-vaapi definitions. Another round of helpers is
also necessary for GstVideoOverlayComposition API.
Use GPOINTER_TO_SIZE() instead of GPOINTER_TO_UINT() while manipulating
pointers. The latter is meant to be 32-bit only, not uintptr_t like size.
Only a gsize can hold all bits of a pointer.
Thanks to Ouping Zhang for spotting this error.
Heuristic: if the second start-code is available, check whether that
one marks the start of a new frame because e.g. this is a sequence
or picture header. This doesn't save much, since we already cache the
results.
Accelerate scan for start codes by skipping up to 3 bytes per iteration.
A start code prefix is defined by the following bytes: 00 00 01. Thus,
for any group of 3 bytes (xx yy zz), we have the following possible cases:
1. If zz != 1, this cannot be a start code, then skip 3 bytes;
2. If yy != 0, this cannot be a start code, then skip 2 bytes;
3. If xx != 0 or zz != 1, this cannot be a start code, then skip 1 byte;
4. xx == 00, yy == 00, zz == 1, we have match!
This algorithm requires to peek bytes from the adapter. This increases the
amount of bytes copied to a temporary buffer, but this process is much faster
than scanning for all the bytes and using shift/masks. So, overall, this is
a win.
Move parsing back to decoding step, but keep functions separate for now.
This is needed for future optimizations that may introduce some meta data
for parsed info attached to codec frames.
Optimize pre-allocation of decoder units, thus avoiding un-necessary
memory reallocations. The heuristic used is that we could have around
one slice unit per macroblock line.
Use a GArray to hold decoder units in a frame, instead of a single-linked
list. This makes 'append' calls faster, but not that much. At least, this
makes things clearer.
Allocate decoder unit earlier in the main parse() function and don't
delegate this task to derived classes. The ultimate purpose is to get
rid of dynamic allocation of decoder units.
The SPS, PPS and slice headers are not fully zero-initialized in the
codecparsers/ library. Rather, the standard upstream behaviour is to
initialize only certain syntax elements with some inferred values if
they are not present in the bitstream.
At the gstreamer-vaapi decoder level, we need to further initialize
certain syntax elements with some sensible default values so that to
not complicate VA drivers that just pass those verbatim to the HW,
and also avoid an memset() of the whole decoder unit.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Make GstVaapiVideoBuffer a simple wrapper for video meta. This buffer is
no longer necessary but for compatibility with GStreamer 0.10 APIs or users
expecting a GstSurfaceBuffer like Clutter.
Use PTS value computed by the decoder, which could also be derived from
the GstVideoCodecFrame PTS. This makes it possible to fix up the PTS if
the original one was miscomputed or only represented a DTS instead.