Since the AV1 specification is not explicitly mentioning about
the array size bounds, array sizes in scalability structure
should be defined as possible maximum sizes that can have.
Also, this commit removes GST_AV1_MAX_SPATIAL_LAYERS define from
public header which is API break but the define is misleading
and this patch is introducing ABI break already
ZDI-CAN-22300
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5823>
When the output alignment is smaller than the input alignment, for
example, When the output alignment is "FRAME" and the parse is likely
connecting to a decoder, the current PTS setting for AV1 frames inside
a TU is not very correct.
For example, a TU may begin with non-displayed frames and end with a
displayed frame. The current way will assign the PTS to the first
non-displayed frame, which is a decode-only frame and the PTS will be
discarded in the video decoder. While the last displayed frame has
invalid PTS, and so the video decoder needs to guess its PTS based on
the frame rate and previous frame's PTS. This is not a decent and
robust way. And more important, when the previous frames provide DTS,
the video decoder will also guess the PTS based on the previous frames'
DTS and trigger the warning like:
gstvideodecoder.c:3147:gst_video_decoder_prepare_finish_frame: \
<vavp9dec0> decreasing timestame
It sets the reordered_output and makes the decoder in free run mode.
We should correct the PTS for a TU, let the non-displayed frames have
no PTS while set the correct PTS to the displayed one. Also, when the
AV1 stream has multi spatial layers, there are more than one displayed
frames inside one TU with the same PTS.
Note: If the input alignment is not TU aligned, we can not know the
exact PTS of this TU, and so we just clear the PTS of the decode only
frame and leave others unchanged.
We also correct all the PTS if the output is OBU aligned. All their
PTS and DTS are set to the input buffer's PTS.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3182>
When the incoming data has big alignment than the output, we do not need to
call finish_frame() and exit the current handle_frame() for each splitted
frame. We can push them all at one shot with in one handle_frame(), whcih
may improve the performance and can help us to find the edge of TU.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3182>
According to spec:
color range equal to 0 shall be referred to as the studio swing
representation and color range equal to 1 shall be referred to as
the full swing representation.
The current status is just the opposite.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/2288>
Found via an analyzed build for Clang. Specifically we had:
gstav1parse.c[1850,11] in gst_av1_parse_detect_stream_format: Logic error: The left operand of '==' is a garbage value
gstav1parse.c[1606,11] in gst_av1_parse_handle_to_small_and_equal_align: Logic error: The left operand of '==' is a garbage value
Also a couple of false-positives:
gstav1parse.c[1398,24] in gst_av1_parse_handle_one_obu: Logic error: Branch condition evaluates to a garbage value
gstav1parse.c[1440,37] in gst_av1_parse_handle_one_obu: Logic error: The left operand of '-' is a garbage value
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/2230>
When we negotiate with downstream, We should use the intersected
caps of input and output to decide the alignment and stream format.
The current code just uses the input caps which may lack the stream
format.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1837>
The demux now outputs the AV1 stream in "tu" alignment, so we do not need
to detect the input alignment. But the annex b stream format is not recognized
by the demux, we still need to detect that stream format for the first input.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1837>
The current manner for deciding the new temporal unit is based on
temporal delimiter(TD) OBU. We only start a new temporal unit when
the TD comes.
But some streams do not have TD at all, which makes the output "TU"
alignment fail to work. We now add check based on the relationship
between the different layers and it can successfully judge the TU edge.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1634>
Some streams may have problematic OBUs at the beginning, which causes
the parse fail to detect the alignment and return error. For example,
there may be verbose OBUs before a valid sequence, which should be
discarded until we meet a valid sequence. We should let the parse
continue when we meet such cases, rather than just return error.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1634>