When we negotiate with downstream, We should use the intersected
caps of input and output to decide the alignment and stream format.
The current code just uses the input caps which may lack the stream
format.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1837>
The demux now outputs the AV1 stream in "tu" alignment, so we do not need
to detect the input alignment. But the annex b stream format is not recognized
by the demux, we still need to detect that stream format for the first input.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1837>
The current manner for deciding the new temporal unit is based on
temporal delimiter(TD) OBU. We only start a new temporal unit when
the TD comes.
But some streams do not have TD at all, which makes the output "TU"
alignment fail to work. We now add check based on the relationship
between the different layers and it can successfully judge the TU edge.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1634>
Some streams may have problematic OBUs at the beginning, which causes
the parse fail to detect the alignment and return error. For example,
there may be verbose OBUs before a valid sequence, which should be
discarded until we meet a valid sequence. We should let the parse
continue when we meet such cases, rather than just return error.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1634>