The current manner for deciding the new temporal unit is based on
temporal delimiter(TD) OBU. We only start a new temporal unit when
the TD comes.
But some streams do not have TD at all, which makes the output "TU"
alignment fail to work. We now add check based on the relationship
between the different layers and it can successfully judge the TU edge.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1634>