Add a new property "do-aggregate"* to the H.264 RTP payloader which
enables STAP-A aggregation as per [RFC-6184][1]. With aggregation enabled,
packets are bundled instead of sent immediately, up until the MTU size.
Bundles also end at access unit boundaries or when packets have to be
fragmented.
*: The property-name is kept generic since it might apply more widely,
e.g. STAP-B or MTAP.
[1]: https://tools.ietf.org/html/rfc6184#section-5.7
Closes https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/434
If an rtx packet arrives that hasn't been requested (it might
have been requested from prior to a reset), ignore it so that
it doesn't inadvertently trigger a clock skew.
In the case of reordered packets, calculating skew would cause
pts values to be off. Only calculate skew when packets come
in as expected. Also, late RTX packets should not trigger
clock skew adjustments.
Fixes#612
mpegaudioparse suggests MP3 needs 10 or 30 frames of lead-in (depending on
mpegaudioversion, which we don't know here), thus provide at least 30 frames
lead-in for such cases as a followup to commit cbfa4531ee.
The pre_push_frame default clipping behaviour was introduced in 2010
with commit 30be03004e and modified with commit 4163969a24 in 2011,
when most parsers didn't implement a pre_push_frame yet. Not having it
meant that clipping was done by default. Those that did implement a
pre_push_frame (flacparse and mpegaudioparse) at the time, had the flag
adjusted as part of the 2011 refactor work.
All other parsers got a pre_push_frame vfunc implementation only in
2013, but seem to have forgot to keep the clipping behaviour, as
was done automatically when a pre_push_frame implementation doesn't
exist for the parser. aacparse lost it with commit 91d4abcea in
July 2013; the others in Dec 2013 as part of AUDIO_CODEC tag posting
in commits 6f89b430e, d2ab5199b, 29f2cae12, 753d3c23a and 292780574.
Otherwise it can happen that we receive a caps event, then another caps
event and only then buffers. We would then send out the first caps event
in the stream but mark buffers with the caps version of the second caps
event.
Otherwise it can happen that we already collected 7 caps, miss the 8th
caps packet (packet loss) and then re-use the 1st caps for the following
buffers instead of the 8th caps which will likely cause errors further
downstream unless both caps are accidentally the same.
Keeping old caps around does not seem to have any value other than
potentially causing errors. We would always receive new caps whenever
they change (even if they were previous ones) and it's very unlikely
that they happen to be exactly the same as the previous ones.
Also after having received new caps or a buffer with a next caps
version, no buffers with old caps version will arrive anymore.
Make sure to clear any master clock on the media_clock
before unreffing it to release the timer callback that's
updating the clock and keeping it reffed.
Clear the mastering_display_info_present field explicitly
after reallocating the track context into a video context
to avoid uninitialised warnings in valgrind
If, say, a rtx-packet arrives really late, this can have a dramatic
effect on the jitterbuffer clock-skew logic, having it being reset
and losing track of the current dts-to-pts calculations, directly affecting
the packets that arrive later.
This is demonstrated in the test, where a RTX packet is pushed in really
late, and without this patch the last packet will have its PTS affected
by this, where as a late RTX packet should be redundant information, and
not affect anything.
This patch corrects the delay set on EXPECTED timers that are added when
processing gaps. Previously the delay could be too small so that
'timout + delay' was much less than 'now', causing the following retries
to be scheduled too early. (They were sent earlier than
rtx-retry-timeout after the previous timeout.)
Turns out that the "big-gap"-logic of the jitterbuffer has been horribly
broken.
For people using lost-events, an RTP-stream with a gap in sequencenumbers,
would produce exactly that many lost-events immediately.
So if your sequence-numbers jumped 20000, you would get 20000 lost-events
in your pipeline...
The test that looks after this logic "test_push_big_gap", basically
incremented the DTS of the buffer equal to the gap that was introduced,
so that in fact this would be more of a "large pause" test, than an
actual gap/discontinuity in the sequencenumbers.
Once the test was modified to not increment DTS (buffer arrival time) with
a similar gap, all sorts of crazy started happening, including adding
thousands of timers, and the logic that should have kicked in, the
"handle_big_gap_buffer"-logic, was not called at all, why?
Because the number max_dropout is calculated using the packet-rate, and
the packet-rate logic would, in this particular test, report that
the new packet rate was over 400000 packets per second!!!
I believe the right fix is to don't try and update the packet-rate if
there is any jumps in the sequence-numbers, and only do these calculations
for nice, sequential streams.
gst_splitmux_src_activate_part() configures the pad information
before starting the pad task, but occasionally the changes it makes
to the pad are not seen in the pad task because they're not
protected by the right locking. Use the pad's object lock to
protect those variables.
Fix a deadlock around the pads list by using an RW lock to
allow simultaneous readers. The pad list doesn't really changes
except at startup and shutdown.
Make the debug output less confusing by not mentioning a src
pad when doing calculations on the sink pad side.
Improve debug around why a GOP is considered overflowing a fragment
AAC and various other audio codecs need a couple frames of lead-in to
decode it properly. The parser elements like aacparse take care of it
via gst_base_parse_set_frame_rate, but when inside a container, the
demuxer is doing the seek segment handling and never gives lead-in
data downstream.
Handle this similar to going back to a keyframe with video, in the
same place. Without a lead-in, the start of the segment is silence,
when it shouldn't, which becomes especially evident in NLE use cases.