The problem was this:
Due to the highly irregular arrival of RTX-packet the max-misorder variable
could be pushed very low. (-10).
If you then at some point get a big in the sequence-numbers (62 in the
test) you end up sending RTX-requests for some of those packets, and then
if the sender answers those requests, you are going to get a bunch of
RTX-packets arriving. (-13 and then 5 more packets in the test)
Now, if max-misorder is pushed very low at this point, these RTX-packets
will trigger the handle_big_gap_buffer() logic, and because they arriving
so neatly in order, (as they would, since they have been requested like
that), the gst_rtp_jitter_buffer_reset() will be called, and two things
will happen:
1. priv->next_seqnum will be set to the first RTX packet
2. the 5 RTX-packet will be pushed into the chain() function
However, at this point, these RTX-packets are no longer valid, the
jitterbuffer has already pushed lost-events for these, so they will now
be dropped on the floor, and never make it to the waiting loop-function.
And, since we now have a priv->next_seqnum that will never arrive
in the loop-function, the jitterbuffer is now stalled forever, and will
not push out another buffer.
The proposed fixes:
1. Don't use RTX in calculation of the packet-rate.
2. Don't use RTX in large-gap logic, as they are likely to be dropped.
gst_buffer_map () results in memcopying when a GstBuffer contains
more than one GstMemory.
This has quite an impact on performance on systems with limited amount
of resources. With this patch the whole GstBuffer will not be mapped at
once, instead each individual GstMemory will be iterated and mapped
separately.
There is a use-case for a server to re-payload opus going through it.
Problem was that the payloader requires channels in the caps, but
this is not something the depayloader can parse out of the stream, meaning
caps-negotiation would fail.
Removing the requirement of channels in the template-caps fixes this.
Mainly generalize all the latest tests that have found various stalls
in the jitterbuffer, so that they only consist of a series of packets
with various seqnum/rtptime/rtx combinations, arriving at a specific time.
This means future tests can be more easily written to prove certain
behavior does not cause stalls.
Also fix the warning on windows:
warning C4244: 'initializing': conversion from 'double' to 'gint', possible loss of data
This is a concept that only applies when a buffer arrives in the chain
function, and it has already been scheduled as part of a "multi"-lost
timer.
However, "multi"-lost timers are now a thing of the past, making this
whole concept superflous, and this buffer is now simply counted as "late",
having already been pushed out (albeit as a lost-event).
There is a problem with the code today, where a single timer will
be scheduled for a series of lost packets, and then if the first packet
in that series arrives, it will cause a rescheduling of that timer, going
from a "multi"-timer to a single-timer, causing a lot of the packets
in that timer to be unaccounted for, and creating a situation in where
the jitterbuffer will never again push out another packet.
This patch solves the problem by instead of scheduling those lost packets
as another timer, it instead asks to have that lost-event pushed straight
out.
This very much goes with the intent of the code here: These packets are
so desperately late that no cure exists, and we might as well get the
lost-event out of the way and get on with it.
This change has some interesting knock-on effect being presented in
later commits. It completely removes the concept of "already-lost", so
that is why that test has been disabled in this commit, to be
removed later.
gst_buffer_map () results in memcopying when a GstBuffer contains
more than one GstMemory and when AVC (length-prefixed) alignment is used.
This has quite an impact on performance on systems with limited amount of
resources. With this patch the whole GstBuffer will not be mapped at once,
instead each individual GstMemory will be iterated and mapped separately.
When calling gst_rtp_jitter_buffer_reset you pass in a seqnum.
This is considered the starting-point for a new stream.
However, the old behavior would unref this buffer, basically lying to
the thread that is pushing out buffers saying that it can expect
this buffer, when it would never arrive. The resulting effect being no
more buffer pushed out of the jitterbuffer, and it would buffer
incoming data indefinitely.
By instead inserting the buffer in the gap_packets queue, the _reset()
function will take responsibility for using that as the first buffer
of the new stream.
Fixes#703
In order to concatenate fragments, splitmuxsrc offsets
the start of each fragment PTS to 0 to align it with the
previous file. This means that DTS can go negative for
the first fragment, with really bad results.
Add a fixed offset to outgoing timestamp ranges to
avoid that.
When using the testclock for determining clock in test, it is sometimes observed
that the clock entry is not registered in time by the aggregator. So deadlock occurs
between the aggregator and the test thread.
* Organize GstRtpFunnelPad and GstRtpFunnel separately
* Use G_GNUC_UNUSED instead of (void) casts
* Don't call an event "caps"
* Use semicolons after GST_END_TEST (helps gst-indent)
When used for processing bundled media streams within rtpbin the rtpssrcdemux element may
receive bad RTP and RTCP packets, these should not be treated as a fatal error.
The property is useful against atacks when the sender changes SSRC for
every RTP packet. The property with the same name introduced in rtpbin
was not enough, because we still can end up with thousands of pads
allocated in rtpssrcdemux.
When connected to an upstream rtpfunnel element, payload-type,
ssrc and clock-rate will not be present in the received caps.
rtprtxsend can already deal with only the clock rate being
present there, a new property is exposed to allow users to
provide a payload-type -> clock-rate map, this enables the
use of the max-size-time property for bundled streams.
The key is to make sure the jitterbuffer is set to NULL *before* the
ptdemux.
The race that existed would basically happen when ptdemux had reached
READY, and the jitterbuffer would then push a buffer, triggering a new
pad with a new payloadtype being added and ghosted to the rtpbin itself.
However, the srcpad of the ptdemux would now be inactive, and all the
sticky-event pushed on it would be swallowed, not allowing any to reach
the ghost-pad. Then the buffer in-flight would come to the ghostpad,
and we would assert that a buffer arrived before the necessary
events.
By simply re-ordering the state-changes, we ensure that there will be
no buffer racing into the ptdemux while its state is being changed,
and the problem disappears completely.
Notice also that there is not point in disconnecting the signals on the
ptdemux before this point, since we need the push-thread to settle
down before we can do this in a non-racy way.
One of the jitterbuffers functions is to try and make sense of weird
network behavior.
It is quite unhelpful for the jitterbuffer to start dropping packets
itself when what you are trying to achieve is better network resilience.
In the case of a skew, this could often mean the sender has restarted
in some fashion, and then dropping the very first buffer of this "new"
stream could often mean missing valuable information, like in the case
of video and I-frames.
This patch simply reverts back to the old behavior, prior to 8d955fc32b
and includes the simplest test I could write to demonstrate the behavior,
where a single packet arrives "perfectly", then a 50ms gap happens,
and then two more packets arrive in perfect order after that.
# Conflicts:
# tests/check/elements/rtpjitterbuffer.c
This commit adds custom element messages for when gstrtpjitterbuffer
drops an incoming rtp packets due to for example arriving too late.
Applications can listen to these messages on the bus which enables
actions to be taken when packets are dropped due to for example high
network jitter.
Two properties has been added, one to enable posting drop messages and
one to set a minimum time between each message to enable throttling the
posting of messages as high drop rates.
When the queue is full (and adding more packets would risk a seqnum
roll-over), the best approach is to just start pushing out packets
from the other side. Just pushing out the packets results in the
timers being left hanging with old seqnums, so it's safer to just
execute them immediately in this case. It does limit the timer space
to the time it takes to receiver about 32k packets, but without
extended sequence number, this is the best RTP can do.
This also results in the test no longer needed to have timeouts or
timers as pushing packets in drives everything.
Fixes#619
If the jitterbuffer head change, there is no need to systematically
wakeup the timer thread. The timer thread will be waken up on if
an earlier timeout has been pushed. This prevent some more spurious
wakeup when the system is loaded. As a side effect, cranking the clock
may set the clock at an earlier position.
Implement a single timer queue for all timers. The goal is to always use
ordered queues for storing timers. This way, extracting timers for
execution becomes O(1). This also allow separating the clock wait
scheduling from the timer itself and ensure that we only wake up the
timer thread when strictly needed.
The knew data structure is still O(n) on insertions and reschedule,
but we now use proximity optimization so that normal cases should be
really fast. The GList structure is also embeded intot he RtpTimer
structure to reduce the number of allocations.
Add a property which explicitly maps splitmuxsink pads to the
muxer pads they should connect to, overriding the implicit logic
that tries to match pads but yields arbitrary names.
RTP and RTCP packets can be muxed together on the same channel (see
RFC5761) and can arrive in the same buffer list.
The GStreamer rtpsession element support RFC5761, so add a test to cover
this case for buffer lists too.
Buffers with different timestamps (e.g. packets belonging to different
frames) can arrive together in the same buffer list,
Add a test to cover this case.
When a new source fails to pass the probation period (i.e. new packets
have non-consecutive sequence numbers), then no buffer shall be pushed
downstream. Add a test to validate this case.
Add a test to verify that stats about received packets are correct when
using buffer lists in the rtpsession receive path.
Split get_session_source_stats() in two to be able to get stats from
a GstRtpSession object directly.
The primary video stream is used to select fragment cut points
at keyframe boundaries. Auxilliary video streams may be
broken up at any packet - so fragments may not start with a keyframe
for those streams.
If the conflict is detected when sending a packet, then also send an
upstream event to tell the source to reconfigure itself.
Also ignore the collision if we see more than one collision from the same
remote source to avoid problems on loops.
Add a new property "do-aggregate"* to the H.264 RTP payloader which
enables STAP-A aggregation as per [RFC-6184][1]. With aggregation enabled,
packets are bundled instead of sent immediately, up until the MTU size.
Bundles also end at access unit boundaries or when packets have to be
fragmented.
*: The property-name is kept generic since it might apply more widely,
e.g. STAP-B or MTAP.
[1]: https://tools.ietf.org/html/rfc6184#section-5.7
Closes https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/434
In the case of reordered packets, calculating skew would cause
pts values to be off. Only calculate skew when packets come
in as expected. Also, late RTX packets should not trigger
clock skew adjustments.
Fixes#612
If, say, a rtx-packet arrives really late, this can have a dramatic
effect on the jitterbuffer clock-skew logic, having it being reset
and losing track of the current dts-to-pts calculations, directly affecting
the packets that arrive later.
This is demonstrated in the test, where a RTX packet is pushed in really
late, and without this patch the last packet will have its PTS affected
by this, where as a late RTX packet should be redundant information, and
not affect anything.
This patch corrects the delay set on EXPECTED timers that are added when
processing gaps. Previously the delay could be too small so that
'timout + delay' was much less than 'now', causing the following retries
to be scheduled too early. (They were sent earlier than
rtx-retry-timeout after the previous timeout.)
Turns out that the "big-gap"-logic of the jitterbuffer has been horribly
broken.
For people using lost-events, an RTP-stream with a gap in sequencenumbers,
would produce exactly that many lost-events immediately.
So if your sequence-numbers jumped 20000, you would get 20000 lost-events
in your pipeline...
The test that looks after this logic "test_push_big_gap", basically
incremented the DTS of the buffer equal to the gap that was introduced,
so that in fact this would be more of a "large pause" test, than an
actual gap/discontinuity in the sequencenumbers.
Once the test was modified to not increment DTS (buffer arrival time) with
a similar gap, all sorts of crazy started happening, including adding
thousands of timers, and the logic that should have kicked in, the
"handle_big_gap_buffer"-logic, was not called at all, why?
Because the number max_dropout is calculated using the packet-rate, and
the packet-rate logic would, in this particular test, report that
the new packet rate was over 400000 packets per second!!!
I believe the right fix is to don't try and update the packet-rate if
there is any jumps in the sequence-numbers, and only do these calculations
for nice, sequential streams.
sethostent() seems to be using a global state and we endup with leaks from
that API when called through shout_init(). We had the option to only
ignore the shout case, but the impression is that if we have shout and
another sethostend user, as it's a global state, we may endup with a
different stack trace for the same leak. So in the end, we just ignore
memory allocated by sethostent in general.
In this change we now protect the internal srcpads list using the
stream lock and limit usage of the internal stream lock to
preventing data flowing on the other src pad type while creating
and signalling the new pad.
This fixes a deadlock with RTPBin shutdown lock. These two locks would
end up being taken in two different order, which caused a deadlock. More
generally, we should not rely on a streamlock when handling out-of-band
data, so as a side effect, we should not take a stream lock when
iterating internal links.
We recently added code to remove outdate NACK to avoid using bandwidth
for packet that have no chance of arriving on time. Though, this had a
side effect, which is that it was to get an early RTCP packet with no
feedback into it. This was pretty useless but also had a side effect,
which is that the RTX RTT value would never be updated. So we we stared
having late RTX request due to high RTT, we'd never manage to recover.
This fixes the regression by making sure we keep at least one NACK in
this situation. This is really light on the bandwidth and allow for
quick recover after the RTT have spiked higher then the jitterbuffer
capacity.
Right now, we may call on-new-ssrc after we have processed the first
RTP packet. This prevents properly configuring the source as some
property like "probation" are copied internally for use as a
decreasing counter. For this specific property, it prevents the
application from disabling probation on auxiliary sparse stream.
Probation is harmful on sparse streams since the probation algorithm
assume frequent and contiguous RTP packets.
We used to split the NACK if a smaller seqnum of a range of seqnum was
submited. This test also make sure that the three operations (append,
prepend, update) works properly.
Calling rtp_session_send_rtcp before marking the source as requiring a
pli/fir/nack meant the rtcp_thread could be scheduled and start running
before the source was updated. This meant the request would not be sent
early but instead was transmitted with the next regular RTCP packet.
Add test for nack generation.
Add a test to verify that stats about sent and received packets are
correct even when using buffer lists.
NOTE: the newly introduced get_session_source_stats() selects the
desired source (sender or receiver) by filtering them by type (using the
get_sender parameter) rather than by ssrc because this simplifies the
code and it's good enough for testing purposes as there is usually one
source per type in the test setup.
Filtering by ssrc would have required handling asynchronous signals like
"on-new-sender-ssrc", with the relative locking, just to retrieve the
actual ssrc of the sender.
The tests create a buffer list and then use the chain_list callback to
verify that the correct packets have been pushed.
Move the creation and validation code next to each other so that the
reader can more easily understand what is going on.
While at it add some comments to introduce the two related functions.
Make it possible to differentiate between the position in the list and
the packet index in the global structures in check_packet, in some
future case the list may change, in case some element removes a buffer
from the list, and the two indices may not coincide.