In push mode (streaming), if the audio size is smaller than segment buffer size, it would be ignored.
This happens because when the plugin receives an EOS signal while a single audio chunk that is less than the segment buffer size is buffered, it does not
flush this chunk. The fix is to flush the data chunk when it receives an EOS signal and has a single (first) chunk buffered.
How to reproduce:
1. Run gst-launch with tcp source
```
gst-launch-1.0 tcpserversrc port=3000 ! wavparse ignore-length=0 ! audioconvert ! filesink location=bug.wav
```
2. Send a wav file with unspecified data chunk length (0). Attached a test file
```
cat test.wav | nc localhost 3000
```
3. Compare the length of the source file and output file
```
ls -l test.wav bug.wav
-rw-rw-r-- 1 amr amr 0 Aug 15 11:07 bug.wav
-rwxrwxr-x 1 amr amr 3564 Aug 15 11:06 test.wav
```
The expected length of the result of the gst-lauch pipeline should be the same as the test file minus the headers (44), which is ```3564 - 44 = 3520``` but the actual output length is ```0```
After the fix:
```
ls -l test.wav fix.wav
-rw-rw-r-- 1 amr amr 3520 Aug 15 11:09 fix.wav
-rwxrwxr-x 1 amr amr 3564 Aug 15 11:06 test.wav
```
If VP8 is not encoded with error resilience enabled then any packet loss
causes very bad artefacts when decoding and waiting for the next
keyframe instead improves user experience considerably.
Various audio formats require an audio lead-in to decode it properly.
Most parsers would take care of it, but when a container like matroska is
involved, the demuxer handles the seeking and without its own lead-in
handling would never even pass the lead-in data to the parser.
This commit provides an initial implementation of that for audio/mpeg,
audio/x-ac3 and audio/x-eac3 by calculating the worst case lead-in time
needed from known samplerate, potential lead-in frames need and the
maximum blocksize possible for the format (as we don't parse that out
exactly in matroskademux) and seeking that much earlier in case of
accurate seeks. This is especially important for NLE use-cases with GES.
If accurate seeking to a position that happens to have a video keyframe,
it'll go back to the previous keyframe than needed, but with typical
video files that's the best we can do anyway without falling back to
scanning the clusters, as typically only keyframes are indexed in
Cueing Data.
If the media doesn't have a CUE, then we bisect for the cluster to seek
to with the same modified time as well in case of accurate seeking,
ensuring sufficient lead-in. This code path is typically hit only with
(suboptimal) audio-only matroska files, e.g. when created with ffmpeg,
which doesn't add a CUE for audio-only mkv muxing.
The send path in rtpsession processes the buffer list along the way,
sharing info and stats between packets in the same list, because it
assumes that all packets in a buffer list are from the same frame.
However, in the receiving path packets can arrive in all sorts of
arrangements:
- different sources,
- different frames (different timestamps),
- different types (multiplexed RTP and RTCP, invalid RTP packets).
so a more general approach should be used to correctly support buffer
lists in the receive path.
It turns out that it's simpler and more robust to process buffers
individually inside the rtpsession element even if they come in a buffer
list, and then reassemble a new buffer list when pushing the buffers
downstream.
This avoids complicating the existing code to make all functions
buffer-list-aware with the risk of introducing regressions,
To support buffer lists in the receive path and reduce the "push
overhead" in the pipeline, a new private field named processed_list is
added to GstRtpSessionPrivate, it is set in the chain_list handler and
used in the process_rtp callback; this is to achieve the following:
- iterate over the incoming buffer list;
- process the packets one by one;
- add the valid ones to a new buffer list;
- push the new buffer list downstream.
The processed_list field is reset before pushing a buffer list to be on
the safe side in case a single buffer was to be pushed by upstream
at some later point.
NOTE:
The proposed modifications do not change the behavior of the send path.
The process_rtp callback is called in rtpsource.c by the push_rtp
callback (via source_push_rtp) only when the source is not internal.
So even though push_rtp is also called in the send path, it won't end up
using process_rtp in this case because the source would be internal in
the send path.
The reasoning from above may suggest a future refactoring: push_rtp
might be split to better differentiate the send and receive path.
In push mode (streaming), if the last audio payload chunk is less than the segment rate buffer size, it would be ignored since the plugin waits until it has at least segment rate bufer size of audio.
The fix is to introduce a flushing flag that indicates that no more audio will be available so that the plugin can recognize this condition and flush the data is has even if it is less
than the desired segment rate buffer size.
This is useful to support the ONVIF case: when is-live is set to
FALSE and onvif-rate-control is no, the client can control the
rate of delivery and arrange for the server to block and still
keep sending when unblocked, without requiring back and forth
PAUSE / PLAY requests. This enables, amongst other things, fast
frame stepping on the client side.
When is-live is FALSE, we don't use a manager at all. This case
was actually already pretty well handled by the current code. The
standard manager, rtpbin, is simply no longer needed in this case.
Applications can instantiate a downloadbuffer after rtspsrc if
needed.
Refactor the code for parsing and generating the Range, taking
advantage of existing API in GstRtspTimeRange.
Only use the TCP protocol in that mode, as per the specification.
Generate an accurate segment when in that mode, and signal to the
depayloader that it should not generate its own segment, through
the "onvif-mode" field in the caps, see
<https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/merge_requests/328>
for more information.
Translate trickmode seek flags to their ONVIF representation
Expose an onvif-rate-control property
Forwarding a single segment event from the pad that first gets
chained is incorrect: when that first event was sent by an element
such as x264enc, with its offset start, we end pushing out of segment
buffers for the other pad(s).
Instead, everytime the active pad changes, forward the appropriate
segment event.
Fixes https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/1028
When it is not clear yet if a packet relative to a source should be
pushed, the packet is put into a queue, this happens in two cases:
- the source is still in probation;
- there is a large jump in seqnum, and it is not clear what
the cause is, future packets will help making a guess.
In either case stats about received packets are not updated at all; and
even if they were, when init_seq() is called it resets all receiver
stats, effectively loosing any possible stat about previously received
packets.
Fix this by taking into account the queued packets and update the stats
when calling init_seq().
Since commit c971d1a9a (rtpsource: refactor bitrate estimation,
2010-03-02) bytes_received filed in RTPSourceStats is set but then never
used again, expose it so that it can be used by user code to verify how
many bytes have been received.
According to RFC3550 lower-level headers should be considered for
bandwidth calculation.
See https://tools.ietf.org/html/rfc3550#section-6.2 paragraph 4:
Bandwidth calculations for control and data traffic include
lower-layer transport and network protocols (e.g., UDP and IP) since
that is what the resource reservation system would need to know.
Fix the source data to accommodate that.
Assume UDPv4 over IP for now, this is a simplification but it's good
enough for now.
While at it define a constant and use that instead of a magic number.
NOTE: this change basically reverts the logic of commit 529f443a6
(rtpsource: use payload size to estimate bitrate, 2010-03-02)
adjust/port from rtph264pay and allow sending the configuration data at
every IDR
The payloader was stripping the configuration data when the
config-interval was set to 0. The code was written in such a way !(a >
0) that it stripped the config when it was set at -1 (send config_data
as soon as possible).
This resulted in some MPEG4 streams where no GOP/VOP-I was detected to
be sent out without configuration.
In reverse playback, we don't want to rely on the position of the current
keyframe to decide a stream is EOS: the last GOP we push will start with
a keyframe, which position is likely to be outside of the segment.
Instead, let the normal seek_to_previous_keyframe mechanism do its job,
it works just fine.
If a key unit seek is performed with a time position that matches
the offset of a keyframe, but not its actual PTS, we need to
adjust the segment nevertheless.
For example consider the following case:
* stream starts with a keyframe at 0 nanosecond, lasting 40 milliseconds
* user does a key unit seek at 20 milliseconds
* we don't adjust the segment as the time position is "over" a keyframe
* we push a segment that starts at 20 milliseconds
* we push a buffer with PTS == 0
* an element downstream (eg rtponviftimestamp) tries to calculate the
stream time of the buffer, fails to do so and drops it
When the seek event contains a (newly-added) trickmode interval,
and TRICKMODE_KEY_UNITS was requested, only let through keyframes
separated with the required interval
The primary video stream is used to select fragment cut points
at keyframe boundaries. Auxilliary video streams may be
broken up at any packet - so fragments may not start with a keyframe
for those streams.
The time_position field of the stream is offset by the media_start
of its QtDemuxSegment compared to the start of the GstSegment of
the demuxer, take it into account when making comparisons.
If the conflict is detected when sending a packet, then also send an
upstream event to tell the source to reconfigure itself.
Also ignore the collision if we see more than one collision from the same
remote source to avoid problems on loops.
Add a new property "do-aggregate"* to the H.264 RTP payloader which
enables STAP-A aggregation as per [RFC-6184][1]. With aggregation enabled,
packets are bundled instead of sent immediately, up until the MTU size.
Bundles also end at access unit boundaries or when packets have to be
fragmented.
*: The property-name is kept generic since it might apply more widely,
e.g. STAP-B or MTAP.
[1]: https://tools.ietf.org/html/rfc6184#section-5.7
Closes https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/434
If an rtx packet arrives that hasn't been requested (it might
have been requested from prior to a reset), ignore it so that
it doesn't inadvertently trigger a clock skew.
In the case of reordered packets, calculating skew would cause
pts values to be off. Only calculate skew when packets come
in as expected. Also, late RTX packets should not trigger
clock skew adjustments.
Fixes#612
mpegaudioparse suggests MP3 needs 10 or 30 frames of lead-in (depending on
mpegaudioversion, which we don't know here), thus provide at least 30 frames
lead-in for such cases as a followup to commit cbfa4531ee.
The pre_push_frame default clipping behaviour was introduced in 2010
with commit 30be03004e and modified with commit 4163969a24 in 2011,
when most parsers didn't implement a pre_push_frame yet. Not having it
meant that clipping was done by default. Those that did implement a
pre_push_frame (flacparse and mpegaudioparse) at the time, had the flag
adjusted as part of the 2011 refactor work.
All other parsers got a pre_push_frame vfunc implementation only in
2013, but seem to have forgot to keep the clipping behaviour, as
was done automatically when a pre_push_frame implementation doesn't
exist for the parser. aacparse lost it with commit 91d4abcea in
July 2013; the others in Dec 2013 as part of AUDIO_CODEC tag posting
in commits 6f89b430e, d2ab5199b, 29f2cae12, 753d3c23a and 292780574.
Otherwise it can happen that we receive a caps event, then another caps
event and only then buffers. We would then send out the first caps event
in the stream but mark buffers with the caps version of the second caps
event.
Otherwise it can happen that we already collected 7 caps, miss the 8th
caps packet (packet loss) and then re-use the 1st caps for the following
buffers instead of the 8th caps which will likely cause errors further
downstream unless both caps are accidentally the same.
Keeping old caps around does not seem to have any value other than
potentially causing errors. We would always receive new caps whenever
they change (even if they were previous ones) and it's very unlikely
that they happen to be exactly the same as the previous ones.
Also after having received new caps or a buffer with a next caps
version, no buffers with old caps version will arrive anymore.
Make sure to clear any master clock on the media_clock
before unreffing it to release the timer callback that's
updating the clock and keeping it reffed.
Clear the mastering_display_info_present field explicitly
after reallocating the track context into a video context
to avoid uninitialised warnings in valgrind
If, say, a rtx-packet arrives really late, this can have a dramatic
effect on the jitterbuffer clock-skew logic, having it being reset
and losing track of the current dts-to-pts calculations, directly affecting
the packets that arrive later.
This is demonstrated in the test, where a RTX packet is pushed in really
late, and without this patch the last packet will have its PTS affected
by this, where as a late RTX packet should be redundant information, and
not affect anything.
This patch corrects the delay set on EXPECTED timers that are added when
processing gaps. Previously the delay could be too small so that
'timout + delay' was much less than 'now', causing the following retries
to be scheduled too early. (They were sent earlier than
rtx-retry-timeout after the previous timeout.)
Turns out that the "big-gap"-logic of the jitterbuffer has been horribly
broken.
For people using lost-events, an RTP-stream with a gap in sequencenumbers,
would produce exactly that many lost-events immediately.
So if your sequence-numbers jumped 20000, you would get 20000 lost-events
in your pipeline...
The test that looks after this logic "test_push_big_gap", basically
incremented the DTS of the buffer equal to the gap that was introduced,
so that in fact this would be more of a "large pause" test, than an
actual gap/discontinuity in the sequencenumbers.
Once the test was modified to not increment DTS (buffer arrival time) with
a similar gap, all sorts of crazy started happening, including adding
thousands of timers, and the logic that should have kicked in, the
"handle_big_gap_buffer"-logic, was not called at all, why?
Because the number max_dropout is calculated using the packet-rate, and
the packet-rate logic would, in this particular test, report that
the new packet rate was over 400000 packets per second!!!
I believe the right fix is to don't try and update the packet-rate if
there is any jumps in the sequence-numbers, and only do these calculations
for nice, sequential streams.
gst_splitmux_src_activate_part() configures the pad information
before starting the pad task, but occasionally the changes it makes
to the pad are not seen in the pad task because they're not
protected by the right locking. Use the pad's object lock to
protect those variables.
Fix a deadlock around the pads list by using an RW lock to
allow simultaneous readers. The pad list doesn't really changes
except at startup and shutdown.
Make the debug output less confusing by not mentioning a src
pad when doing calculations on the sink pad side.
Improve debug around why a GOP is considered overflowing a fragment
AAC and various other audio codecs need a couple frames of lead-in to
decode it properly. The parser elements like aacparse take care of it
via gst_base_parse_set_frame_rate, but when inside a container, the
demuxer is doing the seek segment handling and never gives lead-in
data downstream.
Handle this similar to going back to a keyframe with video, in the
same place. Without a lead-in, the start of the segment is silence,
when it shouldn't, which becomes especially evident in NLE use cases.
In this change we now protect the internal srcpads list using the
stream lock and limit usage of the internal stream lock to
preventing data flowing on the other src pad type while creating
and signalling the new pad.
This fixes a deadlock with RTPBin shutdown lock. These two locks would
end up being taken in two different order, which caused a deadlock. More
generally, we should not rely on a streamlock when handling out-of-band
data, so as a side effect, we should not take a stream lock when
iterating internal links.
This means we can use some newer features and get rid of some
boilerplate code using the G_DECLARE_* macros.
As discussed on IRC, 2.44 is old enough by now to start depending on it.
It must be accurate for all samples to work in Final Cut properly, so
the best we can do is to assume that all samples are the same as the
first. Bigger samples are truncated, smaller samples are padded.
This takes the timestamp of the earliest stream and offsets it so that
it starts at 0. Some software (VLC, ffmpeg-based) does not properly
handle Matroska files that start at timestamps much bigger than zero.
Closes https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/449
There is only a single sink element in async-finalize mode, and we would
keep the running time from previous fragments set in that case. As we
don't ever set the running time for the very last fragment on EOS, this
would mean that the closing time reported for the very last fragment is
the same as the closing time of the previous fragment.
This is a tiny clarification as the storage was loosely named "storage".
This change clarify that the storage is specificaly used for received RTP
packets. This is unlike the storage found in rtprtxsend that stores a
backlog of sent RTP packets.
We recently added code to remove outdate NACK to avoid using bandwidth
for packet that have no chance of arriving on time. Though, this had a
side effect, which is that it was to get an early RTCP packet with no
feedback into it. This was pretty useless but also had a side effect,
which is that the RTX RTT value would never be updated. So we we stared
having late RTX request due to high RTT, we'd never manage to recover.
This fixes the regression by making sure we keep at least one NACK in
this situation. This is really light on the bandwidth and allow for
quick recover after the RTT have spiked higher then the jitterbuffer
capacity.
The second udpsrc (rtcp) might not have seen the segment event if it was
not enabled or if rtcp is not available on the server. So if the
application tries to send an EOS event it will try to set an invalid
seqnum to the event.
Right now, we may call on-new-ssrc after we have processed the first
RTP packet. This prevents properly configuring the source as some
property like "probation" are copied internally for use as a
decreasing counter. For this specific property, it prevents the
application from disabling probation on auxiliary sparse stream.
Probation is harmful on sparse streams since the probation algorithm
assume frequent and contiguous RTP packets.
Scaletempo doesn't support non-interleaved layout. Not explicitely stating this
would trigger critical warnings and a caps negotiation failure when scaletempo
is used as playbin audio-filter.
Patch suggested by George Kiagiadakis <george.kiagiadakis@collabora.com>.
Fixes#591
Fix doc chunks to not use that syntax for links that have the
url as description, it will be put verbatim into the xml/*.xml
file and then the expat parser will throw a syntax error like:
File "../../common/mangle-db.py", line 71, in <module>
main()
File "../../common/mangle-db.py", line 69, in main
patch (details.replace("-details", ""), os.path.basename(details))
File "../../common/mangle-db.py", line 20, in patch
doc = xml.dom.minidom.parse(related)
File "/usr/lib/python2.7/xml/dom/minidom.py", line 1918, in parse
return expatbuilder.parse(file)
File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 924, in parse
result = builder.parseFile(fp)
File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 207, in parseFile
parser.Parse(buffer, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 84, column 7
If the incoming frame buffer has GST_BUFFER_FLAG_DISCONT set this should
be preserved and set for the first output buffer too, like other
payloaders do.
Spotted with gst-validate-1.0 when adding integration tests for
rtpsession, a minimal test to reproduce the issue is:
$ gst-validate-1.0 videotestsrc num-buffers=1 ! rtpvrawpay ! identity ! fakesink
Starting pipeline
Pipeline started
warning : Buffer didn't have expected DISCONT flag333 speed: 1.000000 />
Detected on <identity0:sink>
Detected on <identity0:src>
Detected on <fakesink0:sink>
Description : Buffers after SEGMENT and FLUSH must have a DISCONT flag
Issues found: 1
=======> Test PASSED (Return value: 0)
This introduce a new signal on RTSession, on-sending-nacks is emited
right before the list of seqnums to be nacked are processed and
transformed into FB Nack. This allow implementing custom nacks
handling through another mechanism with APP feedback.
In order to do that, we now split the nacks registration from the actual
FB nack packet construction. We then try and add as many FB Nacks as
possible into the active packets and leave the remaining seqnums in the
RTPSource. In order to avoid sending outdated NACK later on, we save the
seqnum calculated deadline and cleanup the outdated seqnums before the
next RTCP send.
Fixes#583
Calling rtp_session_send_rtcp before marking the source as requiring a
pli/fir/nack meant the rtcp_thread could be scheduled and start running
before the source was updated. This meant the request would not be sent
early but instead was transmitted with the next regular RTCP packet.
Add test for nack generation.
If the current time is equal to the early rtcp time deadline, there is
no need to schedule a timer. This ensure that immediate feedback is
really immediate and simplify implementing unit tests with the test
clock, which stops perfectly on the timeout time.
This fix has been extracted from Pexip feature patch called
"rtpsession: Allow instant transmission of RTCP packets"
When used in combination with a rtponviftimestamp element
downstream, forwarding this flag ensures it gets correctly
serialized in the ONVIF header extension.
A missing colon after G_DEFINE_TYPE declaration was confusing gst-indent
and causing problem in the pre-commit hook.
Add the missing colon and fix the following function declaration to
follow the normal GStreamer style.
One comments in gst_rtp_session_chain_send_rtp_common() is referring to
groups in a buffer list, however this concept of "group" comes from
GStreamer 0.10 and does not exist anymore in GStreamer 1.0, so update the
comment to refer to buffers instead.
The update_receiver_stats() function is called also when sending packets
in rtp_source_send_rtp(), and sending packets may happen using a buffer
list rather than individual buffers.
So update the stats using the actual number of packets sent.
NOTE: this is fine for the receive path too (rtp_process_send_rtp)
because the receive path does not support buffer lists and
pinfo->packets would always be equal to 1 in this case.