This works around a bug in various ONVIF cameras that implement the
attributes the wrong way around. They still won't work with a
backchannel but at least normal playback will work for the time being.
It restores pre-1.14 behaviour where we would fail to preroll on any SDP
that lists a recvonly stream. For 1.16 a better solution should be
found.
The problem here is that the ONVIF spec has the meaning of the two
attributes the wrong way around in the examples, compared to RFC4566.
https://bugzilla.gnome.org/show_bug.cgi?id=793715
Only up to timescale * G_MAXINT16 is possible as cluster duration, which
is already higher than our default value. Using higher values would
cause overflows and broken files.
Based on the investigation by Nicola Murino <nicola.murino@gmail.com>
https://bugzilla.gnome.org/show_bug.cgi?id=792775
Matroska does not support changing the stream type and stream properties
after the headers were started to be written, and for example H264
codec_data changes can't be supported.
https://bugzilla.gnome.org/show_bug.cgi?id=782949
rtpulpfeccommon.c:432:27: error: format ‘%lx’ expects argument of type
‘long unsigned int’, but argument 10 has type ‘guint64 {aka long long unsigned int}’
https://bugzilla.gnome.org/show_bug.cgi?id=793732
The ulpfecenc "mux-seq" and "ssrc" properties were initially added
because the element did more than implement ULPFEC. As it was
decided that FLEXFEC would be implemented in a separate element,
both properties are now unneeded and confusing.
Change the default for the ulpfecenc multi-packet property,
as it is expected that most users of this element will be protecting video
streams.
Change the default property for the rtpredenc allow-no-red-blocks
property, as it should also be its default mode of operation.
https://bugzilla.gnome.org/show_bug.cgi?id=793843
It is expected that when connecting to a stream that has
already started, the caps will only arrive at the interval
specified on rtpgstpay, we shouldn't be warning as this is
a normal mode of operation.
https://bugzilla.gnome.org/show_bug.cgi?id=793798
We expose a set of new elements:
* ULPFEC encoder / decoder
* A storage element, which should be placed before jitterbuffers,
and is used to store packets in order to attempt reconstruction
after the jitterbuffer has sent PacketLost events
* RED encoder / decoder (RFC 2198), these are necessary to
use FEC in webrtc, as browsers will propose and expect ulpfec
packets to be wrapped in red packets
With contributions from:
Mathieu Duponchelle <mathieu@centricular.com>
Sebastian Dröge <sebastian@centricular.com>
https://bugzilla.gnome.org/show_bug.cgi?id=792696
Packets with these payload types will be dropped. A use case
for this is FEC, where we want FEC packets to go through the
jitterbuffer, but not be output by rtpbin.
https://bugzilla.gnome.org/show_bug.cgi?id=792696
All received configurations are parsed and added to a list, this lead
to an unbounded memory usage. As the configuration is resent every
second this quickly lead to a large memory usage.
Add a check to only add the config if it is not already available in
the list. This fix only handle the typical case of a well behaved
stream, a malicious server could still send many useless
configurations to raise the client memory usage.
The smallest possible is 24 (and not 25) bytes.
The last "name" field can according to QTFF specifications not be present
at all. The parser will handle this fine and so will the rest of
the qtdemux code.
If codec_data is changed, the stream is no longer valid.
Rather than keeping running when refusing new caps,
this patch send a warning to the bus.
Also fix up splitmuxsink to ignore this warning while changing caps.
https://bugzilla.gnome.org/show_bug.cgi?id=790000
We would accidentally pass through the duration value from the
demuxer from a single fragment, which causes problems when
feeding the stream from splitmuxsrc to rtsp-server. Streaming
would stop after one fragment due to that.
https://bugzilla.gnome.org/show_bug.cgi?id=792861
total_duration is initialised to CLOCK_TIME_NONE, not 0, so check
for that as well in order not to return an invalid duration to
a duration query. Doesn't fix anything particular observed in
practice, just seemed inconsistent.
With this patch we can now provide a set of files
created by multifilesink as a source for uri elements.
e.g. gst-launch-1.0 playbin uri=multifile://img%25d.ppm
Note that for the %d pattern you need to replace % with %25.
This is to be compliant with URL naming standards.
https://bugzilla.gnome.org/show_bug.cgi?id=783581
It generally makes not much sense to configure it for all pads/traks at
once as this value is usually different for each of them. As such, add a
new property on the pads in addition to the existing property on the
whole muxer.
https://bugzilla.gnome.org/show_bug.cgi?id=792649
We can't handle recvonly streams, sendonly streams are perfectly fine.
The direction is the one from the point of view of the SDP offerer
(i.e. the RTSP server), and a recvonly stream would be one where the
server expects us to send media.
RFC 3264, section 5.1:
If the offerer wishes to only send media on a stream to its peer, it
MUST mark the stream as sendonly with the "a=sendonly" attribute.
This is mixed up in the ONVIF streaming specification examples, but
actual implementations and conformance tools seem to not care at all
about the attributes.
https://bugzilla.gnome.org/show_bug.cgi?id=792376
Raw AAC streams might have very small frames, e.g. 6 byte frames
when encoding silence. These frames are then smaller than aacparse's
default min_frame_size of 10 bytes (ADTS_MAX_SIZE).
When passthrough is disabled or aacparse has to output ADTS, GstBaseParse
will concatenate these short frames to the following frame before
handling them to aacparse, which processes each input buffer as a single
frame, producing bad output.
To avoid this problem, set the min_frame_size to 1 when receiving a raw
stream.
https://bugzilla.gnome.org/show_bug.cgi?id=792644
When the signal returns a floating reference, as its return type
is transfer full, we need to sink it ourselves before passing
it to gst_bin_add (which is transfer floating).
This allows us to unref it in bin_remove_element later on, and
thus to also release the reference we now own if the signal
returns a non-floating reference as well.
As we now still hold a reference to the element when removing it,
we also need to lock its state and setting it to NULL before
unreffing it
Also update the request_aux_sender test.
https://bugzilla.gnome.org/show_bug.cgi?id=792543
TOC support in mastroskamux has been deactivated for a couple of years. This commit updates it to recent GstToc evolutions and introduces toc unit tests for both matroska-mux and matroska-demux.
There are two UIDs for Chapters in Matroska's specifications:
- The ChapterUID is a mandatory unsigned integer which internally refers to a given chapter. Except for title & language which use dedicated fields, this UID can also be used to add tags to the Chapter. The tags come in a separate section of the container.
- The ChapterStringUID is an optional UTF-8 string which also uniquely refers to a chapter but from an external perspective. It can act as a "WebVTT cue identifier" which "can be used to reference a specific cue, for example from script or CSS".
During muxing, the ChapterUID is generated and checked for unicity, while the ChapterStringUID receives the user defined UID. In order to be able to refer to chapters from the tags section, we maintain an internal Toc tree with the generated ChapterUID.
When demuxing, the ChapterStringUIDs (if available) are assigned to the GstTocEntries UIDs and an internal toc mimicking the toc is used to keep track of the ChapterUIDs and match the tags with the appropriate GstTocEntries.
https://bugzilla.gnome.org/show_bug.cgi?id=790686
If we saw empty segments, we previously unconditionally pushed a
GAP event downstream regardless of the duration of that empty
segment.
In order to avoid issues with initial negotiation of downstream elements
(which would negotiate to something before receiving any data due to
that initial GAP event), check if there's at least a second of difference
(like we do for other GAP-related checks in qtdemux) before
deciding to push a GAP event downstream.
Otherwise baseparse will incrementally send us bigger buffers until the
full header size is reached, which is not only pointless but also means
that baseparse will reallocate and copy into a bigger buffer for every
input buffers. In pull mode that's done in 64kb increments, in push mode
usually in much smaller increments, causing a lot of overhead for
example when parsing high-quality coverart.
When receiving a seek event, check whether we can actually seek based
on the information the server provided.
Also add more documentation on what the seekable field means
If a reserved-max-duration is set, we should always track
and update the reserved-duration-remaining estimate, even
if we're not sending periodic moov updates downstream for
full robust muxing.
If the use-robust-muxing property is set, check if the
assigned muxer has reserved-max-duration and
reserved-duration-remaining properties, and if so set
the configured maximum duration to the reserved-max-duration
property, and monitor the remaining space to start
a new file if the reserved header space is about to run out -
even though it never ought to.
Switching to a new fragment because the input caps have
changed didn't properly end the previous file. Use the normal
EOS sequence to ensure that happens. Add a test that it works.
Only for byte-stream or hev1. For hvc1 the SPS/PPS are in the
caps as codec_data field and in this case they shouldn't be in
the stream data as well. The output caps should be updated with
the new codec_data if needed, for hvc1.
We keep the boolean byte_stream around since it's nicer for
readability and most of the code just cares about byte_stream
or not. This is useful for future-proofing the code for when
we add support for hev1 output as well.
This would happen if input is byte-stream with four-byte
sync markers instead of three-byte ones. The code that
scans for sync markers will place the start of the NALU
on the third-last byte of the NALU sync marker, which
means that any additional zeros may be counted as belonging
to the previous NALU instead of being part of the next sync
marker. Fix that so we don't send VPS/SPS/PPS with trailing
zeros in this case.
See https://bugzilla.gnome.org/show_bug.cgi?id=732758
There is no difference between pushing out a buffer directly
with gst_rtp_base_depayload_push() and returning it from the
process function. The base class will just call _depayload_push()
on the returned buffer as well.
So instead of marshalling buffers through three layers and back,
just push them from one place in handle_nal() and always return
NULL from the process vfunc. This simplifies the code a little.
Also rename _push_fragmentation_unit() to _finish_fragmentation_unit()
for clarity. Push sounds like it means being pushed out, whereas
it might just be pushed into an adapter.
This change has the side-effect that multiple NALs in a single STAP
(such as SPS/PPS) may no longer be pushed out as a single buffer if
we output NALs in byte-stream format (i.e. not aggregate AUs), but
that shouldn't really make any difference to anyone.
This would happen if input is byte-stream with four-byte
sync markers instead of three-byte ones. The code that
scans for sync markers will place the start of the NALU
on the third-last byte of the NALU sync marker, which
means that any additional zeros may be counted as belonging
to the previous NALU instead of being part of the next sync
marker. Fix that so we don't send SPS/PPS with trailing
zeros in this case.
https://bugzilla.gnome.org/show_bug.cgi?id=732758
Returning FALSE because we drop an event means that
internal sources like qtdemux might throw an error
and break the whole pipeline. The only time it can
happen is either flushing or shutdown, and those
will be handled anyway.
... and forward colorimetry to downstream. The Colour element describes
various color information (similar to 'colr' box in isobmff).
Note that, due to the comparatively limited syntax for color information
in vpx codecs, the color information in mkv/wemb container level
should be used for sophisticated color handling (e.g., HDR video).
https://bugzilla.gnome.org/show_bug.cgi?id=790023
The G722 payload only accepts G722 audio with channels=1, so it must
specify the encoding-params=1 in its src caps, otherwise it causes issues
with farstream which thinks it supports 2 channels G722 and when
confronted with a remote that has G722/8000/2, it will negotiate it
and error out with a not-negotiated when the caps don't intersect
at runtime.
https://bugzilla.gnome.org/show_bug.cgi?id=789878
When XR packet is detected, warning message leads to misunderstandings.
Until RFC3611 is implemented in gst-plugins-base, the level needs to
be downgraded to avoid confusion.
https://bugzilla.gnome.org/show_bug.cgi?id=789746
It is possible that the mdat has more data than what was stored in the
headers file. If we put that to the output the file will have bogus data
at the end and some players will complain.
https://bugzilla.gnome.org/show_bug.cgi?id=784258
qtdemux.c: In function ‘gst_qtdemux_configure_stream’:
qtdemux.c:7764:34: error: suggest parentheses around ‘&&’ within ‘||’ [-Werror=parentheses]
if ((stream->n_samples == 1) && (stream->first_duration == 0)
~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Avoid computing frame rate when a stream contain moof with only one
sample, to avoid an assert. The moof is considered as still picture.
The same is already done for one sample given in the moov.
https://bugzilla.gnome.org/show_bug.cgi?id=782217
Linear interpolation adds quite some noise, and it's unlikely that
anybody will ever need sub-sample accurate delays. Proper resampling
before that will lead to better results.
When a truncated FLV is provided and processed in pull mode, we
may endup trying to pull passed EOS, causing a rather confusing
warning as the pull offset is an integer overflow.
https://bugzilla.gnome.org/show_bug.cgi?id=787795
This code basically skip over codec_data with empty payload. In
this case, the codec_data variable is the size of the header for
the CODEC part of Video Tag. The remaining is supposed to be the
H.264 codec data, hence should not be empty.
https://bugzilla.gnome.org/show_bug.cgi?id=787795
Meaning that the interleave fields have to be updated as
if streams setup was working when using pipelined setup
request. Otherwise there is a mismatch between the server
channel count and our own.
This also makes RTSP 2.0 over HTTP working.
https://bugzilla.gnome.org/show_bug.cgi?id=781446
- Handle version negotation:
Added a `default-version` property so that the user can configure
what to use in case the server does not support version negotation
(which actually exist)
- Handle pipelined requests, which allow avoiding full round trip to
setup the RTP streams (request are sent in a raw, and response are
handled as they arrive).
- Handle the new Media-Properties header
- Handle the new Seek-Style header
- Handle the new Accept-Ranges header
Handling of IPV6 should already be OK.
We are still missing (at least) the following features (which do not
seem really mandatory as they require a "persistent connection between
server and client"):
- Server to Client TEARDOWN command (Not so usefull fmpov)
- PLAY_NOTIFY (not needed for our server yet)
- Support for the new REDIRECT features
and probably some more protocol changes might not be handled yet.
https://bugzilla.gnome.org/show_bug.cgi?id=781446
This then just counts samples and calculates the output timestamps based
on that and the very first observed timestamp. The timestamps on the
buffers are continued to be used to detect discontinuities that are too
big and reset the counter at that point.
When receiving data via Bluetooth, many devices put completely wrong
values into the RTP timestamp field. For example iOS seems to put a
timestamp in milliseconds in there, instead of something based on the
current sample offset (RTP clock-rate == sample rate).
https://bugzilla.gnome.org/show_bug.cgi?id=787297
Doesn't do anything fancy yet, but still avoids lots of
unnecessary locking/unlocking that would happen if the
default chain_list fallback function in GstPad got invoked.
Timestamp offsets needs to be checked to detect unrealistic values
caused for example by NTP clocks not in sync. The new parameter
max-ts-offset lets the user decide an upper offset limit. There
are two different cases for checking the offset based on if
ntp-sync is used or not:
1) ntp-sync enabled
Only negative offsest are allowed since a positive offset would
mean that the sender and receiver clocks are not in sync.
Default vaule of max-ts-offset = 0 (disabled)
2) ntp-sync disabled
Both positive and negative offsets are allowed.
Default vaule of max-ts-offset = 3000000000
The reason for different default values is to be backwards
compatible.
https://bugzilla.gnome.org/show_bug.cgi?id=785733
Instant large changes to ts_offset may cause timestamps to move
backwards and also cause visible effects in media playback. The new
option max-ts-offset-adjustment lets the application control the rate to
apply changes to ts_offset.
https://bugzilla.gnome.org/show_bug.cgi?id=784002
* use INFO/DEBUG/LOG/TRACE equaly and meaningfully;
previously rtprtxsend:LOG and rtprtxreceive:LOG would generate
a totally different amount of log traffic and sometimes it was
impossible to see the information you wanted without useless
spam being printed around
* improve the wording, give a reasonable and self-explanatory
amount of information
* print SSRCs in hex
* avoid G_FOO_FORMAT for readability (we are just printing integers)
If one requests the send_rtcp_src_%u pad before a recv_rtcp_sink_%u pad,
the session/pad would never be created and NULL was returned.
Switching the request order would work.
https://bugzilla.gnome.org/show_bug.cgi?id=786718
Fix chain function not handling not-linked from baseparse.
When an input data is separated into 2 buffers, the second buffer
would not be pushed into the adapter if baseparse returns not-linked
for first buffer.
This caused glitches when switching streams and selecting
a stream that was previously unselected.
https://bugzilla.gnome.org/show_bug.cgi?id=786268
Callers of the API (rtpsource, rtpjitterbuffer) pass clock_rate
as a signed integer, and the comparison "<= 0" is used against
it, leading me to think the intention was to have the field
be typed as gint32, not guint32.
This led to situations where we could call scale_int with
a MAX_UINT32 (-1) guint32 as the denom, thus raising an
assertion.
https://bugzilla.gnome.org/show_bug.cgi?id=785991
... which no longer worked due to unconditionally clearing sample info and
ending up in inconsistent state. Let's tread a bit more carefully and also
allow for the old seek handling that resorts to scanning if no mfra info
is available.
Do not allocate payload size outbuf if appending payload buffer.
The commit 137672ff18 attached payload
to the output buffer but forgot to remove payload allocation. That
effectively doubled payload size and add zero'ed or random bytes.
Makes the following pipeline work again:
gst-launch-1.0 -v audiotestsrc wave=2 ! gsmenc ! rtpgsmpay ! rtpgsmdepay ! gsmdec ! autoaudiosink
https://bugzilla.gnome.org/show_bug.cgi?id=784616
gst_util_uint64_scale_int takes a gint as denom parameter
whereas ctx->clock_rate is a guint32.
It happens when gst_rtp_packet_rate_ctx_reset set clock_rate
to -1.
So just define clock_rate as gint like it is done in rtpsource.h
https://bugzilla.gnome.org/show_bug.cgi?id=784250
When set this property will allow the jitterbuffer to start delivering
packets as soon as N most recent packets have consecutive seqnum. A
faststart-min-packets of zero disables this feature. This heuristic is
also used in rtpsource which implements the probation mechanism and a
similar heuristic is used to handle long gaps.
https://bugzilla.gnome.org/show_bug.cgi?id=769536
We currently send data to the RTSP connection from multiple threads:
whenever a command is to be handled and whenever RTCP is generated. This
can cause data corruption or worse if both happen at the same time.
As such, protect gst_rtsp_connection_send() and gst_rtsp_connection_receive()
calls with a mutex. While this means that we hold a mutex during the IO
operation, this is not actually a problem as the IO operation can be
interrupted (gst_rtsp_connection_flush()) at any time and is blocking by
itself anyway.
The last entry will most likely get new samples added to it in "robust"
muxing mode, changing the samples_per_chunk and thus making it wrong to
keep the last two entries merged. It will run into an assertion later
when adding a new sample to the chunk.
Thanks to gdiener@cardinalpeak.com for the analysis of the bug and
proposal for a solution.
There might be other chunks after the data chunk, so clipping the chunk
size with the data size can lead to a negative number and all following
calculations go wrong and cause crashes or worse.
This was introduced in 3ac119bbe2.
https://bugzilla.gnome.org/show_bug.cgi?id=783760
They can cause us to deadlock, while we're waiting for a new frame and
upstream is waiting for the allocation query to be answered before
sending a frame
https://bugzilla.gnome.org/show_bug.cgi?id=783753
There is no difference between pushing out a buffer directly
with gst_rtp_base_depayload_push() and returning it from the
process function. The base class will just call _depayload_push()
on the returned buffer as well.
So instead of marshalling buffers through three layers and back,
just push them from one place in handle_nal() and always return
NULL from the process vfunc. This simplifies the code a little.
Also rename _push_fragmentation_unit() to _finish_fragmentation_unit()
for clarity. Push sounds like it means being pushed out, whereas
it might just be pushed into an adapter.
This change has the side-effect that multiple NALs in a single STAP
(such as SPS/PPS) may no longer be pushed out as a single buffer if
we output NALs in byte-stream format (i.e. not aggregate AUs), but
that shouldn't really make any difference to anyone.
Use the ::process_rtp_packet() vfunc to avoid mapping the
RTP buffer twice.
gst_rtp_buffer_get_payload_buffer() returns a new sub-buffer
which will always be writable, so no need to make it writable.
Every g_quark_from_static_string() is a hash table lookup serialised
on the global quark lock in GLib. Let's just look up the two quarks
we need once and cache them locally for future use. While we're at it,
add new utility functions for the two most commonly used tags
(audio + video). Make first argument a gpointer so we don't have to
cast and make the code ugly. These are used for logging purposes
only anyway.
Since the move from CVS the property name of the documentation example
has been filename instead of location. Users trying the gst-launch
command as is will get:
no property name "filename" in element
Fixing it.
If a non-reference stream is behind the reference stream by an amount of
time smaller than the alignment threshold (in nsec), it counts as being
after it.
https://bugzilla.gnome.org/show_bug.cgi?id=782563
Timecode trak is only supported for mov right now, not for mp4. That
code would otherwise create an invalid trak if the muxed video contained
timecode metadata.
https://bugzilla.gnome.org/show_bug.cgi?id=782684
We only accept new caps if they are basically the same. We don't want to
reset anything as if the caps are new, otherwise various state could get
out of sync with the current run.
We have some padding added after the initial moov, so a bigger updated
moov can be handled to some degree and is expected. Previously we just
ignored the padding and errored out in cases when the padding would've
just been enough.
This sets up a moov with the correct sample positions beforehand and
only works with constant framerate, I-frame only streams.
Currently only support for ProRes and raw audio is implemented but
adding new codecs is just a matter of defining appropriate maximum frame
sizes.
https://bugzilla.gnome.org/show_bug.cgi?id=781447
When muxing raw audio, we have no way of storing timestamps but are just
storing a continuous stream of audio samples. If the difference between
the expected and the real timestamp becomes to big, we should error out
instead of silently creating files with wrong A/V sync.
https://bugzilla.gnome.org/show_bug.cgi?id=780679
Re-arrange order of index entry struct members to avoid padding
bytes in the middle of the struct, thus potentially reducing the
overall size of the struct and reducing memory used by the index.
On Linux x86_64 the size goes down from 32 bytes to 24 bytes for
each index entry.
If no clock was provided directly by rtspsrc. This behaviour was removed
by f8013487c9 and results in rtspsrc not
providing the system clock via the rtpjitterbuffer.
As a result, if another element like an audio sink, provides a clock,
the pipeline would select that (when going to PAUSED/PLAYING again later).
Audio clocks usually don't progress in PAUSED, and thus our live source
won't be able to use the clock to produce data, making the sink never
preroll and everything is stuck.
... unless the muxer uses the same audio pad template name as
splitmuxsink. We can't request a pad called "audio_0" on a muxer that
wants pads to be "sink_%d".
In push mode we process as much as possible in the adapter. When we receive
a DISCONT buffer which we can't match to an actual sample (based on the existing
sample table) and there is still data remaining in the incoming adapter,there is
one of two cases happening:
1) We are doing reverse playback, in which case we should flush out all pending
data
2) We have leftover data from the previous incoming buffer... which we can't do
anything about.
For the second case, make sure we flush out the remaining data so that we can start
parsing again from scratch.
https://bugzilla.gnome.org/show_bug.cgi?id=781319
They should have ideally the same timescale of the video track, which we
can't guarantee here as in theory timecode configuration and video
framerate could be different. However we should set a correct timescale
based on the framerate given in the timecode configuration, and not just
use the framerate numerator.
Make sure offset and neededbytes are properly resetted when all
streams are EOS in push-mode.
Avoids cases when some data might still be pushed by upstream (because
it didn't yet see the resulting GST_FLOW_EOS yet) and qtdemux gets
completely lost.
https://bugzilla.gnome.org/show_bug.cgi?id=781266
buf is the current pad->last_buf value. If ever it gets copied/unreffed,
we need to make sure to write back the new pointer to the last_buf
variable.
Fixes using wrong pointer values in the case of decrasing DTS value
Before pushing a sample, check if there was a change in the current
stsd entry. This patch also assumes that the first stsd entry is
used as default for the first sample. It might cause an uneeded
caps renegotiation when this isn't the case.
stsd can have multiple format entries, parse them all.
This is required to play DVB DASH profile that uses multiple entries
to identify the different available bitrates/options on dash streams
The stream format-specific data is not stored into QtDemuxStreamStsdEntry
Instead of using the stsd as a base pointer, use the actual stsd
entry as the stsd can have multiple entries. This is rarely used
for file playback but is a possible profile with in DVB DASH specs.
This still doesn't support stsd with multiple entries but makes it
easier to do so.
AudioSpecifigConfig is used in a variety of AAC streams but was
being parsed differently. Instead, make everyone use the same parsing.
* Remove unused 'bits' field (it was always set to 0 if present)
* Add proper GAConfig parsing (to know the number of samples per frame
if present).
Fixes wrong rate/channels configuration in streams coming from qtdemux
https://bugzilla.gnome.org/show_bug.cgi?id=780966
According to ISO/IEC:14496-2:2009 , in the case of HE-AACv2 (audioObjecType
29) parametric stereo is used (a single mono track is used and then
transformations are applied to it to provide a stereo output).
We therefore report two channels in the case where there is one reported
in the audioChannelConfiguration.
Fixes the various issues where a demuxer would report two channels, but
then the parser would say there's only one channel, and then the decoder
would output two channels.
last_buf is the one we're going to write next, not buf. As such we
should check timestamps against that one if there is one to select the
earliest pad.
Also remember the currently selected pad in the very beginning when
storing the first last_buf.
This both solves some edge cases where not the correct next pad was
selected corresponding to the target interleave.