Move the SVMI stereoscopic atom parsing out to a helper
function to shrink qtdemux_parse_trak a bit.
Add a bounds check that the received atom is large enough
before parsing it.
Add a note to the atom parser that svmi comes from the
MPEG-A spec 23000-11.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/634>
This is disabled by default as it unnecessarily creates bigger headers
but it is something that is required by some applications and most
notably the Apple ProRes spec.
Even if timecode trak is officially unsupported in non-mov flavors,
some software still supports it, e.g. Final Cut Pro X:
https://developer.apple.com/library/archive/technotes/tn2174/_index.html
The user might still expect to see the timecode information in the
non-mov file despite it being officially unsupported , because other
software e.g. QuickTime will create a timecode trak even in mp4 files.
Furthermore, software that supports timecode trak in non-mov flavors
will also display the file duration in "timecode units" instead of real
clock time, which is not necessarily the same for 29.97 fps and friends.
This might confuse users, who see a different duration for the same
framerate and amount of frames depending on whether the container is mp4
or mov.
Fixes https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/512
By the time sink_event is called, the pad's current caps have
already been updated. To address this, implement sink_event_pre_queue,
and check if the pad can be renegotiated there.
Fixes#707
Sometimes the headers contain useless, wrong or zero values for e.g. the
sample size with these formats. There's only a single valid value for
them so let's set these instead.
The VP Codec Configuration Box (vpcC) contains vp9 profile and
colorimetry information. Especially the profile information might
be useful for downstream to select capable decoder element.
Instead of having chunks with one sample per raw audio sample, have
chunks with a single sample that contains lots of raw audio samples. If
necessary these are still split again later when reading the stream.
With this we are allocating a lot less memory for the parsed sample
tables and can play files that previously triggered our limit of 200MB
for the sample table. For example, one file here would previously
allocate 3.5GB for the sample table and now only allocates 70KB.
Outputting 48000 buffers per second is not a good idea performance-wise.
If a container sample is less than 1024 raw audio frames, combine
multiple samples to get at least 1024 raw audio samples as long as
they're stored contiguous in the file.
For the other direction, if a container sample contains more than 4096
samples there is already code for splitting them up.
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=692750
ffmpeg is doing the same and various files in the wild have bogus
information in the sample description if the same information is also
duplicated afterwards in the v1/v2 sound sample desription.
Previously we only did this for non-raw audio due to
https://bugzilla.gnome.org/show_bug.cgi?id=374914
but this specific file is already worked around differently. It still
works after this change.
Also remove ad-hoc GST_READ_DOUBLE_BE re-implementation and move the
switch for legacy audio formats after reading all the sample
descriptions as we want to override the values from there.
Elements emitting frames through several srcpads should use a
flow combiner to aggregate the chain returns and therefore only return
GST_FLOW_NOT_LINKED to upstream when all the downstream pads have
received GST_FLOW_NOT_LINKED.
In addition to that, in order to handle pads being relinked downstream,
the flow combiner should be reset in response to RECONFIGURE events.
This ensures that a both srcpads process a chain operation before a
GST_FLOW_NOT_LINKED can be propagated upstream (which would usually stop
the pipeline).
Otherwise, in a configuration with two srcpads, only one linked at a
time, after the relink the element could chain data through the now
unlinked pad and the flow combiner would resolve as GST_FLOW_NOT_LINKED
(stopping the pipeline) just because the now linked pad has not been
chained yet to update the flow combiner.
This patch adds handling of RECONFIGURE events to qtdemux. Also, since
this event handling causes the flow combiner to be used from a thread
other than the qtdemux streaming thread, usages of the flow combiner
has been guarded by the object lock.
The memory leak occurs in the case when the buffer has been
added to the fragment_buffers array of the current pad and
never been sent because of the push failure of the previous
buffers: moof or mdat header or fragmented buffer(s).
There are in the wild (mp4) streams that basically contain no tracks
but do have a redirect info[0], in which case, we won't be able
to expose any pad (there are no tracks) so we can't post anything but
an error on the bus, as:
- it can't send EOS downstream, it has no pad,
- posting an EOS message will be useless as PAUSED state can't be
reached and there is no sink in the pipeline meaning GstBin will
simply ignore it
The approach here is to to add details to the ERROR message with a
`redirect-location` field which elements like playbin handle and use right
away.
[0]: http://movietrailers.apple.com/movies/paramount/terminator-dark-fate/terminator-dark-fate-trailer-2_480p.mov
In file included from ../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/gst.h:55,
from ../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/tag/tag.h:25,
from ../gst/isomp4/qtdemux.c:56:
In function ‘qtdemux_inspect_transformation_matrix’,
inlined from ‘qtdemux_parse_trak’ at ../gst/isomp4/qtdemux.c:10676:5,
inlined from ‘qtdemux_parse_tree’ at ../gst/isomp4/qtdemux.c:14210:5:
../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/gstinfo.h:645:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
645 | gst_debug_log ((cat), (level), __FILE__, GST_FUNCTION, __LINE__, \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
646 | (GObject *) (object), __VA_ARGS__); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/gstinfo.h:1062:35: note: in expansion of macro ‘GST_CAT_LEVEL_LOG’
1062 | #define GST_DEBUG_OBJECT(obj,...) GST_CAT_LEVEL_LOG (GST_CAT_DEFAULT, GST_LEVEL_DEBUG, obj, __VA_ARGS__)
| ^~~~~~~~~~~~~~~~~
../gst/isomp4/qtdemux.c:10294:5: note: in expansion of macro ‘GST_DEBUG_OBJECT’
10294 | GST_DEBUG_OBJECT (qtdemux, "Transformation matrix rotation %s",
| ^~~~~~~~~~~~~~~~
../gst/isomp4/qtdemux.c: In function ‘qtdemux_parse_tree’:
../gst/isomp4/qtdemux.c:10294:64: note: format string is defined here
10294 | GST_DEBUG_OBJECT (qtdemux, "Transformation matrix rotation %s",
| ^~
In reverse playback, we don't want to rely on the position of the current
keyframe to decide a stream is EOS: the last GOP we push will start with
a keyframe, which position is likely to be outside of the segment.
Instead, let the normal seek_to_previous_keyframe mechanism do its job,
it works just fine.
If a key unit seek is performed with a time position that matches
the offset of a keyframe, but not its actual PTS, we need to
adjust the segment nevertheless.
For example consider the following case:
* stream starts with a keyframe at 0 nanosecond, lasting 40 milliseconds
* user does a key unit seek at 20 milliseconds
* we don't adjust the segment as the time position is "over" a keyframe
* we push a segment that starts at 20 milliseconds
* we push a buffer with PTS == 0
* an element downstream (eg rtponviftimestamp) tries to calculate the
stream time of the buffer, fails to do so and drops it
When the seek event contains a (newly-added) trickmode interval,
and TRICKMODE_KEY_UNITS was requested, only let through keyframes
separated with the required interval
The time_position field of the stream is offset by the media_start
of its QtDemuxSegment compared to the start of the GstSegment of
the demuxer, take it into account when making comparisons.
mpegaudioparse suggests MP3 needs 10 or 30 frames of lead-in (depending on
mpegaudioversion, which we don't know here), thus provide at least 30 frames
lead-in for such cases as a followup to commit cbfa4531ee.
AAC and various other audio codecs need a couple frames of lead-in to
decode it properly. The parser elements like aacparse take care of it
via gst_base_parse_set_frame_rate, but when inside a container, the
demuxer is doing the seek segment handling and never gives lead-in
data downstream.
Handle this similar to going back to a keyframe with video, in the
same place. Without a lead-in, the start of the segment is silence,
when it shouldn't, which becomes especially evident in NLE use cases.
It must be accurate for all samples to work in Final Cut properly, so
the best we can do is to assume that all samples are the same as the
first. Bigger samples are truncated, smaller samples are padded.