When we are fragmented, the edit list may only refer to the portion of
the media that is in the moov. Extend the edit list stop time when we
if there is only one qt segment and we are reading a fragmented file.
Fixes playback of some fragmented mp4 files generated by proprietary
programs.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/643>
This is modelled after the DASH Common Encryption scheme, but is somewhat
simpler as more parts are fixed, i.e. just one encryption scheme.
The output caps are fixed to 'application/x-aavd'. All information
required for decryption are part of the 'adrm' atom, which is passed
on as a property. The property is attached to the buffer.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/577>
The 'aavd' box is contained in the 'stsd' sample description. The 'aavd'
box follows the layout of an 'mp4a' entry, i.e. it contains a single
standard 'esds' extension box, and the two proprietary 'adrm' and 'aabd'
extension boxes.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/577>
This identifiers are registered in the MPEG-RA and defined
to be used by the Dolby Vision AVC/HEVC streams.
This is a first step to present the stream to the decoder.
Additional box parsing of DOVIConfigurationBox is necessary
to complete the media presentation with proper Dolby Vision
enhancements.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/658>
Previously, the user input for stsd entries is trusted completely, and
so a maliciously crafted file could choose the length of the stsd
entries arbitrarily and cause qtdemux to try to allocate up to 2GB of
memory (half of a 32 bit max int).
This patch fixes this by sanity checking the stsd input against the
size of the entire stsd atom.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/670>
During trak parsing, we need to check for the existence of stsd_entries,
otherwise, we end up with a NULL pointer to them. It is entirely
possible for the stsd to exist, but for it to have no entries, which the
previous checks did not take into account.
This patch adds a simply check to ensure that all files that do not
contain a stsd entry are deemed corrupt, and adds a test case to prevent
a regression.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/670>
Move the SVMI stereoscopic atom parsing out to a helper
function to shrink qtdemux_parse_trak a bit.
Add a bounds check that the received atom is large enough
before parsing it.
Add a note to the atom parser that svmi comes from the
MPEG-A spec 23000-11.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/634>
Sometimes the headers contain useless, wrong or zero values for e.g. the
sample size with these formats. There's only a single valid value for
them so let's set these instead.
The VP Codec Configuration Box (vpcC) contains vp9 profile and
colorimetry information. Especially the profile information might
be useful for downstream to select capable decoder element.
Instead of having chunks with one sample per raw audio sample, have
chunks with a single sample that contains lots of raw audio samples. If
necessary these are still split again later when reading the stream.
With this we are allocating a lot less memory for the parsed sample
tables and can play files that previously triggered our limit of 200MB
for the sample table. For example, one file here would previously
allocate 3.5GB for the sample table and now only allocates 70KB.
Outputting 48000 buffers per second is not a good idea performance-wise.
If a container sample is less than 1024 raw audio frames, combine
multiple samples to get at least 1024 raw audio samples as long as
they're stored contiguous in the file.
For the other direction, if a container sample contains more than 4096
samples there is already code for splitting them up.
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=692750
ffmpeg is doing the same and various files in the wild have bogus
information in the sample description if the same information is also
duplicated afterwards in the v1/v2 sound sample desription.
Previously we only did this for non-raw audio due to
https://bugzilla.gnome.org/show_bug.cgi?id=374914
but this specific file is already worked around differently. It still
works after this change.
Also remove ad-hoc GST_READ_DOUBLE_BE re-implementation and move the
switch for legacy audio formats after reading all the sample
descriptions as we want to override the values from there.
Elements emitting frames through several srcpads should use a
flow combiner to aggregate the chain returns and therefore only return
GST_FLOW_NOT_LINKED to upstream when all the downstream pads have
received GST_FLOW_NOT_LINKED.
In addition to that, in order to handle pads being relinked downstream,
the flow combiner should be reset in response to RECONFIGURE events.
This ensures that a both srcpads process a chain operation before a
GST_FLOW_NOT_LINKED can be propagated upstream (which would usually stop
the pipeline).
Otherwise, in a configuration with two srcpads, only one linked at a
time, after the relink the element could chain data through the now
unlinked pad and the flow combiner would resolve as GST_FLOW_NOT_LINKED
(stopping the pipeline) just because the now linked pad has not been
chained yet to update the flow combiner.
This patch adds handling of RECONFIGURE events to qtdemux. Also, since
this event handling causes the flow combiner to be used from a thread
other than the qtdemux streaming thread, usages of the flow combiner
has been guarded by the object lock.
There are in the wild (mp4) streams that basically contain no tracks
but do have a redirect info[0], in which case, we won't be able
to expose any pad (there are no tracks) so we can't post anything but
an error on the bus, as:
- it can't send EOS downstream, it has no pad,
- posting an EOS message will be useless as PAUSED state can't be
reached and there is no sink in the pipeline meaning GstBin will
simply ignore it
The approach here is to to add details to the ERROR message with a
`redirect-location` field which elements like playbin handle and use right
away.
[0]: http://movietrailers.apple.com/movies/paramount/terminator-dark-fate/terminator-dark-fate-trailer-2_480p.mov
In file included from ../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/gst.h:55,
from ../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/tag/tag.h:25,
from ../gst/isomp4/qtdemux.c:56:
In function ‘qtdemux_inspect_transformation_matrix’,
inlined from ‘qtdemux_parse_trak’ at ../gst/isomp4/qtdemux.c:10676:5,
inlined from ‘qtdemux_parse_tree’ at ../gst/isomp4/qtdemux.c:14210:5:
../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/gstinfo.h:645:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
645 | gst_debug_log ((cat), (level), __FILE__, GST_FUNCTION, __LINE__, \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
646 | (GObject *) (object), __VA_ARGS__); \
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../dist/linux_x86_64/include/gstreamer-1.0/gst/gstinfo.h:1062:35: note: in expansion of macro ‘GST_CAT_LEVEL_LOG’
1062 | #define GST_DEBUG_OBJECT(obj,...) GST_CAT_LEVEL_LOG (GST_CAT_DEFAULT, GST_LEVEL_DEBUG, obj, __VA_ARGS__)
| ^~~~~~~~~~~~~~~~~
../gst/isomp4/qtdemux.c:10294:5: note: in expansion of macro ‘GST_DEBUG_OBJECT’
10294 | GST_DEBUG_OBJECT (qtdemux, "Transformation matrix rotation %s",
| ^~~~~~~~~~~~~~~~
../gst/isomp4/qtdemux.c: In function ‘qtdemux_parse_tree’:
../gst/isomp4/qtdemux.c:10294:64: note: format string is defined here
10294 | GST_DEBUG_OBJECT (qtdemux, "Transformation matrix rotation %s",
| ^~
In reverse playback, we don't want to rely on the position of the current
keyframe to decide a stream is EOS: the last GOP we push will start with
a keyframe, which position is likely to be outside of the segment.
Instead, let the normal seek_to_previous_keyframe mechanism do its job,
it works just fine.
If a key unit seek is performed with a time position that matches
the offset of a keyframe, but not its actual PTS, we need to
adjust the segment nevertheless.
For example consider the following case:
* stream starts with a keyframe at 0 nanosecond, lasting 40 milliseconds
* user does a key unit seek at 20 milliseconds
* we don't adjust the segment as the time position is "over" a keyframe
* we push a segment that starts at 20 milliseconds
* we push a buffer with PTS == 0
* an element downstream (eg rtponviftimestamp) tries to calculate the
stream time of the buffer, fails to do so and drops it
When the seek event contains a (newly-added) trickmode interval,
and TRICKMODE_KEY_UNITS was requested, only let through keyframes
separated with the required interval
The time_position field of the stream is offset by the media_start
of its QtDemuxSegment compared to the start of the GstSegment of
the demuxer, take it into account when making comparisons.
mpegaudioparse suggests MP3 needs 10 or 30 frames of lead-in (depending on
mpegaudioversion, which we don't know here), thus provide at least 30 frames
lead-in for such cases as a followup to commit cbfa4531ee.
AAC and various other audio codecs need a couple frames of lead-in to
decode it properly. The parser elements like aacparse take care of it
via gst_base_parse_set_frame_rate, but when inside a container, the
demuxer is doing the seek segment handling and never gives lead-in
data downstream.
Handle this similar to going back to a keyframe with video, in the
same place. Without a lead-in, the start of the segment is silence,
when it shouldn't, which becomes especially evident in NLE use cases.
Need to respect return of gst_video_guess_framerate() to ensure
non-zero denominator.
This patch is to fix below error with an abnormal (but has valid frame) file.
(gst-play-1.0:17940): GStreamer-CRITICAL **: passed '0' as denominator for `GstFraction'
This problem was found in Test. 2 of the YouTube 2018 EME
tests[1]. The code was accidentally not finding an mp4a's esds atom in
the sample description table when the stream was encrypted. It assumed
that if the stream is protected, then only an enca atom will be found
here. What happens with YouTube is they often provide protected
content with a few seconds of clear content, and then switch to the
encrypted stream.
The failure case here was an incorrect codec_data field being sent
into aacparse. The advertisement of stereo audio @ 44.1kHz for the
mp4a (unprotected) stream was incorrect. As usual, the esds contained
the real values here which were mono at 22050 Hz.
Here's what the MP4 tree looks like for these types of files,
demonstrating why the code was making a wrong assumption (or maybe
YouTube is being unusual),
[ftyp] size=8+16
...
[moov] size=8+1571
...
[trak] size=8+559
...
[stsd] size=12+234
entry-count = 2
[enca] size=8+147
channel_count = 2
sample_size = 16
sample_rate = 44100
[esds] size=12+27
...
...
[mp4a] size=8+67
channel_count = 2
sample_size = 16
sample_rate = 44100
[esds] size=12+27
...
In addition to fixing this, the checks for esds atoms in mp4a and mp4v
have been made symmetrical. While I haven't seen a test case for video
with the same problem, it seemed better to make the same checks. This
also fixes a crash reported from another user[2], they also noted the
asymmetry with mp4v and mp4a.
[1] https://yt-dash-mse-test.commondatastorage.googleapis.com/unit-tests/2018.html?test_type=encryptedmedia-test
[2] https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/398