Used by some proprietary software for their fragmented files.
Adds some support for multi-stream fragmented files
Flow is as follows.
1. The first 'fragment' is written as a self-contained fragmented
mdat+moov complete with an edit list and durations, tags, etc.
2. Subsequent fragments are written with a mdat+moof and each stream is
interleaved as data arrives (currently ignoring the interleave-*
properties). data-offsets in both the traf and the trun ensure
data is read from the correct place on demuxing. Data/chunk offsets
are also kept for writing out the final moov.
3. On finalisation, the initial moov is invalidated to a hoov and the
size of the first mdat is extended to cover the entire file contents.
Then a moov is written as regularly would in moov-at-end mode (the
default).
This results in a file that is playable throughout while leaving a
finalised file on completion for players that do not understand
fragmented mp4.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/merge_requests/643>
This is disabled by default as it unnecessarily creates bigger headers
but it is something that is required by some applications and most
notably the Apple ProRes spec.
Even if timecode trak is officially unsupported in non-mov flavors,
some software still supports it, e.g. Final Cut Pro X:
https://developer.apple.com/library/archive/technotes/tn2174/_index.html
The user might still expect to see the timecode information in the
non-mov file despite it being officially unsupported , because other
software e.g. QuickTime will create a timecode trak even in mp4 files.
Furthermore, software that supports timecode trak in non-mov flavors
will also display the file duration in "timecode units" instead of real
clock time, which is not necessarily the same for 29.97 fps and friends.
This might confuse users, who see a different duration for the same
framerate and amount of frames depending on whether the container is mp4
or mov.
Fixes https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/512
The memory leak occurs in the case when the buffer has been
added to the fragment_buffers array of the current pad and
never been sent because of the push failure of the previous
buffers: moof or mdat header or fragmented buffer(s).
It must be accurate for all samples to work in Final Cut properly, so
the best we can do is to assume that all samples are the same as the
first. Bigger samples are truncated, smaller samples are padded.
Supports CEA 608 and CEA 708 CC streams
Also supports usage in "Robust Prefill" mode if the incoming caption
stream is constant (i.e. there is one incoming CC buffer for each
video frame).
https://bugzilla.gnome.org/show_bug.cgi?id=606643
This sets up a moov with the correct sample positions beforehand and
only works with constant framerate, I-frame only streams.
Currently only support for ProRes and raw audio is implemented but
adding new codecs is just a matter of defining appropriate maximum frame
sizes.
https://bugzilla.gnome.org/show_bug.cgi?id=781447
When muxing raw audio, we have no way of storing timestamps but are just
storing a continuous stream of audio samples. If the difference between
the expected and the real timestamp becomes to big, we should error out
instead of silently creating files with wrong A/V sync.
https://bugzilla.gnome.org/show_bug.cgi?id=780679
Previously we were switching from one chunk to another on every single
buffer. This wastes some space in the headers and, depending on the
software, might depend in more reads (e.g. if the software is reading
multiple samples in one go if they're in the same chunk).
The ProRes guidelines suggest an interleave of 0.5s is common, but
specifies that for ProRes at most 2MB (for SD) and 4MB (for HD) should
be used per chunk. This will be handled in a follow-up commit.
https://bugzilla.gnome.org/show_bug.cgi?id=773217
The media start has nothing to do with the shift we have applied
but with the value of the first PTS. This is defined as:
Dt(0) = 0
Ct(0) = Dt(0) + CTTS(0)
So the media start is always the first CTTS.
https://bugzilla.gnome.org/show_bug.cgi?id=751361
Adds AC-3 muxing support. It is defined for mp4 and 3gp formats.
One extra feature that was added was the ability to add extension
atoms after set_caps as the AC-3 extension atom needs some data
that has to be extracted from the stream itself and is not
present on caps.
Implement a robust recording mode, where the output
file is always in a playable state, seeking and rewriting
the moov header at a configurable interval. Rewriting
moov is done using reserved space at the start of
the file, and a ping-pong strategy where the moov
is replaced atomically so it's never invalid.
Track when tags have actually changed, and don't write them into
the moov unless they've changed. Clear any existing tags when
re-writing them, so we can do progressive moov updating in robust
recording mode.
Write placeholder mdat as a free atom plus a 32-bit mdat
with '0' size, which means "rest of the file" in the spec.
Re-write it later to a full 64-bit extended size atom if needed.
Instead of checking various state variables around the muxer,
track the current muxing mode in a single 'mux_mode' enum.
Add some implementation notes about the different mux modes
When not in fast-start or fragmented mode, we need to be able
to rewrite the size of the mdat atom, or else the output just
won't be playable - the mdat placeholder with size == 0 will
cover the rest of the file, including any moov atom we write out.
https://bugzilla.gnome.org/show_bug.cgi?id=708808
Tags received via events, when marked as stream tags, will
be stored on that stream's trak atom instead of being stored
in the main tags atom. This allows the resulting file to have
global and stream tags stored.
https://bugzilla.gnome.org/show_bug.cgi?id=692473
It was used in the past in 0.10 when there was no explicit DTS
field in buffers, now we have it in 1.x series and we can
check it directly with GST_BUFFER_DTS_IS_VALID
Do not try to use subsequent buffer timestamps to calculate
sparse streams durations because the stream is sparse and
the buffers might not be 'time adjacent'. So rely on the
duration and give the option to the pad to provide
custom 'empty' buffers to represent the gaps in the
stream, this can vary on how the data is represented.
Right now, the only sparse stream supported is tx3g subtitles.