docs: add some text about parser/decoder autoplugging issues

This commit is contained in:
Tim-Philipp Müller 2011-06-08 11:11:05 +01:00
parent 5ed90ffc2c
commit dbc04a27ec

View file

@ -13,9 +13,9 @@ Description:
_ a GstTypeFindElement connected to the single sink pad
_ optionnaly a demuxer/parser
_ optionally a demuxer/parser
_ optionnaly one or more DecodeGroup
_ optionally one or more DecodeGroup
* Autoplugging
@ -203,3 +203,87 @@ differences:
controlled by the element. This means that a buffer cannot be pushed to a
non-linked pad any sooner than buffers in any other stream which were received
before it.
=====================================
Parsers, decoders and auto-plugging
=====================================
This section has DRAFT status.
Some media formats come in different "flavours" or "stream formats". These
formats differ in the way the setup data and media data is signalled and/or
packaged. An example for this is H.264 video, where there is a bytestream
format (with codec setup data signalled inline and units prefixed by a sync
code and packet length information) and a "raw" format where codec setup
data is signalled out of band (via the caps) and the chunking is implicit
in the way the buffers were muxed into a container, to mention just two of
the possible variants.
Especially on embedded platforms it is common that decoders can only
handle one particular stream format, and not all of them.
Where there are multiple stream formats, parsers are usually expected
to be able to convert between the different formats. This will, if
implemented correctly, work as expected in a static pipeline such as
... ! parser ! decoder ! sink
where the parser can query the decoder's capabilities even before
processing the first piece of data, and configure itself to convert
accordingly, if conversion is needed at all.
In an auto-plugging context this is not so straight-forward though,
because elements are plugged incrementally and not before the previous
element has processes some data and decided what it will output exactly
(unless the template caps are completely fixed, then it can continue
right away, this is not always the case here though, see below). A
parser will thus have to decide on *some* output format so auto-plugging
can continue. It doesn't know anything about the available decoders and
their capabilities though, so it's possible that it will choose a format
that is not supported by any of the available decoders, or by the preferred
decoder.
If the parser had sufficiently concise but fixed source pad template caps,
decodebin could continue to plug a decoder right away, allowing the
parser to configure itself in the same way as it would with a static
pipeline. This is not an option, unfortunately, because often the
parser needs to process some data to determine e.g. the format's profile or
other stream properties (resolution, sample rate, channel configuration, etc.),
and there may be different decoders for different profiles (e.g. DSP codec
for baseline profile, and software fallback for main/high profile; or a DSP
codec only supporting certain resolutions, with a software fallback for
unusual resolutions). So if decodebin just plugged the most highest-ranking
decoder, that decoder might not be be able to handle the actual stream later
on, which would yield in an error (this is a data flow error then which would
be hard to intercept and avoid in decodebin). In other words, we can't solve
this issue by plugging a decoder right away with the parser.
So decodebin need to communicate to the parser the set of available decoder
caps (which would contain the relevant capabilities/restrictions such as
supported profiles, resolutions, etc.), after the usual "autoplug-*" signal
filtering/sorting of course.
This could be done in multiple ways, e.g.
- plug a capsfilter element right after the parser, and construct
a set of filter caps from the list of available decoders (one
could append at the end just the name(s) of the caps structures
from the parser pad template caps to function as an 'ANY other'
caps equivalent). This would let the parser negotiate to a
supported stream format in the same way as with the static
pipeline mentioned above, but of course incur some overhead
through the additional capsfilter element.
- one could add a filter-caps equivalent property to the parsers
(and/or GstBaseParse class) (e.g. "prefered-caps" or so).
- one could add some kind of "fixate-caps" or "fixate-format"
signal to such parsers
Alternatively, one could simply make all decoders incorporate parsers, so
that always all formats are supported. This is problematic for other reasons
though (e.g. we would not be able to detect the profile in all cases then
before plugging a decoder, which would make it hard to just play the audio
part of a stream and not the video if a suitable decoder was missing, for
example).