docs: add some text about parser/decoder autoplugging issues

2025-06-04 22:48:54 +00:00 · 2011-06-08 11:11:05 +01:00 · 2011-06-08 11:11:05 +01:00 · dbc04a27ec
commit dbc04a27ec
parent 5ed90ffc2c
1 changed files with 86 additions and 2 deletions
--- a/docs/design/design-decodebin.txt
+++ b/docs/design/design-decodebin.txt
@ -13,9 +13,9 @@ Description:

  _ a GstTypeFindElement connected to the single sink pad

-  _ optionnaly a demuxer/parser
+  _ optionally a demuxer/parser

-  _ optionnaly one or more DecodeGroup
+  _ optionally one or more DecodeGroup

 * Autoplugging

@ -203,3 +203,87 @@ differences:
  controlled by the element. This means that a buffer cannot be pushed to a
  non-linked pad any sooner than buffers in any other stream which were received
  before it.
+
+
+=====================================
+ Parsers, decoders and auto-plugging
+=====================================
+
+This section has DRAFT status.
+
+Some media formats come in different "flavours" or "stream formats". These
+formats differ in the way the setup data and media data is signalled and/or
+packaged. An example for this is H.264 video, where there is a bytestream
+format (with codec setup data signalled inline and units prefixed by a sync
+code and packet length information) and a "raw" format where codec setup
+data is signalled out of band (via the caps) and the chunking is implicit
+in the way the buffers were muxed into a container, to mention just two of
+the possible variants.
+
+Especially on embedded platforms it is common that decoders can only
+handle one particular stream format, and not all of them.
+
+Where there are multiple stream formats, parsers are usually expected
+to be able to convert between the different formats. This will, if
+implemented correctly, work as expected in a static pipeline such as
+
+   ... ! parser ! decoder ! sink
+
+where the parser can query the decoder's capabilities even before
+processing the first piece of data, and configure itself to convert
+accordingly, if conversion is needed at all.
+
+In an auto-plugging context this is not so straight-forward though,
+because elements are plugged incrementally and not before the previous
+element has processes some data and decided what it will output exactly
+(unless the template caps are completely fixed, then it can continue
+right away, this is not always the case here though, see below). A
+parser will thus have to decide on *some* output format so auto-plugging
+can continue. It doesn't know anything about the available decoders and
+their capabilities though, so it's possible that it will choose a format
+that is not supported by any of the available decoders, or by the preferred
+decoder.
+
+If the parser had sufficiently concise but fixed source pad template caps,
+decodebin could continue to plug a decoder right away, allowing the
+parser to configure itself in the same way as it would with a static
+pipeline. This is not an option, unfortunately, because often the
+parser needs to process some data to determine e.g. the format's profile or
+other stream properties (resolution, sample rate, channel configuration, etc.),
+and there may be different decoders for different profiles (e.g. DSP codec
+for baseline profile, and software fallback for main/high profile; or a DSP
+codec only supporting certain resolutions, with a software fallback for
+unusual resolutions). So if decodebin just plugged the most highest-ranking
+decoder, that decoder might not be be able to handle the actual stream later
+on, which would yield in an error (this is a data flow error then which would
+be hard to intercept and avoid in decodebin). In other words, we can't solve
+this issue by plugging a decoder right away with the parser.
+
+So decodebin need to communicate to the parser the set of available decoder
+caps (which would contain the relevant capabilities/restrictions such as
+supported profiles, resolutions, etc.), after the usual "autoplug-*" signal
+filtering/sorting of course.
+
+This could be done in multiple ways, e.g.
+
+  - plug a capsfilter element right after the parser, and construct
+    a set of filter caps from the list of available decoders (one
+    could append at the end just the name(s) of the caps structures
+    from the parser pad template caps to function as an 'ANY other'
+    caps equivalent). This would let the parser negotiate to a
+    supported stream format in the same way as with the static
+    pipeline mentioned above, but of course incur some overhead
+    through the additional capsfilter element.
+
+  - one could add a filter-caps equivalent property to the parsers
+    (and/or GstBaseParse class) (e.g. "prefered-caps" or so).
+
+  - one could add some kind of "fixate-caps" or "fixate-format"
+    signal to such parsers
+
+Alternatively, one could simply make all decoders incorporate parsers, so
+that always all formats are supported. This is problematic for other reasons
+though (e.g. we would not be able to detect the profile in all cases then
+before plugging a decoder, which would make it hard to just play the audio
+part of a stream and not the video if a suitable decoder was missing, for
+example).