gstreamer/subprojects/gst-docs/markdown/additional/design/decodebin.md

# Decodebin design

## GstDecodeBin

### Description

 - Autoplug and decode to raw media

 - Input: single pad with ANY caps

 - Output: Dynamic pads

### Contents

 - a `GstTypeFindElement` connected to the single sink pad

 - optionally a demuxer/parser

 - optionally one or more DecodeGroup

### Autoplugging

The goal is to reach 'target' caps (by default raw media).

This is done by using the `GstCaps` of a source pad and finding the
available demuxers/decoders `GstElement` that can be linked to that pad.

The process starts with the source pad of typefind and stops when no
more non-target caps are left. It is commonly done while pre-rolling,
but can also happen whenever a new pad appears on any element.

Once a target caps has been found, that pad is ghosted and the
'pad-added' signal is emitted.

If no compatible elements can be found for a `GstCaps`, the pad is ghosted
and the 'unknown-type' signal is emitted.

### Assisted auto-plugging

When starting the auto-plugging process for a given `GstCaps`, two signals
are emitted in the following way in order to allow the application/user
to assist or fine-tune the process.

 - **'autoplug-continue'**:

```
    gboolean user_function (GstElement * decodebin, GstPad *pad, GstCaps * caps)
```

    This signal is fired at the very beginning with the source pad `GstCaps`. If
    the callback returns TRUE, the process continues normally. If the
    callback returns FALSE, then the `GstCaps` are considered as a target caps
    and the autoplugging process stops.

  - **'autoplug-factories'**:

```
    GValueArray user_function (GstElement* decodebin, GstPad* pad, GstCaps* caps);
```

    Get a list of elementfactories for `@pad` with `@caps`. This function is
    used to instruct decodebin2 of the elements it should try to
    autoplug. The default behaviour when this function is not overriden
    is to get all elements that can handle @caps from the registry
    sorted by rank.

  - **'autoplug-select'**:

```
    gint user_function (GstElement* decodebin, GstPad* pad, GstCaps*caps, GValueArray* factories);
```

    This signal is fired once autoplugging has got a list of compatible
    `GstElementFactory`. The signal is emitted with the `GstCaps` of the
    source pad and a pointer on the GValueArray of compatible factories.

    The callback should return the index of the elementfactory in
    @factories that should be tried next.

    If the callback returns -1, the autoplugging process will stop as if
    no compatible factories were found.

The default implementation of this function will try to autoplug the
first factory of the list.

### Target Caps

The target caps are a read/write `GObject` property of decodebin.

By default the target caps are:

 - Raw audio: audio/x-raw

 - Raw video: video/x-raw

 - Raw text: text/x-raw, format={utf8,pango-markup}

### Media chain/group handling

When autoplugging, all streams coming out of a demuxer will be grouped
in a `DecodeGroup`.

All new source pads created on that demuxer after it has emitted the
'no-more-pads' signal will be put in another `DecodeGroup`.

Only one decodegroup can be active at any given time. If a new
decodegroup is created while another one exists, that `DecodeGroup` will
be set as blocking until the existing one has drained.

## DecodeGroup

### Description

Streams belonging to the same group/chain of a media file.

### Contents

The DecodeGroup contains:

 - a `GstMultiQueue` to which all streams of the media group are connected.

 - the eventual decoders which are autoplugged in order to produce the
   requested target pads.

### Proper group draining

The `DecodeGroup` takes care that all the streams in the group are
completely drained (`EOS` has come through all source ghost pads).

### Pre-roll and block

The `DecodeGroup` has a global blocking feature. If enabled, all the
ghosted source pads for that group will be blocked.

A method is available to unblock all blocked pads for that group.

## GstMultiQueue

Multiple input-output data queue.

`multiqueue` achieves the same functionality as `queue`, with a
few differences:

  - Multiple streams handling.

    The element handles queueing data on more than one stream at once.
    To achieve such a feature it has request sink pads (`sink_%u`) and
    'sometimes' src pads (`src_%u`).

    When requesting a given sinkpad, the associated srcpad for that
    stream will be created. Ex: requesting `sink_1` will generate `src_1`.

  - Non-starvation on multiple streams.

    If more than one stream is used with the element, the streams'
    queues will be dynamically grown (up to a limit), in order to ensure
    that no stream is risking data starvation. This guarantees that at
    any given time there are at least N bytes queued and available for
    each individual stream.

    If an `EOS` event comes through a srcpad, the associated queue should
    be considered as 'not-empty' in the queue-size-growing algorithm.

  - Non-linked srcpads graceful handling.

    A `GstTask` is started for all srcpads when going to
    `GST_STATE_PAUSED`.

    The task are blocking against a GCondition which will be fired in
    two different cases:

    - When the associated queue has received a buffer.

    - When the associated queue was previously declared as 'not-linked'
      and the first buffer of the queue is scheduled to be pushed
      synchronously in relation to the order in which it arrived globally
      in the element (see 'Synchronous data pushing' below).

    When woken up by the `GCondition`, the `GstTask` will try to push the
    next `GstBuffer`/`GstEvent` on the queue. If pushing returns
    `GST_FLOW_NOT_LINKED`, the associated queue is marked as `not-linked`.
    If pushing succeeds, the queue will no longer be marked as `not-linked`.

    If pushing on all srcpads returns a `GstFlowReturn` different from
    `GST_FLOW_OK`, then all the srcpads' tasks are stopped and
    subsequent pushes on sinkpads will return `GST_FLOW_NOT_LINKED`.

  - Synchronous data pushing for non-linked pads.

    In order to better support dynamic switching between streams, the
    multiqueue (unlike the current GStreamer queue) continues to push
    buffers on non-linked pads rather than shutting down.

    In addition, to prevent a non-linked stream from very quickly
    consuming all available buffers and thus 'racing ahead' of the other
    streams, the element must ensure that buffers and inlined events for
    a non-linked stream are pushed in the same order as they were
    received, relative to the other streams controlled by the element.
    This means that a buffer cannot be pushed to a non-linked pad any
    sooner than buffers in any other stream which were received before
    it.

## Parsers, decoders and auto-plugging

This section has DRAFT status.

Some media formats come in different "flavours" or "stream formats".
These formats differ in the way the setup data and media data is
signalled and/or packaged. An example for this is H.264 video, where
there is a bytestream format (with codec setup data signalled inline and
units prefixed by a sync code and packet length information) and a "raw"
format where codec setup data is signalled out of band (via the caps)
and the chunking is implicit in the way the buffers were muxed into a
container, to mention just two of the possible variants.

Especially on embedded platforms it is common that decoders can only
handle one particular stream format, and not all of them.

Where there are multiple stream formats, parsers are usually expected to
be able to convert between the different formats. This will, if
implemented correctly, work as expected in a static pipeline such as

```
... ! parser ! decoder ! sink
```

where the parser can query the decoder's capabilities even before
processing the first piece of data, and configure itself to convert
accordingly, if conversion is needed at all.

In an auto-plugging context this is not so straight-forward though,
because elements are plugged incrementally and not before the previous
element has processed some data and decided what it will output exactly
(unless the template caps are completely fixed, then it can continue
right away, this is not always the case here though, see below). A
parser will thus have to decide on *some* output format so auto-plugging
can continue. It doesn't know anything about the available decoders and
their capabilities though, so it's possible that it will choose a format
that is not supported by any of the available decoders, or by the
preferred decoder.

If the parser had sufficiently concise but fixed source pad template
caps, decodebin could continue to plug a decoder right away, allowing
the parser to configure itself in the same way as it would with a static
pipeline. This is not an option, unfortunately, because often the parser
needs to process some data to determine e.g. the format's profile or
other stream properties (resolution, sample rate, channel configuration,
etc.), and there may be different decoders for different profiles (e.g.
DSP codec for baseline profile, and software fallback for main/high
profile; or a DSP codec only supporting certain resolutions, with a
software fallback for unusual resolutions). So if decodebin just plugged
the most highest-ranking decoder, that decoder might not be be able to
handle the actual stream later on, which would yield an error (this is a
data flow error then which would be hard to intercept and avoid in
decodebin). In other words, we can't solve this issue by plugging a
decoder right away with the parser.

So decodebin needs to communicate to the parser the set of available
decoder caps (which would contain the relevant capabilities/restrictions
such as supported profiles, resolutions, etc.), after the usual
"autoplug-\*" signal filtering/sorting of course.

This is done by plugging a capsfilter element right after the parser,
and constructing set of filter caps from the list of available decoders
(one appends at the end just the name(s) of the caps structures from the
parser pad template caps to function as an 'ANY other' caps equivalent).
This let the parser negotiate to a supported stream format in the same
way as with the static pipeline mentioned above, but of course incur
some overhead through the additional capsfilter element.