gstreamer/docs/design/part-latency.txt

Latency
-------

The latency is the time it takes for a sample captured at timestamp 0 to reach the
sink. This time is measured against the clock in the pipeline. For pipelines
where the only elements that synchronize against the clock are the sinks, the
latency is always 0 since no other element is delaying the buffer.

For pipelines with live sources, a latency is introduced, mostly because of the
way a live source works. Consider an audio source, it will start capturing the
first sample at time 0. If the source pushes buffers with 44100 samples at a
time at 44100Hz it will have collected the buffer at second 1.
Since the timestamp of the buffer is 0 and the time of the clock is now >= 1
second, the sink will drop this buffer because it is too late.
Without any latency compensation in the sink, all buffers will be dropped.

The situation becomes more complex in the presence of:

 - 2 live sources connected to 2 live sinks with different latencies
    * audio/video capture with synchronized live preview.
    * added latencies due to effects (delays, resamplers...)
 - 1 live source connected to 2 live sinks
    * firewire DV
    * RTP, with added latencies because of jitter buffers.
 - mixed live source and non-live source scenarios.
    * synchronized audio capture with non-live playback. (overdubs,..)
 - clock slaving in the sinks due to the live sources providing their own
   clocks.

To perform the needed latency corrections in the above scenarios, we must
develop an algorithm to calculate a global latency for the pipeline. The
algorithm must be extensible so that it can optimize the latency at runtime.
It must also be possible to disable or tune the algorithm based on specific
application needs (required minimal latency).


Pipelines without latency compensation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We show some examples to demonstrate the problem of latency in typical
capture pipelines.

- Example 1

  An audio capture/playback pipeline.

   asrc: audio source, provides a clock
   asink audio sink, provides a clock

   .--------------------------.
   | pipeline                 |
   | .------.      .-------.  |
   | | asrc |      | asink |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   '--------------------------'

 NULL->READY:
  asink: NULL->READY: probes device, returns SUCCESS
  asrc: NULL->READY:  probes device, returns SUCCESS

 READY->PAUSED:
  asink: READY:->PAUSED open device, returns ASYNC
  asrc: READY->PAUSED:  open device, returns NO_PREROLL

  * Since the source is a live source, it will only produce data in the
    PLAYING state. To note this fact, it returns NO_PREROLL from the state change
    function.
  * This sink returns ASYNC because it can only complete the state change to
    PAUSED when it receives the first buffer.

  At this point the pipeline is not processing data and the clock is not
  running. Unless a new action is performed on the pipeline, this situation will
  never change.

 PAUSED->PLAYING:
  asrc clock selected because it is the most upstream clock provider. asink can
  only provide a clock when it received the first buffer and configured the
  device with the samplerate in the caps.

  asink: PAUSED:->PLAYING, sets pending state to PLAYING, returns ASYNC becaus
                         it is not prerolled. The sink will commit state to
			 PLAYING when it prerolls.
  asrc: PAUSED->PLAYING: starts pushing buffers.

  * since the sink is still performing a state change from READY -> PAUSED, it
    remains ASYNC. The pending state will be set to PLAYING.
  * The clock starts running as soon as all the elements have been set to
    PLAYING.
  * the source is a live source with a latency. Since it is synchronized with
    the clock, it will produce a buffer with timestamp 0 and duration D after
    time D, ie. it will only be able to produce the last sample of the buffer
    (with timestamp D) at time D. This latency depends on the size of the
    buffer.
  * the sink will receive the buffer with timestamp 0 at time >= D. At this
    point the buffer is too late already and might be dropped. This state of
    constantly dropping data will not change unless a constant latency
    correction is added to the incoming buffer timestamps.

  The problem is due to the fact that the sink is set to (pending) PLAYING
  without being prerolled, which only happens in live pipelines.

- Example 2

  An audio/video capture/playback pipeline. We capture both audio and video and
  have them played back synchronized again.

   asrc: audio source, provides a clock
   asink audio sink, provides a clock
   vsrc: video source
   vsink video sink

   .--------------------------.
   | pipeline                 |
   | .------.      .-------.  |
   | | asrc |      | asink |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   | .------.      .-------.  |
   | | vsrc |      | vsink |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   '--------------------------'

  The state changes happen in the same way as example 1. Both sinks end up with
  pending state of PLAYING and a return value of ASYNC until they receive the
  first buffer.

  For audio and video to be played in sync, both sinks must compensate for the
  latency of its source but must also use exactly the same latency correction.

  Suppose asrc has a latency of 20ms and vsrc a latency of 33ms, the total
  latency in the pipeline has to be at least 33ms. This also means that the
  pipeline must have at least a 33 - 20 = 13ms buffering on the audio stream or
  else the audio src will underrun while the audiosink waits for the previous
  sample to play.

- Example 3

  An example of the combination of a non-live (file) and a live source (vsrc)
  connected to live sinks (vsink, sink).

   .--------------------------.
   | pipeline                 |
   | .------.      .-------.  |
   | | file |      | sink  |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   | .------.      .-------.  |
   | | vsrc |      | vsink |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   '--------------------------'

  The state changes happen in the same way as example 1. Except sink will be
  able to preroll (commit its state to PAUSED).

  In this case sink will have no latency but vsink will. The total latency
  should be that of vsink.

  Note that because of the presence of a live source (vsrc), the pipeline can be
  set to playing before sink is able to preroll. Without compensation for the
  live source, this might lead to synchronisation problems because the latency
  should be configured in the element before it can go to PLAYING.


- Example 4

  An example of the combination of a non-live and a live source. The non-live
  source is connected to a live sink and the live source to a non-live sink.

   .--------------------------.
   | pipeline                 |
   | .------.      .-------.  |
   | | file |      | sink  |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   | .------.      .-------.  |
   | | vsrc |      | files |  |
   | |     src -> sink     |  |
   | '------'      '-------'  |
   '--------------------------'

  The state changes happen in the same way as example 3. Sink will be
  able to preroll (commit its state to PAUSED). files will not be able to
  preroll.

  sink will have no latency since it is not connected to a live source. files
  does not do synchronisation so it does not care about latency.

  The total latency in the pipeline is 0. The vsrc captures in sync with the
  playback in sink.

  As in example 3, sink can only be set to PLAYING after it successfully
  prerolled.


State Changes
~~~~~~~~~~~~~

A Sink is never set to PLAYING before it is prerolled. In order to do this, the
pipeline (at the GstBin level) keeps track of all
elements that require preroll (the ones that return ASYNC from the state
change). These elements posted a ASYNC_START message without a matching
ASYNC_DONE message.

The pipeline will not change the state of the elements that are still doing an
ASYNC state change.

When an ASYNC element prerolls, it commits its state to PAUSED and posts an
ASYNC_DONE message. The pipeline notices this ASYNC_DONE message and matches it
with the ASYNC_START message it cached for the corresponding element.

When all ASYNC_START messages are matched with an ASYNC_DONE message, the
pipeline proceeds with setting the elements to the final state again.

The base time of the element was already set by the pipeline when it changed the
NO_PREROLL element to PLAYING. This operation has to be performed in the
separate async state change thread (like the one currently used for going from
PAUSED->PLAYING in a non-live pipeline).


Query
~~~~~

The pipeline latency is queried with the LATENCY query.

 (out) "live", G_TYPE_BOOLEAN (default FALSE)
        - if a live element is found upstream

 (out) "min-latency", G_TYPE_UINT64 (default 0, must not be NONE)
        - the minimum latency in the pipeline, meaning the minimum time
          downstream elements synchronizing to the clock have to wait until
          they can be sure that all data for the current running time has
          been received.

          Elements answering the latency query and introducing latency must
          set this to the maximum time for which they will delay data, while
          considering upstream's minimum latency. As such, from an element's
          perspective this is *not* its own minimum latency but its own
          maximum latency.
          Considering upstream's minimum latency in general means that the
          element's own value is added to upstream's value, as this will give
          the overall minimum latency of all elements from the source to the
          current element:

          min_latency = upstream_min_latency + own_min_latency

 (out) "max-latency", G_TYPE_UINT64 (default 0, NONE meaning infinity)
        - the maximum latency in the pipeline, meaning the maximum time an
          element synchronizing to the clock is allowed to wait for receiving
          all data for the current running time. Waiting for a longer time
          will result in data loss, overruns and underruns of buffers and in
          general breaks synchronized data flow in the pipeline.

          Elements answering the latency query should set this to the maximum
          time for which they can buffer upstream data without blocking or
          dropping further data. For an element this value will generally be
          its own minimum latency, but might be bigger than that if it can
          buffer more data. As such, queue elements can be used to increase
          the maximum latency.

          The value set in the query should again consider upstream's maximum
          latency:
          - If the current element has blocking buffering, i.e. it does
            not drop data by itself when its internal buffer is full, it should
            just add its own maximum latency (i.e. the size of its internal
            buffer) to upstream's value. If upstream's maximum latency, or the
            elements internal maximum latency was NONE (i.e. infinity), it will
            be set to infinity.

            if (upstream_max_latency == NONE || own_max_latency == NONE)
              max_latency = NONE;
            else
              max_latency = upstream_max_latency + own_max_latency

            If the element has multiple sinkpads, the minimum upstream latency is
            the maximum of all live upstream minimum latencies.

          - If the current element has leaky buffering, i.e. it drops data by
            itself when its internal buffer is full, it should take the minimum
            of its own maximum latency and upstream's. Examples for such
            elements are audio sinks and sources with an internal ringbuffer,
            leaky queues and in general live sources with a limited amount of
            internal buffers that can be used.

            max_latency = MIN (upstream_max_latency, own_max_latency)

            Note: many GStreamer base classes allow subclasses to set a
            minimum and maximum latency and handle the query themselves. These
            base classes assume non-leaky (i.e. blocking) buffering for the
            maximum latency. The base class' default query handler needs to be
            overridden to correctly handle leaky buffering.

            If the element has multiple sinkpads, the maximum upstream latency is
            the minimum of all live upstream maximum latencies.

Event
~~~~~

The latency in the pipeline is configured with the LATENCY event, which contains
the following fields:

      "latency", G_TYPE_UINT64
        - the configured latency in the pipeline


Latency compensation
~~~~~~~~~~~~~~~~~~~~

Latency calculation and compensation is performed before the pipeline proceeds to
the PLAYING state.

When the pipeline collected all ASYNC_DONE messages it can calculate the global
latency as follows:

  - perform a latency query on all sinks
    - sources set their minimum and maximum latency
    - other elements add their own values as described above
  - latency = MAX (all min latencies)
  - if MIN (all max latencies) < latency we have an impossible situation and we
    must generate an error indicating that this pipeline cannot be played. This
    usually means that there is not enough buffering in some chain of the
    pipeline. A queue can be added to those chains.

The sinks gather this information with a LATENCY query upstream. Intermediate
elements pass the query upstream and add the amount of latency they add to the
result.

ex1:
  sink1: [20 - 20]
  sink2: [33 - 40]

  MAX (20, 33) = 33
  MIN (20, 40) = 20 < 33 -> impossible

ex2:
  sink1: [20 - 50]
  sink2: [33 - 40]

  MAX (20, 33) = 33
  MIN (50, 40) = 40 >= 33 -> latency = 33

The latency is set on the pipeline by sending a LATENCY event to the sinks
in the pipeline. This event configures the total latency on the sinks. The
sink forwards this LATENCY event upstream so that intermediate elements can
configure themselves as well.

After this step, the pipeline continues setting the pending state on its
elements.

A sink adds the latency value, received in the LATENCY event, to
the times used for synchronizing against the clock. This will effectively
delay the rendering of the buffer with the required latency. Since this delay is
the same for all sinks, all sinks will render data relatively synchronised.


Flushing a playing pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~

We can implement resynchronisation after an uncontrolled FLUSH in (part of) a
pipeline in the same way. Indeed, when a flush is performed on
a PLAYING live element, a new base time must be distributed to this element.

A flush in a pipeline can happen in the following cases:

 - flushing seek in the pipeline
    - performed by the application on the pipeline
    - performed by the application on an element
 - flush preformed by an element
    - after receiving a navigation event (DVD, ...)

When a playing sink is flushed by a FLUSH_START event, an ASYNC_START  message is
posted by the element. As part of the message, the fact that the element got
flushed is included. The element also goes to a pending PAUSED state and has to
be set to the PLAYING state again later.

The ASYNC_START message is kept by the parent bin. When the element prerolls,
it posts an ASYNC_DONE message.

When all ASYNC_START messages are matched with an ASYNC_DONE message, the bin
will capture a new base_time from the clock and will bring all the sinks back to
PLAYING after setting the new base time on them. It's also possible
to perform additional latency calculations and adjustments before doing this.


Dynamically adjusting latency
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

An element that want to change the latency in the pipeline can do this by
posting a LATENCY message on the bus. This message instructs the pipeline to:

 - query the latency in the pipeline (which might now have changed) with a
   LATENCY query.
 - redistribute a new global latency to all elements with a LATENCY event.

A use case where the latency in a pipeline can change could be a network element
that observes an increased inter packet arrival jitter or excessive packet loss
and decides to increase its internal buffering (and thus the latency). The
element must post a LATENCY message and perform the additional latency
adjustments when it receives the LATENCY event from the downstream peer element.

In a similar way can the latency be decreased when network conditions are
improving again.

Latency adjustments will introduce glitches in playback in the sinks and must
only be performed in special conditions.