# Quality-of-Service

Quality of service is about measuring and adjusting the real-time
performance of a pipeline.

The real-time performance is always measured relative to the pipeline
clock and typically happens in the sinks when they synchronize buffers
against the clock.

The measurements result in QOS events that aim to adjust the datarate in
one or more upstream elements. Two types of adjustments can be made:

  - short time "emergency" corrections based on latest observation in
    the sinks.

  - long term rate corrections based on trends observed in the sinks.

It is also possible for the application to artificially introduce delay
between synchronized buffers, this is called throttling. It can be used
to reduce the framerate, for example.

## Sources of quality problems

  - High CPU load

  - Network problems

  - Other resource problems such as disk load, memory bottlenecks etc.

  - application level throttling

## QoS event

The QoS event is generated by an element that synchronizes against the
clock. It travels upstream and contains the following fields:

* **`type`**: `GST_TYPE_QOS_TYPE:` The type of the QoS event, we have the
following types and the default type is `GST_QOS_TYPE_UNDERFLOW`:

    * `GST_QOS_TYPE_OVERFLOW`:  an element is receiving buffers too fast and can't
    keep up processing them. Upstream should reduce the rate.

    * `GST_QOS_TYPE_UNDERFLOW`: an element is receiving buffers too slowly
    and has to drop them because they are too late. Upstream should
    increase the processing rate.

    * `GST_QOS_TYPE_THROTTLE`:  the application is asking to add extra delay
    between buffers, upstream is allowed to drop buffers

* **`timestamp`**: `G_TYPE_UINT64`: The timestamp on the buffer that
generated the QoS event. These timestamps are expressed in total
`running_time` in the sink so that the value is ever increasing.

* **`jitter`**: `G_TYPE_INT64`: The difference of that timestamp against the
current clock time. Negative values mean the timestamp was on time.
Positive values indicate the timestamp was late by that amount. When
buffers are received in time and throttling is not enabled, the QoS
type field is set to OVERFLOW. When throttling, the jitter contains
the throttling delay added by the application and the type is set to
THROTTLE.

* **`proportion`**: `G_TYPE_DOUBLE`: Long term prediction of the ideal rate
relative to normal rate to get optimal quality.

The rest of this document deals with how these values can be calculated
in a sink and how the values can be used by other elements to adjust
their operations.

## QoS message

A QOS message is posted on the bus whenever an element decides to:

  - drop a buffer because of QoS reasons

  - change its processing strategy because of QoS reasons (quality)

It should be expected that creating and posting the QoS message is
reasonably fast and does not significantly contribute to the QoS
problems. Options to disable this feature could also be presented on
elements.

This message can be posted by a sink/src that performs synchronisation
against the clock (live) or it could be posted by an upstream element
that performs QoS because of QOS events received from a downstream
element (\!live).

The `GST_MESSAGE_QOS` contains at least the following info:

* **`live`**: `G_TYPE_BOOLEAN`: If the QoS message was dropped by a live
element such as a sink or a live source. If the live property is
FALSE, the QoS message was generated as a response to a QoS event in
a non-live element.

* **`running-time`**: `G_TYPE_UINT64`: The `running_time` of the buffer that
generated the QoS message.

* **`stream-time`**: `G_TYPE_UINT64`: The `stream_time` of the buffer that
generated the QoS message.

* **`timestamp`**: `G_TYPE_UINT64`: The timestamp of the buffer that
generated the QoS message.

* **`duration`**: `G_TYPE_UINT64`: The duration of the buffer that generated
the QoS message.

* **`jitter`**: `G_TYPE_INT64`: The difference of the running-time against
the deadline. Negative values mean the timestamp was on time.
Positive values indicate the timestamp was late (and dropped) by
that amount. The deadline can be a realtime `running_time` or an
estimated `running_time`.

* **`proportion`**: `G_TYPE_DOUBLE`: Long term prediction of the ideal rate
relative to normal rate to get optimal quality.

* **`quality`**: `G_TYPE_INT`: An element dependent integer value that
specifies the current quality level of the element. The default
maximum quality is 1000000.

* **`format`**: `GST_TYPE_FORMAT` Units of the *processed* and *dropped*
fields. Video sinks and video filters will use `GST_FORMAT_BUFFERS`
(frames). Audio sinks and audio filters will likely use
`GST_FORMAT_DEFAULT` (samples).

* **`processed`**: `G_TYPE_UINT64`: Total number of units correctly
processed since the last state change to READY or a flushing
operation.

* **`dropped`**: `G_TYPE_UINT64`: Total number of units dropped since the
last state change to READY or a flushing operation.

The *running-time* and *processed* fields can be used to estimate the
average processing rate (framerate for video).

Elements might add additional fields in the message which are documented
in the relevant elements or baseclasses.

## Collecting statistics

A buffer with timestamp B1 arrives in the sink at time T1. The buffer
timestamp is then synchronized against the clock which yields a jitter
J1 return value from the clock. The jitter J1 is simply calculated as

    J1 = CT - B1

Where CT is the clock time when the entry arrives in the sink. This
value is calculated inside the clock when we perform
`gst_clock_id_wait()`.

If the jitter is negative, the entry arrived in time and can be rendered
after waiting for the clock to reach time B1 (which is also CT - J1).

If the jitter is positive however, the entry arrived too late in the
sink and should therefore be dropped. J1 is the amount of time the entry
was late.

Any buffer that arrives in the sink should generate a QoS event
upstream.

Using the jitter we can calculate the time when the buffer arrived in
the sink:

```
    T1 = B1 + J1.                                (1)
```

The time the buffer leaves the sink after synchronisation is measured
as:

```
    T2 = B1 + (J1 < 0 ? 0 : J1)                  (2)
```

For buffers that arrive in time (J1 \< 0) the buffer leaves after
synchronisation which is exactly B1. Late buffers (J1 \>= 0) leave the
sink when they arrive, whithout any synchronisation, which is `T2 = T1 =
B1 + J1`.

Using a previous T0 and a new T1, we can calculate the time it took for
upstream to generate a buffer with timestamp B1.

```
    PT1 = T1 - T0                                (3)
```

We call PT1 the processing time needed to generate buffer with timestamp
B1.

Moreover, given the duration of the buffer D1, the current data rate
(DR1) of the upstream element is given as:

```
      PT1   T1 - T0
DR1 = --- = -------                           (4)
      D1      D1
```

For values 0.0 \< DR1 ⇐ 1.0 the upstream element is producing faster
than real-time. If DR1 is exactly 1.0, the element is running at a
perfect speed.

Values DR1 \> 1.0 mean that the upstream element cannot produce buffers
of duration D1 in real-time. It is exactly DR1 that tells the amount of
speedup we require from upstream to regain real-time performance.

An element that is not receiving enough data is said to be underflowed.

## Element measurements

In addition to the measurements of the datarate of the upstream element,
a typical element must also measure its own performance. Global pipeline
performance problems can indeed also be caused by the element itself
when it receives too much data it cannot process in time. The element is
then said to be overflowed.

## Short term correction

The timestamp and jitter serve as short term correction information for
upstream elements. Indeed, given arrival time T1 as given in (1) we can
be certain that buffers with a timestamp B2 \< T1 will be too late in
the sink.

In case of a positive jitter we can therefore send a QoS event with a
timestamp B1, jitter J1 and proportion given by (4).

This allows an upstream element to not generate any data with timestamps
B2 \< T1, where the element can derive T1 as B1 + J1.

This will effectively result in frame drops.

The element can even do a better estimation of the next valid timestamp
it should output.

Indeed, given the element generated a buffer with timestamp B0 that
arrived in time in the sink but then received a QoS event stating B1
arrived J1 too late. This means generating B1 took (B1 + J1) - B0 = T1 -
T0 = PT1, as given in (3). Given the buffer B1 had a duration D1 and
assuming that generating a new buffer B2 will take the same amount of
processing time, a better estimation for B2 would then be:

```
    B2 = T1 + D2 * DR1
```

expanding gives:

```
    B2 = (B1 + J1) + D2 * (B1 + J1 - B0)
                          --------------
                               D1
```

assuming the durations of the frames are equal and thus D1 = D2:

```
    B2 = (B1 + J1) + (B1 + J1 - B0)

    B2 =  2 * (B1 + J1) - B0
```

also:

```
    B0 = B1 - D1
```

so:

```
    B2 =  2 * (B1 + J1) - (B1 - D1)
```

Which yields a more accurate prediction for the next buffer given as:

```
    B2 =  B1 + 2 * J1 + D1                          (5)
```

## Long term correction

The datarate used to calculate (5) for the short term prediction is
based on a single observation. A more accurate datarate can be obtained
by creating a running average over multiple datarate observations.

This average is less susceptible to sudden changes that would only
influence the datarate for a very short period.

A running average is calculated over the observations given in (4) and
is used as the proportion member in the QoS event that is sent upstream.

Receivers of the QoS event should permanently reduce their datarate as
given by the proportion member. Failure to do so will certainly lead to
more dropped frames and a generally worse QoS.

## Throttling

In throttle mode, the time distance between buffers is kept to a
configurable throttle interval. This means that effectively the buffer
rate is limited to 1 buffer per throttle interval. This can be used to
limit the framerate, for example.

When an element is configured in throttling mode (this is usually only
implemented on sinks) it should produce QoS events upstream with the
jitter field set to the throttle interval. This should instruct upstream
elements to skip or drop the remaining buffers in the configured
throttle interval.

The proportion field is set to the desired slowdown needed to get the
desired throttle interval. Implementations can use the QoS Throttle
type, the proportion and the jitter member to tune their
implementations.

## QoS strategies

Several strategies exist to reduce processing delay that might affect
real time performance.

  - lowering quality

  - dropping frames (reduce CPU/bandwidth usage)

  - switch to a lower decoding/encoding quality (reduce algorithmic
    complexity)

  - switch to a lower quality source (reduce network usage)

  - increasing thread priorities

  - switch to real-time scheduling

  - assign more CPU cycles to critial pipeline parts

  - assign more CPU(s) to critical pipeline parts

## QoS implementations

Here follows a small overview of how QoS can be implemented in a range
of different types of elements.

### GstBaseSink

The primary implementor of QoS is GstBaseSink. It will calculate the
following values:

  - upstream running average of processing time (5) in stream time.

  - running average of buffer durations.

  - running average of render time (in system time)

  - rendered/dropped buffers

The processing time and the average buffer durations will be used to
calculate a proportion.

The processing time in system time is compared to render time to decide
if the majority of the time is spend upstream or in the sink itself.
This value is used to decide overflow or underflow.

The number of rendered and dropped buffers is used to query stats on the
sink.

A QoS event with the most current values is sent upstream for each
buffer that was received by the sink.

Normally QoS is only enabled for video pipelines. The reason being that
drops in audio are more disturbing than dropping video frames. Also
video requires in general more processing than audio.

Normally there is a threshold for when buffers get dropped in a video
sink. Frames that arrive 20 milliseconds late are still rendered as it
is not noticeable for the human eye.

A QoS message is posted whenever a (part of a) buffer is dropped.

In throttle mode, the sink sends QoS event upstream with the timestamp
set to the `running_time` of the latest buffer and the jitter set to the
throttle interval. If the throttled buffer is late, the lateness is
subtracted from the throttle interval in order to keep the desired
throttle interval.

### GstBaseTransform

Transform elements can entirely skip the transform based on the
timestamp and jitter values of recent QoS event since these buffers will
certainly arrive too late.

With any intermediate element, the element should measure its
performance to decide if it is responsible for the quality problems or
any upstream/downstream element.

some transforms can reduce the complexity of their algorithms. Depending
on the algorithm, the changes in quality may have disturbing visual or
audible effect that should be avoided.

A QoS message should be posted when a frame is dropped or when the
quality of the filter is reduced. The quality member in the QOS message
should reflect the quality setting of the filter.

### Video Decoders

A video decoder can, based on the codec in use, decide to not decode
intermediate frames. A typical codec can for example skip the decoding
of B-frames to reduce the CPU usage and framerate.

If each frame is independantly decodable, any arbitrary frame can be
skipped based on the timestamp and jitter values of the latest QoS
event. In addition can the proportion member be used to permanently skip
frames.

It is suggested to adjust the quality field of the QoS message with the
expected amount of dropped frames (skipping B and/or P frames). This
depends on the particular spacing of B and P frames in the stream. If
the quality control would result in half of the frames to be dropped
(typical B frame skipping), the quality field would be set to ``1000000 *
1/2 = 500000``. If a typical I frame spacing of 18 frames is used,
skipping B and P frames would result in 17 dropped frames or 1 decoded
frame every 18 frames. The quality member should be set to `1000000 *
1/18 = 55555`.

  - skipping B frames: quality = 500000

  - skipping P/B frames: quality = 55555 (for I-frame spacing of 18
    frames)

### Demuxers

Demuxers usually cannot do a lot regarding QoS except for skipping
frames to the next keyframe when a lateness QoS event arrives on a
source pad.

A demuxer can however measure if the performance problems are upstream
or downstream and forward an updated QoS event upstream.

Most demuxers that have multiple output pads might need to combine the
QoS events on all the pads and derive an aggregated QoS event for the
upstream element.

### Sources

The QoS events only apply to push based sources since pull based sources
are entirely controlled by another downstream element.

Sources can receive a overflow or underflow event that can be used to
switch to less demanding source material. In case of a network stream, a
switch could be done to a lower or higher quality stream or additional
enhancement layers could be used or ignored.

Live sources will automatically drop data when it takes too long to
process the data that the element pushes out.

Live sources should post a QoS message when data is dropped.