mirror of
https://gitlab.freedesktop.org/gstreamer/gstreamer.git
synced 2025-01-07 07:55:41 +00:00
451 lines
16 KiB
Markdown
451 lines
16 KiB
Markdown
# Quality-of-Service
|
|
|
|
Quality of service is about measuring and adjusting the real-time
|
|
performance of a pipeline.
|
|
|
|
The real-time performance is always measured relative to the pipeline
|
|
clock and typically happens in the sinks when they synchronize buffers
|
|
against the clock.
|
|
|
|
The measurements result in QOS events that aim to adjust the datarate in
|
|
one or more upstream elements. Two types of adjustments can be made:
|
|
|
|
- short time "emergency" corrections based on latest observation in
|
|
the sinks.
|
|
|
|
- long term rate corrections based on trends observed in the sinks.
|
|
|
|
It is also possible for the application to artificially introduce delay
|
|
between synchronized buffers, this is called throttling. It can be used
|
|
to reduce the framerate, for example.
|
|
|
|
## Sources of quality problems
|
|
|
|
- High CPU load
|
|
|
|
- Network problems
|
|
|
|
- Other resource problems such as disk load, memory bottlenecks etc.
|
|
|
|
- application level throttling
|
|
|
|
## QoS event
|
|
|
|
The QoS event is generated by an element that synchronizes against the
|
|
clock. It travels upstream and contains the following fields:
|
|
|
|
* **`type`**: `GST_TYPE_QOS_TYPE:` The type of the QoS event, we have the
|
|
following types and the default type is `GST_QOS_TYPE_UNDERFLOW`:
|
|
|
|
* `GST_QOS_TYPE_OVERFLOW`: an element is receiving buffers too fast and can't
|
|
keep up processing them. Upstream should reduce the rate.
|
|
|
|
* `GST_QOS_TYPE_UNDERFLOW`: an element is receiving buffers too slowly
|
|
and has to drop them because they are too late. Upstream should
|
|
increase the processing rate.
|
|
|
|
* `GST_QOS_TYPE_THROTTLE`: the application is asking to add extra delay
|
|
between buffers, upstream is allowed to drop buffers
|
|
|
|
* **`timestamp`**: `G_TYPE_UINT64`: The timestamp on the buffer that
|
|
generated the QoS event. These timestamps are expressed in total
|
|
`running_time` in the sink so that the value is ever increasing.
|
|
|
|
* **`jitter`**: `G_TYPE_INT64`: The difference of that timestamp against the
|
|
current clock time. Negative values mean the timestamp was on time.
|
|
Positive values indicate the timestamp was late by that amount. When
|
|
buffers are received in time and throttling is not enabled, the QoS
|
|
type field is set to OVERFLOW. When throttling, the jitter contains
|
|
the throttling delay added by the application and the type is set to
|
|
THROTTLE.
|
|
|
|
* **`proportion`**: `G_TYPE_DOUBLE`: Long term prediction of the ideal rate
|
|
relative to normal rate to get optimal quality.
|
|
|
|
The rest of this document deals with how these values can be calculated
|
|
in a sink and how the values can be used by other elements to adjust
|
|
their operations.
|
|
|
|
## QoS message
|
|
|
|
A QOS message is posted on the bus whenever an element decides to:
|
|
|
|
- drop a buffer because of QoS reasons
|
|
|
|
- change its processing strategy because of QoS reasons (quality)
|
|
|
|
It should be expected that creating and posting the QoS message is
|
|
reasonably fast and does not significantly contribute to the QoS
|
|
problems. Options to disable this feature could also be presented on
|
|
elements.
|
|
|
|
This message can be posted by a sink/src that performs synchronisation
|
|
against the clock (live) or it could be posted by an upstream element
|
|
that performs QoS because of QOS events received from a downstream
|
|
element (\!live).
|
|
|
|
The `GST_MESSAGE_QOS` contains at least the following info:
|
|
|
|
* **`live`**: `G_TYPE_BOOLEAN`: If the QoS message was dropped by a live
|
|
element such as a sink or a live source. If the live property is
|
|
FALSE, the QoS message was generated as a response to a QoS event in
|
|
a non-live element.
|
|
|
|
* **`running-time`**: `G_TYPE_UINT64`: The `running_time` of the buffer that
|
|
generated the QoS message.
|
|
|
|
* **`stream-time`**: `G_TYPE_UINT64`: The `stream_time` of the buffer that
|
|
generated the QoS message.
|
|
|
|
* **`timestamp`**: `G_TYPE_UINT64`: The timestamp of the buffer that
|
|
generated the QoS message.
|
|
|
|
* **`duration`**: `G_TYPE_UINT64`: The duration of the buffer that generated
|
|
the QoS message.
|
|
|
|
* **`jitter`**: `G_TYPE_INT64`: The difference of the running-time against
|
|
the deadline. Negative values mean the timestamp was on time.
|
|
Positive values indicate the timestamp was late (and dropped) by
|
|
that amount. The deadline can be a realtime `running_time` or an
|
|
estimated `running_time`.
|
|
|
|
* **`proportion`**: `G_TYPE_DOUBLE`: Long term prediction of the ideal rate
|
|
relative to normal rate to get optimal quality.
|
|
|
|
* **`quality`**: `G_TYPE_INT`: An element dependent integer value that
|
|
specifies the current quality level of the element. The default
|
|
maximum quality is 1000000.
|
|
|
|
* **`format`**: `GST_TYPE_FORMAT` Units of the *processed* and *dropped*
|
|
fields. Video sinks and video filters will use `GST_FORMAT_BUFFERS`
|
|
(frames). Audio sinks and audio filters will likely use
|
|
`GST_FORMAT_DEFAULT` (samples).
|
|
|
|
* **`processed`**: `G_TYPE_UINT64`: Total number of units correctly
|
|
processed since the last state change to READY or a flushing
|
|
operation.
|
|
|
|
* **`dropped`**: `G_TYPE_UINT64`: Total number of units dropped since the
|
|
last state change to READY or a flushing operation.
|
|
|
|
The *running-time* and *processed* fields can be used to estimate the
|
|
average processing rate (framerate for video).
|
|
|
|
Elements might add additional fields in the message which are documented
|
|
in the relevant elements or baseclasses.
|
|
|
|
## Collecting statistics
|
|
|
|
A buffer with timestamp B1 arrives in the sink at time T1. The buffer
|
|
timestamp is then synchronized against the clock which yields a jitter
|
|
J1 return value from the clock. The jitter J1 is simply calculated as
|
|
|
|
J1 = CT - B1
|
|
|
|
Where CT is the clock time when the entry arrives in the sink. This
|
|
value is calculated inside the clock when we perform
|
|
`gst_clock_id_wait()`.
|
|
|
|
If the jitter is negative, the entry arrived in time and can be rendered
|
|
after waiting for the clock to reach time B1 (which is also CT - J1).
|
|
|
|
If the jitter is positive however, the entry arrived too late in the
|
|
sink and should therefore be dropped. J1 is the amount of time the entry
|
|
was late.
|
|
|
|
Any buffer that arrives in the sink should generate a QoS event
|
|
upstream.
|
|
|
|
Using the jitter we can calculate the time when the buffer arrived in
|
|
the sink:
|
|
|
|
```
|
|
T1 = B1 + J1. (1)
|
|
```
|
|
|
|
The time the buffer leaves the sink after synchronisation is measured
|
|
as:
|
|
|
|
```
|
|
T2 = B1 + (J1 < 0 ? 0 : J1) (2)
|
|
```
|
|
|
|
For buffers that arrive in time (J1 \< 0) the buffer leaves after
|
|
synchronisation which is exactly B1. Late buffers (J1 \>= 0) leave the
|
|
sink when they arrive, whithout any synchronisation, which is `T2 = T1 =
|
|
B1 + J1`.
|
|
|
|
Using a previous T0 and a new T1, we can calculate the time it took for
|
|
upstream to generate a buffer with timestamp B1.
|
|
|
|
```
|
|
PT1 = T1 - T0 (3)
|
|
```
|
|
|
|
We call PT1 the processing time needed to generate buffer with timestamp
|
|
B1.
|
|
|
|
Moreover, given the duration of the buffer D1, the current data rate
|
|
(DR1) of the upstream element is given as:
|
|
|
|
```
|
|
PT1 T1 - T0
|
|
DR1 = --- = ------- (4)
|
|
D1 D1
|
|
```
|
|
|
|
For values 0.0 \< DR1 ⇐ 1.0 the upstream element is producing faster
|
|
than real-time. If DR1 is exactly 1.0, the element is running at a
|
|
perfect speed.
|
|
|
|
Values DR1 \> 1.0 mean that the upstream element cannot produce buffers
|
|
of duration D1 in real-time. It is exactly DR1 that tells the amount of
|
|
speedup we require from upstream to regain real-time performance.
|
|
|
|
An element that is not receiving enough data is said to be underflowed.
|
|
|
|
## Element measurements
|
|
|
|
In addition to the measurements of the datarate of the upstream element,
|
|
a typical element must also measure its own performance. Global pipeline
|
|
performance problems can indeed also be caused by the element itself
|
|
when it receives too much data it cannot process in time. The element is
|
|
then said to be overflowed.
|
|
|
|
## Short term correction
|
|
|
|
The timestamp and jitter serve as short term correction information for
|
|
upstream elements. Indeed, given arrival time T1 as given in (1) we can
|
|
be certain that buffers with a timestamp B2 \< T1 will be too late in
|
|
the sink.
|
|
|
|
In case of a positive jitter we can therefore send a QoS event with a
|
|
timestamp B1, jitter J1 and proportion given by (4).
|
|
|
|
This allows an upstream element to not generate any data with timestamps
|
|
B2 \< T1, where the element can derive T1 as B1 + J1.
|
|
|
|
This will effectively result in frame drops.
|
|
|
|
The element can even do a better estimation of the next valid timestamp
|
|
it should output.
|
|
|
|
Indeed, given the element generated a buffer with timestamp B0 that
|
|
arrived in time in the sink but then received a QoS event stating B1
|
|
arrived J1 too late. This means generating B1 took (B1 + J1) - B0 = T1 -
|
|
T0 = PT1, as given in (3). Given the buffer B1 had a duration D1 and
|
|
assuming that generating a new buffer B2 will take the same amount of
|
|
processing time, a better estimation for B2 would then be:
|
|
|
|
```
|
|
B2 = T1 + D2 * DR1
|
|
```
|
|
|
|
expanding gives:
|
|
|
|
```
|
|
B2 = (B1 + J1) + D2 * (B1 + J1 - B0)
|
|
--------------
|
|
D1
|
|
```
|
|
|
|
assuming the durations of the frames are equal and thus D1 = D2:
|
|
|
|
```
|
|
B2 = (B1 + J1) + (B1 + J1 - B0)
|
|
|
|
B2 = 2 * (B1 + J1) - B0
|
|
```
|
|
|
|
also:
|
|
|
|
```
|
|
B0 = B1 - D1
|
|
```
|
|
|
|
so:
|
|
|
|
```
|
|
B2 = 2 * (B1 + J1) - (B1 - D1)
|
|
```
|
|
|
|
Which yields a more accurate prediction for the next buffer given as:
|
|
|
|
```
|
|
B2 = B1 + 2 * J1 + D1 (5)
|
|
```
|
|
|
|
## Long term correction
|
|
|
|
The datarate used to calculate (5) for the short term prediction is
|
|
based on a single observation. A more accurate datarate can be obtained
|
|
by creating a running average over multiple datarate observations.
|
|
|
|
This average is less susceptible to sudden changes that would only
|
|
influence the datarate for a very short period.
|
|
|
|
A running average is calculated over the observations given in (4) and
|
|
is used as the proportion member in the QoS event that is sent upstream.
|
|
|
|
Receivers of the QoS event should permanently reduce their datarate as
|
|
given by the proportion member. Failure to do so will certainly lead to
|
|
more dropped frames and a generally worse QoS.
|
|
|
|
## Throttling
|
|
|
|
In throttle mode, the time distance between buffers is kept to a
|
|
configurable throttle interval. This means that effectively the buffer
|
|
rate is limited to 1 buffer per throttle interval. This can be used to
|
|
limit the framerate, for example.
|
|
|
|
When an element is configured in throttling mode (this is usually only
|
|
implemented on sinks) it should produce QoS events upstream with the
|
|
jitter field set to the throttle interval. This should instruct upstream
|
|
elements to skip or drop the remaining buffers in the configured
|
|
throttle interval.
|
|
|
|
The proportion field is set to the desired slowdown needed to get the
|
|
desired throttle interval. Implementations can use the QoS Throttle
|
|
type, the proportion and the jitter member to tune their
|
|
implementations.
|
|
|
|
## QoS strategies
|
|
|
|
Several strategies exist to reduce processing delay that might affect
|
|
real time performance.
|
|
|
|
- lowering quality
|
|
|
|
- dropping frames (reduce CPU/bandwidth usage)
|
|
|
|
- switch to a lower decoding/encoding quality (reduce algorithmic
|
|
complexity)
|
|
|
|
- switch to a lower quality source (reduce network usage)
|
|
|
|
- increasing thread priorities
|
|
|
|
- switch to real-time scheduling
|
|
|
|
- assign more CPU cycles to critial pipeline parts
|
|
|
|
- assign more CPU(s) to critical pipeline parts
|
|
|
|
## QoS implementations
|
|
|
|
Here follows a small overview of how QoS can be implemented in a range
|
|
of different types of elements.
|
|
|
|
### GstBaseSink
|
|
|
|
The primary implementor of QoS is GstBaseSink. It will calculate the
|
|
following values:
|
|
|
|
- upstream running average of processing time (5) in stream time.
|
|
|
|
- running average of buffer durations.
|
|
|
|
- running average of render time (in system time)
|
|
|
|
- rendered/dropped buffers
|
|
|
|
The processing time and the average buffer durations will be used to
|
|
calculate a proportion.
|
|
|
|
The processing time in system time is compared to render time to decide
|
|
if the majority of the time is spend upstream or in the sink itself.
|
|
This value is used to decide overflow or underflow.
|
|
|
|
The number of rendered and dropped buffers is used to query stats on the
|
|
sink.
|
|
|
|
A QoS event with the most current values is sent upstream for each
|
|
buffer that was received by the sink.
|
|
|
|
Normally QoS is only enabled for video pipelines. The reason being that
|
|
drops in audio are more disturbing than dropping video frames. Also
|
|
video requires in general more processing than audio.
|
|
|
|
Normally there is a threshold for when buffers get dropped in a video
|
|
sink. Frames that arrive 20 milliseconds late are still rendered as it
|
|
is not noticeable for the human eye.
|
|
|
|
A QoS message is posted whenever a (part of a) buffer is dropped.
|
|
|
|
In throttle mode, the sink sends QoS event upstream with the timestamp
|
|
set to the `running_time` of the latest buffer and the jitter set to the
|
|
throttle interval. If the throttled buffer is late, the lateness is
|
|
subtracted from the throttle interval in order to keep the desired
|
|
throttle interval.
|
|
|
|
### GstBaseTransform
|
|
|
|
Transform elements can entirely skip the transform based on the
|
|
timestamp and jitter values of recent QoS event since these buffers will
|
|
certainly arrive too late.
|
|
|
|
With any intermediate element, the element should measure its
|
|
performance to decide if it is responsible for the quality problems or
|
|
any upstream/downstream element.
|
|
|
|
some transforms can reduce the complexity of their algorithms. Depending
|
|
on the algorithm, the changes in quality may have disturbing visual or
|
|
audible effect that should be avoided.
|
|
|
|
A QoS message should be posted when a frame is dropped or when the
|
|
quality of the filter is reduced. The quality member in the QOS message
|
|
should reflect the quality setting of the filter.
|
|
|
|
### Video Decoders
|
|
|
|
A video decoder can, based on the codec in use, decide to not decode
|
|
intermediate frames. A typical codec can for example skip the decoding
|
|
of B-frames to reduce the CPU usage and framerate.
|
|
|
|
If each frame is independantly decodable, any arbitrary frame can be
|
|
skipped based on the timestamp and jitter values of the latest QoS
|
|
event. In addition can the proportion member be used to permanently skip
|
|
frames.
|
|
|
|
It is suggested to adjust the quality field of the QoS message with the
|
|
expected amount of dropped frames (skipping B and/or P frames). This
|
|
depends on the particular spacing of B and P frames in the stream. If
|
|
the quality control would result in half of the frames to be dropped
|
|
(typical B frame skipping), the quality field would be set to ``1000000 *
|
|
1/2 = 500000``. If a typical I frame spacing of 18 frames is used,
|
|
skipping B and P frames would result in 17 dropped frames or 1 decoded
|
|
frame every 18 frames. The quality member should be set to `1000000 *
|
|
1/18 = 55555`.
|
|
|
|
- skipping B frames: quality = 500000
|
|
|
|
- skipping P/B frames: quality = 55555 (for I-frame spacing of 18
|
|
frames)
|
|
|
|
### Demuxers
|
|
|
|
Demuxers usually cannot do a lot regarding QoS except for skipping
|
|
frames to the next keyframe when a lateness QoS event arrives on a
|
|
source pad.
|
|
|
|
A demuxer can however measure if the performance problems are upstream
|
|
or downstream and forward an updated QoS event upstream.
|
|
|
|
Most demuxers that have multiple output pads might need to combine the
|
|
QoS events on all the pads and derive an aggregated QoS event for the
|
|
upstream element.
|
|
|
|
### Sources
|
|
|
|
The QoS events only apply to push based sources since pull based sources
|
|
are entirely controlled by another downstream element.
|
|
|
|
Sources can receive a overflow or underflow event that can be used to
|
|
switch to less demanding source material. In case of a network stream, a
|
|
switch could be done to a lower or higher quality stream or additional
|
|
enhancement layers could be used or ignored.
|
|
|
|
Live sources will automatically drop data when it takes too long to
|
|
process the data that the element pushes out.
|
|
|
|
Live sources should post a QoS message when data is dropped.
|