mirror of
https://gitlab.freedesktop.org/gstreamer/gstreamer.git
synced 2024-11-25 11:11:08 +00:00
436 lines
16 KiB
Text
436 lines
16 KiB
Text
Quality-of-Service
|
|
------------------
|
|
|
|
Quality of service is about measuring and adjusting the real-time
|
|
performance of a pipeline.
|
|
|
|
The real-time performance is always measured relative to the pipeline
|
|
clock and typically happens in the sinks when they synchronize buffers
|
|
against the clock.
|
|
|
|
The measurements result in QOS events that aim to adjust the datarate
|
|
in one or more upstream elements. Two types of adjustments can be
|
|
made:
|
|
|
|
- short time "emergency" corrections based on latest observation
|
|
in the sinks.
|
|
- long term rate corrections based on trends observed in the sinks.
|
|
|
|
It is also possible for the application to artificially introduce delay
|
|
between synchronized buffers, this is called throttling. It can be used
|
|
to reduce the framerate, for example.
|
|
|
|
|
|
Sources of quality problems
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
- High CPU load
|
|
- Network problems
|
|
- Other resource problems such as disk load, memory bottlenecks etc.
|
|
- application level throttling
|
|
|
|
|
|
QoS event
|
|
~~~~~~~~~
|
|
|
|
The QoS event is generated by an element that synchronizes against the clock. It
|
|
travels upstream and contains the following fields:
|
|
|
|
- type, GST_TYPE_QOS_TYPE:
|
|
The type of the QoS event, we have the following types and the default type
|
|
is GST_QOS_TYPE_UNDERFLOW:
|
|
|
|
GST_QOS_TYPE_OVERFLOW: an element is receiving buffers too fast and can't
|
|
keep up processing them. Upstream should reduce the
|
|
rate.
|
|
GST_QOS_TYPE_UNDERFLOW: an element is receiving buffers too slowly and has
|
|
to drop them because they are too late. Upstream should
|
|
increase the processing rate.
|
|
GST_QOS_TYPE_THROTTLE: the application is asking to add extra delay between
|
|
buffers, upstream is allowed to drop buffers
|
|
|
|
- timestamp, G_TYPE_UINT64:
|
|
The timestamp on the buffer that generated the QoS event. These timestamps
|
|
are expressed in total running_time in the sink so that the value is ever
|
|
increasing.
|
|
|
|
- jitter, G_TYPE_INT64:
|
|
The difference of that timestamp against the current clock time. Negative
|
|
values mean the timestamp was on time. Positive values indicate the
|
|
timestamp was late by that amount. When buffers are received in time and
|
|
throttling is not enabled, the QoS type field is set to OVERFLOW.
|
|
When throttling, the jitter contains the throttling delay added by the
|
|
application and the type is set to THROTTLE.
|
|
|
|
- proportion, G_TYPE_DOUBLE:
|
|
Long term prediction of the ideal rate relative to normal rate to get
|
|
optimal quality.
|
|
|
|
The rest of this document deals with how these values can be calculated
|
|
in a sink and how the values can be used by other elements to adjust their
|
|
operations.
|
|
|
|
|
|
QoS message
|
|
~~~~~~~~~~~
|
|
|
|
A QOS message is posted on the bus whenever an element decides to:
|
|
|
|
- drop a buffer because of QoS reasons
|
|
- change its processing strategy because of QoS reasons (quality)
|
|
|
|
It should be expected that creating and posting the QoS message is reasonably
|
|
fast and does not significantly contribute to the QoS problems. Options to
|
|
disable this feature could also be presented on elements.
|
|
|
|
This message can be posted by a sink/src that performs synchronisation against the
|
|
clock (live) or it could be posted by an upstream element that performs QoS
|
|
because of QOS events received from a downstream element (!live).
|
|
|
|
The GST_MESSAGE_QOS contains at least the following info:
|
|
|
|
- live: G_TYPE_BOOLEAN:
|
|
If the QoS message was dropped by a live element such as a sink or a live
|
|
source. If the live property is FALSE, the QoS message was generated as a
|
|
response to a QoS event in a non-live element.
|
|
|
|
- running-time, G_TYPE_UINT64:
|
|
The running_time of the buffer that generated the QoS message.
|
|
|
|
- stream-time, G_TYPE_UINT64:
|
|
The stream_time of the buffer that generated the QoS message.
|
|
|
|
- timestamp, G_TYPE_UINT64:
|
|
The timestamp of the buffer that generated the QoS message.
|
|
|
|
- duration, G_TYPE_UINT64:
|
|
The duration of the buffer that generated the QoS message.
|
|
|
|
|
|
- jitter, G_TYPE_INT64:
|
|
The difference of the running-time against the deadline. Negative
|
|
values mean the timestamp was on time. Positive values indicate the
|
|
timestamp was late (and dropped) by that amount. The deadline can be
|
|
a realtime running_time or an estimated running_time.
|
|
|
|
- proportion, G_TYPE_DOUBLE:
|
|
Long term prediction of the ideal rate relative to normal rate to get
|
|
optimal quality.
|
|
|
|
- quality, G_TYPE_INT:
|
|
An element dependent integer value that specifies the current quality
|
|
level of the element. The default maximum quality is 1000000.
|
|
|
|
|
|
- format, GST_TYPE_FORMAT
|
|
Units of the 'processed' and 'dropped' fields. Video sinks and video
|
|
filters will use GST_FORMAT_BUFFERS (frames). Audio sinks and audio filters
|
|
will likely use GST_FORMAT_DEFAULT (samples).
|
|
|
|
- processed: G_TYPE_UINT64:
|
|
Total number of units correctly processed since the last state change to
|
|
READY or a flushing operation.
|
|
|
|
- dropped: G_TYPE_UINT64:
|
|
Total number of units dropped since the last state change to READY or a
|
|
flushing operation.
|
|
|
|
The 'running-time' and 'processed' fields can be used to estimate the average
|
|
processing rate (framerate for video).
|
|
|
|
Elements might add additional fields in the message which are documented in the
|
|
relevant elements or baseclasses.
|
|
|
|
|
|
Collecting statistics
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
A buffer with timestamp B1 arrives in the sink at time T1. The buffer
|
|
timestamp is then synchronized against the clock which yields a jitter J1
|
|
return value from the clock. The jitter J1 is simply calculated as
|
|
|
|
J1 = CT - B1
|
|
|
|
Where CT is the clock time when the entry arrives in the sink. This value
|
|
is calculated inside the clock when we perform gst_clock_id_wait().
|
|
|
|
If the jitter is negative, the entry arrived in time and can be rendered
|
|
after waiting for the clock to reach time B1 (which is also CT - J1).
|
|
|
|
If the jitter is positive however, the entry arrived too late in the sink
|
|
and should therefore be dropped. J1 is the amount of time the entry was late.
|
|
|
|
Any buffer that arrives in the sink should generate a QoS event upstream.
|
|
|
|
Using the jitter we can calculate the time when the buffer arrived in the
|
|
sink:
|
|
|
|
T1 = B1 + J1. (1)
|
|
|
|
The time the buffer leaves the sink after synchronisation is measured as:
|
|
|
|
T2 = B1 + (J1 < 0 ? 0 : J1) (2)
|
|
|
|
For buffers that arrive in time (J1 < 0) the buffer leaves after synchronisation
|
|
which is exactly B1. Late buffers (J1 >= 0) leave the sink when they arrive,
|
|
whithout any synchronisation, which is T2 = T1 = B1 + J1.
|
|
|
|
Using a previous T0 and a new T1, we can calculate the time it took for
|
|
upstream to generate a buffer with timestamp B1.
|
|
|
|
PT1 = T1 - T0 (3)
|
|
|
|
We call PT1 the processing time needed to generate buffer with timestamp B1.
|
|
|
|
Moreover, given the duration of the buffer D1, the current data rate (DR1) of
|
|
the upstream element is given as:
|
|
|
|
PT1 T1 - T0
|
|
DR1 = --- = ------- (4)
|
|
D1 D1
|
|
|
|
For values 0.0 < DR1 <= 1.0 the upstream element is producing faster than
|
|
real-time. If DR1 is exactly 1.0, the element is running at a perfect speed.
|
|
|
|
Values DR1 > 1.0 mean that the upstream element cannot produce buffers of
|
|
duration D1 in real-time. It is exactly DR1 that tells the amount of speedup
|
|
we require from upstream to regain real-time performance.
|
|
|
|
An element that is not receiving enough data is said to be underflowed.
|
|
|
|
|
|
Element measurements
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to the measurements of the datarate of the upstream element, a
|
|
typical element must also measure its own performance. Global pipeline
|
|
performance problems can indeed also be caused by the element itself when it
|
|
receives too much data it cannot process in time. The element is then said to
|
|
be overflowed.
|
|
|
|
|
|
Short term correction
|
|
---------------------
|
|
|
|
The timestamp and jitter serve as short term correction information
|
|
for upstream elements. Indeed, given arrival time T1 as given in (1)
|
|
we can be certain that buffers with a timestamp B2 < T1 will be too late
|
|
in the sink.
|
|
|
|
In case of a positive jitter we can therefore send a QoS event with
|
|
a timestamp B1, jitter J1 and proportion given by (4).
|
|
|
|
This allows an upstream element to not generate any data with timestamps
|
|
B2 < T1, where the element can derive T1 as B1 + J1.
|
|
|
|
This will effectively result in frame drops.
|
|
|
|
The element can even do a better estimation of the next valid timestamp it
|
|
should output.
|
|
|
|
Indeed, given the element generated a buffer with timestamp B0 that arrived
|
|
in time in the sink but then received a QoS event stating B1 arrived J1
|
|
too late. This means generating B1 took (B1 + J1) - B0 = T1 - T0 = PT1, as
|
|
given in (3). Given the buffer B1 had a duration D1 and assuming that
|
|
generating a new buffer B2 will take the same amount of processing time,
|
|
a better estimation for B2 would then be:
|
|
|
|
B2 = T1 + D2 * DR1
|
|
|
|
expanding gives:
|
|
|
|
B2 = (B1 + J1) + D2 * (B1 + J1 - B0)
|
|
--------------
|
|
D1
|
|
|
|
assuming the durations of the frames are equal and thus D1 = D2:
|
|
|
|
B2 = (B1 + J1) + (B1 + J1 - B0)
|
|
|
|
B2 = 2 * (B1 + J1) - B0
|
|
|
|
also:
|
|
|
|
B0 = B1 - D1
|
|
|
|
so:
|
|
|
|
B2 = 2 * (B1 + J1) - (B1 - D1)
|
|
|
|
Which yields a more accurate prediction for the next buffer given as:
|
|
|
|
B2 = B1 + 2 * J1 + D1 (5)
|
|
|
|
|
|
Long term correction
|
|
--------------------
|
|
|
|
The datarate used to calculate (5) for the short term prediction is based
|
|
on a single observation. A more accurate datarate can be obtained by
|
|
creating a running average over multiple datarate observations.
|
|
|
|
This average is less susceptible to sudden changes that would only influence
|
|
the datarate for a very short period.
|
|
|
|
A running average is calculated over the observations given in (4) and is
|
|
used as the proportion member in the QoS event that is sent upstream.
|
|
|
|
Receivers of the QoS event should permanently reduce their datarate
|
|
as given by the proportion member. Failure to do so will certainly lead to
|
|
more dropped frames and a generally worse QoS.
|
|
|
|
|
|
Throttling
|
|
----------
|
|
|
|
In throttle mode, the time distance between buffers is kept to a configurable
|
|
throttle interval. This means that effectively the buffer rate is limited
|
|
to 1 buffer per throttle interval. This can be used to limit the framerate,
|
|
for example.
|
|
|
|
When an element is configured in throttling mode (this is usually only
|
|
implemented on sinks) it should produce QoS events upstream with the jitter
|
|
field set to the throttle interval. This should instruct upstream elements to
|
|
skip or drop the remaining buffers in the configured throttle interval.
|
|
|
|
The proportion field is set to the desired slowdown needed to get the
|
|
desired throttle interval. Implementations can use the QoS Throttle type,
|
|
the proportion and the jitter member to tune their implementations.
|
|
|
|
|
|
QoS strategies
|
|
--------------
|
|
|
|
Several strategies exist to reduce processing delay that might affect
|
|
real time performance.
|
|
|
|
- lowering quality
|
|
- dropping frames (reduce CPU/bandwidth usage)
|
|
- switch to a lower decoding/encoding quality (reduce algorithmic
|
|
complexity)
|
|
- switch to a lower quality source (reduce network usage)
|
|
- increasing thread priorities
|
|
- switch to real-time scheduling
|
|
- assign more CPU cycles to critial pipeline parts
|
|
- assign more CPU(s) to critical pipeline parts
|
|
|
|
|
|
QoS implementations
|
|
-------------------
|
|
|
|
Here follows a small overview of how QoS can be implemented in a range of
|
|
different types of elements.
|
|
|
|
|
|
GstBaseSink
|
|
-----------
|
|
|
|
The primary implementor of QoS is GstBaseSink. It will calculate the following
|
|
values:
|
|
|
|
- upstream running average of processing time (5) in stream time.
|
|
- running average of buffer durations.
|
|
- running average of render time (in system time)
|
|
- rendered/dropped buffers
|
|
|
|
The processing time and the average buffer durations will be used to
|
|
calculate a proportion.
|
|
|
|
The processing time in system time is compared to render time to decide if
|
|
the majority of the time is spend upstream or in the sink itself. This value
|
|
is used to decide overflow or underflow.
|
|
|
|
The number of rendered and dropped buffers is used to query stats on the sink.
|
|
|
|
A QoS event with the most current values is sent upstream for each buffer
|
|
that was received by the sink.
|
|
|
|
Normally QoS is only enabled for video pipelines. The reason being that drops
|
|
in audio are more disturbing than dropping video frames. Also video requires in
|
|
general more processing than audio.
|
|
|
|
Normally there is a threshold for when buffers get dropped in a video sink. Frames
|
|
that arrive 20 milliseconds late are still rendered as it is not noticeable for
|
|
the human eye.
|
|
|
|
A QoS message is posted whenever a (part of a) buffer is dropped.
|
|
|
|
In throttle mode, the sink sends QoS event upstream with the timestamp set to
|
|
the running_time of the latest buffer and the jitter set to the throttle interval.
|
|
If the throttled buffer is late, the lateness is subtracted from the throttle
|
|
interval in order to keep the desired throttle interval.
|
|
|
|
|
|
GstBaseTransform
|
|
----------------
|
|
|
|
Transform elements can entirely skip the transform based on the timestamp and
|
|
jitter values of recent QoS event since these buffers will certainly arrive
|
|
too late.
|
|
|
|
With any intermediate element, the element should measure its performance to
|
|
decide if it is responsible for the quality problems or any upstream/downstream
|
|
element.
|
|
|
|
some transforms can reduce the complexity of their algorithms. Depending on the
|
|
algorithm, the changes in quality may have disturbing visual or audible effect
|
|
that should be avoided.
|
|
|
|
A QoS message should be posted when a frame is dropped or when the quality
|
|
of the filter is reduced. The quality member in the QOS message should reflect
|
|
the quality setting of the filter.
|
|
|
|
|
|
Video Decoders
|
|
--------------
|
|
|
|
A video decoder can, based on the codec in use, decide to not decode intermediate
|
|
frames. A typical codec can for example skip the decoding of B-frames to reduce
|
|
the CPU usage and framerate.
|
|
|
|
If each frame is independantly decodable, any arbitrary frame can be skipped based
|
|
on the timestamp and jitter values of the latest QoS event. In addition can the
|
|
proportion member be used to permanently skip frames.
|
|
|
|
It is suggested to adjust the quality field of the QoS message with the expected
|
|
amount of dropped frames (skipping B and/or P frames). This depends on the
|
|
particular spacing of B and P frames in the stream. If the quality control would
|
|
result in half of the frames to be dropped (typical B frame skipping), the
|
|
quality field would be set to 1000000 * 1/2 = 500000. If a typical I frame spacing
|
|
of 18 frames is used, skipping B and P frames would result in 17 dropped frames
|
|
or 1 decoded frame every 18 frames. The quality member should be set to
|
|
1000000 * 1/18 = 55555.
|
|
|
|
- skipping B frames: quality = 500000
|
|
- skipping P/B frames: quality = 55555 (for I-frame spacing of 18 frames)
|
|
|
|
|
|
Demuxers
|
|
--------
|
|
|
|
Demuxers usually cannot do a lot regarding QoS except for skipping frames to the next
|
|
keyframe when a lateness QoS event arrives on a source pad.
|
|
|
|
A demuxer can however measure if the performance problems are upstream or downstream
|
|
and forward an updated QoS event upstream.
|
|
|
|
Most demuxers that have multiple output pads might need to combine the QoS
|
|
events on all the pads and derive an aggregated QoS event for the upstream element.
|
|
|
|
|
|
Sources
|
|
-------
|
|
|
|
The QoS events only apply to push based sources since pull based sources are entirely
|
|
controlled by another downstream element.
|
|
|
|
Sources can receive a overflow or underflow event that can be used to switch to
|
|
less demanding source material. In case of a network stream, a switch could be done
|
|
to a lower or higher quality stream or additional enhancement layers could be used
|
|
or ignored.
|
|
|
|
Live sources will automatically drop data when it takes too long to process the data
|
|
that the element pushes out.
|
|
|
|
Live sources should post a QoS message when data is dropped.
|
|
|