gstreamer/markdown/design/trickmodes.md
Reynaldo H. Verdejo Pinochet c2eb75ee4d design: trickmodes: switch to conventional figure format
Follow figure guidelines from conventions.md.

Additionally: add some missing markup.
2017-03-20 13:58:25 -07:00

238 lines
9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Trickmodes
GStreamer provides API for performing various trickmode playback. This
includes:
- server side trickmodes
- client side fast/slow forward playback
- client side fast/slow backwards playback
Server side trickmodes mean that a source (network source) can provide a
stream with different playback speed and direction. The client does not
have to perform any special algorithms to decode this stream.
Client side trickmodes mean that the decoding client (GStreamer)
performs the needed algorithms to change the direction and speed of the
media file.
Seeking can both be done in a playback pipeline and a transcoding
pipeline.
## General seeking overview
Consider a typical playback pipeline:
```
+---------+ +------+
+-------+ | decoder |->| sink |
+--------+ | |-->+---------+ +------+
| source |->| demux |
+--------+ | |-->+---------+ +------+
+-------+ | decoder |->| sink |
+---------+ +------+
```
The pipeline is initially configured to play back at speed 1.0 starting
from position 0 and stopping at the total duration of the file.
When performing a seek, the following steps have to be taken by the
application:
### Create a seek event
The seek event contains:
- various `GstSeekFlags` flags describing:
- where to seek to (`KEY_UNIT`)
- how accurate the seek should be (`ACCURATE`)
- how to perform the seek (`FLUSH`)
- what to do when the stop position is reached (`SEGMENT`).
- extra playback options (`SKIP`)
- a format to seek in, this can be time, bytes, units (frames,
samples), …
- a playback rate, 1.0 is normal playback speed, positive values
bigger than 1.0 mean fast playback. negative values mean reverse
playback. A playback speed of 0.0 is not allowed (but is equivalent
to PAUSING the pipeline).
- a start position, this value has to be between 0 and the total
duration of the file. It can also be relative to the previously
configured start value.
- a stop position, this value has to be between 0 and the total
duration. It can also be relative to the previously configured stop
value.
See also `gst_event_new_seek()`.
### Send the seek event
Send the new seek event to the pipeline with
`gst_element_send_event()`.
By default the pipeline will send the event to all sink elements. By
default an element will forward the event upstream on all sinkpads.
Elements can modify the format of the seek event. The most common format
is `GST_FORMAT_TIME`.
One element will actually perform the seek, this is usually the demuxer
or source element. For more information on how to perform the different
seek types see [seeking](design/seeking.md).
For client side trickmode a `SEGMENT` event will be sent downstream with
the new rate and start/stop positions. All elements prepare themselves
to handle the rate (see below). The applied rate of the SEGMENT event
will be set to 1.0 to indicate that no rate adjustment has been done.
for server side trick mode a `SEGMENT` event is sent downstream with a
rate of 1.0 and the start/stop positions. The elements will configure
themselves for normal playback speed since the server will perform the
rate conversions. The applied rate will be set to the rate that will be
applied by the server. This is done to insure that the position
reporting performed in the sink is aware of the trick mode.
When the seek succeeds, the `_send_event()` function will return TRUE.
## Server side trickmode
The source element operates in push mode. It can reopen a server
connection requesting a new byte or time position and a new playback
speed. The capabilities can be queried from the server when the
connection is opened.
We assume the source element is derived from the `GstPushSrc` base class.
The base source should be configured with:
```c
gst_base_src_set_format (src, GST_FORMAT_TIME);
```
The `do_seek()` method will be called on the `GstPushSrc` subclass with the
seek information passed in the `GstSegment` argument.
The rate value in the segment should be used to reopen the connection to
the server requesting data at the new speed and possibly a new playback
position.
When the server connection was successfully reopened, set the rate of
the segment to 1.0 so that the client side trickmode is not enabled. The
applied rate in the segment is set to the rate transformation done by
the server.
Alternatively a combination of client side and serverside trickmode can
be used, for example if the server does not support certain rates, the
client can perform rate conversion for the remainder.
```
source server
do_seek | |
----------->| |
| reopen connection |
|-------------------->|
| .
| success .
|<--------------------|
modify | |
rate to 1.0 | |
| |
return | |
TRUE | |
| |
```
After performing the seek, the source will inform the downstream
elements of the new segment that is to be played back. Since the segment
will have a rate of 1.0, no client side trick modes are enabled. The
segment will have an applied rate different from 1.0 to indicate that
the media contains data with non-standard playback speed or direction.
## client side forward trickmodes
The seek happens as stated above. a `SEGMENT` event is sent downstream
with a rate different from 1.0. Plugins receiving the `SEGMENT` can decide
to perform the rate conversion of the media data (retimestamp video
frames, resample audio, …).
If a plugin decides to resample or retimestamp, it should modify the
`SEGMENT` with a rate of 1.0 and update the applied rate so that
downstream elements dont resample again but are aware that the media
has been modified.
The GStreamer base audio and video sinks will resample automatically if
they receive a SEGMENT event with a rate different from 1.0. The
position reporting in the base audio and video sinks will also depend on
the applied rate of the segment information.
When the `SKIP` flag is set, frames can be dropped in the elements. If S
is the speedup factor, a good algorithm for implementing frame skipping
is to send audio in chunks of Nms (usually 300ms is good) and then skip
((S-1) \* Nns) of audio data. For the video we send only the keyframes
in the (S \* Nns) interval. In this case, the demuxer would scale the
timestamps and would set an applied rate of S.
## client side backwards trickmode
For backwards playback the following rules apply:
- the rate in the `SEGMENT` is less than 0.0.
- the `SEGMENT` start position is less than the stop position, playback
will however happen from stop to start in reverse.
- the time member in the `SEGMENT` is set to the stream time of the
start position.
For plugins the following rules apply:
- A source plugin sends data in chunks starting from the last chunk of
the file. The actual bytes are not reversed. Each chunk that is not
forward continuous with the previous chunk is marked with a `DISCONT`
flag.
- A demuxer accumulates the chunks. As soon as a keyframe is found,
everything starting from the keyframe up to the accumulated data is
sent downstream. Timestamps on the buffers are set starting from the
stop position to start, effectively going backwards. Chunks are
marked with `DISCONT` when they are not forward continuous with the
previous buffer.
- A video decoder decodes and accumulates all decoded frames. If a
buffer with a `DISCONT`, `SEGMENT` or `EOS` is received, all accumulated
frames are sent downsteam in reverse.
- An audio decoder decodes and accumulates all decoded audio. If a
buffer with a `DISCONT`, `SEGMENT` or `EOS` is received, all accumulated
audio is sent downstream in reverse order. Some audio codecs need
the previous data buffer to decode the current one, in that case,
the previous DISCONT buffer needs to be combined with the last
non-DISCONT buffer to generate the last bit of output.
- A sink reverses (for audio) and retimestamps (audio, video) the
buffers before playing them back. Retimestamping occurs relative to
the stop position, making the timestamps increase again and suitable
for synchronizing against the clock. Audio sinks also have to
perform simple resampling before playing the samples.
- for transcoding, audio and video resamplers can be used to reverse,
resample and retimestamp the buffers. Any rate adjustments performed
on the media must be added to the `applied_rate` and subtracted from
the rate members in the `SEGMENT`
event.
In `SKIP` mode, the same algorithm as for forward `SKIP` mode can be used.
## Notes
- The clock/`running_time` keeps running forward.
- backwards playback potentially uses a lot of memory as frames and
undecoded data gets buffered.