Import all GStreamer design doc and convert them to markdown

https://bugzilla.gnome.org/show_bug.cgi?id=775667
This commit is contained in:
Thibault Saunier 2016-12-05 18:12:24 -03:00
parent bae66ab278
commit 2fdd87e282
59 changed files with 12011 additions and 0 deletions

View file

@ -0,0 +1,424 @@
# Conventions for thread a safe API
The GStreamer API is designed to be thread safe. This means that API functions
can be called from multiple threads at the same time. GStreamer internally uses
threads to perform the data passing and various asynchronous services such as
the clock can also use threads.
This design decision has implications for the usage of the API and the objects
which this document explains.
## MT safety techniques
Several design patterns are used to guarantee object consistency in GStreamer.
This is an overview of the methods used in various GStreamer subsystems.
### Refcounting:
All shared objects have a refcount associated with them. Each reference
obtained to the object should increase the refcount and each reference lost
should decrease the refcount.
The refcounting is used to make sure that when another thread destroys the
object, the ones which still hold a reference to the object do not read from
invalid memory when accessing the object.
Refcounting is also used to ensure that mutable data structures are only
modified when they are owned by the calling code.
It is a requirement that when two threads have a handle on an object, the
refcount must be more than one. This means that when one thread passes an
object to another thread it must increase the refcount. This requirement makes
sure that one thread cannot suddenly dispose the object making the other
thread crash when it tries to access the pointer to invalid memory.
### Shared data structures and writability:
All objects have a refcount associated with them. Each reference obtained to
the object should increase the refcount and each reference lost should
decrease the refcount.
Each thread having a refcount to the object can safely read from the object.
but modifications made to the object should be preceded with a
`_get_writable()` function call. This function will check the refcount of the
object and if the object is referenced by more than one instance, a copy is
made of the object that is then by definition only referenced from the calling
thread. This new copy is then modifiable without being visible to other
refcount holders.
This technique is used for information objects that, once created, never
change their values. The lifetime of these objects is generally short, the
objects are usually simple and cheap to copy/create.
The advantage of this method is that no reader/writers locks are needed. all
threads can concurrently read but writes happen locally on a new copy. In most
cases `_get_writable()` can avoid a real copy because the calling method is the
only one holding a reference, which makes read/write very cheap.
The drawback is that sometimes 1 needless copy can be done. This would happen
when N threads call `_get_writable()` at the same time, all seeing that N
references are held on the object. In this case 1 copy too many will be done.
This is not a problem in any practical situation because the copy operation is
fast.
### Mutable substructures:
Special techniques are necessary to ensure the consistency of compound shared
objects. As mentioned above, shared objects need to have a reference count of
1 if they are to be modified. Implicit in this assumption is that all parts of
the shared object belong only to the object. For example, a GstStructure in
one GstCaps object should not belong to any other GstCaps object. This
condition suggests a parent-child relationship: structures can only be added
to parent object if they do not already have a parent object.
In addition, these substructures must not be modified while more than one code
segment has a reference on the parent object. For example, if the user creates
a GstStructure, adds it to a GstCaps, and the GstCaps is then referenced by
other code segments, the GstStructure should then become immutable, so that
changes to that data structure do not affect other parts of the code. This
means that the child is only mutable when the parent's reference count is 1,
as well as when the child structure has no parent.
The general solution to this problem is to include a field in child structures
pointing to the parent's atomic reference count. When set to NULL, this
indicates that the child has no parent. Otherwise, procedures that modify the
child structure must check if the parent's refcount is 1, and otherwise must
cause an error to be signaled.
Note that this is an internal implementation detail; application or plugin
code that calls `_get_writable()` on an object is guaranteed to receive an
object of refcount 1, which must then be writable. The only trick is that a
pointer to a child structure of an object is only valid while the calling code
has a reference on the parent object, because the parent is the owner of the
child.
### Object locking:
For objects that contain state information and generally have a longer
lifetime, object locking is used to update the information contained in the
object.
All readers and writers acquire the lock before accessing the object. Only one
thread is allowed access the protected structures at a time.
Object locking is used for all objects extending from GstObject such as
GstElement, GstPad.
Object locking can be done with recursive locks or regular mutexes. Object
locks in GStreamer are implemented with mutexes which cause deadlocks when
locked recursively from the same thread. This is done because regular mutexes
are cheaper.
### Atomic operations
Atomic operations are operations that are performed as one consistent
operation even when executed by multiple threads. They do however not use the
conventional aproach of using mutexes to protect the critical section but rely
on CPU features and instructions.
The advantages are mostly speed related since there are no heavyweight locks
involved. Most of these instructions also do not cause a context switch in case
of concurrent access but use a retry mechanism or spinlocking.
Disadvantages are that each of these instructions usually cause a cache flush
on multi-CPU machines when two processors perform concurrent access.
Atomic operations are generally used for refcounting and for the allocation of
small fixed size objects in a memchunk. They can also be used to implement a
lockfree list or stack.
### Compare and swap
As part of the atomic operations, compare-and-swap (CAS) can be used to access
or update a single property or pointer in an object without having to take a
lock.
This technique is currently not used in GStreamer but might be added in the
future in performance critical places.
## Objects
### Locking involved:
- atomic operations for refcounting
- object locking
All objects should have a lock associated with them. This lock is used to keep
internal consistency when multiple threads call API function on the object.
For objects that extend the GStreamer base object class this lock can be
obtained with the macros `GST_OBJECT_LOCK()` and `GST_OBJECT_UNLOCK()`. For other object that do
not extend from the base GstObject class these macros can be different.
### refcounting
All new objects created have the FLOATING flag set. This means that the object
is not owned or managed yet by anybody other than the one holding a reference
to the object. The object in this state has a reference count of 1.
Various object methods can take ownership of another object, this means that
after calling a method on object A with an object B as an argument, the object
B is made sole property of object A. This means that after the method call you
are not allowed to access the object anymore unless you keep an extra
reference to the object. An example of such a method is the `_bin_add()` method.
As soon as this function is called in a Bin, the element passed as an argument
is owned by the bin and you are not allowed to access it anymore without
taking a `_ref()` before adding it to the bin. The reason being that after the
`_bin_add()` call disposing the bin also destroys the element.
Taking ownership of an object happens through the process of "sinking" the
object. the `_sink()` method on an object will decrease the refcount of the
object if the FLOATING flag is set. The act of taking ownership of an object
is then performed as a `_ref()` followed by a `_sink()` call on the object.
The float/sink process is very useful when initializing elements that will
then be placed under control of a parent. The floating ref keeps the object
alive until it is parented, and once the object is parented you can forget
about it.
also see [relations](design/relations.md)
### parent-child relations
One can create parent-child relationships with the `_object_set_parent()`
method. This method refs and sinks the object and assigns its parent property
to that of the managing parent.
The child is said to have a weak link to the parent since the refcount of the
parent is not increased in this process. This means that if the parent is
disposed it has to unset itself as the parent of the object before disposing
itself, else the child object holds a parent pointer to invalid memory.
The responsibilities for an object that sinks other objects are summarised as:
- taking ownership of the object
- call `_object_set_parent()` to set itself as the object parent, this call
will `_ref()` and `_sink()` the object.
- keep reference to object in a datastructure such as a list or array.
- on dispose
- call `_object_unparent()` to reset the parent property and unref the
object.
- remove the object from the list.
also see [relations](design/relations.md)
### Properties
Most objects also expose state information with public properties in the
object. Two types of properties might exist: accessible with or without
holding the object lock. All properties should only be accessed with their
corresponding macros. The public object properties are marked in the .h files
with /*< public >*/. The public properties that require a lock to be held are
marked with `/*< public >*/` `/* with <lock_type> */`, where `<lock_type>` can be
`LOCK` or `STATE_LOCK` or any other lock to mark the type(s) of lock to be
held.
**Example**:
in GstPad there is a public property `direction`. It can be found in the
section marked as public and requiring the LOCK to be held. There exists
also a macro to access the property.
struct _GstRealPad {
...
/*< public >*/ /* with LOCK */
...
GstPadDirection direction;
...
};
#define GST_RPAD_DIRECTION(pad) (GST_REAL_PAD_CAST(pad)->direction)
Accessing the property is therefore allowed with the following code example:
GST_OBJECT_LOCK (pad);
direction = GST_RPAD_DIRECTION (pad);
GST_OBJECT_UNLOCK (pad);
### Property lifetime
All properties requiring a lock can change after releasing the associated
lock. This means that as long as you hold the lock, the state of the
object regarding the locked properties is consistent with the information
obtained. As soon as the lock is released, any values acquired from the
properties might not be valid anymore and can as best be described as a
snapshot of the state when the lock was held.
This means that all properties that require access beyond the scope of the
critial section should be copied or refcounted before releasing the lock.
Most object provide a `_get_<property>()` method to get a copy or refcounted
instance of the property value. The caller should not wory about any locks
but should unref/free the object after usage.
**Example**:
the following example correctly gets the peer pad of an element. It is
required to increase the refcount of the peer pad because as soon as the
lock is released, the peer could be unreffed and disposed, making the
pointer obtained in the critical section point to invalid memory.
``` c
GST_OBJECT_LOCK (pad);
peer = GST_RPAD_PEER (pad);
if (peer)
gst_object_ref (GST_OBJECT (peer));
GST_OBJECT_UNLOCK (pad);
... use peer ...
if (peer)
gst_object_unref (GST_OBJECT (peer));
```
Note that after releasing the lock the peer might not actually be the peer
anymore of the pad. If you need to be sure it is, you need to extend the
critical section to include the operations on the peer.
The following code is equivalent to the above but with using the functions
to access object properties.
``` c
peer = gst_pad_get_peer (pad);
if (peer) {
... use peer ...
gst_object_unref (GST_OBJECT (peer));
}
```
**Example**:
Accessing the name of an object makes a copy of the name. The caller of the
function should g_free() the name after usage.
``` c
GST_OBJECT_LOCK (object)
name = g_strdup (GST_OBJECT_NAME (object));
GST_OBJECT_UNLOCK (object)
... use name ...
g_free (name);
```
or:
``` c
name = gst_object_get_name (object);
... use name ...
g_free (name);
```
### Accessor methods
For aplications it is encouraged to use the public methods of the object. Most
useful operations can be performed with the methods so it is seldom required
to access the public fields manually.
All accessor methods that return an object should increase the refcount of the
returned object. The caller should `_unref()` the object after usage. Each
method should state this refcounting policy in the documentation.
### Accessing lists
If the object property is a list, concurrent list iteration is needed to get
the contents of the list. GStreamer uses the cookie mechanism to mark the last
update of a list. The list and the cookie are protected by the same lock. Each
update to a list requires the following actions:
- acquire lock
- update list
- update cookie
- release lock
Updating the cookie is usually done by incrementing its value by one. Since
cookies use guint32 its wraparound is for all practical reasons is not a
problem.
Iterating a list can safely be done by surrounding the list iteration with a
lock/unlock of the lock.
In some cases it is not a good idea to hold the lock for a long time while
iterating the list. The state change code for a bin in GStreamer, for example,
has to iterate over each element and perform a blocking call on each of them
potentially causing infinite bin locking. In this case the cookie can be used
to iterate a list.
**Example**:
The following algorithm iterates a list and reverses the updates in the
case a concurrent update was done to the list while iterating. The idea is
that whenever we reacquire the lock, we check for updates to the cookie to
decide if we are still iterating the right list.
``` c
GST_OBJECT_LOCK (lock);
/* grab list and cookie */
cookie = object->list_cookie;
list = object-list;
while (list) {
GstObject *item = GST_OBJECT (list->data);
/* need to ref the item before releasing the lock */
gst_object_ref (item);
GST_OBJECT_UNLOCK (lock);
... use/change item here...
/* release item here */
gst_object_unref (item);
GST_OBJECT_LOCK (lock);
if (cookie != object->list_cookie) {
/* handle rollback caused by concurrent modification
* of the list here */
...rollback changes to items...
/* grab new cookie and list */
cookie = object->list_cookie;
list = object->list;
}
else {
list = g_list_next (list);
}
}
GST_OBJECT_UNLOCK (lock);
```
### GstIterator
GstIterator provides an easier way of retrieving elements in a concurrent
list. The following code example is equivalent to the previous example.
**Example**:
``` c
it = _get_iterator(object);
while (!done) {
switch (gst_iterator_next (it, &item)) {
case GST_ITERATOR_OK:
... use/change item here...
/* release item here */
gst_object_unref (item);
break;
case GST_ITERATOR_RESYNC:
/* handle rollback caused by concurrent modification
* of the list here */
...rollback changes to items...
/* resync iterator to start again */
gst_iterator_resync (it);
break;
case GST_ITERATOR_DONE:
done = TRUE;
break;
}
}
gst_iterator_free (it);
```

96
markdown/design/TODO.md Normal file
View file

@ -0,0 +1,96 @@
# TODO - Future Development
## API/ABI
- implement return values from events in addition to the gboolean.
This should be done by making the event contain a GstStructure with
input/output values, similar to GstQuery. A typical use case is
performing a non-accurate seek to a keyframe, after the seek you
want to get the new stream time that will actually be used to update
the slider bar.
- make gst\_pad\_push\_event() return a GstFlowReturn
- GstEvent, GstMessage register like GstFormat or GstQuery.
- query POSITION/DURATION return accuracy. Just a flag or accuracy
percentage.
- use | instead of + as divider in serialization of Flags
(gstvalue/gststructure)
- rethink how we handle dynamic replugging wrt segments and other
events that already got pushed and need to be pushed again. Might
need GstFlowReturn from gst\_pad\_push\_event(). FIXED in 0.11 with
sticky events.
- Optimize negotiation. We currently do a get\_caps() call when we
link pads, which could potentially generate a huge list of caps and
all their combinations, we need to avoid generating these huge lists
by generating them We also need to incrementally return
intersections etc, for this. somewhat incrementally when needed. We
can do this with a gst\_pad\_iterate\_caps() call. We also need to
incrementally return intersections etc, for this. FIXED in 0.11 with
a filter on getcaps functions.
- Elements in a bin have no clue about the final state of the parent
element since the bin sets the target state on its children in small
steps. This causes problems for elements that like to know the final
state (rtspsrc going to PAUSED or READY is different in that we can
avoid sending the useless PAUSED request).
- Make serialisation of structures more consistent, readable and nicer
code-wise.
- pad block has several issues:
- cant block on selected things, like push, pull, pad\_alloc,
events, …
- cant check why the block happened. We should also be able to
get the item/ reason that blocked the pad.
- it only blocks on datapassing. When EOS, the block never happens
but ideally should because pad block should inform the app when
there is no dataflow.
- the same goes for segment seeks that dont push in-band EOS
events. Maybe segment seeks should also send an EOS event when
theyre done.
- blocking should only happen from one thread. If one thread does
pad\_alloc and another a push, the push might be busy while the
block callback is done.
- maybe this name is overloaded. We need to look at some more use
cases before trying to fix this. FIXED in 0.11 with BLOCKING
probes.
- rethink the way we do upstream renegotiation. Currently its done
with pad\_alloc but this has many issues such as only being able to
suggest 1 format and the need to allocate a buffer of this suggested
format (some elements such as capsfilter only know about the format,
not the size). We would ideally like to let upstream renegotiate a
new format just like it did when it started. This could, for
example, easily be triggered with a RENEGOTIATE event. FIXED in 0.11
with RECONFIGURE events.
- Remove the result format value in queries. FIXED in 0.11
- Try to minimize the amount of acceptcaps calls when pushing buffers
around. The element pushing the buffer usually negotiated already
and decided on the format. The element receiving the buffer usually
has to accept the caps anyway.
## IMPLEMENTATION
- implement more QOS, [qos](design/qos.md).
- implement BUFFERSIZE.
## DESIGN
- unlinking pads in the PAUSED state needs to make sure the stream
thread is not executing code. Can this be done with a flush to
unlock all downstream chain functions? Do we do this automatically
or let the app handle this?

View file

@ -0,0 +1,88 @@
# Pad (de)activation
## Activation
When changing states, a bin will set the state on all of its children in
sink-to-source order. As elements undergo the READY→PAUSED transition,
their pads are activated so as to prepare for data flow. Some pads will
start tasks to drive the data flow.
An element activates its pads from sourcepads to sinkpads. This to make
sure that when the sinkpads are activated and ready to accept data, the
sourcepads are already active to pass the data downstream.
Pads can be activated in one of two modes, PUSH and PULL. PUSH pads are
the normal case, where the source pad in a link sends data to the sink
pad via `gst_pad_push()`. PULL pads instead have sink pads request data
from the source pads via `gst_pad_pull_range()`.
To activate a pad, the core will call `gst_pad_set_active()` with a
TRUE argument, indicating that the pad should be active. If the pad is
already active, be it in a PUSH or PULL mode, `gst_pad_set_active()`
will return without doing anything. Otherwise it will call the
activation function of the pad.
Because the core does not know in which mode to activate a pad (PUSH or
PULL), it delegates that choice to a method on the pad, activate(). The
activate() function of a pad should choose whether to operate in PUSH or
PULL mode. Once the choice is made, it should call ``activate_mode()`` with
the selected activation mode. The default activate() function will call
`activate_mode()` with ``#GST_PAD_MODE_PUSH``, as it is the default
mechanism for data flow. A sink pad that supports either mode of
operation might call `activate_mode(PULL)` if the SCHEDULING query
upstream contains the `#GST_PAD_MODE_PULL` scheduling mode, and
`activate_mode(PUSH)` otherwise.
Consider the case `fakesrc ! fakesink`, where fakesink is configured to
operate in PULL mode. State changes in the pipeline will start with
fakesink, which is the most downstream element. The core will call
`activate()` on fakesinks sink pad. For fakesink to go into PULL mode, it
needs to implement a custom activate() function that will call
`activate_mode(PULL)` on its sink pad (because the default is to use PUSH
mode). activate_mode(PULL) is then responsible for starting the task
that pulls from fakesrc:src. Clearly, fakesrc needs to be notified that
fakesrc is about to pull on its src pad, even though the pipeline has
not yet changed fakesrcs state. For this reason, GStreamer will first
call call `activate_mode(PULL)` on fakesink:sinks peer before calling
`activate_mode(PULL)` on fakesink:sinks.
In short, upstream elements operating in PULL mode must be ready to
produce data in READY, after having `activate_mode(PULL)` called on their
source pad. Also, a call to `activate_mode(PULL)` needs to propagate
through the pipeline to every pad that a `gst_pad_pull()` will reach. In
the case `fakesrc ! identity ! fakesink`, calling `activate_mode(PULL)`
on identitys source pad would need to activate its sink pad in pull
mode as well, which should propagate all the way to fakesrc.
If, on the other hand, `fakesrc ! fakesink` is operating in PUSH mode,
the activation sequence is different. First, activate() on fakesink:sink
calls `activate_mode(PUSH)` on fakesink:sink. Then fakesrcs pads are
activated: sources first, then sinks (of which fakesrc has none).
fakesrc:srcs activation function is then called.
Note that it does not make sense to set an activation function on a
source pad. The peer of a source pad is downstream, meaning it should
have been activated first. If it was activated in PULL mode, the source
pad should have already had `activate_mode(PULL)` called on it, and thus
needs no further activation. Otherwise it should be in PUSH mode, which
is the choice of the default activation function.
So, in the PUSH case, the default activation function chooses PUSH mode,
which calls `activate_mode(PUSH),` which will then start a task on the
source pad and begin pushing. In this way PUSH scheduling is a bit
easier, because it follows the order of state changes in a pipeline.
fakesink is already in PAUSED with an active sink pad by the time
fakesrc starts pushing data.
## Deactivation
Pad deactivation occurs when its parent goes into the READY state or
when the pad is deactivated explicitly by the application or element.
`gst_pad_set_active()` is called with a FALSE argument, which then
calls `activate_mode(PUSH)` or `activate_mode(PULL)` with a FALSE
argument, depending on the current activation mode of the pad.
## Mode switching
Changing from push to pull modes needs a bit of thought. This is
actually possible and implemented but not yet documented here.

137
markdown/design/buffer.md Normal file
View file

@ -0,0 +1,137 @@
# GstBuffer
This document describes the design for buffers.
A GstBuffer is the object that is passed from an upstream element to a
downstream element and contains memory and metadata information.
## Requirements
- It must be fast
- allocation, free, low fragmentation
- Must be able to attach multiple memory blocks to the buffer
- Must be able to attach arbitrary metadata to buffers
- efficient handling of subbuffer, copy, span, trim
## Lifecycle
GstMemory extends from GstMiniObject and therefore uses its lifecycle
management (See [miniobject](design/miniobject.md)).
## Writability
When a Buffers is writable as returned from `gst_buffer_is_writable()`:
- metadata can be added/removed and the metadata can be changed
- GstMemory blocks can be added/removed
The individual memory blocks have their own locking and READONLY flags
that might influence their writability.
Buffers can be made writable with `gst_buffer_make_writable()`. This
will copy the buffer with the metadata and will ref the memory in the
buffer. This means that the memory is not automatically copied when
copying buffers.
# Managing GstMemory
A GstBuffer contains an array of pointers to GstMemory objects.
When the buffer is writable, `gst_buffer_insert_memory()` can be used
to add a new GstMemory object to the buffer. When the array of memory is
full, memory will be merged to make room for the new memory object.
`gst_buffer_n_memory()` is used to get the amount of memory blocks on
the `GstBuffer`.
With `gst_buffer_peek_memory(),` memory can be retrieved from the
memory array. The desired access pattern for the memory block should be
specified so that appropriate checks can be made and, in case of
`GST_MAP_WRITE`, a writable copy can be constructed when needed.
`gst_buffer_remove_memory_range()` and `gst_buffer_remove_memory()`
can be used to remove memory from the GstBuffer.
# Subbuffers
Subbuffers are made by copying only a region of the memory blocks and
copying all of the metadata.
# Span
Spanning will merge together the data of 2 buffers into a new
buffer
# Data access
Accessing the data of the buffer can happen by retrieving the individual
GstMemory objects in the GstBuffer or by using the `gst_buffer_map()` and
`gst_buffer_unmap()` functions.
The `_map` and `_unmap` functions will always return the memory of all blocks as
one large contiguous region of memory. Using the `_map` and `_unmap` functions
might be more convenient than accessing the individual memory blocks at the
expense of being more expensive because it might perform memcpy operations.
For buffers with only one GstMemory object (the most common case), `_map` and
`_unmap` have no performance penalty at all.
- **Read access with 1 memory block**: The memory block is accessed and mapped
for read access. The memory block is unmapped after usage
- **write access with 1 memory block**: The buffer should be writable or this
operation will fail. The memory block is accessed. If the memory block is
readonly, a copy is made and the original memory block is replaced with this
copy. Then the memory block is mapped in write mode and unmapped after usage.
- **Read access with multiple memory blocks**: The memory blocks are combined
into one large memory block. If the buffer is writable, the memory blocks are
replaced with this new combined block. If the buffer is not writable, the
memory is returned as is. The memory block is then mapped in read mode.
When the memory is unmapped after usage and the buffer has multiple memory
blocks, this means that the map operation was not able to store the combined
buffer and it thus returned memory that should be freed. Otherwise, the memory
is unmapped.
- **Write access with multiple memory blocks**: The buffer should be writable
or the operation fails. The memory blocks are combined into one large memory
block and the existing blocks are replaced with this new block. The memory is
then mapped in write mode and unmapped after usage.
# Use cases
## Generating RTP packets from h264 video
We receive as input a GstBuffer with an encoded h264 image and we need
to create RTP packets containing this h264 data as the payload. We
typically need to fragment the h264 data into multiple packets, each
with their own RTP and payload specific
header.
```
+-------+-------+---------------------------+--------+
input H264 buffer: | NALU1 | NALU2 | ..... | NALUx |
+-------+-------+---------------------------+--------+
|
V
array of +-+ +-------+ +-+ +-------+ +-+ +-------+
output buffers: | | | NALU1 | | | | NALU2 | .... | | | NALUx |
+-+ +-------+ +-+ +-------+ +-+ +-------+
: : : :
\-----------/ \-----------/
buffer 1 buffer 2
```
The output buffer array consists of x buffers consisting of an RTP
payload header and a subbuffer of the original input H264 buffer. Since
the rtp headers and the h264 data dont need to be contiguous in memory,
they are added to the buffer as separate GstMemory blocks and we can
avoid to memcpy the h264 data into contiguous memory.
A typical udpsink will then use something like sendmsg to send the
memory regions on the network inside one UDP packet. This will further
avoid having to memcpy data into contiguous memory.
Using bufferlists, the complete array of output buffers can be pushed in
one operation to the peer element.

View file

@ -0,0 +1,310 @@
# Buffering
This document outlines the buffering policy used in the GStreamer core
that can be used by plugins and applications.
The purpose of buffering is to accumulate enough data in a pipeline so
that playback can occur smoothly and without interruptions. It is
typically done when reading from a (slow) non-live network source but
can also be used for live sources.
We want to be able to implement the following features:
- buffering up to a specific amount of data, in memory, before
starting playback so that network fluctuations are minimized.
- download of the network file to a local disk with fast seeking in
the downloaded data. This is similar to the quicktime/youtube
players.
- caching of semi-live streams to a local, on disk, ringbuffer with
seeking in the cached area. This is similar to tivo-like
timeshifting.
- progress report about the buffering operations
- the possibility for the application to do more complex buffering
Some use cases:
- Stream buffering:
+---------+ +---------+ +-------+
| httpsrc | | buffer | | demux |
| src - sink src - sink ....
+---------+ +---------+ +-------+
In this case we are reading from a slow network source into a buffer element
(such as queue2).
The buffer element has a low and high watermark expressed in bytes. The
buffer uses the watermarks as follows:
- The buffer element will post `BUFFERING` messages until the high
watermark is hit. This instructs the application to keep the
pipeline PAUSED, which will eventually block the srcpad from
pushing while data is prerolled in the sinks.
- When the high watermark is hit, a `BUFFERING` message with 100%
will be posted, which instructs the application to continue
playback.
- When the low watermark is hit during playback, the queue will
start posting `BUFFERING` messages again, making the application
PAUSE the pipeline again until the high watermark is hit again.
This is called the rebuffering stage.
- During playback, the queue level will fluctuate between the high
and low watermarks as a way to compensate for network
irregularities.
This buffering method is usable when the demuxer operates in push mode.
Seeking in the stream requires the seek to happen in the network source.
It is mostly desirable when the total duration of the file is not known, such
as in live streaming or when efficient seeking is not possible/required.
- Incremental download
+---------+ +---------+ +-------+
| httpsrc | | buffer | | demux |
| src - sink src - sink ....
+---------+ +----|----+ +-------+
V
file
In this case, we know the server is streaming a fixed length file to the
client. The application can choose to download the file to disk. The buffer
element will provide a push or pull based srcpad to the demuxer to navigate in
the downloaded file.
This mode is only suitable when the client can determine the length of the
file on the server.
In this case, buffering messages will be emitted as usual when the requested
range is not within the downloaded area + buffersize. The buffering message
will also contain an indication that incremental download is being performed.
This flag can be used to let the application control the buffering in a more
intelligent way, using the `BUFFERING` query, for example.
The application can use the `BUFFERING` query to get the estimated download time
and match this time to the current/remaining playback time to control when
playback should start to have a non-interrupted playback experience.
- Timeshifting
+---------+ +---------+ +-------+
| httpsrc | | buffer | | demux |
| src - sink src - sink ....
+---------+ +----|----+ +-------+
V
file-ringbuffer
In this mode, a fixed size ringbuffer is kept to download the server content.
This allows for seeking in the buffered data. Depending on the size of the
buffer one can seek further back in time.
This mode is suitable for all live streams.
As with the incremental download mode, buffering messages are emitted along
with an indication that timeshifting download is in progress.
- Live buffering
In live pipelines we usually introduce some latency between the capture and
the playback elements. This latency can be introduced by a queue (such as a
jitterbuffer) or by other means (in the audiosink).
Buffering messages can be emitted in those live pipelines as well and serve as
an indication to the user of the latency buffering. The application usually
does not react to these buffering messages with a state change.
## Messages
A `GST_MESSAGE_BUFFERING` must be posted on the bus when playback
temporarily stops to buffer and when buffering finishes. When the
percentage field in the `BUFFERING` message is 100, buffering is done.
Values less than 100 mean that buffering is in progress.
The `BUFFERING` message should be intercepted and acted upon by the
application. The message contains at least one field that is sufficient
for basic functionality:
* **`buffer-percent`**, G_TYPE_INT: between 0 and 100
Several more clever ways of dealing with the buffering messages can be
used when in incremental or timeshifting download mode. For this purpose
additional fields are added to the buffering message:
* **`buffering-mode`**, `GST_TYPE_BUFFERING_MODE`: `enum { "stream", "download",
"timeshift", "live" }`: Buffering mode in use. See above for an explanation of the different
alternatives. This field can be used to let the application have more control
over the buffering process.
* **`avg-in-rate`**, G_TYPE_INT: Average input buffering speed in bytes/second.
-1 is unknown. This is the average number of bytes per second that is received
on the buffering element input (sink) pads. It is a measurement of the network
speed in most cases.
* **`avg-out-rate`**, G_TYPE_INT: Average consumption speed in bytes/second. -1
is unknown. This is the average number of bytes per second that is consumed by
the downstream element of the buffering element.
* **`buffering-left`**, G_TYPE_INT64: Estimated time that buffering will take
in milliseconds. -1 is unknown. This is measured based on the avg-in-rate and
the filled level of the queue. The application can use this hint to update the
GUI about the estimated remaining time that buffering will take.
## Application
While data is buffered the pipeline should remain in the PAUSED state.
It is also possible that more data should be buffered while the pipeline
is PLAYING, in which case the pipeline should be PAUSED until the
buffering finishes.
`BUFFERING` messages can be posted while the pipeline is prerolling. The
application should not set the pipeline to PLAYING before a `BUFFERING`
message with a 100 percent value is received, which might only happen
after the pipeline prerolls.
An exception is made for live pipelines. The application may not change
the state of a live pipeline when a buffering message is received.
Usually these buffering messages contain the "buffering-mode" = "live".
The buffering message can also instruct the application to switch to a
periodical `BUFFERING` query instead, so it can more precisely control the
buffering process. The application can, for example, choose not to act
on the `BUFFERING` complete message (buffer-percent = 100) to resume
playback but use the estimated download time instead, resuming playback
when it has determined that it should be able to provide uninterrupted
playback.
## Buffering Query
In addition to the `BUFFERING` messages posted by the buffering elements,
we want to be able to query the same information from the application.
We also want to be able to present the user with information about the
downloaded range in the file so that the GUI can react on it.
In addition to all the fields present in the buffering message, the
`BUFFERING` query contains the following field, which indicates the
available downloaded range in a specific format and the estimated time
to complete:
* **`busy`**, G_TYPE_BOOLEAN: if buffering was busy. This flag allows the
application to pause the pipeline by using the query only.
* **`format`**, GST_TYPE_FORMAT: the format of the "start" and "stop" values
below
* **`start`**, G_TYPE_INT64, -1 unknown: the start position of the available
data. If there are multiple ranges, this field contains the start position of
the currently downloading range.
* **`stop`**, G_TYPE_INT64, -1 unknown: the stop position of the available
data. If there are multiple ranges, this field contains the stop position of
the currently downloading range.
* **`estimated-total`**, G_TYPE_INT64: gives the estimated download time in
milliseconds. -1 unknown. When the size of the downloaded file is known, this
value will contain the latest estimate of the remaining download time of the
currently downloading range. This value is usually only filled for the
"download" buffering mode. The application can use this information to estimate
the amount of remaining time to download till the end of the file.
* **`buffering-ranges`**, G_TYPE_ARRAY of GstQueryBufferingRange: contains
optionally the downloaded areas in the format given above. One of the ranges
contains the same start/stop position as above:
typedef struct
{
gint64 start;
gint64 stop;
} GstQueryBufferingRange;
For the `download` and `timeshift` buffering-modes, the start and stop
positions specify the ranges where efficient seeking in the downloaded
media is possible. Seeking outside of these ranges might be slow or not
at all possible.
For the `stream` and `live` mode the start and stop values describe the
oldest and newest item (expressed in `format`) in the buffer.
## Defaults
Some defaults for common elements:
A GstBaseSrc with random access replies to the `BUFFERING` query with:
"buffer-percent" = 100
"buffering-mode" = "stream"
"avg-in-rate" = -1
"avg-out-rate" = -1
"buffering-left" = 0
"format" = GST_FORMAT_BYTES
"start" = 0
"stop" = the total filesize
"estimated-total" = 0
"buffering-ranges" = NULL
A GstBaseSrc in push mode replies to the `BUFFERING` query with:
"buffer-percent" = 100
"buffering-mode" = "stream"
"avg-in-rate" = -1
"avg-out-rate" = -1
"buffering-left" = 0
"format" = a valid GST_TYPE_FORMAT
"start" = current position
"stop" = current position
"estimated-total" = -1
"buffering-ranges" = NULL
## Buffering strategies
Buffering strategies are specific implementations based on the buffering
message and query described above.
Most strategies have to balance buffering time versus maximal playback
experience.
### Simple buffering
NON-live pipelines are kept in the paused state while buffering messages with
a percent < 100% are received.
This buffering strategy relies on the buffer size and low/high watermarks of
the element. It can work with a fixed size buffer in memory or on disk.
The size of the buffer is usually expressed in a fixed amount of time units
and the estimated bitrate of the upstream source is used to convert this time
to bytes.
All GStreamer applications must implement this strategy. Failure to do so
will result in starvation at the sink.
### No-rebuffer strategy
This strategy tries to buffer as much data as possible so that playback can
continue without any further rebuffering.
This strategy is initially similar to simple buffering, the difference is in
deciding on the condition to continue playback. When a 100% buffering message
has been received, the application will not yet start the playback but it will
start a periodic buffering query, which will return the estimated amount of
buffering time left. When the estimated time left is less than the remaining
playback time, playback can continue.
This strategy requires a unlimited buffer size in memory or on disk, such as
provided by elements that implement the incremental download buffering mode.
Usually, the application can choose to start playback even before the
remaining buffer time elapsed in order to more quickly start the playback at
the expense of a possible rebuffering phase.
### Incremental rebuffering
The application implements the simple buffering strategy but with each
rebuffering phase, it increases the size of the buffer.
This strategy has quick, fixed time startup times but incrementally longer
rebuffering times if the network is slower than the media bitrate.

View file

@ -0,0 +1,365 @@
# Bufferpool
This document details the design of how buffers are be allocated and
managed in pools.
Bufferpools increase performance by reducing allocation overhead and
improving possibilities to implement zero-copy memory transfer.
Together with the ALLOCATION query, elements can negotiate allocation
properties and bufferpools between themselves. This also allows elements
to negotiate buffer metadata between themselves.
# Requirements
- Provide a GstBufferPool base class to help the efficient
implementation of a list of reusable GstBuffer objects.
- Let upstream elements initiate the negotiation of a bufferpool and
its configuration. Allow downstream elements provide bufferpool
properties and/or a bufferpool. This includes the following
properties:
- have minimum and maximum amount of buffers with the option of
preallocating buffers.
- allocator, alignment and padding support
- buffer metadata
- arbitrary extra options
- Integrate with dynamic caps renegotiation.
- Notify upstream element of new bufferpool availability. This is
important when a new element, that can provide a bufferpool, is
dynamically linked
downstream.
# GstBufferPool
The bufferpool object manages a list of buffers with the same properties such
as size, padding and alignment.
The bufferpool has two states: active and inactive. In the inactive
state, the bufferpool can be configured with the required allocation
preferences. In the active state, buffers can be retrieved from and
returned to the pool.
The default implementation of the bufferpool is able to allocate buffers
from any allocator with arbitrary alignment and padding/prefix.
Custom implementations of the bufferpool can override the allocation and
free algorithms of the buffers from the pool. This should allow for
different allocation strategies such as using shared memory or hardware
mapped memory.
# Negotiation
After a particular media format has been negotiated between two pads (using the
CAPS event), they must agree on how to allocate buffers.
The srcpad will always take the initiative to negotiate the allocation
properties. It starts with creating a GST_QUERY_ALLOCATION with the negotiated
caps.
The srcpad can set the need-pool flag to TRUE in the query to optionally make the
peer pad allocate a bufferpool. It should only do this if it is able to use
the peer provided bufferpool.
It will then inspect the returned results and configure the returned pool or
create a new pool with the returned properties when needed.
Buffers are then allocated by the srcpad from the negotiated pool and pushed to
the peer pad as usual.
The allocation query can also return an allocator object when the buffers are of
different sizes and can't be allocated from a pool.
# Allocation query
The allocation query has the following fields:
* (in) **`caps`**, GST_TYPE_CAPS: the caps that was negotiated
* (in) **`need-pool`**, G_TYPE_BOOLEAN: if a GstBufferPool is requested
* (out) **`pool`**, G_TYPE_ARRAY of structure: an array of pool configurations:
`` c
struct {
GstBufferPool *pool;
guint size;
guint min_buffers;
guint max_buffers;
}
``
Use `gst_query_parse_nth_allocation_pool()` to get the values.
The allocator can contain multiple pool configurations. If need-pool
was TRUE, the pool member might contain a GstBufferPool when the
downstream element can provide one.
Size contains the size of the bufferpool's buffers and is never 0.
min_buffers and max_buffers contain the suggested min and max amount of
buffers that should be managed by the pool.
The upstream element can choose to use the provided pool or make its own
pool when none was provided or when the suggested pool was not
acceptable.
The pool can then be configured with the suggested min and max amount of
buffers or a downstream element might choose different values.
* (out) **`allocator`**, G_TYPE_ARRAY of structure: an array of allocator parameters that can be used.
``` c
struct {
GstAllocator *allocator;
GstAllocationParams params;
}
```
Use `gst_query_parse_nth_allocation_param()` to get the values.
The element performing the query can use the allocators and its
parameters to allocate memory for the downstream element.
It is also possible to configure the allocator in a provided pool.
* (out) **`metadata`**, G_TYPE_ARRAY of structure: an array of metadata params that can be accepted.
``` c
struct {
GType api;
GstStructure *params;
}
```
Use `gst_query_parse_nth_allocation_meta(`) to get the values.
These metadata items can be accepted by the downstream element when
placed on buffers. There is also an arbitrary `GstStructure` associated
with the metadata that contains metadata-specific options.
Some bufferpools have options to enable metadata on the buffers
allocated by the pool.
# Allocating from pool
Buffers are allocated from the pool of a pad:
``` c
res = gst_buffer_pool_acquire_buffer (pool, &buffer, &params);
```
A `GstBuffer` that is allocated from the pool will always be writable (have a
refcount of 1) and it will also have its pool member point to the `GstBufferPool`
that created the buffer.
Buffers are refcounted in the usual way. When the refcount of the buffer
reaches 0, the buffer is automatically returned to the pool.
Since all the buffers allocated from the pool keep a reference to the pool,
when nothing else is holding a refcount to the pool, it will be finalized
when all the buffers from the pool are unreffed. By setting the pool to
the inactive state we can drain all buffers from the pool.
When the pool is in the inactive state, `gst_buffer_pool_acquire_buffer()` will
return `GST_FLOW_FLUSHING` immediately.
Extra parameters can be given to the `gst_buffer_pool_acquire_buffer()` method to
influence the allocation decision. `GST_BUFFER_POOL_FLAG_KEY_UNIT` and
`GST_BUFFER_POOL_FLAG_DISCONT` serve as hints.
When the bufferpool is configured with a maximum number of buffers, allocation
will block when all buffers are outstanding until a buffer is returned to the
pool. This behaviour can be changed by specifying the
`GST_BUFFER_POOL_FLAG_DONTWAIT` flag in the parameters. With this flag set,
allocation will return `GST_FLOW_EOS` when the pool is empty.
# Renegotiation
Renegotiation of the bufferpool might need to be performed when the
configuration of the pool changes. Changes can be in the buffer size
(because of a caps change), alignment or number of
buffers.
## Downstream
When the upstream element wants to negotiate a new format, it might need
to renegotiate a new bufferpool configuration with the downstream element.
This can, for example, happen when the buffer size changes.
We can not just reconfigure the existing bufferpool because there might
still be outstanding buffers from the pool in the pipeline. Therefore we
need to create a new bufferpool for the new configuration while we let the
old pool drain.
Implementations can choose to reuse the same bufferpool object and wait for
the drain to finish before reconfiguring the pool.
The element that wants to renegotiate a new bufferpool uses exactly the same
algorithm as when it first started. It will negotiate caps first then use the
ALLOCATION query to get and configure the new pool.
## upstream
When a downstream element wants to negotiate a new format, it will send a
RECONFIGURE event upstream. This instructs upstream to renegotiate both
the format and the bufferpool when needed.
A pipeline reconfiguration happens when new elements are added or removed from
the pipeline or when the topology of the pipeline changes. Pipeline
reconfiguration also triggers possible renegotiation of the bufferpool and
caps.
A RECONFIGURE event tags each pad it travels on as needing reconfiguration.
The next buffer allocation will then require the renegotiation or
reconfiguration of a pool.
# Shutting down
In push mode, a source pad is responsible for setting the pool to the
inactive state when streaming stops. The inactive state will unblock any pending
allocations so that the element can shut down.
In pull mode, the sink element should set the pool to the inactive state when
shutting down so that the peer `_get_range()` function can unblock.
In the inactive state, all the buffers that are returned to the pool will
automatically be freed by the pool and new allocations will fail.
# Use cases
## - `videotestsrc ! xvimagesink`
* Before videotestsrc can output a buffer, it needs to negotiate caps and
a bufferpool with the downstream peer pad.
* First it will negotiate a suitable format with downstream according to the
normal rules. It will send a CAPS event downstream with the negotiated
configuration.
* Then it does an ALLOCATION query. It will use the returned bufferpool or
configures its own bufferpool with the returned parameters. The bufferpool is
initially in the inactive state.
* The ALLOCATION query lists the desired configuration of the downstream
xvimagesink, which can have specific alignment and/or min/max amount of
buffers.
* videotestsrc updates the configuration of the bufferpool, it will likely set
the min buffers to 1 and the size of the desired buffers. It then updates the
bufferpool configuration with the new properties.
* When the configuration is successfully updated, videotestsrc sets the
bufferpool to the active state. This preallocates the buffers in the pool (if
needed). This operation can fail when there is not enough memory available.
Since the bufferpool is provided by xvimagesink, it will allocate buffers
backed by an XvImage and pointing to shared memory with the X server.
* If the bufferpool is successfully activated, videotestsrc can acquire
a buffer from the pool, fill in the data and push it out to xvimagesink.
* xvimagesink can know that the buffer originated from its pool by following
the pool member.
* when shutting down, videotestsrc will set the pool to the inactive state,
this will cause further allocations to fail and currently allocated buffers to
be freed. videotestsrc will then free the pool and stop streaming.
## - ``videotestsrc ! queue ! myvideosink``
* In this second use case we have a videosink that can at most allocate 3 video
buffers.
* Again videotestsrc will have to negotiate a bufferpool with the peer element.
For this it will perform the ALLOCATION query which queue will proxy to its
downstream peer element.
* The bufferpool returned from myvideosink will have a max_buffers set to 3.
queue and videotestsrc can operate with this upper limit because none of those
elements require more than that amount of buffers for temporary storage.
* Myvideosink's bufferpool will then be configured with the size of the buffers
for the negotiated format and according to the padding and alignment rules.
When videotestsrc sets the pool to active, the 3 video buffers will be
preallocated in the pool.
* videotestsrc acquires a buffer from the configured pool on its srcpad and
pushes this into the queue. When videotestsrc has acquired and pushed 3 frames,
the next call to gst_buffer_pool_acquire_buffer() will block (assuming the
GST_BUFFER_POOL_FLAG_DONTWAIT is not specified).
* When the queue has pushed out a buffer and the sink has rendered it, the
refcount of the buffer reaches 0 and the buffer is recycled in the pool. This
will wake up the videotestsrc that was blocked, waiting for more buffers and
will make it produce the next buffer.
* In this setup, there are at most 3 buffers active in the pipeline and the
videotestsrc is rate limited by the rate at which buffers are recycled in the
bufferpool.
* When shutting down, videotestsrc will first set the bufferpool on the srcpad
to inactive. This causes any pending (blocked) acquire to return with
a FLUSHING result and causes the streaming thread to pause.
## - `.. ! myvideodecoder ! queue ! fakesink`
* In this case, the myvideodecoder requires buffers to be aligned to 128 bytes
and padded with 4096 bytes. The pipeline starts out with the decoder linked to
a fakesink but we will then dynamically change the sink to one that can provide
a bufferpool.
* When myvideodecoder negotiates the size with the downstream fakesink element,
it will receive a NULL bufferpool because fakesink does not provide
a bufferpool. It will then select its own custom bufferpool to start the data
transfer.
* At some point we block the queue srcpad, unlink the queue from the fakesink,
link a new sink and set the new sink to the PLAYING state. Linking the new sink
would automatically send a RECONFIGURE event upstream and, through queue,
inform myvideodecoder that it should renegotiate its bufferpool because
downstream has been reconfigured.
* Before pushing the next buffer, myvideodecoder has to renegotiate a new
bufferpool. To do this, it performs the usual bufferpool negotiation algorithm.
If it can obtain and configure a new bufferpool from downstream, it sets its
own (old) pool to inactive and unrefs it. This will eventually drain and unref
the old bufferpool.
* The new bufferpool is set as the new bufferpool for the srcpad and sinkpad of
the queue and set to the active state.
## - `.. ! myvideodecoder ! queue ! myvideosink `
* myvideodecoder has negotiated a bufferpool with the downstream myvideosink to
handle buffers of size 320x240. It has now detected a change in the video
format and needs to renegotiate to a resolution of 640x480. This requires it to
negotiate a new bufferpool with a larger buffer size.
* When myvideodecoder needs to get the bigger buffer, it starts the negotiation
of a new bufferpool. It queries a bufferpool from downstream, reconfigures it
with the new configuration (which includes the bigger buffer size) and sets the
bufferpool to active. The old pool is inactivated and unreffed, which causes
the old format to drain.
* It then uses the new bufferpool for allocating new buffers of the new
dimension.
* If at some point, the decoder wants to switch to a lower resolution again, it
can choose to use the current pool (which has buffers that are larger than the
required size) or it can choose to renegotiate a new bufferpool.
## - `.. ! myvideodecoder ! videoscale ! myvideosink`
* myvideosink is providing a bufferpool for upstream elements and wants to
change the resolution.
* myvideosink sends a RECONFIGURE event upstream to notify upstream that a new
format is desirable. Upstream elements try to negotiate a new format and
bufferpool before pushing out a new buffer. The old bufferpools are drained in
the regular way.

141
markdown/design/caps.md Normal file
View file

@ -0,0 +1,141 @@
# Caps
Caps are lightweight refcounted objects describing media types. They are
composed of an array of GstStructures plus, optionally, a
GstCapsFeatures set for the GstStructure.
Caps are exposed on GstPadTemplates to describe all possible types a
given pad can handle. They are also stored in the registry along with a
description of the element.
Caps are exposed on the element pads via CAPS and `ACCEPT_CAPS` queries.
This function describes the possible types that the pad can handle or
produce ([negotiation](design/negotiation.md)).
Various methods exist to work with the media types such as subtracting
or intersecting.
## Operations
# Fixating
Caps are fixed if they only contain a single structure and this
structure is fixed. A structure is fixed if none of the fields of the
structure is an unfixed type, for example a range, list or array.
For fixating caps only the first structure is kept as the order of
structures is meant to express the preferences for the different
structures. Afterwards, each unfixed field of this structure is set to
the value that makes most sense for the media format by the element or
pad implementation and then every remaining unfixed field is set to an
arbitrary value that is a subset of the unfixed fields values.
EMPTY caps are fixed caps and ANY caps are not. Caps with ANY caps
features are not fixed.
# Subset
One caps "A" is a subset of another caps "B" if for each structure in
"A" there exists a structure in "B" that is a superset of the structure
in "A".
A structure "a" is the subset of a structure "b" if it has the same
structure name, the same caps features and each field in "b" exists in
"a" and the value of the field in "a" is a subset of the value of the
field in "b". "a" can have additional fields that are not in "b".
EMPTY caps are a subset of every other caps. Every caps are a subset of
ANY caps.
# Equality
Caps "A" and "B" are equal if "A" is a subset of "B" and "B" is a subset
of "A". This means that both caps are expressing the same possibilities
but their structures can still be different if they contain unfixed
fields.
# Intersection
The intersection of caps "A" and caps "B" are the caps that contain the
intersection of all their structures with each other.
The intersection of structure "a" and structure "b" is empty if their
structure name or their caps features are not equal, or if "a" and "b"
contain the same field but the intersection of both field values is
empty. If one structure contains a field that is not existing in the
other structure it will be copied over to the intersection with the same
value.
The intersection with ANY caps is always the other caps and the
intersection with EMPTY caps is always EMPTY.
# Union
The union of caps "A" and caps "B" are the caps that contain the union
of all their structures with each other.
The union of structure "a" and structure "b" are the two structures "a"
and "b" if the structure names or caps features are not equal.
Otherwise, the union is the structure that contains the union of each
fields value. If a field is only in one of the two structures it is not
contained in the union.
The union with ANY caps is always ANY and the union with EMPTY caps is
always the other caps.
# Subtraction
The subtraction of caps "A" from caps "B" is the most generic subset of
"B" that has an empty intersection with "A" but only contains structures
with names and caps features that are existing in "B".
## Basic Rules
# Semantics of caps and their usage
A caps can contain multiple structures, in which case any of the
structures would be acceptable. The structures are in the preferred
order of the creator of the caps, with the preferred structure being
first and during negotiation of caps this order should be considered to
select the most optimal structure.
Each of these structures has a name that specifies the media type, e.g.
"video/x-theora" to specify Theora video. Additional fields in the
structure add additional constraints and/or information about the media
type, like the width and height of a video frame, or the codec profile
that is used. These fields can be non-fixed (e.g. ranges) for non-fixed
caps but must be fixated to a fixed value during negotiation. If a field
is included in the caps returned by a pad via the CAPS query, it imposes
an additional constraint during negotiation. The caps in the end must
have this field with a value that is a subset of the non-fixed value.
Additional fields that are added in the negotiated caps give additional
information about the media but are treated as optional. Information
that can change for every buffer and is not relevant during negotiation
must not be stored inside the caps.
For each of the structures in caps it is possible to store caps
features. The caps features are expressing additional requirements for a
specific structure, and only structures with the same name *and* equal
caps features are considered compatible. Caps features can be used to
require a specific memory representation or a specific meta to be set on
buffers, for example a pad could require for a specific structure that
it is passed EGLImage memory or buffers with the video meta. If no caps
features are provided for a structure, it is assumed that system memory
is required unless later negotiation steps (e.g. the ALLOCATION query)
detect that something else can be used. The special ANY caps features
can be used to specify that any caps feature would be accepted, for
example if the buffer memory is not touched at all.
# Compatibility of caps
Pads can be linked when the caps of both pads are compatible. This is
the case when their intersection is not empty.
For checking if a pad actually supports a fixed caps an intersection is
not enough. Instead the fixed caps must be at least a subset of the
pads caps but pads can introduce additional constraints which would
be checked in the `ACCEPT_CAPS` query handler.
Data flow can only happen after pads have decided on common fixed caps.
These caps are distributed to both pads with the CAPS event.

83
markdown/design/clocks.md Normal file
View file

@ -0,0 +1,83 @@
# Clocks
The GstClock returns a monotonically increasing time with the method
`_get_time()`. Its accuracy and base time depends on the specific clock
implementation but time is always expressed in nanoseconds. Since the
baseline of the clock is undefined, the clock time returned is not
meaningful in itself, what matters are the deltas between two clock
times. The time reported by the clock is called the `absolute_time`.
## Clock Selection
To synchronize the different elements, the GstPipeline is responsible
for selecting and distributing a global GstClock for all the elements in
it.
This selection happens whenever the pipeline goes to PLAYING. Whenever
an element is added/removed from the pipeline, this selection will be
redone in the next state change to PLAYING. Adding an element that can
provide a clock will post a `GST_MESSAGE_CLOCK_PROVIDE` message on the
bus to inform parent bins of the fact that a clock recalculation is
needed.
When a clock is selected, a `NEW_CLOCK` message is posted on the bus
signaling the clock to the application.
When the element that provided the clock is removed from the pipeline, a
`CLOCK_LOST` message is posted. The application must then set the
pipeline to PAUSED and PLAYING again in order to let the pipeline select
a new clock and distribute a new base time.
The clock selection is performed as part of the state change from PAUSED
to PLAYING and is described in [states](design/states.md).
## Clock features
The clock supports periodic and single shot clock notifications both
synchronous and asynchronous.
One first needs to create a GstClockID for the periodic or single shot
notification using `_clock_new_single_shot_id()` or
`_clock_new_periodic_id()`.
To perform a blocking wait for the specific time of the GstClockID use
the `gst_clock_id_wait()`. To receive a callback when the specific time
is reached in the clock use `gstclock_id_wait_async()`. Both these
calls can be interrupted with the `gst_clock_id_unschedule()` call. If
the blocking wait is unscheduled a value of `GST_CLOCK_UNSCHEDULED` is
returned.
The async callbacks can happen from any thread, either provided by the
core or from a streaming thread. The application should be prepared for
this.
A GstClockID that has been unscheduled cannot be used again for any wait
operation.
It is possible to perform a blocking wait on the same ID from multiple
threads. However, registering the same ID for multiple async
notifications is not possible, the callback will only be called once.
None of the wait operations unref the GstClockID, the owner is
responsible for unreffing the ids itself. This holds true for both
periodic and single shot notifications. The reason being that the owner
of the ClockID has to keep a handle to the ID to unblock the wait on
FLUSHING events or state changes and if we unref it automatically, the
handle might be invalid.
These clock operations do not operate on the stream time, so the
callbacks will also occur when not in PLAYING state as if the clock just
keeps on running. Some clocks however do not progress when the element
that provided the clock is not PLAYING.
## Clock implementations
The GStreamer core provides a GstSystemClock based on the system time.
Asynchronous callbacks are scheduled from an internal thread.
Clock implementers are encouraged to subclass this systemclock as it
implements the async notification.
Subclasses can however override all of the important methods for sync
and async notifications to implement their own callback methods or
blocking wait operations.

View file

@ -0,0 +1,61 @@
# Context
GstContext is a container object, containing a type string and a generic
GstStructure. It is used to store and propagate context information in a
pipeline, like device handles, display server connections and other
information that should be shared between multiple elements in a
pipeline.
For sharing context objects and distributing them between application
and elements in a pipeline, there are downstream queries, upstream
queries, messages and functions to set a context on a complete pipeline.
## Context types
Context type names should be unique and be put in appropriate
namespaces, to prevent name conflicts, e.g. "gst.egl.EGLDisplay". Only
one specific type is allowed per context type name.
## Elements
Elements that need a specific context for their operation would do the
following steps until one succeeds:
1) Check if the element already has a context of the specific type,
i.e. it was previously set via gst_element_set_context().
2) Query downstream with GST_QUERY_CONTEXT for the context and check if
downstream already has a context of the specific type
3) Query upstream with GST_QUERY_CONTEXT for the context and check if
upstream already has a context of the specific type
4) Post a GST_MESSAGE_NEED_CONTEXT message on the bus with the required
context types and afterwards check if a usable context was set now
as in 1). The message could be handled by the parent bins of the
element and the application.
4) Create a context by itself and post a GST_MESSAGE_HAVE_CONTEXT message
on the bus.
Bins will propagate any context that is set on them to their child
elements via gst\_element\_set\_context(). Even to elements added after
a given context has been set.
Bins can handle the GST\_MESSAGE\_NEED\_CONTEXT message, can filter both
messages and can also set different contexts for different pipeline
parts.
## Applications
Applications can set a specific context on a pipeline or elements inside
a pipeline with gst\_element\_set\_context().
If an element inside the pipeline needs a specific context, it will post
a GST\_MESSAGE\_NEED\_CONTEXT message on the bus. The application can
now create a context of the requested type or pass an already existing
context to the element (or to the complete pipeline).
Whenever an element creates a context internally it will post a
GST\_MESSAGE\_HAVE\_CONTEXT message on the bus. Bins will cache these
contexts and pass them to any future element that requests them.

View file

@ -0,0 +1,65 @@
# Controller
The controller subsystem allows to automate element property changes. It
works so that all parameter changes are time based and elements request
property updates at processing time.
## Element view
Elements dont need to do much. They need to: - mark object properties
that can be changed while processing with GST\_PARAM\_CONTROLLABLE -
call gst\_object\_sync\_values (self, timestamp) in the processing
function before accessing the parameters.
All ordered property types can be automated (int, double, boolean,
enum). Other property types can also be automated by using special
control bindings. One can e.g. write a control-binding that updates a
text property based on timestamps.
## Application view
Applications need to setup the property automation. For that they need
to create a GstControlSource and attach it to a property using
GstControlBinding. Various control-sources and control-bindings exist.
All control sources produce control value sequences in the form of
gdouble values. The control bindings map them to the value range and
type of the bound property.
One control-source can be attached to one or more properties at the same
time. If it is attached multiple times, then each control-binding will
scale and convert the control values to the target property type and
range.
One can create complex control-curves by using a
GstInterpolationControlSource. This allows the classic user editable
control-curve (often seen in audio/video editors). Another way is to use
computed control curves. GstLFOControlSource can generate various
repetitive signals. Those can be made more complex by chaining the
control sources. One can attach another control-source to e.g. modulate
the frequency of the first GstLFOControlSource.
In most cases GstControlBindingDirect will be the binding to be used.
Other control bindings are there to handle special cases, such as having
1-4 control- sources and combine their values into a single guint to
control a rgba-color property.
## TODO
* control-source value ranges - control sources should ideally emit values
between \[0.0 and 1.0\] - right now lfo-control-sources emits values
between \[-1.0 and 1.0\] - we can make control-sources announce that or
fix it in a lfo2-control-source
* ranged-control-binding - it might be a nice thing to have a
control-binding that has scale and offset properties - when attaching a
control-source to e.g. volume, one needs to be aware that the values go
from \[0.0 to 4.0\] - we can also have a "mapping-mode"={AS\_IS,
TRANSFORMED} on direct-control-binding and two extra properties that are
used in TRANSFORMED mode
* control-setup descriptions - it would be nice to have a way to parse a
textual control-setup description. This could be used in gst-launch and
in presets. It needs to be complemented with a formatter (for the preset
storage or e.g. for debug logging). - this could be function-style:
direct(control-source=lfo(waveform=*sine*,offset=0.5)) or gst-launch
style (looks weird) lfo wave=sine offset=0.5 \! direct .control-source

View file

@ -0,0 +1,78 @@
# Documentation conventions
Due to the potential for exponential growth, several abbreviating
conventions will be used throughout this documentation. These
conventions have grown primarily from extremely in-depth discussions of
the architecture in IRC. This has verified the safety of these
conventions, if used properly. There are no known namespace conflicts as
long as context is rigorously observed.
## Object classes
Since everything starts with Gst, we will generally refer to objects by
the shorter name, i.e. Element or Pad. These names will always have
their first letter capitalized.
## Function names
Within the context of a given object, functions defined in that objects
header and/or source file will have their object-specific prefix
stripped. For instance, gst\_element\_add\_pad() would be referred to as
simply *add\_pad(). Note that the trailing parentheses should always be
present, but sometimes may not be. A prefixing underscore (*) will
always tell you its a function, however, regardless of the presence or
absence of the trailing parentheses.
## defines and enums
Values and macros defined as enums and preprocessor macros will be
referred to in all capitals, as per their definition. This includes
object flags and element states, as well as general enums. Examples are
the states NULL, READY, PLAYING, and PAUSED; the element flags
LOCKED\_STATE , and state return values SUCCESS, FAILURE, and ASYNC.
Where there is a prefix, as in the element flags, it is usually dropped
and implied. Note however that element flags should be cross-checked
with the header, as there are currently two conventions in use: with and
without *FLAGS* in the middle.
## Drawing conventions
When drawing pictures the following conventions apply:
### objects
Objects are drawn with a box like:
+------+
| |
+------+
### pointers
a pointer to an object.
```
+-----+
*--->| |
+-----+
```
an invalid pointer, this is a pointer that should not be used.
*-//->
### elements
```
+----------+
| name |
sink src
+----------+
```
### pad links
-----+ +---
| |
src--sink
-----+ +---

View file

@ -0,0 +1,215 @@
# Element Klass definition
## Purpose
Applications should be able to retrieve elements from the registry of
existing elements based on specific capabilities or features of the
element.
A playback application might want to retrieve all the elements that can
be used for visualisation, for example, or a video editor might want to
select all video effect filters.
The topic of defining the klass of elements should be based on use
cases.
A list of classes that are used in a installation can be generated
using: gst-inspect-1.0 -a | grep -ho Class:.\* | cut -c8- | sed
"s/\\//\\\\n/g" | sort | uniq
## Proposal
The GstElementDetails contains a field named klass that is a pointer to
a string describing the element type.
In this document we describe the format and contents of the string.
Elements should adhere to this specification although that is not
enforced to allow for wild (application specific) customisation.
###string format
<keyword>['/'<keyword]*
The string consists of an _unordered_ list of keywords separated with a '/'
character. While the / suggests a hierarchy, this is _not_ the case.
### keyword categories
- functional
Categories are base on _intended usage_ of the element. Some elements
might have other side-effects (especially for filers/effects). The purpose
is to list enough keywords so that applications can do meaningful filtering,
not to completely describe the functionality, that is expressed in caps etc..
- Source : produces data
- Sink : consumes data
- Filter : filters/transforms data, no modification on the data is
intended (although it might be unavoidable). The filter can
decide on input and output caps independently of the stream
contents (GstBaseTransform).
- Effect : applies an effect to some data, changes to data are
intended. Examples are colorbalance, volume. These elements can
also be implemented with GstBaseTransform.
- Demuxer : splits audio, video, … from a stream
- Muxer : interleave audio, video, … into one stream, this is like
mixing but without losing or degrading each separate input
stream. The reverse operation is possible with a Demuxer that
reproduces the exact same input streams.
- Decoder : decodes encoded data into a raw format, there is
typically no relation between input caps and output caps. The
output caps are defined in the stream data. This separates the
Decoder from the Filter and Effect.
- Encoder : encodes raw data into an encoded format.
- Mixer : combine audio, video, .. this is like muxing but with
applying some algorithm so that the individual streams are not
extractable anymore, there is therefore no reverse operation to
mixing. (audio mixer, video mixer, …)
- Converter : convert audio into video, text to audio, … The
converter typically works on raw types only. The source media
type is listed first.
- Analyzer : reports about the stream contents.
- Control : controls some aspect of a hardware device
- Extracter : extracts tags/headers from a stream
- Formatter : adds tags/headers to a stream
- Connector : allows for new connections in the pipeline. (tee, …)
- …
- Based on media type
Purpose is to make a selection for elements operating on the different
types of media. An audio application must be able to filter out the
elements operating on audio, for example.
- Audio : operates on audio data
- Video : operates on video data
- Image : operates on image data. Usually this media type can also
be used to make a video stream in which case it is added
together with the Video media type.
- Text : operates on text data
- Metadata : operates on metadata
- …
- Extra features
The purpose is to further specialize the element, mostly for
application specific needs.
- Network : element is used in networked situations
- Protocol : implements some protocol (RTSP, HTTP, …)
- Payloader : encapsulate as payload (RTP, RDT,.. )
- Depayloader : strip a payload (RTP, RDT,.. )
- RTP : intended to be used in RTP applications
- Device : operates on some hardware device (disk, network, audio
card, video card, usb, …)
- Visualisation : intended to be used for audio visualisation
- Debug : intended usage is more for debugging purposes.
- Categories found, but not yet in one of the above lists
- Bin : playbin, decodebin, bin, pipeline
- Codec : lots of decoders, encoder, demuxers should be removed?
- Generic : should be removed?
- File : like network, should go to Extra?
- Editor : gnonlin, textoverlays
- DVD, GDP, LADSPA, Parser, Player, Subtitle, Testing, …
3\) suggested order:
<functional>[/<media type>]*[/<extra...>]*
4\) examples:
apedemux : Extracter/Metadata
audiotestsrc : Source/Audio
autoaudiosink : Sink/Audio/Device
cairotimeoverlay : Mixer/Video/Text
dvdec : Decoder/Video
dvdemux : Demuxer
goom : Converter/Audio/Video
id3demux : Extracter/Metadata
udpsrc : Source/Network/Protocol/Device
videomixer : Mixer/Video
videoconvert : Filter/Video (intended use to convert video with as little
visible change as possible)
vertigotv : Effect/Video (intended use is to change the video)
volume : Effect/Audio (intended use is to change the audio data)
vorbisdec : Decoder/Audio
vorbisenc : Encoder/Audio
oggmux : Muxer
adder : Mixer/Audio
videobox : Effect/Video
alsamixer : Control/Audio/Device
audioconvert : Filter/Audio
audioresample : Filter/Audio
xvimagesink : Sink/Video/Device
navseek : Filter/Debug
decodebin : Decoder/Demuxer
level : Filter/Analyzer/Audio
tee : Connector/Debug
### open issues:
- how to differentiate physical devices from logical ones?
autoaudiosink : Sink/Audio/Device alsasink : Sink/Audio/Device
## Use cases
- get a list of all elements implementing a video effect (pitivi):
klass.contains (Effect & Video)
- get list of muxers (pitivi):
klass.contains (Muxer)
- get list of video encoders (pitivi):
klass.contains (Encoder & video)
- Get a list of all audio/video visualisations (totem):
klass.contains (Visualisation)
- Get a list of all decoders/demuxer/metadata parsers/vis (playbin):
klass.contains (Visualisation | Demuxer | Decoder | (Extractor & Metadata))
- Get a list of elements that can capture from an audio device
(gst-properties):
klass.contains (Source & Audio & Device)
- filters out audiotestsrc, since it is not a device

View file

@ -0,0 +1,194 @@
# Metadata
This draft recaps the current metadata handling in GStreamer and
proposes some additions.
## Supported Metadata standards
The paragraphs below list supported native metadata standards sorted by
type and then in alphabetical order. Some standards have been extended
to support additional metadata. GStreamer already supports all of those
to some extend. This is showns in the table below as either \[--\],
\[r-\], \[-w\] or \[rw\] depending on read/write support (08.Feb.2010).
### Audio
- mp3
* ID3v2: \[rw]
* http://www.id3.org/Developer_Information
* ID3v1: [rw]
* http://www.id3.org/ID3v1
* XMP: \[--] (inside ID3v2 PRIV tag of owner XMP)
* http://www.adobe.com/devnet/xmp/
- ogg/vorbis
* vorbiscomment: \[rw]
* http://www.xiph.org/vorbis/doc/v-comment.html
* http://wiki.xiph.org/VorbisComment
- wav
* LIST/INFO chunk: \[rw]
* http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
* http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt
* XMP: \[--]
* http://www.adobe.com/devnet/xmp/
### Video
- 3gp
* {moov,trak}.udta: \[rw]
* http://www.3gpp.org/ftp/Specs/html-info/26244.htm
* ID3V2: \[--]
* http://www.3gpp.org/ftp/Specs/html-info/26244.htm
* http://www.mp4ra.org/specs.html#id3v2
- avi
* LIST/INFO chunk: \[rw]
* http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
* http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt
* XMP: \[--] (inside "_PMX" chunk)
* http://www.adobe.com/devnet/xmp/
- asf
* ??:
* XMP: \[--]
* http://www.adobe.com/devnet/xmp/
- flv \[--]
* XMP: (inside onXMPData script data tag)
* http://www.adobe.com/devnet/xmp/
- mkv
* tags: \[rw]
* http://www.matroska.org/technical/specs/tagging/index.html
- mov
* XMP: \[--] (inside moov.udta.XMP_ box)
* http://www.adobe.com/devnet/xmp/
- mp4
* {moov,trak}.udta: \[rw]
* http://standards.iso.org/ittf/PubliclyAvailableStandards/c051533_ISO_IEC_14496-12_2008.zip
* moov.udta.meta.ilst: \[rw]
* http://atomicparsley.sourceforge.net/
* http://atomicparsley.sourceforge.net/mpeg-4files.html
* ID3v2: \[--]
* http://www.mp4ra.org/specs.html#id3v2
* XMP: \[--] (inside UUID box)
* http://www.adobe.com/devnet/xmp/
- mxf
* ??
### Images
- gif
* XMP: \[--]
* http://www.adobe.com/devnet/xmp/
- jpg
* jif: \[rw] (only comments)
* EXIF: \[rw] (via metadata plugin)
* http://www.exif.org/specifications.html
* IPTC: \[rw] (via metadata plugin)
* http://www.iptc.org/IPTC4XMP/
* XMP: \[rw] (via metadata plugin)
* http://www.adobe.com/devnet/xmp/
- png
* XMP: \[--]
* http://www.adobe.com/devnet/xmp/
### further Links:
http://age.hobba.nl/audio/tag_frame_reference.html
http://wiki.creativecommons.org/Tracker_CC_Indexing
## Current Metadata handling
When reading files, demuxers or parsers extract the metadata. It will be
sent a GST\_EVENT\_TAG to downstream elements. When a sink element
receives a tag event, it will post a GST\_MESSAGE\_TAG message on the
bus with the contents of the tag event.
Elements receiving GST\_EVENT\_TAG events can mangle them, mux them into
the buffers they send or just pass them through. Usually is muxers that
will format the tag data into the form required by the format they mux.
Such elements would also implement the GstTagSetter interface to receive
tags from the application.
```
+----------+
| demux |
sink src --> GstEvent(tag) over GstPad to downstream element
+----------+
method call over GstTagSetter interface from application
|
v
+----------+
| mux |
GstEvent(tag) over GstPad from upstream element --> sink src
+----------+
```
The data used in all those interfaces is GstTagList. It is based on a
GstStructure which is like a hash table with differently typed entries.
The key is always a string/GQuark. Many keys are predefined in GStreamer
core. More keys are defined in gst-plugins-base/gst-libs/gst/tag/tag.h.
If elements and applications use predefined types, it is possible to
transcode a file from one format into another while preserving all known
and mapped metadata.
## Issues
### Unknown/Unmapped metadata
Right now GStreamer can lose metadata when transcoding, remuxing
content. This can happend as we dont map all metadata fields to generic
ones.
We should probably also add the whole metadata blob to the GstTagList.
We would need a GST\_TAG\_SYSTEM\_xxx define (e.g.
GST\_TAG\_SYSTEM\_ID3V2) for each standard. The content is not printable
and should be treated as binary if not known. The tag is not mergeable -
call gst\_tag\_register() with GstTagMergeFunc=NULL. Also the tag data
is only useful for upstream elements, not for the application.
A muxer would first scan a taglist for known system tags. Unknown tags
are ignored as already. It would first populate its own metadata store
with the entries from the system tag and the update the entries with the
data in normal tags.
Below is an initial list of tag systems: ID3V1 - GST\_TAG\_SYSTEM\_ID3V1
ID3V2 - GST\_TAG\_SYSTEM\_ID3V2 RIFF\_INFO -
GST\_TAG\_SYSTEM\_RIFF\_INFO XMP - GST\_TAG\_SYSTEM\_XMP
We would basically need this for each container format.
See also <https://bugzilla.gnome.org/show_bug.cgi?id=345352>
### Lost metadata
A case slighly different from the previous is that when an application
sets a GstTagList on a pipeline. Right elements consuming tags do not
report which tags have been consumed. Especially when using elements
that make metadata persistent, we have no means of knowing which of the
tags made it into the target stream and which were not serialized.
Ideally the application would like to know which kind of metadata is
accepted by a pipleine to reflect that in the UI.
Although it is in practise so that elements implementing GstTagSetter
are the ones that serialize, this does not have to be so. Otherwise we
could add a means to that interface, where elements add the tags they
have serialized. The application could build one list from all the tag
messages and then query all the serialized tags from tag-setters. The
delta tells what has not been serialized.
A different approach would be to query the list of supported tags in
advance. This could be a query (GST\_QUERY\_TAG\_SUPPORT). The query
result could be a list of elements and their tags. As a convenience we
could flatten the list of tags for the top-level element (if the query
was sent to a bin) and add that.
### Tags are per Element
In many cases we want tags per stream. Even metadata standards like
mp4/3gp metadata supports that. Right now GST\_MESSAGE\_SRC(tags) is the
element. We tried changing that to the pad, but that broke applications.
Also we miss the symmetric functionality in GstTagSetter. This interface
is usually implemented by
elements.
### Open bugs
<https://bugzilla.gnome.org/buglist.cgi?query_format=advanced;short_desc=tag;bug_status=UNCONFIRMED;bug_status=NEW;bug_status=ASSIGNED;bug_status=REOPENED;bug_status=NEEDINFO;short_desc_type=allwordssubstr;product=GStreamer>
Add GST\_TAG\_MERGE\_REMOVE
<https://bugzilla.gnome.org/show_bug.cgi?id=560302>

View file

@ -0,0 +1,117 @@
# DRAFT push-pull scheduling
Status
DRAFT. DEPRECATED by better current implementation.
Observations:
- The main scheduling mode is chain based scheduling where the source
element pushes buffers through the pipeline to the sinks. this is
called the push model
- In the pull model, some plugin pulls buffers from an upstream peer
element before consuming and/or pushing them further downstream.
Usages of pull based scheduling:
- sinks that pull in data, possibly at fixed intervals driven by some
hardware device (audiocard, videodevice, …).
- Efficient random access to resources. Especially useful for certain
types of demuxers.
API for pull-based scheduling:
- an element that wants to pull data from a peer element needs to call
the pull\_range() method. This methods requires an offset and a
size. It is possible to leave the offset and size at -1, indicating
that any offset or size is acceptable, this of course removes the
advantages of getrange based scheduling.
Types of pull based scheduling:
- some sources can do random access (file source, …)
- some sources can read a random number of bytes but not at a random
offset. (audio cards, …) Audio cards using a ringbuffer can however
do random access in the ringbuffer.
- some sources can do random access in a range of bytes but not in
another range. (a caching network source).
- some sources can do a fixed size data and without an offset. (video
sources, …)
Current scheduling decision:
- core selects scheduling type starting on sinks by looking at
existence of loop function on sinkpad and calling
\_check\_pull\_range() on the source pad to activate the pads in
push/pull mode.
- element proxies pull mode pad activation to peer pad.
Problems:
- core makes a tough desicion without knowing anything about the
element. Some elements are able to deal with a pull\_range() without
offset while others need full random access.
Requirements:
- element should be able to select scheduling method itself based on
how it can use the peer element pull\_range. This includes if the
peer can operate with or without offset/size. This also means that
the core does not need to select the scheduling method anymore and
allows for more efficient scheduling methods adjusted for the
particular element.
Proposition:
- pads are activated without the core selecting a method.
- pads queries scheduling mode of peer pad. This query is rather
finegrained and allows the element to know if the peer supports
offsets and sizes in the get\_range function. A proposition for the
query is outlined in draft-query.txt.
- pad selects scheduling mode and informs the peer pad of this
decision.
Things to query:
- pad can do real random access (downstream peer can ask for offset
\!= -1)
- min offset
- suggest sequential access
- max offset
- align: all offsets should be aligned with this value.
- pad can give ranges from A to B length (peer can ask for A ⇐ length
⇐ B)
- min length
- suggested length
- max length
Use cases:
- An audio source can provide random access to the samples queued in
its DMA buffer, it however suggests sequential access method. An
audio source can provide a random number of samples but prefers
reading from the hardware using a fixed segment size.
- A caching network source would suggest sequential access but is
seekable in the cached region. Applications can query for the
already downloaded portion and update the GUI, a seek can be done in
that area.
- a live video source can only provide buffers sequentialy. It exposes
offsets as -1. lengths are also -1.

View file

@ -0,0 +1,107 @@
# Tagreading
The tagreading (metadata reading) use case for mediacenter applications
is not too well supported by the current GStreamer architecture. It uses
demuxers on the files, which generally said takes too long (building
seek-index, prerolling). What we want is specialized elements / parsing
modes that just do the tag-reading.
The idea is to define a TagReadIFace. Tag-demuxers, classic demuxers and
decoder plugins can just implement the interface or provide a separate
element that implements the interface.
In addition we need a tagreadbin, that similar to decodebin does a
typefind and then plugs the right tagread element(s). If will only look
at elements that implement the interface. It can plug serval if
possible.
For optimal performance typefind and tagread could share the list of
already peeked buffers (a queue element after sink, but that would
change pull to push).
## Design
The plan is that applications can do the following: pipeline = "filesrc
\! tagbin" for (file\_path in list\_of\_files) {
filesrc.location=file\_path pipeline.set\_state(PAUSED) // wait for TAGS
& EOS pipeline.set\_state(READY) }
- it should have one sinkpad of type ANY
- it should send EOS when all metadata has been read "done"-signal
from all tagread-elements
- special tagread-elements should have RANK\_NONE to be not
autoplugged by decodebin
## Interface
- gboolean iface property "tag-reading" Switches the element to
tagreading mode. Needed if normal element implement that behaviour.
Elements will skip parsing unneeded data, dont build a seeking
index, etc.
- signal "done" Equivalent of EOS.
## Use Cases
- mp3 with id3- and apetags
- plug id3demux \! apedemux
- avi with vorbis audio
- plug avidemux
- new pad → audio/vorbis
- plug vorbisdec or special vorbiscomment reader
## Additional Thoughts
- would it make sense to have 2-phase tag-reading (property on tagbin
and/or tagread elements)
- 1st phase: get tag-data that are directly embedded in the data
- 2nd phase: get tag-data that has to be generated
- e.g. album-art via web, video-thumbnails
- what about caching backends
- it would be good to allow applications to supply tagbin with a
tagcache- object instance. Whenever tagbin gets a *location* to
tagread, it consults the cache first. whenever there is a cache-miss
it will tag-read and then store in the
cache
``` c
GstTagList *gst_tag_cache_load_tag_data (GstTagCache *self, const gchar *uri);
gst_tag_cache_store_tag_data (GstTagCache *self, const gchar *uri, GstTagList *tags);
```
## Tests
- write a generic test for parsers/demuxers to ensure they send tags
until they reached PAUSED (elements need to parse file for
prerolling anyway): set pipeline to paused, check for tags, set to
playing, error out if tags come after paused
## Code Locations
- tagreadbin → gst-plugins-base/gst/tagread
- tagreaderiface → gst-plugins-base/gst-libs/gst/tag
## Reuse
- ogg : gst-plugins-base/ext/ogg
- avi : gst-plugins-good/gst/avi
- mp3 : gst-plugins-good/gst/id3demux
- wav : gst-plugins-good/gst/wavparse
- qt : gst-plugins-bad/gst/qtdemux

View file

@ -0,0 +1,14 @@
# Dynamic pipelines
This document describes many use cases for dynamically constructing and
manipulating a running or paused pipeline and the features provided by
GStreamer.
When constructing dynamic pipelines it is important to understand the
following features of gstreamer:
- pad blocking
- playback segments.
- streaming vs application threads.

View file

@ -0,0 +1,290 @@
# Sink elements
Sink elements consume data and normally have no source pads.
Typical sink elements include:
- audio/video renderers
- network sinks
- filesinks
Sinks are harder to construct than other element types as they are
treated specially by the GStreamer core.
## state changes
A sink always returns ASYNC from the state change to PAUSED, this
includes a state change from READY→PAUSED and PLAYING→PAUSED. The reason
for this is that this way we can detect when the first buffer or event
arrives in the sink when the state change completes.
A sink should block on the first EOS event or buffer received in the
READY→PAUSED state before commiting the state to PAUSED.
FLUSHING events have to be handled out of sync with the buffer flow and
take no part in the preroll procedure.
Events other than EOS do not complete the preroll stage.
## sink overview
- TODO: PREROLL\_LOCK can be removed and we can safely use the STREAM\_LOCK.
``````
# Commit the state. We return TRUE if we can continue
# streaming, FALSE in the case we go to a READY or NULL state.
# if we go to PLAYING, we don't need to block on preroll.
commit
{
LOCK
switch (pending)
case PLAYING:
need_preroll = FALSE
break
case PAUSED:
break
case READY:
case NULL:
return FALSE
case VOID:
return TRUE
# update state
state = pending
next = VOID
pending = VOID
UNLOCK
return TRUE
}
# Sync an object. We have to wait for the element to reach
# the PLAYING state before we can wait on the clock.
# Some items do not need synchronisation (most events) so the
# get_times method returns FALSE (not syncable)
# need_preroll indicates that we are not in the PLAYING state
# and therefore need to commit and potentially block on preroll
# if our clock_wait got interrupted we commit and block again.
# The reason for this is that the current item being rendered is
# not yet finished and we can use that item to finish preroll.
do_sync (obj)
{
# get timing information for this object
syncable = get_times (obj, &start, &stop)
if (!syncable)
return OK;
again:
while (need_preroll)
if (need_commit)
need_commit = FALSE
if (!commit)
return FLUSHING
if (need_preroll)
# release PREROLL_LOCK and wait. prerolled can be observed
# and will be TRUE
prerolled = TRUE
PREROLL_WAIT (releasing PREROLL_LOCK)
prerolled = FALSE
if (flushing)
return FLUSHING
if (valid (start || stop))
PREROLL_UNLOCK
end_time = stop
ret = wait_clock (obj,start)
PREROLL_LOCK
if (flushing)
return FLUSHING
# if the clock was unscheduled, we redo the
# preroll
if (ret == UNSCHEDULED)
goto again
}
# render a prerollable item (EOS or buffer). It is
# always called with the PREROLL_LOCK helt.
render_object (obj)
{
ret = do_sync (obj)
if (ret != OK)
return ret;
# preroll and syncing done, now we can render
render(obj)
}
| # sinks that sync on buffer contents do like this
| while (more_to_render)
| ret = render
| if (ret == interrupted)
| prerolled = TRUE
render (buffer) ----->| PREROLL_WAIT (releasing PREROLL_LOCK)
| prerolled = FALSE
| if (flushing)
| return FLUSHING
|
# queue a prerollable item (EOS or buffer). It is
# always called with the PREROLL_LOCK helt.
# This function will commit the state when receiving the
# first prerollable item.
# items are then added to the rendering queue or rendered
# right away if no preroll is needed.
queue (obj, prerollable)
{
if (need_preroll)
if (prerollable)
queuelen++
# first item in the queue while we need preroll
# will complete state change and call preroll
if (queuelen == 1)
preroll (obj)
if (need_commit)
need_commit = FALSE
if (!commit)
return FLUSHING
# then see if we need more preroll items before we
# can block
if (need_preroll)
if (queuelen <= maxqueue)
queue.add (obj)
return OK
# now clear the queue and render each item before
# rendering the current item.
while (queue.hasItem)
render_object (queue.remove())
render_object (obj)
queuelen = 0
}
# various event functions
event
EOS:
# events must complete preroll too
STREAM_LOCK
PREROLL_LOCK
if (flushing)
return FALSE
ret = queue (event, TRUE)
if (ret == FLUSHING)
return FALSE
PREROLL_UNLOCK
STREAM_UNLOCK
break
SEGMENT:
# the segment must be used to clip incoming
# buffers. Then then go into the queue as non-prerollable
# items used for syncing the buffers
STREAM_LOCK
PREROLL_LOCK
if (flushing)
return FALSE
set_clip
ret = queue (event, FALSE)
if (ret == FLUSHING)
return FALSE
PREROLL_UNLOCK
STREAM_UNLOCK
break
FLUSH_START:
# set flushing and unblock all that is waiting
event ----> subclasses can interrupt render
PREROLL_LOCK
flushing = TRUE
unlock_clock
PREROLL_SIGNAL
PREROLL_UNLOCK
STREAM_LOCK
lost_state
STREAM_UNLOCK
break
FLUSH_END:
# unset flushing and clear all data and eos
STREAM_LOCK
event
PREROLL_LOCK
queue.clear
queuelen = 0
flushing = FALSE
eos = FALSE
PREROLL_UNLOCK
STREAM_UNLOCK
break
# the chain function checks the buffer falls within the
# configured segment and queues the buffer for preroll and
# rendering
chain
STREAM_LOCK
PREROLL_LOCK
if (flushing)
return FLUSHING
if (clip)
queue (buffer, TRUE)
PREROLL_UNLOCK
STREAM_UNLOCK
state
switch (transition)
READY_PAUSED:
# no datapassing is going on so we always return ASYNC
ret = ASYNC
need_commit = TRUE
eos = FALSE
flushing = FALSE
need_preroll = TRUE
prerolled = FALSE
break
PAUSED_PLAYING:
# we grab the preroll lock. This we can only do if the
# chain function is either doing some clock sync, we are
# waiting for preroll or the chain function is not being called.
PREROLL_LOCK
if (prerolled || eos)
ret = OK
need_commit = FALSE
need_preroll = FALSE
if (eos)
post_eos
else
PREROLL_SIGNAL
else
need_preroll = TRUE
need_commit = TRUE
ret = ASYNC
PREROLL_UNLOCK
break
PLAYING_PAUSED:
---> subclass can interrupt render
# we grab the preroll lock. This we can only do if the
# chain function is either doing some clock sync
# or the chain function is not being called.
PREROLL_LOCK
need_preroll = TRUE
unlock_clock
if (prerolled || eos)
ret = OK
else
ret = ASYNC
PREROLL_UNLOCK
break
PAUSED_READY:
---> subclass can interrupt render
# we grab the preroll lock. Set to flushing and unlock
# everything. This should exit the chain functions and stop
# streaming.
PREROLL_LOCK
flushing = TRUE
unlock_clock
queue.clear
queuelen = 0
PREROLL_SIGNAL
ret = OK
PREROLL_UNLOCK
break
```

View file

@ -0,0 +1,132 @@
# Source elements
A source element is an element that provides data to the pipeline. It
does typically not have any sink (input) pads.
Typical source elements include:
- file readers
- network elements (live or not)
- capture elements (video/audio/…)
- generators (signals/video/audio/…)
## Live sources
A source is said to be a live source when it has the following property:
- temporarily stopping reading from the source causes data to be lost.
In general when this property holds, the source also produces data at a
fixed rate. Most sources have a limit on the rate at which they can
deliver data, which might be faster or slower than the consumption rate.
This property however does not make them a live source.
Lets look at some example sources.
- file readers: you can PAUSE without losing data. There is however a
limit to how fast you can read from this source. This limit is
usually much higher than the consumption rate. In some cases it
might be slower (an NFS share, for example) in which case you might
need to use some buffering (see [buffering](design/buffering.md)).
- HTTP network element: you can PAUSE without data loss. Depending on
the available network bandwidth, consumption rate might be higher
than production rate in which case buffering should be used (see
[buffering](design/buffering.md)).
- audio source: pausing the audio capture will lead to lost data. this
source is therefore definatly live. In addition, an audio source
will produce data at a fixed rate (the samplerate). Also depending
on the buffersize, this source will introduce a latency (see
[latency](design/latency.md)).
- udp network source: Pausing the receiving part will lead to lost
data. This source is therefore a live source. Also in a typical case
the udp packets will be received at a certain rate, which might be
difficult to guess because of network jitter. This source does not
necessarily introduce latency on its own.
- dvb source: PAUSING this element will lead to data loss, its a live
source similar to a UDP source.
## Source types
A source element can operate in three ways:
- it is fully seekable, this means that random access can be performed
on it in an efficient way. (a file reader,…). This also typically
means that the source is not live.
- data can be obtained from it with a variable size. This means that
the source can give N bytes of data. An example is an audio source.
A video source always provides the same amount of data (one video
frame). Note that this is not a fully seekable source.
- it is a live source, see above.
When writing a source, one has to look at how the source can operate to
decide on the scheduling methods to implement on the source.
- fully seekable sources implement a getrange function on the source
pad.
- sources that can give N bytes but cannot do seeking also implement a
getrange function but state that they cannot do random access.
- sources that are purely live sources implement a task to push out
data.
Any source that has a getrange function must also implement a push based
scheduling mode. In this mode the source starts a task that gets N bytes
and pushes them out. Whenever possible, the peer element will select the
getrange based scheduling method of the source, though.
A source with a getrange function must activate itself in the pad
activate function. This is needed because the downstream peer element
will decide and activate the source element in its state change function
before the sources state change function is called.
## Source base classes
GstBaseSrc:
This base class provides an implementation of a random access source and
is very well suited for file reader like sources.
GstPushSrc:
Base class for block-based sources. This class is mostly useful for
elements that cannot do random access, or at least very slowly. The
source usually prefers to push out a fixed size buffer.
Classes extending this base class will usually be scheduled in a push
based mode. If the peer accepts to operate without offsets and within
the limits of the allowed block size, this class can operate in getrange
based mode automatically.
The subclass should extend the methods from the baseclass in addition to
the create method. If the source is seekable, it needs to override
GstBaseSrc::event() in addition to GstBaseSrc::is\_seekable() in order
to retrieve the seek offset, which is the offset of the next buffer to
be requested.
Flushing, scheduling and sync is all handled by this base class.
## Timestamps
A non-live source should timestamp the buffers it produces starting from
0. If it is not possible to timestamp every buffer (filesrc), the source
is allowed to only timestamp the first buffer (as 0).
Live sources only produce data in the PLAYING state, when the clock is
running. They should timestamp each buffer they produce with the current
running\_time of the pipeline, which is expressed as:
absolute_time - base_time
With absolute\_time the time obtained from the global pipeline with
gst\_clock\_get\_time() and base\_time being the time of that clock when
the pipeline was last set to PLAYING.

View file

@ -0,0 +1,327 @@
# Transform elements
Transform elements transform input buffers to output buffers based on
the sink and source caps.
An important requirement for a transform is that the output caps are
completely defined by the input caps and vice versa. This means that a
typical decoder element can NOT be implemented with a transform element,
this is because the output caps like width and height of the
decompressed video frame, for example, are encoded in the stream and
thus not defined by the input caps.
Typical transform elements include:
- audio convertors (audioconvert, audioresample,…)
- video convertors (colorspace, videoscale, …)
- filters (capsfilter, volume, colorbalance, …)
The implementation of the transform element has to take care of the
following things:
- efficient negotiation both up and downstream
- efficient buffer alloc and other buffer management
Some transform elements can operate in different modes:
- passthrough (no changes are done on the input buffers)
- in-place (changes made directly to the incoming buffers without
requiring a copy or new buffer allocation)
- metadata changes only
Depending on the mode of operation the buffer allocation strategy might
change.
The transform element should at any point be able to renegotiate sink
and src caps as well as change the operation mode.
In addition, the transform element will typically take care of the
following things as well:
- flushing, seeking
- state changes
- timestamping, this is typically done by copying the input timestamps
to the output buffers but subclasses should be able to override
this.
- QoS, avoiding calls to the subclass transform function
- handle scheduling issues such as push and pull based operation.
In the next sections, we will describe the behaviour of the transform
element in each of the above use cases. We focus mostly on the buffer
allocation strategies and caps negotiation.
## Processing
A transform has 2 main processing functions:
- **`transform()`**: Transform the input buffer to the output buffer. The
output buffer is guaranteed to be writable and different from the input buffer.
- **`transform_ip()`**: Transform the input buffer in-place. The input buffer
is writable and of bigger or equal size than the output buffer.
A transform can operate in the following modes:
- *passthrough*: The element will not make changes to the buffers, buffers are
pushed straight through, caps on both sides need to be the same. The element
can optionally implement a transform_ip() function to take a look at the data,
the buffer does not have to be writable.
- *in-place*: Changes can be made to the input buffer directly to obtain the
output buffer. The transform must implement a transform_ip() function.
- *copy-transform*: The transform is performed by copying and transforming the
input buffer to a new output buffer. The transform must implement a transform()
function.
When no `transform()` function is provided, only in-place and passthrough
operation is allowed, this means that source and destination caps must
be equal or that the source buffer size is bigger or equal than the
destination buffer.
When no `transform_ip()` function is provided, only passthrough and
copy-transforms are supported. Providing this function is an
optimisation that can avoid a buffer copy.
When no functions are provided, we can only process in passthrough mode.
## Negotiation
Typical (re)negotiation of the transform element in push mode always
goes from sink to src, this means triggers the following sequence:
- the sinkpad receives a new caps event.
- the transform function figures out what it can convert these caps
to.
- try to see if we can configure the caps unmodified on the peer. We
need to do this because we prefer to not do anything.
- the transform configures itself to transform from the new sink caps
to the target src caps
- the transform processes and sets the output caps on the src pad
We call this downstream negotiation (DN) and it goes roughly like this:
```
sinkpad transform srcpad
CAPS event | | |
------------>| find_transform() | |
|------------------->| |
| | CAPS event |
| |--------------------->|
| <configure caps> <-| |
```
These steps configure the element for a transformation from the input
caps to the output caps.
The transform has 3 function to perform the negotiation:
- **`transform_caps()`**: Transform the caps on a certain pad to all the
possible supported caps on the other pad. The input caps are guaranteed to be
a simple caps with just one structure. The caps do not have to be fixed.
- **`fixate_caps()`**: Given a caps on one pad, fixate the caps on the other
pad. The target caps are writable.
- **`set_caps()`**: Configure the transform for a transformation between src
caps and dest caps. Both caps are guaranteed to be fixed caps.
If no `transform_caps()` is defined, we can only perform the identity
transform, by default.
If no `set_caps()` is defined, we dont care about caps. In that case we
also assume nothing is going to write to the buffer and we dont enforce
a writable buffer for the `transform_ip` function, when present.
One common function that we need for the transform element is to find
the best transform from one format (src) to another (dest). Some
requirements of this function are:
- has a fixed src caps
- finds a fixed dest caps that the transform element can transform to
- the dest caps are compatible and can be accepted by peer elements
- the transform function prefers to make src caps == dest caps
- the transform function can optionally fixate dest caps.
The `find_transform()` function goes like this:
- start from src aps, these caps are fixed.
- check if the caps are acceptable for us as src caps. This is usually
enforced by the padtemplate of the element.
- calculate all caps we can transform too with `transform_caps()`
- if the original caps are a subset of the transforms, try to see if
the the caps are acceptable for the peer. If this is possible, we
can perform passthrough and make src == dest. This is performed by
simply calling gst\_pad\_peer\_accept\_caps().
- if the caps are not fixed, we need to fixate it, start by taking the
peer caps and intersect with them.
- for each of the transformed caps retrieved with transform\_caps():
- try to fixate the caps with fixate\_caps()
- if the caps are fixated, check if the peer accepts them with
`_peer_accept_caps()`, if the peer accepts, we have found a dest caps.
- if we run out of caps, we fail to find a transform.
- if we found a destination caps, configure the transform with
set\_caps().
After this negotiation process, the transform element is usually in a
steady state. We can identify these steady states:
- src and sink pads both have the same caps. Note that when the caps
are equal on both pads, the input and output buffers automatically
have the same size. The element can operate on the buffers in the
following ways: (Same caps, SC)
- passthrough: buffers are inspected but no metadata or buffer data is
changed. The input buffers dont need to be writable. The input
buffer is simply pushed out again without modifications. (SCP)
```
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| pad_push() |
| |--------------------->|
| | |
```
- in-place: buffers are modified in-place, this means that the input
buffer is modified to produce a new output buffer. This requires the
input buffer to be writable. If the input buffer is not writable, a
new buffer has to be allocated from the bufferpool. (SCI)
```
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| |
| | [!writable] |
| | alloc buffer |
| .-| |
| <transform_ip> | | |
| '>| |
| | pad_push() |
| |--------------------->|
| | |
```
- copy transform: a new output buffer is allocate from the bufferpool
and data from the input buffer is transformed into the output
buffer. (SCC)
```
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| |
| | alloc buffer |
| .-| |
| <transform> | | |
| '>| |
| | pad_push() |
| |--------------------->|
| | |
```
- src and sink pads have different caps. The element can operate on
the buffers in the following way: (Different Caps, DC)
- in-place: input buffers are modified in-place. This means that the
input buffer has a size that is larger or equal to the output size.
The input buffer will be resized to the size of the output buffer.
If the input buffer is not writable or the output size is bigger
than the input size, we need to pad-alloc a new buffer. (DCI)
```
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| |
| | [!writable || !size] |
| | alloc buffer |
| .-| |
| <transform_ip> | | |
| '>| |
| | pad_push() |
| |--------------------->|
| | |
```
- copy transform: a new output buffer is allocated and the data from
the input buffer is transformed into the output buffer. The flow is
exactly the same as the case with the same-caps negotiation. (DCC)
We can immediately observe that the copy transform states will need to
allocate a new buffer from the bufferpool. When the transform element is
receiving a non-writable buffer in the in-place state, it will also need
to perform an allocation. There is no reason why the passthrough state
would perform an allocation.
This steady state changes when one of the following actions occur:
- the sink pad receives new caps, this triggers the above downstream
renegotation process, see above for the flow.
- the transform element wants to renegotiate (because of changed
properties, for example). This essentially clears the current steady
state and triggers the downstream and upstream renegotiation
process. This situation also happens when a RECONFIGURE event was
received on the transform srcpad.
## Allocation
After the transform element is configured with caps, a bufferpool needs
to be negotiated to perform the allocation of buffers. We have 2 cases:
- The element is operating in passthrough we dont need to allocate a
buffer in the transform element.
- The element is not operating in passthrough and needs to allocation
an output buffer.
In case 1, we dont query and configure a pool. We let upstream decide
if it wants to use a bufferpool and then we will proxy the bufferpool
from downstream to upstream.
In case 2, we query and set a bufferpool on the srcpad that will be used
for doing the allocations.
In order to perform allocation, we need to be able to get the size of
the output buffer after the transform. We need additional function to
retrieve the size. There are two functions:
- `transform_size()`: Given a caps and a size on one pad, and a caps on the
other pad, calculate the size of the other buffer. This function is able to
perform all size transforms and is the preferred method of transforming
a size.
- `get_unit_size()`: When the input size and output size are always
a multiple of each other (audio conversion, ..) we can define a more simple
get_unit_size() function. The transform will use this function to get the
same amount of units in the source and destination buffers. For performance
reasons, the mapping between caps and size is kept in a cache.

282
markdown/design/events.md Normal file
View file

@ -0,0 +1,282 @@
# Events
Events are objects passed around in parallel to the buffer dataflow to
notify elements of various events.
Events are received on pads using the event function. Some events should
be interleaved with the data stream so they require taking the
STREAM_LOCK, others dont.
Different types of events exist to implement various functionalities.
* `GST_EVENT_FLUSH_START`: data is to be discarded
* `GST_EVENT_FLUSH_STOP`: data is allowed again
* `GST_EVENT_CAPS`: Format information about the following buffers
* `GST_EVENT_SEGMENT`: Timing information for the following buffers
* `GST_EVENT_TAG`: Stream metadata.
* `GST_EVENT_BUFFERSIZE`: Buffer size requirements
* `GST_EVENT_SINK_MESSAGE`: An event turned into a message by sinks
* `GST_EVENT_EOS`: no more data is to be expected on a pad.
* `GST_EVENT_QOS`: A notification of the quality of service of the stream
* `GST_EVENT_SEEK`: A seek should be performed to a new position in the stream
* `GST_EVENT_NAVIGATION`: A navigation event.
* `GST_EVENT_LATENCY`: Configure the latency in a pipeline
* `GST_EVENT_STEP`: Stepping event
* `GST_EVENT_RECONFIGURE`: stream reconfigure event
- `GST_EVENT_DRAIN`: Play all data downstream before returning.
> not yet implemented, under investigation, might be needed to do
still frames in DVD.
# src pads
A `gst_pad_push_event()` on a srcpad will first store the sticky event
in the sticky array before sending the event to the peer pad. If there
is no peer pad and the event was not stored in the sticky array, FALSE
is returned.
Flushing pads will refuse the events and will not store the sticky
events.
# sink pads
A `gst_pad_send_event()`i on a sinkpad will call the event function on
the pad. If the event function returns success, the sticky event is
stored in the sticky event array and the event is marked for update.
When the pad is flushing, the `_send_event()` function returns FALSE
immediately.
When the next data item is pushed, the pending events are pushed first.
This ensures that the event function is never called for flushing pads
and that the sticky array only contains events for which the event
function returned success.
# pad link
When linking pads, the srcpad sticky events are marked for update when
they are different from the sinkpad events. The next buffer push will
push the events to the sinkpad.
## FLUSH_START/STOP
A flush event is sent both downstream and upstream to clear any pending
data from the pipeline. This might be needed to make the graph more
responsive when the normal dataflow gets interrupted by for example a
seek event.
Flushing happens in two stages.
1) a source element sends the FLUSH_START event to the downstream peer element.
The downstream element starts rejecting buffers from the upstream elements. It
sends the flush event further downstream and discards any buffers it is
holding as well as return from the chain function as soon as possible.
This makes sure that all upstream elements get unblocked.
This event is not synchronized with the STREAM_LOCK and can be done in the
application thread.
2) a source element sends the FLUSH_STOP event to indicate
that the downstream element can accept buffers again. The downstream
element sends the flush event to its peer elements. After this step dataflow
continues. The FLUSH_STOP call is synchronized with the STREAM_LOCK so any
data used by the chain function can safely freed here if needed. Any
pending EOS events should be discarded too.
After the flush completes the second stage, data is flowing again in the
pipeline and all buffers are more recent than those before the flush.
For elements that use the pullrange function, they send both flush
events to the upstream pads in the same way to make sure that the
pullrange function unlocks and any pending buffers are cleared in the
upstream elements.
A `FLUSH_START` may instruct the pipeline to distribute a new base_time
to elements so that the running_time is reset to 0. (see
[clocks](design/clocks.md) and [synchronisation](design/synchronisation.md)).
## EOS
The EOS event can only be sent on a sinkpad. It is typically emitted by
the source element when it has finished sending data. This event is
mainly sent in the streaming thread but can also be sent from the
application thread.
The downstream element should forward the EOS event to its downstream
peer elements. This way the event will eventually reach the sinks which
should then post an EOS message on the bus when in PLAYING.
An element might want to flush its internally queued data before
forwarding the EOS event downstream. This flushing can be done in the
same thread as the one handling the EOS event.
For elements with multiple sink pads it might be possible to wait for
EOS on all the pads before forwarding the event.
The EOS event should always be interleaved with the data flow, therefore
the GStreamer core will take the `STREAM_LOCK`.
Sometimes the EOS event is generated by another element than the source,
for example a demuxer element can generate an EOS event before the
source element. This is not a problem, the demuxer does not send an EOS
event to the upstream element but returns `GST_FLOW_EOS`, causing the
source element to stop sending data.
An element that sends EOS on a pad should stop sending data on that pad.
Source elements typically pause() their task for that purpose.
By default, a GstBin collects all EOS messages from all its sinks before
posting the EOS message to its parent.
The EOS is only posted on the bus by the sink elements in the PLAYING
state. If the EOS event is received in the PAUSED state, it is queued
until the element goes to PLAYING.
A `FLUSH_STOP` event on an element flushes the EOS state and all pending
EOS messages.
## SEGMENT
A segment event is sent downstream by an element to indicate that the
following group of buffers start and end at the specified positions. The
newsegment event also contains the playback speed and the applied rate
of the stream.
Since the stream time is always set to 0 at start and after a seek, a 0
point for all next buffers timestamps has to be propagated through the
pipeline using the SEGMENT event.
Before sending buffers, an element must send a SEGMENT event. An element
is free to refuse buffers if they were not preceded by a SEGMENT event.
Elements that sync to the clock should store the SEGMENT start and end
values and subtract the start value from the buffer timestamp before
comparing it against the stream time (see [clocks](design/clocks.md)).
An element is allowed to send out buffers with the SEGMENT start time
already subtracted from the timestamp. If it does so, it needs to send a
corrected SEGMENT downstream, ie, one with start time 0.
A SEGMENT event should be generated as soon as possible in the pipeline
and is usually generated by a demuxer or source. The event is generated
before pushing the first buffer and after a seek, right before pushing
the new buffer.
The SEGMENT event should be sent from the streaming thread and should be
serialized with the buffers.
Buffers should be clipped within the range indicated by the newsegment
event start and stop values. Sinks must drop buffers with timestamps out
of the indicated segment range.
## TAG
The tag event is sent downstream when an element has discovered metadata
tags in a media file. Encoders can use this event to adjust their
tagging system. A tag is serialized with buffers.
## BUFFERSIZE
> **Note**
>
> This event is not yet implemented.
An element can suggest a buffersize for downstream elements. This is
typically done by elements that produce data on multiple source pads
such as demuxers.
## QOS
A QOS, or quality of service message, is generated in an element to
report to the upstream elements about the current quality of real-time
performance of the stream. This is typically done by the sinks that
measure the amount of framedrops they have. (see [qos](design/qos.md))
## SEEK
A seek event is issued by the application to configure the playback
range of a stream. It is called form the application thread and travels
upstream.
The seek event contains the new start and stop position of playback
after the seek is performed. Optionally the stop position can be left at
-1 to continue playback to the end of the stream. The seek event also
contains the new playback rate of the stream, 1.0 is normal playback,
2.0 double speed and negative values mean backwards playback.
A seek usually flushes the graph to minimize latency after the seek.
This behaviour is triggered by using the `SEEK_FLUSH` flag on the seek
event.
The seek event usually starts from the sink elements and travels
upstream from element to element until it reaches an element that can
perform the seek. No intermediate element is allowed to assume that a
seek to this location will happen. It is allowed to modify the start and
stop times if it needs to do so. this is typically the case if a seek is
requested for a non-time position.
The actual seek is performed in the application thread so that success
or failure can be reported as a return value of the seek event. It is
therefore important that before executing the seek, the element acquires
the `STREAM_LOCK` so that the streaming thread and the seek get
serialized.
The general flow of executing the seek with FLUSH is as follows:
1) unblock the streaming threads, they could be blocked in a chain
function. This is done by sending a FLUSH_START on all srcpads or by pausing
the streaming task, depending on the seek FLUSH flag.
The flush will make sure that all downstream elements unlock and
that control will return to this element chain/loop function.
We cannot lock the STREAM_LOCK before doing this since it might
cause a deadlock.
2) acquire the STREAM_LOCK. This will work since the chain/loop function
was unlocked/paused in step 1).
3) perform the seek. since the STREAM_LOCK is held, the streaming thread
will wait for the seek to complete. Most likely, the stream thread
will pause because the peer elements are flushing.
4) send a FLUSH_STOP event to all peer elements to allow streaming again.
5) create a SEGMENT event to signal the new buffer timestamp base time.
This event must be queued to be sent by the streaming thread.
6) start stopped tasks and unlock the STREAM_LOCK, dataflow will continue
now from the new position.
More information about the different seek types can be found in
[seeking](design/seeking.md).
## NAVIGATION
A navigation event is generated by a sink element to signal the elements
of a navigation event such as a mouse movement or button click.
Navigation events travel upstream.
## LATENCY
A latency event is used to configure a certain latency in the pipeline.
It contains a single GstClockTime with the required latency. The latency
value is calculated by the pipeline and distributed to all sink elements
before they are set to PLAYING. The sinks will add the configured
latency value to the timestamps of the buffer in order to delay their
presentation. (See also [latency](design/latency.md)).
## DRAIN
> **Note**
>
> This event is not yet implemented.
Drain event indicates that upstream is about to perform a real-time
event, such as pausing to present an interactive menu or such, and needs
to wait for all data it has sent to be played-out in the sink.
Drain should only be used by live elements, as it may otherwise occur
during prerolling.
Usually after draining the pipeline, an element either needs to modify
timestamps, or FLUSH to prevent subsequent data being discarded at the
sinks for arriving late (only applies during playback scenarios).

View file

@ -0,0 +1,246 @@
# Frame step
This document outlines the details of the frame stepping functionality
in GStreamer.
The stepping functionality operates on the current playback segment,
position and rate as it was configured with a regular seek event. In
contrast to the seek event, it operates very closely to the sink and
thus has a very low latency and is not slowed down by queues and does
not actually perform any seeking logic. For this reason we want to
include a new API instead of reusing the seek API.
The following requirements are needed:
- The ability to walk forwards and backwards in the stream.
- Arbitrary increments in any supported format (time, frames, bytes …)
- High speed, minimal overhead. This mechanism is not more expensive
than simple playback.
- switching between forwards and backwards stepping should be fast.
- Maintain synchronisation between streams.
- Get feedback of the amount of skipped data.
- Ability to play a certain amount of data at an arbitrary speed.
We want a system where we can step frames in PAUSED as well as play
short segments of data in PLAYING.
## Use Cases
* frame stepping in video only pipeline in PAUSED
```
.-----. .-------. .------. .-------.
| src | | demux | .-----. | vdec | | vsink |
| src->sink src1->|queue|->sink src->sink |
'-----' '-------' '-----' '------' '-------'
``````
*
- app sets the pipeline to PAUSED to block on the preroll picture
- app seeks to required position in the stream. This can be done
with a positive or negative rate depending on the required frame
stepping direction.
- app steps frames (in `GST_FORMAT_DEFAULT` or `GST_FORMAT_BUFFER)`. The
pipeline loses its PAUSED state until the required number of frames have been
skipped, it then prerolls again. This skipping is purely done in the sink.
- sink posts `STEP_DONE` with amount of frames stepped and
corresponding time interval.
* frame stepping in audio/video pipeline in PAUSED
```
.-----. .-------. .------. .-------.
| src | | demux | .-----. | vdec | | vsink |
| src->sink src1->|queue|->sink src->sink |
'-----' | | '-----' '------' '-------'
| | .------. .-------.
| | .-----. | adec | | asink |
| src2->|queue|->sink src->sink |
'-------' '-----' '------' '-------'
```
*
- app sets the pipeline to PAUSED to block on the preroll picture
- app seeks to required position in the stream. This can be done
with a positive or negative rate depending on the required frame
stepping direction.
- app steps frames (in `GST_FORMAT_DEFAULT` or `GST_FORMAT_BUFFER`) or an
amount of time on the video sink. The pipeline loses its PAUSED state until
the required number of frames have been skipped, it then prerolls again. This
skipping is purely done in the sink.
- sink posts `STEP_DONE` with amount of frames stepped and
corresponding time interval.
- the app skips the same amount of time on the audiosink to align
the streams again. When huge amount of video frames are skipped,
there needs to be enough queueing in the pipeline to compensate
for the accumulated audio.
- frame stepping in audio/video pipeline in PLAYING
- app sets the pipeline to PAUSED to block on the preroll picture
- app seeks to required position in the stream. This can be done
with a positive or negative rate depending on the required frame
stepping direction.
- app configures frames steps (in `GST_FORMAT_DEFAULT` or
`GST_FORMAT_BUFFER` or an amount of time on the sink. The step event has
a flag indicating live stepping so that the stepping will only happens in
PLAYING.
- app sets pipeline to PLAYING. The pipeline continues PLAYING
until it consumed the amount of time.
- sink posts `STEP_DONE` with amount of frames stepped and
corresponding time interval. The sink will then wait for another
step event. Since the `STEP_DONE` message was emitted by the sink
when it handed off the buffer to the device, there is usually
sufficient time to queue a new STEP event so that one can
seamlessly continue stepping.
## events
A new `GST_EVENT_STEP` event is introduced to start the step operation.
The step event is created with the following fields in the structure:
* **`format`** GST_TYPE_FORMAT: The format of the step units
* **`amount`** G_TYPE_UINT64: The amount of units to step. A 0 amount
immediately completes and can be used to cancel the current step and resume
normal non-stepping behaviour to the end of the segment. A -1 amount steps
until the end of the segment.
* **`rate`** G_TYPE_DOUBLE: The rate at which the frames should be stepped in
PLAYING mode. 1.0 is the normal playback speed and direction of the segment,
2.0 is double speed. A speed of 0.0 is not allowed. When performing a flushing
step, the speed is not relevant. Note that we don't allow negative rates here,
use a seek with a negative rate first to reverse the playback direction.
* **`flush`** G_TYPE_BOOLEAN: when flushing is TRUE, the step is performed
immediately:
*
- In the PAUSED state the pipeline loses the PAUSED state, the
requested amount of data is skipped and the pipeline prerolls again
when a non-intermediate step completes. When the pipeline was
stepping while the event is sent, the current step operation is
updated with the new amount and format. The sink will do a best
effort to comply with the new amount.
- In the PLAYING state, the pipeline loses the PLAYING state, the
requested amount of data is skipped (not rendered) from the previous
STEP request or from the position of the last PAUSED if no previous
STEP operation was performed. The pipeline goes back to the PLAYING
state when a non-intermediate step completes.
- When flushing is FALSE, the step will be performed later.
- In the PAUSED state the step will be done when going to PLAYING. Any
previous step operation will be overridden with the new STEP event.
- In the PLAYING state the step operation will be performed after the
current step operation completes. If there was no previous step
operation, the step operation will be performed from the position of
the last PAUSED state.
* **`intermediate`** G_TYPE_BOOLEAN: Signal that this step operation is an
intermediate step, part of a series of step operations. It is mostly
interesting for stepping in the PAUSED state because the sink will only perform
a preroll after a non-intermediate step operation completes. Intermediate steps
are useful to flush out data from other sinks in order to not cause excessive
queueing. In the PLAYING state the intermediate flag has no visual effect. In
all states, the intermediate flag is passed to the corresponding
GST_MESSAGE_STEP_DONE.
The application will create a STEP event to start or stop the stepping
operation. Both stepping in PAUSED and PLAYING can be performed by means
of the flush flag.
The event is usually sent to the pipeline, which will typically
distribute the event to all of its sinks. For some use cases, like frame
stepping on video frames only, the event should only be sent to the
video sink and upon reception of the `STEP_DONE` message, one can step
the other sinks to align the streams again.
For large stepping amounts, there needs to be enough queueing in front
of all the sinks. If large steps need to be performed, they can be split
up into smaller step operations using the "intermediate" flag on the
step.
Since the step event does not update the `base_time` of any of the
elements, the sinks should keep track of the amount of stepped data in
order to remain synchronized against the clock.
## messages
A `GST_MESSAGE_STEP_START` is created. It contains the following
fields.
* **`active`**: If the step was queued or activated.
* **`format`** GST_TYPE_FORMAT: The format of the step units that queued/activated.
* **`amount`** G_TYPE_UINT64: The amount of units that were queued/activated.
* **`rate`** G_TYPE_DOUBLE: The rate and direction at which the frames were queued/activated.
* **`flush`** G_TYPE_BOOLEAN: If the queued/activated frames will be flushed.
* **`intermediate`** G_TYPE_BOOLEAN: If this is an intermediate step operation
that queued/activated.
The `STEP_START` message is emitted 2 times:
- first when an element received the STEP event and queued it. The
"active" field will be FALSE in this case.
- second when the step operation started in the streaming thread. The
"active" field is TRUE in this case. After this message is emitted,
the application can queue a new step operation.
The purpose of this message is to find out how many elements participate
in the step operation and to queue new step operations at the earliest
possible moment.
A new `GST_MESSAGE_STEP_DONE` message is created. It contains the
following fields:
* **`format`** GST_TYPE_FORMAT: The format of the step units that completed.
* **`amount`** G_TYPE_UINT64: The amount of units that were stepped.
* **`rate`** G_TYPE_DOUBLE: The rate and direction at which the frames were stepped.
* **`flush`** G_TYPE_BOOLEAN: If the stepped frames were flushed.
* **`intermediate`** G_TYPE_BOOLEAN: If this is an intermediate step operation that completed.
* **`duration`** G_TYPE_UINT64: The total duration of the stepped units in `GST_FORMAT_TIME`.
* **`eos`** G_TYPE_BOOLEAN: The step ended because of EOS.
The message is emitted by the element that performs the step operation.
The purpose is to return the duration in `GST_FORMAT_TIME` of the
stepped media. This especially interesting to align other stream in case
of stepping frames on the video sink element.
## Direction switch
When quickly switching between a forwards and a backwards step of, for
example, one video frame, we need either:
1) issue a new seek to change the direction from the current position.
2) cache a certain number of stepped frames and walk the cache.
option 1) might be very slow. For option 2) we would ideally like to
offload this caching functionality to a separate element, which means
that we need to forward the STEP event upstream. Its unclear how this
could work in a generic way. What is a demuxer supposed to do when it
received a step event? a flushing seek to what stream position?

105
markdown/design/gstbin.md Normal file
View file

@ -0,0 +1,105 @@
# GstBin
GstBin is a container element for other GstElements. This makes it
possible to group elements together so that they can be treated as one
single GstElement. A GstBin provides a GstBus for the children and
collates messages from them.
## Add/removing elements
The basic functionality of a bin is to add and remove GstElements
to/from it. `gst_bin_add()` and `gst_bin_remove()` perform these
operations respectively.
The bin maintains a parent-child relationship with its elements (see
[relations](design/relations.md)).
## Retrieving elements
GstBin provides a number of functions to retrieve one or more children
from itself. A few examples of the provided functions:
* `gst_bin_get_by_name()` retrieves an element by name.
* `gst_bin_iterate_elements()` returns an iterator to all the children.
## element management
The most important function of the GstBin is to distribute all
GstElement operations on itself to all of its children. This includes:
- state changes
- index get/set
- clock get/set
The state change distribution is the most complex and is explained in
[states](design/states.md).
## GstBus
The GstBin creates a GstBus for its children and distributes it when
child elements are added to the bin. The bin attaches a sync handler to
receive messages from children. The bus for receiving messages from
children is distinct from the bins own externally-visible GstBus.
Messages received from children are forwarded intact onto the bins
external message bus, except for EOS and SEGMENT_START/DONE which are
handled specially.
ASYNC_START/ASYNC_STOP messages received from the children are used to
trigger a recalculation of the current state of the bin, as described in
[states](design/states.md).
The application can retrieve the external GstBus and integrate it in the
mainloop or it can just `pop()` messages off in its own thread.
When a bin goes to READY it will clear all cached messages.
## EOS
The sink elements will post an EOS message on the bus when they reach
EOS. The EOS message is only posted to the bus when the sink element is
in PLAYING.
The bin collects all EOS messages and forwards it to the application as
soon as all the sinks have posted an EOS.
The list of queued EOS messages is cleared when the bin goes to PAUSED
again. This means that all elements should repost the EOS message when
going to PLAYING again.
## SEGMENT_START/DONE
A bin collects `SEGMENT_START` messages but does not post them to the
application. It counts the number of `SEGMENT_START` messages and posts a
`SEGMENT_STOP` message to the application when an equal number of
`SEGMENT_STOP` messages where received.
The cached SEGMENT_START/STOP messages are cleared when going to READY.
## DURATION
When a DURATION query is performed on a bin, it will forward the query
to all its sink elements. The bin will calculate the total duration as
the MAX of all returned durations and will then cache the result so that
any further query can use the cached version. The reason for caching the
result is because the duration of a stream typically does not change
that often.
A `GST_MESSAGE_DURATION_CHANGED` posted by an element will clear the
cached duration value so that the bin will query the sinks again. This
message is typically posted by elements that calculate the duration of
the stream based on some average bitrate, which might change while
playing the stream. The `DURATION_CHANGED` message is posted to the
application, which can then fetch the updated DURATION.
## Subclassing
Subclasses of GstBin are free to implement their own add/remove
implementations. It is a good idea to update the GList of children so
that the `_iterate()` functions can still be used if the custom bin
allows access to its children.
Any bin subclass can also implement a custom message handler by
overriding the default message handler.

41
markdown/design/gstbus.md Normal file
View file

@ -0,0 +1,41 @@
# GstBus
The GstBus is an object responsible for delivering GstMessages in a
first-in first-out way from the streaming threads to the application.
Since the application typically only wants to deal with delivery of
these messages from one thread, the GstBus will marshall the messages
between different threads. This is important since the actual streaming
of media is done in another threads (streaming threads) than the
application. It is also important to not block the streaming threads
while the application deals with the message.
The GstBus provides support for GSource based notifications. This makes
it possible to handle the delivery in the glib mainloop. Different
GSources can be added to the same bin provided they listen to different
message types.
A message is posted on the bus with the `gst_bus_post()` method. With
the `gst_bus_peek()` and `_pop()` methods one can look at or retrieve a
previously posted message.
The bus can be polled with the `gst_bus_poll()` method. This methods
blocks up to the specified timeout value until one of the specified
messages types is posted on the bus. The application can then `_pop()`
the messages from the bus to handle them.
It is also possible to get messages from the bus without any thread
marshalling with the `gst_bus_set_sync_handler()` method. This makes
it possible to react to a message in the same thread that posted the
message on the bus. This should only be used if the application is able
to deal with messages from different threads.
If no messages are popped from the bus with either a GSource or
`gst_bus_pop()`, they remain on the bus.
When a pipeline or bin goes from READY into NULL state, it will set its
bus to flushing, ie. the bus will drop all existing and new messages on
the bus, This is necessary because bus messages hold references to the
bin/pipeline or its elements, so there are circular references that need
to be broken if one ever wants to be able to destroy a bin or pipeline
properly.

View file

@ -0,0 +1,61 @@
# GstElement
The Element is the most important object in the entire GStreamer system,
as it defines the structure of the pipeline. Elements include sources,
filters, sinks, and containers (Bins). They may be an intrinsic part of
the core GStreamer library, or may be loaded from a plugin. In some
cases theyre even fabricated from completely different systems (see the
LADSPA plugin). They are generally created from a GstElementFactory,
which will be covered in another chapter, but for the intrinsic types
they can be created with specific functions.
Elements contains GstPads (also covered in another chapter), which are
subsequently used to connect the Elements together to form a pipeline
capable of passing and processing data. They have a parent, which must
be another Element. This allows deeply nested pipelines, and the
possibility of "black-box" meta-elements.
## Name
All elements are named, and while they should ideally be unique in any
given pipeline, they do not have to be. The only guaranteed unique name
for an element is its complete path in the object hierarchy. In other
words, an elements name is unique inside its parent. (This follows from
GstObjects name explanation)
This uniqueness is guaranteed through all functions where either
parentage or name of an element is changed.
## Pads
GstPads are the property of a given GstElement. They provide the
connection capability, with allowing arbitrary structure in the graph.
For any Element but a source or sink, there will be at least 2 Pads
owned by the Element. These pads are stored in a single GList within the
Element. Several counters are kept in order to allow quicker
determination of the type and properties of a given Element.
Pads may be added to an element with `_add_pad.` Retrieval is via
`_get_static_pad()`, which operates on the name of the Pad (the unique
key). This means that all Pads owned by a given Element must have unique
names. A pointer to the GList of pads may be obtained with
`_iterate_pads`.
`gst_element_add_pad(element,pads)`: Sets the element as the parent of
the pad, then adds the pad to the elements list of pads, keeping the
counts of total, src, and sink pads up to date. Emits the `new_pad`
signal with the pad as argument. Fails if either the element or pad are
either NULL or not what they claim to be. Should fail if the pad already
has a parent. Should fail if the pad is already owned by the element.
Should fail if theres already a pad by that name in the list of pads.
`pad = gst_element_get_pad(element, "padname")`: Searches through the
list of pads
## Ghost Pads
More info in [ghostpad](design/gstghostpad.md).
## State
An element has a state. More info in [state](design/states.md).

View file

@ -0,0 +1,451 @@
# Ghostpads
GhostPads are used to build complex compound elements out of existing
elements. They are used to expose internal element pads on the complex
element.
## Some design requirements
- Must look like a real GstPad on both sides.
- target of Ghostpad must be changeable
- target can be initially NULL
- a GhostPad is implemented using a private GstProxyPad class:
```
GstProxyPad
(------------------)
| GstPad |
|------------------|
| GstPad *target |
(------------------)
| GstPad *internal |
(------------------)
GstGhostPad
(------------------) -\
| GstPad | |
|------------------| |
| GstPad *target | > GstProxyPad
|------------------| |
| GstPad *internal | |
|------------------| -/
| <private data> |
(------------------)
```
A GstGhostPad (X) is _always_ created together with a GstProxyPad (Y).
The internal pad pointers are set to point to the eachother. The
GstProxyPad pairs have opposite directions, the GstGhostPad has the same
direction as the (future) ghosted pad (target).
(- X --------)
| |
| target * |
|------------|
| internal *----+
(------------) |
^ V
| (- Y --------)
| | |
| | target * |
| |------------|
+----* internal |
(------------)
Which we will abbreviate to:
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
The GstGhostPad (X) is also set as the parent of the GstProxyPad (Y).
The target is a pointer to the internal pads peer. It is an optimisation to
quickly get to the peer of a ghostpad without having to dereference the
internal->peer.
Some use case follow with a description of how the datastructure
is modified.
## Creating a ghostpad with a target:
gst_ghost_pad_new (char *name, GstPad *target)
1) create new GstGhostPad X + GstProxyPad Y
2) X name set to @name
3) X direction is the same as the target, Y is opposite.
4) the target of X is set to @target
5) Y is linked to @target
6) link/unlink and activate functions are set up
on GstGhostPad.
```
(--------------
(- X --------) |
| | |------)
| target *------------------> | sink |
(------------) -------> |------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
```
- Automatically takes same direction as target.
- target is filled in automatically.
## Creating a ghostpad without a target
```
gst_ghost_pad_new_no_target (char *name, GstPadDirection dir)
```
1) create new GstGhostPad X + GstProxyPad Y
2) X name set to @name
3) X direction is @dir
5) link/unlink and activate functions are set up on GstGhostPad.
```
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
```
- allows for setting the target later
## Setting target on an untargetted unlinked ghostpad
```
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
```
1) assert direction of newtarget == X direction
2) target is set to newtarget
3) internal pad Y is linked to newtarget
```
(--------------
(- X --------) |
| | |------)
| target *------------------> | sink |
(------------) -------> |------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
```
## Setting target on a targetted unlinked ghostpad
```
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(--------------
(- X --------) |
| | |-------)
| target *------------------> | sink1 |
(------------) -------> |-------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
```
1) assert direction of newtarget (sink2) == X direction
2) unlink internal pad Y and oldtarget
3) target is set to newtarget (sink2)
4) internal pad Y is linked to newtarget
```
(--------------
(- X --------) |
| | |-------)
| target *------------------> | sink2 |
(------------) -------> |-------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
```
- Linking a pad to an untargetted ghostpad:
```
gst_pad_link (src, X)
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
-------)
|
(-----|
| src |
(-----|
-------)
```
X is a sink GstGhostPad without a target. The internal GstProxyPad Y has
the same direction as the src pad (peer).
1) link function is called
- Y direction is same as @src
- Y target is set to @src
- Y is activated in the same mode as X
- core makes link from @src to X
```
(- X --------)
| |
| target *----->//
>(------------)
(real pad link) / |
/ (- Y ------)
/ -----* target |
-------) / / (----------)
| / /
(-----|/ /
| src |<----
(-----|
-------)
```
## Linking a pad to a targetted ghostpad:
```
gst_pad_link (src, X)
(--------
(- X --------) |
| | |------)
| target *------------->| sink |
(------------) >|------)
| / (--------
| /
| /
-------) | / (real pad link)
| (- Y ------) /
(-----| | |/
| src | //<----* target |
(-----| (----------)
-------)
```
1) link function is called
- Y direction is same as @src
- Y target is set to @src
- Y is activated in the same mode as X
- core makes link from @src to X
```
(--------
(- X --------) |
| | |------)
| target *------------->| sink |
>(------------) >|------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
```
## Setting target on untargetted linked ghostpad:
```
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(- X --------)
| |
| target *------>//
>(------------)
(real pad link) / |
/ |
/ |
-------) / |
| / (- Y ------)
(-----|/ | |
| src |<-------------* target |
(-----| (----------)
-------)
```
1) assert direction of @newtarget == X direction
2) X target is set to @newtarget
3) Y is linked to @newtarget
```
(--------
(- X --------) |
| | |------)
| target *------------->| sink |
>(------------) >|------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
```
## Setting target on targetted linked ghostpad:
```
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(--------
(- X --------) |
| | |-------)
| target *------------->| sink1 |
>(------------) >|-------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
```
1) assert direction of @newtarget == X direction
2) Y and X target are unlinked
2) X target is set to @newtarget
3) Y is linked to @newtarget
```
(--------
(- X --------) |
| | |-------)
| target *------------->| sink2 |
>(------------) >|-------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
```
## Activation
Sometimes ghost pads should proxy activation functions. This thingie
attempts to explain how it should work in the different cases.
```
+---+ +----+ +----+ +----+
| A +-----+ B | | C |-------+ D |
+---+ +---=+ +=---+ +----+
+--=-----------------------------=-+
| +=---+ +----+ +----+ +---=+ |
| | a +---+ b ==== c +--+ d | |
| +----+ +----+ +----+ +----+ |
| |
+----------------------------------+
state change goes from right to left
<-----------------------------------------------------------
```
All of the labeled boxes are pads. The dashes (---) show pad links, and
the double-lines (===) are internal connections. The box around a, b, c,
and d is a bin. B and C are ghost pads, and a and d are proxy pads. The
arrow represents the direction of a state change algorithm. Not counting
the bin, there are three elements involved herethe parent of D, the
parent of A, and the parent of b and c.
Now, in the state change from READY to PAUSED, assuming the pipeline
does not have a live source, all of the pads will end up activated at
the end. There are 4 possible activation modes:
1) AD and ab in PUSH, cd and CD in PUSH
2) AD and ab in PUSH, cd and CD in PULL
3) AD and ab in PULL, cd and CD in PUSH
4) AD and ab in PULL, cd and CD in PULL
When activating (1), the state change algorithm will first visit the
parent of D and activate D in push mode. Then it visits the bin. The bin
will first change the state of its child before activating its pads.
That means c will be activated in push mode. \[\*\] At this point, d and
C should also be active in push mode, because it could be that
activating c in push mode starts a thread, which starts pushing to pads
which arent ready yet. Then b is activated in push mode. Then, the bin
activates C in push mode, which should already be in push mode, so
nothing is done. It then activates B in push mode, which activates b in
push mode, but its already there, then activates a in push mode as
well. The order of activating a and b does not matter in this case.
Then, finally, the state change algorithm moves to the parent of A,
activates A in push mode, and dataflow begins.
\[\*\] Not yet implemented.
Activation mode (2) is implausible, so we can ignore it for now. That
leaves us with the rest.
(3) is the same as (1) until you get to activating b. Activating b will
proxy directly to activating a, which will activate B and A as well.
Then when the state change algorithm gets to B and A it sees that they
are already active, so it ignores them.
Similarly in (4), activating D will cause the activation of all of the
rest of the pads, in this order: C d c b a B A. Then when the state
change gets to the other elements they are already active, and in fact
data flow is already occurring.
So, from these scenarios, we can distill how ghost pad activation
functions should work:
Ghost source pads (e.g. C): push: called by: element state change
handler behavior: just return TRUE pull: called by: peers activatepull
behavior: change the internal pad, which proxies to its peer e.g. C
changes d which changes c.
Internal sink pads (e.g. d): push: called by: nobody (doesnt seem
possible) behavior: n/a pull: called by: ghost pad behavior: proxy to
peer first
Internal src pads (e.g. a): push: called by: ghost pad behavior:
activate peer in push mode pull: called by: peers activatepull
behavior: proxy to ghost pad, which proxies to its peer (e.g. a calls B
which calls A)
Ghost sink pads (e.g. B): push: called by: element state change handler
behavior: change the internal pad, which proxies to peer (e.g. B changes
a which changes b) pull: called by: internal pad behavior: proxy to peer
It doesnt really make sense to have activation functions on proxy pads
that arent part of a ghost pad arrangement.

View file

@ -0,0 +1,79 @@
# GstObject
The base class for the entire GStreamer hierarchy is the GstObject.
## Parentage
A pointer is available to store the current parent of the object. This
is one of the two fundamental requirements for a hierarchical system
such as GStreamer (for the other, read up on GstBin). Three functions
are provided: `_set_parent()`, `_get_parent()`, and `_unparent()`. The
third is required because there is an explicit check in `_set_parent()`:
an object must not already have a parent if you wish to set one. You
must unparent the object first. This allows for new additions later.
- GstObjects that can be parented: GstElement (inside a bin) GstPad (inside
an element)
## Naming
- names of objects cannot be changed when they are parented
- names of objects should be unique across parent
- set_name() can fail because of this
- as can gst_element_add_pad()/gst_bin_add_element()
- gst_object_set_name() only changes the objects name
- objects also have a name_prefix that is used to prefix the object
name during debugging and identification
- there are object-specific set_names() which also set the
name_prefix on the object. This is useful for debugging purposes to
give the object a more identifiable name. Typically a parent will
call _set_name_prefix on children, taking a lock on them to do
so.
## Locking
The GstObject contains the necessary primitives to lock the object in a
thread-safe manner. This will be used to provide general thread-safety
as needed. However, this lock is generic, i.e. it covers the whole
object.
The object LOCK is a very lowlevel lock that should only be held to
access the object properties for short periods of code.
All members of the GstObject structure marked as `/**< public >**/ /*
with LOCK */` are protected by this lock. These members can only be
accessed for reading or writing while the lock is held. All members
should be copied or reffed if they are used after releasing the LOCK.
Note that this does **not** mean that no other thread can modify the
object at the same time that the lock is held. It only means that any
two sections of code that obey the lock are guaranteed to not be running
simultaneously. "The lock is voluntary and cooperative".
This lock will ideally be used for parentage, flags and naming, which is
reasonable, since they are the only possible things to protect in the
GstObject.
## Locking order
In parent-child situations the lock of the parent must always be taken
first before taking the lock of the child. It is NOT allowed to hold the
child lock before taking the parent lock.
This policy allows for parents to iterate their children and setting
properties on them.
Whenever a nested lock needs to be taken on objects not involved in a
parent-child relation (eg. pads), an explictic locking order has to be
defined.
## Path Generation
Due to the base nature of the GstObject, it becomes the only reasonable
place to put this particular function (_get_path_string). It will
generate a string describing the parent hierarchy of a given GstObject.
## Flags
Each object in the GStreamer object hierarchy can have flags associated
with it, which are used to describe a state or a feature of the object.

View file

@ -0,0 +1,79 @@
# GstPipeline
A GstPipeline is usually a toplevel bin and provides all of its children
with a clock.
A GstPipeline also provides a toplevel GstBus (see [gstbus](design/gstbus.md))
The pipeline also calculates the running\_time based on the selected
clock (see also clocks.txt and [synchronisation](design/synchronisation.md)).
The pipeline will calculate a global latency for the elements in the
pipeline. (See also [latency](design/latency.md)).
## State changes
In addition to the normal state change procedure of its parent class
GstBin, the pipeline performs the following actions during a state
change:
- NULL → READY:
- set the bus to non-flushing
- READY → PAUSED:
- reset the running_time to 0
- PAUSED → PLAYING:
- Select and a clock.
- calculate base_time using the running_time.
- calculate and distribute latency.
- set clock and base_time on all elements before performing the state
change.
- PLAYING → PAUSED:
- calculate the running_time when the pipeline was PAUSED.
- READY → NULL:
- set the bus to flushing (when auto-flushing is enabled)
The running_time represents the total elapsed time, measured in clock
units, that the pipeline spent in the PLAYING state (see
[synchronisation](design/synchronisation.md)). The running_time is set to 0 after a
flushing seek.
## Clock selection
Since all of the children of a GstPipeline must use the same clock, the
pipeline must select a clock. This clock selection happens when the
pipeline goes to the PLAYING state.
The default clock selection algorithm works as follows:
- If the application selected a clock, use that clock. (see below)
- Use the clock of most upstream element that can provide a clock.
This selection is performed by iterating the element starting from
the sinks going upstream.
- since this selection procedure happens in the PAUSED→PLAYING
state change, all the sinks are prerolled and we can thus be
sure that each sink is linked to some upstream element.
- in the case of a live pipeline (`NO_PREROLL`), the sink will not
yet be prerolled and the selection process will select the clock
of a more upstream element.
- use GstSystemClock, this only happens when no element provides a
usable clock.
The application can influence this clock selection with two methods:
`gst_pipeline_use_clock()` and `gst_pipeline_auto_clock()`.
The `_use_clock()` method forces the use of a specific clock on the
pipeline regardless of what clock providers are children of the
pipeline. Setting NULL disables the clock completely and makes the
pipeline run as fast as possible.
The `_auto_clock()` method removes the fixed clock and reactivates the
auto- matic clock selection algorithm described above.
## GstBus
A GstPipeline provides a GstBus to the application. The bus can be
retrieved with `gst_pipeline_get_bus()` and can then be used to
retrieve messages posted by the elements in the pipeline (see
[gstbus](design/gstbus.md)).

6
markdown/design/index.md Normal file
View file

@ -0,0 +1,6 @@
# GStreamer design documents
This section gathers the various GStreamer design documents.
Those documents are the technical documents that have been produce while
developing or refactoring parts of the GStreamer design to explain the
problem and the design solution we came up to solve them.

409
markdown/design/latency.md Normal file
View file

@ -0,0 +1,409 @@
# Latency
The latency is the time it takes for a sample captured at timestamp 0 to
reach the sink. This time is measured against the clock in the pipeline.
For pipelines where the only elements that synchronize against the clock
are the sinks, the latency is always 0 since no other element is
delaying the buffer.
For pipelines with live sources, a latency is introduced, mostly because
of the way a live source works. Consider an audio source, it will start
capturing the first sample at time 0. If the source pushes buffers with
44100 samples at a time at 44100Hz it will have collected the buffer at
second 1. Since the timestamp of the buffer is 0 and the time of the
clock is now \>= 1 second, the sink will drop this buffer because it is
too late. Without any latency compensation in the sink, all buffers will
be dropped.
The situation becomes more complex in the presence of:
- 2 live sources connected to 2 live sinks with different latencies
- audio/video capture with synchronized live preview.
- added latencies due to effects (delays, resamplers…)
- 1 live source connected to 2 live sinks
- firewire DV
- RTP, with added latencies because of jitter buffers.
- mixed live source and non-live source scenarios.
- synchronized audio capture with non-live playback. (overdubs,..)
- clock slaving in the sinks due to the live sources providing their
own clocks.
To perform the needed latency corrections in the above scenarios, we
must develop an algorithm to calculate a global latency for the
pipeline. The algorithm must be extensible so that it can optimize the
latency at runtime. It must also be possible to disable or tune the
algorithm based on specific application needs (required minimal
latency).
## Pipelines without latency compensation
We show some examples to demonstrate the problem of latency in typical
capture pipelines.
### Example 1
An audio capture/playback pipeline.
* asrc: audio source, provides a clock
* asink audio sink, provides a clock
.--------------------------.
| pipeline |
| .------. .-------. |
| | asrc | | asink | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
* *NULL→READY*:
* asink: *NULL→READY*: probes device, returns `SUCCESS`
* asrc: *NULL→READY*: probes device, returns `SUCCESS`
* *READY→PAUSED*:
* asink: *READY:→PAUSED* open device, returns `ASYNC`
* asrc: *READY→PAUSED*: open device, returns `NO_PREROLL`
- Since the source is a live source, it will only produce data in
the `PLAYING` state. To note this fact, it returns `NO_PREROLL`
from the state change function.
- This sink returns `ASYNC` because it can only complete the state
change to `PAUSED` when it receives the first buffer.
At this point the pipeline is not processing data and the clock is not
running. Unless a new action is performed on the pipeline, this situation will
never change.
* *PAUSED→PLAYING*: asrc clock selected because it is the most upstream clock
provider. asink can only provide a clock when it received the first buffer and
configured the device with the samplerate in the caps.
* sink: *PAUSED:→PLAYING*, sets pending state to `PLAYING`, returns `ASYNC` because it
is not prerolled. The sink will commit state to `PLAYING` when it prerolls.
* src: *PAUSED→PLAYING*: starts pushing buffers.
- since the sink is still performing a state change from `READY→PAUSED`, it remains ASYNC. The pending state will be set to
PLAYING.
- The clock starts running as soon as all the elements have been
set to PLAYING.
- the source is a live source with a latency. Since it is
synchronized with the clock, it will produce a buffer with
timestamp 0 and duration D after time D, ie. it will only be
able to produce the last sample of the buffer (with timestamp D)
at time D. This latency depends on the size of the buffer.
- the sink will receive the buffer with timestamp 0 at time \>= D.
At this point the buffer is too late already and might be
dropped. This state of constantly dropping data will not change
unless a constant latency correction is added to the incoming
buffer timestamps.
The problem is due to the fact that the sink is set to (pending) PLAYING
without being prerolled, which only happens in live pipelines.
### Example 2
An audio/video capture/playback pipeline. We capture both audio and video and
have them played back synchronized again.
* asrc: audio source, provides a clock
* asink audio sink, provides a clock
* vsrc: video source
* vsink video sink
.--------------------------.
| pipeline |
| .------. .-------. |
| | asrc | | asink | |
| | src -> sink | |
| '------' '-------' |
| .------. .-------. |
| | vsrc | | vsink | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
The state changes happen in the same way as example 1. Both sinks end up with
pending state of `PLAYING` and a return value of ASYNC until they receive the
first buffer.
For audio and video to be played in sync, both sinks must compensate for the
latency of its source but must also use exactly the same latency correction.
Suppose asrc has a latency of 20ms and vsrc a latency of 33ms, the total
latency in the pipeline has to be at least 33ms. This also means that the
pipeline must have at least a 33 - 20 = 13ms buffering on the audio stream or
else the audio src will underrun while the audiosink waits for the previous
sample to play.
### Example 3
An example of the combination of a non-live (file) and a live source (vsrc)
connected to live sinks (vsink, sink).
.--------------------------.
| pipeline |
| .------. .-------. |
| | file | | sink | |
| | src -> sink | |
| '------' '-------' |
| .------. .-------. |
| | vsrc | | vsink | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
The state changes happen in the same way as example 1. Except sink will be
able to preroll (commit its state to PAUSED).
In this case sink will have no latency but vsink will. The total latency
should be that of vsink.
Note that because of the presence of a live source (vsrc), the pipeline can be
set to playing before sink is able to preroll. Without compensation for the
live source, this might lead to synchronisation problems because the latency
should be configured in the element before it can go to PLAYING.
### Example 4
An example of the combination of a non-live and a live source. The non-live
source is connected to a live sink and the live source to a non-live sink.
.--------------------------.
| pipeline |
| .------. .-------. |
| | file | | sink | |
| | src -> sink | |
| '------' '-------' |
| .------. .-------. |
| | vsrc | | files | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
The state changes happen in the same way as example 3. Sink will be
able to preroll (commit its state to PAUSED). files will not be able to
preroll.
sink will have no latency since it is not connected to a live source. files
does not do synchronisation so it does not care about latency.
The total latency in the pipeline is 0. The vsrc captures in sync with the
playback in sink.
As in example 3, sink can only be set to `PLAYING` after it successfully
prerolled.
## State Changes
A Sink is never set to `PLAYING` before it is prerolled. In order to do
this, the pipeline (at the GstBin level) keeps track of all elements
that require preroll (the ones that return ASYNC from the state change).
These elements posted a `ASYNC_START` message without a matching
`ASYNC_DONE` message.
The pipeline will not change the state of the elements that are still
doing an ASYNC state change.
When an ASYNC element prerolls, it commits its state to PAUSED and posts
an `ASYNC_DONE` message. The pipeline notices this `ASYNC_DONE` message
and matches it with the `ASYNC_START` message it cached for the
corresponding element.
When all `ASYNC_START` messages are matched with an `ASYNC_DONE` message,
the pipeline proceeds with setting the elements to the final state
again.
The base time of the element was already set by the pipeline when it
changed the NO\_PREROLL element to PLAYING. This operation has to be
performed in the separate async state change thread (like the one
currently used for going from `PAUSED→PLAYING` in a non-live pipeline).
## Query
The pipeline latency is queried with the LATENCY query.
* **`live`** G_TYPE_BOOLEAN (default FALSE): - if a live element is found upstream
* **`min-latency`** G_TYPE_UINT64 (default 0, must not be NONE): - the minimum
latency in the pipeline, meaning the minimum time downstream elements
synchronizing to the clock have to wait until they can be sure that all data
for the current running time has been received.
Elements answering the latency query and introducing latency must
set this to the maximum time for which they will delay data, while
considering upstream's minimum latency. As such, from an element's
perspective this is *not* its own minimum latency but its own
maximum latency.
Considering upstream's minimum latency in general means that the
element's own value is added to upstream's value, as this will give
the overall minimum latency of all elements from the source to the
current element:
min_latency = upstream_min_latency + own_min_latency
* **`max-latency`** G_TYPE_UINT64 (default 0, NONE meaning infinity): - the
maximum latency in the pipeline, meaning the maximum time an element
synchronizing to the clock is allowed to wait for receiving all data for the
current running time. Waiting for a longer time will result in data loss,
overruns and underruns of buffers and in general breaks synchronized data flow
in the pipeline.
Elements answering the latency query should set this to the maximum
time for which they can buffer upstream data without blocking or
dropping further data. For an element this value will generally be
its own minimum latency, but might be bigger than that if it can
buffer more data. As such, queue elements can be used to increase
the maximum latency.
The value set in the query should again consider upstream's maximum
latency:
- If the current element has blocking buffering, i.e. it does not drop data by
itself when its internal buffer is full, it should just add its own maximum
latency (i.e. the size of its internal buffer) to upstream's value. If
upstream's maximum latency, or the elements internal maximum latency was NONE
(i.e. infinity), it will be set to infinity.
if (upstream_max_latency == NONE || own_max_latency == NONE)
max_latency = NONE;
else
max_latency = upstream_max_latency + own_max_latency
If the element has multiple sinkpads, the minimum upstream latency is
the maximum of all live upstream minimum latencies.
If the current element has leaky buffering, i.e. it drops data by itself
when its internal buffer is full, it should take the minimum of its own
maximum latency and upstreams. Examples for such elements are audio sinks
and sources with an internal ringbuffer, leaky queues and in general live
sources with a limited amount of internal buffers that can be used.
max_latency = MIN (upstream_max_latency, own_max_latency)
> Note: many GStreamer base classes allow subclasses to set a
> minimum and maximum latency and handle the query themselves. These
> base classes assume non-leaky (i.e. blocking) buffering for the
> maximum latency. The base class' default query handler needs to be
> overridden to correctly handle leaky buffering.
If the element has multiple sinkpads, the maximum upstream latency is the
minimum of all live upstream maximum latencies.
## Event
The latency in the pipeline is configured with the LATENCY event, which
contains the following fields:
* **`latency`** G_TYPE_UINT64: the configured latency in the pipeline
## Latency compensation
Latency calculation and compensation is performed before the pipeline
proceeds to the `PLAYING` state.
When the pipeline collected all `ASYNC_DONE` messages it can calculate
the global latency as follows:
- perform a latency query on all sinks
- sources set their minimum and maximum latency
- other elements add their own values as described above
- latency = MAX (all min latencies)
- if MIN (all max latencies) \< latency we have an impossible
situation and we must generate an error indicating that this
pipeline cannot be played. This usually means that there is not
enough buffering in some chain of the pipeline. A queue can be added
to those chains.
The sinks gather this information with a LATENCY query upstream.
Intermediate elements pass the query upstream and add the amount of
latency they add to the result.
ex1: sink1: \[20 - 20\] sink2: \[33 - 40\]
MAX (20, 33) = 33
MIN (20, 40) = 20 < 33 -> impossible
ex2: sink1: \[20 - 50\] sink2: \[33 - 40\]
MAX (20, 33) = 33
MIN (50, 40) = 40 >= 33 -> latency = 33
The latency is set on the pipeline by sending a LATENCY event to the
sinks in the pipeline. This event configures the total latency on the
sinks. The sink forwards this LATENCY event upstream so that
intermediate elements can configure themselves as well.
After this step, the pipeline continues setting the pending state on its
elements.
A sink adds the latency value, received in the LATENCY event, to the
times used for synchronizing against the clock. This will effectively
delay the rendering of the buffer with the required latency. Since this
delay is the same for all sinks, all sinks will render data relatively
synchronised.
## Flushing a playing pipeline
We can implement resynchronisation after an uncontrolled FLUSH in (part
of) a pipeline in the same way. Indeed, when a flush is performed on a
PLAYING live element, a new base time must be distributed to this
element.
A flush in a pipeline can happen in the following cases:
- flushing seek in the pipeline
- performed by the application on the pipeline
- performed by the application on an element
- flush preformed by an element
- after receiving a navigation event (DVD, …)
When a playing sink is flushed by a `FLUSH_START` event, an `ASYNC_START`
message is posted by the element. As part of the message, the fact that
the element got flushed is included. The element also goes to a pending
PAUSED state and has to be set to the `PLAYING` state again later.
The `ASYNC_START` message is kept by the parent bin. When the element
prerolls, it posts an `ASYNC_DONE` message.
When all `ASYNC_START` messages are matched with an `ASYNC_DONE` message,
the bin will capture a new base\_time from the clock and will bring all
the sinks back to `PLAYING` after setting the new base time on them. Its
also possible to perform additional latency calculations and adjustments
before doing this.
## Dynamically adjusting latency
An element that want to change the latency in the pipeline can do this
by posting a LATENCY message on the bus. This message instructs the
pipeline to:
- query the latency in the pipeline (which might now have changed)
with a LATENCY query.
- redistribute a new global latency to all elements with a LATENCY
event.
A use case where the latency in a pipeline can change could be a network
element that observes an increased inter packet arrival jitter or
excessive packet loss and decides to increase its internal buffering
(and thus the latency). The element must post a LATENCY message and
perform the additional latency adjustments when it receives the LATENCY
event from the downstream peer element.
In a similar way can the latency be decreased when network conditions
are improving again.
Latency adjustments will introduce glitches in playback in the sinks and
must only be performed in special conditions.

View file

@ -0,0 +1,53 @@
# Live sources
A live source is a source that cannot be arbitrarily `PAUSED` without
losing data.
A live source such as an element capturing audio or video need to be
handled in a special way. It does not make sense to start the dataflow
in the `PAUSED` state for those devices as the user might wait a long time
between going from `PAUSED` to PLAYING, making the previously captured
buffers irrelevant.
A live source therefore only produces buffers in the PLAYING state. This
has implications for sinks waiting for a buffer to complete the preroll
state since such a buffer might never arrive.
Live sources return `NO_PREROLL` when going to the `PAUSED` state to inform
the bin/pipeline that this element will not be able to produce data in
the `PAUSED` state. `NO_PREROLL` should be returned for both READY→PAUSED
and PLAYING→PAUSED.
When performing a get\_state() on a bin with a non-zero timeout value,
the bin must be sure that there are no live sources in the pipeline
because else the get\_state() function would block on the sinks.
A gstbin therefore always performs a zero timeout get\_state() on its
elements to discover the `NO_PREROLL` (and ERROR) elements before
performing a blocking wait.
## Scheduling
Live sources will not produce data in the paused state. They block in
the getrange function or in the loop function until they go to PLAYING.
## Latency
The live source timestamps its data with the time of the clock at the
time the data was captured. Normally it will take some time to capture
the first sample of data and the last sample. This means that when the
buffer arrives at the sink, it will already be late and will be dropped.
The latency is the time it takes to construct one buffer of data. This
latency is exposed with a `LATENCY` query.
See [latency](design/latency.md)
## Timestamps
Live sources always timestamp their buffers with the running\_time of
the pipeline. This is needed to be able to match the timestamps of
different live sources in order to synchronize them.
This is in contrast to non-live sources, which timestamp their buffers
starting from running\_time 0.

165
markdown/design/memory.md Normal file
View file

@ -0,0 +1,165 @@
# GstMemory
This document describes the design of the memory objects.
GstMemory objects are usually added to GstBuffer objects and contain the
multimedia data passed around in the pipeline.
## Requirements
- It must be possible to have different memory allocators
- It must be possible to efficiently share memory objects, copy, span and trim.
## Memory layout
`GstMemory` manages a memory region. The accessible part of the managed region is
defined by an offset relative to the start of the region and a size. This
means that the managed region can be larger than what is visible to the user of
GstMemory API.
Schematically, GstMemory has a pointer to a memory region of _maxsize_. The area
starting from `offset` and `size` is accessible.
```
memory
GstMemory ->*----------------------------------------------------*
^----------------------------------------------------^
maxsize
^--------------------------------------^
offset size
```
The current properties of the accessible memory can be retrieved with:
``` c
gsize gst_memory_get_sizes (GstMemory *mem, gsize *offset, gsize *maxsize);
```
The offset and size can be changed with:
``` c
void gst_memory_resize (GstMemory *mem, gssize offset, gsize size);
```
## Allocators
GstMemory objects are created by allocators. Allocators are a subclass
of GstObject and can be subclassed to make custom allocators.
``` c
struct _GstAllocator {
GstObject object;
const gchar *mem_type;
GstMemoryMapFunction mem_map;
GstMemoryUnmapFunction mem_unmap;
GstMemoryCopyFunction mem_copy;
GstMemoryShareFunction mem_share;
GstMemoryIsSpanFunction mem_is_span;
};
```
The allocator class has 2 virtual methods. One to create a GstMemory,
another to free it again.
``` c
struct _GstAllocatorClass {
GstObjectClass object_class;
GstMemory * (*alloc) (GstAllocator *allocator, gsize size,
GstAllocationParams *params);
void (*free) (GstAllocator *allocator, GstMemory *memory);
};
```
Allocators are refcounted. It is also possible to register the allocator to the
GStreamer system. This way, the allocator can be retrieved by name.
After an allocator is created, new GstMemory can be created with
``` c
GstMemory * gst_allocator_alloc (const GstAllocator * allocator,
gsize size,
GstAllocationParams *params);
```
GstAllocationParams contain extra info such as flags, alignment, prefix and
padding.
The GstMemory object is a refcounted object that must be freed with
gst_memory_unref ().
The GstMemory keeps a ref to the allocator that allocated it. Inside the
allocator are the most common GstMemory operations listed. Custom
GstAllocator implementations must implement the various operations on
the memory they allocate.
It is also possible to create a new GstMemory object that wraps existing
memory with:
``` c
GstMemory * gst_memory_new_wrapped (GstMemoryFlags flags,
gpointer data, gsize maxsize,
gsize offset, gsize size,
gpointer user_data,
GDestroyNotify notify);
```
## Lifecycle
GstMemory extends from GstMiniObject and therefore uses its lifecycle
management (See [miniobject](design/miniobject.md)).
## Data Access
Access to the memory region is always controlled with a map and unmap method
call. This allows the implementation to monitor the access patterns or set up
the required memory mappings when needed.
The access of the memory object is controlled with the locking mechanism on
GstMiniObject (See [miniobject](design/miniobject.md)).
Mapping a memory region requires the caller to specify the access method: READ
and/or WRITE. Mapping a memory region will first try to get a lock on the
memory in the requested access mode. This means that the map operation can
fail when WRITE access is requested on a non-writable memory object (it has
an exclusive counter > 1, the memory is already locked in an incompatible
access mode or the memory is marked readonly).
After the data has been accessed in the object, the unmap call must be
performed, which will unlock the memory again.
It is allowed to recursively map multiple times with the same or narrower
access modes. For each of the map calls, a corresponding unmap call needs to
be made. WRITE-only memory cannot be mapped in READ mode and READ-only memory
cannot be mapped in WRITE mode.
The memory pointer returned from the map call is guaranteed to remain valid in
the requested mapping mode until the corresponding unmap call is performed on
the pointer.
When multiple map operations are nested and return the same pointer, the pointer
is valid until the last unmap call is done.
When the final reference on a memory object is dropped, all outstanding
mappings should have been unmapped.
Resizing a GstMemory does not influence any current mappings in any way.
## Copy
A GstMemory copy can be made with the `gst_memory_copy()` call. Normally,
allocators will implement a custom version of this function to make a copy of
the same kind of memory as the original one.
This is what the fallback version of the copy function does, albeit slower
than what a custom implementation could do.
The copy operation is only required to copy the visible range of the memory
block.
## Share
A memory region can be shared between GstMemory object with the
`gst_memory_share()` operation.

107
markdown/design/messages.md Normal file
View file

@ -0,0 +1,107 @@
# Messages
Messages are refcounted lightweight objects to signal the application of
pipeline events.
Messages are implemented as a subclass of GstMiniObject with a generic
GstStructure as the content. This allows for writing custom messages
without requiring an API change while allowing a wide range of different
types of messages.
Messages are posted by objects in the pipeline and are passed to the
application using the GstBus (See also [gstbus](design/gstbus.md)
and [gstpipeline](design/gstpipeline.md)).
## Message types
**`GST_MESSAGE_EOS`**: Posted by sink elements. This message is posted to the
application when all the sinks in a pipeline have posted an EOS message. When
performing a flushing seek, the EOS state of the pipeline and sinks is reset.
**`GST_MESSAGE_ERROR`**: An element in the pipeline got into an error state.
The message carries a GError and a debug string describing the error. This
usually means that part of the pipeline is not streaming anymore.
**`GST_MESSAGE_WARNING`**: An element in the pipeline encountered a condition
that made it produce a warning. This could be a recoverable decoding error or
some other non fatal event. The pipeline continues streaming after a warning.
**`GST_MESSAGE_INFO`**: An element produced an informational message.
**`GST_MESSAGE_TAG`**: An element decoded metadata about the stream. The
message carries a GstTagList with the tag information.
**`GST_MESSAGE_BUFFERING`**: An element is buffering data and that could
potentially take some time. This message is typically emitted by elements that
perform some sort of network buffering. While the pipeline is buffering it
should remain in the PAUSED state. When the buffering is finished, it can
resume PLAYING.
**`GST_MESSAGE_STATE_CHANGED`**: An element changed state in the pipeline.
The message carries the old, new and pending state of the element.
**`GST_MESSAGE_STATE_DIRTY`**: An internal message used to instruct
a pipeline hierarchy that a state recalculation must be performed because of an
ASYNC state change completed. This message is not used anymore.
**`GST_MESSAGE_STEP_DONE`**: An element stepping frames has finished. This is
currently not used.
**`GST_MESSAGE_CLOCK_PROVIDE`**: An element notifies its capability of
providing a clock for the pipeline.
**`GST_MESSAGE_CLOCK_LOST`**: The current clock, as selected by the pipeline,
became unusable. The pipeline will select a new clock on the next PLAYING state
change.
**`GST_MESSAGE_NEW_CLOCK`**: A new clock was selected for the pipeline.
**`GST_MESSAGE_STRUCTURE_CHANGE`**: The pipeline changed its structure, This
means elements were added or removed or pads were linked or unlinked. This
message is not yet used.
**`GST_MESSAGE_STREAM_STATUS`**: Posted by an element when it
starts/stops/pauses a streaming task. It contains information about the reason
why the stream state changed along with the thread id. The application can use
this information to detect failures in streaming threads and/or to adjust
streaming thread priorities.
**`GST_MESSAGE_APPLICATION`**: The application posted a message. This message
must be used when the application posts a message on the bus.
**`GST_MESSAGE_ELEMENT`**: Element-specific message. See the specific
element's documentation
**`GST_MESSAGE_SEGMENT_START`**: An element started playback of a new
segment. This message is not forwarded to applications but is used internally
to schedule SEGMENT_DONE messages.
**`GST_MESSAGE_SEGMENT_DONE`**: An element or bin completed playback of
a segment. This message is only posted on the bus if a SEGMENT seek is
performed on a pipeline.
**`GST_MESSAGE_DURATION_CHANGED`**: An element posts this message when it has
detected or updated the stream duration.
**`GST_MESSAGE_ASYNC_START`**: Posted by sinks when they start an
asynchronous state change.
**`GST_MESSAGE_ASYNC_DONE`**: Posted by sinks when they receive the first
data buffer and complete the asynchronous state change.
**`GST_MESSAGE_LATENCY`**: Posted by elements when the latency in a pipeline
changed and a new global latency should be calculated by the pipeline or
application.
**`GST_MESSAGE_REQUEST_STATE`**: Posted by elements when they want to change
the state of the pipeline they are in. A typical use case would be an audio
sink that requests the pipeline to pause in order to play a higher priority
stream.
**`GST_MESSAGE_STEP_START`**: A Stepping operation has started.
**`GST_MESSAGE_QOS`**: A buffer was dropped or an element changed its
processing strategy for Quality of Service reasons.
**`GST_MESSAGE_PROGRESS`**: A progress message was posted. Progress messages
inform the application about the state of asynchronous operations.

410
markdown/design/meta.md Normal file
View file

@ -0,0 +1,410 @@
# GstMeta
This document describes the design for arbitrary per-buffer metadata.
Buffer metadata typically describes the low level properties of the
buffer content. These properties are commonly not negotiated with caps
but they are negotiated in the bufferpools.
Some examples of metadata:
- interlacing information
- video alignment, cropping, panning information
- extra container information such as granulepos, …
- extra global buffer properties
## Requirements
- It must be fast
- allocation, free, low fragmentation
- access to the metadata fields, preferably not much slower than
directly accessing a C structure field
- It must be extensible. Elements should be able to add new arbitrary
metadata without requiring much effort. Also new metadata fields
should not break API or ABI.
- It plays nice with subbuffers. When a subbuffer is created, the
various buffer metadata should be copied/updated correctly.
- We should be able to negotiate metadata between elements
# Use cases
- **Video planes**: Video data is sometimes allocated in non-contiguous planes
for the Y and the UV data. We need to be able to specify the data on a buffer
using multiple pointers in memory. We also need to be able to specify the
stride for these planes.
- **Extra buffer data**: Some elements might need to store extra data for
a buffer. This is typically done when the resources are allocated from another
subsystem such as OMX or X11.
- **Processing information**: Pan and crop information can be added to the
buffer data when the downstream element can understand and use this metadata.
An imagesink can, for example, use the pan and cropping information when
blitting the image on the screen with little overhead.
## GstMeta
A GstMeta is a structure as follows:
``` c
struct _GstMeta {
GstMetaFlags flags;
const GstMetaInfo *info; /* tag and info for the meta item */
};
```
The purpose of the this structure is to serve as a common header for all
metadata information that we can attach to a buffer. Specific metadata,
such as timing metadata, will have this structure as the first field.
For example:
``` c
struct _GstMetaTiming {
GstMeta meta; /* common meta header */
GstClockTime dts; /* decoding timestamp */
GstClockTime pts; /* presentation timestamp */
GstClockTime duration; /* duration of the data */
GstClockTime clock_rate; /* clock rate for the above values */
};
```
Or another example for the video memory regions that consists of both
fields and methods.
``` c
#define GST_VIDEO_MAX_PLANES 4
struct GstMetaVideo {
GstMeta meta;
GstBuffer *buffer;
GstVideoFlags flags;
GstVideoFormat format;
guint id
guint width;
guint height;
guint n_planes;
gsize offset[GST_VIDEO_MAX_PLANES]; /* offset in the buffer memory region of the
* first pixel. */
gint stride[GST_VIDEO_MAX_PLANES]; /* stride of the image lines. Can be negative when
* the image is upside-down */
gpointer (*map) (GstMetaVideo *meta, guint plane, gpointer * data, gint *stride,
GstMapFlags flags);
gboolean (*unmap) (GstMetaVideo *meta, guint plane, gpointer data);
};
gpointer gst_meta_video_map (GstMetaVideo *meta, guint plane, gpointer * data,
gint *stride, GstMapflags flags);
gboolean gst_meta_video_unmap (GstMetaVideo *meta, guint plane, gpointer data);
```
GstMeta derived structures define the API of the metadata. The API can
consist of fields and/or methods. It is possible to have different
implementations for the same GstMeta structure.
The implementation of the GstMeta API would typically add more fields to
the public structure that allow it to implement the API.
GstMetaInfo will point to more information about the metadata and looks
like this:
``` c
struct _GstMetaInfo {
GType api; /* api type */
GType type; /* implementation type */
gsize size; /* size of the structure */
GstMetaInitFunction init_func;
GstMetaFreeFunction free_func;
GstMetaTransformFunction transform_func;
};
```
api will contain a GType of the metadata API. A repository of registered
MetaInfo will be maintained by the core. We will register some common
metadata structures in core and some media specific info for
audio/video/text in -base. Plugins can register additional custom
metadata.
For each implementation of api, there will thus be a unique GstMetaInfo.
In the case of metadata with a well defined API, the implementation
specific init function will setup the methods in the metadata structure.
A unique GType will be made for each implementation and stored in the
type field.
Along with the metadata description we will have functions to
initialize/free (and/or refcount) a specific GstMeta instance. We also
have the possibility to add a custom transform function that can be used
to modify the metadata when a transformation happens.
There are no explicit methods to serialize and deserialize the metadata.
Since each type has a GType, we can reuse the GValue transform functions
for this.
The purpose of the separate MetaInfo is to not have to carry the
free/init functions in each buffer instance but to define them globally.
We still want quick access to the info so we need to make the buffer
metadata point to the info.
Technically we could also specify the field and types in the MetaInfo
and provide a generic API to retrieve the metadata fields without the
need for a header file. We will not do this yet.
Allocation of the GstBuffer structure will result in the allocation of a
memory region of a customizable size (512 bytes). Only the first sizeof
(GstBuffer) bytes of this region will initially be used. The remaining
bytes will be part of the free metadata region of the buffer. Different
implementations are possible and are invisible in the API or ABI.
The complete buffer with metadata could, for example, look as follows:
```
+-------------------------------------+
GstMiniObject | GType (GstBuffer) |
| refcount, flags, copy/disp/free |
+-------------------------------------+
GstBuffer | pool,pts,dts,duration,offsets |
| <private data> |
+.....................................+
| next ---+
+- | info ------> GstMetaInfo
GstMetaTiming | | | |
| | dts | |
| | pts | |
| | duration | |
+- | clock_rate | |
+ . . . . . . . . . . . . . . . . . . + |
| next <--+
GstMetaVideo +- +- | info ------> GstMetaInfo
| | | | |
| | | flags | |
| | | n_planes | |
| | | planes[] | |
| | | map | |
| | | unmap | |
+- | | | |
| | private fields | |
GstMetaVideoImpl | | ... | |
| | ... | |
+- | | |
+ . . . . . . . . . . . . . . . . . . + .
. .
```
## API examples
Buffers are created using the normal gst\_buffer\_new functions. The
standard fields are initialized as usual. A memory area that is bigger
than the structure size is allocated for the buffer metadata.
``` c
gst_buffer_new ();
```
After creating a buffer, the application can set caps and add metadata
information.
To add or retrieve metadata, a handle to a GstMetaInfo structure needs
to be obtained. This defines the implementation and API of the metadata.
Usually, a handle to this info structure can be obtained by calling a
public `_get\_info()` method from a shared library (for shared metadata).
The following defines can usually be found in the shared .h file.
``` c
GstMetaInfo * gst_meta_timing_get_info();
#define GST_META_TIMING_INFO (gst_meta_timing_get_info())
```
Adding metadata to a buffer can be done with the
`gst_buffer_add_meta()` call. This function will create new metadata
based on the implementation specified by the GstMetaInfo. It is also
possible to pass a generic pointer to the `add_meta()` function that can
contain parameters to initialize the new metadata fields.
Retrieving the metadata on a buffer can be done with the
`gst_buffer_meta_get()` method. This function retrieves an existing
metadata conforming to the API specified in the given info. When no such
metadata exists, the function will return NULL.
``` c
GstMetaTiming *timing;
timing = gst_buffer_get_meta (buffer, GST_META_TIMING_INFO);
```
Once a reference to the info has been obtained, the associated metadata
can be added or modified on a buffer.
``` c
timing->timestamp = 0;
timing->duration = 20 * GST_MSECOND;
```
Other convenience macros can be made to simplify the above code:
``` c
#define gst_buffer_get_meta_timing(b) \
((GstMetaTiming *) gst_buffer_get_meta ((b), GST_META_TIMING_INFO)
```
This makes the code look like this:
``` c
GstMetaTiming *timing;
timing = gst_buffer_get_meta_timing (buffer);
timing->timestamp = 0;
timing->duration = 20 * GST_MSECOND;
```
To iterate the different metainfo structures, one can use the
`gst_buffer_meta_get_next()` methods.
``` c
GstMeta *current = NULL;
/* passing NULL gives the first entry */
current = gst_buffer_meta_get_next (buffer, current);
/* passing a GstMeta returns the next */
current = gst_buffer_meta_get_next (buffer, current);
```
## Memory management
### allocation
We initially allocate a reasonable sized GstBuffer structure (say 512 bytes).
Since the complete buffer structure, including a large area for metadata, is
allocated in one go, we can reduce the number of memory allocations while still
providing dynamic metadata.
When adding metadata, we need to call the init function of the associated
metadata info structure. Since adding the metadata requires the caller to pass
a handle to the info, this operation does not require table lookups.
Per-metadata memory initialisation is needed because not all metadata is
initialized in the same way. We need to, for example, set the timestamps to
NONE in the MetaTiming structures.
The init/free functions can also be used to implement refcounting for a metadata
structure. This can be useful when a structure is shared between buffers.
When the free_size of the GstBuffer is exhausted, we will allocate new memory
for each newly added Meta and use the next pointers to point to this. It
is expected that this does not occur often and we might be able to optimize
this transparently in the future.
### free
When a GstBuffer is freed, we potentially might have to call a custom free
function on the metadata info. In the case of the Memory metadata, we need to
call the associated free function to free the memory.
When freeing a GstBuffer, the custom buffer free function will iterate all of
the metadata in the buffer and call the associated free functions in the
MetaInfo associated with the entries. Usually, this function will be NULL.
## Serialization
When buffer should be sent over the wire or be serialized in GDP, we
need a way to perform custom serialization and deserialization on the
metadata.
for this we can use the GValue transform functions.
## Transformations
After certain transformations, the metadata on a buffer might not be
relevant anymore.
Consider, for example, metadata that lists certain regions of interest
on the video data. If the video is scaled or rotated, the coordinates
might not make sense anymore. A transform element should be able to
adjust or remove the associated metadata when it becomes invalid.
We can make the transform element aware of the metadata so that it can
adjust or remove in an intelligent way. Since we allow arbitrary
metadata, we cant do this for all metadata and thus we need some other
way.
One proposition is to tag the metadata type with keywords that specify
what it functionally refers too. We could, for example, tag the metadata
for the regions of interest with a tag that notes that the metadata
refers to absolute pixel positions. A transform could then know that the
metadata is not valid anymore when the position of the pixels changed
(due to rotation, flipping, scaling and so on).
## Subbuffers
Subbuffers are implemented with a generic copy. Parameters to the copy
are the offset and size. This allows each metadata structure to
implement the actions needed to update the metadata of the subbuffer.
It might not make sense for some metadata to work with subbuffers. For
example when we take a subbuffer of a buffer with a video frame, the
GstMetaVideo simply becomes invalid and is removed from the new
subbuffer.
## Relationship with GstCaps
The difference between GstCaps, used in negotiation, and the metadata is
not clearly defined.
We would like to think of the GstCaps containing the information needed
to functionally negotiate the format between two elements. The Metadata
should then only contain variables that can change between each buffer.
For example, for video we would have width/height/framerate in the caps
but then have the more technical details, such as stride, data pointers,
pan/crop/zoom etc in the metadata.
A scheme like this would still allow us to functionally specify the
desired video resolution while the implementation details would be
inside the metadata.
## Relationship with GstMiniObject qdata
qdata on a miniobject is element private and is not visible to other
element. Therefore qdata never contains essential information that
describes the buffer content.
## Compatibility
We need to make sure that elements exchange metadata that they both
understand, This is particularly important when the metadata describes
the data layout in memory (such as strides).
The ALLOCATION query is used to let upstream know what metadata we can
suport.
It is also possible to have a bufferpool add certain metadata to the
buffers from the pool. This feature is activated by enabling a buffer
option when configuring the pool.
## Notes
Some structures that we need to be able to add to buffers.
- Clean Aperture
- Arbitrary Matrix Transform
- Aspect ratio
- Pan/crop/zoom
- Video strides
Some of these overlap, we need to find a minimal set of metadata
structures that allows us to define all use cases.

View file

@ -0,0 +1,199 @@
# GstMiniObject
This document describes the design of the miniobject base class.
The miniobject abstract base class is used to construct lightweight
refcounted and boxed types that are frequently created and destroyed.
## Requirements
- Be lightweight
- Refcounted
- I must be possible to control access to the object, ie. when the
object is readable and writable.
- Subclasses must be able to use their own allocator for the memory.
## Usage
Users of the GstMiniObject infrastructure will need to define a
structure that includes the GstMiniObject structure as the first field.
``` c
struct {
GstMiniObject mini_object;
/* my fields */
...
} MyObject
```
The subclass should then implement a constructor method where it
allocates the memory for its structure and initializes the miniobject
structure with `gst\_mini\_object\_init()`. Copy and Free functions are
provided to the `gst\_mini\_object\_init()` function.
``` c
MyObject *
my_object_new()
{
MyObject *res = g_slice_new (MyObject);
gst_mini_object_init (GST_MINI_OBJECT_CAST (res), 0,
MY_TYPE_OBJECT,
(GstMiniObjectCopyFunction) _my_object_copy,
(GstMiniObjectDisposeFunction) NULL,
(GstMiniObjectFreeFunction) _my_object_free);
/* other init */
.....
return res;
}
```
The Free function is responsible for freeing the allocated memory for
the structure.
``` c
static void
_my_object_free (MyObject *obj)
{
/* other cleanup */
...
g_slice_free (MyObject, obj);
}
```
## Lifecycle
GstMiniObject is refcounted. When a GstMiniObject is first created, it
has a refcount of 1.
Each variable holding a reference to a GstMiniObject is responsible for
updating the refcount. This includes incrementing the refcount with
`gst\_mini\_object\_ref()` when a reference is kept to a miniobject or
`gst\_mini\_object\_unref()` when a reference is released.
When the refcount reaches 0, and thus no objects hold a reference to the
miniobject anymore, we can free the miniobject.
When freeing the miniobject, first the GstMiniObjectDisposeFunction is
called. This function is allowed to revive the object again by
incrementing the refcount, in which case it should return FALSE from the
dispose function. The dispose function is used by GstBuffer to revive
the buffer back into the GstBufferPool when needed.
When the dispose function returns TRUE, the GstMiniObjectFreeFunction
will be called and the miniobject will be freed.
## Copy
A miniobject can be copied with `gst\_mini\_object\_copy()`. This function
will call the custom copy function that was provided when registering
the new GstMiniObject subclass.
The copy function should try to preserve as much info from the original
object as possible.
The new copy should be writable.
## Access management
GstMiniObject can be shared between multiple threads. It is important
that when a thread writes to a GstMiniObject that the other threads
dont not see the changes.
To avoid exposing changes from one thread to another thread, the
miniobjects are managed in a Copy-On-Write way. A copy is only made when
it is known that the object is shared between multiple objects or
threads.
There are 2 methods implemented for controlling access to the
miniobject.
- A first method relies on the refcount of the object to control
writability. Objects using this method have the LOCKABLE flag unset.
- A second method relies on a separate counter for controlling the
access to the object. Objects using this method have the LOCKABLE
flag set.
You can check if an object is writable with gst_mini_object_is_writable() and
you can make any miniobject writable with gst_mini_object_make_writable().
This will create a writable copy when the object was not writable.
### non-LOCKABLE GstMiniObjects
These GstMiniObjects have the LOCKABLE flag unset. They use the refcount value
to control writability of the object.
When the refcount of the miniobject is > 1, the objects it referenced by at
least 2 objects and is thus considered unwritable. A copy must be made before a
modification to the object can be done.
Using the refcount to control writability is problematic for many language
bindings that can keep additional references to the objects. This method is
mainly for historical reasons until all users of the miniobjects are
converted to use the LOCAKBLE flag.
### LOCKABLE GstMiniObjects
These GstMiniObjects have the LOCKABLE flag set. They use a separate counter
for controlling writability and access to the object.
It consists of 2 components:
#### exclusive counter
Each object that wants to keep a reference to a GstMiniObject and doesn't want to
see the changes from other owners of the same GstMiniObject needs to lock the
GstMiniObject in EXCLUSIVE mode, which will increase the exclusive counter.
The exclusive counter counts the amount of objects that share this
GstMiniObject. The counter is initially 0, meaning that the object is not shared with
any object.
When a reference to a GstMiniObject release, both the ref count and the
exclusive counter will be decreased with `gst_mini_object_unref()` and
`gst_mini_object_unlock()` respectively.
#### locking
All read and write access must be performed between a `gst_mini_object_lock()`
and `gst_mini_object_unlock()` pair with the requested access method.
A `gst_mini_object_lock()` can fail when a `WRITE` lock is requested and the
exclusive counter is > 1. Indeed a GstMiniObject object with an exclusive
counter > 1 is locked EXCLUSIVELY by at least 2 objects and is therefore not
writable.
Once the GstMiniObject is locked with a certain access mode, it can be
recursively locked with the same or narrower access mode. For example, first
locking the GstMiniObject in READWRITE mode allows you to recusively lock the
GstMiniObject in READWRITE, READ and WRITE mode. Memory locked in READ mode
cannot be locked recursively in WRITE or READWRITE mode.
Note that multiple threads can READ lock the GstMiniObject concurrently but
cannot lock the object in WRITE mode because the exclusive counter must be > 1.
All calls to `gst_mini_object_lock()` need to be paired with one
`gst_mini_object_unlock()` call with the same access mode. When the last
refcount of the object is removed, there should be no more outstanding locks.
Note that a shared counter of both 0 and 1 leaves the GstMiniObject writable.
The reason is to make it easy to create and pass ownership of the GstMiniObject
to another object while keeping it writable. When the GstMiniObject is created
with a shared count of 0, it is writable. When the GstMiniObject is then added
to another object, the shared count is incremented to 1 and the GstMiniObject
remains writable. The 0 share counter has a similar purpose as the floating
reference in GObject.
## Weak references
GstMiniObject has support for weak references. A callback will be called
when the object is freed for all registered weak references.
## QData
Extra data can be associated with a GstMiniObject by using the QData
API.

View file

@ -0,0 +1,257 @@
# What to do when a plugin is missing
The mechanism and API described in this document requires GStreamer core
and gst-plugins-base versions \>= 0.10.12. Further information on some
aspects of this document can be found in the libgstbaseutils API
reference.
We only discuss playback pipelines for now.
A three step process:
1\) GStreamer level
Elements will use a "missing-plugin" element message to report
missing plugins, with the following fields set:
* **`type`**: (string) { "urisource", "urisink", "decoder", "encoder",
"element" } (we do not distinguish between demuxer/decoders/parsers etc.)
* **`detail`**: (string) or (caps) depending on the type { ANY } ex: "mms,
"mmsh", "audio/x-mp3,rate=48000,…"
* **`name`**: (string) { ANY } ex: "MMS protocol handler",..
## missing uri handler
ex. mms://foo.bar/file.asf
When no protocol handler is installed for mms://, the application will not be
able to instantiate an element for that uri (gst_element_make_from_uri()
returns NULL).
Playbin will post a "missing-plugin" element message with the type set to
"urisource", detail set to "mms". Optionally the friendly name can be filled
in as well.
## missing typefind function
We don't recognize the type of the file, this should normally not happen
because all the typefinders are in the basic GStreamer installation.
There is not much useful information we can give about how to resolve this
issue. It is possible to use the first N bytes of the data to determine the
type (and needed plugin) on the server. We don't explore this option in this
document yet, but the proposal is flexible enough to accommodate this in the
future should the need arise.
## missing demuxer
Typically after running typefind on the data we determine the type of the
file. If there is no plugin found for the type, a "missing-plugin" element
message is posted by decodebin with the following fields: Type set to
"decoder", detail set to the caps for witch no plugin was found. Optionally
the friendly name can be filled in as well.
## missing decoder
The demuxer will dynamically create new pads with specific caps while it
figures out the contents of the container format. Decodebin tries to find the
decoders for these formats in the registry. If there is no decoder found, a
"missing-plugin" element message is posted by decodebin with the following
fields: Type set to "decoder", detail set to the caps for which no plugin
was found. Optionally the friendly name can be filled in as well. There is
no distinction made between the missing demuxer and decoder at the
application level.
## missing element
Decodebin and playbin will create a set of helper elements when they set up
their decoding pipeline. These elements are typically colorspace, sample rate,
audio sinks,... Their presence on the system is required for the functionality
of decodebin. It is typically a package dependency error if they are not
present but in case of a corrupted system the following "missing-plugin"
element message will be emitted: type set to "element", detail set to the
element factory name and the friendly name optionally set to a description
of the element's functionality in the decoding pipeline.
Except for reporting the missing plugins, no further policy is enforced at the
GStreamer level. It is up to the application to decide whether a missing
plugin constitutes a problem or not.
# Application level
The application's job is to listen for the "missing-plugin" element messages
and to decide on a policy to handle them. Following cases exist:
## partially missing plugins
The application will be able to complete a state change to PAUSED but there
will be a "missing-plugin" element message on the GstBus.
This means that it will be possible to play back part of the media file but not
all of it.
For example: suppose we have an .avi file with mp3 audio and divx video. If we
have the mp3 audio decoder but not the divx video decoder, it will be possible
to play only the audio part but not the video part. For an audio playback
application, this is not a problem but a video player might want to decide on:
- require the use to install the additionally required plugins.
- inform the user that only the audio will be played back
- ask the user if it should download the additional codec or only play
the audio part.
- …
## completely unplayable stream
The application will receive an ERROR message from GStreamer informing it that
playback stopped (before it could reach PAUSED). This happens because none of
the streams is connected to a decoder. The error code and domain should be one
of the following in this case:
- `GST_CORE_ERROR_MISSING_PLUGIN` (domain: GST_CORE_ERROR)
- `GST_STREAM_ERROR_CODEC_NOT_FOUND` (domain: GST_STREAM_ERROR)
The application can then see that there are a set of "missing-plugin" element
messages on the GstBus and can decide to trigger the download procedure. It
does that as described in the following section.
"missing-plugin" element messages can be identified using the function
gst_is_missing_plugin_message().
# Plugin download stage
At this point the application has
- collected one or more "missing-plugin" element messages
- made a decision that additional plugins should be installed
It will call a GStreamer utility function to convert each "missing-plugin"
message into an identifier string describing the missing capability. This is
done using the function `gst_missing_plugin_message_get_installer_detail()`.
The application will then pass these strings to `gst_install_plugins_async()`
or `gst_install_plugins_sync()` to initiate the download. See the API
documentation there (`libgstbaseutils`, part of `gst-plugins-base`) for more
details.
When new plugins have been installed, the application will have to initiate
a re-scan of the GStreamer plugin registry using gst_update_registry().
# Format of the (UTF-8) string ID passed to the external installer system
The string is made up of several fields, separated by '|' characters.
The fields are:
- plugin system identifier, ie. "gstreamer" This identifier determines
the format of the rest of the detail string. Automatic plugin
installers should not process detail strings with unknown
identifiers. This allows other plugin-based libraries to use the
same mechanism for their automatic plugin installation needs, or for
the format to be changed should it turn out to be insufficient.
- plugin system version, e.g. "1.0" This is required so that when
there is a GStreamer-2.0 or GStreamer-3.0 at some point in future,
the different major versions can still co-exist and use the same
plugin install mechanism in the same way.
- application identifier, e.g. "totem" This may also be in the form of
"pid/12345" if the program name cant be obtained for some reason.
- human-readable localised description of the required component, e.g.
"Vorbis audio decoder"
- identifier string for the required component, e.g.
- urisource-(PROTOCOL_REQUIRED) e.g. `urisource-http` or `urisource-mms`
- element-(ELEMENT_REQUIRED), e.g. `element-videoconvert`
- decoder-(CAPS_REQUIRED) e.g. `decoder-audio/x-vorbis` or
`decoder-application/ogg` or `decoder-audio/mpeg, mpegversion=(int)4` or
`decoder-video/mpeg, systemstream=(boolean)true, mpegversion=(int)2`
- encoder-(CAPS_REQUIRED) e.g. `encoder-audio/x-vorbis`
- optional further fields not yet specified
* An entire ID string might then look like this, for example:
`gstreamer|0.10|totem|Vorbis audio decoder|decoder-audio/x-vorbis`
* Plugin installers parsing this ID string should expect further fields also
separated by '|' symbols and either ignore them, warn the user, or error
out when encountering them.
* The human-readable description string is provided by the libgstbaseutils
library that can be found in gst-plugins-base versions >= 0.10.12 and can
also be used by demuxers to find out the codec names for taglists from given
caps in a unified and consistent way.
* Applications can create these detail strings using the function
`gst_missing_plugin_message_get_installer_detail()` on a given missing-plugin
message.
# Using missing-plugin messages for error reporting:
Missing-plugin messages are also useful for error reporting purposes, either in
the case where the application does not support libgimme-codec, or the external
installer is not available or not able to install the required plugins.
When creating error messages, applications may use the function
gst_missing_plugin_message_get_description() to obtain a possibly translated
description from each missing-plugin message (e.g. "Matroska demuxer" or
"Theora video depayloader"). This can be used to report to the user exactly
what it is that is missing.
# Notes for packagers
An easy way to introspect plugin .so files is:
```
$ gst-inspect --print-plugin-auto-install-info /path/to/libgstfoo.so
```
The output will be something like:
```
decoder-audio/x-vorbis
element-vorbisdec
element-vorbisenc
element-vorbisparse
element-vorbistag
encoder-audio/x-vorbis
```
BUT could also be like this (from the faad element in this case):
```
decoder-audio/mpeg, mpegversion=(int){ 2, 4 }
```
NOTE that this does not exactly match the caps string that the installer
will get from the application. The application will always ever ask for
one of
```
decoder-audio/mpeg, mpegversion=(int)2
decoder-audio/mpeg, mpegversion=(int)4
```
When introspecting, keep in mind that there are GStreamer plugins
that in turn load external plugins. Examples of these are pitfdll,
ladspa, or the GStreamer libvisual plugin. Those plugins will only
announce elements for the currently installed external plugins at
the time of introspection\! With the exception of pitfdll, this is
not really relevant to the playback case, but may become an issue in
future when applications like buzztard, jokosher or pitivi start
requestion elements by name, for example ladspa effect elements or
so.
This case could be handled if those wrapper plugins would also provide a
`gst-install-xxx-plugins-helper`, where xxx={ladspa|visual|...}. Thus if the
distro specific `gst-install-plugins-helper` can't resolve a request for e.g.
`element-bml-sonicverb` it can forward the request to
`gst-install-bml-plugins-helper` (bml is the buzz machine loader).
# Further references:
<http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-libs/html/gstreamer-base-utils.html>

View file

@ -0,0 +1,333 @@
# Negotiation
Capabilities negotiation is the process of deciding on an adequate
format for dataflow within a GStreamer pipeline. Ideally, negotiation
(also known as "capsnego") transfers information from those parts of the
pipeline that have information to those parts of the pipeline that are
flexible, constrained by those parts of the pipeline that are not
flexible.
## Basic rules
These simple rules must be followed:
1) downstream suggests formats
2) upstream decides on format
There are 4 queries/events used in caps negotiation:
1) `GST_QUERY_CAPS`: get possible formats
2) `GST_QUERY_ACCEPT_CAPS`: check if format is possible
3) `GST_EVENT_CAPS`: configure format (downstream)
4) `GST_EVENT_RECONFIGURE`: inform upstream of possibly new caps
# Queries
A pad can ask the peer pad for its supported GstCaps. It does this with
the CAPS query. The list of supported caps can be used to choose an
appropriate GstCaps for the data transfer. The CAPS query works
recursively, elements should take their peers into consideration when
constructing the possible caps. Because the result caps can be very
large, the filter can be used to restrict the caps. Only the caps that
match the filter will be returned as the result caps. The order of the
filter caps gives the order of preference of the caller and should be
taken into account for the returned caps.
* **`filter`** (in) GST_TYPE_CAPS (default NULL): - a GstCaps to filter the results against
* **`caps`** (out) GST_TYPE_CAPS (default NULL): - the result caps
A pad can ask the peer pad if it supports a given caps. It does this
with the ACCEPT\_CAPS query. The caps must be fixed. The ACCEPT\_CAPS
query is not required to work recursively, it can simply return TRUE if
a subsequent CAPS event with those caps would return success.
* **`caps`** (in) GST_TYPE_CAPS: - a GstCaps to check, must be fixed
* **`result`** (out) G_TYPE_BOOLEAN (default FALSE): - TRUE if the caps are accepted
## Events
When a media format is negotiated, peer elements are notified of the
GstCaps with the CAPS event. The caps must be fixed.
* **`caps`** GST_TYPE_CAPS: - the negotiated GstCaps, must be fixed
## Operation
GStreamers two scheduling modes, push mode and pull mode, lend
themselves to different mechanisms to achieve this goal. As it is more
common we describe push mode negotiation first.
## Push-mode negotiation
Push-mode negotiation happens when elements want to push buffers and
need to decide on the format. This is called downstream negotiation
because the upstream element decides the format for the downstream
element. This is the most common case.
Negotiation can also happen when a downstream element wants to receive
another data format from an upstream element. This is called upstream
negotiation.
The basics of negotiation are as follows:
- GstCaps (see [caps](design/caps.md)) are refcounted before they are pushed as
an event to describe the contents of the following buffer.
- An element should reconfigure itself to the new format received as a
CAPS event before processing the following buffers. If the data type
in the caps event is not acceptable, the element should refuse the
event. The element should also refuse the next buffers by returning
an appropriate GST\_FLOW\_NOT\_NEGOTIATED return value from the
chain function.
- Downstream elements can request a format change of the stream by
sending a RECONFIGURE event upstream. Upstream elements will
renegotiate a new format when they receive a RECONFIGURE event.
The general flow for a source pad starting the negotiation.
```
src sink
| |
| querycaps? |
|---------------->|
| caps |
select caps |< - - - - - - - -|
from the | |
candidates | |
| |-.
| accepts? | |
type A |---------------->| | optional
| yes | |
|< - - - - - - - -| |
| |-'
| send_event() |
send CAPS |---------------->| Receive type A, reconfigure to
event A | | process type A.
| |
| push |
push buffer |---------------->| Process buffer of type A
| |
```
One possible implementation in pseudo code:
```
[element wants to create a buffer]
if not format
# see what we can do
ourcaps = gst_pad_query_caps (srcpad)
# see what the peer can do filtered against our caps
candidates = gst_pad_peer_query_caps (srcpad, ourcaps)
foreach candidate in candidates
# make sure the caps is fixed
fixedcaps = gst_pad_fixate_caps (srcpad, candidate)
# see if the peer accepts it
if gst_pad_peer_accept_caps (srcpad, fixedcaps)
# store the caps as the negotiated caps, this will
# call the setcaps function on the pad
gst_pad_push_event (srcpad, gst_event_new_caps (fixedcaps))
break
endif
done
endif
```
# Negotiate allocator/bufferpool with the ALLOCATION query
buffer = gst_buffer_new_allocate (NULL, size, 0);
# fill buffer and push
The general flow for a sink pad starting a renegotiation.
```
src sink
| |
| accepts? |
|<----------------| type B
| yes |
|- - - - - - - - >|-.
| | | suggest B caps next
| |<'
| |
| push_event() |
mark .-|<----------------| send RECONFIGURE event
renegotiate| | |
'>| |
| querycaps() |
renegotiate |---------------->|
| suggest B |
|< - - - - - - - -|
| |
| send_event() |
send CAPS |---------------->| Receive type B, reconfigure to
event B | | process type B.
| |
| push |
push buffer |---------------->| Process buffer of type B
| |
```
# Use case:
## `videotestsrc ! xvimagesink`
* Who decides what format to use?
- src pad always decides, by convention. sinkpad can suggest a format
by putting it high in the caps query result GstCaps.
- since the src decides, it can always choose something that it can do,
so this step can only fail if the sinkpad stated it could accept
something while later on it couldn't.
* When does negotiation happen?
- before srcpad does a push, it figures out a type as stated in 1), then
it pushes a caps event with the type. The sink checks the media type and
configures itself for this type.
- the source then usually does an ALLOCATION query to negotiate a bufferpool
with the sink. It then allocates a buffer from the pool and pushes it to
the sink. since the sink accepted the caps, it can create a pool for the
format.
- since the sink stated in 1) it could accept the type, it will be able to
handle it.
* How can sink request another format?
- sink asks if new format is possible for the source.
- sink pushes RECONFIGURE event upstream
- src receives the RECONFIGURE event and marks renegotiation
- On the next buffer push, the source renegotiates the caps and the
bufferpool. The sink will put the new new preferred format high in the list
of caps it returns from its caps query.
## `videotestsrc ! queue ! xvimagesink`
- queue proxies all accept and caps queries to the other peer pad.
- queue proxies the bufferpool
- queue proxies the RECONFIGURE event
- queue stores CAPS event in the queue. This means that the queue can
contain buffers with different types.
## Pull-mode negotiation
### Rationale
A pipeline in pull mode has different negotiation needs than one
activated in push mode. Push mode is optimized for two use cases:
- Playback of media files, in which the demuxers and the decoders are
the points from which format information should disseminate to the
rest of the pipeline; and
- Recording from live sources, in which users are accustomed to
putting a capsfilter directly after the source element; thus the
caps information flow proceeds from the user, through the potential
caps of the source, to the sinks of the pipeline.
In contrast, pull mode has other typical use cases:
- Playback from a lossy source, such as RTP, in which more knowledge
about the latency of the pipeline can increase quality; or
- Audio synthesis, in which audio APIs are tuned to produce only the
necessary number of samples, typically driven by a hardware
interrupt to fill a DMA buffer or a Jack[0] port buffer.
- Low-latency effects processing, whereby filters should be applied as
data is transferred from a ring buffer to a sink instead of
beforehand. For example, instead of using the internal alsasink
ringbuffer thread in push-mode wavsrc \! volume \! alsasink, placing
the volume inside the sound card writer thread via wavsrc \!
audioringbuffer \! volume \! alsasink.
[0] <http://jackit.sf.net>
The problem with pull mode is that the sink has to know the format in
order to know how many bytes to pull via `gst_pad_pull_range()`. This
means that before pulling, the sink must initiate negotation to decide
on a format.
Recalling the principles of capsnego, whereby information must flow from
those that have it to those that do not, we see that the three named use
cases have different negotiation requirements:
- RTP and low-latency playback are both like the normal playback case,
in which information flows downstream.
- In audio synthesis, the part of the pipeline that has the most
information is the sink, constrained by the capabilities of the
graph that feeds it. However the caps are not completely specified;
at some point the user has to intervene to choose the sample rate,
at least. This can be done externally to gstreamer, as in the jack
elements, or internally via a capsfilter, as is customary with live
sources.
Given that sinks potentially need the input of sources, as in the RTP
case and at least as a filter in the synthesis case, there must be a
negotiation phase before the pull thread is activated. Also, given the
low latency offered by pull mode, we want to avoid capsnego from within
the pulling thread, in case it causes us to miss our scheduling
deadlines.
The pull thread is usually started in the PAUSED→PLAYING state change.
We must be able to complete the negotiation before this state change
happens.
The time to do capsnego, then, is after the SCHEDULING query has
succeeded, but before the sink has spawned the pulling thread.
### Mechanism
The sink determines that the upstream elements support pull based
scheduling by doing a SCHEDULING query.
The sink initiates the negotiation process by intersecting the results
of `gst_pad_query_caps()` on its sink pad and its peer src pad. This is
the operation performed by `gst_pad_get_allowed_caps()` In the simple
passthrough case, the peer pads caps query should return the
intersection of calling `get_allowed_caps()` on all of its sink pads. In
this way the sink element knows the capabilities of the entire pipeline.
The sink element then fixates the resulting caps, if necessary,
resulting in the flow caps. From now on, the caps query of the sinkpad
will only return these fixed caps meaning that upstream elements will
only be able to produce this format.
If the sink element could not set caps on its sink pad, it should post
an error message on the bus indicating that negotiation was not
possible.
When negotiation succeeded, the sinkpad and all upstream internally
linked pads are activated in pull mode. Typically, this operation will
trigger negotiation on the downstream elements, which will now be forced
to negotiate to the final fixed desired caps of the sinkpad.
After these steps, the sink element returns ASYNC from the state change
function. The state will commit to PAUSED when the first buffer is
received in the sink. This is needed to provide a consistent API to the
applications that expect ASYNC return values from sinks but it also
allows us to perform the remainder of the negotiation outside of the
context of the pulling thread.
## Patterns
We can identify 3 patterns in negotiation:
* Fixed : Can't choose the output format
- Caps encoded in the stream
- A video/audio decoder
- usually uses gst_pad_use_fixed_caps()
* Transform
- Caps not modified (passthrough)
- can do caps transform based on element property
- fixed caps get transformed into fixed caps
- videobox
* Dynamic : can choose output format
- A converter element
- depends on downstream caps, needs to do a CAPS query to find
transform.
- usually prefers to use the identity transform
- fixed caps can be transformed into unfixed caps.

568
markdown/design/overview.md Normal file
View file

@ -0,0 +1,568 @@
# Overview
This part gives an overview of the design of GStreamer with references
to the more detailed explanations of the different topics.
This document is intented for people that want to have a global overview
of the inner workings of GStreamer.
## Introduction
GStreamer is a set of libraries and plugins that can be used to
implement various multimedia applications ranging from desktop players,
audio/video recorders, multimedia servers, transcoders, etc.
Applications are built by constructing a pipeline composed of elements.
An element is an object that performs some action on a multimedia stream
such as:
- read a file
- decode or encode between formats
- capture from a hardware device
- render to a hardware device
- mix or multiplex multiple streams
Elements have input and output pads called sink and source pads in
GStreamer. An application links elements together on pads to construct a
pipeline. Below is an example of an ogg/vorbis playback pipeline.
```
+-----------------------------------------------------------+
| ----------> downstream -------------------> |
| |
| pipeline |
| +---------+ +----------+ +-----------+ +----------+ |
| | filesrc | | oggdemux | | vorbisdec | | alsasink | |
| | src-sink src-sink src-sink | |
| +---------+ +----------+ +-----------+ +----------+ |
| |
| <---------< upstream <-------------------< |
+-----------------------------------------------------------+
```
The filesrc element reads data from a file on disk. The oggdemux element
parses the data and sends the compressed audio data to the vorbisdec
element. The vorbisdec element decodes the compressed data and sends it
to the alsasink element. The alsasink element sends the samples to the
audio card for playback.
Downstream and upstream are the terms used to describe the direction in
the Pipeline. From source to sink is called "downstream" and "upstream"
is from sink to source. Dataflow always happens downstream.
The task of the application is to construct a pipeline as above using
existing elements. This is further explained in the pipeline building
topic.
The application does not have to manage any of the complexities of the
actual dataflow/decoding/conversions/synchronisation etc. but only calls
high level functions on the pipeline object such as PLAY/PAUSE/STOP.
The application also receives messages and notifications from the
pipeline such as metadata, warning, error and EOS messages.
If the application needs more control over the graph it is possible to
directly access the elements and pads in the pipeline.
## Design overview
GStreamer design goals include:
- Process large amounts of data quickly
- Allow fully multithreaded processing
- Ability to deal with multiple formats
- Synchronize different dataflows
- Ability to deal with multiple devices
The capabilities presented to the application depends on the number of
elements installed on the system and their functionality.
The GStreamer core is designed to be media agnostic but provides many
features to elements to describe media formats.
## Elements
The smallest building blocks in a pipeline are elements. An element
provides a number of pads which can be source or sinkpads. Sourcepads
provide data and sinkpads consume data. Below is an example of an ogg
demuxer element that has one pad that takes (sinks) data and two source
pads that produce data.
```
+-----------+
| oggdemux |
| src0
sink src1
+-----------+
```
An element can be in four different states: NULL, READY, PAUSED,
PLAYING. In the NULL and READY state, the element is not processing any
data. In the PLAYING state it is processing data. The intermediate
PAUSED state is used to preroll data in the pipeline. A state change can
be performed with `gst_element_set_state()`.
An element always goes through all the intermediate state changes. This
means that when en element is in the READY state and is put to PLAYING,
it will first go through the intermediate PAUSED state.
An element state change to PAUSED will activate the pads of the element.
First the source pads are activated, then the sinkpads. When the pads
are activated, the pad activate function is called. Some pads will start
a thread (GstTask) or some other mechanism to start producing or
consuming data.
The PAUSED state is special as it is used to preroll data in the
pipeline. The purpose is to fill all connected elements in the pipeline
with data so that the subsequent PLAYING state change happens very
quickly. Some elements will therefore not complete the state change to
PAUSED before they have received enough data. Sink elements are required
to only complete the state change to PAUSED after receiving the first
data.
Normally the state changes of elements are coordinated by the pipeline
as explained in [states](design/states.md).
Different categories of elements exist:
- *source elements*: these are elements that do not consume data but
only provide data for the pipeline.
- *sink elements*: these are elements that do not produce data but
renders data to an output device.
- *transform elements*: these elements transform an input stream in a
certain format into a stream of another format.
Encoder/decoder/converters are examples.
- *demuxer elements*: these elements parse a stream and produce several
output streams.
- *mixer/muxer elements*: combine several input streams into one output
stream.
Other categories of elements can be constructed (see [klass](design/draft-klass.md)).
## Bins
A bin is an element subclass and acts as a container for other elements
so that multiple elements can be combined into one element.
A bin coordinates its childrens state changes as explained later. It
also distributes events and various other functionality to elements.
A bin can have its own source and sinkpads by ghostpadding one or more
of its childrens pads to itself.
Below is a picture of a bin with two elements. The sinkpad of one
element is ghostpadded to the bin.
```
+---------------------------+
| bin |
| +--------+ +-------+ |
| | | | | |
| /sink src-sink | |
sink +--------+ +-------+ |
+---------------------------+
```
## Pipeline
A pipeline is a special bin subclass that provides the following
features to its children:
- Select and manage a global clock for all its children.
- Manage running\_time based on the selected clock. Running\_time is
the elapsed time the pipeline spent in the PLAYING state and is used
for synchronisation.
- Manage latency in the pipeline.
- Provide means for elements to comunicate with the application by the
GstBus.
- Manage the global state of the elements such as Errors and
end-of-stream.
Normally the application creates one pipeline that will manage all the
elements in the application.
## Dataflow and buffers
GStreamer supports two possible types of dataflow, the push and pull
model. In the push model, an upstream element sends data to a downstream
element by calling a method on a sinkpad. In the pull model, a
downstream element requests data from an upstream element by calling a
method on a source pad.
The most common dataflow is the push model. The pull model can be used
in specific circumstances by demuxer elements. The pull model can also
be used by low latency audio applications.
The data passed between pads is encapsulated in Buffers. The buffer
contains pointers to the actual memory and also metadata describing the
memory. This metadata includes:
- timestamp of the data, this is the time instance at which the data
was captured or the time at which the data should be played back.
- offset of the data: a media specific offset, this could be samples
for audio or frames for video.
- the duration of the data in time.
- additional flags describing special properties of the data such as
discontinuities or delta units.
- additional arbitrary metadata
When an element whishes to send a buffer to another element is does this
using one of the pads that is linked to a pad of the other element. In
the push model, a buffer is pushed to the peer pad with
`gst_pad_push()`. In the pull model, a buffer is pulled from the peer
with the `gst_pad_pull_range()` function.
Before an element pushes out a buffer, it should make sure that the peer
element can understand the buffer contents. It does this by querying the
peer element for the supported formats and by selecting a suitable
common format. The selected format is then first sent to the peer
element with a CAPS event before pushing the buffer (see
[negotiation](design/negotiation.md)).
When an element pad receives a CAPS event, it has to check if it
understand the media type. The element must refuse following buffers if
the media type preceding it was not accepted.
Both `gst_pad_push()` and `gst_pad_pull_range()` have a return value
indicating whether the operation succeeded. An error code means that no
more data should be sent to that pad. A source element that initiates
the data flow in a thread typically pauses the producing thread when
this happens.
A buffer can be created with `gst_buffer_new()` or by requesting a
usable buffer from a buffer pool using
`gst_buffer_pool_acquire_buffer()`. Using the second method, it is
possible for the peer element to implement a custom buffer allocation
algorithm.
The process of selecting a media type is called caps negotiation.
## Caps
A media type (Caps) is described using a generic list of key/value
pairs. The key is a string and the value can be a single/list/range of
int/float/string.
Caps that have no ranges/list or other variable parts are said to be
fixed and can be used to put on a buffer.
Caps with variables in them are used to describe possible media types
that can be handled by a pad.
## Dataflow and events
Parallel to the dataflow is a flow of events. Unlike the buffers, events
can pass both upstream and downstream. Some events only travel upstream
others only downstream.
The events are used to denote special conditions in the dataflow such as
EOS or to inform plugins of special events such as flushing or seeking.
Some events must be serialized with the buffer flow, others dont.
Serialized events are inserted between the buffers. Non serialized
events jump in front of any buffers current being processed.
An example of a serialized event is a TAG event that is inserted between
buffers to mark metadata for those buffers.
An example of a non serialized event is the FLUSH event.
## Pipeline construction
The application starts by creating a Pipeline element using
`gst_pipeline_new ()`. Elements are added to and removed from the
pipeline with `gst_bin_add()` and `gst_bin_remove()`.
After adding the elements, the pads of an element can be retrieved with
`gst_element_get_pad()`. Pads can then be linked together with
`gst_pad_link()`.
Some elements create new pads when actual dataflow is happening in the
pipeline. With `g_signal_connect()` one can receive a notification when
an element has created a pad. These new pads can then be linked to other
unlinked pads.
Some elements cannot be linked together because they operate on
different incompatible data types. The possible datatypes a pad can
provide or consume can be retrieved with `gst_pad_get_caps()`.
Below is a simple mp3 playback pipeline that we constructed. We will use
this pipeline in further examples.
+-------------------------------------------+
| pipeline |
| +---------+ +----------+ +----------+ |
| | filesrc | | mp3dec | | alsasink | |
| | src-sink src-sink | |
| +---------+ +----------+ +----------+ |
+-------------------------------------------+
## Pipeline clock
One of the important functions of the pipeline is to select a global
clock for all the elements in the pipeline.
The purpose of the clock is to provide a stricly increasing value at the
rate of one `GST_SECOND` per second. Clock values are expressed in
nanoseconds. Elements use the clock time to synchronize the playback of
data.
Before the pipeline is set to PLAYING, the pipeline asks each element if
they can provide a clock. The clock is selected in the following order:
- If the application selected a clock, use that one.
- If a source element provides a clock, use that clock.
- Select a clock from any other element that provides a clock, start
with the sinks.
- If no element provides a clock a default system clock is used for
the pipeline.
In a typical playback pipeline this algorithm will select the clock
provided by a sink element such as an audio sink.
In capture pipelines, this will typically select the clock of the data
producer, which in most cases can not control the rate at which it
produces data.
## Pipeline states
When all the pads are linked and signals have been connected, the
pipeline can be put in the PAUSED state to start dataflow.
When a bin (and hence a pipeline) performs a state change, it will
change the state of all its children. The pipeline will change the state
of its children from the sink elements to the source elements, this to
make sure that no upstream element produces data to an element that is
not yet ready to accept it.
In the mp3 playback pipeline, the state of the elements is changed in
the order alsasink, mp3dec, filesrc.
All intermediate states are traversed for each element resulting in the
following chain of state changes:
* alsasink to READY: the audio device is probed
* mp3dec to READY: nothing happens.
* filesrc to READY: the file is probed
* alsasink to PAUSED: the audio device is opened. alsasink is a sink and returns ASYNC because it did not receive data yet. mp3dec to PAUSED: the decoding library is initialized
* filesrc to PAUSED: the file is opened and a thread is started to push data to mp3dec
At this point data flows from filesrc to mp3dec and alsasink. Since
mp3dec is PAUSED, it accepts the data from filesrc on the sinkpad and
starts decoding the compressed data to raw audio samples.
The mp3 decoder figures out the samplerate, the number of channels and
other audio properties of the raw audio samples and sends out a caps
event with the media type.
Alsasink then receives the caps event, inspects the caps and
reconfigures itself to process the media type.
mp3dec then puts the decoded samples into a Buffer and pushes this
buffer to the next element.
Alsasink receives the buffer with samples. Since it received the first
buffer of samples, it completes the state change to the PAUSED state. At
this point the pipeline is prerolled and all elements have samples.
Alsasink is now also capable of providing a clock to the pipeline.
Since alsasink is now in the PAUSED state it blocks while receiving the
first buffer. This effectively blocks both mp3dec and filesrc in their
gst\_pad\_push().
Since all elements now return SUCCESS from the
gst\_element\_get\_state() function, the pipeline can be put in the
PLAYING state.
Before going to PLAYING, the pipeline select a clock and samples the
current time of the clock. This is the base\_time. It then distributes
this time to all elements. Elements can then synchronize against the
clock using the buffer running\_time
base\_time (See also [synchronisation](design/synchronisation.md)).
The following chain of state changes then takes place:
* alsasink to PLAYING: the samples are played to the audio device
* mp3dec to PLAYING: nothing happens
* filesrc to PLAYING: nothing happens
## Pipeline status
The pipeline informs the application of any special events that occur in
the pipeline with the bus. The bus is an object that the pipeline
provides and that can be retrieved with `gst_pipeline_get_bus()`.
The bus can be polled or added to the glib mainloop.
The bus is distributed to all elements added to the pipeline. The
elements use the bus to post messages on. Various message types exist
such as ERRORS, WARNINGS, EOS, `STATE_CHANGED`, etc..
The pipeline handles EOS messages received from elements in a special
way. It will only forward the message to the application when all sink
elements have posted an EOS message.
Other methods for obtaining the pipeline status include the Query
functionality that can be performed with `gst_element_query()` on the
pipeline. This type of query is useful for obtaining information about
the current position and total time of the pipeline. It can also be used
to query for the supported seeking formats and ranges.
## Pipeline EOS
When the source filter encounters the end of the stream, it sends an EOS
event to the peer element. This event will then travel downstream to all
of the connected elements to inform them of the EOS. The element is not
supposed to accept any more data after receiving an EOS event on a
sinkpad.
The element providing the streaming thread stops sending data after
sending the EOS event.
The EOS event will eventually arrive in the sink element. The sink will
then post an EOS message on the bus to inform the pipeline that a
particular stream has finished. When all sinks have reported EOS, the
pipeline forwards the EOS message to the application. The EOS message is
only forwarded to the application in the PLAYING state.
When in EOS, the pipeline remains in the PLAYING state, it is the
applications responsability to PAUSE or READY the pipeline. The
application can also issue a seek, for example.
## Pipeline READY
When a running pipeline is set from the PLAYING to READY state, the
following actions occur in the pipeline:
* alsasink to PAUSED: alsasink blocks and completes the state change on the
next sample. If the element was EOS, it does not wait for a sample to complete
the state change.
* mp3dec to PAUSED: nothing
* filesrc to PAUSED: nothing
Going to the intermediate PAUSED state will block all elements in the
`_push()` functions. This happens because the sink element blocks on the
first buffer it receives.
Some elements might be performing blocking operations in the PLAYING
state that must be unblocked when they go into the PAUSED state. This
makes sure that the state change happens very fast.
In the next PAUSED to READY state change the pipeline has to shut down
and all streaming threads must stop sending data. This happens in the
following sequence:
* alsasink to READY: alsasink unblocks from the `_chain()` function and returns
a FLUSHING return value to the peer element. The sinkpad is deactivated and
becomes unusable for sending more data.
* mp3dec to READY: the pads are deactivated and the state change completes
when mp3dec leaves its `_chain()` function.
* filesrc to READY: the pads are deactivated and the thread is paused.
The upstream elements finish their chain() function because the
downstream element returned an error code (FLUSHING) from the `_push()`
functions. These error codes are eventually returned to the element that
started the streaming thread (filesrc), which pauses the thread and
completes the state change.
This sequence of events ensure that all elements are unblocked and all
streaming threads stopped.
## Pipeline seeking
Seeking in the pipeline requires a very specific order of operations to
make sure that the elements remain synchronized and that the seek is
performed with a minimal amount of latency.
An application issues a seek event on the pipeline using
`gst_element_send_event()` on the pipeline element. The event can be a
seek event in any of the formats supported by the elements.
The pipeline first pauses the pipeline to speed up the seek operations.
The pipeline then issues the seek event to all sink elements. The sink
then forwards the seek event upstream until some element can perform the
seek operation, which is typically the source or demuxer element. All
intermediate elements can transform the requested seek offset to another
format, this way a decoder element can transform a seek to a frame
number to a timestamp, for example.
When the seek event reaches an element that will perform the seek
operation, that element performs the following steps.
1) send a FLUSH_START event to all downstream and upstream peer elements.
2) make sure the streaming thread is not running. The streaming thread will
always stop because of step 1).
3) perform the seek operation
4) send a FLUSH done event to all downstream and upstream peer elements.
5) send SEGMENT event to inform all elements of the new position and to complete
the seek.
In step 1) all downstream elements have to return from any blocking
operations and have to refuse any further buffers or events different
from a FLUSH done.
The first step ensures that the streaming thread eventually unblocks and
that step 2) can be performed. At this point, dataflow is completely
stopped in the pipeline.
In step 3) the element performs the seek to the requested position.
In step 4) all peer elements are allowed to accept data again and
streaming can continue from the new position. A FLUSH done event is sent
to all the peer elements so that they accept new data again and restart
their streaming threads.
Step 5) informs all elements of the new position in the stream. After
that the event function returns back to the application. and the
streaming threads start to produce new data.
Since the pipeline is still PAUSED, this will preroll the next media
sample in the sinks. The application can wait for this preroll to
complete by performing a `_get_state()` on the pipeline.
The last step in the seek operation is then to adjust the stream
running_time of the pipeline to 0 and to set the pipeline back to
PLAYING.
The sequence of events in our mp3 playback example.
```
| a) seek on pipeline
| b) PAUSE pipeline
+----------------------------------V--------+
| pipeline | c) seek on sink
| +---------+ +----------+ +---V------+ |
| | filesrc | | mp3dec | | alsasink | |
| | src-sink src-sink | |
| +---------+ +----------+ +----|-----+ |
+-----------------------------------|-------+
<------------------------+
d) seek travels upstream
--------------------------> 1) FLUSH event
| 2) stop streaming
| 3) perform seek
--------------------------> 4) FLUSH done event
--------------------------> 5) SEGMENT event
| e) update running_time to 0
| f) PLAY pipeline
```

View file

@ -0,0 +1,55 @@
# Preroll
A sink element can only complete the state change to `PAUSED` after a
buffer has been queued on the input pad or pads. This process is called
prerolling and is needed to fill the pipeline with buffers so that the
transition to `PLAYING` goes as fast as possible with no visual delay for
the user.
Preroll is also crucial in maintaining correct audio and video
synchronisation and ensuring that no buffers are dropped in the sinks.
After receiving a buffer (or EOS) on a pad the chain/event function
should wait to render the buffers or in the EOS case, wait to post the
EOS message. While waiting, the sink will wait for the preroll cond to
be signalled.
Several things can happen that require the preroll cond to be signalled.
This include state changes or flush events. The prerolling is
implemented in sinks (see [element-sink](design/element-sink.md)
## Committing the state
When going to `PAUSED` and `PLAYING` a buffer should be queued in the pad.
We also make this requirement for going to `PLAYING` since a flush event
in the `PAUSED` state could unqueue the buffer again.
The state is commited in the following conditions:
- a buffer is received on a sinkpad
- an GAP event is received on a sinkpad.
- an EOS event is received on a sinkpad.
We require the state change to be commited in EOS as well since an EOS
means by definition that no buffer is going to arrive anymore.
After the state is commited, a blocking wait should be performed for the
next event. Some sinks might render the preroll buffer before starting
this blocking wait.
## Unlocking the preroll
The following conditions unlock the preroll:
- a state change
- a flush event
When the preroll is unlocked by a flush event, a return value of
`GST_FLOW_FLUSHING` is to be returned to the peer pad.
When preroll is unlocked by a state change to `PLAYING`, playback and
rendering of the buffers shall start.
When preroll is unlocked by a state change to READY, the buffer is to be
discarded and a `GST_FLOW_FLUSHING` shall be returned to the peer
element.

363
markdown/design/probes.md Normal file
View file

@ -0,0 +1,363 @@
# Probes
Probes are callbacks that can be installed by the application and will notify
the application about the states of the dataflow.
# Requirements
Applications should be able to monitor and control the dataflow on pads.
We identify the following types:
- be notified when the pad is/becomes idle and make sure the pad stays
idle. This is essential to be able to implement dynamic relinking of
elements without breaking the dataflow.
- be notified when data, events or queries are pushed or sent on a
pad. It should also be possible to inspect and modify the data.
- be able to drop, pass and block on data based on the result of the
callback.
- be able to drop, pass data on blocking pads based on methods
performed by the application
thread.
# Overview
The function gst_pad_add_probe() is used to add a probe to a pad. It accepts a
probe type mask and a callback.
``` c
gulong gst_pad_add_probe (GstPad *pad,
GstPadProbeType mask,
GstPadProbeCallback callback,
gpointer user_data,
GDestroyNotify destroy_data);
```
The function returns a gulong that uniquely identifies the probe and that can
be used to remove the probe with gst_pad_remove_probe():
``` c
void gst_pad_remove_probe (GstPad *pad, gulong id);
```
The mask parameter is a bitwise or of the following flags:
``` c
typedef enum
{
GST_PAD_PROBE_TYPE_INVALID = 0,
/* flags to control blocking */
GST_PAD_PROBE_TYPE_IDLE = (1 << 0),
GST_PAD_PROBE_TYPE_BLOCK = (1 << 1),
/* flags to select datatypes */
GST_PAD_PROBE_TYPE_BUFFER = (1 << 4),
GST_PAD_PROBE_TYPE_BUFFER_LIST = (1 << 5),
GST_PAD_PROBE_TYPE_EVENT_DOWNSTREAM = (1 << 6),
GST_PAD_PROBE_TYPE_EVENT_UPSTREAM = (1 << 7),
GST_PAD_PROBE_TYPE_EVENT_FLUSH = (1 << 8),
GST_PAD_PROBE_TYPE_QUERY_DOWNSTREAM = (1 << 9),
GST_PAD_PROBE_TYPE_QUERY_UPSTREAM = (1 << 10),
/* flags to select scheduling mode */
GST_PAD_PROBE_TYPE_PUSH = (1 << 12),
GST_PAD_PROBE_TYPE_PULL = (1 << 13),
} GstPadProbeType;
```
When adding a probe with the IDLE or BLOCK flag, the probe will become a
blocking probe (see below). Otherwise the probe will be a DATA probe.
The datatype and scheduling selector flags are used to select what kind of
datatypes and scheduling modes should be allowed in the callback.
The blocking flags must match the triggered probe exactly.
The probe callback is defined as:
``` c
GstPadProbeReturn (*GstPadProbeCallback) (GstPad *pad, GstPadProbeInfo *info,
gpointer user_data);
```
A probe info structure is passed as an argument and its type is guaranteed
to match the mask that was used to register the callback. The data item in the
info contains type specific data, which is usually the data item that is blocked
or NULL when no data item is present.
The probe can return any of the following return values:
``` c
typedef enum
{
GST_PAD_PROBE_DROP,
GST_PAD_PROBE_OK,
GST_PAD_PROBE_REMOVE,
GST_PAD_PROBE_PASS,
} GstPadProbeReturn;
```
`GST_PAD_PROBE_OK` is the normal return value. DROP will drop the item that is
currently being probed. GST_PAD_PROBE_REMOVE the currently executing probe from the
list of probes.
`GST_PAD_PROBE_PASS` is relevant for blocking probes and will temporarily unblock the
pad and let the item trough, it will then block again on the next item.
# Blocking probes
Blocking probes are probes with BLOCK or IDLE flags set. They will always
block the dataflow and trigger the callback according to the following rules:
When the IDLE flag is set, the probe callback is called as soon as no data is
flowing over the pad. If at the time of probe registration, the pad is idle,
the callback will be called immediately from the current thread. Otherwise,
the callback will be called as soon as the pad becomes idle in the streaming
thread.
The IDLE probe is useful to perform dynamic linking, it allows to wait for for
a safe moment when an unlink/link operation can be done. Since the probe is a
blocking probe, it will also make sure that the pad stays idle until the probe
is removed.
When the BLOCK flag is set, the probe callback will be called when new data
arrives on the pad and right before the pad goes into the blocking state. This
callback is thus only called when there is new data on the pad.
The blocking probe is removed with gst_pad_remove_probe() or when the probe
callback return GST_PAD_PROBE_REMOVE. In both cases, and if this was the last
blocking probe on the pad, the pad is unblocked and dataflow can continue.
# Non-Blocking probes
Non-blocking probes or DATA probes are probes triggered when data is flowing
over the pad. The are called after the blocking probes are run and always with
data.
# Push dataflow
Push probes have the GST\_PAD\_PROBE\_TYPE\_PUSH flag set in the
callbacks.
In push based scheduling, the blocking probe is called first with the
data item. Then the data probes are called before the peer pad chain or
event function is called.
The data probes are called before the peer pad is checked. This allows
for linking the pad in either the BLOCK or DATA probes on the pad.
Before the peerpad chain or event function is called, the peer pad block
and data probes are called.
Finally, the IDLE probe is called on the pad after the data was sent to
the peer pad.
The push dataflow probe behavior is the same for buffers and
bidirectional events.
```
pad peerpad
| |
gst_pad_push() / | |
gst_pad_push_event() | |
-------------------->O |
O |
flushing? O |
FLUSHING O |
< - - - - - - O |
O-> do BLOCK probes |
O |
O-> do DATA probes |
no peer? O |
NOT_LINKED O |
< - - - - - - O |
O gst_pad_chain() / |
O gst_pad_send_event() |
O------------------------------>O
O flushing? O
O FLUSHING O
O< - - - - - - - - - - - - - - -O
O O-> do BLOCK probes
O O
O O-> do DATA probes
O O
O O---> chainfunc /
O O eventfunc
O< - - - - - - - - - - - - - - -O
O |
O-> do IDLE probes |
O |
< - - - - - - O |
| |
```
# Pull dataflow
Pull probes have the `GST_PAD_PROBE_TYPE_PULL` flag set in the
callbacks.
The `gst_pad_pull_range()` call will first trigger the BLOCK probes
without a DATA item. This allows the pad to be linked before the peer
pad is resolved. It also allows the callback to set a data item in the
probe info.
After the blocking probe and the getrange function is called on the peer
pad and there is a data item, the DATA probes are called.
When control returns to the sinkpad, the IDLE callbacks are called. The
IDLE callback is called without a data item so that it will also be
called when there was an error.
If there is a valid DATA item, the DATA probes are called for the item.
```
srcpad sinkpad
| |
| | gst_pad_pull_range()
| O<---------------------
| O
| O flushing?
| O FLUSHING
| O - - - - - - - - - - >
| do BLOCK probes <-O
| O no peer?
| O NOT_LINKED
| O - - - - - - - - - - >
| gst_pad_get_range() O
O<------------------------------O
O O
O flushing? O
O FLUSHING O
O- - - - - - - - - - - - - - - >O
do BLOCK probes <-O O
O O
getrangefunc <---O O
O flow error? O
O- - - - - - - - - - - - - - - >O
O O
do DATA probes <-O O
O- - - - - - - - - - - - - - - >O
| O
| do IDLE probes <-O
| O flow error?
| O - - - - - - - - - - >
| O
| do DATA probes <-O
| O - - - - - - - - - - >
| |
```
# Queries
Query probes have the GST_PAD_PROBE_TYPE_QUERY_* flag set in the
callbacks.
```
pad peerpad
| |
gst_pad_peer_query() | |
-------------------->O |
O |
O-> do BLOCK probes |
O |
O-> do QUERY | PUSH probes |
no peer? O |
FALSE O |
< - - - - - - O |
O gst_pad_query() |
O------------------------------>O
O O-> do BLOCK probes
O O
O O-> do QUERY | PUSH probes
O O
O O---> queryfunc
O error O
<- - - - - - - - - - - - - - - - - - - - - - -O
O O
O O-> do QUERY | PULL probes
O< - - - - - - - - - - - - - - -O
O |
O-> do QUERY | PULL probes |
O |
< - - - - - - O |
| |
```
For queries, the PUSH ProbeType is set when the query is traveling to
the object that will answer the query and the PULL type is set when the
query contains the answer.
# Use-cases
## Prerolling a partial pipeline
```
.---------. .---------. .----------.
| filesrc | | demuxer | .-----. | decoder1 |
| src -> sink src1 ->|queue|-> sink src
'---------' | | '-----' '----------' X
| | .----------.
| | .-----. | decoder2 |
| src2 ->|queue|-> sink src
'---------' '-----' '----------' X
```
The purpose is to create the pipeline dynamically up to the decoders but
not yet connect them to a sink and without losing any data.
To do this, the source pads of the decoders is blocked so that no events
or buffers can escape and we dont interrupt the stream.
When all of the dynamic pad are created (no-more-pads emitted by the
branching point, ie, the demuxer or the queues filled) and the pads are
blocked (blocked callback received) the pipeline is completely
prerolled.
It should then be possible to perform the following actions on the
prerolled pipeline:
- query duration/position
- perform a flushing seek to preroll a new position
- connect other elements and unblock the blocked pads.
## dynamically switching an element in a PLAYING pipeline
```
.----------. .----------. .----------.
| element1 | | element2 | | element3 |
... src -> sink src -> sink ...
'----------' '----------' '----------'
.----------.
| element4 |
sink src
'----------'
```
The purpose is to replace element2 with element4 in the PLAYING
pipeline.
1) block element1 src pad.
2) inside the block callback nothing is flowing between
element1 and element2 and nothing will flow until unblocked.
3) unlink element1 and element2
4) optional step: make sure data is flushed out of element2:
4a) pad event probe on element2 src
4b) send EOS to element2, this makes sure that element2 flushes out the last bits of data it holds.
4c) wait for EOS to appear in the probe, drop the EOS.
4d) remove the EOS pad event probe.
5) unlink element2 and element3
5a) optionally element2 can now be set to NULL and/or removed from the pipeline.
6) link element4 and element3
7) link element1 and element4
8) make sure element4 is in the same state as the rest of the elements. The
element should at least be PAUSED.
9) unblock element1 src
The same flow can be used to replace an element in a PAUSED pipeline. Of
course in a PAUSED pipeline there might not be dataflow so the block
might not immediately happen.

222
markdown/design/progress.md Normal file
View file

@ -0,0 +1,222 @@
# Progress Reporting
This document describes the design and use cases for the progress
reporting messages.
PROGRESS messages are posted on the bus to inform the application about
the progress of asynchronous operations in the pipeline. This should not
be confused with asynchronous state changes.
We accommodate for the following requirements:
- Application is informed when an async operation starts and
completes.
- It should be possible for the application to generically detect
common operations and incorporate their progress into the GUI.
- Applications can cancel pending operations by doing regular state
changes.
- Applications should be able to wait for completion of async
operations.
We allow for the following scenarios:
- Elements want to inform the application about asynchronous DNS
lookups and pending network requests. This includes starting and
completing the lookup.
- Elements opening devices and resources asynchronously.
- Applications having more freedom to implement timeout and
cancelation of operations that currently block the state changes or
happen invisibly behind the scenes.
## Rationale
The main reason for adding these extra progress notifications is
twofold:
### to give the application more information of what is going on
When there are well defined progress information codes, applications
can let the user know about the status of the progress. We anticipate to
have at least DNS resolving and server connections and requests be well
defined.
### To make the state changes non-blocking and cancellable.
Currently state changes such as going to the READY or PAUSED state often do
blocking calls such as resolving DNS or connecting to a remote server. These
operations often block the main thread and are often not cancellable, causing
application lockups.
We would like to make the state change function, instead, start a separate
thread that performs the blocking operations in a cancellable way. When going
back to the NULL state, all pending operations would be canceled immediately.
For downward state changes, we want to let the application implement its own
timeout mechanism. For example: when stopping an RTSP stream, the clients
needs to send a TEARDOWN request to the server. This can however take an
unlimited amount of time in case of network problems. We want to give the
application an opportunity to wait (and timeout) for the completion of the
async operation before setting the element to the final NULL state.
Progress updates are very similar to buffering messages in the same way
that the application can decide to wait for the completion of the
buffering process before performing the next state change. It might make
sense to implement buffering with the progress messages in the future.
## Async state changes
GStreamer currently has a `GST_STATE_CHANGE_ASYNC` return value to note
to the application that a state change is happening asynchronously.
The main purpose of this return value is to make the pipeline wait for
preroll and delay a future (upwards) state changes until the sinks are
prerolled.
In the case of async operations on source, this will automatically force
sinks to stay async because they will not preroll before the source can
produce data.
The fact that other asynchronous operations happen behind the scenes is
irrelevant for the prerolling process so it is not implemented with the
ASYNC state change return value in order to not complicate the state
changes and mix concepts.
## Use cases
### RTSP client (but also HTTP, MMS, …)
When the client goes from the READY to the PAUSED state, it opens a socket,
performs a DNS lookup, retrieves the SDP and negotiates the streams. All these
operations currently block the state change function for an indefinite amount
of time and while they are blocking cannot be canceled.
Instead, a thread would be started to perform these operations asynchronously
and the state change would complete with the usual NO_PREROLL return value.
Before starting the thread a PROGRESS message would be posted to mark the
start of the async operation.
As the DNS lookup completes and the connection is established, PROGRESS
messages are posted on the bus to inform the application of the progress. When
something fails, an error is posted and a PROGRESS CANCELED message is posted.
The application can then stop the pipeline.
If there are no errors and the setup of the streams completed successfully, a
PROGRESS COMPLETED is posted on the bus. The thread then goes to sleep and the
asynchronous operation completed.
The RTSP protocol requires to send a TEARDOWN request to the server
before closing the connection and destroying the socket. A state change to the
READY state will issue the TEARDOWN request in the background and notify the
application of this pending request with a PROGRESS message.
The application might want to only go to the NULL state after it got confirmation
that the TEARDOWN request completed or it might choose to go to NULL after a
timeout. It might also be possible that the application just want to close the
socket as fast as possible without waiting for completion of the TEARDOWN request.
### Network performance measuring
DNS lookup and connection times can be measured by calculating the elapsed
time between the various PROGRESS messages.
## Messages
A new `PROGRESS` message will be created.
The following fields will be contained in the message:
- **`type`**, GST_TYPE_PROGRESS_TYPE: a set of types to define the type of progress
* GST_PROGRESS_TYPE_START: A new task is started in the background
* GST_PROGRESS_TYPE_CONTINUE: The previous tasks completed and a new one
continues. This is done so that the application can follow a set of
continuous tasks and react to COMPLETE only when the element completely
finished. * GST_PROGRESS_TYPE_CANCELED: A task is canceled by the user.
* GST_PROGRESS_TYPE_ERROR: A task stopped because of an error. In case of
an error, an error message will have been posted before.
* GST_PROGRESS_TYPE_COMPLETE: A task completed successfully.
- **`code`**, G_TYPE_STRING: A generic extensible string that can be used to
programmatically determine the action that is in progress. Some standard
predefined codes will be defined.
- **`text`**, G_TYPE_STRING: A user visible string detailing the action.
- **`percent`**, G_TYPE_INT: between 0 and 100 Progress of the action as
a percentage, the following values are allowed:
- GST_PROGRESS_TYPE_START always has a 0% value.
- GST_PROGRESS_TYPE_CONTINUE have a value between 0 and 100
- GST_PROGRESS_TYPE_CANCELED, GST_PROGRESS_TYPE_ERROR and
GST_PROGRESS_TYPE_COMPLETE always have a 100% value.
- **`timeout`**, G_TYPE_INT in milliseconds: The timeout of the async
operation. -1 if unknown/unlimited.. This field can be interesting to the
application when it wants to display some sort of progress indication.
- ….
Depending on the code, more fields can be put here.
## Implementation
Elements should not do blocking operations from the state change
function. Instead, elements should post an appropriate progress message
with the right code and of type `GST_PROGRESS_TYPE_START` and then
start a thread to perform the blocking calls in a cancellable manner.
It is highly recommended to only start async operations from the READY
to PAUSED state and onwards and not from the NULL to READY state. The
reason for this is that streaming threads are usually started in the
READY to PAUSED state and that the current NULL to READY state change is
used to perform a blocking check for the presence of devices.
The progress message needs to be posted from the state change function
so that the application can immediately take appropriate action after
setting the state.
The threads will usually perform many blocking calls with different
codes in a row, a client might first do a DNS query and then continue
with establishing a connection to the server. For this purpose the
`GST_PROGRESS_TYPE_CONTINUE` must be used.
Usually, the thread used to perform the blocking operations can be used
to implement the streaming threads when needed.
Upon downward state changes, operations that are busy in the thread are
canceled and `GST_PROGRESS_TYPE_CANCELED` is posted.
The application can know about pending tasks because they received the
`GST_PROGRESS_TYPE_START` messages that didnt complete with a
`GST_PROGRESS_TYPE_COMPLETE` message, got canceled with a
`GST_PROGRESS_TYPE_CANCELED` or errored with
`GST_PROGRESS_TYPE_ERROR.` Applications should be able to choose if
they wait for the pending operation or cancel them.
If an async operation fails, an error message is posted first before the
`GST_PROGRESS_TYPE_ERROR` progress message.
## Categories
We want to propose some standard codes here:
* "open" : A resource is being opened
* "close" : A resource is being closed
* "name-lookup" : A DNS lookup.
* "connect" : A socket connection is established
* "disconnect" : a socket connection is closed
* "request" : A request is sent to a server and we are waiting for a reply.
This message is posted right before the request is sent and completed when the
reply has arrived completely. * "mount" : A volume is being mounted
* "unmount" : A volume is being unmounted
More codes can be posted by elements and can be made official later.

View file

@ -0,0 +1,43 @@
# push-pull
Normally a source element will push data to the downstream element using
the `gst_pad_push()` method. The downstream peer pad will receive the
buffer in the Chain function. In the push mode, the source element is
the driving force in the pipeline as it initiates data transport.
It is also possible for an element to pull data from an upstream
element. The downstream element does this by calling
`gst_pad_pull_range()` on one of its sinkpads. In this mode, the
downstream element is the driving force in the pipeline as it initiates
data transfer.
It is important that the elements are in the correct state to handle a
push() or a `pull_range()` from the peer element. For push() based
elements this means that all downstream elements should be in the
correct state and for `pull_range()` based elements this means the
upstream elements should be in the correct state.
Most sinkpads implement a chain function. This is the most common case.
sinkpads implementing a loop function will be the exception. Likewise
srcpads implementing a getrange function will be the exception.
## state changes
The GstBin sets the state of all the sink elements. These are the
elements without source pads.
Setting the state on an element will first activate all the srcpads and
then the sinkpads. For each of the sinkpads,
`gst_pad_check_pull_range()` is performed. If the sinkpad supports a
loopfunction and the peer pad returns TRUE from the GstPadCheckPullRange
function, then the peer pad is activated first as it must be in the
right state to handle a `_pull_range().` Note that the state change of
the element is not yet performed, just the activate function is called
on the source pad. This means that elements that implement a getrange
function must be prepared to get their activate function called before
their state change function.
Elements that have multiple sinkpads that require all of them to operate
in the same mode (push/pull) can use the `_check_pull_range()` on all
their pads and can then remove the loop functions if one of the pads
does not support pull based mode.

445
markdown/design/qos.md Normal file
View file

@ -0,0 +1,445 @@
# Quality-of-Service
Quality of service is about measuring and adjusting the real-time
performance of a pipeline.
The real-time performance is always measured relative to the pipeline
clock and typically happens in the sinks when they synchronize buffers
against the clock.
The measurements result in QOS events that aim to adjust the datarate in
one or more upstream elements. Two types of adjustments can be made:
- short time "emergency" corrections based on latest observation in
the sinks.
- long term rate corrections based on trends observed in the sinks.
It is also possible for the application to artificially introduce delay
between synchronized buffers, this is called throttling. It can be used
to reduce the framerate, for example.
## Sources of quality problems
- High CPU load
- Network problems
- Other resource problems such as disk load, memory bottlenecks etc.
- application level throttling
## QoS event
The QoS event is generated by an element that synchronizes against the
clock. It travels upstream and contains the following fields:
* **`type`**: `GST\_TYPE\_QOS\_TYPE:` The type of the QoS event, we have the
following types and the default type is `GST\_QOS\_TYPE\_UNDERFLOW`:
* GST_QOS_TYPE_OVERFLOW: an element is receiving buffers too fast and can't
keep up processing them. Upstream should reduce the rate.
* GST_QOS_TYPE_UNDERFLOW: an element is receiving buffers too slowly
and has to drop them because they are too late. Upstream should
increase the processing rate.
* GST_QOS_TYPE_THROTTLE: the application is asking to add extra delay
between buffers, upstream is allowed to drop buffers
* **`timestamp`**: G\_TYPE\_UINT64: The timestamp on the buffer that
generated the QoS event. These timestamps are expressed in total
running\_time in the sink so that the value is ever increasing.
* **`jitter`**: G\_TYPE\_INT64: The difference of that timestamp against the
current clock time. Negative values mean the timestamp was on time.
Positive values indicate the timestamp was late by that amount. When
buffers are received in time and throttling is not enabled, the QoS
type field is set to OVERFLOW. When throttling, the jitter contains
the throttling delay added by the application and the type is set to
THROTTLE.
* **`proportion`**: G\_TYPE\_DOUBLE: Long term prediction of the ideal rate
relative to normal rate to get optimal quality.
The rest of this document deals with how these values can be calculated
in a sink and how the values can be used by other elements to adjust
their operations.
## QoS message
A QOS message is posted on the bus whenever an element decides to:
- drop a buffer because of QoS reasons
- change its processing strategy because of QoS reasons (quality)
It should be expected that creating and posting the QoS message is
reasonably fast and does not significantly contribute to the QoS
problems. Options to disable this feature could also be presented on
elements.
This message can be posted by a sink/src that performs synchronisation
against the clock (live) or it could be posted by an upstream element
that performs QoS because of QOS events received from a downstream
element (\!live).
The `GST\_MESSAGE\_QOS` contains at least the following info:
* **`live`**: G\_TYPE\_BOOLEAN: If the QoS message was dropped by a live
element such as a sink or a live source. If the live property is
FALSE, the QoS message was generated as a response to a QoS event in
a non-live element.
* **`running-time`**: G\_TYPE\_UINT64: The running\_time of the buffer that
generated the QoS message.
* **`stream-time`**: G\_TYPE\_UINT64: The stream\_time of the buffer that
generated the QoS message.
* **`timestamp`**: G\_TYPE\_UINT64: The timestamp of the buffer that
generated the QoS message.
* **`duration`**: G\_TYPE\_UINT64: The duration of the buffer that generated
the QoS message.
* **`jitter`**: G\_TYPE\_INT64: The difference of the running-time against
the deadline. Negative values mean the timestamp was on time.
Positive values indicate the timestamp was late (and dropped) by
that amount. The deadline can be a realtime running\_time or an
estimated running\_time.
* **`proportion`**: G\_TYPE\_DOUBLE: Long term prediction of the ideal rate
relative to normal rate to get optimal quality.
* **`quality`**: G\_TYPE\_INT: An element dependent integer value that
specifies the current quality level of the element. The default
maximum quality is 1000000.
* **`format`**: GST\_TYPE\_FORMAT Units of the *processed* and *dropped*
fields. Video sinks and video filters will use GST\_FORMAT\_BUFFERS
(frames). Audio sinks and audio filters will likely use
GST\_FORMAT\_DEFAULT (samples).
* **`processed`**: G\_TYPE\_UINT64: Total number of units correctly
processed since the last state change to READY or a flushing
operation.
* **`dropped`**: G\_TYPE\_UINT64: Total number of units dropped since the
last state change to READY or a flushing operation.
The *running-time* and *processed* fields can be used to estimate the
average processing rate (framerate for video).
Elements might add additional fields in the message which are documented
in the relevant elements or baseclasses.
## Collecting statistics
A buffer with timestamp B1 arrives in the sink at time T1. The buffer
timestamp is then synchronized against the clock which yields a jitter
J1 return value from the clock. The jitter J1 is simply calculated as
J1 = CT - B1
Where CT is the clock time when the entry arrives in the sink. This
value is calculated inside the clock when we perform
`gst\_clock\_id\_wait()`.
If the jitter is negative, the entry arrived in time and can be rendered
after waiting for the clock to reach time B1 (which is also CT - J1).
If the jitter is positive however, the entry arrived too late in the
sink and should therefore be dropped. J1 is the amount of time the entry
was late.
Any buffer that arrives in the sink should generate a QoS event
upstream.
Using the jitter we can calculate the time when the buffer arrived in
the sink:
T1 = B1 + J1. (1)
The time the buffer leaves the sink after synchronisation is measured
as:
T2 = B1 + (J1 < 0 ? 0 : J1) (2)
For buffers that arrive in time (J1 \< 0) the buffer leaves after
synchronisation which is exactly B1. Late buffers (J1 \>= 0) leave the
sink when they arrive, whithout any synchronisation, which is T2 = T1 =
B1 + J1.
Using a previous T0 and a new T1, we can calculate the time it took for
upstream to generate a buffer with timestamp B1.
PT1 = T1 - T0 (3)
We call PT1 the processing time needed to generate buffer with timestamp
B1.
Moreover, given the duration of the buffer D1, the current data rate
(DR1) of the upstream element is given as:
```
PT1 T1 - T0
DR1 = --- = ------- (4)
D1 D1
```
For values 0.0 \< DR1 ⇐ 1.0 the upstream element is producing faster
than real-time. If DR1 is exactly 1.0, the element is running at a
perfect speed.
Values DR1 \> 1.0 mean that the upstream element cannot produce buffers
of duration D1 in real-time. It is exactly DR1 that tells the amount of
speedup we require from upstream to regain real-time performance.
An element that is not receiving enough data is said to be underflowed.
## Element measurements
In addition to the measurements of the datarate of the upstream element,
a typical element must also measure its own performance. Global pipeline
performance problems can indeed also be caused by the element itself
when it receives too much data it cannot process in time. The element is
then said to be overflowed.
# Short term correction
The timestamp and jitter serve as short term correction information for
upstream elements. Indeed, given arrival time T1 as given in (1) we can
be certain that buffers with a timestamp B2 \< T1 will be too late in
the sink.
In case of a positive jitter we can therefore send a QoS event with a
timestamp B1, jitter J1 and proportion given by (4).
This allows an upstream element to not generate any data with timestamps
B2 \< T1, where the element can derive T1 as B1 + J1.
This will effectively result in frame drops.
The element can even do a better estimation of the next valid timestamp
it should output.
Indeed, given the element generated a buffer with timestamp B0 that
arrived in time in the sink but then received a QoS event stating B1
arrived J1 too late. This means generating B1 took (B1 + J1) - B0 = T1 -
T0 = PT1, as given in (3). Given the buffer B1 had a duration D1 and
assuming that generating a new buffer B2 will take the same amount of
processing time, a better estimation for B2 would then be:
```
B2 = T1 + D2 * DR1
```
expanding gives:
```
B2 = (B1 + J1) + D2 * (B1 + J1 - B0)
--------------
D1
```
assuming the durations of the frames are equal and thus D1 = D2:
```
B2 = (B1 + J1) + (B1 + J1 - B0)
B2 = 2 * (B1 + J1) - B0
```
also:
```
B0 = B1 - D1
```
so:
```
B2 = 2 * (B1 + J1) - (B1 - D1)
```
Which yields a more accurate prediction for the next buffer given as:
```
B2 = B1 + 2 * J1 + D1 (5)
```
# Long term correction
The datarate used to calculate (5) for the short term prediction is
based on a single observation. A more accurate datarate can be obtained
by creating a running average over multiple datarate observations.
This average is less susceptible to sudden changes that would only
influence the datarate for a very short period.
A running average is calculated over the observations given in (4) and
is used as the proportion member in the QoS event that is sent upstream.
Receivers of the QoS event should permanently reduce their datarate as
given by the proportion member. Failure to do so will certainly lead to
more dropped frames and a generally worse QoS.
# Throttling
In throttle mode, the time distance between buffers is kept to a
configurable throttle interval. This means that effectively the buffer
rate is limited to 1 buffer per throttle interval. This can be used to
limit the framerate, for example.
When an element is configured in throttling mode (this is usually only
implemented on sinks) it should produce QoS events upstream with the
jitter field set to the throttle interval. This should instruct upstream
elements to skip or drop the remaining buffers in the configured
throttle interval.
The proportion field is set to the desired slowdown needed to get the
desired throttle interval. Implementations can use the QoS Throttle
type, the proportion and the jitter member to tune their
implementations.
# QoS strategies
Several strategies exist to reduce processing delay that might affect
real time performance.
- lowering quality
- dropping frames (reduce CPU/bandwidth usage)
- switch to a lower decoding/encoding quality (reduce algorithmic
complexity)
- switch to a lower quality source (reduce network usage)
- increasing thread priorities
- switch to real-time scheduling
- assign more CPU cycles to critial pipeline parts
- assign more CPU(s) to critical pipeline parts
# QoS implementations
Here follows a small overview of how QoS can be implemented in a range
of different types of elements.
# GstBaseSink
The primary implementor of QoS is GstBaseSink. It will calculate the
following values:
- upstream running average of processing time (5) in stream time.
- running average of buffer durations.
- running average of render time (in system time)
- rendered/dropped buffers
The processing time and the average buffer durations will be used to
calculate a proportion.
The processing time in system time is compared to render time to decide
if the majority of the time is spend upstream or in the sink itself.
This value is used to decide overflow or underflow.
The number of rendered and dropped buffers is used to query stats on the
sink.
A QoS event with the most current values is sent upstream for each
buffer that was received by the sink.
Normally QoS is only enabled for video pipelines. The reason being that
drops in audio are more disturbing than dropping video frames. Also
video requires in general more processing than audio.
Normally there is a threshold for when buffers get dropped in a video
sink. Frames that arrive 20 milliseconds late are still rendered as it
is not noticeable for the human eye.
A QoS message is posted whenever a (part of a) buffer is dropped.
In throttle mode, the sink sends QoS event upstream with the timestamp
set to the running\_time of the latest buffer and the jitter set to the
throttle interval. If the throttled buffer is late, the lateness is
subtracted from the throttle interval in order to keep the desired
throttle interval.
# GstBaseTransform
Transform elements can entirely skip the transform based on the
timestamp and jitter values of recent QoS event since these buffers will
certainly arrive too late.
With any intermediate element, the element should measure its
performance to decide if it is responsible for the quality problems or
any upstream/downstream element.
some transforms can reduce the complexity of their algorithms. Depending
on the algorithm, the changes in quality may have disturbing visual or
audible effect that should be avoided.
A QoS message should be posted when a frame is dropped or when the
quality of the filter is reduced. The quality member in the QOS message
should reflect the quality setting of the filter.
# Video Decoders
A video decoder can, based on the codec in use, decide to not decode
intermediate frames. A typical codec can for example skip the decoding
of B-frames to reduce the CPU usage and framerate.
If each frame is independantly decodable, any arbitrary frame can be
skipped based on the timestamp and jitter values of the latest QoS
event. In addition can the proportion member be used to permanently skip
frames.
It is suggested to adjust the quality field of the QoS message with the
expected amount of dropped frames (skipping B and/or P frames). This
depends on the particular spacing of B and P frames in the stream. If
the quality control would result in half of the frames to be dropped
(typical B frame skipping), the quality field would be set to ``1000000 *
1/2 = 500000``. If a typical I frame spacing of 18 frames is used,
skipping B and P frames would result in 17 dropped frames or 1 decoded
frame every 18 frames. The quality member should be set to `1000000 *
1/18 = 55555`.
- skipping B frames: quality = 500000
- skipping P/B frames: quality = 55555 (for I-frame spacing of 18
frames)
# Demuxers
Demuxers usually cannot do a lot regarding QoS except for skipping
frames to the next keyframe when a lateness QoS event arrives on a
source pad.
A demuxer can however measure if the performance problems are upstream
or downstream and forward an updated QoS event upstream.
Most demuxers that have multiple output pads might need to combine the
QoS events on all the pads and derive an aggregated QoS event for the
upstream element.
# Sources
The QoS events only apply to push based sources since pull based sources
are entirely controlled by another downstream element.
Sources can receive a overflow or underflow event that can be used to
switch to less demanding source material. In case of a network stream, a
switch could be done to a lower or higher quality stream or additional
enhancement layers could be used or ignored.
Live sources will automatically drop data when it takes too long to
process the data that the element pushes out.
Live sources should post a QoS message when data is dropped.

71
markdown/design/query.md Normal file
View file

@ -0,0 +1,71 @@
# Query
## Purpose
Queries are used to get information about the stream. A query is started
on a specific pad and travels up or downstream.
## Requirements
- multiple return values, grouped together when they make sense.
- one pad function to perform the query
- extensible queries.
## Implementation
- GstQuery extends GstMiniObject and contains a GstStructure (see
GstMessage)
- some standard query types are defined below
- methods to create and parse the results in the GstQuery.
- define pad
method:
``` c
gboolean (*GstPadQueryFunction) (GstPad *pad,
GstObject *parent,
GstQuery *query);
```
pad returns result in query structure and TRUE as result or FALSE when query is
not supported.
## Query types
**`GST_QUERY_POSITION`**: get info on current position of the stream in stream_time.
**`GST_QUERY_DURATION`**: get info on the total duration of the stream.
**`GST_QUERY_LATENCY`**: get amount of latency introduced in the pipeline. (See [latency](design/latency.md))
**`GST_QUERY_RATE`**: get the current playback rate of the pipeline
**`GST_QUERY_SEEKING`**: get info on how seeking can be done
- getrange, with/without offset/size
- ranges where seeking is efficient (for caching network sources)
- flags describing seeking behaviour (forward, backward, segments,
play backwards, ...)
**`GST_QUERY_SEGMENT`**: get info about the currently configured playback segment.
**`GST_QUERY_CONVERT`**: convert format/value to another format/value pair.
**`GST_QUERY_FORMATS`**: return list of supported formats that can be used for GST_QUERY_CONVERT.
**`GST_QUERY_BUFFERING`**: query available media for efficient seeking (See [buffering](design/buffering.md))
**`GST_QUERY_CUSTOM`**: a custom query, the name of the query defines the properties of the query.
**`GST_QUERY_URI`**: query the uri of the source or sink element
**`GST_QUERY_ALLOCATION`**: the buffer allocation properties (See [bufferpool](design/bufferpool.md))
**`GST_QUERY_SCHEDULING`**: the scheduling properties (See [scheduling](design/scheduling.md))
**`GST_QUERY_ACCEPT_CAPS`**: check if caps are supported (See [negotiation](design/negotiation.md))
**`GST_QUERY_CAPS`**: get the possible caps (See [negotiation](design/negotiation.md))

View file

@ -0,0 +1,522 @@
# Object relation types
This document describes the relations between objects that exist in
GStreamer. It will also describe the way of handling the relation wrt
locking and refcounting.
## parent-child relation
```
+---------+ +-------+
| parent | | child |
*--->| *----->| |
| F1|<-----* 1|
+---------+ +-------+
```
### properties
- parent has references to multiple children
- child has reference to parent
- reference fields protected with LOCK
- the reference held by each child to the parent is NOT reflected in
the refcount of the parent.
- the parent removes the floating flag of the child when taking
ownership.
- the application has valid reference to parent
- creation/destruction requires two unnested locks and 1 refcount.
### usage in GStreamer
* GstBin -> GstElement
* GstElement -> GstRealPad
### lifecycle
#### object creation
The application creates two object and holds a pointer
to them. The objects are initially FLOATING with a refcount of 1.
```
+---------+ +-------+
*--->| parent | *--->| child |
| * | | |
| F1| | * F1|
+---------+ +-------+
```
#### establishing the parent-child relationship
The application then calls a method on the parent object to take ownership of
the child object. The parent performs the following actions:
```
result = _set_parent (child, parent);
if (result) {
lock (parent);
ref_pointer = child;
1. update other data structures .. unlock (parent);
} else {
2. child had parent ..
}
```
the `_set_parent()` method performs the following actions:
```
lock (child);
if (child->parent != null) {
unlock (child);
return false;
}
if (is_floating (child)) {
unset (child, floating);
}
else {
_ref (child);
}
child->parent = parent;
unlock (child);
_signal (parent_set, child, parent);
return true;
```
The function atomically checks if the child has no parent yet
and will set the parent if not. It will also sink the child, meaning
all floating references to the child are invalid now as it takes
over the refcount of the object.
Visually:
after `_set_parent()` returns TRUE:
```
+---------+ +-------+
*---->| parent | *-//->| child |
| * | | |
| F1|<-------------* 1|
+---------+ +-------+
```
after parent updates ref_pointer to child.
```
+---------+ +-------+
*---->| parent | *-//->| child |
| *--------->| |
| F1|<---------* 1|
+---------+ +-------+
```
- only one parent is able to \_sink the same object because the
\_set\_parent() method is atomic.
- since only one parent is able to \_set\_parent() the object, only
one will add a reference to the object.
- since the parent can hold multiple references to children, we dont
need to lock the parent when locking the child. Many threads can
call \_set\_parent() on the children with the same parent, the
parent can then add all those to its lists.
> Note: that the signal is emitted before the parent has added the
> element to its internal data structures. This is not a problem
> since the parent usually has his own signal to inform the app that
> the child was reffed. One possible solution would be to update the
> internal structure first and then perform a rollback if the \_set_parent()
> failed. This is not a good solution as iterators might grab the
> 'half-added' child too soon.
#### using the parent-child relationship
- since the initial floating reference to the child object became
invalid after giving it to the parent, any reference to a child has
at least a refcount \> 1.
- this means that unreffing a child object cannot decrease the
refcount to 0. In fact, only the parent can destroy and dispose the
child object.
- given a reference to the child object, the parent pointer is only
valid when holding the child LOCK. Indeed, after unlocking the child
LOCK, the parent can unparent the child or the parent could even
become disposed. To avoid the parent dispose problem, when obtaining
the parent pointer, if should be reffed before releasing the child
LOCK.
* getting a reference to the parent.
- a referece is held to the child, so it cannot be disposed.
``` c
LOCK (child);
parent = _ref (child->parent);
UNLOCK (child);
.. use parent ..
_unref (parent);
```
* getting a reference to a child
- a reference to a child can be obtained by reffing it before adding
it to the parent or by querying the parent.
- when requesting a child from the parent, a reference is held to the
parent so it cannot be disposed. The parent will use its internal
data structures to locate the child element and will return a
reference to it with an incremented refcount. The requester should
\_unref() the child after usage.
* destroying the parent-child relationship
- only the parent can actively destroy the parent-child relationship
this typically happens when a method is called on the parent to
release ownership of the child.
- a child shall never remove itself from the parent.
- since calling a method on the parent with the child as an argument
requires the caller to obtain a valid reference to the child, the
child refcount is at least \> 1.
- the parent will perform the folowing actions:
``` c
LOCK (parent);
if (ref_pointer == child) {
ref_pointer = NULL;
..update other data structures ..
UNLOCK (parent);
_unparent (child);
} else {
UNLOCK (parent);
.. not our child ..
}
```
The `_unparent()` method performs the following actions:
``` c
LOCK (child);
if (child->parent != NULL) {
child->parent = NULL;
UNLOCK (child);
_signal (PARENT_UNSET, child, parent);
_unref (child);
} else {
UNLOCK (child);
}
```
Since the `_unparent()` method unrefs the child object, it is possible that
the child pointer is invalid after this function. If the parent wants to
perform other actions on the child (such as signal emmision) it should
`_ref()` the child first.
## single-reffed relation
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 1| | 2|
+---------+ +---------+
```
### properties
- one object has a reference to another
- reference field protected with LOCK
- the reference held by the object is reflected in the refcount of the
other object.
- typically the other object can be shared among multiple other
objects where each ref is counted for in the refcount.
- no object has ownership of the other.
- either shared state or copy-on-write.
- creation/destruction requires one lock and one refcount.
### usage
GstRealPad -> GstCaps
GstBuffer -> GstCaps
GstEvent -> GstCaps
GstEvent -> GstObject
GstMessage -> GstCaps
GstMessage -> GstObject
### lifecycle
#### Two objects exist unlinked.
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| * | | |
| 1| | 1|
+---------+ +---------+
```
#### establishing the single-reffed relationship
The second object is attached to the first one using a method
on the first object. The second object is reffed and a pointer
is updated in the first object using the following algorithm:
``` c
LOCK (object1);
if (object1->pointer)
_unref (object1->pointer);
object1->pointer = _ref (object2);
UNLOCK (object1);
```
After releasing the lock on the first object is is not sure that
object2 is still reffed from object1.
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 1| | 2|
+---------+ +---------+
```
#### using the single-reffed relationship
The only way to access object2 is by holding a ref to it or by
getting the reference from object1.
Reading the object pointed to by object1 can be done like this:
``` c
LOCK (object1);
object2 = object1->pointer;
_ref (object2);
UNLOCK (object1);
… use object2 …
_unref (object2);
```
Depending on the type of the object, modifications can be done either with
copy-on-write or directly into the object.
Copy on write can practically only be done like this:
``` c
LOCK (object1);
object2 = object1->pointer;
object2 = _copy_on_write (object2);
... make modifications to object2 ...
UNLOCK (object1);
Releasing the lock has only a very small window where the copy_on_write
actually does not perform a copy:
LOCK (object1);
object2 = object1->pointer;
_ref (object2);
UNLOCK (object1);
/* object2 now has at least 2 refcounts making the next
copy-on-write make a real copy, unless some other thread writes
another object2 to object1 here … */
object2 = _copy_on_write (object2);
/* make modifications to object2 … */
LOCK (object1);
if (object1->pointer != object2) {
if (object1->pointer)
_unref (object1->pointer);
object1->pointer = gst_object_ref (object2);
}
UNLOCK (object1);
```
#### destroying the single-reffed relationship
The folowing algorithm removes the single-reffed link between
object1 and object2.
``` c
LOCK (object1);
_unref (object1->pointer);
object1->pointer = NULL;
UNLOCK (object1);
```
Which yields the following initial state again:
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| * | | |
| 1| | 1|
+---------+ +---------+
```
## unreffed relation
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 1|<---------* 1|
+---------+ +---------+
```
### properties
- two objects have references to each other
- both objects can only have 1 reference to another object.
- reference fields protected with LOCK
- the references held by each object are NOT reflected in the refcount
of the other object.
- no object has ownership of the other.
- typically each object is owned by a different parent.
- creation/destruction requires two nested locks and no refcounts.
### usage
- This type of link is used when the link is less important than the
existance of the objects, If one of the objects is disposed, so is
the link.
GstRealPad <-> GstRealPad (srcpad lock taken first)
### lifecycle
#### Two objects exist unlinked.
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| * | | |
| 1| | * 1|
+---------+ +---------+
```
#### establishing the unreffed relationship
Since we need to take two locks, the order in which these locks are
taken is very important or we might cause deadlocks. This lock order
must be defined for all unreffed relations. In these examples we always
lock object1 first and then object2.
``` c
LOCK (object1);
LOCK (object2);
object2->refpointer = object1;
object1->refpointer = object2;
UNLOCK (object2);
UNLOCK (object1);
```
#### using the unreffed relationship
Reading requires taking one of the locks and reading the corresponing
object. Again we need to ref the object before releasing the lock.
``` c
LOCK (object1);
object2 = _ref (object1->refpointer);
UNLOCK (object1);
.. use object2 ..
_unref (object2);
```
#### destroying the unreffed relationship
Because of the lock order we need to be careful when destroying this
Relation.
When only a reference to object1 is held:
``` c
LOCK (object1);
LOCK (object2);
object1->refpointer->refpointer = NULL;
object1->refpointer = NULL;
UNLOCK (object2);
UNLOCK (object1);
```
When only a reference to object2 is held we need to get a handle to the
other object fist so that we can lock it first. There is a window where
we need to release all locks and the relation could be invalid. To solve
this we check the relation after grabbing both locks and retry if the
relation changed.
``` c
retry:
LOCK (object2);
object1 = _ref (object2->refpointer);
UNLOCK (object2);
.. things can change here ..
LOCK (object1);
LOCK (object2);
if (object1 == object2->refpointer) {
/* relation unchanged */
object1->refpointer->refpointer = NULL;
object1->refpointer = NULL;
}
else {
/* relation changed.. retry */
UNLOCK (object2);
UNLOCK (object1);
_unref (object1);
goto retry;
}
UNLOCK (object2);
UNLOCK (object1);
_unref (object1);
/* When references are held to both objects. Note that it is not possible to
get references to both objects with the locks released since when the
references are taken and the locks are released, a concurrent update might
have changed the link, making the references not point to linked objects. */
LOCK (object1);
LOCK (object2);
if (object1->refpointer == object2) {
object2->refpointer = NULL;
object1->refpointer = NULL;
}
else {
.. objects are not linked ..
}
UNLOCK (object2);
UNLOCK (object1);
```
## double-reffed relation
```
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 2|<---------* 2|
+---------+ +---------+
```
### properties
- two objects have references to each other
- reference fields protected with LOCK
- the references held by each object are reflected in the refcount of
the other object.
- no object has ownership of the other.
- typically each object is owned by a different parent.
- creation/destruction requires two locks and two refcounts.
#### usage
Not used in GStreamer.
### lifecycle

View file

@ -0,0 +1,246 @@
# Scheduling
The scheduling in GStreamer is based on pads actively pushing
(producing) data or pad pulling in data (consuming) from other pads.
## Pushing
A pad can produce data and push it to the next pad. A pad that behaves
this way exposes a loop function that will be called repeatedly until it
returns false. The loop function is allowed to block whenever it wants.
When the pad is deactivated the loop function should unblock though.
A pad operating in the push mode can only produce data to a pad that
exposes a chain function. This chain function will be called with the
buffer produced by the pushing pad.
This method of producing data is called the streaming mode since the
producer produces a constant stream of data.
## Pulling
Pads that operate in pulling mode can only pull data from a pad that
exposes the pull\_range function. In this case, the sink pad exposes a
loop function that will be called repeatedly until the task is stopped.
After pulling data from the peer pad, the loop function will typically
call the push function to push the result to the peer sinkpad.
## Deciding the scheduling mode
When a pad is activated, the \_activate() function is called. The pad
can then choose to activate itself in push or pull mode depending on
upstream capabilities.
The GStreamer core will by default activate pads in push mode when there
is no activate function for the pad.
## The chain function
The chain function will be called when a upstream element performs a
\_push() on the pad. The upstream element can be another chain based
element or a pushing source.
## The getrange function
The getrange function is called when a peer pad performs a
\_pull\_range() on the pad. This downstream pad can be a pulling element
or another \_pull\_range() based element.
## Scheduling Query
A sinkpad can ask the upstream srcpad for its scheduling attributes. It
does this with the SCHEDULING query.
* (out) **`modes`**: G_TYPE_ARRAY (default NULL): an array of GST_TYPE_PAD_MODE enums. Contains all the supported scheduling modes.
* (out) **`flags`**, GST_TYPE_SCHEDULING_FLAGS (default 0):
```c
typedef enum {
GST_SCHEDULING_FLAG_SEEKABLE = (1 << 0),
GST_SCHEDULING_FLAG_SEQUENTIAL = (1 << 1),
GST_SCHEDULING_FLAG_BANDWIDTH_LIMITED = (1 << 2)
} GstSchedulingFlags;
```
*
* **`_SEEKABLE`**: the offset of a pull operation can be specified, if this
flag is false, the offset should be -1,
* **` _SEQUENTIAL`**: suggest sequential access to the data. If`` _SEEKABLE`` is
specified, seeks are allowed but should be avoided. This is common for network
streams.
* **`_BANDWIDTH_LIMITED`**: suggest the element supports buffering data for
downstream to cope with bandwidth limitations. If this flag is on the
downstream element might ask for more data than necessary for normal playback.
This use-case is interesting for on-disk buffering scenarios for instance. Seek
operations might be slow as well so downstream elements should take this into
consideration.
* (out) **`minsize`**: G_TYPE_INT (default 1): the suggested minimum size of pull requests
* (out) **`maxsize`**: G_TYPE_INT (default -1, unlimited): the suggested maximum size of pull requests
* (out) **`align`**: G_TYPE_INT (default 0): the suggested alignment for the pull requests.
## Plug-in techniques
### Multi-sink elements
Elements with multiple sinks can either expose a loop function on each
of the pads to actively pull\_range data or they can expose a chain
function on each pad.
Implementing a chain function is usually easy and allows for all
possible scheduling methods.
# Pad select
If the chain based sink wants to wait for one of the pads to receive a buffer, just
implement the action to perform in the chain function. Be aware that the action could
be performed in different threads and possibly simultaneously so grab the STREAM_LOCK.
# Collect pads
If the chain based sink pads all require one buffer before the element can operate on
the data, collect all the buffers in the chain function and perform the action when
all chainpads received the buffer.
In this case you probably also don't want to accept more data on a pad that has a buffer
queued. This can easily be done with the following code snippet:
``` c
static GstFlowReturn _chain (GstPad *pad, GstBuffer *buffer)
{
LOCK (mylock);
while (pad->store != NULL) {
WAIT (mycond, mylock);
}
pad->store = buffer;
SIGNAL (mycond);
UNLOCK (mylock);
return GST_FLOW_OK;
}
static void _pull (GstPad *pad, GstBuffer **buffer)
{
LOCK (mylock);
while (pad->store == NULL) {
WAIT (mycond, mylock);
}
**buffer = pad->store;
pad->store = NULL;
SIGNAL (mycond);
UNLOCK (mylock);
}
```
## Cases
Inside the braces below the pads is stated what function the pad
support:
* l: exposes a loop function, so it can act as a pushing source.
* g: exposes a getrange function
* c: exposes a chain function
Following scheduling decisions are made based on the scheduling methods exposed
by the pads:
* (g) - (l): sinkpad will pull data from src
* (l) - (c): srcpad actively pushes data to sinkpad
* () - (c): srcpad will push data to sinkpad.
* () - () : not schedulable.
* () - (l): not schedulable.
* (g) - () : not schedulable.
* (g) - (c): not schedulable.
* (l) - () : not schedulable.
* (l) - (l): not schedulable
* () - (g): impossible
* (g) - (g): impossible.
* (l) - (g): impossible
* (c) - () : impossible
* (c) - (g): impossible
* (c) - (l): impossible
* (c) - (c): impossible
```
+---------+ +------------+ +-----------+
| filesrc | | mp3decoder | | audiosink |
| src--sink src--sink |
+---------+ +------------+ +-----------+
(l-g) (c) () (c)
```
When activating the pads:
- audiosink has a chain function and the peer pad has no loop
function, no scheduling is done.
- mp3decoder and filesrc expose an (l) - (c) connection, a thread is
created to call the srcpad loop function.
```
+---------+ +------------+ +----------+
| filesrc | | avidemuxer | | fakesink |
| src--sink src--sink |
+---------+ +------------+ +----------+
(l-g) (l) () (c)
```
- fakesink has a chain function and the peer pad has no loop function,
no scheduling is done.
- avidemuxer and filesrc expose an (g) - (l) connection, a thread is
created to call the sinkpad loop function.
```
+---------+ +----------+ +------------+ +----------+
| filesrc | | identity | | avidemuxer | | fakesink |
| src--sink src--sink src--sink |
+---------+ +----------+ +------------+ +----------+
(l-g) (c) () (l) () (c)
```
- fakesink has a chain function and the peer pad has no loop function,
no scheduling is done.
- avidemuxer and identity expose no schedulable connection so this
pipeline is not schedulable.
```
+---------+ +----------+ +------------+ +----------+
| filesrc | | identity | | avidemuxer | | fakesink |
| src--sink src--sink src--sink |
+---------+ +----------+ +------------+ +----------+
(l-g) (c-l) (g) (l) () (c)
```
- fakesink has a chain function and the peer pad has no loop function,
no scheduling is done.
- avidemuxer and identity expose an (g) - (l) connection, a thread is
created to call the sinkpad loop function.
- identity knows the srcpad is getrange based and uses the thread from
avidemux to getrange data from filesrc.
```
+---------+ +----------+ +------------+ +----------+
| filesrc | | identity | | oggdemuxer | | fakesink |
| src--sink src--sink src--sink |
+---------+ +----------+ +------------+ +----------+
(l-g) (c) () (l-c) () (c)
```
- fakesink has a chain function and the peer pad has no loop function,
no scheduling is done.
- oggdemuxer and identity expose an () - (l-c) connection, oggdemux
has to operate in chain mode.
- identity chan only work chain based and so filesrc creates a thread
to push data to identity.

229
markdown/design/seeking.md Normal file
View file

@ -0,0 +1,229 @@
# Seeking
Seeking in GStreamer means configuring the pipeline for playback of the
media between a certain start and stop time, called the playback
segment. By default a pipeline will play from position 0 to the total
duration of the media at a rate of 1.0.
A seek is performed by sending a seek event to the sink elements of a
pipeline. Sending the seek event to a bin will by default forward the
event to all sinks in the bin.
When performing a seek, the start and stop values of the segment can be
specified as absolute positions or relative to the currently configured
playback segment. Note that it is not possible to seek relative to the
current playback position. To seek relative to the current playback
position, one must query the position first and then perform an absolute
seek to the desired position.
Feedback of the seek operation can be immediately using the
`GST_SEEK_FLAG_FLUSH` flag. With this flag, all pending data in the
pipeline is discarded and playback starts from the new position
immediately.
When the FLUSH flag is not set, the seek will be queued and executed as
soon as possible, which might be after all queues are emptied.
Seeking can be performed in different formats such as time, frames or
samples.
The seeking can be performed to a nearby key unit or to the exact
(estimated) unit in the media (`GST_SEEK_FLAG_KEY_UNIT`). See below
for more details on this.
The seeking can be performed by using an estimated target position or in
an accurate way (`GST_SEEK_FLAG_ACCURATE`). For some formats this can
result in having to scan the complete file in order to accurately find
the target unit. See below for more details on this.
Non segment seeking will make the pipeline emit EOS when the configured
segment has been played.
Segment seeking (using the `GST_SEEK_FLAG_SEGMENT`) will not emit an
EOS at the end of the playback segment but will post a SEGMENT_DONE
message on the bus. This message is posted by the element driving the
playback in the pipeline, typically a demuxer. After receiving the
message, the application can reconnect the pipeline or issue other seek
events in the pipeline. Since the message is posted as early as possible
in the pipeline, the application has some time to issue a new seek to
make the transition seamless. Typically the allowed delay is defined by
the buffer sizes of the sinks as well as the size of any queues in the
pipeline.
The seek can also change the playback speed of the configured segment. A
speed of 1.0 is normal speed, 2.0 is double speed. Negative values mean
backward playback.
When performing a seek with a playback rate different from 1.0, the
`GST_SEEK_FLAG_SKIP` flag can be used to instruct decoders and demuxers
that they are allowed to skip decoding. This can be useful when resource
consumption is more important than accurately producing all frames.
<!-- FIXME # Seeking in push based elements-->
## Generating seeking events
A seek event is created with `gst_event_new_seek ()`.
## Seeking variants
The different kinds of seeking methods and their internal workings are
described below.
### FLUSH seeking
This is the most common way of performing a seek in a playback
application. The application issues a seek on the pipeline and the new
media is immediately played after the seek call returns.
### seeking without FLUSH
This seek type is typically performed after issuing segment seeks to
finish the playback of the pipeline.
Performing a non-flushing seek in a PAUSED pipeline blocks until the
pipeline is set to playing again since all data passing is blocked in
the prerolled sinks.
### segment seeking with FLUSH
This seek is typically performed when starting seamless looping.
### segment seeking without FLUSH
This seek is typically performed when continuing seamless looping.
Demuxer/parser behaviour and `SEEK_FLAG_KEY_UNIT` and
`SEEK_FLAG_ACCURATE`
This section aims to explain the behaviour expected by an element with
regard to the KEY_UNIT and ACCURATE seek flags using the example of a
parser or demuxer.
#### DEFAULT BEHAVIOUR:
When a seek to a certain position is requested, the demuxer/parser will
do two things (ignoring flushing and segment seeks, and simplified for
illustration purposes):
- send a segment event with a new start position
- start pushing data/buffers again
To ensure that the data corresponding to the requested seek position can
actually be decoded, a demuxer or parser needs to start pushing data
from a keyframe/keyunit at or before the requested seek position.
Unless requested differently (via the KEY_UNIT flag), the start of the
segment event should be the requested seek position.
So by default a demuxer/parser will then start pushing data from
position DATA and send a segment event with start position SEG_START,
and DATA ⇐ SEG_START.
If DATA < SEG_START, a well-behaved video decoder will start decoding
frames from DATA, but take into account the segment configured by the
demuxer via the segment event, and only actually output decoded video
frames from SEG_START onwards, dropping all decoded frames that are
before the segment start and adjusting the timestamp/duration of the
buffer that overlaps the segment start ("clipping"). A
not-so-well-behaved video decoder will start decoding frames from DATA
and push decoded video frames out starting from position DATA, in which
case the frames that are before the configured segment start will
usually be dropped/clipped downstream (e.g. by the video sink).
#### GST_SEEK_FLAG_KEY_UNIT:
If the KEY_UNIT flag is specified, the demuxer/parser should adjust the
segment start to the position of the key frame closest to the requested
seek position and then start pushing out data from there. The nearest
key frame may be before or after the requested seek position, but many
implementations will only look for the closest keyframe before the
requested position.
Most media players and thumbnailers do (and should be doing) KEY_UNIT
seeks by default, for performance reasons, to ensure almost-instant
responsiveness when scrubbing (dragging the seek slider in PAUSED or
PLAYING mode). This works well for most media, but results in suboptimal
behaviour for a small number of *odd* files (e.g. files that only have
one keyframe at the very beginning, or only a few keyframes throughout
the entire stream). At the time of writing, a solution for this still
needs to be found, but could be implemented demuxer/parser-side, e.g.
make demuxers/parsers ignore the KEY_UNIT flag if the position
adjustment would be larger than 1/10th of the duration or somesuch.
Flags can be used to influence snapping direction for those cases where
it matters. SNAP_BEFORE will select the preceding position to the seek
target, and SNAP_AFTER will select the following one. If both flags are
set, the nearest one to the seek target will be used. If none of these
flags are set, the seeking implemention is free to select whichever it
wants.
#### Summary:
- if the KEY_UNIT flag is **not** specified, the demuxer/parser
should start pushing data from a key unit preceding the seek
position (or from the seek position if that falls on a key unit),
and the start of the new segment should be the requested seek
position.
- if the KEY_UNIT flag is specified, the demuxer/parser should start
pushing data from the key unit nearest the seek position (or from
the seek position if that falls on a key unit), and the start of the
new segment should be adjusted to the position of that key unit
which was nearest the requested seek position (ie. the new segment
start should be the position from which data is pushed).
### GST_SEEK_FLAG_ACCURATE:
If the ACCURATE flag is specified in a seek request, the demuxer/parser
is asked to do whatever it takes (!) to make sure that the position
seeked to is accurate in relation to the beginning of the stream. This
means that it is not acceptable to just approximate the position (e.g.
using an average bitrate). The achieved position must be exact. In the
worst case, the demuxer or parser needs to push data from the beginning
of the file and let downstream clip everything before the requested
segment start.
The ACCURATE flag does not affect what the segment start should be in
relation to the requested seek position. Only the KEY_UNIT flag (or its
absence) has any effect on that.
Video editors and frame-stepping applications usually use the ACCURATE
flag.
#### Summary:
- if the ACCURATE flag is **not** specified, it is up to the
demuxer/parser to decide how exact the seek should be. If the flag
is not specified, the expectation is that the demuxer/parser does a
resonable best effort attempt, trading speed for accuracy. In the
absence of an index, the seek position may be approximated.
- if the ACCURATE flag is specified, absolute accuracy is required,
and speed is of no concern. It is not acceptable to just approximate
the seek position in that case.
- the ACCURATE flag does not imply that the segment starts at the
requested seek position or should be adjusted to the nearest
keyframe, only the KEY_UNIT flag determines that.
### ACCURATE and KEY_UNIT combinations:
All combinations of these two flags are valid:
- neither flag specified: segment starts at seek position, send data
from preceding key frame (or earlier), feel free to approximate the
seek position
- only KEY_UNIT specified: segment starts from position of nearest
keyframe, send data from nearest keyframe, feel free to approximate
the seek position
- only ACCURATE specified: segment starts at seek position, send data
from preceding key frame (or earlier), do not approximate the seek
position under any circumstances
- ACCURATE | KEY_UNIT specified: segment starts from position of
nearest keyframe, send data from nearest key frame, do not
approximate the seek position under any circumstances

108
markdown/design/segments.md Normal file
View file

@ -0,0 +1,108 @@
# Segments
A segment in GStreamer denotes a set of media samples that must be
processed. A segment has a start time, a stop time and a processing
rate.
A media stream has a start and a stop time. The start time is always 0
and the stop time is the total duration (or -1 if unknown, for example a
live stream). We call this the complete media stream.
The segment of the complete media stream can be played by issuing a seek
on the stream. The seek has a start time, a stop time and a processing
rate.
```
complete stream
+------------------------------------------------+
0 duration
segment
|--------------------------|
start stop
```
The playback of a segment starts with a source or demuxer element
pushing a segment event containing the start time, stop time and rate of
the segment. The purpose of this segment is to inform downstream
elements of the requested segment positions. Some elements might produce
buffers that fall outside of the segment and that might therefore be
discarded or
clipped.
## Use case: FLUSHING seek
ex. `filesrc ! avidemux ! videodecoder ! videosink`
When doing a seek in this pipeline for a segment 1 to 5 seconds, avidemux
will perform the seek.
Avidemux starts by sending a FLUSH_START event downstream and upstream. This
will cause its streaming task to PAUSED because \_pad_pull_range() and
\_pad_push() will return FLUSHING. It then waits for the STREAM_LOCK,
which will be unlocked when the streaming task pauses. At this point no
streaming is happening anymore in the pipeline and a FLUSH_STOP is sent
upstream and downstream.
When avidemux starts playback of the segment from second 1 to 5, it pushes
out a segment with 1 and 5 as start and stop times. The stream_time in
the segment is also 1 as this is the position we seek to.
The video decoder stores these values internally and forwards them to the
next downstream element (videosink, which also stores the values)
Since second 1 does not contain a keyframe, the avi demuxer starts sending
data from the previous keyframe which is at timestamp 0.
The video decoder decodes the keyframe but knows it should not push the
video frame yet as it falls outside of the configured segment.
When the video decoder receives the frame with timestamp 1, it is able to
decode this frame as it received and decoded the data up to the previous
keyframe. It then continues to decode and push frames with timestamps >= 1.
When it reaches timestamp 5, it does not decode and push frames anymore.
The video sink receives a frame of timestamp 1. It takes the start value of
the previous segment and aplies the following (simplified) formula:
```
render_time = BUFFER_TIMESTAMP - segment_start + element->base_time
```
It then syncs against the clock with this render_time. Note that
BUFFER_TIMESTAMP is always >= segment_start or else it would fall outside of
the configure segment.
Videosink reports its current position as (simplified):
```
current_position = clock_time - element->base_time + segment_time
```
See [synchronisation](design/synchronisation.md) for a more detailed and accurate explanation of
synchronisation and position reporting.
Since after a flushing seek the stream_time is reset to 0, the new buffer
will be rendered immediately after the seek and the current_position will be
the stream_time of the seek that was performed.
The stop time is important when the video format contains B frames. The
video decoder receives a P frame first, which it can decode but not push yet.
When it receives a B frame, it can decode the B frame and push the B frame
followed by the previously decoded P frame. If the P frame is outside of the
segment, the decoder knows it should not send the P frame.
Avidemux stops sending data after pushing a frame with timestamp 5 and
returns GST_FLOW_EOS from the chain function to make the upstream
elements perform the EOS logic.
## Use case: live stream
## Use case: segment looping
Consider the case of a wav file with raw audio.
```
filesrc ! wavparse ! alsasink
```
FIXME!

View file

@ -0,0 +1,85 @@
# Seqnums (Sequence numbers)
Seqnums are integers associated to events and messages. They are used to
identify a group of events and messages as being part of the same
*operation* over the pipeline.
Whenever a new event or message is created, a seqnum is set into them.
This seqnum is created from an ever increasing source (starting from 0
and it might wrap around), so each new event and message gets a new and
hopefully unique seqnum.
Suppose an element receives an event A and, as part of the logic of
handling the event A, creates a new event B. B should have its seqnum to
the same as A, because they are part of the same operation. The same
logic applies if this element had to create multiple events or messages,
all of those should have the seqnum set to the value on the received
event. For example, when a sink element receives an EOS event and
creates a new EOS message to post, it should copy the seqnum from the
event to the message because the EOS message is a consequence of the EOS
event being received.
Preserving the seqnums accross related events and messages allows the
elements and applications to identify a set of events/messages as being
part of a single operation on the pipeline. For example, flushes,
segments and EOS that are related to a seek event started by the
application.
Seqnums are also useful for elements to discard duplicated events,
avoiding handling them again.
Below are some scenarios as examples of how to handle seqnums when
receving events:
# Forcing EOS on the pipeline
The application has a pipeline running and does a
`gst_element_send_event` to the pipeline with an EOS event. All the
sources in the pipeline will have their `send_event` handlers called and
will receive the event from the application.
When handling this event, the sources will push either the same EOS
downstream or create their own EOS event and push. In the later case,
the source should copy the seqnum from the original EOS to the newly
created. This same logic applies to all elements that receive the EOS
downstream, either push the same event or, if creating a new one, copy
the seqnum.
When the EOS reaches the sink, it will create an EOS message, copy the
seqnum to the message and post to the bus. The application receives the
message and can compare the seqnum of the message with the one from the
original event sent to the pipeline. If they match, it knows that this
EOS message was caused by the event it pushed and not from other reason
(input finished or configured segment was over).
# Seeking
A seek event sent to the pipeline is forwarded to all sinks in it. Those
sinks, then, push the seek event upstream until they reach an element
that is capable of handling it. If the element handling the seek has
multiple source pads (tipically a demuxer is handling the seek) it might
receive the same seek event on all pads. To prevent handling the same
seek event multiple times, the seqnum can be used to identify those
events as being the same and only handle the first received.
Also, when handling the seek, the element might push flush-start,
flush-stop and a segment event. All those events should have the same
seqnum of the seek event received. When this segment is over and an
EOS/Segment-done event is going to be pushed, it also should have the
same seqnum of the seek that originated the segment to be played.
Having the same seqnum as the seek on the segment-done or EOS events is
important for the application to identify that the segment requested by
its seek has finished playing.
# Questions
What happens if the application has sent a seek to the pipeline and,
while the segment relative to this seek is playing, it sends an EOS
event? Should the EOS pushed by the source have the seqnum of the
segment or the EOS from the application?
If the EOS was received from the application before the segment ended,
it should have the EOS from the application event. If the segment ends
before the application event is received/handled, it should have the
seek/segment seqnum.

View file

@ -0,0 +1,110 @@
# DRAFT Sparse Streams
## Introduction
In 0.8, there was some support for Sparse Streams through the use of
FILLER events. These were used to mark gaps between buffers so that
downstream elements could know not to expect any more data for that gap.
In 0.10, segment information conveyed through SEGMENT events can be used
for the same purpose.
In 1.0, there is a GAP event that works in a similar fashion as the
FILLER event in 0.8.
## Use cases
1) Sub-title streams Sub-title information from muxed formats such as
Matroska or MPEG consist of irregular buffers spaced far apart compared
to the other streams (audio and video). Since these usually only appear
when someone speaks or some other action in the video/audio needs
describing, they can be anywhere from 1-2 seconds to several minutes
apart. Downstream elements that want to mix sub-titles and video (and muxers)
have no way of knowing whether to process a video packet or wait a moment
for a corresponding sub-title to be delivered on another pad.
2) Still frame/menu support In DVDs (and other formats), there are
still-frame regions where the current video frame should be retained and
no audio played for a period. In DVD, these are described either as a
fixed duration, or infinite duration still frame.
3) Avoiding processing silence from audio generators Imagine a source
that from time to time produces empty buffers (silence or blank images).
If the pipeline has many elements next, it is better to optimise the
obsolete data processing in this case. Examples for such sources are
sound-generators (simsyn in gst-buzztard) or a source in a voip
application that uses noise-gating (to save bandwith).
## Details
### Sub-title streams
The main requirement here is to avoid stalling the
pipeline between sub-title packets, and is effectively updating the
minimum-timestamp for that
stream.
A demuxer can do this by sending an 'update' SEGMENT with a new start time
to the subtitle pad. For example, every time the SCR in MPEG data
advances more than 0.5 seconds, the MPEG demuxer can issue a SEGMENT with
(update=TRUE, start=SCR ). Downstream elements can then be aware not to
expect any data older than the new start time.
The same holds true for any element that knows the current position in the
stream - once the element knows that there is no more data to be presented
until time 'n' it can advance the start time of the current segment to 'n'.
This technique can also be used, for example, to represent a stream of
MIDI events spaced to a clock period. When there is no event present for
a clock time, a SEGMENT update can be sent in its place.
### Still frame/menu support
Still frames in DVD menus are not the same,
in that they do not introduce a gap in the timestamps of the data.
Instead, they represent a pause in the presentation of a stream.
Correctly performing the wait requires some synchronisation with
downstream elements.
In this scenario, an upstream element that wants to execute a still frame
performs the following steps:
- Send all data before the still frame wait
- Send a DRAIN event to ensure that all data has been played
downstream.
- wait on the clock for the required duration, possibly interrupting
if necessary due to an intervening activity (such as a user
navigation)
- FLUSH the pipeline using a normal flush sequence (FLUSH\_START,
chain-lock, FLUSH\_STOP)
- Send a SEGMENT to restart playback with the next timestamp in the
stream.
The upstream element performing the wait must only do so when in the PLAYING
state. During PAUSED, the clock will not be running, and may not even have
been distributed to the element yet.
DRAIN is a new event that will block on a src pad until all data downstream
has been played out.
Flushing after completing the still wait is to ensure that data after the wait
is played correctly. Without it, sinks will consider the first buffers
(x seconds, where x is the duration of the wait that occurred) to be
arriving late at the sink, and they will be discarded instead of played.
### For audio
It is the same case as the first one - there is a *gap* in the audio
data that needs to be presented, and this can be done by sending a
SEGMENT update that moves the start time of the segment to the next
timestamp when data will be sent.
For video, however it is slightly different. Video frames are typically
treated at the moment as continuing to be displayed after their indicated
duration if no new frame arrives. Here, it is desired to display a blank
frame instead, in which case at least one blank frame should be sent before
updating the start time of the segment.

View file

@ -0,0 +1,51 @@
# Ownership of dynamic objects
Any object-oriented system or language that doesnt have automatic
garbage collection has many potential pitfalls as far as the pointers
go. Therefore, some standards must be adhered to as far as who owns
what.
## Strings
Arguments passed into a function are owned by the caller, and the
function will make a copy of the string for its own internal use. The
string should be const gchar \*. Strings returned from a function are
always a copy of the original and should be freed after usage by the
caller.
ex:
``` c
name = gst_element_get_name (element); /* copy of name is made */
.. use name ..
g_free (name); /* free after usage */
```
## Objects
Objects passed into a function are owned by the caller, any additional
reference held to the object after leaving the function should increase
the refcount of that object.
Objects returned from a function are owned by the caller. This means
that the called should \_free() or \_unref() the object after usage.
ex:
``` c
peer = gst_pad_get_peer (pad); /* peer with increased refcount */
if (peer) {
.. use peer ..
gst_object_unref (GST_OBJECT (peer)); /* unref peer after usage */
}
```
## Iterators
When retrieving multiple objects from an object an iterator should be
used. The iterator allows you to access the objects one after another
while making sure that the set of objects retrieved remains consistent.
Each object retrieved from an iterator has its refcount increased or is
a copy of the original. In any case the object should be unreffed or
freed after usage.

404
markdown/design/states.md Normal file
View file

@ -0,0 +1,404 @@
# States
Both elements and pads can be in different states. The states of the
pads are linked to the state of the element so the design of the states
is mainly focused around the element states.
An element can be in 4 states. NULL, READY, PAUSED and PLAYING. When an
element is initially instantiated, it is in the NULL state.
## State definitions
- NULL: This is the initial state of an element.
- READY: The element should be prepared to go to PAUSED.
- PAUSED: The element should be ready to accept and process data. Sink
elements however only accept one buffer and then block.
- PLAYING: The same as PAUSED except for live sources and sinks. Sinks
accept and render data. Live sources produce data.
We call the sequence NULL→PLAYING an upwards state change and
PLAYING→NULL a downwards state change.
## State transitions
the following state changes are possible:
* *NULL -> READY*:
- The element must check if the resources it needs are available.
Device sinks and sources typically try to probe the device to constrain
their caps.
- The element opens the device, this is needed if the previous step requires
the device to be opened.
* *READY -> PAUSED*:
- The element pads are activated in order to receive data in PAUSED.
Streaming threads are started.
- Some elements might need to return `ASYNC` and complete the state change
when they have enough information. It is a requirement for sinks to
return `ASYNC` and complete the state change when they receive the first
buffer or EOS event (preroll). Sinks also block the dataflow when in PAUSED.
- A pipeline resets the running_time to 0.
- Live sources return NO_PREROLL and don't generate data.
* *PAUSED -> PLAYING*:
- Most elements ignore this state change.
- The pipeline selects a clock and distributes this to all the children
before setting them to PLAYING. This means that it is only allowed to
synchronize on the clock in the PLAYING state.
- The pipeline uses the clock and the running_time to calculate the base_time.
The base_time is distributed to all children when performing the state
change.
- Sink elements stop blocking on the preroll buffer or event and start
rendering the data.
- Sinks can post the EOS message in the PLAYING state. It is not allowed to
post EOS when not in the PLAYING state.
- While streaming in PAUSED or PLAYING elements can create and remove
sometimes pads.
- Live sources start generating data and return SUCCESS.
* *PLAYING -> PAUSED*:
- Most elements ignore this state change.
- The pipeline calculates the running_time based on the last selected clock
and the base_time. It stores this information to continue playback when
going back to the PLAYING state.
- Sinks unblock any clock wait calls.
- When a sink does not have a pending buffer to play, it returns `ASYNC` from
this state change and completes the state change when it receives a new
buffer or an EOS event.
- Any queued EOS messages are removed since they will be reposted when going
back to the PLAYING state. The EOS messages are queued in GstBins.
- Live sources stop generating data and return NO_PREROLL.
* *PAUSED -> READY*:
- Sinks unblock any waits in the preroll.
- Elements unblock any waits on devices
- Chain or get_range functions return FLUSHING.
- The element pads are deactivated so that streaming becomes impossible and
all streaming threads are stopped.
- The sink forgets all negotiated formats
- Elements remove all sometimes pads
* *READY -> NULL*:
- Elements close devices
- Elements reset any internal state.
## State variables
An element has 4 state variables that are protected with the object LOCK:
- *STATE*
- *STATE_NEXT*
- *STATE_PENDING*
- *STATE_RETURN*
The STATE always reflects the current state of the element. The
STATE\_NEXT reflects the next state the element will go to. The
STATE\_PENDING always reflects the required state of the element. The
STATE\_RETURN reflects the last return value of a state change.
The STATE\_NEXT and STATE\_PENDING can be VOID\_PENDING if the element
is in the right state.
An element has a special lock to protect against concurrent invocations
of set\_state(), called the STATE\_LOCK.
## Setting state on elements
The state of an element can be changed with \_element\_set\_state().
When changing the state of an element all intermediate states will also
be set on the element until the final desired state is set.
The `set\_state()` function can return 3 possible values:
* *GST_STATE_FAILURE*: The state change failed for some reason. The plugin should
have posted an error message on the bus with information.
* *GST_STATE_SUCCESS*: The state change is completed successfully.
* *GST_STATE_ASYNC*: The state change will complete later on. This can happen
when the element needs a long time to perform the state change or for sinks
that need to receive the first buffer before they can complete the state change
(preroll).
* *GST_STATE_NO_PREROLL*: The state change is completed successfully but the
element will not be able to produce data in the PAUSED state.
In the case of an `ASYNC` state change, it is possible to proceed to the
next state before the current state change completed, however, the
element will only get to this next state before completing the previous
`ASYNC` state change. After receiving an `ASYNC` return value, you can use
`element\_get\_state()` to poll the status of the element. If the
polling returns `SUCCESS`, the element completed the state change to the
last requested state with `set\_state()`.
When setting the state of an element, the STATE\_PENDING is set to the
required state. Then the state change function of the element is called
and the result of that function is used to update the STATE and
STATE\_RETURN fields, STATE\_NEXT, STATE\_PENDING and STATE\_RETURN
fields. If the function returned `ASYNC`, this result is immediately
returned to the caller.
## Getting state of elements
The get\_state() function takes 3 arguments, two pointers that will
hold the current and pending state and one GstClockTime that holds a
timeout value. The function returns a GstElementStateReturn.
- If the element returned `SUCCESS` to the previous \_set\_state()
function, this function will return the last state set on the
element and VOID\_PENDING in the pending state value. The function
returns GST\_STATE\_SUCCESS.
- If the element returned NO\_PREROLL to the previous \_set\_state()
function, this function will return the last state set on the
element and VOID\_PENDING in the pending state value. The function
returns GST\_STATE\_NO\_PREROLL.
- If the element returned FAILURE to the previous \_set\_state() call,
this function will return FAILURE with the state set to the current
state of the element and the pending state set to the value used in
the last call of \_set\_state().
- If the element returned `ASYNC` to the previous \_set\_state() call,
this function will wait for the element to complete its state change
up to the amount of time specified in the GstClockTime.
- If the element does not complete the state change in the
specified amount of time, this function will return `ASYNC` with
the state set to the current state and the pending state set to
the pending state.
- If the element completes the state change within the specified
timeout, this function returns the updated state and
VOID\_PENDING as the pending state.
- If the element aborts the `ASYNC` state change due to an error
within the specified timeout, this function returns FAILURE with
the state set to last successful state and pending set to the
last attempt. The element should also post an error message on
the bus with more information about the problem.
## States in GstBin
A GstBin manages the state of its children. It does this by propagating
the state changes performed on it to all of its children. The
\_set\_state() function on a bin will call the \_set\_state() function
on all of its children, that are not already in the target state or in a
change state to the target state.
The children are iterated from the sink elements to the source elements.
This makes sure that when changing the state of an element, the
downstream elements are in the correct state to process the eventual
buffers. In the case of a downwards state change, the sink elements will
shut down first which makes the upstream elements shut down as well
since the \_push() function returns a GST\_FLOW\_FLUSHING error.
If all the children return `SUCCESS`, the function returns `SUCCESS` as
well.
If one of the children returns FAILURE, the function returns FAILURE as
well. In this state it is possible that some elements successfully
changed state. The application can check which elements have a changed
state, which were in error and which were not affected by iterating the
elements and calling \_get\_state() on the elements.
If after calling the state function on all children, one of the children
returned `ASYNC`, the function returns `ASYNC` as well.
If after calling the state function on all children, one of the children
returned NO\_PREROLL, the function returns NO\_PREROLL as well.
If both NO\_PREROLL and `ASYNC` children are present, NO\_PREROLL is
returned.
The current state of the bin can be retrieved with \_get\_state().
If the bin is performing an `ASYNC` state change, it will automatically
update its current state fields when it receives state messages from the
children.
## Implementing states in elements
### READY
## upward state change
Upward state changes always return `ASYNC` either if the STATE\_PENDING is
reached or not.
Element:
* A -> B => `SUCCESS`
- commit state
* A -> B => `ASYNC`
- no commit state
- element commits state `ASYNC`
* A -> B while `ASYNC`
- update STATE_PENDING state
- no commit state
- no change_state called on element
Bin:
* A->B: all elements `SUCCESS`
- commit state
* A->B: some elements `ASYNC`
- no commit state
- listen for commit messages on bus
- for each commit message, poll elements, this happens in another
thread.
- if no `ASYNC` elements, commit state, continue state change
to STATE_PENDING
## downward state change
Downward state changes only return `ASYNC` if the final state is ASYNC.
This is to make sure that its not needed to wait for an element to
complete the preroll or other `ASYNC` state changes when one only wants to
shut down an element.
Element:
A -> B => `SUCCESS`
- commit state
A -> B => `ASYNC` not final state
- commit state on behalf of element
A -> B => `ASYNC` final state
- element will commit `ASYNC`
Bin:
A -> B -> `SUCCESS`
- commit state
A -> B -> `ASYNC` not final state
- commit state on behalf of element, continue state change
A -> B => `ASYNC` final state
- no commit state
- listen for commit messages on bus
- for each commit message, poll elements
- if no `ASYNC` elements, commit state
## Locking overview (element)
- Element committing `SUCCESS`
- STATE\_LOCK is taken in set\_state
- change state is called if `SUCCESS`, commit state is called
- commit state calls change\_state to next state change.
- if final state is reached, stack unwinds and result is returned
to set\_state and
caller.
```
set_state(element) change_state (element) commit_state
| | |
| | |
STATE_LOCK | |
| | |
|------------------------>| |
| | |
| | |
| | (do state change) |
| | |
| | |
| | if `SUCCESS` |
| |---------------------->|
| | | post message
| | |
| |<----------------------| if (!final) change_state (next)
| | | else SIGNAL
| | |
| | |
| | |
|<------------------------| |
| `SUCCESS`
|
STATE_UNLOCK
|
`SUCCESS`
```
- Element committing `ASYNC`
- STATE\_LOCK is taken in set\_state
- change state is called and returns `ASYNC`
- `ASYNC` returned to the caller.
- element takes LOCK in streaming thread.
- element calls commit\_state in streaming thread.
- commit state calls change\_state to next state
change.
```
set_state(element) change_state (element) stream_thread commit_state (element)
| | | |
| | | |
STATE_LOCK | | |
| | | |
|------------------------>| | |
| | | |
| | | |
| | (start_task) | |
| | | |
| | STREAM_LOCK |
| | |... |
|<------------------------| | |
| ASYNC STREAM_UNLOCK |
STATE_UNLOCK | |
| .....sync........ STATE_LOCK |
ASYNC |----------------->|
| |
| |---> post_message()
| |---> if (!final) change_state (next)
| | else SIGNAL
|<-----------------|
STATE_UNLOCK
|
STREAM_LOCK
| ...
STREAM_UNLOCK
```
## Remarks
set\_state cannot be called from multiple threads at the same time. The
STATE\_LOCK prevents this.
State variables are protected with the LOCK.
Calling set\_state while gst\_state is called should unlock the
get\_state with an error. The cookie will do that.
``` c
set_state(element)
STATE_LOCK
LOCK
update current, next, pending state
cookie++
UNLOCK
change_state
STATE_UNLOCK
```

View file

@ -0,0 +1,580 @@
# Stream selection
History
```
v0.1: Jun 11th 2015
Initial Draft
v0.2: Sep 18th 2015
Update to reflect design changes
v1.0: Jun 28th 2016
Pre-commit revision
```
This document describes the events and objects involved in stream
selection in GStreamer pipelines, elements and applications
## Background
This new API is intended to address the use cases described in
this section:
1) As a user/app I want an overview and control of the media streams
that can be configured within a pipeline for processing, even
when some streams are mutually exclusive or logical constructs only.
2) The user/app can disable entirely streams it's not interested
in so they don't occupy memory or processing power - discarded
as early as possible in the pipeline. The user/app can also
(re-)enable them at a later time.
3) If the set of possible stream configurations is changing,
the user/app should be aware of the pending change and
be able to make configuration choices for the new set of streams,
as well as possibly still reconfiguring the old set
4) Elements that have some other internal mechanism for triggering
stream selections (DVD, or maybe some scripted playback
playlist) should be able to trigger 'selection' of some particular
stream.
5) Indicate known relationships between streams - for example that
2 separate video feeds represent the 2 views of a stereoscopic
view, or that certain streams are mutually exclusive.
> Note: the streams that are "available" are not automatically
> the ones active, or present in the pipeline as pads. Think HLS/DASH
> alternate streams.
Use case examples:
1) Playing an MPEG-TS multi-program stream, we want to tell the
app that there are multiple programs that could be extracted
from the incoming feed. Further, we want to provide a mechanism
for the app to select which program(s) to decode, and once
that is known to further tell the app which elementary streams
are then available within those program(s) so the app/user can
choose which audio track(s) to decode and/or use.
2) A new PMT arrives for an MPEG-TS stream, due to a codec or
channel change. The pipeline will need to reconfigure to
play the desired streams from new program. Equally, there
may be multiple seconds of content buffered from the old
program and it should still be possible to switch (for example)
subtitle tracks responsively in the draining out data, as
well as selecting which subs track to play from the new feed.
This same scenario applies when doing gapless transition to a
new source file/URL, except that likely the element providing
the list of streams also changes as a new demuxer is installed.
3) When playing a multi-angle DVD, the DVD Virtual Machine needs to
extract 1 angle from the data for presentation. It can publish
the available angles as logical streams, even though only one
stream can be chosen.
4) When playing a DVD, the user can make stream selections from the
DVD menu to choose audio or sub-picture tracks, or the DVD VM
can trigger automatic selections. In addition, the player UI
should be able to show which audio/subtitle tracks are available
and allow direct selection in a GUI the same as for normal
files with subtitle tracks in them.
5) Playing a SCHC (3DTV) feed, where one view is MPEG-2 and the other
is H.264 and they should be combined for 3D presentation, or
not bother decoding 1 stream if displaying 2D.
(bug https://bugzilla.gnome.org/show_bug.cgi?id=719333)
FIXME - need some use cases indicating what alternate streams in
HLS might require - what are the possibilities?
## Design Overview
Stream selection in GStreamer is implemented in several parts:
1) Objects describing streams : GstStream
2) Objects describing a collection of streams : GstStreamCollection
3) Events from the app allowing selection and activation of some streams:
GST_EVENT_SELECT_STREAMS
4) Messages informing the user/application about the available
streams and current status:
GST_MESSAGE_STREAM_COLLECTION
GST_MESSAGE_STREAMS_SELECTED
## GstStream objects
* API: GstStream
* API: gst_stream_new(..)
* API: gst_stream_get_\*(...)
* API: gst_stream_set_\*()
* API: gst_event_set_stream(...)
* API: gst_event_parse_stream(...)
GstStream objects are a high-level convenience object containing
information regarding a possible data stream that can be exposed by
GStreamer elements.
They are mostly the aggregation of information present in other
GStreamer components (STREAM_START, CAPS, TAGS event) but are not
tied to the presence of a GstPad, and for some use-cases provide
information that the existing components don't provide.
The various properties of a GstStream object are:
- stream_id (from the STREAM_START event)
- flags (from the STREAM_START event)
- caps
- tags
- type (high-level type of stream: Audio, Video, Container,...)
GstStream objects can be subclassed so that they can be re-used by
elements already using the notion of stream (which is common for
example in demuxers).
Elements that create GstStream should also set it on the
GST_EVENT_STREAM_START event of the relevant pad. This helps
downstream elements to have all information in one location.
## Exposing collections of streams
* API: GstStreamCollection
* API: gst_stream_collection_new(...)
* API: gst_stream_collection_add_stream(...)
* API: gst_stream_collection_get_size(...)
* API: gst_stream_collection_get_stream(...)
* API: GST_MESSAGE_STREAM_COLLECTION
* API: gst_message_new_stream_collection(...)
* API: gst_message_parse_stream_collection(...)
* API: GST_EVENT_STREAM_COLLECTION
* API: gst_event_new_stream_collection(...)
* API: gst_event_parse_stream_collection(...)
Elements that create new streams (such as demuxers) or can create
new streams (like the HLS/DASH alternative streams) can list the
streams they can make available with the GstStreamCollection object.
Other elements that might generate GstStreamCollections are the
DVD-VM, which handles internal switching of tracks, or parsebin and
decodebin3 when it aggregates and presents multiple internal stream
sources as a single configurable collection.
The GstStreamCollection object is a flat listing of GstStream objects.
The various properties of a GstStreamCollection are:
- 'identifier'
- the identifier of the collection (unique name)
- Generated from the 'upstream stream id' (or stream ids, plural)
- the list of GstStreams in the collection.
- (Not implemented) : Flags -
For now, the only flag is 'INFORMATIONAL' - used by container parsers to
publish information about detected streams without allowing selection of
the streams.
- (Not implemented yet) : The relationship between the various streams
This specifies which streams are exclusive (can not be selected at the
same time), are related (such as LINKED_VIEW or ENHANCEMENT), or need to
be selected together.
An element will inform outside components about that collection via:
* a GST_MESSAGE_STREAM_COLLECTION message on the bus.
* a GST_EVENT_STREAM_COLLECTION on each source pads.
Applications and container bin elements can listen and collect the
various stream collections to know the full range of streams
available within a bin/pipeline.
Once posted on the bus, a GstStreamCollection is immutable. It is
updated by subsequent messages with a matching identifier.
If the element that provided the collection goes away, there is no way
to know that the streams are no longer valid (without having the
user/app track that element). The exception to that is if the bin
containing that element (such as parsebin or decodebin3) informs that
the next collection is a replacement of the former one.
The mutual exclusion and relationship lists use stream-ids
rather than GstStream references in order to avoid circular
referencing problems.
### Usage from elements
When a demuxer knows the list of streams it can expose, it
creates a new GstStream for each stream it can provide with the
appropriate information (stream id, flag, tags, caps, ...).
The demuxer then creates a GstStreamCollection object in which it
will put the list of GstStream it can expose. That collection is
then both posted on the bus (via a GST_MESSAGE_COLLECTION) and on
each pad (via a GST_EVENT_STREAM_COLLECTION).
That new collection must be posted on the bus *before* the changes
are made available. i.e. before pads corresponding to that selection
are added/removed.
In order to be backwards-compatible and support elements that don't
create streams/collection yet, the new 'parsebin' element used by
decodebin3 will automatically create those if not provided.
### Usage from application
Applications can know what streams are available by listening to the
GST_MESSAGE_STREAM_COLLECTION messages posted on the bus.
The application can list the available streams per-type (such as all
the audio streams, or all the video streams) by iterating the
streams available in the collection by GST_STREAM_TYPE.
The application will also be able to use these stream information to
decide which streams should be activated or not (see the stream
selection event below).
### Backwards compatibility
Not all demuxers will create the various GstStream and
GstStreamCollection objects. In order to remain backwards
compatible, a parent bin (parsebin in decodebin3) will create the
GstStream and GstStreamCollection based on the pads being
added/removed from an element.
This allows providing stream listing/selection for any demuxer-like
element even if it doesn't implement the GstStreamCollection usage.
## Stream selection event
* API: GST_EVENT_SELECT_STREAMS
* API: gst_event_new_select_streams(...)
* API: gst_event_parse_select_streams(...)
Stream selection events are generated by the application and
sent into the pipeline to configure the streams.
The event carries:
* List of GstStreams to activate - a subset of the GstStreamCollection
* (Not implemented) - List of GstStreams to be kept discarded - a
subset of streams for which hot-swapping will not be desired,
allowing elements (such as decodebin3, demuxers, ...) to not parse or
buffer those streams at all.
### Usage from application
There are two use-cases where an application needs to specify in a
generic fashion which streams it wants in output:
1) When there are several present streams of which it only wants a
subset (such as one audio, one video and one subtitle
stream). Those streams are demuxed and present in the pipeline.
2) When the stream the user wants require some element to undertake
some action to expose that stream in the pipeline (such as
DASH/HLS alternative streams).
From the point of view of the application, those two use-cases are
treated identically. The streams are all available through the
GstStreamCollection posted on the bus, and it will select a subset.
The application can select the streams it wants by creating a
GST_EVENT_SELECT_STREAMS event with the list of stream-id of the
streams it wants. That event is then sent on the pipeline,
eventually traveling all the way upstream from each sink.
In some cases, selecting one stream may trigger the availability of
other dependent streams, resulting in new GstStreamCollection
messages. This can happen in the case where choosing a different DVB
channel would create a new single-program collection.
### Usage in elements
Elements that receive the GST_EVENT_SELECT_STREAMS event and that
can activate/deactivate streams need to look at the list of
stream-id contained in the event and decide if they need to do some
action.
In the standard demuxer case (demuxing and exposing all streams),
there is nothing to do by default.
In decodebin3, activating or deactivating streams is taken care of by
linking only the streams present in the event to decoders and output
ghostpad.
In the case of elements that can expose alternate streams that are
not present in the pipeline as pads, they will take the appropriate
action to add/remove those streams.
Containers that receive the event should pass it to any elements
with no downstream peers, so that streams can be configured during
pre-roll before a pipeline is completely linked down to sinks.
## decodebin3 usage and example
This is an example of how decodebin3 works by using the
above-mentioned objects/events/messages.
For clarity/completeness, we will consider a mpeg-ts stream that has
multiple audio streams. Furthermore that stream might have changes
at some point (switching video codec, or adding/removing audio
streams).
### Initial differences
decodebin3 is different, compared to decodebin2, in the sense that, by
default:
* it will only expose as output ghost source pads one stream of each
type (one audio, one video, ..).
* It will only decode the exposed streams
The multiqueue element is still used and takes in all elementary
(non-decoded) streams. If parsers are needed/present they are placed
before the multiqueue. This is needed in order for multiqueue to
work only with packetized and properly timestamped streams.
Note that the whole typefinding of streams, and optional depayloading,
demuxing and parsing are done in a new 'parsebin' element.
Just like the current implementation, demuxers will expose all
streams present within a program as source pads. They will connect
to parsers and multiqueue.
Initial setup. 1 video stream, 2 audio streams.
```
+---------------------+
| parsebin |
| --------- | +-------------+
| | demux |--[parser]-+-| multiqueue |--[videodec]---[
]-+-| |--[parser]-+-| |
| | |--[parser]-+-| |--[audiodec]---[
| --------- | +-------------+
+---------------------+
```
### GstStreamCollection
When parsing the initial PAT/PMT, the demuxer will:
1) create the various GstStream objects for each stream.
2) create the GstStreamCollection for that initial PMT
3) post the GST_MESSAGE_STREAM_COLLECTION Decodebin will intercept that message
and know what the demuxer will be exposing.
4) The demuxer creates the various pads and sends the corresponding
STREAM_START event (with the same stream-id as the corresponding
GstStream objects), CAPS event, and TAGS event.
* parsebin will add all relevant parsers and expose those streams.
* Decodebin will be able to correlate, based on STREAM_START event
stream-id, what pad corresponds to which stream. It links each stream
from parsebin to multiqueue.
* Decodebin knows all the streams that will be available. Since by
default it is configured to only expose a stream of each type, it
will pick a stream of each for which it will complete the
auto-plugging (finding a decoder and then exposing that stream as a
source ghostpad.
> Note: If the demuxer doesn't create/post the GstStreamCollection,
> parsebin will create it on itself, as explained in section 2.3
> above.
### Changing the active selection from the application
The user wants to change the audio track. The application received
the GST_MESSAGE_STREAM_COLLECTION containing the list of available
streams. For clarity, we will assume those stream-ids are
"video-main", "audio-english" and "audio-french".
The user prefers to use the french soundtrack (which it knows based
on the language tag contained in the GstStream objects).
The application will create and send a GST_EVENT_SELECT_STREAM event
containing the list of streams: "video-main", "audio-french".
That event gets sent on the pipeline, the sinks send it upstream and
eventually reach decodebin.
Decodebin compares:
* The currently active selection ("video-main", "audio-english")
* The available stream collection ("video-main", "audio-english",
"audio-french")
* The list of streams in the event ("video-main", "audio-french")
Decodebin determines that no change is required for "video-main",
but sees that it needs to deactivate "audio-english" and activate
"audio-french".
It unlinks the multiqueue source pad connected to the audiodec. Then
it queries audiodec, using the GST_QUERY_ACCEPT_CAPS, whether it can
accept as-is the caps from the "audio-french" stream.
1) If it does, the multiqueue source pad corresponding to
"audio-french" is linked to the decoder.
2) If it does not, the existing audio decoder is removed,
a new decoder is selected (like during initial
auto-plugging), and replaces the old audio decoder element.
The newly selected stream gets decoded and output through the same
pad as the previous audio stream.
Note:
The default behaviour would be to only expose one stream of each
type. But nothing prevents decodebin from outputting more/less of
each type if the GST_EVENT_SELECT_STREAM event specifies that. This
allows covering more use-case than the simple playback one.
Such examples could be :
* Wanting just a video stream or just an audio stream
* Wanting all decoded streams
* Wanting all audio streams
...
### Changes coming from upstream
At some point in time, a PMT change happens. Let's assume a change
in video-codec and/or PID.
The demuxer creates a new GstStream for the changed/new stream,
creates a new GstStreamCollection for the updated PMT and posts it.
Decodebin sees the new GstStreamCollection message.
The demuxer (and parsebin) then adds and removes pads.
1) decodebin will match the new pads to GstStream in the "new"
GstStreamCollection the same way it did for the initial pads in
section 4.2 above.
2) decodebin will see whether the new stream can re-use a multiqueue
slot used by a stream of the same type no longer present (it
compares the old collection to the new collection).
In this case, decodebin sees that the new video stream can re-use
the same slot as the previous video stream.
3) If the new stream is going to be active by default (in this case
it does because we are replacing the only video stream, which was
active), it will check whether the caps are compatible with the
existing videodec (in the same way it was done for the audio
decoder switch in section 4.3).
Eventually, the stream that switched will be decoded and output
through the same pad as the previous video stream in a gapless fashion.
### Further examples
##### HLS alternates
There is a main (multi-bitrate or not) stream with audio and
video interleaved in mpeg-ts. The manifest also indicates the
presence of alternate language audio-only streams.
HLS would expose one collection containing:
1) The main A+V CONTAINER stream (mpeg-ts), initially active,
downloaded and exposed as a pad
2) The alternate A-only streams, initially inactive and not exposed as pads
the tsdemux element connected to the first stream will also expose
a collection containing
1.1) A video stream
1.2) An audio stream
```
[ Collection 1 ] [ Collection 2 ]
[ (hlsdemux) ] [ (tsdemux) ]
[ upstream:nil ] /----[ upstream:main]
[ ] / [ ]
[ "main" (A+V) ]<-/ [ "video" (V) ] viddec1 : "video"
[ "fre" (A) ] [ "eng" (A) ] auddec1 : "eng"
[ "kor" (A) ] [ ]
```
The user might want to use the korean audio track instead of the
default english one.
=> SELECT_STREAMS ("video", "kor")
1) decodebin3 receives and sends the event further upstream
2) tsdemux sees that "video" is part of its current upstream,
so adds the corresponding stream-id ("main") to the event
and sends it upstream ("main", "video", "kor")
3) hlsdemux receives the event
=> It activates "kor" in addition to "main"
4) The event travels back to decodebin3 which will remember the
requested selection. If "kor" is already present it will switch
the "eng" stream from the audio decoder to the "kor" stream.
If it appears a bit later, it will wait until that "kor" stream
is available before switching
#### multi-program MPEG-TS
Assuming the case of a mpeg-ts stream which contains multiple
programs.
There would be three "levels" of collection:
1) The collection of programs presents in the stream
2) The collection of elementary streams presents in a stream
3) The collection of streams decodebin can expose
Initially tsdemux exposes the first program present (default)
```
[ Collection 1 ] [ Collection 2 ] [ Collection 3 ]
[ (tsdemux) ] [ (tsdemux) ] [ (decodebin) ]
[ id:Programs ]<-\ [ id:BBC1 ]<-\ [ id:BBC1-decoded ]
[ upstream:nil ] \-----[ upstream:Programs] \----[ upstream:BBC1 ]
[ ] [ ] [ ]
[ "BBC1" (C) ] [ id:"bbcvideo"(V) ] [ id:"bbcvideo"(V)]
[ "ITV" (C) ] [ id:"bbcaudio"(A) ] [ id:"bbcaudio"(A)]
[ "NBC" (C) ] [ ] [ ]
```
At some point the user wants to switch to ITV (of which we do not
know the topology at this point in time. A SELECT_STREAMS event
is sent with "ITV" in it and the pointer to the Collection1.
1) The event travels up the pipeline until tsdemux receives it
and begins the switch.
2) tsdemux publishes a new 'Collection 2a/ITV' and marks 'Collection 2/BBC'
as replaced.
2a) App may send a SELECT_STREAMS event configuring which demuxer output
streams should be selected (parsed)
3) tsdemux adds/removes pads as needed (flushing pads as it removes them?)
4) Decodebin feeds new pad streams through existing parsers/decoders as
needed. As data from the new collection arrives out each decoder,
decodebin sends new GstStreamCollection messages to the app so it
can know that the new streams are now switchable at that level.
4a) As new GstStreamCollections are published, the app may override
the default decodebin stream selection to expose more/fewer streams.
The default is to decode and output 1 stream of each type.
Final state:
```
[ Collection 1 ] [ Collection 4 ] [ Collection 5 ]
[ (tsdemux) ] [ (tsdemux) ] [ (decodebin) ]
[ id:Programs ]<-\ [ id:ITV ]<-\ [ id:ITV-decoded ]
[ upstream:nil ] \-----[ upstream:Programs] \----[ upstream:ITV ]
[ ] [ ] [ ]
[ "BBC1" (C) ] [ id:"itvvideo"(V) ] [ id:"itvvideo"(V)]
[ "ITV" (C) ] [ id:"itvaudio"(A) ] [ id:"itvaudio"(A)]
[ "NBC" (C) ] [ ] [ ]
```
### TODO
- Add missing implementation
- Add flags to GstStreamCollection
- Add mutual-exclusion and relationship API to GstStreamCollection
- Add helper API to figure out whether a collection is a replacement
of another or a completely new one. This will require a more generic
system to know whether a certain stream-id is a replacement of
another or not.
### OPEN QUESTIONS
- Is a FLUSHING flag for stream-selection required or not ? This would
make the handler of the SELECT\_STREAMS event send FLUSH START/STOP
before switching to the other streams. This is tricky when dealing
where situations where we keep some streams and only switch some
others. Do we flush all streams ? Do we only flush the new streams,
potentially resulting in delay to fully switch ? Furthermore, due to
efficient buffering in decodebin3, the switching time has been
minimized extensively, to the point where flushing might not bring a
noticeable improvement.
- Store the stream collection in bins/pipelines ? A Bin/Pipeline could
store all active collection internally, so that it could be queried
later on. This could be useful to then get, on any pipeline, at any
point in time, the full list of collections available without having
to listen to all COLLECTION messages on the bus. This would require
fixing the "is a collection a replacement or not" issue first.
- When switching to new collections, should decodebin3 make any effort
to *map* corresponding streams from the old to new PMT - that is,
try and stick to the *english* language audio track, for example?
Alternatively, rely on the app to do such smarts with stream-select
messages ?

View file

@ -0,0 +1,106 @@
# Stream Status
This document describes the design and use cases for the stream status
messages.
STREAM_STATUS messages are posted on the bus when the state of a
streaming thread changes. The purpose of this message is to allow the
application to interact with the streaming thread properties, such as
the thread priority or the threadpool to use.
We accommodate for the following requirements:
- Application is informed when a streaming thread is about to be
created. It should be possible for the application to suggest a
custom GstTaskPool.
- Application is informed when the status of a streaming thread is
changed. This can be interesting for GUI application that want to
visualize the status of the streaming threads
(playing/paused/stopped)
- Application is informed when a streaming thread is destroyed.
We allow for the following scenarios:
- Elements require a specific (internal) streaming thread to operate
or the application can create/specify a thread for the element.
- Elements allow the application to configure a priority on the
threads.
## Use cases
- boost the priority of the udp receiver streaming thread
```
.--------. .-------. .------. .-------.
| udpsrc | | depay | | adec | | asink |
| src->sink src->sink src->sink |
'--------' '-------' '------' '-------'
```
- when going from READY to PAUSED state, udpsrc will require a
streaming thread for pushing data into the depayloader. It will
post a STREAM_STATUS message indicating its requirement for a
streaming thread.
- The application will usually react to the STREAM_STATUS
messages with a sync bus handler.
- The application can configure the GstTask with a custom
GstTaskPool to manage the streaming thread or it can ignore the
message which will make the element use its default GstTaskPool.
- The application can react to the ENTER/LEAVE stream status
message to configure the thread right before it is
started/stopped. This can be used to configure the thread
priority.
- Before the GstTask is changed state (start/pause/stop) a
STREAM_STATUS message is posted that can be used by the
application to keep track of the running streaming threads.
## Messages
The existing STREAM_STATUS message will be further defined and implemented in
(selected) elements. The following fields will be contained in the message:
- **`type`**, GST_TYPE_STREAM_STATUS_TYPE:
- a set of types to control the lifecycle of the thread:
GST_STREAM_STATUS_TYPE_CREATE: a new streaming thread is going
to be created. The application has the chance to configure a custom
thread. GST_STREAM_STATUS_TYPE_ENTER: the streaming thread is
about to enter its loop function for the first time.
GST_STREAM_STATUS_TYPE_LEAVE: the streaming thread is about to
leave its loop. GST_STREAM_STATUS_TYPE_DESTROY: a streaming
thread is destroyed
- A set of types to control the state of the threads:
GST_STREAM_STATUS_TYPE_START: a streaming thread is started
GST_STREAM_STATUS_TYPE_PAUSE: a streaming thread is paused
GST_STREAM_STATUS_TYPE_STOP: a streaming thread is stopped
- **`owner`**: GST_TYPE_ELEMENT: The owner element of the thread. The
message source will contain the pad (or one of the pads) that will
produce data by this thread. If this thread does not produce data on
a pad, the message source will contain the owner as well. The idea
is that the application should be able to see from the element/pad
what function this thread has in the context of the application and
configure the thread appropriatly.
- **`object`**: G_TYPE, GstTask/GThread: A GstTask/GThread controlling
this streaming thread.
- **`flow-return`**: GstFlowReturn: A status code for why the thread state
changed. when threads are created and started, this is usually
GST_FLOW_OK but when they are stopping it contains the reason code
why it stopped.
- **`reason`**: G_TYPE_STRING: A string describing the reason why the
thread started/stopped/paused. Can be NULL if no reason is given.
## Events
FIXME

View file

@ -0,0 +1,82 @@
# Streams
This document describes the objects that are passed from element to
element in the streaming thread.
## Stream objects
The following objects are to be expected in the streaming thread:
- events
- STREAM_START (START)
- SEGMENT (SEGMENT)
- EOS * (EOS)
- TAG (T)
- buffers * (B)
Objects marked with * need to be synchronised to the clock in sinks and
live sources.
## Typical stream
A typical stream starts with a stream start event that marks the
start of the stream, followed by a segment event that marks the
buffer timestamp range. After that buffers are sent one after the
other. After the last buffer an EOS marks the end of the stream. No
more buffers are to be processed after the EOS event.
```
+-----+-------+ +-++-+ +-+ +---+
|START|SEGMENT| |B||B| ... |B| |EOS|
+-----+-------+ +-++-+ +-+ +---+
```
1) **`STREAM_START`**
- marks the start of a stream; unlike the SEGMENT event, there
will be no STREAM_START event after flushing seeks.
2) **`SEGMENT`**, rate, start/stop, time
- marks valid buffer timestamp range (start, stop)
- marks stream_time of buffers (time). This is the stream time of buffers
with a timestamp of S.start.
- marks playback rate (rate). This is the required playback rate.
- marks applied rate (applied_rate). This is the already applied playback
rate. (See also [trickmodes](design/trickmodes.md))
- marks running_time of buffers. This is the time used to synchronize
against the clock.
3) **N buffers**
- displayable buffers are between start/stop of the SEGMENT (S). Buffers
outside the segment range should be dropped or clipped.
- running_time:
```
if (S.rate > 0.0)
running_time = (B.timestamp - S.start) / ABS (S.rate) + S.base
else
running_time = (S.stop - B.timestamp) / ABS (S.rate) + S.base
```
- a monotonically increasing value that can be used to synchronize
against the clock (See also
[synchronisation](design/synchronisation.md)).
- stream_time:
* current position in stream between 0 and duration.
```
stream_time = (B.timestamp - S.start) * ABS (S.applied_rate) + S.time
```
4) **`EOS`**
- marks the end of data, nothing is to be expected after EOS, elements
should refuse more data and return GST_FLOW_EOS. A FLUSH_STOP
event clears the EOS state of an element.
## Elements
These events are generated typically either by the GstBaseSrc class for
sources operating in push mode, or by a parser/demuxer operating in
pull-mode and pushing parsed/demuxed data downstream.

View file

@ -0,0 +1,271 @@
# Synchronisation
This document outlines the techniques used for doing synchronised
playback of multiple streams.
Synchronisation in a GstPipeline is achieved using the following 3
components:
- a GstClock, which is global for all elements in a GstPipeline.
- Timestamps on a GstBuffer.
- the SEGMENT event preceding the buffers.
## A GstClock
This object provides a counter that represents the current time in
nanoseconds. This value is called the absolute\_time.
Different sources exist for this counter:
- the system time (with g\_get\_current\_time() and with microsecond
accuracy)
- monotonic time (with g\_get\_monotonic\_time () with microsecond
accuracy)
- an audio device (based on number of samples played)
- a network source based on packets received + timestamps in those
packets (a typical example is an RTP source)
- …
In GStreamer any element can provide a GstClock object that can be used
in the pipeline. The GstPipeline object will select a clock from all the
providers and will distribute it to all other elements (see
[gstpipeline](design/gstpipeline.md)).
A GstClock always counts time upwards and does not necessarily start at
0.
While it is possible, it is not recommended to create a clock derived
from the contents of a stream (for example, create a clock from the PCR
in an mpeg-ts stream).
## Running time
After a pipeline selected a clock it will maintain the running\_time
based on the selected clock. This running\_time represents the total
time spent in the PLAYING state and is calculated as follows:
- If the pipeline is NULL/READY, the running\_time is undefined.
- In PAUSED, the running\_time remains at the time when it was last
PAUSED. When the stream is PAUSED for the first time, the
running\_time is 0.
- In PLAYING, the running\_time is the delta between the
absolute\_time and the base time. The base time is defined as the
absolute\_time minus the running\_time at the time when the pipeline
is set to PLAYING.
- after a flushing seek, the running\_time is set to 0 (see
[seeking](design/seeking.md)). This is accomplished by redistributing a new
base\_time to the elements that got flushed.
This algorithm captures the running\_time when the pipeline is set from
PLAYING to PAUSED and restores this time based on the current
absolute\_time when going back to PLAYING. This allows for both clocks
that progress when in the PAUSED state (systemclock) and clocks that
dont (audioclock).
The clock and pipeline now provide a running\_time to all elements that
want to perform synchronisation. Indeed, the running time can be
observed in each element (during the PLAYING state) as:
```
C.running_time = absolute_time - base_time
```
We note C.running\_time as the running\_time obtained by looking at the
clock. This value is monotonically increasing at the rate of the clock.
## Timestamps
The GstBuffer timestamps and the preceding SEGMENT event (See
[streams](design/streams.md)) define a transformation of the buffer timestamps to
running\_time as follows:
The following notation is used:
**B**: GstBuffer
- B.timestamp = buffer timestamp (GST_BUFFER_PTS or GST_BUFFER_DTS)
**S**: SEGMENT event preceding the buffers.
- S.start: start field in the SEGMENT event. This is the lowest allowed
timestamp.
- S.stop: stop field in the SEGMENT event. This is the highers allowed
timestamp.
- S.rate: rate field of SEGMENT event. This is the playback rate.
- S.base: a base time for the time. This is the total elapsed running_time of any
previous segments.
- S.offset: an offset to apply to S.start or S.stop. This is the amount that
has already been elapsed in the segment.
Valid buffers for synchronisation are those with B.timestamp between
S.start and S.stop (after applying the S.offset). All other buffers
outside this range should be dropped or clipped to these boundaries (see
also [segments](design/segments.md)).
The following transformation to running_time exist:
```
if (S.rate > 0.0)
B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base
=>
B.timestamp = (B.running_time - S.base) * ABS (S.rate) + S.start + S.offset
else
B.running_time = ((S.stop - S.offset) - B.timestamp) / ABS (S.rate) + S.base
=>
B.timestamp = S.stop - S.offset - ((B.running_time - S.base) * ABS (S.rate))
```
We write B.running_time as the running_time obtained from the SEGMENT
event and the buffers of that segment.
The first displayable buffer will yield a value of 0 (since B.timestamp
== S.start and S.offset and S.base == 0).
For S.rate \> 1.0, the timestamps will be scaled down to increase the
playback rate. Likewise, a rate between 0.0 and 1.0 will slow down
playback.
For negative rates, timestamps are received stop S.stop to S.start so
that the first buffer received will be transformed into B.running\_time
of 0 (B.timestamp == S.stop and S.base == 0).
This makes it so that B.running\_time is always monotonically increasing
starting from 0 with both positive and negative rates.
## Synchronisation
As we have seen, we can get a running\_time:
- using the clock and the elements base\_time with:
```
C.running_time = absolute_time - base_time
```
- using the buffer timestamp and the preceding SEGMENT event as (assuming
positive playback rate):
```
B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base
```
We prefix C. and B. before the two running times to note how they were
calculated.
The task of synchronized playback is to make sure that we play a buffer
with B.running\_time at the moment when the clock reaches the same
C.running\_time.
Thus the following must hold:
```
B.running_time = C.running_time
```
expaning:
```
B.running_time = absolute_time - base_time
```
or:
```
absolute_time = B.running_time + base_time
```
The absolute\_time when a buffer with B.running\_time should be played
is noted with B.sync\_time. Thus:
```
B.sync_time = B.running_time + base_time
```
One then waits for the clock to reach B.sync\_time before rendering the
buffer in the sink (See also [clocks](design/clocks.md)).
For multiple streams this means that buffers with the same running\_time
are to be displayed at the same time.
A demuxer must make sure that the SEGMENT it emits on its output pads
yield the same running\_time for buffers that should be played
synchronized. This usually means sending the same SEGMENT on all pads
and making sure that the synchronized buffers have the same timestamps.
## Stream time
The stream time is also known as the position in the stream and is a
value between 0 and the total duration of the media file.
It is the stream time that is used for:
- report the POSITION query in the pipeline
- the position used in seek events/queries
- the position used to synchronize controller values
Additional fields in the SEGMENT are used:
- S.time: time field in the SEGMENT event. This the stream-time of
S.start
- S.applied\_rate: The rate already applied to the segment.
Stream time is calculated using the buffer times and the preceding
SEGMENT event as follows:
```
stream_time = (B.timestamp - S.start) * ABS (S.applied_rate) + S.time
=> B.timestamp = (stream_time - S.time) / ABS(S.applied_rate) + S.start
```
For negative rates, B.timestamp will go backwards from S.stop to
S.start, making the stream time go backwards:
```
stream_time = (S.stop - B.timestamp) * ABS(S.applied_rate) + S.time
=> B.timestamp = S.stop - (stream_time - S.time) / ABS(S.applied_rate)
```
In the PLAYING state, it is also possible to use the pipeline clock to
derive the current stream\_time.
Give the two formulas above to match the clock times with buffer
timestamps allows us to rewrite the above formula for stream\_time (and
for positive rates).
```
C.running_time = absolute_time - base_time
B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base
=>
(B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base = absolute_time - base_time;
=>
(B.timestamp - (S.start + S.offset)) / ABS (S.rate) = absolute_time - base_time - S.base;
=>
(B.timestamp - (S.start + S.offset)) = (absolute_time - base_time - S.base) * ABS (S.rate)
=>
(B.timestamp - S.start) = S.offset + (absolute_time - base_time - S.base) * ABS (S.rate)
filling (B.timestamp - S.start) in the above formule for stream time
=>
stream_time = (S.offset + (absolute_time - base_time - S.base) * ABS (S.rate)) * ABS (S.applied_rate) + S.time
```
This last formula is typically used in sinks to report the current
position in an accurate and efficient way.
Note that the stream time is never used for synchronisation against the
clock.

226
markdown/design/toc.md Normal file
View file

@ -0,0 +1,226 @@
# Implementing GstToc support in GStreamer elements
## General info about GstToc structure
GstToc introduces a general way to handle chapters within multimedia
formats. GstToc can be represented as tree structure with arbitrary
hierarchy. Tree item can be either of two types: sequence or
alternative. Sequence types acts like a part of the media data, for
example audio track in CUE sheet, or part of the movie. Alternative
types acts like some kind of selection to process a different version of
the media content, for example DVD angles. GstToc has one constraint on
the tree structure: it does not allow different entry types on the same
level of the hierarchy, i.e. you shouldnt have editions and chapters
mixed together. Here is an example of right TOC:
```
------- TOC -------
/ \
edition1 edition2
| |
-chapter1 -chapter3
-chapter2
```
Here are two editions (alternatives), the first contains two chapters
(sequence type), and the second has only one chapter. And here is an
example of invalid TOC:
```
------- TOC -------
/ \
edition1 chapter1
|
-chapter1
-chapter2
```
Here you have edition1 and chapter1 mixed on the same level of
hierarchy, and such TOC will be considered broken.
GstToc has *entries* field of GList type which consists of children
items. Each item is of type GstTocEntry. Also GstToc has list of tags
and GstStructure called *info*. Please, use GstToc.info and
GstTocEntry.info fields this way: create a GstStructure, put all info
related to your element there and put this structure into the *info*
field under the name of your element. Some fields in the *info*
structure can be used for internal purposes, so you should use it in the
way described above to not to overwrite already existent fields.
Lets look at GstTocEntry a bit closer. One of the most important fields
is *uid*, which must be unique for each item within the TOC. This is
used to identify each item inside TOC, especially when element receives
TOC select event with UID to seek on. Field *subentries* of type GList
contains children items of type GstTocEntry. Thus you can achieve
arbitrary hierarchy level. Field *type* can be either
GST\_TOC\_ENTRY\_TYPE\_CHAPTER or GST\_TOC\_ENTRY\_TYPE\_EDITION which
corresponds to chapter or edition type of item respectively. Field
*tags* is a list of tags related to the item. And field *info* is
similar to GstToc.info described above.
So, a little more about managing GstToc. Use gst\_toc\_new() and
gst\_toc\_unref() to create/free it. GstTocEntry can be created using
gst\_toc\_entry\_new(). While building GstToc you can set start and stop
timestamps for each item using gst\_toc\_entry\_set\_start\_stop() and
loop\_type and repeat\_count using gst\_toc\_entry\_set\_loop(). The
best way to process already created GstToc is to recursively go through
the *entries* and *subentries* fields.
Applications and plugins should not rely on TOCs having a certain kind
of structure, but should allow for different alternatives. For example,
a simple CUE sheet embedded in a file may be presented as a flat list of
track entries, or could have a top-level edition node (or some other
alternative type entry) with track entries underneath that node; or even
multiple top-level edition nodes (or some other alternative type
entries) each with track entries underneath, in case the source file has
extracted a track listing from different sources).
## TOC scope: global and current
There are two main consumers for TOC information: applications and
elements in the pipeline that are TOC writers (such as e.g.
matroskamux).
Applications typically want to know the entire table of contents (TOC)
with all entries that can possibly be selected.
TOC writers in the pipeline, however, would not want to write a TOC for
all possible/available streams, but only for the current stream.
When transcoding a title from a DVD, for example, the application would
still want to know the entire TOC, with all titles, the chapters for
each title, and the available angles. When transcoding to a file, we
only want the TOC information that is relevant to the transcoded stream
to be written into the file structure, e.g. the chapters of the title
being transcoded (or possibly only chapters 5-7 if only those have been
selected for playback/ transcoding).
This is why we may need to create two different TOCs for those two types
of consumers.
Elements that extract TOC information should send TOC events downstream.
Like with tags, sinks will post a TOC message on the bus for the
application with the global TOC, once a global TOC event reaches the
sink.
## Working with GstMessage
If a table of contents is available, applications will receive a TOC
message on the pipelines GstBus.
A TOC message will be posted on the bus by sinks when the receive a TOC
event containing a TOC with global scope. Elements extracting TOCs
should not post a TOC message themselves, but send a TOC event
downstream.
The reason for this is that there may be cascades of TOCs (e.g. a zip
archive containing multiple matroska files, each with a TOC).
GstMessage with GstToc can be created using gst\_message\_new\_toc() and
parsed with gst\_message\_parse\_toc(). The *updated* parameter in these
methods indicates whether the TOC was just discovered (set to false) or
TOC was already found and have been updated (set to true). This message
will typically be posted by sinks to pipeline in case you have
discovered TOC data within your element.
## Working with GstEvent
There are two types of TOC-related events:
- downstream TOC events that contain TOC information and travel
downstream
- toc-select events that travel upstream and can be used to select a
certain TOC entry for playback (similar to seek events)
GstToc supports select event through GstEvent infrastructure. The idea
is the following: when you receive TOC select event, parse it with
gst\_event\_parse\_toc\_select() and seek stream (if it is not
streamable) for specified TOC UID (you can use gst\_toc\_find\_entry()
to find entry in TOC by UID). To create TOC select event use
gst\_event\_new\_toc\_select(). The common action on such event is to
seek to specified UID within your element.
## Implementation coverage, Specifications, …
Below is a list of container formats, links to documentation and a
summary of toc related features. Each section title also indicates
whether reading/writing a toc is implemented. Below hollow bullet point
*o* indicate no support and filled bullets *\*\* indicate that this
feature is handled.
### AIFC: -/-
<http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/AIFF/Docs/AIFF-1.3.pdf>
o *MARK* o *INST*
The *MARK* chunk defines a list of (cue-id, position\_in\_samples,
label).
The *INST* chunk contains a sustainLoop and releaseLoop, each consisting
of (loop-type, cue-begin, cue-end)
### FLAC: read/write
<http://xiph.org/flac/format.html#metadata_block_cuesheet> \*
METADATA\_BLOCK\_CUESHEET \* CUESHEET\_TRACK o CUESHEET\_TRACK\_INDEX
Both CUESHEET\_TRACK and CUESHEET\_TRACK\_INDEX have a (relative) offset
in samples. CUESHEET\_TRACK has ISRC metadata.
### MKV: read/write
<http://matroska.org/technical/specs/chapters/index.html> \* Chapters
and Editions each having a uid \* Chapter have start/end time and
metadata: ChapString, ChapLanguage, ChapCountry
### MP4: \* elst
The *elst* atom contains a list of edits. Each edit consists of (length,
start, play-back speed).
### OGG: -/- <https://wiki.xiph.org/Chapter_Extension> o VorbisComment
fields called CHAPTERxxx and CHAPTERxxxNAME with xxx being a number
between 000 and 999.
### WAV: read/write <http://www.sonicspot.com/guide/wavefiles.html> \* *cue
' o 'plst* \* *adtl* \* *labl* \* *note* o *ltxt* o *smpl*
The *cue ' chunk defines a list of markers in the stream with 'cue-ids.
The 'smpl* chunk defines a list of regions in the stream with 'cue-ids
in the same namespace (?).
The various *adtl* chunks: *labl*, *note* and *ltxt* refer to the
'cue-ids.
A *plst* chunk defines a sequence of segments (cue-id, length\_samples,
repeats). The *smpl* chunk defines a list of loops (cue-id, beg, end,
loop-type, repeats).
## Conclusion/Ideas/Future work
Based on the data of chapter 5, a few thoughts and observations that can
be used to extend and refine our API. These things below are not
reflecting the current implementation.
All formats have table of \[cue-id, cue-start, (cue-end), (extra tags)\]
- cue-id is commonly represented as and unsigned int 32bit - cue-end is
optional - extra tags could be represented as a structure/taglist
Many formats have metadata that references the cue-table. - loops in
instruments in wav, aifc - edit lists in wav, mp4
For mp4.edtl, wav.plst we could expose two editions. 1) the edit list is
flattened: default, for playback 2) the stream has the raw data and the
edit list is there as chapter markers: useful for editing software
We might want to introduce a new GST\_TOC\_ENTRY\_TYPE\_MARKER or \_CUE.
This would be a sequence entry-type and it would not be used for
navigational purposes, but to attach data to a point in time (envelopes,
loops, …).
API wise there is some overlap between: - exposing multiple audio/video
tracks as pads or as ToC editions. For ToC editions, we have the
TocSelect event. - exposing subtitles as a sparse stream or as as ToC
sequence of markers with labels

405
markdown/design/tracing.md Normal file
View file

@ -0,0 +1,405 @@
# Tracing
This subsystem will provide a mechanism to get structured tracing info
from GStreamer applications. This can be used for post-run analysis as
well as for live introspection.
# Use cases
- Id like to get statistics from a running application.
- Id like to to understand which parts of my pipeline use how many
resources.
- Id like to know which parts of the pipeline use how much memory.
- Id like to know about ref-counts of parts in the pipeline to find
ref-count issues.
# Non use-cases
- Some element in the pipeline does not play along the rules, find out
which one. This could be done with generic tests.
# Design
The system brings the following new items: core hooks: probes in the
core api, that will expose internal state when tracing is in use
tracers: plugin features that can process data from the hooks and emit a
log tracing front-ends: applications that consume logs from tracers
Like the logging, the tracer hooks can be compiled out and if not use a
local condition to check if active.
Certain GStreamer core function (such as gst_pad_push or
gst_element_add_pad) will call into the tracer subsystem to dispatch
into active tracing modules. Developers will be able to select a list of
plugins by setting an environment variable, such as
GST_TRACERS="meminfo;dbus". One can also pass parameters to plugins:
GST_TRACERS="log(events,buffers);stats(all)". When then plugins are
loaded, well add them to certain hooks according to which they are
interested in.
Right now tracing info is logged as GstStructures to the TRACE level.
Idea: Another env var GST_TRACE_CHANNEL could be used to send the
tracing to a file or a socket. See
<https://bugzilla.gnome.org/show_bug.cgi?id=733188> for discussion on
these environment variables.
# Hook api
Well wrap interesting api calls with two macros, e.g. gst_pad_push():
GstFlowReturn gst_pad_push (GstPad * pad, GstBuffer * buffer) {
GstFlowReturn res;
``` c
g_return_val_if_fail (GST_IS_PAD (pad), GST_FLOW_ERROR);
g_return_val_if_fail (GST_PAD_IS_SRC (pad), GST_FLOW_ERROR);
g_return_val_if_fail (GST_IS_BUFFER (buffer), GST_FLOW_ERROR);
GST_TRACER_PAD_PUSH_PRE (pad, buffer);
res = gst_pad_push_data (pad,
GST_PAD_PROBE_TYPE_BUFFER | GST_PAD_PROBE_TYPE_PUSH, buffer);
GST_TRACER_PAD_PUSH_POST (pad, res);
return res;
}
```
TODO(ensonic): gcc has some magic for wrapping functions -
<http://gcc.gnu.org/onlinedocs/gcc/Constructing-Calls.html> -
<http://www.clifford.at/cfun/gccfeat/#gccfeat05.c>
TODO(ensonic): we should eval if we can use something like jump_label
in the kernel - <http://lwn.net/Articles/412072/> +
<http://lwn.net/Articles/435215/> -
<http://lxr.free-electrons.com/source/kernel/jump_label.c> -
<http://lxr.free-electrons.com/source/include/linux/jump_label.h> -
<http://lxr.free-electrons.com/source/arch/x86/kernel/jump_label.c>
TODO(ensonic): liblttng-ust provides such a mechanism for user-space -
but this is mostly about logging traces - it is linux specific :/
In addition to api hooks we should also provide timer hooks. Interval
timers are useful to get e.g. resource usage snapshots. Also absolute
timers might make sense. All this could be implemented with a clock
thread. We can use another env-var GST_TRACE_TIMERS="100ms,75ms" to
configure timers and then pass them to the tracers like,
GST_TRACERS="rusage(timer=100ms);meminfo(timer=75ms)". Maybe we can
create them ad-hoc and avoid the GST_TRACE_TIMERS var.
Hooks (* already implemented)
* gst_bin_add
* gst_bin_remove
* gst_element_add_pad
* gst_element_post_message
* gst_element_query
* gst_element_remove_pad
* gst_element_factory_make
* gst_pad_link
* gst_pad_pull_range
* gst_pad_push
* gst_pad_push_list
* gst_pad_push_event
* gst_pad_unlink
## Tracer api
Tracers are plugin features. They have a simple api:
class init Here the tracers describe the data the will emit.
instance init Tracers attach handlers to one or more hooks using
gst_tracing_register_hook(). In case the are configurable, they can
read the options from the *params* property. This is the extra detail
from the environment var.
hook functions Hooks marshal the parameters given to a trace hook into
varargs and also add some extra into such as a timestamp. Hooks will be
called from misc threads. The trace plugins should only consume (=read)
the provided data. Expensive computation should be avoided to not affect
the execution too much. Most trace plugins will log data to a trace
channel.
instance destruction Tracers can output results and release data. This
would ideally be done at the end of the applications, but gst_deinit()
is not mandatory. gst_tracelib was using a gcc_destructor. Ideally
tracer modules log data as they have them and leave aggregation to a
tool that processes the log.
## tracer event classes
Most tracers will log some kind of *events* : a data transfer, an event,
a message, a query or a measurement. Every tracers should describe the
data format. This way tools that process tracer logs can show the data
in a meaningful way without having to know about the tracer plugin.
One way would be to introspect the data from the plugin. This has the
disadvantage that the postprocessing app needs to load the plugins or
talk to the gstreamer registry. An alternative is to also log the format
description into the log. Right now were logging several nested
GstStructure from the `tracer_class_init()` function (except in the
log tracer).
```
gst_tracer_record_new ("thread-rusage.class",
// value in the log record (order does not matter)
// *thread-id* is a *key* to related the record to something as indicated
// by *scope* substructure "thread-id",
GST_TYPE_STRUCTURE, gst_structure_new ("scope", "type",
G_TYPE_GTYPE, G_TYPE_GUINT64, "related-to",
GST_TYPE_TRACER_VALUE_SCOPE, GST_TRACER_VALUE_SCOPE_THREAD,
NULL),
// next value in the record // *average-cpuload* is a measurement as indicated by the *value*
// substructure "average-cpuload",
GST_TYPE_STRUCTURE, gst_structure_new ("value", // value type
"type", G_TYPE_GTYPE, G_TYPE_UINT,
// human readable description, that can be used as a graph label
"description", G_TYPE_STRING, "average cpu usage per thread",
// flags that help to use the right graph type
// flags { aggregated, windowed, cumulative, … }
"flags", GST_TYPE_TRACER_VALUE_FLAGS, GST_TRACER_VALUE_FLAGS_AGGREGATED,
// value range
"min", G_TYPE_UINT, 0, "max", G_TYPE_UINT, 100, NULL),
… NULL);
```
A few ideas that are not yet in the above spec: - it would be nice to
describe the unit of values - putting it into the description is not
flexible though, e.g. time would be a guint64 but a ui would reformat it
to e.g. h:m:s.ms - other units are e.g.: percent, per-mille, or kbit/s -
wed like to have some metadata on scopes - e.g. wed like to log the
thread-names, so that a UI can show that instead of thread-ids - the
stats tracer logs *new-element* and *new-pad* messages - they add a
unique *ix* to each instance as the memory ptr can be reused for new
instances, the data is attached to the objects as qdata - the latency
tracer would like to also reference this metadata - right now we log the
classes as structures - this is important so that the log is self
contained - it would be nice to add them to the registry, so that
gst-inspect can show them
We could also consider to add each value as a READABLE gobject property.
The property has name/description. We could use qdata for scope and
flags (or have some new property flags). We would also need a new
"notify" signal, so that value-change notifications would include a
time-stamp. This way the tracers would not needs to be aware of the
logging. The core tracer would register the notify handlers and emit the
log. Or we just add a gst_tracer_class_install_event() and that
mimics the g_object_class_install_property().
Frontends can: - do an events over time histogram - plot curves of
values over time or deltas - show gauges - collect statistics (min, max,
avg, …)
We can have some under gstreamer/plugins/tracers/
## latency
- register to buffer and event flow
- send custom event on buffer flow at source elements
- catch events on event transfer at sink elements
## meminfo (not yet implemented)
- register to an interval-timer hook.
- call mallinfo() and log memory usage
rusage
- register to an interval-timer hook.
- call getrusage() and log resource usage
## dbus (not yet implemented)
- provide a dbus iface to announce applications that are traced
- tracing UIs can use the dbus iface to find the channels where logging and
tracing is getting logged to
- one would start the tracing UI first and when the application is started with
tracing activated, the dbus plugin will announce the new application,
upon which the tracing UI can start reading from the log channels, this avoid
missing some data
## topology (not yet implemented)
- register to pipeline topology hooks
- tracing UIs can show a live pipeline graph
## stats
- register to buffer, event, message and query flow
- tracing apps can do e.g. statistics
## refcounts (not yet implemented)
- log ref-counts of objects
- just logging them outside of glib/gobject would still make it hard to detect
issues though
## opengl (not yet implemented)
- upload/download times
- there is not hardware agnostic way to get e.g. memory usage info (gl
extensions)
## memory (not yet implemented)
- trace live instance (and pointer to the memory)
- use an atexit handler to dump leaked instance
https://bugzilla.gnome.org/show_bug.cgi?id=756760#c6
## leaks
- track creation/destruction of GstObject and GstMiniObject
- log those which are still alive when app is exiting and raise an
error if any
- If the GST_LEAKS_TRACER_SIG env variable is defined the tracer
will handle the following UNIX signals:
- SIGUSR1: log alive objects
- SIGUSR2: create a checkpoint and print a list of objects created and
destroyed since the previous checkpoint.
- If the GST_LEAKS_TRACER_STACK_TRACE env variable is defined log
the creation stack trace of leaked objects. This may significantly
increase memory consumption.
## gst-debug-viewer
gst-debug-viewer could be given the trace log in addition to the debug
log (or a combined log). Alternatively it would show a dialog that shows
all local apps (if the dbus plugin is loaded) and read the log streams
from the sockets/files that are configured for the app.
## gst-tracer
Counterpart of gst-tracelib-ui.
## gst-stats
A terminal app that shows summary/running stats like the summary
gst-tracelib shows at the end of a run. Currently only shows an
aggregated status.
## live-graphers
Maybe we can even feed the log into existing live graphers, with a
little driver * <https://github.com/dkogan/feedgnuplot>
- should tracers log into the debug.log or into a separate log?
- separate log
- use a binary format?
- worse performance (were writing two logs at the same time)
- need to be careful when people to GST_DEBUG_CHANNEL=stderr and
GST_TRACE_CHANNEL=stderr (use a shared channel, but what about the
formats?)
- debug log
- the tracer subsystem would need to log the GST_TRACE at a level
that is active
- should the tracer call gst_debug_category_set_threshold() to
ensure things work, even though the levels dont make a lot of sense
here
- make logging a tracer (a hook in gst_debug_log_valist, move
gst_debug_log_default() to the tracer module)
- log all debug log to the tracer log, some of the current logging
statements can be replaced by generic logging as shown in the
log-tracer
- add tools/gst-debug to extract a human readable debug log from the
trace log
- we could maintain a list of log functions, where
gst_tracer_log_trace() is the default one. This way e.g.
gst-validate could consume the traces directly.
- when hooking into a timer, should we just have some predefined
intervals?
- can we add a tracer module that registers the timer hook? then we
could do GST_TRACER="timer(10ms);rusage" right now the tracer hooks
are defined as an enum though.
- when connecting to a running app, we cant easily get the *current*
state if logging is using a socket, as past events are not
explicitly stored, we could determine the current topology and emit
events with GST_CLOCK_TIME_NONE as ts to indicate that the events
are synthetic.
- we need stable ids for scopes (threads, elements, pads)
- the address can be reused
- we can use gst_util_seqnum_next()
- something like gst_object_get_path_string() wont work as
objects are initially without parent
- right now the tracing-hooks are enabled/disabled from configure with
--{enable,disable}-gst-tracer-hooks The tracer code and the plugins
are still built though. We should add a
--{enable,disable}-gst-tracer to disabled the whole system,
allthough this is a bit confusing with the --{enable,disable}-trace
option we have already.
## Try it
### Traces for buffer flow in TRACE level:
GST_DEBUG="GST_TRACER:7,GST_BUFFER*:7,GST_EVENT:7,GST_MESSAGE:7"
GST_TRACERS=log gst-launch-1.0 fakesrc num-buffers=10 ! fakesink -
### Print some pipeline stats on exit:
GST_DEBUG="GST_TRACER:7" GST_TRACERS="stats;rusage"
GST_DEBUG_FILE=trace.log gst-launch-1.0 fakesrc num-buffers=10
sizetype=fixed ! queue ! fakesink && gst-stats-1.0 trace.log
### get ts, average-cpuload, current-cpuload, time and plot
GST_DEBUG="GST_TRACER:7" GST_TRACERS="stats;rusage"
GST_DEBUG_FILE=trace.log /usr/bin/gst-play-1.0 $HOME/Videos/movie.mp4 &&
./scripts/gst-plot-traces.sh --format=png | gnuplot eog trace.log.*.png
### print processing latencies
GST_DEBUG="GST_TRACER:7" GST_TRACERS=latency gst-launch-1.0 \
audiotestsrc num-buffers=10 ! audioconvert ! volume volume=0.7 ! \
autoaudiosink
### Raise a warning if a leak is detected
GST_TRACERS="leaks" gst-launch-1.0 videotestsrc num-buffers=10 !
fakesink
### check if any GstEvent or GstMessage is leaked and raise a warning
GST_DEBUG="GST_TRACER:7" GST_TRACERS="leaks(GstEvent,GstMessage)"
gst-launch-1.0 videotestsrc num-buffers=10 ! fakesink
# Performance
run ./tests/benchmarks/tracing.sh <tracer(s)> <media>
egrep -c "(proc|thread)-rusage" trace.log 658618 grep -c
"gst_tracer_log_trace" trace.log 823351
- we can optimize most of it by using quarks in structures or
eventually avoid structures totally

View file

@ -0,0 +1,235 @@
# Trickmodes
GStreamer provides API for performing various trickmode playback. This
includes:
- server side trickmodes
- client side fast/slow forward playback
- client side fast/slow backwards playback
Server side trickmodes mean that a source (network source) can provide a
stream with different playback speed and direction. The client does not
have to perform any special algorithms to decode this stream.
Client side trickmodes mean that the decoding client (GStreamer)
performs the needed algorithms to change the direction and speed of the
media file.
Seeking can both be done in a playback pipeline and a transcoding
pipeline.
## General seeking overview
Consider a typical playback pipeline:
```
.---------. .------.
.-------. | decoder |->| sink |
.--------. | |-->'---------' '------'
| source |->| demux |
'--------' | |-->.---------. .------.
'-------' | decoder |->| sink |
'---------' '------'
```
The pipeline is initially configured to play back at speed 1.0 starting
from position 0 and stopping at the total duration of the file.
When performing a seek, the following steps have to be taken by the
application:
### Create a seek event
The seek event contains:
- various flags describing:
- where to seek to (KEY\_UNIT)
- how accurate the seek should be (ACCURATE)
- how to perform the seek (FLUSH)
- what to do when the stop position is reached (SEGMENT).
- extra playback options (SKIP)
- a format to seek in, this can be time, bytes, units (frames,
samples), …
- a playback rate, 1.0 is normal playback speed, positive values
bigger than 1.0 mean fast playback. negative values mean reverse
playback. A playback speed of 0.0 is not allowed (but is equivalent
to PAUSING the pipeline).
- a start position, this value has to be between 0 and the total
duration of the file. It can also be relative to the previously
configured start value.
- a stop position, this value has to be between 0 and the total
duration. It can also be relative to the previously configured stop
value.
See also gst\_event\_new\_seek().
### Send the seek event
Send the new seek event to the pipeline with
gst\_element\_send\_event().
By default the pipeline will send the event to all sink elements. By
default an element will forward the event upstream on all sinkpads.
Elements can modify the format of the seek event. The most common format
is GST\_FORMAT\_TIME.
One element will actually perform the seek, this is usually the demuxer
or source element. For more information on how to perform the different
seek types see [seeking](design/seeking.md).
For client side trickmode a SEGMENT event will be sent downstream with
the new rate and start/stop positions. All elements prepare themselves
to handle the rate (see below). The applied rate of the SEGMENT event
will be set to 1.0 to indicate that no rate adjustment has been done.
for server side trick mode a SEGMENT event is sent downstream with a
rate of 1.0 and the start/stop positions. The elements will configure
themselves for normal playback speed since the server will perform the
rate conversions. The applied rate will be set to the rate that will be
applied by the server. This is done to insure that the position
reporting performed in the sink is aware of the trick mode.
When the seek succeeds, the \_send\_event() function will return TRUE.
## Server side trickmode
The source element operates in push mode. It can reopen a server
connection requesting a new byte or time position and a new playback
speed. The capabilities can be queried from the server when the
connection is opened.
We assume the source element is derived from the GstPushSrc base class.
The base source should be configured with gst\_base\_src\_set\_format
(src, GST\_FORMAT\_TIME).
The do\_seek method will be called on the push src subclass with the
seek information passed in the GstSegment argument.
The rate value in the segment should be used to reopen the connection to
the server requesting data at the new speed and possibly a new playback
position.
When the server connection was successfully reopened, set the rate of
the segment to 1.0 so that the client side trickmode is not enabled. The
applied rate in the segment is set to the rate transformation done by
the server.
Alternatively a combination of client side and serverside trickmode can
be used, for example if the server does not support certain rates, the
client can perform rate conversion for the remainder.
```
source server
do_seek | |
----------->| |
| reopen connection |
|-------------------->|
| .
| success .
|<--------------------|
modify | |
rate to 1.0 | |
| |
return | |
TRUE | |
| |
```
After performing the seek, the source will inform the downstream
elements of the new segment that is to be played back. Since the segment
will have a rate of 1.0, no client side trick modes are enabled. The
segment will have an applied rate different from 1.0 to indicate that
the media contains data with non-standard playback speed or direction.
## client side forward trickmodes
The seek happens as stated above. a SEGMENT event is sent downstream
with a rate different from 1.0. Plugins receiving the SEGMENT can decide
to perform the rate conversion of the media data (retimestamp video
frames, resample audio, …).
If a plugin decides to resample or retimestamp, it should modify the
SEGMENT with a rate of 1.0 and update the applied rate so that
downstream elements dont resample again but are aware that the media
has been modified.
The GStreamer base audio and video sinks will resample automatically if
they receive a SEGMENT event with a rate different from 1.0. The
position reporting in the base audio and video sinks will also depend on
the applied rate of the segment information.
When the SKIP flag is set, frames can be dropped in the elements. If S
is the speedup factor, a good algorithm for implementing frame skipping
is to send audio in chunks of Nms (usually 300ms is good) and then skip
((S-1) \* Nns) of audio data. For the video we send only the keyframes
in the (S \* Nns) interval. In this case, the demuxer would scale the
timestamps and would set an applied rate of S.
## client side backwards trickmode
For backwards playback the following rules apply:
- the rate in the SEGMENT is less than 0.0.
- the SEGMENT start position is less than the stop position, playback
will however happen from stop to start in reverse.
- the time member in the SEGMENT is set to the stream time of the
start position.
For plugins the following rules apply:
- A source plugin sends data in chunks starting from the last chunk of
the file. The actual bytes are not reversed. Each chunk that is not
forward continuous with the previous chunk is marked with a DISCONT
flag.
- A demuxer accumulates the chunks. As soon as a keyframe is found,
everything starting from the keyframe up to the accumulated data is
sent downstream. Timestamps on the buffers are set starting from the
stop position to start, effectively going backwards. Chunks are
marked with DISCONT when they are not forward continuous with the
previous buffer.
- A video decoder decodes and accumulates all decoded frames. If a
buffer with a DISCONT, SEGMENT or EOS is received, all accumulated
frames are sent downsteam in reverse.
- An audio decoder decodes and accumulates all decoded audio. If a
buffer with a DISCONT, SEGMENT or EOS is received, all accumulated
audio is sent downstream in reverse order. Some audio codecs need
the previous data buffer to decode the current one, in that case,
the previous DISCONT buffer needs to be combined with the last
non-DISCONT buffer to generate the last bit of output.
- A sink reverses (for audio) and retimestamps (audio, video) the
buffers before playing them back. Retimestamping occurs relative to
the stop position, making the timestamps increase again and suitable
for synchronizing against the clock. Audio sinks also have to
perform simple resampling before playing the samples.
- for transcoding, audio and video resamplers can be used to reverse,
resample and retimestamp the buffers. Any rate adjustments performed
on the media must be added to the applied\_rate and subtracted from
the rate members in the SEGMENT
event.
In SKIP mode, the same algorithm as for forward SKIP mode can be used.
## Notes
- The clock/running\_time keeps running forward.
- backwards playback potentially uses a lot of memory as frames and
undecoded data gets buffered.

View file

@ -137,3 +137,60 @@ index.md
splitup.md splitup.md
licensing.md licensing.md
rtp.md rtp.md
design/index.md
design/MT-refcounting.md
design/TODO.md
design/activation.md
design/buffer.md
design/buffering.md
design/bufferpool.md
design/caps.md
design/clocks.md
design/context.md
design/controller.md
design/conventions.md
design/dynamic.md
design/element-sink.md
design/element-source.md
design/element-transform.md
design/events.md
design/framestep.md
design/gstbin.md
design/gstbus.md
design/gstelement.md
design/gstghostpad.md
design/gstobject.md
design/gstpipeline.md
design/draft-klass.md
design/latency.md
design/live-source.md
design/memory.md
design/messages.md
design/meta.md
design/draft-metadata.md
design/miniobject.md
design/missing-plugins.md
design/negotiation.md
design/overview.md
design/preroll.md
design/probes.md
design/progress.md
design/push-pull.md
design/qos.md
design/query.md
design/relations.md
design/scheduling.md
design/seeking.md
design/segments.md
design/seqnums.md
design/sparsestreams.md
design/standards.md
design/states.md
design/stream-selection.md
design/stream-status.md
design/streams.md
design/synchronisation.md
design/draft-tagreading.md
design/toc.md
design/tracing.md
design/trickmodes.md