diff --git a/markdown/design/MT-refcounting.md b/markdown/design/MT-refcounting.md new file mode 100644 index 0000000000..043a3bb233 --- /dev/null +++ b/markdown/design/MT-refcounting.md @@ -0,0 +1,424 @@ +# Conventions for thread a safe API + +The GStreamer API is designed to be thread safe. This means that API functions +can be called from multiple threads at the same time. GStreamer internally uses +threads to perform the data passing and various asynchronous services such as +the clock can also use threads. + +This design decision has implications for the usage of the API and the objects +which this document explains. + +## MT safety techniques + +Several design patterns are used to guarantee object consistency in GStreamer. +This is an overview of the methods used in various GStreamer subsystems. + +### Refcounting: + +All shared objects have a refcount associated with them. Each reference +obtained to the object should increase the refcount and each reference lost +should decrease the refcount. + +The refcounting is used to make sure that when another thread destroys the +object, the ones which still hold a reference to the object do not read from +invalid memory when accessing the object. + +Refcounting is also used to ensure that mutable data structures are only +modified when they are owned by the calling code. + +It is a requirement that when two threads have a handle on an object, the +refcount must be more than one. This means that when one thread passes an +object to another thread it must increase the refcount. This requirement makes +sure that one thread cannot suddenly dispose the object making the other +thread crash when it tries to access the pointer to invalid memory. + +### Shared data structures and writability: + +All objects have a refcount associated with them. Each reference obtained to +the object should increase the refcount and each reference lost should +decrease the refcount. + +Each thread having a refcount to the object can safely read from the object. +but modifications made to the object should be preceded with a +`_get_writable()` function call. This function will check the refcount of the +object and if the object is referenced by more than one instance, a copy is +made of the object that is then by definition only referenced from the calling +thread. This new copy is then modifiable without being visible to other +refcount holders. + +This technique is used for information objects that, once created, never +change their values. The lifetime of these objects is generally short, the +objects are usually simple and cheap to copy/create. + +The advantage of this method is that no reader/writers locks are needed. all +threads can concurrently read but writes happen locally on a new copy. In most +cases `_get_writable()` can avoid a real copy because the calling method is the +only one holding a reference, which makes read/write very cheap. + +The drawback is that sometimes 1 needless copy can be done. This would happen +when N threads call `_get_writable()` at the same time, all seeing that N +references are held on the object. In this case 1 copy too many will be done. +This is not a problem in any practical situation because the copy operation is +fast. + +### Mutable substructures: + +Special techniques are necessary to ensure the consistency of compound shared +objects. As mentioned above, shared objects need to have a reference count of +1 if they are to be modified. Implicit in this assumption is that all parts of +the shared object belong only to the object. For example, a GstStructure in +one GstCaps object should not belong to any other GstCaps object. This +condition suggests a parent-child relationship: structures can only be added +to parent object if they do not already have a parent object. + +In addition, these substructures must not be modified while more than one code +segment has a reference on the parent object. For example, if the user creates +a GstStructure, adds it to a GstCaps, and the GstCaps is then referenced by +other code segments, the GstStructure should then become immutable, so that +changes to that data structure do not affect other parts of the code. This +means that the child is only mutable when the parent's reference count is 1, +as well as when the child structure has no parent. + +The general solution to this problem is to include a field in child structures +pointing to the parent's atomic reference count. When set to NULL, this +indicates that the child has no parent. Otherwise, procedures that modify the +child structure must check if the parent's refcount is 1, and otherwise must +cause an error to be signaled. + +Note that this is an internal implementation detail; application or plugin +code that calls `_get_writable()` on an object is guaranteed to receive an +object of refcount 1, which must then be writable. The only trick is that a +pointer to a child structure of an object is only valid while the calling code +has a reference on the parent object, because the parent is the owner of the +child. + +### Object locking: + +For objects that contain state information and generally have a longer +lifetime, object locking is used to update the information contained in the +object. + +All readers and writers acquire the lock before accessing the object. Only one +thread is allowed access the protected structures at a time. + +Object locking is used for all objects extending from GstObject such as +GstElement, GstPad. + +Object locking can be done with recursive locks or regular mutexes. Object +locks in GStreamer are implemented with mutexes which cause deadlocks when +locked recursively from the same thread. This is done because regular mutexes +are cheaper. + +### Atomic operations + +Atomic operations are operations that are performed as one consistent +operation even when executed by multiple threads. They do however not use the +conventional aproach of using mutexes to protect the critical section but rely +on CPU features and instructions. + +The advantages are mostly speed related since there are no heavyweight locks +involved. Most of these instructions also do not cause a context switch in case +of concurrent access but use a retry mechanism or spinlocking. + +Disadvantages are that each of these instructions usually cause a cache flush +on multi-CPU machines when two processors perform concurrent access. + +Atomic operations are generally used for refcounting and for the allocation of +small fixed size objects in a memchunk. They can also be used to implement a +lockfree list or stack. + +### Compare and swap + +As part of the atomic operations, compare-and-swap (CAS) can be used to access +or update a single property or pointer in an object without having to take a +lock. + +This technique is currently not used in GStreamer but might be added in the +future in performance critical places. + + +## Objects + +### Locking involved: + +- atomic operations for refcounting +- object locking + +All objects should have a lock associated with them. This lock is used to keep +internal consistency when multiple threads call API function on the object. + +For objects that extend the GStreamer base object class this lock can be +obtained with the macros `GST_OBJECT_LOCK()` and `GST_OBJECT_UNLOCK()`. For other object that do +not extend from the base GstObject class these macros can be different. + +### refcounting + +All new objects created have the FLOATING flag set. This means that the object +is not owned or managed yet by anybody other than the one holding a reference +to the object. The object in this state has a reference count of 1. + +Various object methods can take ownership of another object, this means that +after calling a method on object A with an object B as an argument, the object +B is made sole property of object A. This means that after the method call you +are not allowed to access the object anymore unless you keep an extra +reference to the object. An example of such a method is the `_bin_add()` method. +As soon as this function is called in a Bin, the element passed as an argument +is owned by the bin and you are not allowed to access it anymore without +taking a `_ref()` before adding it to the bin. The reason being that after the +`_bin_add()` call disposing the bin also destroys the element. + +Taking ownership of an object happens through the process of "sinking" the +object. the `_sink()` method on an object will decrease the refcount of the +object if the FLOATING flag is set. The act of taking ownership of an object +is then performed as a `_ref()` followed by a `_sink()` call on the object. + +The float/sink process is very useful when initializing elements that will +then be placed under control of a parent. The floating ref keeps the object +alive until it is parented, and once the object is parented you can forget +about it. + +also see [relations](design/relations.md) + +### parent-child relations + +One can create parent-child relationships with the `_object_set_parent()` +method. This method refs and sinks the object and assigns its parent property +to that of the managing parent. + +The child is said to have a weak link to the parent since the refcount of the +parent is not increased in this process. This means that if the parent is +disposed it has to unset itself as the parent of the object before disposing +itself, else the child object holds a parent pointer to invalid memory. + +The responsibilities for an object that sinks other objects are summarised as: + +- taking ownership of the object + - call `_object_set_parent()` to set itself as the object parent, this call + will `_ref()` and `_sink()` the object. + - keep reference to object in a datastructure such as a list or array. + +- on dispose + - call `_object_unparent()` to reset the parent property and unref the + object. + - remove the object from the list. + +also see [relations](design/relations.md) + +### Properties + +Most objects also expose state information with public properties in the +object. Two types of properties might exist: accessible with or without +holding the object lock. All properties should only be accessed with their +corresponding macros. The public object properties are marked in the .h files +with /*< public >*/. The public properties that require a lock to be held are +marked with `/*< public >*/` `/* with */`, where `` can be +`LOCK` or `STATE_LOCK` or any other lock to mark the type(s) of lock to be +held. + +**Example**: + +in GstPad there is a public property `direction`. It can be found in the +section marked as public and requiring the LOCK to be held. There exists +also a macro to access the property. + + struct _GstRealPad { + ... + /*< public >*/ /* with LOCK */ + ... + GstPadDirection direction; + ... + }; + + #define GST_RPAD_DIRECTION(pad) (GST_REAL_PAD_CAST(pad)->direction) + +Accessing the property is therefore allowed with the following code example: + + GST_OBJECT_LOCK (pad); + direction = GST_RPAD_DIRECTION (pad); + GST_OBJECT_UNLOCK (pad); + +### Property lifetime + +All properties requiring a lock can change after releasing the associated +lock. This means that as long as you hold the lock, the state of the +object regarding the locked properties is consistent with the information +obtained. As soon as the lock is released, any values acquired from the +properties might not be valid anymore and can as best be described as a +snapshot of the state when the lock was held. + +This means that all properties that require access beyond the scope of the +critial section should be copied or refcounted before releasing the lock. + +Most object provide a `_get_()` method to get a copy or refcounted +instance of the property value. The caller should not wory about any locks +but should unref/free the object after usage. + +**Example**: + +the following example correctly gets the peer pad of an element. It is +required to increase the refcount of the peer pad because as soon as the +lock is released, the peer could be unreffed and disposed, making the +pointer obtained in the critical section point to invalid memory. + +``` c + GST_OBJECT_LOCK (pad); + peer = GST_RPAD_PEER (pad); + if (peer) + gst_object_ref (GST_OBJECT (peer)); + GST_OBJECT_UNLOCK (pad); + ... use peer ... + + if (peer) + gst_object_unref (GST_OBJECT (peer)); +``` + +Note that after releasing the lock the peer might not actually be the peer +anymore of the pad. If you need to be sure it is, you need to extend the +critical section to include the operations on the peer. + +The following code is equivalent to the above but with using the functions +to access object properties. + +``` c + peer = gst_pad_get_peer (pad); + if (peer) { + ... use peer ... + + gst_object_unref (GST_OBJECT (peer)); + } +``` + +**Example**: + +Accessing the name of an object makes a copy of the name. The caller of the +function should g_free() the name after usage. + +``` c + GST_OBJECT_LOCK (object) + name = g_strdup (GST_OBJECT_NAME (object)); + GST_OBJECT_UNLOCK (object) + ... use name ... + + g_free (name); +``` + +or: + +``` c + name = gst_object_get_name (object); + + ... use name ... + + g_free (name); +``` + +### Accessor methods + +For aplications it is encouraged to use the public methods of the object. Most +useful operations can be performed with the methods so it is seldom required +to access the public fields manually. + +All accessor methods that return an object should increase the refcount of the +returned object. The caller should `_unref()` the object after usage. Each +method should state this refcounting policy in the documentation. + +### Accessing lists + +If the object property is a list, concurrent list iteration is needed to get +the contents of the list. GStreamer uses the cookie mechanism to mark the last +update of a list. The list and the cookie are protected by the same lock. Each +update to a list requires the following actions: + +- acquire lock +- update list +- update cookie +- release lock + +Updating the cookie is usually done by incrementing its value by one. Since +cookies use guint32 its wraparound is for all practical reasons is not a +problem. + +Iterating a list can safely be done by surrounding the list iteration with a +lock/unlock of the lock. + +In some cases it is not a good idea to hold the lock for a long time while +iterating the list. The state change code for a bin in GStreamer, for example, +has to iterate over each element and perform a blocking call on each of them +potentially causing infinite bin locking. In this case the cookie can be used +to iterate a list. + +**Example**: + +The following algorithm iterates a list and reverses the updates in the +case a concurrent update was done to the list while iterating. The idea is +that whenever we reacquire the lock, we check for updates to the cookie to +decide if we are still iterating the right list. + +``` c + GST_OBJECT_LOCK (lock); + /* grab list and cookie */ + cookie = object->list_cookie; + list = object-list; + while (list) { + GstObject *item = GST_OBJECT (list->data); + /* need to ref the item before releasing the lock */ + gst_object_ref (item); + GST_OBJECT_UNLOCK (lock); + + ... use/change item here... + + /* release item here */ + gst_object_unref (item); + + GST_OBJECT_LOCK (lock); + if (cookie != object->list_cookie) { + /* handle rollback caused by concurrent modification + * of the list here */ + + ...rollback changes to items... + + /* grab new cookie and list */ + cookie = object->list_cookie; + list = object->list; + } + else { + list = g_list_next (list); + } + } + GST_OBJECT_UNLOCK (lock); +``` + +### GstIterator + +GstIterator provides an easier way of retrieving elements in a concurrent +list. The following code example is equivalent to the previous example. + +**Example**: + +``` c +it = _get_iterator(object); +while (!done) { + switch (gst_iterator_next (it, &item)) { + case GST_ITERATOR_OK: + + ... use/change item here... + + /* release item here */ + gst_object_unref (item); + break; + case GST_ITERATOR_RESYNC: + /* handle rollback caused by concurrent modification + * of the list here */ + + ...rollback changes to items... + + /* resync iterator to start again */ + gst_iterator_resync (it); + break; + case GST_ITERATOR_DONE: + done = TRUE; + break; + } +} +gst_iterator_free (it); +``` diff --git a/markdown/design/TODO.md b/markdown/design/TODO.md new file mode 100644 index 0000000000..e101215846 --- /dev/null +++ b/markdown/design/TODO.md @@ -0,0 +1,96 @@ +# TODO - Future Development + +## API/ABI + +- implement return values from events in addition to the gboolean. +This should be done by making the event contain a GstStructure with +input/output values, similar to GstQuery. A typical use case is +performing a non-accurate seek to a keyframe, after the seek you +want to get the new stream time that will actually be used to update +the slider bar. + +- make gst\_pad\_push\_event() return a GstFlowReturn + +- GstEvent, GstMessage register like GstFormat or GstQuery. + +- query POSITION/DURATION return accuracy. Just a flag or accuracy +percentage. + +- use | instead of + as divider in serialization of Flags +(gstvalue/gststructure) + +- rethink how we handle dynamic replugging wrt segments and other +events that already got pushed and need to be pushed again. Might +need GstFlowReturn from gst\_pad\_push\_event(). FIXED in 0.11 with +sticky events. + +- Optimize negotiation. We currently do a get\_caps() call when we +link pads, which could potentially generate a huge list of caps and +all their combinations, we need to avoid generating these huge lists +by generating them We also need to incrementally return +intersections etc, for this. somewhat incrementally when needed. We +can do this with a gst\_pad\_iterate\_caps() call. We also need to +incrementally return intersections etc, for this. FIXED in 0.11 with +a filter on getcaps functions. + +- Elements in a bin have no clue about the final state of the parent +element since the bin sets the target state on its children in small +steps. This causes problems for elements that like to know the final +state (rtspsrc going to PAUSED or READY is different in that we can +avoid sending the useless PAUSED request). + +- Make serialisation of structures more consistent, readable and nicer +code-wise. + +- pad block has several issues: + + - can’t block on selected things, like push, pull, pad\_alloc, + events, … + + - can’t check why the block happened. We should also be able to + get the item/ reason that blocked the pad. + + - it only blocks on datapassing. When EOS, the block never happens + but ideally should because pad block should inform the app when + there is no dataflow. + + - the same goes for segment seeks that don’t push in-band EOS + events. Maybe segment seeks should also send an EOS event when + they’re done. + + - blocking should only happen from one thread. If one thread does + pad\_alloc and another a push, the push might be busy while the + block callback is done. + + - maybe this name is overloaded. We need to look at some more use + cases before trying to fix this. FIXED in 0.11 with BLOCKING + probes. + +- rethink the way we do upstream renegotiation. Currently it’s done +with pad\_alloc but this has many issues such as only being able to +suggest 1 format and the need to allocate a buffer of this suggested +format (some elements such as capsfilter only know about the format, +not the size). We would ideally like to let upstream renegotiate a +new format just like it did when it started. This could, for +example, easily be triggered with a RENEGOTIATE event. FIXED in 0.11 +with RECONFIGURE events. + +- Remove the result format value in queries. FIXED in 0.11 + +- Try to minimize the amount of acceptcaps calls when pushing buffers +around. The element pushing the buffer usually negotiated already +and decided on the format. The element receiving the buffer usually +has to accept the caps anyway. + +## IMPLEMENTATION + + - implement more QOS, [qos](design/qos.md). + + - implement BUFFERSIZE. + +## DESIGN + + - unlinking pads in the PAUSED state needs to make sure the stream + thread is not executing code. Can this be done with a flush to + unlock all downstream chain functions? Do we do this automatically + or let the app handle this? diff --git a/markdown/design/activation.md b/markdown/design/activation.md new file mode 100644 index 0000000000..cfbe303e3d --- /dev/null +++ b/markdown/design/activation.md @@ -0,0 +1,88 @@ +# Pad (de)activation + +## Activation + +When changing states, a bin will set the state on all of its children in +sink-to-source order. As elements undergo the READY→PAUSED transition, +their pads are activated so as to prepare for data flow. Some pads will +start tasks to drive the data flow. + +An element activates its pads from sourcepads to sinkpads. This to make +sure that when the sinkpads are activated and ready to accept data, the +sourcepads are already active to pass the data downstream. + +Pads can be activated in one of two modes, PUSH and PULL. PUSH pads are +the normal case, where the source pad in a link sends data to the sink +pad via `gst_pad_push()`. PULL pads instead have sink pads request data +from the source pads via `gst_pad_pull_range()`. + +To activate a pad, the core will call `gst_pad_set_active()` with a +TRUE argument, indicating that the pad should be active. If the pad is +already active, be it in a PUSH or PULL mode, `gst_pad_set_active()` +will return without doing anything. Otherwise it will call the +activation function of the pad. + +Because the core does not know in which mode to activate a pad (PUSH or +PULL), it delegates that choice to a method on the pad, activate(). The +activate() function of a pad should choose whether to operate in PUSH or +PULL mode. Once the choice is made, it should call ``activate_mode()`` with +the selected activation mode. The default activate() function will call +`activate_mode()` with ``#GST_PAD_MODE_PUSH``, as it is the default +mechanism for data flow. A sink pad that supports either mode of +operation might call `activate_mode(PULL)` if the SCHEDULING query +upstream contains the `#GST_PAD_MODE_PULL` scheduling mode, and +`activate_mode(PUSH)` otherwise. + +Consider the case `fakesrc ! fakesink`, where fakesink is configured to +operate in PULL mode. State changes in the pipeline will start with +fakesink, which is the most downstream element. The core will call +`activate()` on fakesink’s sink pad. For fakesink to go into PULL mode, it +needs to implement a custom activate() function that will call +`activate_mode(PULL)` on its sink pad (because the default is to use PUSH +mode). activate_mode(PULL) is then responsible for starting the task +that pulls from fakesrc:src. Clearly, fakesrc needs to be notified that +fakesrc is about to pull on its src pad, even though the pipeline has +not yet changed fakesrc’s state. For this reason, GStreamer will first +call call `activate_mode(PULL)` on fakesink:sink’s peer before calling +`activate_mode(PULL)` on fakesink:sinks. + +In short, upstream elements operating in PULL mode must be ready to +produce data in READY, after having `activate_mode(PULL)` called on their +source pad. Also, a call to `activate_mode(PULL)` needs to propagate +through the pipeline to every pad that a `gst_pad_pull()` will reach. In +the case `fakesrc ! identity ! fakesink`, calling `activate_mode(PULL)` +on identity’s source pad would need to activate its sink pad in pull +mode as well, which should propagate all the way to fakesrc. + +If, on the other hand, `fakesrc ! fakesink` is operating in PUSH mode, +the activation sequence is different. First, activate() on fakesink:sink +calls `activate_mode(PUSH)` on fakesink:sink. Then fakesrc’s pads are +activated: sources first, then sinks (of which fakesrc has none). +fakesrc:src’s activation function is then called. + +Note that it does not make sense to set an activation function on a +source pad. The peer of a source pad is downstream, meaning it should +have been activated first. If it was activated in PULL mode, the source +pad should have already had `activate_mode(PULL)` called on it, and thus +needs no further activation. Otherwise it should be in PUSH mode, which +is the choice of the default activation function. + +So, in the PUSH case, the default activation function chooses PUSH mode, +which calls `activate_mode(PUSH),` which will then start a task on the +source pad and begin pushing. In this way PUSH scheduling is a bit +easier, because it follows the order of state changes in a pipeline. +fakesink is already in PAUSED with an active sink pad by the time +fakesrc starts pushing data. + +## Deactivation + +Pad deactivation occurs when its parent goes into the READY state or +when the pad is deactivated explicitly by the application or element. +`gst_pad_set_active()` is called with a FALSE argument, which then +calls `activate_mode(PUSH)` or `activate_mode(PULL)` with a FALSE +argument, depending on the current activation mode of the pad. + +## Mode switching + +Changing from push to pull modes needs a bit of thought. This is +actually possible and implemented but not yet documented here. diff --git a/markdown/design/buffer.md b/markdown/design/buffer.md new file mode 100644 index 0000000000..728b30b46d --- /dev/null +++ b/markdown/design/buffer.md @@ -0,0 +1,137 @@ +# GstBuffer + +This document describes the design for buffers. + +A GstBuffer is the object that is passed from an upstream element to a +downstream element and contains memory and metadata information. + +## Requirements + + - It must be fast + - allocation, free, low fragmentation + - Must be able to attach multiple memory blocks to the buffer + - Must be able to attach arbitrary metadata to buffers + - efficient handling of subbuffer, copy, span, trim + +## Lifecycle + +GstMemory extends from GstMiniObject and therefore uses its lifecycle +management (See [miniobject](design/miniobject.md)). + +## Writability + +When a Buffers is writable as returned from `gst_buffer_is_writable()`: + + - metadata can be added/removed and the metadata can be changed + + - GstMemory blocks can be added/removed + +The individual memory blocks have their own locking and READONLY flags +that might influence their writability. + +Buffers can be made writable with `gst_buffer_make_writable()`. This +will copy the buffer with the metadata and will ref the memory in the +buffer. This means that the memory is not automatically copied when +copying buffers. + +# Managing GstMemory + +A GstBuffer contains an array of pointers to GstMemory objects. + +When the buffer is writable, `gst_buffer_insert_memory()` can be used +to add a new GstMemory object to the buffer. When the array of memory is +full, memory will be merged to make room for the new memory object. + +`gst_buffer_n_memory()` is used to get the amount of memory blocks on +the `GstBuffer`. + +With `gst_buffer_peek_memory(),` memory can be retrieved from the +memory array. The desired access pattern for the memory block should be +specified so that appropriate checks can be made and, in case of +`GST_MAP_WRITE`, a writable copy can be constructed when needed. + +`gst_buffer_remove_memory_range()` and `gst_buffer_remove_memory()` +can be used to remove memory from the GstBuffer. + +# Subbuffers + +Subbuffers are made by copying only a region of the memory blocks and +copying all of the metadata. + +# Span + +Spanning will merge together the data of 2 buffers into a new + buffer + +# Data access + +Accessing the data of the buffer can happen by retrieving the individual +GstMemory objects in the GstBuffer or by using the `gst_buffer_map()` and +`gst_buffer_unmap()` functions. + +The `_map` and `_unmap` functions will always return the memory of all blocks as +one large contiguous region of memory. Using the `_map` and `_unmap` functions +might be more convenient than accessing the individual memory blocks at the +expense of being more expensive because it might perform memcpy operations. + +For buffers with only one GstMemory object (the most common case), `_map` and +`_unmap` have no performance penalty at all. + +- **Read access with 1 memory block**: The memory block is accessed and mapped +for read access. The memory block is unmapped after usage + +- **write access with 1 memory block**: The buffer should be writable or this +operation will fail. The memory block is accessed. If the memory block is +readonly, a copy is made and the original memory block is replaced with this +copy. Then the memory block is mapped in write mode and unmapped after usage. + +- **Read access with multiple memory blocks**: The memory blocks are combined +into one large memory block. If the buffer is writable, the memory blocks are +replaced with this new combined block. If the buffer is not writable, the +memory is returned as is. The memory block is then mapped in read mode. +When the memory is unmapped after usage and the buffer has multiple memory +blocks, this means that the map operation was not able to store the combined +buffer and it thus returned memory that should be freed. Otherwise, the memory +is unmapped. + +- **Write access with multiple memory blocks**: The buffer should be writable +or the operation fails. The memory blocks are combined into one large memory +block and the existing blocks are replaced with this new block. The memory is +then mapped in write mode and unmapped after usage. + +# Use cases + +## Generating RTP packets from h264 video + +We receive as input a GstBuffer with an encoded h264 image and we need +to create RTP packets containing this h264 data as the payload. We +typically need to fragment the h264 data into multiple packets, each +with their own RTP and payload specific +header. + +``` + +-------+-------+---------------------------+--------+ +input H264 buffer: | NALU1 | NALU2 | ..... | NALUx | + +-------+-------+---------------------------+--------+ + | + V +array of +-+ +-------+ +-+ +-------+ +-+ +-------+ +output buffers: | | | NALU1 | | | | NALU2 | .... | | | NALUx | + +-+ +-------+ +-+ +-------+ +-+ +-------+ + : : : : + \-----------/ \-----------/ + buffer 1 buffer 2 +``` + +The output buffer array consists of x buffers consisting of an RTP +payload header and a subbuffer of the original input H264 buffer. Since +the rtp headers and the h264 data don’t need to be contiguous in memory, +they are added to the buffer as separate GstMemory blocks and we can +avoid to memcpy the h264 data into contiguous memory. + +A typical udpsink will then use something like sendmsg to send the +memory regions on the network inside one UDP packet. This will further +avoid having to memcpy data into contiguous memory. + +Using bufferlists, the complete array of output buffers can be pushed in +one operation to the peer element. diff --git a/markdown/design/buffering.md b/markdown/design/buffering.md new file mode 100644 index 0000000000..9558b6a9b5 --- /dev/null +++ b/markdown/design/buffering.md @@ -0,0 +1,310 @@ +# Buffering + +This document outlines the buffering policy used in the GStreamer core +that can be used by plugins and applications. + +The purpose of buffering is to accumulate enough data in a pipeline so +that playback can occur smoothly and without interruptions. It is +typically done when reading from a (slow) non-live network source but +can also be used for live sources. + +We want to be able to implement the following features: + +- buffering up to a specific amount of data, in memory, before +starting playback so that network fluctuations are minimized. + +- download of the network file to a local disk with fast seeking in +the downloaded data. This is similar to the quicktime/youtube +players. + +- caching of semi-live streams to a local, on disk, ringbuffer with +seeking in the cached area. This is similar to tivo-like +timeshifting. + +- progress report about the buffering operations + +- the possibility for the application to do more complex buffering + +Some use cases: + + - Stream buffering: + + +---------+ +---------+ +-------+ + | httpsrc | | buffer | | demux | + | src - sink src - sink .... + +---------+ +---------+ +-------+ + +In this case we are reading from a slow network source into a buffer element +(such as queue2). + +The buffer element has a low and high watermark expressed in bytes. The +buffer uses the watermarks as follows: + +- The buffer element will post `BUFFERING` messages until the high +watermark is hit. This instructs the application to keep the +pipeline PAUSED, which will eventually block the srcpad from +pushing while data is prerolled in the sinks. + +- When the high watermark is hit, a `BUFFERING` message with 100% +will be posted, which instructs the application to continue +playback. + +- When the low watermark is hit during playback, the queue will +start posting `BUFFERING` messages again, making the application +PAUSE the pipeline again until the high watermark is hit again. +This is called the rebuffering stage. + +- During playback, the queue level will fluctuate between the high +and low watermarks as a way to compensate for network +irregularities. + +This buffering method is usable when the demuxer operates in push mode. +Seeking in the stream requires the seek to happen in the network source. +It is mostly desirable when the total duration of the file is not known, such +as in live streaming or when efficient seeking is not possible/required. + + - Incremental download + + +---------+ +---------+ +-------+ + | httpsrc | | buffer | | demux | + | src - sink src - sink .... + +---------+ +----|----+ +-------+ + V + file + +In this case, we know the server is streaming a fixed length file to the +client. The application can choose to download the file to disk. The buffer +element will provide a push or pull based srcpad to the demuxer to navigate in +the downloaded file. + +This mode is only suitable when the client can determine the length of the +file on the server. + +In this case, buffering messages will be emitted as usual when the requested +range is not within the downloaded area + buffersize. The buffering message +will also contain an indication that incremental download is being performed. +This flag can be used to let the application control the buffering in a more +intelligent way, using the `BUFFERING` query, for example. + +The application can use the `BUFFERING` query to get the estimated download time +and match this time to the current/remaining playback time to control when +playback should start to have a non-interrupted playback experience. + + - Timeshifting + + +---------+ +---------+ +-------+ + | httpsrc | | buffer | | demux | + | src - sink src - sink .... + +---------+ +----|----+ +-------+ + V + file-ringbuffer + +In this mode, a fixed size ringbuffer is kept to download the server content. +This allows for seeking in the buffered data. Depending on the size of the +buffer one can seek further back in time. + +This mode is suitable for all live streams. + +As with the incremental download mode, buffering messages are emitted along +with an indication that timeshifting download is in progress. + + - Live buffering + +In live pipelines we usually introduce some latency between the capture and +the playback elements. This latency can be introduced by a queue (such as a +jitterbuffer) or by other means (in the audiosink). + +Buffering messages can be emitted in those live pipelines as well and serve as +an indication to the user of the latency buffering. The application usually +does not react to these buffering messages with a state change. + +## Messages + +A `GST_MESSAGE_BUFFERING` must be posted on the bus when playback +temporarily stops to buffer and when buffering finishes. When the +percentage field in the `BUFFERING` message is 100, buffering is done. +Values less than 100 mean that buffering is in progress. + +The `BUFFERING` message should be intercepted and acted upon by the +application. The message contains at least one field that is sufficient +for basic functionality: + +* **`buffer-percent`**, G_TYPE_INT: between 0 and 100 + +Several more clever ways of dealing with the buffering messages can be +used when in incremental or timeshifting download mode. For this purpose +additional fields are added to the buffering message: + +* **`buffering-mode`**, `GST_TYPE_BUFFERING_MODE`: `enum { "stream", "download", +"timeshift", "live" }`: Buffering mode in use. See above for an explanation of the different +alternatives. This field can be used to let the application have more control +over the buffering process. + +* **`avg-in-rate`**, G_TYPE_INT: Average input buffering speed in bytes/second. +-1 is unknown. This is the average number of bytes per second that is received +on the buffering element input (sink) pads. It is a measurement of the network +speed in most cases. + +* **`avg-out-rate`**, G_TYPE_INT: Average consumption speed in bytes/second. -1 +is unknown. This is the average number of bytes per second that is consumed by +the downstream element of the buffering element. + +* **`buffering-left`**, G_TYPE_INT64: Estimated time that buffering will take +in milliseconds. -1 is unknown. This is measured based on the avg-in-rate and +the filled level of the queue. The application can use this hint to update the +GUI about the estimated remaining time that buffering will take. + +## Application + +While data is buffered the pipeline should remain in the PAUSED state. +It is also possible that more data should be buffered while the pipeline +is PLAYING, in which case the pipeline should be PAUSED until the +buffering finishes. + +`BUFFERING` messages can be posted while the pipeline is prerolling. The +application should not set the pipeline to PLAYING before a `BUFFERING` +message with a 100 percent value is received, which might only happen +after the pipeline prerolls. + +An exception is made for live pipelines. The application may not change +the state of a live pipeline when a buffering message is received. +Usually these buffering messages contain the "buffering-mode" = "live". + +The buffering message can also instruct the application to switch to a +periodical `BUFFERING` query instead, so it can more precisely control the +buffering process. The application can, for example, choose not to act +on the `BUFFERING` complete message (buffer-percent = 100) to resume +playback but use the estimated download time instead, resuming playback +when it has determined that it should be able to provide uninterrupted +playback. + +## Buffering Query + +In addition to the `BUFFERING` messages posted by the buffering elements, +we want to be able to query the same information from the application. +We also want to be able to present the user with information about the +downloaded range in the file so that the GUI can react on it. + +In addition to all the fields present in the buffering message, the +`BUFFERING` query contains the following field, which indicates the +available downloaded range in a specific format and the estimated time +to complete: + +* **`busy`**, G_TYPE_BOOLEAN: if buffering was busy. This flag allows the +application to pause the pipeline by using the query only. + +* **`format`**, GST_TYPE_FORMAT: the format of the "start" and "stop" values +below + +* **`start`**, G_TYPE_INT64, -1 unknown: the start position of the available +data. If there are multiple ranges, this field contains the start position of +the currently downloading range. + +* **`stop`**, G_TYPE_INT64, -1 unknown: the stop position of the available +data. If there are multiple ranges, this field contains the stop position of +the currently downloading range. + +* **`estimated-total`**, G_TYPE_INT64: gives the estimated download time in +milliseconds. -1 unknown. When the size of the downloaded file is known, this +value will contain the latest estimate of the remaining download time of the +currently downloading range. This value is usually only filled for the +"download" buffering mode. The application can use this information to estimate +the amount of remaining time to download till the end of the file. + +* **`buffering-ranges`**, G_TYPE_ARRAY of GstQueryBufferingRange: contains +optionally the downloaded areas in the format given above. One of the ranges +contains the same start/stop position as above: + + typedef struct + { + gint64 start; + gint64 stop; + } GstQueryBufferingRange; + +For the `download` and `timeshift` buffering-modes, the start and stop +positions specify the ranges where efficient seeking in the downloaded +media is possible. Seeking outside of these ranges might be slow or not +at all possible. + +For the `stream` and `live` mode the start and stop values describe the +oldest and newest item (expressed in `format`) in the buffer. + +## Defaults + +Some defaults for common elements: + +A GstBaseSrc with random access replies to the `BUFFERING` query with: + + "buffer-percent" = 100 + "buffering-mode" = "stream" + "avg-in-rate" = -1 + "avg-out-rate" = -1 + "buffering-left" = 0 + "format" = GST_FORMAT_BYTES + "start" = 0 + "stop" = the total filesize + "estimated-total" = 0 + "buffering-ranges" = NULL + +A GstBaseSrc in push mode replies to the `BUFFERING` query with: + + "buffer-percent" = 100 + "buffering-mode" = "stream" + "avg-in-rate" = -1 + "avg-out-rate" = -1 + "buffering-left" = 0 + "format" = a valid GST_TYPE_FORMAT + "start" = current position + "stop" = current position + "estimated-total" = -1 + "buffering-ranges" = NULL + +## Buffering strategies + +Buffering strategies are specific implementations based on the buffering +message and query described above. + +Most strategies have to balance buffering time versus maximal playback +experience. + +### Simple buffering + +NON-live pipelines are kept in the paused state while buffering messages with +a percent < 100% are received. + +This buffering strategy relies on the buffer size and low/high watermarks of +the element. It can work with a fixed size buffer in memory or on disk. + +The size of the buffer is usually expressed in a fixed amount of time units +and the estimated bitrate of the upstream source is used to convert this time +to bytes. + +All GStreamer applications must implement this strategy. Failure to do so +will result in starvation at the sink. + +### No-rebuffer strategy + +This strategy tries to buffer as much data as possible so that playback can +continue without any further rebuffering. + +This strategy is initially similar to simple buffering, the difference is in +deciding on the condition to continue playback. When a 100% buffering message +has been received, the application will not yet start the playback but it will +start a periodic buffering query, which will return the estimated amount of +buffering time left. When the estimated time left is less than the remaining +playback time, playback can continue. + +This strategy requires a unlimited buffer size in memory or on disk, such as +provided by elements that implement the incremental download buffering mode. + +Usually, the application can choose to start playback even before the +remaining buffer time elapsed in order to more quickly start the playback at +the expense of a possible rebuffering phase. + +### Incremental rebuffering + +The application implements the simple buffering strategy but with each +rebuffering phase, it increases the size of the buffer. + +This strategy has quick, fixed time startup times but incrementally longer +rebuffering times if the network is slower than the media bitrate. diff --git a/markdown/design/bufferpool.md b/markdown/design/bufferpool.md new file mode 100644 index 0000000000..a7fefdcc52 --- /dev/null +++ b/markdown/design/bufferpool.md @@ -0,0 +1,365 @@ +# Bufferpool + +This document details the design of how buffers are be allocated and +managed in pools. + +Bufferpools increase performance by reducing allocation overhead and +improving possibilities to implement zero-copy memory transfer. + +Together with the ALLOCATION query, elements can negotiate allocation +properties and bufferpools between themselves. This also allows elements +to negotiate buffer metadata between themselves. + +# Requirements + +- Provide a GstBufferPool base class to help the efficient +implementation of a list of reusable GstBuffer objects. + +- Let upstream elements initiate the negotiation of a bufferpool and +its configuration. Allow downstream elements provide bufferpool +properties and/or a bufferpool. This includes the following +properties: + + - have minimum and maximum amount of buffers with the option of + preallocating buffers. + + - allocator, alignment and padding support + + - buffer metadata + + - arbitrary extra options + +- Integrate with dynamic caps renegotiation. + +- Notify upstream element of new bufferpool availability. This is +important when a new element, that can provide a bufferpool, is +dynamically linked +downstream. + +# GstBufferPool + +The bufferpool object manages a list of buffers with the same properties such +as size, padding and alignment. + +The bufferpool has two states: active and inactive. In the inactive +state, the bufferpool can be configured with the required allocation +preferences. In the active state, buffers can be retrieved from and +returned to the pool. + +The default implementation of the bufferpool is able to allocate buffers +from any allocator with arbitrary alignment and padding/prefix. + +Custom implementations of the bufferpool can override the allocation and +free algorithms of the buffers from the pool. This should allow for +different allocation strategies such as using shared memory or hardware +mapped memory. + +# Negotiation + +After a particular media format has been negotiated between two pads (using the +CAPS event), they must agree on how to allocate buffers. + +The srcpad will always take the initiative to negotiate the allocation +properties. It starts with creating a GST_QUERY_ALLOCATION with the negotiated +caps. + +The srcpad can set the need-pool flag to TRUE in the query to optionally make the +peer pad allocate a bufferpool. It should only do this if it is able to use +the peer provided bufferpool. + +It will then inspect the returned results and configure the returned pool or +create a new pool with the returned properties when needed. + +Buffers are then allocated by the srcpad from the negotiated pool and pushed to +the peer pad as usual. + +The allocation query can also return an allocator object when the buffers are of +different sizes and can't be allocated from a pool. + +# Allocation query + +The allocation query has the following fields: + +* (in) **`caps`**, GST_TYPE_CAPS: the caps that was negotiated + +* (in) **`need-pool`**, G_TYPE_BOOLEAN: if a GstBufferPool is requested + +* (out) **`pool`**, G_TYPE_ARRAY of structure: an array of pool configurations: + +`` c + struct { + GstBufferPool *pool; + guint size; + guint min_buffers; + guint max_buffers; + } +`` + +Use `gst_query_parse_nth_allocation_pool()` to get the values. + +The allocator can contain multiple pool configurations. If need-pool +was TRUE, the pool member might contain a GstBufferPool when the +downstream element can provide one. + +Size contains the size of the bufferpool's buffers and is never 0. + +min_buffers and max_buffers contain the suggested min and max amount of +buffers that should be managed by the pool. + +The upstream element can choose to use the provided pool or make its own +pool when none was provided or when the suggested pool was not +acceptable. + +The pool can then be configured with the suggested min and max amount of +buffers or a downstream element might choose different values. + +* (out) **`allocator`**, G_TYPE_ARRAY of structure: an array of allocator parameters that can be used. + +``` c + struct { + GstAllocator *allocator; + GstAllocationParams params; + } +``` + +Use `gst_query_parse_nth_allocation_param()` to get the values. + +The element performing the query can use the allocators and its +parameters to allocate memory for the downstream element. + +It is also possible to configure the allocator in a provided pool. + +* (out) **`metadata`**, G_TYPE_ARRAY of structure: an array of metadata params that can be accepted. + +``` c + struct { + GType api; + GstStructure *params; + } +``` + +Use `gst_query_parse_nth_allocation_meta(`) to get the values. + +These metadata items can be accepted by the downstream element when +placed on buffers. There is also an arbitrary `GstStructure` associated +with the metadata that contains metadata-specific options. + +Some bufferpools have options to enable metadata on the buffers +allocated by the pool. + +# Allocating from pool + +Buffers are allocated from the pool of a pad: + +``` c +res = gst_buffer_pool_acquire_buffer (pool, &buffer, ¶ms); +``` + +A `GstBuffer` that is allocated from the pool will always be writable (have a +refcount of 1) and it will also have its pool member point to the `GstBufferPool` +that created the buffer. + +Buffers are refcounted in the usual way. When the refcount of the buffer +reaches 0, the buffer is automatically returned to the pool. + +Since all the buffers allocated from the pool keep a reference to the pool, +when nothing else is holding a refcount to the pool, it will be finalized +when all the buffers from the pool are unreffed. By setting the pool to +the inactive state we can drain all buffers from the pool. + +When the pool is in the inactive state, `gst_buffer_pool_acquire_buffer()` will +return `GST_FLOW_FLUSHING` immediately. + +Extra parameters can be given to the `gst_buffer_pool_acquire_buffer()` method to +influence the allocation decision. `GST_BUFFER_POOL_FLAG_KEY_UNIT` and +`GST_BUFFER_POOL_FLAG_DISCONT` serve as hints. + +When the bufferpool is configured with a maximum number of buffers, allocation +will block when all buffers are outstanding until a buffer is returned to the +pool. This behaviour can be changed by specifying the +`GST_BUFFER_POOL_FLAG_DONTWAIT` flag in the parameters. With this flag set, +allocation will return `GST_FLOW_EOS` when the pool is empty. + +# Renegotiation + +Renegotiation of the bufferpool might need to be performed when the +configuration of the pool changes. Changes can be in the buffer size +(because of a caps change), alignment or number of + buffers. + +## Downstream + +When the upstream element wants to negotiate a new format, it might need +to renegotiate a new bufferpool configuration with the downstream element. +This can, for example, happen when the buffer size changes. + +We can not just reconfigure the existing bufferpool because there might +still be outstanding buffers from the pool in the pipeline. Therefore we +need to create a new bufferpool for the new configuration while we let the +old pool drain. + +Implementations can choose to reuse the same bufferpool object and wait for +the drain to finish before reconfiguring the pool. + +The element that wants to renegotiate a new bufferpool uses exactly the same +algorithm as when it first started. It will negotiate caps first then use the +ALLOCATION query to get and configure the new pool. + +## upstream + +When a downstream element wants to negotiate a new format, it will send a +RECONFIGURE event upstream. This instructs upstream to renegotiate both +the format and the bufferpool when needed. + +A pipeline reconfiguration happens when new elements are added or removed from +the pipeline or when the topology of the pipeline changes. Pipeline +reconfiguration also triggers possible renegotiation of the bufferpool and +caps. + +A RECONFIGURE event tags each pad it travels on as needing reconfiguration. +The next buffer allocation will then require the renegotiation or +reconfiguration of a pool. + +# Shutting down + +In push mode, a source pad is responsible for setting the pool to the +inactive state when streaming stops. The inactive state will unblock any pending +allocations so that the element can shut down. + +In pull mode, the sink element should set the pool to the inactive state when +shutting down so that the peer `_get_range()` function can unblock. + +In the inactive state, all the buffers that are returned to the pool will +automatically be freed by the pool and new allocations will fail. + +# Use cases + +## - `videotestsrc ! xvimagesink` + +* Before videotestsrc can output a buffer, it needs to negotiate caps and +a bufferpool with the downstream peer pad. + +* First it will negotiate a suitable format with downstream according to the +normal rules. It will send a CAPS event downstream with the negotiated +configuration. + +* Then it does an ALLOCATION query. It will use the returned bufferpool or +configures its own bufferpool with the returned parameters. The bufferpool is +initially in the inactive state. + +* The ALLOCATION query lists the desired configuration of the downstream +xvimagesink, which can have specific alignment and/or min/max amount of +buffers. + +* videotestsrc updates the configuration of the bufferpool, it will likely set +the min buffers to 1 and the size of the desired buffers. It then updates the +bufferpool configuration with the new properties. + +* When the configuration is successfully updated, videotestsrc sets the +bufferpool to the active state. This preallocates the buffers in the pool (if +needed). This operation can fail when there is not enough memory available. +Since the bufferpool is provided by xvimagesink, it will allocate buffers +backed by an XvImage and pointing to shared memory with the X server. + +* If the bufferpool is successfully activated, videotestsrc can acquire +a buffer from the pool, fill in the data and push it out to xvimagesink. + +* xvimagesink can know that the buffer originated from its pool by following +the pool member. + +* when shutting down, videotestsrc will set the pool to the inactive state, +this will cause further allocations to fail and currently allocated buffers to +be freed. videotestsrc will then free the pool and stop streaming. + +## - ``videotestsrc ! queue ! myvideosink`` + +* In this second use case we have a videosink that can at most allocate 3 video +buffers. + +* Again videotestsrc will have to negotiate a bufferpool with the peer element. +For this it will perform the ALLOCATION query which queue will proxy to its +downstream peer element. + +* The bufferpool returned from myvideosink will have a max_buffers set to 3. +queue and videotestsrc can operate with this upper limit because none of those +elements require more than that amount of buffers for temporary storage. + +* Myvideosink's bufferpool will then be configured with the size of the buffers +for the negotiated format and according to the padding and alignment rules. +When videotestsrc sets the pool to active, the 3 video buffers will be +preallocated in the pool. + +* videotestsrc acquires a buffer from the configured pool on its srcpad and +pushes this into the queue. When videotestsrc has acquired and pushed 3 frames, +the next call to gst_buffer_pool_acquire_buffer() will block (assuming the +GST_BUFFER_POOL_FLAG_DONTWAIT is not specified). + +* When the queue has pushed out a buffer and the sink has rendered it, the +refcount of the buffer reaches 0 and the buffer is recycled in the pool. This +will wake up the videotestsrc that was blocked, waiting for more buffers and +will make it produce the next buffer. + +* In this setup, there are at most 3 buffers active in the pipeline and the +videotestsrc is rate limited by the rate at which buffers are recycled in the +bufferpool. + +* When shutting down, videotestsrc will first set the bufferpool on the srcpad +to inactive. This causes any pending (blocked) acquire to return with +a FLUSHING result and causes the streaming thread to pause. + +## - `.. ! myvideodecoder ! queue ! fakesink` + +* In this case, the myvideodecoder requires buffers to be aligned to 128 bytes +and padded with 4096 bytes. The pipeline starts out with the decoder linked to +a fakesink but we will then dynamically change the sink to one that can provide +a bufferpool. + +* When myvideodecoder negotiates the size with the downstream fakesink element, +it will receive a NULL bufferpool because fakesink does not provide +a bufferpool. It will then select its own custom bufferpool to start the data +transfer. + +* At some point we block the queue srcpad, unlink the queue from the fakesink, +link a new sink and set the new sink to the PLAYING state. Linking the new sink +would automatically send a RECONFIGURE event upstream and, through queue, +inform myvideodecoder that it should renegotiate its bufferpool because +downstream has been reconfigured. + +* Before pushing the next buffer, myvideodecoder has to renegotiate a new +bufferpool. To do this, it performs the usual bufferpool negotiation algorithm. +If it can obtain and configure a new bufferpool from downstream, it sets its +own (old) pool to inactive and unrefs it. This will eventually drain and unref +the old bufferpool. + +* The new bufferpool is set as the new bufferpool for the srcpad and sinkpad of +the queue and set to the active state. + +## - `.. ! myvideodecoder ! queue ! myvideosink ` + +* myvideodecoder has negotiated a bufferpool with the downstream myvideosink to +handle buffers of size 320x240. It has now detected a change in the video +format and needs to renegotiate to a resolution of 640x480. This requires it to +negotiate a new bufferpool with a larger buffer size. + +* When myvideodecoder needs to get the bigger buffer, it starts the negotiation +of a new bufferpool. It queries a bufferpool from downstream, reconfigures it +with the new configuration (which includes the bigger buffer size) and sets the +bufferpool to active. The old pool is inactivated and unreffed, which causes +the old format to drain. + +* It then uses the new bufferpool for allocating new buffers of the new +dimension. + +* If at some point, the decoder wants to switch to a lower resolution again, it +can choose to use the current pool (which has buffers that are larger than the +required size) or it can choose to renegotiate a new bufferpool. + +## - `.. ! myvideodecoder ! videoscale ! myvideosink` + +* myvideosink is providing a bufferpool for upstream elements and wants to +change the resolution. + +* myvideosink sends a RECONFIGURE event upstream to notify upstream that a new +format is desirable. Upstream elements try to negotiate a new format and +bufferpool before pushing out a new buffer. The old bufferpools are drained in +the regular way. diff --git a/markdown/design/caps.md b/markdown/design/caps.md new file mode 100644 index 0000000000..3f15ec9991 --- /dev/null +++ b/markdown/design/caps.md @@ -0,0 +1,141 @@ +# Caps + +Caps are lightweight refcounted objects describing media types. They are +composed of an array of GstStructures plus, optionally, a +GstCapsFeatures set for the GstStructure. + +Caps are exposed on GstPadTemplates to describe all possible types a +given pad can handle. They are also stored in the registry along with a +description of the element. + +Caps are exposed on the element pads via CAPS and `ACCEPT_CAPS` queries. + +This function describes the possible types that the pad can handle or +produce ([negotiation](design/negotiation.md)). + +Various methods exist to work with the media types such as subtracting +or intersecting. + +## Operations + +# Fixating + +Caps are fixed if they only contain a single structure and this +structure is fixed. A structure is fixed if none of the fields of the +structure is an unfixed type, for example a range, list or array. + +For fixating caps only the first structure is kept as the order of +structures is meant to express the preferences for the different +structures. Afterwards, each unfixed field of this structure is set to +the value that makes most sense for the media format by the element or +pad implementation and then every remaining unfixed field is set to an +arbitrary value that is a subset of the unfixed field’s values. + +EMPTY caps are fixed caps and ANY caps are not. Caps with ANY caps +features are not fixed. + +# Subset + +One caps "A" is a subset of another caps "B" if for each structure in +"A" there exists a structure in "B" that is a superset of the structure +in "A". + +A structure "a" is the subset of a structure "b" if it has the same +structure name, the same caps features and each field in "b" exists in +"a" and the value of the field in "a" is a subset of the value of the +field in "b". "a" can have additional fields that are not in "b". + +EMPTY caps are a subset of every other caps. Every caps are a subset of +ANY caps. + +# Equality + +Caps "A" and "B" are equal if "A" is a subset of "B" and "B" is a subset +of "A". This means that both caps are expressing the same possibilities +but their structures can still be different if they contain unfixed +fields. + +# Intersection + +The intersection of caps "A" and caps "B" are the caps that contain the +intersection of all their structures with each other. + +The intersection of structure "a" and structure "b" is empty if their +structure name or their caps features are not equal, or if "a" and "b" +contain the same field but the intersection of both field values is +empty. If one structure contains a field that is not existing in the +other structure it will be copied over to the intersection with the same +value. + +The intersection with ANY caps is always the other caps and the +intersection with EMPTY caps is always EMPTY. + +# Union + +The union of caps "A" and caps "B" are the caps that contain the union +of all their structures with each other. + +The union of structure "a" and structure "b" are the two structures "a" +and "b" if the structure names or caps features are not equal. +Otherwise, the union is the structure that contains the union of each +fields value. If a field is only in one of the two structures it is not +contained in the union. + +The union with ANY caps is always ANY and the union with EMPTY caps is +always the other caps. + +# Subtraction + +The subtraction of caps "A" from caps "B" is the most generic subset of +"B" that has an empty intersection with "A" but only contains structures +with names and caps features that are existing in "B". + +## Basic Rules + +# Semantics of caps and their usage + +A caps can contain multiple structures, in which case any of the +structures would be acceptable. The structures are in the preferred +order of the creator of the caps, with the preferred structure being +first and during negotiation of caps this order should be considered to +select the most optimal structure. + +Each of these structures has a name that specifies the media type, e.g. +"video/x-theora" to specify Theora video. Additional fields in the +structure add additional constraints and/or information about the media +type, like the width and height of a video frame, or the codec profile +that is used. These fields can be non-fixed (e.g. ranges) for non-fixed +caps but must be fixated to a fixed value during negotiation. If a field +is included in the caps returned by a pad via the CAPS query, it imposes +an additional constraint during negotiation. The caps in the end must +have this field with a value that is a subset of the non-fixed value. +Additional fields that are added in the negotiated caps give additional +information about the media but are treated as optional. Information +that can change for every buffer and is not relevant during negotiation +must not be stored inside the caps. + +For each of the structures in caps it is possible to store caps +features. The caps features are expressing additional requirements for a +specific structure, and only structures with the same name *and* equal +caps features are considered compatible. Caps features can be used to +require a specific memory representation or a specific meta to be set on +buffers, for example a pad could require for a specific structure that +it is passed EGLImage memory or buffers with the video meta. If no caps +features are provided for a structure, it is assumed that system memory +is required unless later negotiation steps (e.g. the ALLOCATION query) +detect that something else can be used. The special ANY caps features +can be used to specify that any caps feature would be accepted, for +example if the buffer memory is not touched at all. + +# Compatibility of caps + +Pads can be linked when the caps of both pads are compatible. This is +the case when their intersection is not empty. + +For checking if a pad actually supports a fixed caps an intersection is +not enough. Instead the fixed caps must be at least a subset of the +pad’s caps but pads can introduce additional constraints which would +be checked in the `ACCEPT_CAPS` query handler. + +Data flow can only happen after pads have decided on common fixed caps. +These caps are distributed to both pads with the CAPS event. diff --git a/markdown/design/clocks.md b/markdown/design/clocks.md new file mode 100644 index 0000000000..d2d17c3530 --- /dev/null +++ b/markdown/design/clocks.md @@ -0,0 +1,83 @@ +# Clocks + +The GstClock returns a monotonically increasing time with the method +`_get_time()`. Its accuracy and base time depends on the specific clock +implementation but time is always expressed in nanoseconds. Since the +baseline of the clock is undefined, the clock time returned is not +meaningful in itself, what matters are the deltas between two clock +times. The time reported by the clock is called the `absolute_time`. + +## Clock Selection + +To synchronize the different elements, the GstPipeline is responsible +for selecting and distributing a global GstClock for all the elements in +it. + +This selection happens whenever the pipeline goes to PLAYING. Whenever +an element is added/removed from the pipeline, this selection will be +redone in the next state change to PLAYING. Adding an element that can +provide a clock will post a `GST_MESSAGE_CLOCK_PROVIDE` message on the +bus to inform parent bins of the fact that a clock recalculation is +needed. + +When a clock is selected, a `NEW_CLOCK` message is posted on the bus +signaling the clock to the application. + +When the element that provided the clock is removed from the pipeline, a +`CLOCK_LOST` message is posted. The application must then set the +pipeline to PAUSED and PLAYING again in order to let the pipeline select +a new clock and distribute a new base time. + +The clock selection is performed as part of the state change from PAUSED +to PLAYING and is described in [states](design/states.md). + +## Clock features + +The clock supports periodic and single shot clock notifications both +synchronous and asynchronous. + +One first needs to create a GstClockID for the periodic or single shot +notification using `_clock_new_single_shot_id()` or +`_clock_new_periodic_id()`. + +To perform a blocking wait for the specific time of the GstClockID use +the `gst_clock_id_wait()`. To receive a callback when the specific time +is reached in the clock use `gstclock_id_wait_async()`. Both these +calls can be interrupted with the `gst_clock_id_unschedule()` call. If +the blocking wait is unscheduled a value of `GST_CLOCK_UNSCHEDULED` is +returned. + +The async callbacks can happen from any thread, either provided by the +core or from a streaming thread. The application should be prepared for +this. + +A GstClockID that has been unscheduled cannot be used again for any wait +operation. + +It is possible to perform a blocking wait on the same ID from multiple +threads. However, registering the same ID for multiple async +notifications is not possible, the callback will only be called once. + +None of the wait operations unref the GstClockID, the owner is +responsible for unreffing the ids itself. This holds true for both +periodic and single shot notifications. The reason being that the owner +of the ClockID has to keep a handle to the ID to unblock the wait on +FLUSHING events or state changes and if we unref it automatically, the +handle might be invalid. + +These clock operations do not operate on the stream time, so the +callbacks will also occur when not in PLAYING state as if the clock just +keeps on running. Some clocks however do not progress when the element +that provided the clock is not PLAYING. + +## Clock implementations + +The GStreamer core provides a GstSystemClock based on the system time. +Asynchronous callbacks are scheduled from an internal thread. + +Clock implementers are encouraged to subclass this systemclock as it +implements the async notification. + +Subclasses can however override all of the important methods for sync +and async notifications to implement their own callback methods or +blocking wait operations. diff --git a/markdown/design/context.md b/markdown/design/context.md new file mode 100644 index 0000000000..ef35aef908 --- /dev/null +++ b/markdown/design/context.md @@ -0,0 +1,61 @@ +# Context + +GstContext is a container object, containing a type string and a generic +GstStructure. It is used to store and propagate context information in a +pipeline, like device handles, display server connections and other +information that should be shared between multiple elements in a +pipeline. + +For sharing context objects and distributing them between application +and elements in a pipeline, there are downstream queries, upstream +queries, messages and functions to set a context on a complete pipeline. + +## Context types + +Context type names should be unique and be put in appropriate +namespaces, to prevent name conflicts, e.g. "gst.egl.EGLDisplay". Only +one specific type is allowed per context type name. + +## Elements + +Elements that need a specific context for their operation would do the +following steps until one succeeds: + +1) Check if the element already has a context of the specific type, + i.e. it was previously set via gst_element_set_context(). + +2) Query downstream with GST_QUERY_CONTEXT for the context and check if + downstream already has a context of the specific type + +3) Query upstream with GST_QUERY_CONTEXT for the context and check if + upstream already has a context of the specific type + +4) Post a GST_MESSAGE_NEED_CONTEXT message on the bus with the required + context types and afterwards check if a usable context was set now + as in 1). The message could be handled by the parent bins of the + element and the application. + +4) Create a context by itself and post a GST_MESSAGE_HAVE_CONTEXT message + on the bus. + +Bins will propagate any context that is set on them to their child +elements via gst\_element\_set\_context(). Even to elements added after +a given context has been set. + +Bins can handle the GST\_MESSAGE\_NEED\_CONTEXT message, can filter both +messages and can also set different contexts for different pipeline +parts. + +## Applications + +Applications can set a specific context on a pipeline or elements inside +a pipeline with gst\_element\_set\_context(). + +If an element inside the pipeline needs a specific context, it will post +a GST\_MESSAGE\_NEED\_CONTEXT message on the bus. The application can +now create a context of the requested type or pass an already existing +context to the element (or to the complete pipeline). + +Whenever an element creates a context internally it will post a +GST\_MESSAGE\_HAVE\_CONTEXT message on the bus. Bins will cache these +contexts and pass them to any future element that requests them. diff --git a/markdown/design/controller.md b/markdown/design/controller.md new file mode 100644 index 0000000000..555d5fbd40 --- /dev/null +++ b/markdown/design/controller.md @@ -0,0 +1,65 @@ +# Controller + +The controller subsystem allows to automate element property changes. It +works so that all parameter changes are time based and elements request +property updates at processing time. + +## Element view + +Elements don’t need to do much. They need to: - mark object properties +that can be changed while processing with GST\_PARAM\_CONTROLLABLE - +call gst\_object\_sync\_values (self, timestamp) in the processing +function before accessing the parameters. + +All ordered property types can be automated (int, double, boolean, +enum). Other property types can also be automated by using special +control bindings. One can e.g. write a control-binding that updates a +text property based on timestamps. + +## Application view + +Applications need to setup the property automation. For that they need +to create a GstControlSource and attach it to a property using +GstControlBinding. Various control-sources and control-bindings exist. +All control sources produce control value sequences in the form of +gdouble values. The control bindings map them to the value range and +type of the bound property. + +One control-source can be attached to one or more properties at the same +time. If it is attached multiple times, then each control-binding will +scale and convert the control values to the target property type and +range. + +One can create complex control-curves by using a +GstInterpolationControlSource. This allows the classic user editable +control-curve (often seen in audio/video editors). Another way is to use +computed control curves. GstLFOControlSource can generate various +repetitive signals. Those can be made more complex by chaining the +control sources. One can attach another control-source to e.g. modulate +the frequency of the first GstLFOControlSource. + +In most cases GstControlBindingDirect will be the binding to be used. +Other control bindings are there to handle special cases, such as having +1-4 control- sources and combine their values into a single guint to +control a rgba-color property. + +## TODO + +* control-source value ranges - control sources should ideally emit values +between \[0.0 and 1.0\] - right now lfo-control-sources emits values +between \[-1.0 and 1.0\] - we can make control-sources announce that or +fix it in a lfo2-control-source + +* ranged-control-binding - it might be a nice thing to have a +control-binding that has scale and offset properties - when attaching a +control-source to e.g. volume, one needs to be aware that the values go +from \[0.0 to 4.0\] - we can also have a "mapping-mode"={AS\_IS, +TRANSFORMED} on direct-control-binding and two extra properties that are +used in TRANSFORMED mode + +* control-setup descriptions - it would be nice to have a way to parse a +textual control-setup description. This could be used in gst-launch and +in presets. It needs to be complemented with a formatter (for the preset +storage or e.g. for debug logging). - this could be function-style: +direct(control-source=lfo(waveform=*sine*,offset=0.5)) or gst-launch +style (looks weird) lfo wave=sine offset=0.5 \! direct .control-source diff --git a/markdown/design/conventions.md b/markdown/design/conventions.md new file mode 100644 index 0000000000..b1dcc57208 --- /dev/null +++ b/markdown/design/conventions.md @@ -0,0 +1,78 @@ +# Documentation conventions + +Due to the potential for exponential growth, several abbreviating +conventions will be used throughout this documentation. These +conventions have grown primarily from extremely in-depth discussions of +the architecture in IRC. This has verified the safety of these +conventions, if used properly. There are no known namespace conflicts as +long as context is rigorously observed. + +## Object classes + +Since everything starts with Gst, we will generally refer to objects by +the shorter name, i.e. Element or Pad. These names will always have +their first letter capitalized. + +## Function names + +Within the context of a given object, functions defined in that object’s +header and/or source file will have their object-specific prefix +stripped. For instance, gst\_element\_add\_pad() would be referred to as +simply *add\_pad(). Note that the trailing parentheses should always be +present, but sometimes may not be. A prefixing underscore (*) will +always tell you it’s a function, however, regardless of the presence or +absence of the trailing parentheses. + +## defines and enums + +Values and macros defined as enums and preprocessor macros will be +referred to in all capitals, as per their definition. This includes +object flags and element states, as well as general enums. Examples are +the states NULL, READY, PLAYING, and PAUSED; the element flags +LOCKED\_STATE , and state return values SUCCESS, FAILURE, and ASYNC. +Where there is a prefix, as in the element flags, it is usually dropped +and implied. Note however that element flags should be cross-checked +with the header, as there are currently two conventions in use: with and +without *FLAGS* in the middle. + +## Drawing conventions + +When drawing pictures the following conventions apply: + +### objects + +Objects are drawn with a box like: + + +------+ + | | + +------+ + +### pointers + +a pointer to an object. + +``` + +-----+ +*--->| | + +-----+ +``` + +an invalid pointer, this is a pointer that should not be used. + + *-//-> + +### elements + +``` + +----------+ + | name | +sink src + +----------+ +``` + +### pad links + + -----+ +--- + | | + src--sink + -----+ +--- diff --git a/markdown/design/draft-klass.md b/markdown/design/draft-klass.md new file mode 100644 index 0000000000..5bd76e4b07 --- /dev/null +++ b/markdown/design/draft-klass.md @@ -0,0 +1,215 @@ +# Element Klass definition + +## Purpose + +Applications should be able to retrieve elements from the registry of +existing elements based on specific capabilities or features of the +element. + +A playback application might want to retrieve all the elements that can +be used for visualisation, for example, or a video editor might want to +select all video effect filters. + +The topic of defining the klass of elements should be based on use +cases. + +A list of classes that are used in a installation can be generated +using: gst-inspect-1.0 -a | grep -ho Class:.\* | cut -c8- | sed +"s/\\//\\\\n/g" | sort | uniq + +## Proposal + +The GstElementDetails contains a field named klass that is a pointer to +a string describing the element type. + +In this document we describe the format and contents of the string. +Elements should adhere to this specification although that is not +enforced to allow for wild (application specific) customisation. + +###string format + + ['/'[/]*[/]* + +4\) examples: + + apedemux : Extracter/Metadata + audiotestsrc : Source/Audio + autoaudiosink : Sink/Audio/Device + cairotimeoverlay : Mixer/Video/Text + dvdec : Decoder/Video + dvdemux : Demuxer + goom : Converter/Audio/Video + id3demux : Extracter/Metadata + udpsrc : Source/Network/Protocol/Device + videomixer : Mixer/Video + videoconvert : Filter/Video (intended use to convert video with as little + visible change as possible) + vertigotv : Effect/Video (intended use is to change the video) + volume : Effect/Audio (intended use is to change the audio data) + vorbisdec : Decoder/Audio + vorbisenc : Encoder/Audio + oggmux : Muxer + adder : Mixer/Audio + videobox : Effect/Video + alsamixer : Control/Audio/Device + audioconvert : Filter/Audio + audioresample : Filter/Audio + xvimagesink : Sink/Video/Device + navseek : Filter/Debug + decodebin : Decoder/Demuxer + level : Filter/Analyzer/Audio + tee : Connector/Debug + +### open issues: + + - how to differentiate physical devices from logical ones? + autoaudiosink : Sink/Audio/Device alsasink : Sink/Audio/Device + +## Use cases + +- get a list of all elements implementing a video effect (pitivi): + + klass.contains (Effect & Video) + +- get list of muxers (pitivi): + + klass.contains (Muxer) + +- get list of video encoders (pitivi): + + klass.contains (Encoder & video) + +- Get a list of all audio/video visualisations (totem): + + klass.contains (Visualisation) + +- Get a list of all decoders/demuxer/metadata parsers/vis (playbin): + + klass.contains (Visualisation | Demuxer | Decoder | (Extractor & Metadata)) + +- Get a list of elements that can capture from an audio device +(gst-properties): + + klass.contains (Source & Audio & Device) + + - filters out audiotestsrc, since it is not a device diff --git a/markdown/design/draft-metadata.md b/markdown/design/draft-metadata.md new file mode 100644 index 0000000000..be5f29fb7d --- /dev/null +++ b/markdown/design/draft-metadata.md @@ -0,0 +1,194 @@ +# Metadata + +This draft recaps the current metadata handling in GStreamer and +proposes some additions. + +## Supported Metadata standards + +The paragraphs below list supported native metadata standards sorted by +type and then in alphabetical order. Some standards have been extended +to support additional metadata. GStreamer already supports all of those +to some extend. This is showns in the table below as either \[--\], +\[r-\], \[-w\] or \[rw\] depending on read/write support (08.Feb.2010). + +### Audio +- mp3 + * ID3v2: \[rw] + * http://www.id3.org/Developer_Information + * ID3v1: [rw] + * http://www.id3.org/ID3v1 + * XMP: \[--] (inside ID3v2 PRIV tag of owner XMP) + * http://www.adobe.com/devnet/xmp/ +- ogg/vorbis + * vorbiscomment: \[rw] + * http://www.xiph.org/vorbis/doc/v-comment.html + * http://wiki.xiph.org/VorbisComment +- wav + * LIST/INFO chunk: \[rw] + * http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info + * http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt + * XMP: \[--] + * http://www.adobe.com/devnet/xmp/ + +### Video +- 3gp + * {moov,trak}.udta: \[rw] + * http://www.3gpp.org/ftp/Specs/html-info/26244.htm + * ID3V2: \[--] + * http://www.3gpp.org/ftp/Specs/html-info/26244.htm + * http://www.mp4ra.org/specs.html#id3v2 +- avi + * LIST/INFO chunk: \[rw] + * http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info + * http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt + * XMP: \[--] (inside "_PMX" chunk) + * http://www.adobe.com/devnet/xmp/ +- asf + * ??: + * XMP: \[--] + * http://www.adobe.com/devnet/xmp/ +- flv \[--] + * XMP: (inside onXMPData script data tag) + * http://www.adobe.com/devnet/xmp/ +- mkv + * tags: \[rw] + * http://www.matroska.org/technical/specs/tagging/index.html +- mov + * XMP: \[--] (inside moov.udta.XMP_ box) + * http://www.adobe.com/devnet/xmp/ +- mp4 + * {moov,trak}.udta: \[rw] + * http://standards.iso.org/ittf/PubliclyAvailableStandards/c051533_ISO_IEC_14496-12_2008.zip + * moov.udta.meta.ilst: \[rw] + * http://atomicparsley.sourceforge.net/ + * http://atomicparsley.sourceforge.net/mpeg-4files.html + * ID3v2: \[--] + * http://www.mp4ra.org/specs.html#id3v2 + * XMP: \[--] (inside UUID box) + * http://www.adobe.com/devnet/xmp/ +- mxf + * ?? + +### Images +- gif + * XMP: \[--] + * http://www.adobe.com/devnet/xmp/ +- jpg + * jif: \[rw] (only comments) + * EXIF: \[rw] (via metadata plugin) + * http://www.exif.org/specifications.html + * IPTC: \[rw] (via metadata plugin) + * http://www.iptc.org/IPTC4XMP/ + * XMP: \[rw] (via metadata plugin) + * http://www.adobe.com/devnet/xmp/ +- png + * XMP: \[--] + * http://www.adobe.com/devnet/xmp/ + +### further Links: + +http://age.hobba.nl/audio/tag_frame_reference.html +http://wiki.creativecommons.org/Tracker_CC_Indexing + +## Current Metadata handling + +When reading files, demuxers or parsers extract the metadata. It will be +sent a GST\_EVENT\_TAG to downstream elements. When a sink element +receives a tag event, it will post a GST\_MESSAGE\_TAG message on the +bus with the contents of the tag event. + +Elements receiving GST\_EVENT\_TAG events can mangle them, mux them into +the buffers they send or just pass them through. Usually is muxers that +will format the tag data into the form required by the format they mux. +Such elements would also implement the GstTagSetter interface to receive +tags from the application. + +``` + +----------+ + | demux | +sink src --> GstEvent(tag) over GstPad to downstream element + +----------+ + + method call over GstTagSetter interface from application + | + v + +----------+ + | mux | +GstEvent(tag) over GstPad from upstream element --> sink src + +----------+ +``` + +The data used in all those interfaces is GstTagList. It is based on a +GstStructure which is like a hash table with differently typed entries. +The key is always a string/GQuark. Many keys are predefined in GStreamer +core. More keys are defined in gst-plugins-base/gst-libs/gst/tag/tag.h. +If elements and applications use predefined types, it is possible to +transcode a file from one format into another while preserving all known +and mapped metadata. + +## Issues + +### Unknown/Unmapped metadata + +Right now GStreamer can lose metadata when transcoding, remuxing +content. This can happend as we don’t map all metadata fields to generic +ones. + +We should probably also add the whole metadata blob to the GstTagList. +We would need a GST\_TAG\_SYSTEM\_xxx define (e.g. +GST\_TAG\_SYSTEM\_ID3V2) for each standard. The content is not printable +and should be treated as binary if not known. The tag is not mergeable - +call gst\_tag\_register() with GstTagMergeFunc=NULL. Also the tag data +is only useful for upstream elements, not for the application. + +A muxer would first scan a taglist for known system tags. Unknown tags +are ignored as already. It would first populate its own metadata store +with the entries from the system tag and the update the entries with the +data in normal tags. + +Below is an initial list of tag systems: ID3V1 - GST\_TAG\_SYSTEM\_ID3V1 +ID3V2 - GST\_TAG\_SYSTEM\_ID3V2 RIFF\_INFO - +GST\_TAG\_SYSTEM\_RIFF\_INFO XMP - GST\_TAG\_SYSTEM\_XMP + +We would basically need this for each container format. + +See also + +### Lost metadata + +A case slighly different from the previous is that when an application +sets a GstTagList on a pipeline. Right elements consuming tags do not +report which tags have been consumed. Especially when using elements +that make metadata persistent, we have no means of knowing which of the +tags made it into the target stream and which were not serialized. +Ideally the application would like to know which kind of metadata is +accepted by a pipleine to reflect that in the UI. + +Although it is in practise so that elements implementing GstTagSetter +are the ones that serialize, this does not have to be so. Otherwise we +could add a means to that interface, where elements add the tags they +have serialized. The application could build one list from all the tag +messages and then query all the serialized tags from tag-setters. The +delta tells what has not been serialized. + +A different approach would be to query the list of supported tags in +advance. This could be a query (GST\_QUERY\_TAG\_SUPPORT). The query +result could be a list of elements and their tags. As a convenience we +could flatten the list of tags for the top-level element (if the query +was sent to a bin) and add that. + +### Tags are per Element + +In many cases we want tags per stream. Even metadata standards like +mp4/3gp metadata supports that. Right now GST\_MESSAGE\_SRC(tags) is the +element. We tried changing that to the pad, but that broke applications. +Also we miss the symmetric functionality in GstTagSetter. This interface +is usually implemented by +elements. + +### Open bugs + + + +Add GST\_TAG\_MERGE\_REMOVE + diff --git a/markdown/design/draft-push-pull.md b/markdown/design/draft-push-pull.md new file mode 100644 index 0000000000..88b65c3942 --- /dev/null +++ b/markdown/design/draft-push-pull.md @@ -0,0 +1,117 @@ +# DRAFT push-pull scheduling + +Status + + DRAFT. DEPRECATED by better current implementation. + +Observations: + + - The main scheduling mode is chain based scheduling where the source + element pushes buffers through the pipeline to the sinks. this is + called the push model + + - In the pull model, some plugin pulls buffers from an upstream peer + element before consuming and/or pushing them further downstream. + +Usages of pull based scheduling: + + - sinks that pull in data, possibly at fixed intervals driven by some + hardware device (audiocard, videodevice, …). + + - Efficient random access to resources. Especially useful for certain + types of demuxers. + +API for pull-based scheduling: + + - an element that wants to pull data from a peer element needs to call + the pull\_range() method. This methods requires an offset and a + size. It is possible to leave the offset and size at -1, indicating + that any offset or size is acceptable, this of course removes the + advantages of getrange based scheduling. + +Types of pull based scheduling: + + - some sources can do random access (file source, …) + + - some sources can read a random number of bytes but not at a random + offset. (audio cards, …) Audio cards using a ringbuffer can however + do random access in the ringbuffer. + + - some sources can do random access in a range of bytes but not in + another range. (a caching network source). + + - some sources can do a fixed size data and without an offset. (video + sources, …) + +Current scheduling decision: + + - core selects scheduling type starting on sinks by looking at + existence of loop function on sinkpad and calling + \_check\_pull\_range() on the source pad to activate the pads in + push/pull mode. + + - element proxies pull mode pad activation to peer pad. + +Problems: + + - core makes a tough desicion without knowing anything about the + element. Some elements are able to deal with a pull\_range() without + offset while others need full random access. + +Requirements: + + - element should be able to select scheduling method itself based on + how it can use the peer element pull\_range. This includes if the + peer can operate with or without offset/size. This also means that + the core does not need to select the scheduling method anymore and + allows for more efficient scheduling methods adjusted for the + particular element. + +Proposition: + + - pads are activated without the core selecting a method. + + - pads queries scheduling mode of peer pad. This query is rather + finegrained and allows the element to know if the peer supports + offsets and sizes in the get\_range function. A proposition for the + query is outlined in draft-query.txt. + + - pad selects scheduling mode and informs the peer pad of this + decision. + +Things to query: + + - pad can do real random access (downstream peer can ask for offset + \!= -1) + + - min offset + + - suggest sequential access + + - max offset + + - align: all offsets should be aligned with this value. + + - pad can give ranges from A to B length (peer can ask for A ⇐ length + ⇐ B) + + - min length + + - suggested length + + - max length + +Use cases: + + - An audio source can provide random access to the samples queued in + its DMA buffer, it however suggests sequential access method. An + audio source can provide a random number of samples but prefers + reading from the hardware using a fixed segment size. + + - A caching network source would suggest sequential access but is + seekable in the cached region. Applications can query for the + already downloaded portion and update the GUI, a seek can be done in + that area. + + - a live video source can only provide buffers sequentialy. It exposes + offsets as -1. lengths are also -1. diff --git a/markdown/design/draft-tagreading.md b/markdown/design/draft-tagreading.md new file mode 100644 index 0000000000..ea07d323e1 --- /dev/null +++ b/markdown/design/draft-tagreading.md @@ -0,0 +1,107 @@ +# Tagreading + +The tagreading (metadata reading) use case for mediacenter applications +is not too well supported by the current GStreamer architecture. It uses +demuxers on the files, which generally said takes too long (building +seek-index, prerolling). What we want is specialized elements / parsing +modes that just do the tag-reading. + +The idea is to define a TagReadIFace. Tag-demuxers, classic demuxers and +decoder plugins can just implement the interface or provide a separate +element that implements the interface. + +In addition we need a tagreadbin, that similar to decodebin does a +typefind and then plugs the right tagread element(s). If will only look +at elements that implement the interface. It can plug serval if +possible. + +For optimal performance typefind and tagread could share the list of +already peeked buffers (a queue element after sink, but that would +change pull to push). + +## Design + +The plan is that applications can do the following: pipeline = "filesrc +\! tagbin" for (file\_path in list\_of\_files) { +filesrc.location=file\_path pipeline.set\_state(PAUSED) // wait for TAGS +& EOS pipeline.set\_state(READY) } + + - it should have one sinkpad of type ANY + + - it should send EOS when all metadata has been read "done"-signal + from all tagread-elements + + - special tagread-elements should have RANK\_NONE to be not + autoplugged by decodebin + +## Interface + + - gboolean iface property "tag-reading" Switches the element to + tagreading mode. Needed if normal element implement that behaviour. + Elements will skip parsing unneeded data, don’t build a seeking + index, etc. + + - signal "done" Equivalent of EOS. + +## Use Cases + + - mp3 with id3- and apetags + + - plug id3demux \! apedemux + + - avi with vorbis audio + + - plug avidemux + + - new pad → audio/vorbis + + - plug vorbisdec or special vorbiscomment reader + +## Additional Thoughts + + - would it make sense to have 2-phase tag-reading (property on tagbin + and/or tagread elements) + + - 1st phase: get tag-data that are directly embedded in the data + + - 2nd phase: get tag-data that has to be generated + + - e.g. album-art via web, video-thumbnails + + - what about caching backends + + - it would be good to allow applications to supply tagbin with a + tagcache- object instance. Whenever tagbin gets a *location* to + tagread, it consults the cache first. whenever there is a cache-miss + it will tag-read and then store in the + cache + +``` c + GstTagList *gst_tag_cache_load_tag_data (GstTagCache *self, const gchar *uri); + gst_tag_cache_store_tag_data (GstTagCache *self, const gchar *uri, GstTagList *tags); +``` + +## Tests + + - write a generic test for parsers/demuxers to ensure they send tags + until they reached PAUSED (elements need to parse file for + prerolling anyway): set pipeline to paused, check for tags, set to + playing, error out if tags come after paused + +## Code Locations + + - tagreadbin → gst-plugins-base/gst/tagread + + - tagreaderiface → gst-plugins-base/gst-libs/gst/tag + +## Reuse + + - ogg : gst-plugins-base/ext/ogg + + - avi : gst-plugins-good/gst/avi + + - mp3 : gst-plugins-good/gst/id3demux + + - wav : gst-plugins-good/gst/wavparse + + - qt : gst-plugins-bad/gst/qtdemux diff --git a/markdown/design/dynamic.md b/markdown/design/dynamic.md new file mode 100644 index 0000000000..301460d475 --- /dev/null +++ b/markdown/design/dynamic.md @@ -0,0 +1,14 @@ +# Dynamic pipelines + +This document describes many use cases for dynamically constructing and +manipulating a running or paused pipeline and the features provided by +GStreamer. + +When constructing dynamic pipelines it is important to understand the +following features of gstreamer: + + - pad blocking + + - playback segments. + + - streaming vs application threads. diff --git a/markdown/design/element-sink.md b/markdown/design/element-sink.md new file mode 100644 index 0000000000..9e6f31def0 --- /dev/null +++ b/markdown/design/element-sink.md @@ -0,0 +1,290 @@ +# Sink elements + +Sink elements consume data and normally have no source pads. + +Typical sink elements include: + + - audio/video renderers + + - network sinks + + - filesinks + +Sinks are harder to construct than other element types as they are +treated specially by the GStreamer core. + +## state changes + +A sink always returns ASYNC from the state change to PAUSED, this +includes a state change from READY→PAUSED and PLAYING→PAUSED. The reason +for this is that this way we can detect when the first buffer or event +arrives in the sink when the state change completes. + +A sink should block on the first EOS event or buffer received in the +READY→PAUSED state before commiting the state to PAUSED. + +FLUSHING events have to be handled out of sync with the buffer flow and +take no part in the preroll procedure. + +Events other than EOS do not complete the preroll stage. + +## sink overview + + - TODO: PREROLL\_LOCK can be removed and we can safely use the STREAM\_LOCK. + +`````` + # Commit the state. We return TRUE if we can continue + # streaming, FALSE in the case we go to a READY or NULL state. + # if we go to PLAYING, we don't need to block on preroll. + commit + { + LOCK + switch (pending) + case PLAYING: + need_preroll = FALSE + break + case PAUSED: + break + case READY: + case NULL: + return FALSE + case VOID: + return TRUE + + # update state + state = pending + next = VOID + pending = VOID + UNLOCK + return TRUE + } + + # Sync an object. We have to wait for the element to reach + # the PLAYING state before we can wait on the clock. + # Some items do not need synchronisation (most events) so the + # get_times method returns FALSE (not syncable) + # need_preroll indicates that we are not in the PLAYING state + # and therefore need to commit and potentially block on preroll + # if our clock_wait got interrupted we commit and block again. + # The reason for this is that the current item being rendered is + # not yet finished and we can use that item to finish preroll. + do_sync (obj) + { + # get timing information for this object + syncable = get_times (obj, &start, &stop) + if (!syncable) + return OK; + again: + while (need_preroll) + if (need_commit) + need_commit = FALSE + if (!commit) + return FLUSHING + + if (need_preroll) + # release PREROLL_LOCK and wait. prerolled can be observed + # and will be TRUE + prerolled = TRUE + PREROLL_WAIT (releasing PREROLL_LOCK) + prerolled = FALSE + if (flushing) + return FLUSHING + + if (valid (start || stop)) + PREROLL_UNLOCK + end_time = stop + ret = wait_clock (obj,start) + PREROLL_LOCK + if (flushing) + return FLUSHING + # if the clock was unscheduled, we redo the + # preroll + if (ret == UNSCHEDULED) + goto again + } + + # render a prerollable item (EOS or buffer). It is + # always called with the PREROLL_LOCK helt. + render_object (obj) + { + ret = do_sync (obj) + if (ret != OK) + return ret; + + # preroll and syncing done, now we can render + render(obj) + } + | # sinks that sync on buffer contents do like this + | while (more_to_render) + | ret = render + | if (ret == interrupted) + | prerolled = TRUE + render (buffer) ----->| PREROLL_WAIT (releasing PREROLL_LOCK) + | prerolled = FALSE + | if (flushing) + | return FLUSHING + | + + # queue a prerollable item (EOS or buffer). It is + # always called with the PREROLL_LOCK helt. + # This function will commit the state when receiving the + # first prerollable item. + # items are then added to the rendering queue or rendered + # right away if no preroll is needed. + queue (obj, prerollable) + { + if (need_preroll) + if (prerollable) + queuelen++ + + # first item in the queue while we need preroll + # will complete state change and call preroll + if (queuelen == 1) + preroll (obj) + if (need_commit) + need_commit = FALSE + if (!commit) + return FLUSHING + + # then see if we need more preroll items before we + # can block + if (need_preroll) + if (queuelen <= maxqueue) + queue.add (obj) + return OK + + # now clear the queue and render each item before + # rendering the current item. + while (queue.hasItem) + render_object (queue.remove()) + + render_object (obj) + queuelen = 0 + } + + # various event functions + event + EOS: + # events must complete preroll too + STREAM_LOCK + PREROLL_LOCK + if (flushing) + return FALSE + ret = queue (event, TRUE) + if (ret == FLUSHING) + return FALSE + PREROLL_UNLOCK + STREAM_UNLOCK + break + SEGMENT: + # the segment must be used to clip incoming + # buffers. Then then go into the queue as non-prerollable + # items used for syncing the buffers + STREAM_LOCK + PREROLL_LOCK + if (flushing) + return FALSE + set_clip + ret = queue (event, FALSE) + if (ret == FLUSHING) + return FALSE + PREROLL_UNLOCK + STREAM_UNLOCK + break + FLUSH_START: + # set flushing and unblock all that is waiting + event ----> subclasses can interrupt render + PREROLL_LOCK + flushing = TRUE + unlock_clock + PREROLL_SIGNAL + PREROLL_UNLOCK + STREAM_LOCK + lost_state + STREAM_UNLOCK + break + FLUSH_END: + # unset flushing and clear all data and eos + STREAM_LOCK + event + PREROLL_LOCK + queue.clear + queuelen = 0 + flushing = FALSE + eos = FALSE + PREROLL_UNLOCK + STREAM_UNLOCK + break + + # the chain function checks the buffer falls within the + # configured segment and queues the buffer for preroll and + # rendering + chain + STREAM_LOCK + PREROLL_LOCK + if (flushing) + return FLUSHING + if (clip) + queue (buffer, TRUE) + PREROLL_UNLOCK + STREAM_UNLOCK + + state + switch (transition) + READY_PAUSED: + # no datapassing is going on so we always return ASYNC + ret = ASYNC + need_commit = TRUE + eos = FALSE + flushing = FALSE + need_preroll = TRUE + prerolled = FALSE + break + PAUSED_PLAYING: + # we grab the preroll lock. This we can only do if the + # chain function is either doing some clock sync, we are + # waiting for preroll or the chain function is not being called. + PREROLL_LOCK + if (prerolled || eos) + ret = OK + need_commit = FALSE + need_preroll = FALSE + if (eos) + post_eos + else + PREROLL_SIGNAL + else + need_preroll = TRUE + need_commit = TRUE + ret = ASYNC + PREROLL_UNLOCK + break + PLAYING_PAUSED: + ---> subclass can interrupt render + # we grab the preroll lock. This we can only do if the + # chain function is either doing some clock sync + # or the chain function is not being called. + PREROLL_LOCK + need_preroll = TRUE + unlock_clock + if (prerolled || eos) + ret = OK + else + ret = ASYNC + PREROLL_UNLOCK + break + PAUSED_READY: + ---> subclass can interrupt render + # we grab the preroll lock. Set to flushing and unlock + # everything. This should exit the chain functions and stop + # streaming. + PREROLL_LOCK + flushing = TRUE + unlock_clock + queue.clear + queuelen = 0 + PREROLL_SIGNAL + ret = OK + PREROLL_UNLOCK + break +``` diff --git a/markdown/design/element-source.md b/markdown/design/element-source.md new file mode 100644 index 0000000000..84e9060e3d --- /dev/null +++ b/markdown/design/element-source.md @@ -0,0 +1,132 @@ +# Source elements + +A source element is an element that provides data to the pipeline. It +does typically not have any sink (input) pads. + +Typical source elements include: + + - file readers + + - network elements (live or not) + + - capture elements (video/audio/…) + + - generators (signals/video/audio/…) + +## Live sources + +A source is said to be a live source when it has the following property: + + - temporarily stopping reading from the source causes data to be lost. + +In general when this property holds, the source also produces data at a +fixed rate. Most sources have a limit on the rate at which they can +deliver data, which might be faster or slower than the consumption rate. +This property however does not make them a live source. + +Let’s look at some example sources. + + - file readers: you can PAUSE without losing data. There is however a + limit to how fast you can read from this source. This limit is + usually much higher than the consumption rate. In some cases it + might be slower (an NFS share, for example) in which case you might + need to use some buffering (see [buffering](design/buffering.md)). + + - HTTP network element: you can PAUSE without data loss. Depending on + the available network bandwidth, consumption rate might be higher + than production rate in which case buffering should be used (see + [buffering](design/buffering.md)). + + - audio source: pausing the audio capture will lead to lost data. this + source is therefore definatly live. In addition, an audio source + will produce data at a fixed rate (the samplerate). Also depending + on the buffersize, this source will introduce a latency (see + [latency](design/latency.md)). + + - udp network source: Pausing the receiving part will lead to lost + data. This source is therefore a live source. Also in a typical case + the udp packets will be received at a certain rate, which might be + difficult to guess because of network jitter. This source does not + necessarily introduce latency on its own. + + - dvb source: PAUSING this element will lead to data loss, it’s a live + source similar to a UDP source. + +## Source types + +A source element can operate in three ways: + + - it is fully seekable, this means that random access can be performed + on it in an efficient way. (a file reader,…). This also typically + means that the source is not live. + + - data can be obtained from it with a variable size. This means that + the source can give N bytes of data. An example is an audio source. + A video source always provides the same amount of data (one video + frame). Note that this is not a fully seekable source. + + - it is a live source, see above. + +When writing a source, one has to look at how the source can operate to +decide on the scheduling methods to implement on the source. + + - fully seekable sources implement a getrange function on the source + pad. + + - sources that can give N bytes but cannot do seeking also implement a + getrange function but state that they cannot do random access. + + - sources that are purely live sources implement a task to push out + data. + +Any source that has a getrange function must also implement a push based +scheduling mode. In this mode the source starts a task that gets N bytes +and pushes them out. Whenever possible, the peer element will select the +getrange based scheduling method of the source, though. + +A source with a getrange function must activate itself in the pad +activate function. This is needed because the downstream peer element +will decide and activate the source element in its state change function +before the source’s state change function is called. + +## Source base classes + +GstBaseSrc: + +This base class provides an implementation of a random access source and +is very well suited for file reader like sources. + +GstPushSrc: + +Base class for block-based sources. This class is mostly useful for +elements that cannot do random access, or at least very slowly. The +source usually prefers to push out a fixed size buffer. + +Classes extending this base class will usually be scheduled in a push +based mode. If the peer accepts to operate without offsets and within +the limits of the allowed block size, this class can operate in getrange +based mode automatically. + +The subclass should extend the methods from the baseclass in addition to +the create method. If the source is seekable, it needs to override +GstBaseSrc::event() in addition to GstBaseSrc::is\_seekable() in order +to retrieve the seek offset, which is the offset of the next buffer to +be requested. + +Flushing, scheduling and sync is all handled by this base class. + +## Timestamps + +A non-live source should timestamp the buffers it produces starting from +0. If it is not possible to timestamp every buffer (filesrc), the source +is allowed to only timestamp the first buffer (as 0). + +Live sources only produce data in the PLAYING state, when the clock is +running. They should timestamp each buffer they produce with the current +running\_time of the pipeline, which is expressed as: + + absolute_time - base_time + +With absolute\_time the time obtained from the global pipeline with +gst\_clock\_get\_time() and base\_time being the time of that clock when +the pipeline was last set to PLAYING. diff --git a/markdown/design/element-transform.md b/markdown/design/element-transform.md new file mode 100644 index 0000000000..2f60069a90 --- /dev/null +++ b/markdown/design/element-transform.md @@ -0,0 +1,327 @@ +# Transform elements + +Transform elements transform input buffers to output buffers based on +the sink and source caps. + +An important requirement for a transform is that the output caps are +completely defined by the input caps and vice versa. This means that a +typical decoder element can NOT be implemented with a transform element, +this is because the output caps like width and height of the +decompressed video frame, for example, are encoded in the stream and +thus not defined by the input caps. + +Typical transform elements include: + + - audio convertors (audioconvert, audioresample,…) + + - video convertors (colorspace, videoscale, …) + + - filters (capsfilter, volume, colorbalance, …) + +The implementation of the transform element has to take care of the +following things: + + - efficient negotiation both up and downstream + + - efficient buffer alloc and other buffer management + +Some transform elements can operate in different modes: + + - passthrough (no changes are done on the input buffers) + + - in-place (changes made directly to the incoming buffers without + requiring a copy or new buffer allocation) + + - metadata changes only + +Depending on the mode of operation the buffer allocation strategy might +change. + +The transform element should at any point be able to renegotiate sink +and src caps as well as change the operation mode. + +In addition, the transform element will typically take care of the +following things as well: + + - flushing, seeking + + - state changes + + - timestamping, this is typically done by copying the input timestamps + to the output buffers but subclasses should be able to override + this. + + - QoS, avoiding calls to the subclass transform function + + - handle scheduling issues such as push and pull based operation. + +In the next sections, we will describe the behaviour of the transform +element in each of the above use cases. We focus mostly on the buffer +allocation strategies and caps negotiation. + +## Processing + +A transform has 2 main processing functions: + +- **`transform()`**: Transform the input buffer to the output buffer. The +output buffer is guaranteed to be writable and different from the input buffer. + +- **`transform_ip()`**: Transform the input buffer in-place. The input buffer +is writable and of bigger or equal size than the output buffer. + +A transform can operate in the following modes: + +- *passthrough*: The element will not make changes to the buffers, buffers are +pushed straight through, caps on both sides need to be the same. The element +can optionally implement a transform_ip() function to take a look at the data, +the buffer does not have to be writable. + +- *in-place*: Changes can be made to the input buffer directly to obtain the +output buffer. The transform must implement a transform_ip() function. + +- *copy-transform*: The transform is performed by copying and transforming the +input buffer to a new output buffer. The transform must implement a transform() +function. + +When no `transform()` function is provided, only in-place and passthrough +operation is allowed, this means that source and destination caps must +be equal or that the source buffer size is bigger or equal than the +destination buffer. + +When no `transform_ip()` function is provided, only passthrough and +copy-transforms are supported. Providing this function is an +optimisation that can avoid a buffer copy. + +When no functions are provided, we can only process in passthrough mode. + +## Negotiation + +Typical (re)negotiation of the transform element in push mode always +goes from sink to src, this means triggers the following sequence: + + - the sinkpad receives a new caps event. + + - the transform function figures out what it can convert these caps + to. + + - try to see if we can configure the caps unmodified on the peer. We + need to do this because we prefer to not do anything. + + - the transform configures itself to transform from the new sink caps + to the target src caps + + - the transform processes and sets the output caps on the src pad + +We call this downstream negotiation (DN) and it goes roughly like this: + +``` + sinkpad transform srcpad +CAPS event | | | +------------>| find_transform() | | + |------------------->| | + | | CAPS event | + | |--------------------->| + | <-| | +``` + +These steps configure the element for a transformation from the input +caps to the output caps. + +The transform has 3 function to perform the negotiation: + +- **`transform_caps()`**: Transform the caps on a certain pad to all the +possible supported caps on the other pad. The input caps are guaranteed to be +a simple caps with just one structure. The caps do not have to be fixed. + +- **`fixate_caps()`**: Given a caps on one pad, fixate the caps on the other +pad. The target caps are writable. + +- **`set_caps()`**: Configure the transform for a transformation between src +caps and dest caps. Both caps are guaranteed to be fixed caps. + +If no `transform_caps()` is defined, we can only perform the identity +transform, by default. + +If no `set_caps()` is defined, we don’t care about caps. In that case we +also assume nothing is going to write to the buffer and we don’t enforce +a writable buffer for the `transform_ip` function, when present. + +One common function that we need for the transform element is to find +the best transform from one format (src) to another (dest). Some +requirements of this function are: + + - has a fixed src caps + + - finds a fixed dest caps that the transform element can transform to + + - the dest caps are compatible and can be accepted by peer elements + + - the transform function prefers to make src caps == dest caps + + - the transform function can optionally fixate dest caps. + +The `find_transform()` function goes like this: + + - start from src aps, these caps are fixed. + + - check if the caps are acceptable for us as src caps. This is usually + enforced by the padtemplate of the element. + + - calculate all caps we can transform too with `transform_caps()` + + - if the original caps are a subset of the transforms, try to see if + the the caps are acceptable for the peer. If this is possible, we + can perform passthrough and make src == dest. This is performed by + simply calling gst\_pad\_peer\_accept\_caps(). + + - if the caps are not fixed, we need to fixate it, start by taking the + peer caps and intersect with them. + + - for each of the transformed caps retrieved with transform\_caps(): + + - try to fixate the caps with fixate\_caps() + + - if the caps are fixated, check if the peer accepts them with + `_peer_accept_caps()`, if the peer accepts, we have found a dest caps. + + - if we run out of caps, we fail to find a transform. + + - if we found a destination caps, configure the transform with + set\_caps(). + +After this negotiation process, the transform element is usually in a +steady state. We can identify these steady states: + + - src and sink pads both have the same caps. Note that when the caps + are equal on both pads, the input and output buffers automatically + have the same size. The element can operate on the buffers in the + following ways: (Same caps, SC) + + - passthrough: buffers are inspected but no metadata or buffer data is + changed. The input buffers don’t need to be writable. The input + buffer is simply pushed out again without modifications. (SCP) + + ``` + sinkpad transform srcpad + chain() | | | + ------------>| handle_buffer() | | + |------------------->| pad_push() | + | |--------------------->| + | | | + ``` + + - in-place: buffers are modified in-place, this means that the input + buffer is modified to produce a new output buffer. This requires the + input buffer to be writable. If the input buffer is not writable, a + new buffer has to be allocated from the bufferpool. (SCI) + + ``` + sinkpad transform srcpad + chain() | | | + ------------>| handle_buffer() | | + |------------------->| | + | | [!writable] | + | | alloc buffer | + | .-| | + | | | | + | '>| | + | | pad_push() | + | |--------------------->| + | | | + ``` + + - copy transform: a new output buffer is allocate from the bufferpool + and data from the input buffer is transformed into the output + buffer. (SCC) + + ``` + sinkpad transform srcpad + chain() | | | + ------------>| handle_buffer() | | + |------------------->| | + | | alloc buffer | + | .-| | + | | | | + | '>| | + | | pad_push() | + | |--------------------->| + | | | + ``` + + - src and sink pads have different caps. The element can operate on + the buffers in the following way: (Different Caps, DC) + + - in-place: input buffers are modified in-place. This means that the + input buffer has a size that is larger or equal to the output size. + The input buffer will be resized to the size of the output buffer. + If the input buffer is not writable or the output size is bigger + than the input size, we need to pad-alloc a new buffer. (DCI) + + ``` + sinkpad transform srcpad + chain() | | | + ------------>| handle_buffer() | | + |------------------->| | + | | [!writable || !size] | + | | alloc buffer | + | .-| | + | | | | + | '>| | + | | pad_push() | + | |--------------------->| + | | | + ``` + + - copy transform: a new output buffer is allocated and the data from + the input buffer is transformed into the output buffer. The flow is + exactly the same as the case with the same-caps negotiation. (DCC) + +We can immediately observe that the copy transform states will need to +allocate a new buffer from the bufferpool. When the transform element is +receiving a non-writable buffer in the in-place state, it will also need +to perform an allocation. There is no reason why the passthrough state +would perform an allocation. + +This steady state changes when one of the following actions occur: + + - the sink pad receives new caps, this triggers the above downstream + renegotation process, see above for the flow. + + - the transform element wants to renegotiate (because of changed + properties, for example). This essentially clears the current steady + state and triggers the downstream and upstream renegotiation + process. This situation also happens when a RECONFIGURE event was + received on the transform srcpad. + +## Allocation + +After the transform element is configured with caps, a bufferpool needs +to be negotiated to perform the allocation of buffers. We have 2 cases: + + - The element is operating in passthrough we don’t need to allocate a + buffer in the transform element. + + - The element is not operating in passthrough and needs to allocation + an output buffer. + +In case 1, we don’t query and configure a pool. We let upstream decide +if it wants to use a bufferpool and then we will proxy the bufferpool +from downstream to upstream. + +In case 2, we query and set a bufferpool on the srcpad that will be used +for doing the allocations. + +In order to perform allocation, we need to be able to get the size of +the output buffer after the transform. We need additional function to +retrieve the size. There are two functions: + +- `transform_size()`: Given a caps and a size on one pad, and a caps on the +other pad, calculate the size of the other buffer. This function is able to +perform all size transforms and is the preferred method of transforming +a size. + +- `get_unit_size()`: When the input size and output size are always +a multiple of each other (audio conversion, ..) we can define a more simple +get_unit_size() function. The transform will use this function to get the +same amount of units in the source and destination buffers. For performance +reasons, the mapping between caps and size is kept in a cache. diff --git a/markdown/design/events.md b/markdown/design/events.md new file mode 100644 index 0000000000..74512d878b --- /dev/null +++ b/markdown/design/events.md @@ -0,0 +1,282 @@ +# Events + +Events are objects passed around in parallel to the buffer dataflow to +notify elements of various events. + +Events are received on pads using the event function. Some events should +be interleaved with the data stream so they require taking the +STREAM_LOCK, others don’t. + +Different types of events exist to implement various functionalities. + +* `GST_EVENT_FLUSH_START`: data is to be discarded +* `GST_EVENT_FLUSH_STOP`: data is allowed again +* `GST_EVENT_CAPS`: Format information about the following buffers +* `GST_EVENT_SEGMENT`: Timing information for the following buffers +* `GST_EVENT_TAG`: Stream metadata. +* `GST_EVENT_BUFFERSIZE`: Buffer size requirements +* `GST_EVENT_SINK_MESSAGE`: An event turned into a message by sinks +* `GST_EVENT_EOS`: no more data is to be expected on a pad. +* `GST_EVENT_QOS`: A notification of the quality of service of the stream +* `GST_EVENT_SEEK`: A seek should be performed to a new position in the stream +* `GST_EVENT_NAVIGATION`: A navigation event. +* `GST_EVENT_LATENCY`: Configure the latency in a pipeline +* `GST_EVENT_STEP`: Stepping event +* `GST_EVENT_RECONFIGURE`: stream reconfigure event + +- `GST_EVENT_DRAIN`: Play all data downstream before returning. + > not yet implemented, under investigation, might be needed to do + still frames in DVD. + +# src pads + +A `gst_pad_push_event()` on a srcpad will first store the sticky event +in the sticky array before sending the event to the peer pad. If there +is no peer pad and the event was not stored in the sticky array, FALSE +is returned. + +Flushing pads will refuse the events and will not store the sticky +events. + +# sink pads + +A `gst_pad_send_event()`i on a sinkpad will call the event function on +the pad. If the event function returns success, the sticky event is +stored in the sticky event array and the event is marked for update. + +When the pad is flushing, the `_send_event()` function returns FALSE +immediately. + +When the next data item is pushed, the pending events are pushed first. + +This ensures that the event function is never called for flushing pads +and that the sticky array only contains events for which the event +function returned success. + +# pad link + +When linking pads, the srcpad sticky events are marked for update when +they are different from the sinkpad events. The next buffer push will +push the events to the sinkpad. + +## FLUSH_START/STOP + +A flush event is sent both downstream and upstream to clear any pending +data from the pipeline. This might be needed to make the graph more +responsive when the normal dataflow gets interrupted by for example a +seek event. + +Flushing happens in two stages. + +1) a source element sends the FLUSH_START event to the downstream peer element. + The downstream element starts rejecting buffers from the upstream elements. It + sends the flush event further downstream and discards any buffers it is + holding as well as return from the chain function as soon as possible. + This makes sure that all upstream elements get unblocked. + This event is not synchronized with the STREAM_LOCK and can be done in the + application thread. + +2) a source element sends the FLUSH_STOP event to indicate + that the downstream element can accept buffers again. The downstream + element sends the flush event to its peer elements. After this step dataflow + continues. The FLUSH_STOP call is synchronized with the STREAM_LOCK so any + data used by the chain function can safely freed here if needed. Any + pending EOS events should be discarded too. + +After the flush completes the second stage, data is flowing again in the +pipeline and all buffers are more recent than those before the flush. + +For elements that use the pullrange function, they send both flush +events to the upstream pads in the same way to make sure that the +pullrange function unlocks and any pending buffers are cleared in the +upstream elements. + +A `FLUSH_START` may instruct the pipeline to distribute a new base_time +to elements so that the running_time is reset to 0. (see +[clocks](design/clocks.md) and [synchronisation](design/synchronisation.md)). + +## EOS + +The EOS event can only be sent on a sinkpad. It is typically emitted by +the source element when it has finished sending data. This event is +mainly sent in the streaming thread but can also be sent from the +application thread. + +The downstream element should forward the EOS event to its downstream +peer elements. This way the event will eventually reach the sinks which +should then post an EOS message on the bus when in PLAYING. + +An element might want to flush its internally queued data before +forwarding the EOS event downstream. This flushing can be done in the +same thread as the one handling the EOS event. + +For elements with multiple sink pads it might be possible to wait for +EOS on all the pads before forwarding the event. + +The EOS event should always be interleaved with the data flow, therefore +the GStreamer core will take the `STREAM_LOCK`. + +Sometimes the EOS event is generated by another element than the source, +for example a demuxer element can generate an EOS event before the +source element. This is not a problem, the demuxer does not send an EOS +event to the upstream element but returns `GST_FLOW_EOS`, causing the +source element to stop sending data. + +An element that sends EOS on a pad should stop sending data on that pad. +Source elements typically pause() their task for that purpose. + +By default, a GstBin collects all EOS messages from all its sinks before +posting the EOS message to its parent. + +The EOS is only posted on the bus by the sink elements in the PLAYING +state. If the EOS event is received in the PAUSED state, it is queued +until the element goes to PLAYING. + +A `FLUSH_STOP` event on an element flushes the EOS state and all pending +EOS messages. + +## SEGMENT + +A segment event is sent downstream by an element to indicate that the +following group of buffers start and end at the specified positions. The +newsegment event also contains the playback speed and the applied rate +of the stream. + +Since the stream time is always set to 0 at start and after a seek, a 0 +point for all next buffer’s timestamps has to be propagated through the +pipeline using the SEGMENT event. + +Before sending buffers, an element must send a SEGMENT event. An element +is free to refuse buffers if they were not preceded by a SEGMENT event. + +Elements that sync to the clock should store the SEGMENT start and end +values and subtract the start value from the buffer timestamp before +comparing it against the stream time (see [clocks](design/clocks.md)). + +An element is allowed to send out buffers with the SEGMENT start time +already subtracted from the timestamp. If it does so, it needs to send a +corrected SEGMENT downstream, ie, one with start time 0. + +A SEGMENT event should be generated as soon as possible in the pipeline +and is usually generated by a demuxer or source. The event is generated +before pushing the first buffer and after a seek, right before pushing +the new buffer. + +The SEGMENT event should be sent from the streaming thread and should be +serialized with the buffers. + +Buffers should be clipped within the range indicated by the newsegment +event start and stop values. Sinks must drop buffers with timestamps out +of the indicated segment range. + +## TAG + +The tag event is sent downstream when an element has discovered metadata +tags in a media file. Encoders can use this event to adjust their +tagging system. A tag is serialized with buffers. + +## BUFFERSIZE + +> **Note** +> +> This event is not yet implemented. + +An element can suggest a buffersize for downstream elements. This is +typically done by elements that produce data on multiple source pads +such as demuxers. + +## QOS + +A QOS, or quality of service message, is generated in an element to +report to the upstream elements about the current quality of real-time +performance of the stream. This is typically done by the sinks that +measure the amount of framedrops they have. (see [qos](design/qos.md)) + +## SEEK + +A seek event is issued by the application to configure the playback +range of a stream. It is called form the application thread and travels +upstream. + +The seek event contains the new start and stop position of playback +after the seek is performed. Optionally the stop position can be left at +-1 to continue playback to the end of the stream. The seek event also +contains the new playback rate of the stream, 1.0 is normal playback, +2.0 double speed and negative values mean backwards playback. + +A seek usually flushes the graph to minimize latency after the seek. +This behaviour is triggered by using the `SEEK_FLUSH` flag on the seek +event. + +The seek event usually starts from the sink elements and travels +upstream from element to element until it reaches an element that can +perform the seek. No intermediate element is allowed to assume that a +seek to this location will happen. It is allowed to modify the start and +stop times if it needs to do so. this is typically the case if a seek is +requested for a non-time position. + +The actual seek is performed in the application thread so that success +or failure can be reported as a return value of the seek event. It is +therefore important that before executing the seek, the element acquires +the `STREAM_LOCK` so that the streaming thread and the seek get +serialized. + +The general flow of executing the seek with FLUSH is as follows: + +1) unblock the streaming threads, they could be blocked in a chain + function. This is done by sending a FLUSH_START on all srcpads or by pausing + the streaming task, depending on the seek FLUSH flag. + The flush will make sure that all downstream elements unlock and + that control will return to this element chain/loop function. + We cannot lock the STREAM_LOCK before doing this since it might + cause a deadlock. + +2) acquire the STREAM_LOCK. This will work since the chain/loop function + was unlocked/paused in step 1). + +3) perform the seek. since the STREAM_LOCK is held, the streaming thread + will wait for the seek to complete. Most likely, the stream thread + will pause because the peer elements are flushing. + +4) send a FLUSH_STOP event to all peer elements to allow streaming again. + +5) create a SEGMENT event to signal the new buffer timestamp base time. + This event must be queued to be sent by the streaming thread. + +6) start stopped tasks and unlock the STREAM_LOCK, dataflow will continue + now from the new position. + +More information about the different seek types can be found in +[seeking](design/seeking.md). + +## NAVIGATION + +A navigation event is generated by a sink element to signal the elements +of a navigation event such as a mouse movement or button click. +Navigation events travel upstream. + +## LATENCY + +A latency event is used to configure a certain latency in the pipeline. +It contains a single GstClockTime with the required latency. The latency +value is calculated by the pipeline and distributed to all sink elements +before they are set to PLAYING. The sinks will add the configured +latency value to the timestamps of the buffer in order to delay their +presentation. (See also [latency](design/latency.md)). + +## DRAIN + +> **Note** +> +> This event is not yet implemented. + +Drain event indicates that upstream is about to perform a real-time +event, such as pausing to present an interactive menu or such, and needs +to wait for all data it has sent to be played-out in the sink. + +Drain should only be used by live elements, as it may otherwise occur +during prerolling. + +Usually after draining the pipeline, an element either needs to modify +timestamps, or FLUSH to prevent subsequent data being discarded at the +sinks for arriving late (only applies during playback scenarios). diff --git a/markdown/design/framestep.md b/markdown/design/framestep.md new file mode 100644 index 0000000000..1802a1fa31 --- /dev/null +++ b/markdown/design/framestep.md @@ -0,0 +1,246 @@ +# Frame step + +This document outlines the details of the frame stepping functionality +in GStreamer. + +The stepping functionality operates on the current playback segment, +position and rate as it was configured with a regular seek event. In +contrast to the seek event, it operates very closely to the sink and +thus has a very low latency and is not slowed down by queues and does +not actually perform any seeking logic. For this reason we want to +include a new API instead of reusing the seek API. + +The following requirements are needed: + +- The ability to walk forwards and backwards in the stream. + +- Arbitrary increments in any supported format (time, frames, bytes …) + +- High speed, minimal overhead. This mechanism is not more expensive +than simple playback. + +- switching between forwards and backwards stepping should be fast. + +- Maintain synchronisation between streams. + +- Get feedback of the amount of skipped data. + +- Ability to play a certain amount of data at an arbitrary speed. + +We want a system where we can step frames in PAUSED as well as play +short segments of data in PLAYING. + +## Use Cases + +* frame stepping in video only pipeline in PAUSED + +``` + .-----. .-------. .------. .-------. + | src | | demux | .-----. | vdec | | vsink | + | src->sink src1->|queue|->sink src->sink | + '-----' '-------' '-----' '------' '-------' +`````` + +* + - app sets the pipeline to PAUSED to block on the preroll picture + + - app seeks to required position in the stream. This can be done + with a positive or negative rate depending on the required frame + stepping direction. + + - app steps frames (in `GST_FORMAT_DEFAULT` or `GST_FORMAT_BUFFER)`. The + pipeline loses its PAUSED state until the required number of frames have been + skipped, it then prerolls again. This skipping is purely done in the sink. + + - sink posts `STEP_DONE` with amount of frames stepped and + corresponding time interval. + +* frame stepping in audio/video pipeline in PAUSED + +``` + .-----. .-------. .------. .-------. + | src | | demux | .-----. | vdec | | vsink | + | src->sink src1->|queue|->sink src->sink | + '-----' | | '-----' '------' '-------' + | | .------. .-------. + | | .-----. | adec | | asink | + | src2->|queue|->sink src->sink | + '-------' '-----' '------' '-------' +``` + +* + - app sets the pipeline to PAUSED to block on the preroll picture + + - app seeks to required position in the stream. This can be done + with a positive or negative rate depending on the required frame + stepping direction. + + - app steps frames (in `GST_FORMAT_DEFAULT` or `GST_FORMAT_BUFFER`) or an + amount of time on the video sink. The pipeline loses its PAUSED state until + the required number of frames have been skipped, it then prerolls again. This + skipping is purely done in the sink. + + - sink posts `STEP_DONE` with amount of frames stepped and + corresponding time interval. + + - the app skips the same amount of time on the audiosink to align + the streams again. When huge amount of video frames are skipped, + there needs to be enough queueing in the pipeline to compensate + for the accumulated audio. + +- frame stepping in audio/video pipeline in PLAYING + + - app sets the pipeline to PAUSED to block on the preroll picture + + - app seeks to required position in the stream. This can be done + with a positive or negative rate depending on the required frame + stepping direction. + + - app configures frames steps (in `GST_FORMAT_DEFAULT` or + `GST_FORMAT_BUFFER` or an amount of time on the sink. The step event has + a flag indicating live stepping so that the stepping will only happens in + PLAYING. + + - app sets pipeline to PLAYING. The pipeline continues PLAYING + until it consumed the amount of time. + + - sink posts `STEP_DONE` with amount of frames stepped and + corresponding time interval. The sink will then wait for another + step event. Since the `STEP_DONE` message was emitted by the sink + when it handed off the buffer to the device, there is usually + sufficient time to queue a new STEP event so that one can + seamlessly continue stepping. + +## events + +A new `GST_EVENT_STEP` event is introduced to start the step operation. +The step event is created with the following fields in the structure: + +* **`format`** GST_TYPE_FORMAT: The format of the step units + +* **`amount`** G_TYPE_UINT64: The amount of units to step. A 0 amount +immediately completes and can be used to cancel the current step and resume +normal non-stepping behaviour to the end of the segment. A -1 amount steps +until the end of the segment. + +* **`rate`** G_TYPE_DOUBLE: The rate at which the frames should be stepped in +PLAYING mode. 1.0 is the normal playback speed and direction of the segment, +2.0 is double speed. A speed of 0.0 is not allowed. When performing a flushing +step, the speed is not relevant. Note that we don't allow negative rates here, +use a seek with a negative rate first to reverse the playback direction. + +* **`flush`** G_TYPE_BOOLEAN: when flushing is TRUE, the step is performed +immediately: + +* + - In the PAUSED state the pipeline loses the PAUSED state, the + requested amount of data is skipped and the pipeline prerolls again + when a non-intermediate step completes. When the pipeline was + stepping while the event is sent, the current step operation is + updated with the new amount and format. The sink will do a best + effort to comply with the new amount. + + - In the PLAYING state, the pipeline loses the PLAYING state, the + requested amount of data is skipped (not rendered) from the previous + STEP request or from the position of the last PAUSED if no previous + STEP operation was performed. The pipeline goes back to the PLAYING + state when a non-intermediate step completes. + + - When flushing is FALSE, the step will be performed later. + + - In the PAUSED state the step will be done when going to PLAYING. Any + previous step operation will be overridden with the new STEP event. + + - In the PLAYING state the step operation will be performed after the + current step operation completes. If there was no previous step + operation, the step operation will be performed from the position of + the last PAUSED state. + +* **`intermediate`** G_TYPE_BOOLEAN: Signal that this step operation is an +intermediate step, part of a series of step operations. It is mostly +interesting for stepping in the PAUSED state because the sink will only perform +a preroll after a non-intermediate step operation completes. Intermediate steps +are useful to flush out data from other sinks in order to not cause excessive +queueing. In the PLAYING state the intermediate flag has no visual effect. In +all states, the intermediate flag is passed to the corresponding +GST_MESSAGE_STEP_DONE. + +The application will create a STEP event to start or stop the stepping +operation. Both stepping in PAUSED and PLAYING can be performed by means +of the flush flag. + +The event is usually sent to the pipeline, which will typically +distribute the event to all of its sinks. For some use cases, like frame +stepping on video frames only, the event should only be sent to the +video sink and upon reception of the `STEP_DONE` message, one can step +the other sinks to align the streams again. + +For large stepping amounts, there needs to be enough queueing in front +of all the sinks. If large steps need to be performed, they can be split +up into smaller step operations using the "intermediate" flag on the +step. + +Since the step event does not update the `base_time` of any of the +elements, the sinks should keep track of the amount of stepped data in +order to remain synchronized against the clock. + +## messages + +A `GST_MESSAGE_STEP_START` is created. It contains the following +fields. + +* **`active`**: If the step was queued or activated. + +* **`format`** GST_TYPE_FORMAT: The format of the step units that queued/activated. + +* **`amount`** G_TYPE_UINT64: The amount of units that were queued/activated. + +* **`rate`** G_TYPE_DOUBLE: The rate and direction at which the frames were queued/activated. + +* **`flush`** G_TYPE_BOOLEAN: If the queued/activated frames will be flushed. + +* **`intermediate`** G_TYPE_BOOLEAN: If this is an intermediate step operation +that queued/activated. + +The `STEP_START` message is emitted 2 times: + + - first when an element received the STEP event and queued it. The + "active" field will be FALSE in this case. + + - second when the step operation started in the streaming thread. The + "active" field is TRUE in this case. After this message is emitted, + the application can queue a new step operation. + +The purpose of this message is to find out how many elements participate +in the step operation and to queue new step operations at the earliest +possible moment. + +A new `GST_MESSAGE_STEP_DONE` message is created. It contains the +following fields: + +* **`format`** GST_TYPE_FORMAT: The format of the step units that completed. +* **`amount`** G_TYPE_UINT64: The amount of units that were stepped. +* **`rate`** G_TYPE_DOUBLE: The rate and direction at which the frames were stepped. +* **`flush`** G_TYPE_BOOLEAN: If the stepped frames were flushed. +* **`intermediate`** G_TYPE_BOOLEAN: If this is an intermediate step operation that completed. +* **`duration`** G_TYPE_UINT64: The total duration of the stepped units in `GST_FORMAT_TIME`. +* **`eos`** G_TYPE_BOOLEAN: The step ended because of EOS. + +The message is emitted by the element that performs the step operation. +The purpose is to return the duration in `GST_FORMAT_TIME` of the +stepped media. This especially interesting to align other stream in case +of stepping frames on the video sink element. + +## Direction switch + +When quickly switching between a forwards and a backwards step of, for +example, one video frame, we need either: + +1) issue a new seek to change the direction from the current position. +2) cache a certain number of stepped frames and walk the cache. + +option 1) might be very slow. For option 2) we would ideally like to +offload this caching functionality to a separate element, which means +that we need to forward the STEP event upstream. It’s unclear how this +could work in a generic way. What is a demuxer supposed to do when it +received a step event? a flushing seek to what stream position? diff --git a/markdown/design/gstbin.md b/markdown/design/gstbin.md new file mode 100644 index 0000000000..9583e3af75 --- /dev/null +++ b/markdown/design/gstbin.md @@ -0,0 +1,105 @@ +# GstBin + +GstBin is a container element for other GstElements. This makes it +possible to group elements together so that they can be treated as one +single GstElement. A GstBin provides a GstBus for the children and +collates messages from them. + +## Add/removing elements + +The basic functionality of a bin is to add and remove GstElements +to/from it. `gst_bin_add()` and `gst_bin_remove()` perform these +operations respectively. + +The bin maintains a parent-child relationship with its elements (see +[relations](design/relations.md)). + +## Retrieving elements + +GstBin provides a number of functions to retrieve one or more children +from itself. A few examples of the provided functions: + +* `gst_bin_get_by_name()` retrieves an element by name. +* `gst_bin_iterate_elements()` returns an iterator to all the children. + +## element management + +The most important function of the GstBin is to distribute all +GstElement operations on itself to all of its children. This includes: + + - state changes + + - index get/set + + - clock get/set + +The state change distribution is the most complex and is explained in +[states](design/states.md). + +## GstBus + +The GstBin creates a GstBus for its children and distributes it when +child elements are added to the bin. The bin attaches a sync handler to +receive messages from children. The bus for receiving messages from +children is distinct from the bin’s own externally-visible GstBus. + +Messages received from children are forwarded intact onto the bin’s +external message bus, except for EOS and SEGMENT_START/DONE which are +handled specially. + +ASYNC_START/ASYNC_STOP messages received from the children are used to +trigger a recalculation of the current state of the bin, as described in +[states](design/states.md). + +The application can retrieve the external GstBus and integrate it in the +mainloop or it can just `pop()` messages off in its own thread. + +When a bin goes to READY it will clear all cached messages. + +## EOS + +The sink elements will post an EOS message on the bus when they reach +EOS. The EOS message is only posted to the bus when the sink element is +in PLAYING. + +The bin collects all EOS messages and forwards it to the application as +soon as all the sinks have posted an EOS. + +The list of queued EOS messages is cleared when the bin goes to PAUSED +again. This means that all elements should repost the EOS message when +going to PLAYING again. + +## SEGMENT_START/DONE + +A bin collects `SEGMENT_START` messages but does not post them to the +application. It counts the number of `SEGMENT_START` messages and posts a +`SEGMENT_STOP` message to the application when an equal number of +`SEGMENT_STOP` messages where received. + +The cached SEGMENT_START/STOP messages are cleared when going to READY. + +## DURATION + +When a DURATION query is performed on a bin, it will forward the query +to all its sink elements. The bin will calculate the total duration as +the MAX of all returned durations and will then cache the result so that +any further query can use the cached version. The reason for caching the +result is because the duration of a stream typically does not change +that often. + +A `GST_MESSAGE_DURATION_CHANGED` posted by an element will clear the +cached duration value so that the bin will query the sinks again. This +message is typically posted by elements that calculate the duration of +the stream based on some average bitrate, which might change while +playing the stream. The `DURATION_CHANGED` message is posted to the +application, which can then fetch the updated DURATION. + +## Subclassing + +Subclasses of GstBin are free to implement their own add/remove +implementations. It is a good idea to update the GList of children so +that the `_iterate()` functions can still be used if the custom bin +allows access to its children. + +Any bin subclass can also implement a custom message handler by +overriding the default message handler. diff --git a/markdown/design/gstbus.md b/markdown/design/gstbus.md new file mode 100644 index 0000000000..303ee0e8be --- /dev/null +++ b/markdown/design/gstbus.md @@ -0,0 +1,41 @@ +# GstBus + +The GstBus is an object responsible for delivering GstMessages in a +first-in first-out way from the streaming threads to the application. + +Since the application typically only wants to deal with delivery of +these messages from one thread, the GstBus will marshall the messages +between different threads. This is important since the actual streaming +of media is done in another threads (streaming threads) than the +application. It is also important to not block the streaming threads +while the application deals with the message. + +The GstBus provides support for GSource based notifications. This makes +it possible to handle the delivery in the glib mainloop. Different +GSources can be added to the same bin provided they listen to different +message types. + +A message is posted on the bus with the `gst_bus_post()` method. With +the `gst_bus_peek()` and `_pop()` methods one can look at or retrieve a +previously posted message. + +The bus can be polled with the `gst_bus_poll()` method. This methods +blocks up to the specified timeout value until one of the specified +messages types is posted on the bus. The application can then `_pop()` +the messages from the bus to handle them. + +It is also possible to get messages from the bus without any thread +marshalling with the `gst_bus_set_sync_handler()` method. This makes +it possible to react to a message in the same thread that posted the +message on the bus. This should only be used if the application is able +to deal with messages from different threads. + +If no messages are popped from the bus with either a GSource or +`gst_bus_pop()`, they remain on the bus. + +When a pipeline or bin goes from READY into NULL state, it will set its +bus to flushing, ie. the bus will drop all existing and new messages on +the bus, This is necessary because bus messages hold references to the +bin/pipeline or its elements, so there are circular references that need +to be broken if one ever wants to be able to destroy a bin or pipeline +properly. diff --git a/markdown/design/gstelement.md b/markdown/design/gstelement.md new file mode 100644 index 0000000000..58bd6daf23 --- /dev/null +++ b/markdown/design/gstelement.md @@ -0,0 +1,61 @@ +# GstElement + +The Element is the most important object in the entire GStreamer system, +as it defines the structure of the pipeline. Elements include sources, +filters, sinks, and containers (Bins). They may be an intrinsic part of +the core GStreamer library, or may be loaded from a plugin. In some +cases they’re even fabricated from completely different systems (see the +LADSPA plugin). They are generally created from a GstElementFactory, +which will be covered in another chapter, but for the intrinsic types +they can be created with specific functions. + +Elements contains GstPads (also covered in another chapter), which are +subsequently used to connect the Elements together to form a pipeline +capable of passing and processing data. They have a parent, which must +be another Element. This allows deeply nested pipelines, and the +possibility of "black-box" meta-elements. + +## Name + +All elements are named, and while they should ideally be unique in any +given pipeline, they do not have to be. The only guaranteed unique name +for an element is its complete path in the object hierarchy. In other +words, an element’s name is unique inside its parent. (This follows from +GstObject’s name explanation) + +This uniqueness is guaranteed through all functions where either +parentage or name of an element is changed. + +## Pads + +GstPads are the property of a given GstElement. They provide the +connection capability, with allowing arbitrary structure in the graph. +For any Element but a source or sink, there will be at least 2 Pads +owned by the Element. These pads are stored in a single GList within the +Element. Several counters are kept in order to allow quicker +determination of the type and properties of a given Element. + +Pads may be added to an element with `_add_pad.` Retrieval is via +`_get_static_pad()`, which operates on the name of the Pad (the unique +key). This means that all Pads owned by a given Element must have unique +names. A pointer to the GList of pads may be obtained with +`_iterate_pads`. + +`gst_element_add_pad(element,pads)`: Sets the element as the parent of +the pad, then adds the pad to the element’s list of pads, keeping the +counts of total, src, and sink pads up to date. Emits the `new_pad` +signal with the pad as argument. Fails if either the element or pad are +either NULL or not what they claim to be. Should fail if the pad already +has a parent. Should fail if the pad is already owned by the element. +Should fail if there’s already a pad by that name in the list of pads. + +`pad = gst_element_get_pad(element, "padname")`: Searches through the +list of pads + +## Ghost Pads + +More info in [ghostpad](design/gstghostpad.md). + +## State + +An element has a state. More info in [state](design/states.md). diff --git a/markdown/design/gstghostpad.md b/markdown/design/gstghostpad.md new file mode 100644 index 0000000000..982e9f97ab --- /dev/null +++ b/markdown/design/gstghostpad.md @@ -0,0 +1,451 @@ +# Ghostpads + +GhostPads are used to build complex compound elements out of existing +elements. They are used to expose internal element pads on the complex +element. + +## Some design requirements + +- Must look like a real GstPad on both sides. +- target of Ghostpad must be changeable +- target can be initially NULL + +- a GhostPad is implemented using a private GstProxyPad class: + +``` + GstProxyPad + (------------------) + | GstPad | + |------------------| + | GstPad *target | + (------------------) + | GstPad *internal | + (------------------) + + GstGhostPad + (------------------) -\ + | GstPad | | + |------------------| | + | GstPad *target | > GstProxyPad + |------------------| | + | GstPad *internal | | + |------------------| -/ + | | + (------------------) +``` + +A GstGhostPad (X) is _always_ created together with a GstProxyPad (Y). +The internal pad pointers are set to point to the eachother. The +GstProxyPad pairs have opposite directions, the GstGhostPad has the same +direction as the (future) ghosted pad (target). + + (- X --------) + | | + | target * | + |------------| + | internal *----+ + (------------) | + ^ V + | (- Y --------) + | | | + | | target * | + | |------------| + +----* internal | + (------------) + + Which we will abbreviate to: + + (- X --------) + | | + | target *--------->// + (------------) + | + (- Y --------) + | target *----->// + (------------) + +The GstGhostPad (X) is also set as the parent of the GstProxyPad (Y). + +The target is a pointer to the internal pads peer. It is an optimisation to +quickly get to the peer of a ghostpad without having to dereference the +internal->peer. + +Some use case follow with a description of how the datastructure +is modified. + +## Creating a ghostpad with a target: + + gst_ghost_pad_new (char *name, GstPad *target) + +1) create new GstGhostPad X + GstProxyPad Y +2) X name set to @name +3) X direction is the same as the target, Y is opposite. +4) the target of X is set to @target +5) Y is linked to @target +6) link/unlink and activate functions are set up + on GstGhostPad. + +``` + (-------------- + (- X --------) | + | | |------) + | target *------------------> | sink | + (------------) -------> |------) + | / (-------------- + (- Y --------) / (pad link) +//<-----* target |/ + (------------) +``` + +- Automatically takes same direction as target. +- target is filled in automatically. + +## Creating a ghostpad without a target + +``` +gst_ghost_pad_new_no_target (char *name, GstPadDirection dir) +``` + +1) create new GstGhostPad X + GstProxyPad Y +2) X name set to @name +3) X direction is @dir +5) link/unlink and activate functions are set up on GstGhostPad. + +``` +(- X --------) +| | +| target *--------->// +(------------) + | + (- Y --------) + | target *----->// + (------------) +``` + +- allows for setting the target later + +## Setting target on an untargetted unlinked ghostpad + +``` +gst_ghost_pad_set_target (char *name, GstPad *newtarget) + + (- X --------) + | | + | target *--------->// + (------------) + | + (- Y --------) + | target *----->// + (------------) +``` + +1) assert direction of newtarget == X direction +2) target is set to newtarget +3) internal pad Y is linked to newtarget + +``` + (-------------- + (- X --------) | + | | |------) + | target *------------------> | sink | + (------------) -------> |------) + | / (-------------- + (- Y --------) / (pad link) +//<-----* target |/ + (------------) +``` + +## Setting target on a targetted unlinked ghostpad + +``` +gst_ghost_pad_set_target (char *name, GstPad *newtarget) + + (-------------- + (- X --------) | + | | |-------) + | target *------------------> | sink1 | + (------------) -------> |-------) + | / (-------------- + (- Y --------) / (pad link) +//<-----* target |/ + (------------) +``` + +1) assert direction of newtarget (sink2) == X direction +2) unlink internal pad Y and oldtarget +3) target is set to newtarget (sink2) +4) internal pad Y is linked to newtarget + +``` + (-------------- + (- X --------) | + | | |-------) + | target *------------------> | sink2 | + (------------) -------> |-------) + | / (-------------- + (- Y --------) / (pad link) +//<-----* target |/ + (------------) +``` + +- Linking a pad to an untargetted ghostpad: + +``` + gst_pad_link (src, X) + + (- X --------) + | | + | target *--------->// + (------------) + | + (- Y --------) + | target *----->// + (------------) +-------) + | + (-----| + | src | + (-----| +-------) +``` + +X is a sink GstGhostPad without a target. The internal GstProxyPad Y has +the same direction as the src pad (peer). + +1) link function is called + - Y direction is same as @src + - Y target is set to @src + - Y is activated in the same mode as X + - core makes link from @src to X + + ``` + (- X --------) + | | + | target *----->// + >(------------) + (real pad link) / | + / (- Y ------) + / -----* target | + -------) / / (----------) + | / / + (-----|/ / + | src |<---- + (-----| + -------) + ``` + +## Linking a pad to a targetted ghostpad: + +``` + gst_pad_link (src, X) + + (-------- + (- X --------) | + | | |------) + | target *------------->| sink | + (------------) >|------) + | / (-------- + | / + | / +-------) | / (real pad link) + | (- Y ------) / + (-----| | |/ + | src | //<----* target | + (-----| (----------) +-------) +``` + +1) link function is called + - Y direction is same as @src + - Y target is set to @src + - Y is activated in the same mode as X + - core makes link from @src to X + +``` + (-------- + (- X --------) | + | | |------) + | target *------------->| sink | + >(------------) >|------) +(real pad link) / | / (-------- + / | / + / | / + -------) / | / (real pad link) + | / (- Y ------) / + (-----|/ | |/ + | src |<-------------* target | + (-----| (----------) + -------) +``` + +## Setting target on untargetted linked ghostpad: + +``` + gst_ghost_pad_set_target (char *name, GstPad *newtarget) + + (- X --------) + | | + | target *------>// + >(------------) +(real pad link) / | + / | + / | + -------) / | + | / (- Y ------) + (-----|/ | | + | src |<-------------* target | + (-----| (----------) + -------) +``` + +1) assert direction of @newtarget == X direction +2) X target is set to @newtarget +3) Y is linked to @newtarget + +``` + (-------- + (- X --------) | + | | |------) + | target *------------->| sink | + >(------------) >|------) +(real pad link) / | / (-------- + / | / + / | / + -------) / | / (real pad link) + | / (- Y ------) / + (-----|/ | |/ + | src |<-------------* target | + (-----| (----------) + -------) +``` + +## Setting target on targetted linked ghostpad: + +``` + gst_ghost_pad_set_target (char *name, GstPad *newtarget) + + (-------- + (- X --------) | + | | |-------) + | target *------------->| sink1 | + >(------------) >|-------) +(real pad link) / | / (-------- + / | / + / | / + -------) / | / (real pad link) + | / (- Y ------) / + (-----|/ | |/ + | src |<-------------* target | + (-----| (----------) + -------) +``` + +1) assert direction of @newtarget == X direction +2) Y and X target are unlinked +2) X target is set to @newtarget +3) Y is linked to @newtarget + +``` + (-------- + (- X --------) | + | | |-------) + | target *------------->| sink2 | + >(------------) >|-------) +(real pad link) / | / (-------- + / | / + / | / + -------) / | / (real pad link) + | / (- Y ------) / + (-----|/ | |/ + | src |<-------------* target | + (-----| (----------) + -------) +``` + +## Activation + +Sometimes ghost pads should proxy activation functions. This thingie +attempts to explain how it should work in the different cases. + +``` + +---+ +----+ +----+ +----+ + | A +-----+ B | | C |-------+ D | + +---+ +---=+ +=---+ +----+ + +--=-----------------------------=-+ + | +=---+ +----+ +----+ +---=+ | + | | a +---+ b ==== c +--+ d | | + | +----+ +----+ +----+ +----+ | + | | + +----------------------------------+ + state change goes from right to left + <----------------------------------------------------------- +``` + +All of the labeled boxes are pads. The dashes (---) show pad links, and +the double-lines (===) are internal connections. The box around a, b, c, +and d is a bin. B and C are ghost pads, and a and d are proxy pads. The +arrow represents the direction of a state change algorithm. Not counting +the bin, there are three elements involved here — the parent of D, the +parent of A, and the parent of b and c. + +Now, in the state change from READY to PAUSED, assuming the pipeline +does not have a live source, all of the pads will end up activated at +the end. There are 4 possible activation modes: + +1) AD and ab in PUSH, cd and CD in PUSH +2) AD and ab in PUSH, cd and CD in PULL +3) AD and ab in PULL, cd and CD in PUSH +4) AD and ab in PULL, cd and CD in PULL + +When activating (1), the state change algorithm will first visit the +parent of D and activate D in push mode. Then it visits the bin. The bin +will first change the state of its child before activating its pads. +That means c will be activated in push mode. \[\*\] At this point, d and +C should also be active in push mode, because it could be that +activating c in push mode starts a thread, which starts pushing to pads +which aren’t ready yet. Then b is activated in push mode. Then, the bin +activates C in push mode, which should already be in push mode, so +nothing is done. It then activates B in push mode, which activates b in +push mode, but it’s already there, then activates a in push mode as +well. The order of activating a and b does not matter in this case. +Then, finally, the state change algorithm moves to the parent of A, +activates A in push mode, and dataflow begins. + +\[\*\] Not yet implemented. + +Activation mode (2) is implausible, so we can ignore it for now. That +leaves us with the rest. + +(3) is the same as (1) until you get to activating b. Activating b will +proxy directly to activating a, which will activate B and A as well. +Then when the state change algorithm gets to B and A it sees that they +are already active, so it ignores them. + +Similarly in (4), activating D will cause the activation of all of the +rest of the pads, in this order: C d c b a B A. Then when the state +change gets to the other elements they are already active, and in fact +data flow is already occurring. + +So, from these scenarios, we can distill how ghost pad activation +functions should work: + +Ghost source pads (e.g. C): push: called by: element state change +handler behavior: just return TRUE pull: called by: peer’s activatepull +behavior: change the internal pad, which proxies to its peer e.g. C +changes d which changes c. + +Internal sink pads (e.g. d): push: called by: nobody (doesn’t seem +possible) behavior: n/a pull: called by: ghost pad behavior: proxy to +peer first + +Internal src pads (e.g. a): push: called by: ghost pad behavior: +activate peer in push mode pull: called by: peer’s activatepull +behavior: proxy to ghost pad, which proxies to its peer (e.g. a calls B +which calls A) + +Ghost sink pads (e.g. B): push: called by: element state change handler +behavior: change the internal pad, which proxies to peer (e.g. B changes +a which changes b) pull: called by: internal pad behavior: proxy to peer + +It doesn’t really make sense to have activation functions on proxy pads +that aren’t part of a ghost pad arrangement. diff --git a/markdown/design/gstobject.md b/markdown/design/gstobject.md new file mode 100644 index 0000000000..7bbcffec98 --- /dev/null +++ b/markdown/design/gstobject.md @@ -0,0 +1,79 @@ +# GstObject + +The base class for the entire GStreamer hierarchy is the GstObject. + +## Parentage + +A pointer is available to store the current parent of the object. This +is one of the two fundamental requirements for a hierarchical system +such as GStreamer (for the other, read up on GstBin). Three functions +are provided: `_set_parent()`, `_get_parent()`, and `_unparent()`. The +third is required because there is an explicit check in `_set_parent()`: +an object must not already have a parent if you wish to set one. You +must unparent the object first. This allows for new additions later. + + - GstObject’s that can be parented: GstElement (inside a bin) GstPad (inside + an element) + +## Naming + +- names of objects cannot be changed when they are parented +- names of objects should be unique across parent + - set_name() can fail because of this + - as can gst_element_add_pad()/gst_bin_add_element() +- gst_object_set_name() only changes the object’s name +- objects also have a name_prefix that is used to prefix the object +name during debugging and identification +- there are object-specific set_name’s() which also set the +name_prefix on the object. This is useful for debugging purposes to +give the object a more identifiable name. Typically a parent will +call _set_name_prefix on children, taking a lock on them to do +so. + +## Locking + +The GstObject contains the necessary primitives to lock the object in a +thread-safe manner. This will be used to provide general thread-safety +as needed. However, this lock is generic, i.e. it covers the whole +object. + +The object LOCK is a very lowlevel lock that should only be held to +access the object properties for short periods of code. + +All members of the GstObject structure marked as `/**< public >**/ /* +with LOCK */` are protected by this lock. These members can only be +accessed for reading or writing while the lock is held. All members +should be copied or reffed if they are used after releasing the LOCK. + +Note that this does **not** mean that no other thread can modify the +object at the same time that the lock is held. It only means that any +two sections of code that obey the lock are guaranteed to not be running +simultaneously. "The lock is voluntary and cooperative". + +This lock will ideally be used for parentage, flags and naming, which is +reasonable, since they are the only possible things to protect in the +GstObject. + +## Locking order + +In parent-child situations the lock of the parent must always be taken +first before taking the lock of the child. It is NOT allowed to hold the +child lock before taking the parent lock. + +This policy allows for parents to iterate their children and setting +properties on them. + +Whenever a nested lock needs to be taken on objects not involved in a +parent-child relation (eg. pads), an explictic locking order has to be +defined. + +## Path Generation + +Due to the base nature of the GstObject, it becomes the only reasonable +place to put this particular function (_get_path_string). It will +generate a string describing the parent hierarchy of a given GstObject. + +## Flags + +Each object in the GStreamer object hierarchy can have flags associated +with it, which are used to describe a state or a feature of the object. diff --git a/markdown/design/gstpipeline.md b/markdown/design/gstpipeline.md new file mode 100644 index 0000000000..3bb235e0f3 --- /dev/null +++ b/markdown/design/gstpipeline.md @@ -0,0 +1,79 @@ +# GstPipeline + +A GstPipeline is usually a toplevel bin and provides all of its children +with a clock. + +A GstPipeline also provides a toplevel GstBus (see [gstbus](design/gstbus.md)) + +The pipeline also calculates the running\_time based on the selected +clock (see also clocks.txt and [synchronisation](design/synchronisation.md)). + +The pipeline will calculate a global latency for the elements in the +pipeline. (See also [latency](design/latency.md)). + +## State changes + +In addition to the normal state change procedure of its parent class +GstBin, the pipeline performs the following actions during a state +change: + +- NULL → READY: + - set the bus to non-flushing +- READY → PAUSED: + - reset the running_time to 0 +- PAUSED → PLAYING: + - Select and a clock. + - calculate base_time using the running_time. + - calculate and distribute latency. + - set clock and base_time on all elements before performing the state + change. +- PLAYING → PAUSED: + - calculate the running_time when the pipeline was PAUSED. +- READY → NULL: + - set the bus to flushing (when auto-flushing is enabled) + +The running_time represents the total elapsed time, measured in clock +units, that the pipeline spent in the PLAYING state (see +[synchronisation](design/synchronisation.md)). The running_time is set to 0 after a +flushing seek. + +## Clock selection + +Since all of the children of a GstPipeline must use the same clock, the +pipeline must select a clock. This clock selection happens when the +pipeline goes to the PLAYING state. + +The default clock selection algorithm works as follows: + +- If the application selected a clock, use that clock. (see below) + +- Use the clock of most upstream element that can provide a clock. +This selection is performed by iterating the element starting from +the sinks going upstream. + - since this selection procedure happens in the PAUSED→PLAYING + state change, all the sinks are prerolled and we can thus be + sure that each sink is linked to some upstream element. + - in the case of a live pipeline (`NO_PREROLL`), the sink will not + yet be prerolled and the selection process will select the clock + of a more upstream element. + +- use GstSystemClock, this only happens when no element provides a +usable clock. + +The application can influence this clock selection with two methods: +`gst_pipeline_use_clock()` and `gst_pipeline_auto_clock()`. + +The `_use_clock()` method forces the use of a specific clock on the +pipeline regardless of what clock providers are children of the +pipeline. Setting NULL disables the clock completely and makes the +pipeline run as fast as possible. + +The `_auto_clock()` method removes the fixed clock and reactivates the +auto- matic clock selection algorithm described above. + +## GstBus + +A GstPipeline provides a GstBus to the application. The bus can be +retrieved with `gst_pipeline_get_bus()` and can then be used to +retrieve messages posted by the elements in the pipeline (see +[gstbus](design/gstbus.md)). diff --git a/markdown/design/index.md b/markdown/design/index.md new file mode 100644 index 0000000000..4212297b71 --- /dev/null +++ b/markdown/design/index.md @@ -0,0 +1,6 @@ +# GStreamer design documents + +This section gathers the various GStreamer design documents. +Those documents are the technical documents that have been produce while +developing or refactoring parts of the GStreamer design to explain the +problem and the design solution we came up to solve them. diff --git a/markdown/design/latency.md b/markdown/design/latency.md new file mode 100644 index 0000000000..bb9351bc90 --- /dev/null +++ b/markdown/design/latency.md @@ -0,0 +1,409 @@ +# Latency + +The latency is the time it takes for a sample captured at timestamp 0 to +reach the sink. This time is measured against the clock in the pipeline. +For pipelines where the only elements that synchronize against the clock +are the sinks, the latency is always 0 since no other element is +delaying the buffer. + +For pipelines with live sources, a latency is introduced, mostly because +of the way a live source works. Consider an audio source, it will start +capturing the first sample at time 0. If the source pushes buffers with +44100 samples at a time at 44100Hz it will have collected the buffer at +second 1. Since the timestamp of the buffer is 0 and the time of the +clock is now \>= 1 second, the sink will drop this buffer because it is +too late. Without any latency compensation in the sink, all buffers will +be dropped. + +The situation becomes more complex in the presence of: + +- 2 live sources connected to 2 live sinks with different latencies + - audio/video capture with synchronized live preview. + - added latencies due to effects (delays, resamplers…) + +- 1 live source connected to 2 live sinks + - firewire DV + - RTP, with added latencies because of jitter buffers. + +- mixed live source and non-live source scenarios. + - synchronized audio capture with non-live playback. (overdubs,..) + +- clock slaving in the sinks due to the live sources providing their + own clocks. + +To perform the needed latency corrections in the above scenarios, we +must develop an algorithm to calculate a global latency for the +pipeline. The algorithm must be extensible so that it can optimize the +latency at runtime. It must also be possible to disable or tune the +algorithm based on specific application needs (required minimal +latency). + +## Pipelines without latency compensation + +We show some examples to demonstrate the problem of latency in typical +capture pipelines. + +### Example 1 + +An audio capture/playback pipeline. + +* asrc: audio source, provides a clock +* asink audio sink, provides a clock + + .--------------------------. + | pipeline | + | .------. .-------. | + | | asrc | | asink | | + | | src -> sink | | + | '------' '-------' | + '--------------------------' + +* *NULL→READY*: + * asink: *NULL→READY*: probes device, returns `SUCCESS` + * asrc: *NULL→READY*: probes device, returns `SUCCESS` + +* *READY→PAUSED*: + * asink: *READY:→PAUSED* open device, returns `ASYNC` + * asrc: *READY→PAUSED*: open device, returns `NO_PREROLL` + +- Since the source is a live source, it will only produce data in +the `PLAYING` state. To note this fact, it returns `NO_PREROLL` +from the state change function. + +- This sink returns `ASYNC` because it can only complete the state +change to `PAUSED` when it receives the first buffer. + +At this point the pipeline is not processing data and the clock is not +running. Unless a new action is performed on the pipeline, this situation will +never change. + +* *PAUSED→PLAYING*: asrc clock selected because it is the most upstream clock +provider. asink can only provide a clock when it received the first buffer and +configured the device with the samplerate in the caps. + +* sink: *PAUSED:→PLAYING*, sets pending state to `PLAYING`, returns `ASYNC` because it +is not prerolled. The sink will commit state to `PLAYING` when it prerolls. +* src: *PAUSED→PLAYING*: starts pushing buffers. + +- since the sink is still performing a state change from `READY→PAUSED`, it remains ASYNC. The pending state will be set to +PLAYING. + +- The clock starts running as soon as all the elements have been +set to PLAYING. + +- the source is a live source with a latency. Since it is +synchronized with the clock, it will produce a buffer with +timestamp 0 and duration D after time D, ie. it will only be +able to produce the last sample of the buffer (with timestamp D) +at time D. This latency depends on the size of the buffer. + +- the sink will receive the buffer with timestamp 0 at time \>= D. +At this point the buffer is too late already and might be +dropped. This state of constantly dropping data will not change +unless a constant latency correction is added to the incoming +buffer timestamps. + +The problem is due to the fact that the sink is set to (pending) PLAYING +without being prerolled, which only happens in live pipelines. + +### Example 2 + +An audio/video capture/playback pipeline. We capture both audio and video and +have them played back synchronized again. + +* asrc: audio source, provides a clock +* asink audio sink, provides a clock +* vsrc: video source +* vsink video sink + + .--------------------------. + | pipeline | + | .------. .-------. | + | | asrc | | asink | | + | | src -> sink | | + | '------' '-------' | + | .------. .-------. | + | | vsrc | | vsink | | + | | src -> sink | | + | '------' '-------' | + '--------------------------' + +The state changes happen in the same way as example 1. Both sinks end up with +pending state of `PLAYING` and a return value of ASYNC until they receive the +first buffer. + +For audio and video to be played in sync, both sinks must compensate for the +latency of its source but must also use exactly the same latency correction. + +Suppose asrc has a latency of 20ms and vsrc a latency of 33ms, the total +latency in the pipeline has to be at least 33ms. This also means that the +pipeline must have at least a 33 - 20 = 13ms buffering on the audio stream or +else the audio src will underrun while the audiosink waits for the previous +sample to play. + +### Example 3 + +An example of the combination of a non-live (file) and a live source (vsrc) +connected to live sinks (vsink, sink). + + .--------------------------. + | pipeline | + | .------. .-------. | + | | file | | sink | | + | | src -> sink | | + | '------' '-------' | + | .------. .-------. | + | | vsrc | | vsink | | + | | src -> sink | | + | '------' '-------' | + '--------------------------' + +The state changes happen in the same way as example 1. Except sink will be +able to preroll (commit its state to PAUSED). + +In this case sink will have no latency but vsink will. The total latency +should be that of vsink. + +Note that because of the presence of a live source (vsrc), the pipeline can be +set to playing before sink is able to preroll. Without compensation for the +live source, this might lead to synchronisation problems because the latency +should be configured in the element before it can go to PLAYING. + +### Example 4 + +An example of the combination of a non-live and a live source. The non-live +source is connected to a live sink and the live source to a non-live sink. + + .--------------------------. + | pipeline | + | .------. .-------. | + | | file | | sink | | + | | src -> sink | | + | '------' '-------' | + | .------. .-------. | + | | vsrc | | files | | + | | src -> sink | | + | '------' '-------' | + '--------------------------' + +The state changes happen in the same way as example 3. Sink will be +able to preroll (commit its state to PAUSED). files will not be able to +preroll. + +sink will have no latency since it is not connected to a live source. files +does not do synchronisation so it does not care about latency. + +The total latency in the pipeline is 0. The vsrc captures in sync with the +playback in sink. + +As in example 3, sink can only be set to `PLAYING` after it successfully +prerolled. + +## State Changes + +A Sink is never set to `PLAYING` before it is prerolled. In order to do +this, the pipeline (at the GstBin level) keeps track of all elements +that require preroll (the ones that return ASYNC from the state change). +These elements posted a `ASYNC_START` message without a matching +`ASYNC_DONE` message. + +The pipeline will not change the state of the elements that are still +doing an ASYNC state change. + +When an ASYNC element prerolls, it commits its state to PAUSED and posts +an `ASYNC_DONE` message. The pipeline notices this `ASYNC_DONE` message +and matches it with the `ASYNC_START` message it cached for the +corresponding element. + +When all `ASYNC_START` messages are matched with an `ASYNC_DONE` message, +the pipeline proceeds with setting the elements to the final state +again. + +The base time of the element was already set by the pipeline when it +changed the NO\_PREROLL element to PLAYING. This operation has to be +performed in the separate async state change thread (like the one +currently used for going from `PAUSED→PLAYING` in a non-live pipeline). + +## Query + +The pipeline latency is queried with the LATENCY query. + +* **`live`** G_TYPE_BOOLEAN (default FALSE): - if a live element is found upstream + +* **`min-latency`** G_TYPE_UINT64 (default 0, must not be NONE): - the minimum +latency in the pipeline, meaning the minimum time downstream elements +synchronizing to the clock have to wait until they can be sure that all data +for the current running time has been received. + +Elements answering the latency query and introducing latency must +set this to the maximum time for which they will delay data, while +considering upstream's minimum latency. As such, from an element's +perspective this is *not* its own minimum latency but its own +maximum latency. +Considering upstream's minimum latency in general means that the +element's own value is added to upstream's value, as this will give +the overall minimum latency of all elements from the source to the +current element: + + min_latency = upstream_min_latency + own_min_latency + +* **`max-latency`** G_TYPE_UINT64 (default 0, NONE meaning infinity): - the +maximum latency in the pipeline, meaning the maximum time an element +synchronizing to the clock is allowed to wait for receiving all data for the +current running time. Waiting for a longer time will result in data loss, +overruns and underruns of buffers and in general breaks synchronized data flow +in the pipeline. + +Elements answering the latency query should set this to the maximum +time for which they can buffer upstream data without blocking or +dropping further data. For an element this value will generally be +its own minimum latency, but might be bigger than that if it can +buffer more data. As such, queue elements can be used to increase +the maximum latency. + +The value set in the query should again consider upstream's maximum +latency: + +- If the current element has blocking buffering, i.e. it does not drop data by +itself when its internal buffer is full, it should just add its own maximum +latency (i.e. the size of its internal buffer) to upstream's value. If +upstream's maximum latency, or the elements internal maximum latency was NONE +(i.e. infinity), it will be set to infinity. + + + if (upstream_max_latency == NONE || own_max_latency == NONE) + max_latency = NONE; + else + max_latency = upstream_max_latency + own_max_latency + + +If the element has multiple sinkpads, the minimum upstream latency is +the maximum of all live upstream minimum latencies. + +If the current element has leaky buffering, i.e. it drops data by itself +when its internal buffer is full, it should take the minimum of its own +maximum latency and upstream’s. Examples for such elements are audio sinks +and sources with an internal ringbuffer, leaky queues and in general live +sources with a limited amount of internal buffers that can be used. + + max_latency = MIN (upstream_max_latency, own_max_latency) + +> Note: many GStreamer base classes allow subclasses to set a +> minimum and maximum latency and handle the query themselves. These +> base classes assume non-leaky (i.e. blocking) buffering for the +> maximum latency. The base class' default query handler needs to be +> overridden to correctly handle leaky buffering. + + If the element has multiple sinkpads, the maximum upstream latency is the + minimum of all live upstream maximum latencies. + +## Event + +The latency in the pipeline is configured with the LATENCY event, which +contains the following fields: + +* **`latency`** G_TYPE_UINT64: the configured latency in the pipeline + +## Latency compensation + +Latency calculation and compensation is performed before the pipeline +proceeds to the `PLAYING` state. + +When the pipeline collected all `ASYNC_DONE` messages it can calculate +the global latency as follows: + +- perform a latency query on all sinks +- sources set their minimum and maximum latency +- other elements add their own values as described above +- latency = MAX (all min latencies) +- if MIN (all max latencies) \< latency we have an impossible +situation and we must generate an error indicating that this +pipeline cannot be played. This usually means that there is not +enough buffering in some chain of the pipeline. A queue can be added +to those chains. + +The sinks gather this information with a LATENCY query upstream. +Intermediate elements pass the query upstream and add the amount of +latency they add to the result. + +ex1: sink1: \[20 - 20\] sink2: \[33 - 40\] + + MAX (20, 33) = 33 + MIN (20, 40) = 20 < 33 -> impossible + +ex2: sink1: \[20 - 50\] sink2: \[33 - 40\] + + MAX (20, 33) = 33 + MIN (50, 40) = 40 >= 33 -> latency = 33 + +The latency is set on the pipeline by sending a LATENCY event to the +sinks in the pipeline. This event configures the total latency on the +sinks. The sink forwards this LATENCY event upstream so that +intermediate elements can configure themselves as well. + +After this step, the pipeline continues setting the pending state on its +elements. + +A sink adds the latency value, received in the LATENCY event, to the +times used for synchronizing against the clock. This will effectively +delay the rendering of the buffer with the required latency. Since this +delay is the same for all sinks, all sinks will render data relatively +synchronised. + +## Flushing a playing pipeline + +We can implement resynchronisation after an uncontrolled FLUSH in (part +of) a pipeline in the same way. Indeed, when a flush is performed on a +PLAYING live element, a new base time must be distributed to this +element. + +A flush in a pipeline can happen in the following cases: + + - flushing seek in the pipeline + + - performed by the application on the pipeline + + - performed by the application on an element + + - flush preformed by an element + + - after receiving a navigation event (DVD, …) + +When a playing sink is flushed by a `FLUSH_START` event, an `ASYNC_START` +message is posted by the element. As part of the message, the fact that +the element got flushed is included. The element also goes to a pending +PAUSED state and has to be set to the `PLAYING` state again later. + +The `ASYNC_START` message is kept by the parent bin. When the element +prerolls, it posts an `ASYNC_DONE` message. + +When all `ASYNC_START` messages are matched with an `ASYNC_DONE` message, +the bin will capture a new base\_time from the clock and will bring all +the sinks back to `PLAYING` after setting the new base time on them. It’s +also possible to perform additional latency calculations and adjustments +before doing this. + +## Dynamically adjusting latency + +An element that want to change the latency in the pipeline can do this +by posting a LATENCY message on the bus. This message instructs the +pipeline to: + + - query the latency in the pipeline (which might now have changed) + with a LATENCY query. + + - redistribute a new global latency to all elements with a LATENCY + event. + +A use case where the latency in a pipeline can change could be a network +element that observes an increased inter packet arrival jitter or +excessive packet loss and decides to increase its internal buffering +(and thus the latency). The element must post a LATENCY message and +perform the additional latency adjustments when it receives the LATENCY +event from the downstream peer element. + +In a similar way can the latency be decreased when network conditions +are improving again. + +Latency adjustments will introduce glitches in playback in the sinks and +must only be performed in special conditions. diff --git a/markdown/design/live-source.md b/markdown/design/live-source.md new file mode 100644 index 0000000000..dc12d1de1e --- /dev/null +++ b/markdown/design/live-source.md @@ -0,0 +1,53 @@ +# Live sources + +A live source is a source that cannot be arbitrarily `PAUSED` without +losing data. + +A live source such as an element capturing audio or video need to be +handled in a special way. It does not make sense to start the dataflow +in the `PAUSED` state for those devices as the user might wait a long time +between going from `PAUSED` to PLAYING, making the previously captured +buffers irrelevant. + +A live source therefore only produces buffers in the PLAYING state. This +has implications for sinks waiting for a buffer to complete the preroll +state since such a buffer might never arrive. + +Live sources return `NO_PREROLL` when going to the `PAUSED` state to inform +the bin/pipeline that this element will not be able to produce data in +the `PAUSED` state. `NO_PREROLL` should be returned for both READY→PAUSED +and PLAYING→PAUSED. + +When performing a get\_state() on a bin with a non-zero timeout value, +the bin must be sure that there are no live sources in the pipeline +because else the get\_state() function would block on the sinks. + +A gstbin therefore always performs a zero timeout get\_state() on its +elements to discover the `NO_PREROLL` (and ERROR) elements before +performing a blocking wait. + +## Scheduling + +Live sources will not produce data in the paused state. They block in +the getrange function or in the loop function until they go to PLAYING. + +## Latency + +The live source timestamps its data with the time of the clock at the +time the data was captured. Normally it will take some time to capture +the first sample of data and the last sample. This means that when the +buffer arrives at the sink, it will already be late and will be dropped. + +The latency is the time it takes to construct one buffer of data. This +latency is exposed with a `LATENCY` query. + +See [latency](design/latency.md) + +## Timestamps + +Live sources always timestamp their buffers with the running\_time of +the pipeline. This is needed to be able to match the timestamps of +different live sources in order to synchronize them. + +This is in contrast to non-live sources, which timestamp their buffers +starting from running\_time 0. diff --git a/markdown/design/memory.md b/markdown/design/memory.md new file mode 100644 index 0000000000..335281767b --- /dev/null +++ b/markdown/design/memory.md @@ -0,0 +1,165 @@ +# GstMemory + +This document describes the design of the memory objects. + +GstMemory objects are usually added to GstBuffer objects and contain the +multimedia data passed around in the pipeline. + +## Requirements + +- It must be possible to have different memory allocators +- It must be possible to efficiently share memory objects, copy, span and trim. + +## Memory layout + +`GstMemory` manages a memory region. The accessible part of the managed region is +defined by an offset relative to the start of the region and a size. This +means that the managed region can be larger than what is visible to the user of +GstMemory API. + +Schematically, GstMemory has a pointer to a memory region of _maxsize_. The area +starting from `offset` and `size` is accessible. + +``` + memory +GstMemory ->*----------------------------------------------------* + ^----------------------------------------------------^ + maxsize + ^--------------------------------------^ + offset size +``` + +The current properties of the accessible memory can be retrieved with: + +``` c + gsize gst_memory_get_sizes (GstMemory *mem, gsize *offset, gsize *maxsize); +``` + +The offset and size can be changed with: + +``` c + void gst_memory_resize (GstMemory *mem, gssize offset, gsize size); +``` + +## Allocators + +GstMemory objects are created by allocators. Allocators are a subclass +of GstObject and can be subclassed to make custom allocators. + +``` c +struct _GstAllocator { + GstObject object; + + const gchar *mem_type; + + GstMemoryMapFunction mem_map; + GstMemoryUnmapFunction mem_unmap; + GstMemoryCopyFunction mem_copy; + GstMemoryShareFunction mem_share; + GstMemoryIsSpanFunction mem_is_span; +}; +``` + +The allocator class has 2 virtual methods. One to create a GstMemory, +another to free it again. + +``` c +struct _GstAllocatorClass { + GstObjectClass object_class; + + GstMemory * (*alloc) (GstAllocator *allocator, gsize size, + GstAllocationParams *params); + void (*free) (GstAllocator *allocator, GstMemory *memory); +}; +``` + +Allocators are refcounted. It is also possible to register the allocator to the +GStreamer system. This way, the allocator can be retrieved by name. + +After an allocator is created, new GstMemory can be created with + +``` c + GstMemory * gst_allocator_alloc (const GstAllocator * allocator, + gsize size, + GstAllocationParams *params); +``` + +GstAllocationParams contain extra info such as flags, alignment, prefix and +padding. + +The GstMemory object is a refcounted object that must be freed with +gst_memory_unref (). + +The GstMemory keeps a ref to the allocator that allocated it. Inside the +allocator are the most common GstMemory operations listed. Custom +GstAllocator implementations must implement the various operations on +the memory they allocate. + +It is also possible to create a new GstMemory object that wraps existing +memory with: + +``` c + GstMemory * gst_memory_new_wrapped (GstMemoryFlags flags, + gpointer data, gsize maxsize, + gsize offset, gsize size, + gpointer user_data, + GDestroyNotify notify); +``` + +## Lifecycle + +GstMemory extends from GstMiniObject and therefore uses its lifecycle +management (See [miniobject](design/miniobject.md)). + +## Data Access + +Access to the memory region is always controlled with a map and unmap method +call. This allows the implementation to monitor the access patterns or set up +the required memory mappings when needed. + +The access of the memory object is controlled with the locking mechanism on +GstMiniObject (See [miniobject](design/miniobject.md)). + +Mapping a memory region requires the caller to specify the access method: READ +and/or WRITE. Mapping a memory region will first try to get a lock on the +memory in the requested access mode. This means that the map operation can +fail when WRITE access is requested on a non-writable memory object (it has +an exclusive counter > 1, the memory is already locked in an incompatible +access mode or the memory is marked readonly). + +After the data has been accessed in the object, the unmap call must be +performed, which will unlock the memory again. + +It is allowed to recursively map multiple times with the same or narrower +access modes. For each of the map calls, a corresponding unmap call needs to +be made. WRITE-only memory cannot be mapped in READ mode and READ-only memory +cannot be mapped in WRITE mode. + +The memory pointer returned from the map call is guaranteed to remain valid in +the requested mapping mode until the corresponding unmap call is performed on +the pointer. + +When multiple map operations are nested and return the same pointer, the pointer +is valid until the last unmap call is done. + +When the final reference on a memory object is dropped, all outstanding +mappings should have been unmapped. + +Resizing a GstMemory does not influence any current mappings in any way. + +## Copy + +A GstMemory copy can be made with the `gst_memory_copy()` call. Normally, +allocators will implement a custom version of this function to make a copy of +the same kind of memory as the original one. + +This is what the fallback version of the copy function does, albeit slower +than what a custom implementation could do. + +The copy operation is only required to copy the visible range of the memory +block. + +## Share + +A memory region can be shared between GstMemory object with the +`gst_memory_share()` operation. diff --git a/markdown/design/messages.md b/markdown/design/messages.md new file mode 100644 index 0000000000..c3a94a228f --- /dev/null +++ b/markdown/design/messages.md @@ -0,0 +1,107 @@ +# Messages + +Messages are refcounted lightweight objects to signal the application of +pipeline events. + +Messages are implemented as a subclass of GstMiniObject with a generic +GstStructure as the content. This allows for writing custom messages +without requiring an API change while allowing a wide range of different +types of messages. + +Messages are posted by objects in the pipeline and are passed to the +application using the GstBus (See also [gstbus](design/gstbus.md) +and [gstpipeline](design/gstpipeline.md)). + +## Message types + +**`GST_MESSAGE_EOS`**: Posted by sink elements. This message is posted to the +application when all the sinks in a pipeline have posted an EOS message. When +performing a flushing seek, the EOS state of the pipeline and sinks is reset. + +**`GST_MESSAGE_ERROR`**: An element in the pipeline got into an error state. +The message carries a GError and a debug string describing the error. This +usually means that part of the pipeline is not streaming anymore. + +**`GST_MESSAGE_WARNING`**: An element in the pipeline encountered a condition +that made it produce a warning. This could be a recoverable decoding error or +some other non fatal event. The pipeline continues streaming after a warning. + +**`GST_MESSAGE_INFO`**: An element produced an informational message. + +**`GST_MESSAGE_TAG`**: An element decoded metadata about the stream. The +message carries a GstTagList with the tag information. + +**`GST_MESSAGE_BUFFERING`**: An element is buffering data and that could +potentially take some time. This message is typically emitted by elements that +perform some sort of network buffering. While the pipeline is buffering it +should remain in the PAUSED state. When the buffering is finished, it can +resume PLAYING. + +**`GST_MESSAGE_STATE_CHANGED`**: An element changed state in the pipeline. +The message carries the old, new and pending state of the element. + +**`GST_MESSAGE_STATE_DIRTY`**: An internal message used to instruct +a pipeline hierarchy that a state recalculation must be performed because of an +ASYNC state change completed. This message is not used anymore. + +**`GST_MESSAGE_STEP_DONE`**: An element stepping frames has finished. This is +currently not used. + +**`GST_MESSAGE_CLOCK_PROVIDE`**: An element notifies its capability of +providing a clock for the pipeline. + +**`GST_MESSAGE_CLOCK_LOST`**: The current clock, as selected by the pipeline, +became unusable. The pipeline will select a new clock on the next PLAYING state +change. + +**`GST_MESSAGE_NEW_CLOCK`**: A new clock was selected for the pipeline. + +**`GST_MESSAGE_STRUCTURE_CHANGE`**: The pipeline changed its structure, This +means elements were added or removed or pads were linked or unlinked. This +message is not yet used. + +**`GST_MESSAGE_STREAM_STATUS`**: Posted by an element when it +starts/stops/pauses a streaming task. It contains information about the reason +why the stream state changed along with the thread id. The application can use +this information to detect failures in streaming threads and/or to adjust +streaming thread priorities. + +**`GST_MESSAGE_APPLICATION`**: The application posted a message. This message +must be used when the application posts a message on the bus. + +**`GST_MESSAGE_ELEMENT`**: Element-specific message. See the specific +element's documentation + +**`GST_MESSAGE_SEGMENT_START`**: An element started playback of a new +segment. This message is not forwarded to applications but is used internally +to schedule SEGMENT_DONE messages. + +**`GST_MESSAGE_SEGMENT_DONE`**: An element or bin completed playback of +a segment. This message is only posted on the bus if a SEGMENT seek is +performed on a pipeline. + +**`GST_MESSAGE_DURATION_CHANGED`**: An element posts this message when it has +detected or updated the stream duration. + +**`GST_MESSAGE_ASYNC_START`**: Posted by sinks when they start an +asynchronous state change. + +**`GST_MESSAGE_ASYNC_DONE`**: Posted by sinks when they receive the first +data buffer and complete the asynchronous state change. + +**`GST_MESSAGE_LATENCY`**: Posted by elements when the latency in a pipeline +changed and a new global latency should be calculated by the pipeline or +application. + +**`GST_MESSAGE_REQUEST_STATE`**: Posted by elements when they want to change +the state of the pipeline they are in. A typical use case would be an audio +sink that requests the pipeline to pause in order to play a higher priority +stream. + +**`GST_MESSAGE_STEP_START`**: A Stepping operation has started. + +**`GST_MESSAGE_QOS`**: A buffer was dropped or an element changed its +processing strategy for Quality of Service reasons. + +**`GST_MESSAGE_PROGRESS`**: A progress message was posted. Progress messages +inform the application about the state of asynchronous operations. diff --git a/markdown/design/meta.md b/markdown/design/meta.md new file mode 100644 index 0000000000..8e92d561c6 --- /dev/null +++ b/markdown/design/meta.md @@ -0,0 +1,410 @@ +# GstMeta + +This document describes the design for arbitrary per-buffer metadata. + +Buffer metadata typically describes the low level properties of the +buffer content. These properties are commonly not negotiated with caps +but they are negotiated in the bufferpools. + +Some examples of metadata: + + - interlacing information + + - video alignment, cropping, panning information + + - extra container information such as granulepos, … + + - extra global buffer properties + +## Requirements + + - It must be fast + + - allocation, free, low fragmentation + + - access to the metadata fields, preferably not much slower than + directly accessing a C structure field + + - It must be extensible. Elements should be able to add new arbitrary + metadata without requiring much effort. Also new metadata fields + should not break API or ABI. + + - It plays nice with subbuffers. When a subbuffer is created, the + various buffer metadata should be copied/updated correctly. + + - We should be able to negotiate metadata between elements + +# Use cases + +- **Video planes**: Video data is sometimes allocated in non-contiguous planes +for the Y and the UV data. We need to be able to specify the data on a buffer +using multiple pointers in memory. We also need to be able to specify the +stride for these planes. + +- **Extra buffer data**: Some elements might need to store extra data for +a buffer. This is typically done when the resources are allocated from another +subsystem such as OMX or X11. + +- **Processing information**: Pan and crop information can be added to the +buffer data when the downstream element can understand and use this metadata. +An imagesink can, for example, use the pan and cropping information when +blitting the image on the screen with little overhead. + +## GstMeta + +A GstMeta is a structure as follows: + +``` c +struct _GstMeta { + GstMetaFlags flags; + const GstMetaInfo *info; /* tag and info for the meta item */ +}; +``` + +The purpose of the this structure is to serve as a common header for all +metadata information that we can attach to a buffer. Specific metadata, +such as timing metadata, will have this structure as the first field. +For example: + +``` c +struct _GstMetaTiming { + GstMeta meta; /* common meta header */ + + GstClockTime dts; /* decoding timestamp */ + GstClockTime pts; /* presentation timestamp */ + GstClockTime duration; /* duration of the data */ + GstClockTime clock_rate; /* clock rate for the above values */ +}; +``` + +Or another example for the video memory regions that consists of both +fields and methods. + +``` c + #define GST_VIDEO_MAX_PLANES 4 + +struct GstMetaVideo { + GstMeta meta; + + GstBuffer *buffer; + + GstVideoFlags flags; + GstVideoFormat format; + guint id + guint width; + guint height; + + guint n_planes; + gsize offset[GST_VIDEO_MAX_PLANES]; /* offset in the buffer memory region of the + * first pixel. */ + gint stride[GST_VIDEO_MAX_PLANES]; /* stride of the image lines. Can be negative when + * the image is upside-down */ + + gpointer (*map) (GstMetaVideo *meta, guint plane, gpointer * data, gint *stride, + GstMapFlags flags); + gboolean (*unmap) (GstMetaVideo *meta, guint plane, gpointer data); +}; + +gpointer gst_meta_video_map (GstMetaVideo *meta, guint plane, gpointer * data, + gint *stride, GstMapflags flags); +gboolean gst_meta_video_unmap (GstMetaVideo *meta, guint plane, gpointer data); +``` + +GstMeta derived structures define the API of the metadata. The API can +consist of fields and/or methods. It is possible to have different +implementations for the same GstMeta structure. + +The implementation of the GstMeta API would typically add more fields to +the public structure that allow it to implement the API. + +GstMetaInfo will point to more information about the metadata and looks +like this: + +``` c +struct _GstMetaInfo { + GType api; /* api type */ + GType type; /* implementation type */ + gsize size; /* size of the structure */ + + GstMetaInitFunction init_func; + GstMetaFreeFunction free_func; + GstMetaTransformFunction transform_func; +}; +``` + +api will contain a GType of the metadata API. A repository of registered +MetaInfo will be maintained by the core. We will register some common +metadata structures in core and some media specific info for +audio/video/text in -base. Plugins can register additional custom +metadata. + +For each implementation of api, there will thus be a unique GstMetaInfo. +In the case of metadata with a well defined API, the implementation +specific init function will setup the methods in the metadata structure. +A unique GType will be made for each implementation and stored in the +type field. + +Along with the metadata description we will have functions to +initialize/free (and/or refcount) a specific GstMeta instance. We also +have the possibility to add a custom transform function that can be used +to modify the metadata when a transformation happens. + +There are no explicit methods to serialize and deserialize the metadata. +Since each type has a GType, we can reuse the GValue transform functions +for this. + +The purpose of the separate MetaInfo is to not have to carry the +free/init functions in each buffer instance but to define them globally. +We still want quick access to the info so we need to make the buffer +metadata point to the info. + +Technically we could also specify the field and types in the MetaInfo +and provide a generic API to retrieve the metadata fields without the +need for a header file. We will not do this yet. + +Allocation of the GstBuffer structure will result in the allocation of a +memory region of a customizable size (512 bytes). Only the first sizeof +(GstBuffer) bytes of this region will initially be used. The remaining +bytes will be part of the free metadata region of the buffer. Different +implementations are possible and are invisible in the API or ABI. + +The complete buffer with metadata could, for example, look as follows: + +``` + +-------------------------------------+ +GstMiniObject | GType (GstBuffer) | + | refcount, flags, copy/disp/free | + +-------------------------------------+ +GstBuffer | pool,pts,dts,duration,offsets | + | | + +.....................................+ + | next ---+ + +- | info ------> GstMetaInfo +GstMetaTiming | | | | + | | dts | | + | | pts | | + | | duration | | + +- | clock_rate | | + + . . . . . . . . . . . . . . . . . . + | + | next <--+ +GstMetaVideo +- +- | info ------> GstMetaInfo + | | | | | + | | | flags | | + | | | n_planes | | + | | | planes[] | | + | | | map | | + | | | unmap | | + +- | | | | + | | private fields | | +GstMetaVideoImpl | | ... | | + | | ... | | + +- | | | + + . . . . . . . . . . . . . . . . . . + . + . . +``` + +## API examples + +Buffers are created using the normal gst\_buffer\_new functions. The +standard fields are initialized as usual. A memory area that is bigger +than the structure size is allocated for the buffer metadata. + +``` c + gst_buffer_new (); +``` + +After creating a buffer, the application can set caps and add metadata +information. + +To add or retrieve metadata, a handle to a GstMetaInfo structure needs +to be obtained. This defines the implementation and API of the metadata. +Usually, a handle to this info structure can be obtained by calling a +public `_get\_info()` method from a shared library (for shared metadata). + +The following defines can usually be found in the shared .h file. + +``` c + GstMetaInfo * gst_meta_timing_get_info(); + #define GST_META_TIMING_INFO (gst_meta_timing_get_info()) +``` + +Adding metadata to a buffer can be done with the +`gst_buffer_add_meta()` call. This function will create new metadata +based on the implementation specified by the GstMetaInfo. It is also +possible to pass a generic pointer to the `add_meta()` function that can +contain parameters to initialize the new metadata fields. + +Retrieving the metadata on a buffer can be done with the +`gst_buffer_meta_get()` method. This function retrieves an existing +metadata conforming to the API specified in the given info. When no such +metadata exists, the function will return NULL. + +``` c + GstMetaTiming *timing; + + timing = gst_buffer_get_meta (buffer, GST_META_TIMING_INFO); +``` + +Once a reference to the info has been obtained, the associated metadata +can be added or modified on a buffer. + +``` c + timing->timestamp = 0; + timing->duration = 20 * GST_MSECOND; +``` + +Other convenience macros can be made to simplify the above code: + +``` c + #define gst_buffer_get_meta_timing(b) \ + ((GstMetaTiming *) gst_buffer_get_meta ((b), GST_META_TIMING_INFO) +``` + +This makes the code look like this: + +``` c + GstMetaTiming *timing; + + timing = gst_buffer_get_meta_timing (buffer); + timing->timestamp = 0; + timing->duration = 20 * GST_MSECOND; +``` + +To iterate the different metainfo structures, one can use the +`gst_buffer_meta_get_next()` methods. + +``` c + GstMeta *current = NULL; + + /* passing NULL gives the first entry */ + current = gst_buffer_meta_get_next (buffer, current); + + /* passing a GstMeta returns the next */ + current = gst_buffer_meta_get_next (buffer, current); +``` + +## Memory management + +### allocation + +We initially allocate a reasonable sized GstBuffer structure (say 512 bytes). + +Since the complete buffer structure, including a large area for metadata, is +allocated in one go, we can reduce the number of memory allocations while still +providing dynamic metadata. + +When adding metadata, we need to call the init function of the associated +metadata info structure. Since adding the metadata requires the caller to pass +a handle to the info, this operation does not require table lookups. + +Per-metadata memory initialisation is needed because not all metadata is +initialized in the same way. We need to, for example, set the timestamps to +NONE in the MetaTiming structures. + +The init/free functions can also be used to implement refcounting for a metadata +structure. This can be useful when a structure is shared between buffers. + +When the free_size of the GstBuffer is exhausted, we will allocate new memory +for each newly added Meta and use the next pointers to point to this. It +is expected that this does not occur often and we might be able to optimize +this transparently in the future. + +### free + +When a GstBuffer is freed, we potentially might have to call a custom free +function on the metadata info. In the case of the Memory metadata, we need to +call the associated free function to free the memory. + +When freeing a GstBuffer, the custom buffer free function will iterate all of +the metadata in the buffer and call the associated free functions in the +MetaInfo associated with the entries. Usually, this function will be NULL. + +## Serialization + +When buffer should be sent over the wire or be serialized in GDP, we +need a way to perform custom serialization and deserialization on the +metadata. + +for this we can use the GValue transform functions. + +## Transformations + +After certain transformations, the metadata on a buffer might not be +relevant anymore. + +Consider, for example, metadata that lists certain regions of interest +on the video data. If the video is scaled or rotated, the coordinates +might not make sense anymore. A transform element should be able to +adjust or remove the associated metadata when it becomes invalid. + +We can make the transform element aware of the metadata so that it can +adjust or remove in an intelligent way. Since we allow arbitrary +metadata, we can’t do this for all metadata and thus we need some other +way. + +One proposition is to tag the metadata type with keywords that specify +what it functionally refers too. We could, for example, tag the metadata +for the regions of interest with a tag that notes that the metadata +refers to absolute pixel positions. A transform could then know that the +metadata is not valid anymore when the position of the pixels changed +(due to rotation, flipping, scaling and so on). + +## Subbuffers + +Subbuffers are implemented with a generic copy. Parameters to the copy +are the offset and size. This allows each metadata structure to +implement the actions needed to update the metadata of the subbuffer. + +It might not make sense for some metadata to work with subbuffers. For +example when we take a subbuffer of a buffer with a video frame, the +GstMetaVideo simply becomes invalid and is removed from the new +subbuffer. + +## Relationship with GstCaps + +The difference between GstCaps, used in negotiation, and the metadata is +not clearly defined. + +We would like to think of the GstCaps containing the information needed +to functionally negotiate the format between two elements. The Metadata +should then only contain variables that can change between each buffer. + +For example, for video we would have width/height/framerate in the caps +but then have the more technical details, such as stride, data pointers, +pan/crop/zoom etc in the metadata. + +A scheme like this would still allow us to functionally specify the +desired video resolution while the implementation details would be +inside the metadata. + +## Relationship with GstMiniObject qdata + +qdata on a miniobject is element private and is not visible to other +element. Therefore qdata never contains essential information that +describes the buffer content. + +## Compatibility + +We need to make sure that elements exchange metadata that they both +understand, This is particularly important when the metadata describes +the data layout in memory (such as strides). + +The ALLOCATION query is used to let upstream know what metadata we can +suport. + +It is also possible to have a bufferpool add certain metadata to the +buffers from the pool. This feature is activated by enabling a buffer +option when configuring the pool. + +## Notes + +Some structures that we need to be able to add to buffers. + +- Clean Aperture +- Arbitrary Matrix Transform +- Aspect ratio +- Pan/crop/zoom +- Video strides + +Some of these overlap, we need to find a minimal set of metadata +structures that allows us to define all use cases. diff --git a/markdown/design/miniobject.md b/markdown/design/miniobject.md new file mode 100644 index 0000000000..5402ea6f7f --- /dev/null +++ b/markdown/design/miniobject.md @@ -0,0 +1,199 @@ +# GstMiniObject + +This document describes the design of the miniobject base class. + +The miniobject abstract base class is used to construct lightweight +refcounted and boxed types that are frequently created and destroyed. + +## Requirements + +- Be lightweight +- Refcounted +- I must be possible to control access to the object, ie. when the +object is readable and writable. +- Subclasses must be able to use their own allocator for the memory. + +## Usage + +Users of the GstMiniObject infrastructure will need to define a +structure that includes the GstMiniObject structure as the first field. + +``` c +struct { + GstMiniObject mini_object; + + /* my fields */ + ... +} MyObject +``` + +The subclass should then implement a constructor method where it +allocates the memory for its structure and initializes the miniobject +structure with `gst\_mini\_object\_init()`. Copy and Free functions are +provided to the `gst\_mini\_object\_init()` function. + +``` c +MyObject * +my_object_new() +{ + MyObject *res = g_slice_new (MyObject); + + gst_mini_object_init (GST_MINI_OBJECT_CAST (res), 0, + MY_TYPE_OBJECT, + (GstMiniObjectCopyFunction) _my_object_copy, + (GstMiniObjectDisposeFunction) NULL, + (GstMiniObjectFreeFunction) _my_object_free); + + /* other init */ + ..... + + return res; +} +``` + +The Free function is responsible for freeing the allocated memory for +the structure. + +``` c +static void +_my_object_free (MyObject *obj) +{ + /* other cleanup */ + ... + + g_slice_free (MyObject, obj); +} +``` + +## Lifecycle + +GstMiniObject is refcounted. When a GstMiniObject is first created, it +has a refcount of 1. + +Each variable holding a reference to a GstMiniObject is responsible for +updating the refcount. This includes incrementing the refcount with +`gst\_mini\_object\_ref()` when a reference is kept to a miniobject or +`gst\_mini\_object\_unref()` when a reference is released. + +When the refcount reaches 0, and thus no objects hold a reference to the +miniobject anymore, we can free the miniobject. + +When freeing the miniobject, first the GstMiniObjectDisposeFunction is +called. This function is allowed to revive the object again by +incrementing the refcount, in which case it should return FALSE from the +dispose function. The dispose function is used by GstBuffer to revive +the buffer back into the GstBufferPool when needed. + +When the dispose function returns TRUE, the GstMiniObjectFreeFunction +will be called and the miniobject will be freed. + +## Copy + +A miniobject can be copied with `gst\_mini\_object\_copy()`. This function +will call the custom copy function that was provided when registering +the new GstMiniObject subclass. + +The copy function should try to preserve as much info from the original +object as possible. + +The new copy should be writable. + +## Access management + +GstMiniObject can be shared between multiple threads. It is important +that when a thread writes to a GstMiniObject that the other threads +don’t not see the changes. + +To avoid exposing changes from one thread to another thread, the +miniobjects are managed in a Copy-On-Write way. A copy is only made when +it is known that the object is shared between multiple objects or +threads. + +There are 2 methods implemented for controlling access to the +miniobject. + + - A first method relies on the refcount of the object to control + writability. Objects using this method have the LOCKABLE flag unset. + + - A second method relies on a separate counter for controlling the + access to the object. Objects using this method have the LOCKABLE + flag set. + You can check if an object is writable with gst_mini_object_is_writable() and + you can make any miniobject writable with gst_mini_object_make_writable(). + This will create a writable copy when the object was not writable. + +### non-LOCKABLE GstMiniObjects + +These GstMiniObjects have the LOCKABLE flag unset. They use the refcount value +to control writability of the object. + +When the refcount of the miniobject is > 1, the objects it referenced by at +least 2 objects and is thus considered unwritable. A copy must be made before a +modification to the object can be done. + +Using the refcount to control writability is problematic for many language +bindings that can keep additional references to the objects. This method is +mainly for historical reasons until all users of the miniobjects are +converted to use the LOCAKBLE flag. + +### LOCKABLE GstMiniObjects + +These GstMiniObjects have the LOCKABLE flag set. They use a separate counter +for controlling writability and access to the object. + +It consists of 2 components: + +#### exclusive counter + +Each object that wants to keep a reference to a GstMiniObject and doesn't want to +see the changes from other owners of the same GstMiniObject needs to lock the +GstMiniObject in EXCLUSIVE mode, which will increase the exclusive counter. + +The exclusive counter counts the amount of objects that share this +GstMiniObject. The counter is initially 0, meaning that the object is not shared with +any object. + +When a reference to a GstMiniObject release, both the ref count and the +exclusive counter will be decreased with `gst_mini_object_unref()` and +`gst_mini_object_unlock()` respectively. + +#### locking + +All read and write access must be performed between a `gst_mini_object_lock()` +and `gst_mini_object_unlock()` pair with the requested access method. + +A `gst_mini_object_lock()` can fail when a `WRITE` lock is requested and the +exclusive counter is > 1. Indeed a GstMiniObject object with an exclusive +counter > 1 is locked EXCLUSIVELY by at least 2 objects and is therefore not +writable. + +Once the GstMiniObject is locked with a certain access mode, it can be +recursively locked with the same or narrower access mode. For example, first +locking the GstMiniObject in READWRITE mode allows you to recusively lock the +GstMiniObject in READWRITE, READ and WRITE mode. Memory locked in READ mode +cannot be locked recursively in WRITE or READWRITE mode. + +Note that multiple threads can READ lock the GstMiniObject concurrently but +cannot lock the object in WRITE mode because the exclusive counter must be > 1. + +All calls to `gst_mini_object_lock()` need to be paired with one +`gst_mini_object_unlock()` call with the same access mode. When the last +refcount of the object is removed, there should be no more outstanding locks. + +Note that a shared counter of both 0 and 1 leaves the GstMiniObject writable. +The reason is to make it easy to create and pass ownership of the GstMiniObject +to another object while keeping it writable. When the GstMiniObject is created +with a shared count of 0, it is writable. When the GstMiniObject is then added +to another object, the shared count is incremented to 1 and the GstMiniObject +remains writable. The 0 share counter has a similar purpose as the floating +reference in GObject. + +## Weak references + +GstMiniObject has support for weak references. A callback will be called +when the object is freed for all registered weak references. + +## QData + +Extra data can be associated with a GstMiniObject by using the QData +API. diff --git a/markdown/design/missing-plugins.md b/markdown/design/missing-plugins.md new file mode 100644 index 0000000000..715698c56a --- /dev/null +++ b/markdown/design/missing-plugins.md @@ -0,0 +1,257 @@ +# What to do when a plugin is missing + +The mechanism and API described in this document requires GStreamer core +and gst-plugins-base versions \>= 0.10.12. Further information on some +aspects of this document can be found in the libgstbaseutils API +reference. + +We only discuss playback pipelines for now. + +A three step process: + +1\) GStreamer level + +Elements will use a "missing-plugin" element message to report +missing plugins, with the following fields set: + +* **`type`**: (string) { "urisource", "urisink", "decoder", "encoder", +"element" } (we do not distinguish between demuxer/decoders/parsers etc.) + +* **`detail`**: (string) or (caps) depending on the type { ANY } ex: "mms, +"mmsh", "audio/x-mp3,rate=48000,…" + +* **`name`**: (string) { ANY } ex: "MMS protocol handler",.. + +## missing uri handler + +ex. mms://foo.bar/file.asf + +When no protocol handler is installed for mms://, the application will not be +able to instantiate an element for that uri (gst_element_make_from_uri() +returns NULL). + +Playbin will post a "missing-plugin" element message with the type set to +"urisource", detail set to "mms". Optionally the friendly name can be filled +in as well. + +## missing typefind function + +We don't recognize the type of the file, this should normally not happen +because all the typefinders are in the basic GStreamer installation. +There is not much useful information we can give about how to resolve this +issue. It is possible to use the first N bytes of the data to determine the +type (and needed plugin) on the server. We don't explore this option in this +document yet, but the proposal is flexible enough to accommodate this in the +future should the need arise. + +## missing demuxer + +Typically after running typefind on the data we determine the type of the +file. If there is no plugin found for the type, a "missing-plugin" element +message is posted by decodebin with the following fields: Type set to +"decoder", detail set to the caps for witch no plugin was found. Optionally +the friendly name can be filled in as well. + +## missing decoder + +The demuxer will dynamically create new pads with specific caps while it +figures out the contents of the container format. Decodebin tries to find the +decoders for these formats in the registry. If there is no decoder found, a +"missing-plugin" element message is posted by decodebin with the following +fields: Type set to "decoder", detail set to the caps for which no plugin +was found. Optionally the friendly name can be filled in as well. There is +no distinction made between the missing demuxer and decoder at the +application level. + +## missing element + +Decodebin and playbin will create a set of helper elements when they set up +their decoding pipeline. These elements are typically colorspace, sample rate, +audio sinks,... Their presence on the system is required for the functionality +of decodebin. It is typically a package dependency error if they are not +present but in case of a corrupted system the following "missing-plugin" +element message will be emitted: type set to "element", detail set to the +element factory name and the friendly name optionally set to a description +of the element's functionality in the decoding pipeline. + +Except for reporting the missing plugins, no further policy is enforced at the +GStreamer level. It is up to the application to decide whether a missing +plugin constitutes a problem or not. + +# Application level + +The application's job is to listen for the "missing-plugin" element messages +and to decide on a policy to handle them. Following cases exist: + +## partially missing plugins + +The application will be able to complete a state change to PAUSED but there +will be a "missing-plugin" element message on the GstBus. + +This means that it will be possible to play back part of the media file but not +all of it. + +For example: suppose we have an .avi file with mp3 audio and divx video. If we +have the mp3 audio decoder but not the divx video decoder, it will be possible +to play only the audio part but not the video part. For an audio playback +application, this is not a problem but a video player might want to decide on: + + - require the use to install the additionally required plugins. + - inform the user that only the audio will be played back + - ask the user if it should download the additional codec or only play + the audio part. + - … + +## completely unplayable stream + +The application will receive an ERROR message from GStreamer informing it that +playback stopped (before it could reach PAUSED). This happens because none of +the streams is connected to a decoder. The error code and domain should be one +of the following in this case: + + - `GST_CORE_ERROR_MISSING_PLUGIN` (domain: GST_CORE_ERROR) + - `GST_STREAM_ERROR_CODEC_NOT_FOUND` (domain: GST_STREAM_ERROR) + +The application can then see that there are a set of "missing-plugin" element +messages on the GstBus and can decide to trigger the download procedure. It +does that as described in the following section. + +"missing-plugin" element messages can be identified using the function +gst_is_missing_plugin_message(). + +# Plugin download stage + +At this point the application has + - collected one or more "missing-plugin" element messages + - made a decision that additional plugins should be installed + +It will call a GStreamer utility function to convert each "missing-plugin" +message into an identifier string describing the missing capability. This is +done using the function `gst_missing_plugin_message_get_installer_detail()`. + +The application will then pass these strings to `gst_install_plugins_async()` +or `gst_install_plugins_sync()` to initiate the download. See the API +documentation there (`libgstbaseutils`, part of `gst-plugins-base`) for more +details. + +When new plugins have been installed, the application will have to initiate +a re-scan of the GStreamer plugin registry using gst_update_registry(). + +# Format of the (UTF-8) string ID passed to the external installer system + +The string is made up of several fields, separated by '|' characters. +The fields are: + +- plugin system identifier, ie. "gstreamer" This identifier determines +the format of the rest of the detail string. Automatic plugin +installers should not process detail strings with unknown +identifiers. This allows other plugin-based libraries to use the +same mechanism for their automatic plugin installation needs, or for +the format to be changed should it turn out to be insufficient. + +- plugin system version, e.g. "1.0" This is required so that when +there is a GStreamer-2.0 or GStreamer-3.0 at some point in future, +the different major versions can still co-exist and use the same +plugin install mechanism in the same way. + +- application identifier, e.g. "totem" This may also be in the form of +"pid/12345" if the program name can’t be obtained for some reason. + +- human-readable localised description of the required component, e.g. +"Vorbis audio decoder" + +- identifier string for the required component, e.g. + +- urisource-(PROTOCOL_REQUIRED) e.g. `urisource-http` or `urisource-mms` + +- element-(ELEMENT_REQUIRED), e.g. `element-videoconvert` + +- decoder-(CAPS_REQUIRED) e.g. `decoder-audio/x-vorbis` or +`decoder-application/ogg` or `decoder-audio/mpeg, mpegversion=(int)4` or +`decoder-video/mpeg, systemstream=(boolean)true, mpegversion=(int)2` + +- encoder-(CAPS_REQUIRED) e.g. `encoder-audio/x-vorbis` + +- optional further fields not yet specified + +* An entire ID string might then look like this, for example: +`gstreamer|0.10|totem|Vorbis audio decoder|decoder-audio/x-vorbis` + +* Plugin installers parsing this ID string should expect further fields also +separated by '|' symbols and either ignore them, warn the user, or error +out when encountering them. + +* The human-readable description string is provided by the libgstbaseutils +library that can be found in gst-plugins-base versions >= 0.10.12 and can +also be used by demuxers to find out the codec names for taglists from given +caps in a unified and consistent way. + +* Applications can create these detail strings using the function +`gst_missing_plugin_message_get_installer_detail()` on a given missing-plugin +message. + +# Using missing-plugin messages for error reporting: + +Missing-plugin messages are also useful for error reporting purposes, either in +the case where the application does not support libgimme-codec, or the external +installer is not available or not able to install the required plugins. + +When creating error messages, applications may use the function +gst_missing_plugin_message_get_description() to obtain a possibly translated +description from each missing-plugin message (e.g. "Matroska demuxer" or +"Theora video depayloader"). This can be used to report to the user exactly +what it is that is missing. + +# Notes for packagers + +An easy way to introspect plugin .so files is: + +``` +$ gst-inspect --print-plugin-auto-install-info /path/to/libgstfoo.so +``` + +The output will be something like: + +``` +decoder-audio/x-vorbis +element-vorbisdec +element-vorbisenc +element-vorbisparse +element-vorbistag +encoder-audio/x-vorbis +``` + +BUT could also be like this (from the faad element in this case): + +``` +decoder-audio/mpeg, mpegversion=(int){ 2, 4 } +``` + +NOTE that this does not exactly match the caps string that the installer +will get from the application. The application will always ever ask for +one of + +``` +decoder-audio/mpeg, mpegversion=(int)2 +decoder-audio/mpeg, mpegversion=(int)4 +``` + +When introspecting, keep in mind that there are GStreamer plugins +that in turn load external plugins. Examples of these are pitfdll, +ladspa, or the GStreamer libvisual plugin. Those plugins will only +announce elements for the currently installed external plugins at +the time of introspection\! With the exception of pitfdll, this is +not really relevant to the playback case, but may become an issue in +future when applications like buzztard, jokosher or pitivi start +requestion elements by name, for example ladspa effect elements or +so. + +This case could be handled if those wrapper plugins would also provide a +`gst-install-xxx-plugins-helper`, where xxx={ladspa|visual|...}. Thus if the +distro specific `gst-install-plugins-helper` can't resolve a request for e.g. +`element-bml-sonicverb` it can forward the request to +`gst-install-bml-plugins-helper` (bml is the buzz machine loader). + +# Further references: + + diff --git a/markdown/design/negotiation.md b/markdown/design/negotiation.md new file mode 100644 index 0000000000..d8bb8854a4 --- /dev/null +++ b/markdown/design/negotiation.md @@ -0,0 +1,333 @@ +# Negotiation + +Capabilities negotiation is the process of deciding on an adequate +format for dataflow within a GStreamer pipeline. Ideally, negotiation +(also known as "capsnego") transfers information from those parts of the +pipeline that have information to those parts of the pipeline that are +flexible, constrained by those parts of the pipeline that are not +flexible. + +## Basic rules + +These simple rules must be followed: + +1) downstream suggests formats +2) upstream decides on format + +There are 4 queries/events used in caps negotiation: + +1) `GST_QUERY_CAPS`: get possible formats +2) `GST_QUERY_ACCEPT_CAPS`: check if format is possible +3) `GST_EVENT_CAPS`: configure format (downstream) +4) `GST_EVENT_RECONFIGURE`: inform upstream of possibly new caps + +# Queries + +A pad can ask the peer pad for its supported GstCaps. It does this with +the CAPS query. The list of supported caps can be used to choose an +appropriate GstCaps for the data transfer. The CAPS query works +recursively, elements should take their peers into consideration when +constructing the possible caps. Because the result caps can be very +large, the filter can be used to restrict the caps. Only the caps that +match the filter will be returned as the result caps. The order of the +filter caps gives the order of preference of the caller and should be +taken into account for the returned caps. + +* **`filter`** (in) GST_TYPE_CAPS (default NULL): - a GstCaps to filter the results against +* **`caps`** (out) GST_TYPE_CAPS (default NULL): - the result caps + +A pad can ask the peer pad if it supports a given caps. It does this +with the ACCEPT\_CAPS query. The caps must be fixed. The ACCEPT\_CAPS +query is not required to work recursively, it can simply return TRUE if +a subsequent CAPS event with those caps would return success. + +* **`caps`** (in) GST_TYPE_CAPS: - a GstCaps to check, must be fixed +* **`result`** (out) G_TYPE_BOOLEAN (default FALSE): - TRUE if the caps are accepted + +## Events + +When a media format is negotiated, peer elements are notified of the +GstCaps with the CAPS event. The caps must be fixed. + +* **`caps`** GST_TYPE_CAPS: - the negotiated GstCaps, must be fixed + +## Operation + +GStreamer’s two scheduling modes, push mode and pull mode, lend +themselves to different mechanisms to achieve this goal. As it is more +common we describe push mode negotiation first. + +## Push-mode negotiation + +Push-mode negotiation happens when elements want to push buffers and +need to decide on the format. This is called downstream negotiation +because the upstream element decides the format for the downstream +element. This is the most common case. + +Negotiation can also happen when a downstream element wants to receive +another data format from an upstream element. This is called upstream +negotiation. + +The basics of negotiation are as follows: + +- GstCaps (see [caps](design/caps.md)) are refcounted before they are pushed as +an event to describe the contents of the following buffer. + +- An element should reconfigure itself to the new format received as a +CAPS event before processing the following buffers. If the data type +in the caps event is not acceptable, the element should refuse the +event. The element should also refuse the next buffers by returning +an appropriate GST\_FLOW\_NOT\_NEGOTIATED return value from the +chain function. + +- Downstream elements can request a format change of the stream by +sending a RECONFIGURE event upstream. Upstream elements will +renegotiate a new format when they receive a RECONFIGURE event. + +The general flow for a source pad starting the negotiation. + +``` + src sink + | | + | querycaps? | + |---------------->| + | caps | +select caps |< - - - - - - - -| +from the | | +candidates | | + | |-. + | accepts? | | + type A |---------------->| | optional + | yes | | + |< - - - - - - - -| | + | |-' + | send_event() | +send CAPS |---------------->| Receive type A, reconfigure to +event A | | process type A. + | | + | push | +push buffer |---------------->| Process buffer of type A + | | +``` + +One possible implementation in pseudo code: + +``` + [element wants to create a buffer] + if not format + # see what we can do + ourcaps = gst_pad_query_caps (srcpad) + # see what the peer can do filtered against our caps + candidates = gst_pad_peer_query_caps (srcpad, ourcaps) + + foreach candidate in candidates + # make sure the caps is fixed + fixedcaps = gst_pad_fixate_caps (srcpad, candidate) + + # see if the peer accepts it + if gst_pad_peer_accept_caps (srcpad, fixedcaps) + # store the caps as the negotiated caps, this will + # call the setcaps function on the pad + gst_pad_push_event (srcpad, gst_event_new_caps (fixedcaps)) + break + endif + done +endif +``` + +# Negotiate allocator/bufferpool with the ALLOCATION query + + buffer = gst_buffer_new_allocate (NULL, size, 0); + # fill buffer and push + +The general flow for a sink pad starting a renegotiation. + + +``` + src sink + | | + | accepts? | + |<----------------| type B + | yes | + |- - - - - - - - >|-. + | | | suggest B caps next + | |<' + | | + | push_event() | + mark .-|<----------------| send RECONFIGURE event +renegotiate| | | + '>| | + | querycaps() | +renegotiate |---------------->| + | suggest B | + |< - - - - - - - -| + | | + | send_event() | +send CAPS |---------------->| Receive type B, reconfigure to +event B | | process type B. + | | + | push | +push buffer |---------------->| Process buffer of type B + | | +``` + +# Use case: + +## `videotestsrc ! xvimagesink` + +* Who decides what format to use? + - src pad always decides, by convention. sinkpad can suggest a format + by putting it high in the caps query result GstCaps. + - since the src decides, it can always choose something that it can do, + so this step can only fail if the sinkpad stated it could accept + something while later on it couldn't. + +* When does negotiation happen? + - before srcpad does a push, it figures out a type as stated in 1), then + it pushes a caps event with the type. The sink checks the media type and + configures itself for this type. + - the source then usually does an ALLOCATION query to negotiate a bufferpool + with the sink. It then allocates a buffer from the pool and pushes it to + the sink. since the sink accepted the caps, it can create a pool for the + format. + - since the sink stated in 1) it could accept the type, it will be able to + handle it. + +* How can sink request another format? + - sink asks if new format is possible for the source. + - sink pushes RECONFIGURE event upstream + - src receives the RECONFIGURE event and marks renegotiation + - On the next buffer push, the source renegotiates the caps and the + bufferpool. The sink will put the new new preferred format high in the list + of caps it returns from its caps query. + +## `videotestsrc ! queue ! xvimagesink` + +- queue proxies all accept and caps queries to the other peer pad. +- queue proxies the bufferpool +- queue proxies the RECONFIGURE event +- queue stores CAPS event in the queue. This means that the queue can +contain buffers with different types. + +## Pull-mode negotiation + +### Rationale + +A pipeline in pull mode has different negotiation needs than one +activated in push mode. Push mode is optimized for two use cases: + +- Playback of media files, in which the demuxers and the decoders are +the points from which format information should disseminate to the +rest of the pipeline; and + +- Recording from live sources, in which users are accustomed to +putting a capsfilter directly after the source element; thus the +caps information flow proceeds from the user, through the potential +caps of the source, to the sinks of the pipeline. + +In contrast, pull mode has other typical use cases: + +- Playback from a lossy source, such as RTP, in which more knowledge +about the latency of the pipeline can increase quality; or + +- Audio synthesis, in which audio APIs are tuned to produce only the +necessary number of samples, typically driven by a hardware +interrupt to fill a DMA buffer or a Jack[0] port buffer. + +- Low-latency effects processing, whereby filters should be applied as +data is transferred from a ring buffer to a sink instead of +beforehand. For example, instead of using the internal alsasink +ringbuffer thread in push-mode wavsrc \! volume \! alsasink, placing +the volume inside the sound card writer thread via wavsrc \! +audioringbuffer \! volume \! alsasink. + +[0] + +The problem with pull mode is that the sink has to know the format in +order to know how many bytes to pull via `gst_pad_pull_range()`. This +means that before pulling, the sink must initiate negotation to decide +on a format. + +Recalling the principles of capsnego, whereby information must flow from +those that have it to those that do not, we see that the three named use +cases have different negotiation requirements: + +- RTP and low-latency playback are both like the normal playback case, +in which information flows downstream. + +- In audio synthesis, the part of the pipeline that has the most +information is the sink, constrained by the capabilities of the +graph that feeds it. However the caps are not completely specified; +at some point the user has to intervene to choose the sample rate, +at least. This can be done externally to gstreamer, as in the jack +elements, or internally via a capsfilter, as is customary with live +sources. + +Given that sinks potentially need the input of sources, as in the RTP +case and at least as a filter in the synthesis case, there must be a +negotiation phase before the pull thread is activated. Also, given the +low latency offered by pull mode, we want to avoid capsnego from within +the pulling thread, in case it causes us to miss our scheduling +deadlines. + +The pull thread is usually started in the PAUSED→PLAYING state change. +We must be able to complete the negotiation before this state change +happens. + +The time to do capsnego, then, is after the SCHEDULING query has +succeeded, but before the sink has spawned the pulling thread. + +### Mechanism + +The sink determines that the upstream elements support pull based +scheduling by doing a SCHEDULING query. + +The sink initiates the negotiation process by intersecting the results +of `gst_pad_query_caps()` on its sink pad and its peer src pad. This is +the operation performed by `gst_pad_get_allowed_caps()` In the simple +passthrough case, the peer pad’s caps query should return the +intersection of calling `get_allowed_caps()` on all of its sink pads. In +this way the sink element knows the capabilities of the entire pipeline. + +The sink element then fixates the resulting caps, if necessary, +resulting in the flow caps. From now on, the caps query of the sinkpad +will only return these fixed caps meaning that upstream elements will +only be able to produce this format. + +If the sink element could not set caps on its sink pad, it should post +an error message on the bus indicating that negotiation was not +possible. + +When negotiation succeeded, the sinkpad and all upstream internally +linked pads are activated in pull mode. Typically, this operation will +trigger negotiation on the downstream elements, which will now be forced +to negotiate to the final fixed desired caps of the sinkpad. + +After these steps, the sink element returns ASYNC from the state change +function. The state will commit to PAUSED when the first buffer is +received in the sink. This is needed to provide a consistent API to the +applications that expect ASYNC return values from sinks but it also +allows us to perform the remainder of the negotiation outside of the +context of the pulling thread. + +## Patterns + +We can identify 3 patterns in negotiation: + +* Fixed : Can't choose the output format + - Caps encoded in the stream + - A video/audio decoder + - usually uses gst_pad_use_fixed_caps() + +* Transform + - Caps not modified (passthrough) + - can do caps transform based on element property + - fixed caps get transformed into fixed caps + - videobox + +* Dynamic : can choose output format + - A converter element + - depends on downstream caps, needs to do a CAPS query to find + transform. + - usually prefers to use the identity transform + - fixed caps can be transformed into unfixed caps. diff --git a/markdown/design/overview.md b/markdown/design/overview.md new file mode 100644 index 0000000000..fd75e90240 --- /dev/null +++ b/markdown/design/overview.md @@ -0,0 +1,568 @@ +# Overview + +This part gives an overview of the design of GStreamer with references +to the more detailed explanations of the different topics. + +This document is intented for people that want to have a global overview +of the inner workings of GStreamer. + +## Introduction + +GStreamer is a set of libraries and plugins that can be used to +implement various multimedia applications ranging from desktop players, +audio/video recorders, multimedia servers, transcoders, etc. + +Applications are built by constructing a pipeline composed of elements. +An element is an object that performs some action on a multimedia stream +such as: + +- read a file +- decode or encode between formats +- capture from a hardware device +- render to a hardware device +- mix or multiplex multiple streams + +Elements have input and output pads called sink and source pads in +GStreamer. An application links elements together on pads to construct a +pipeline. Below is an example of an ogg/vorbis playback pipeline. + +``` + +-----------------------------------------------------------+ + | ----------> downstream -------------------> | + | | + | pipeline | + | +---------+ +----------+ +-----------+ +----------+ | + | | filesrc | | oggdemux | | vorbisdec | | alsasink | | + | | src-sink src-sink src-sink | | + | +---------+ +----------+ +-----------+ +----------+ | + | | + | <---------< upstream <-------------------< | + +-----------------------------------------------------------+ +``` + +The filesrc element reads data from a file on disk. The oggdemux element +parses the data and sends the compressed audio data to the vorbisdec +element. The vorbisdec element decodes the compressed data and sends it +to the alsasink element. The alsasink element sends the samples to the +audio card for playback. + +Downstream and upstream are the terms used to describe the direction in +the Pipeline. From source to sink is called "downstream" and "upstream" +is from sink to source. Dataflow always happens downstream. + +The task of the application is to construct a pipeline as above using +existing elements. This is further explained in the pipeline building +topic. + +The application does not have to manage any of the complexities of the +actual dataflow/decoding/conversions/synchronisation etc. but only calls +high level functions on the pipeline object such as PLAY/PAUSE/STOP. + +The application also receives messages and notifications from the +pipeline such as metadata, warning, error and EOS messages. + +If the application needs more control over the graph it is possible to +directly access the elements and pads in the pipeline. + +## Design overview + +GStreamer design goals include: + +- Process large amounts of data quickly +- Allow fully multithreaded processing +- Ability to deal with multiple formats +- Synchronize different dataflows +- Ability to deal with multiple devices + +The capabilities presented to the application depends on the number of +elements installed on the system and their functionality. + +The GStreamer core is designed to be media agnostic but provides many +features to elements to describe media formats. + +## Elements + +The smallest building blocks in a pipeline are elements. An element +provides a number of pads which can be source or sinkpads. Sourcepads +provide data and sinkpads consume data. Below is an example of an ogg +demuxer element that has one pad that takes (sinks) data and two source +pads that produce data. + +``` + +-----------+ + | oggdemux | + | src0 +sink src1 + +-----------+ +``` + +An element can be in four different states: NULL, READY, PAUSED, +PLAYING. In the NULL and READY state, the element is not processing any +data. In the PLAYING state it is processing data. The intermediate +PAUSED state is used to preroll data in the pipeline. A state change can +be performed with `gst_element_set_state()`. + +An element always goes through all the intermediate state changes. This +means that when en element is in the READY state and is put to PLAYING, +it will first go through the intermediate PAUSED state. + +An element state change to PAUSED will activate the pads of the element. +First the source pads are activated, then the sinkpads. When the pads +are activated, the pad activate function is called. Some pads will start +a thread (GstTask) or some other mechanism to start producing or +consuming data. + +The PAUSED state is special as it is used to preroll data in the +pipeline. The purpose is to fill all connected elements in the pipeline +with data so that the subsequent PLAYING state change happens very +quickly. Some elements will therefore not complete the state change to +PAUSED before they have received enough data. Sink elements are required +to only complete the state change to PAUSED after receiving the first +data. + +Normally the state changes of elements are coordinated by the pipeline +as explained in [states](design/states.md). + +Different categories of elements exist: + +- *source elements*: these are elements that do not consume data but +only provide data for the pipeline. + +- *sink elements*: these are elements that do not produce data but +renders data to an output device. + +- *transform elements*: these elements transform an input stream in a +certain format into a stream of another format. +Encoder/decoder/converters are examples. + +- *demuxer elements*: these elements parse a stream and produce several +output streams. + +- *mixer/muxer elements*: combine several input streams into one output +stream. + +Other categories of elements can be constructed (see [klass](design/draft-klass.md)). + +## Bins + +A bin is an element subclass and acts as a container for other elements +so that multiple elements can be combined into one element. + +A bin coordinates its children’s state changes as explained later. It +also distributes events and various other functionality to elements. + +A bin can have its own source and sinkpads by ghostpadding one or more +of its children’s pads to itself. + +Below is a picture of a bin with two elements. The sinkpad of one +element is ghostpadded to the bin. + +``` + +---------------------------+ + | bin | + | +--------+ +-------+ | + | | | | | | + | /sink src-sink | | +sink +--------+ +-------+ | + +---------------------------+ +``` + +## Pipeline + +A pipeline is a special bin subclass that provides the following +features to its children: + +- Select and manage a global clock for all its children. +- Manage running\_time based on the selected clock. Running\_time is +the elapsed time the pipeline spent in the PLAYING state and is used +for synchronisation. +- Manage latency in the pipeline. +- Provide means for elements to comunicate with the application by the +GstBus. +- Manage the global state of the elements such as Errors and +end-of-stream. + +Normally the application creates one pipeline that will manage all the +elements in the application. + +## Dataflow and buffers + +GStreamer supports two possible types of dataflow, the push and pull +model. In the push model, an upstream element sends data to a downstream +element by calling a method on a sinkpad. In the pull model, a +downstream element requests data from an upstream element by calling a +method on a source pad. + +The most common dataflow is the push model. The pull model can be used +in specific circumstances by demuxer elements. The pull model can also +be used by low latency audio applications. + +The data passed between pads is encapsulated in Buffers. The buffer +contains pointers to the actual memory and also metadata describing the +memory. This metadata includes: + +- timestamp of the data, this is the time instance at which the data +was captured or the time at which the data should be played back. + +- offset of the data: a media specific offset, this could be samples +for audio or frames for video. + +- the duration of the data in time. + +- additional flags describing special properties of the data such as +discontinuities or delta units. + +- additional arbitrary metadata + +When an element whishes to send a buffer to another element is does this +using one of the pads that is linked to a pad of the other element. In +the push model, a buffer is pushed to the peer pad with +`gst_pad_push()`. In the pull model, a buffer is pulled from the peer +with the `gst_pad_pull_range()` function. + +Before an element pushes out a buffer, it should make sure that the peer +element can understand the buffer contents. It does this by querying the +peer element for the supported formats and by selecting a suitable +common format. The selected format is then first sent to the peer +element with a CAPS event before pushing the buffer (see +[negotiation](design/negotiation.md)). + +When an element pad receives a CAPS event, it has to check if it +understand the media type. The element must refuse following buffers if +the media type preceding it was not accepted. + +Both `gst_pad_push()` and `gst_pad_pull_range()` have a return value +indicating whether the operation succeeded. An error code means that no +more data should be sent to that pad. A source element that initiates +the data flow in a thread typically pauses the producing thread when +this happens. + +A buffer can be created with `gst_buffer_new()` or by requesting a +usable buffer from a buffer pool using +`gst_buffer_pool_acquire_buffer()`. Using the second method, it is +possible for the peer element to implement a custom buffer allocation +algorithm. + +The process of selecting a media type is called caps negotiation. + +## Caps + +A media type (Caps) is described using a generic list of key/value +pairs. The key is a string and the value can be a single/list/range of +int/float/string. + +Caps that have no ranges/list or other variable parts are said to be +fixed and can be used to put on a buffer. + +Caps with variables in them are used to describe possible media types +that can be handled by a pad. + +## Dataflow and events + +Parallel to the dataflow is a flow of events. Unlike the buffers, events +can pass both upstream and downstream. Some events only travel upstream +others only downstream. + +The events are used to denote special conditions in the dataflow such as +EOS or to inform plugins of special events such as flushing or seeking. + +Some events must be serialized with the buffer flow, others don’t. +Serialized events are inserted between the buffers. Non serialized +events jump in front of any buffers current being processed. + +An example of a serialized event is a TAG event that is inserted between +buffers to mark metadata for those buffers. + +An example of a non serialized event is the FLUSH event. + +## Pipeline construction + +The application starts by creating a Pipeline element using +`gst_pipeline_new ()`. Elements are added to and removed from the +pipeline with `gst_bin_add()` and `gst_bin_remove()`. + +After adding the elements, the pads of an element can be retrieved with +`gst_element_get_pad()`. Pads can then be linked together with +`gst_pad_link()`. + +Some elements create new pads when actual dataflow is happening in the +pipeline. With `g_signal_connect()` one can receive a notification when +an element has created a pad. These new pads can then be linked to other +unlinked pads. + +Some elements cannot be linked together because they operate on +different incompatible data types. The possible datatypes a pad can +provide or consume can be retrieved with `gst_pad_get_caps()`. + +Below is a simple mp3 playback pipeline that we constructed. We will use +this pipeline in further examples. + + +-------------------------------------------+ + | pipeline | + | +---------+ +----------+ +----------+ | + | | filesrc | | mp3dec | | alsasink | | + | | src-sink src-sink | | + | +---------+ +----------+ +----------+ | + +-------------------------------------------+ + +## Pipeline clock + +One of the important functions of the pipeline is to select a global +clock for all the elements in the pipeline. + +The purpose of the clock is to provide a stricly increasing value at the +rate of one `GST_SECOND` per second. Clock values are expressed in +nanoseconds. Elements use the clock time to synchronize the playback of +data. + +Before the pipeline is set to PLAYING, the pipeline asks each element if +they can provide a clock. The clock is selected in the following order: + +- If the application selected a clock, use that one. + +- If a source element provides a clock, use that clock. + +- Select a clock from any other element that provides a clock, start +with the sinks. + +- If no element provides a clock a default system clock is used for +the pipeline. + +In a typical playback pipeline this algorithm will select the clock +provided by a sink element such as an audio sink. + +In capture pipelines, this will typically select the clock of the data +producer, which in most cases can not control the rate at which it +produces data. + +## Pipeline states + +When all the pads are linked and signals have been connected, the +pipeline can be put in the PAUSED state to start dataflow. + +When a bin (and hence a pipeline) performs a state change, it will +change the state of all its children. The pipeline will change the state +of its children from the sink elements to the source elements, this to +make sure that no upstream element produces data to an element that is +not yet ready to accept it. + +In the mp3 playback pipeline, the state of the elements is changed in +the order alsasink, mp3dec, filesrc. + +All intermediate states are traversed for each element resulting in the +following chain of state changes: + +* alsasink to READY: the audio device is probed + +* mp3dec to READY: nothing happens. + +* filesrc to READY: the file is probed + +* alsasink to PAUSED: the audio device is opened. alsasink is a sink and returns ASYNC because it did not receive data yet. mp3dec to PAUSED: the decoding library is initialized + +* filesrc to PAUSED: the file is opened and a thread is started to push data to mp3dec + +At this point data flows from filesrc to mp3dec and alsasink. Since +mp3dec is PAUSED, it accepts the data from filesrc on the sinkpad and +starts decoding the compressed data to raw audio samples. + +The mp3 decoder figures out the samplerate, the number of channels and +other audio properties of the raw audio samples and sends out a caps +event with the media type. + +Alsasink then receives the caps event, inspects the caps and +reconfigures itself to process the media type. + +mp3dec then puts the decoded samples into a Buffer and pushes this +buffer to the next element. + +Alsasink receives the buffer with samples. Since it received the first +buffer of samples, it completes the state change to the PAUSED state. At +this point the pipeline is prerolled and all elements have samples. +Alsasink is now also capable of providing a clock to the pipeline. + +Since alsasink is now in the PAUSED state it blocks while receiving the +first buffer. This effectively blocks both mp3dec and filesrc in their +gst\_pad\_push(). + +Since all elements now return SUCCESS from the +gst\_element\_get\_state() function, the pipeline can be put in the +PLAYING state. + +Before going to PLAYING, the pipeline select a clock and samples the +current time of the clock. This is the base\_time. It then distributes +this time to all elements. Elements can then synchronize against the +clock using the buffer running\_time +base\_time (See also [synchronisation](design/synchronisation.md)). + +The following chain of state changes then takes place: + +* alsasink to PLAYING: the samples are played to the audio device + +* mp3dec to PLAYING: nothing happens + +* filesrc to PLAYING: nothing happens + +## Pipeline status + +The pipeline informs the application of any special events that occur in +the pipeline with the bus. The bus is an object that the pipeline +provides and that can be retrieved with `gst_pipeline_get_bus()`. + +The bus can be polled or added to the glib mainloop. + +The bus is distributed to all elements added to the pipeline. The +elements use the bus to post messages on. Various message types exist +such as ERRORS, WARNINGS, EOS, `STATE_CHANGED`, etc.. + +The pipeline handles EOS messages received from elements in a special +way. It will only forward the message to the application when all sink +elements have posted an EOS message. + +Other methods for obtaining the pipeline status include the Query +functionality that can be performed with `gst_element_query()` on the +pipeline. This type of query is useful for obtaining information about +the current position and total time of the pipeline. It can also be used +to query for the supported seeking formats and ranges. + +## Pipeline EOS + +When the source filter encounters the end of the stream, it sends an EOS +event to the peer element. This event will then travel downstream to all +of the connected elements to inform them of the EOS. The element is not +supposed to accept any more data after receiving an EOS event on a +sinkpad. + +The element providing the streaming thread stops sending data after +sending the EOS event. + +The EOS event will eventually arrive in the sink element. The sink will +then post an EOS message on the bus to inform the pipeline that a +particular stream has finished. When all sinks have reported EOS, the +pipeline forwards the EOS message to the application. The EOS message is +only forwarded to the application in the PLAYING state. + +When in EOS, the pipeline remains in the PLAYING state, it is the +applications responsability to PAUSE or READY the pipeline. The +application can also issue a seek, for example. + +## Pipeline READY + +When a running pipeline is set from the PLAYING to READY state, the +following actions occur in the pipeline: + +* alsasink to PAUSED: alsasink blocks and completes the state change on the +next sample. If the element was EOS, it does not wait for a sample to complete +the state change. +* mp3dec to PAUSED: nothing +* filesrc to PAUSED: nothing + +Going to the intermediate PAUSED state will block all elements in the +`_push()` functions. This happens because the sink element blocks on the +first buffer it receives. + +Some elements might be performing blocking operations in the PLAYING +state that must be unblocked when they go into the PAUSED state. This +makes sure that the state change happens very fast. + +In the next PAUSED to READY state change the pipeline has to shut down +and all streaming threads must stop sending data. This happens in the +following sequence: + +* alsasink to READY: alsasink unblocks from the `_chain()` function and returns +a FLUSHING return value to the peer element. The sinkpad is deactivated and +becomes unusable for sending more data. +* mp3dec to READY: the pads are deactivated and the state change completes +when mp3dec leaves its `_chain()` function. +* filesrc to READY: the pads are deactivated and the thread is paused. + +The upstream elements finish their chain() function because the +downstream element returned an error code (FLUSHING) from the `_push()` +functions. These error codes are eventually returned to the element that +started the streaming thread (filesrc), which pauses the thread and +completes the state change. + +This sequence of events ensure that all elements are unblocked and all +streaming threads stopped. + +## Pipeline seeking + +Seeking in the pipeline requires a very specific order of operations to +make sure that the elements remain synchronized and that the seek is +performed with a minimal amount of latency. + +An application issues a seek event on the pipeline using +`gst_element_send_event()` on the pipeline element. The event can be a +seek event in any of the formats supported by the elements. + +The pipeline first pauses the pipeline to speed up the seek operations. + +The pipeline then issues the seek event to all sink elements. The sink +then forwards the seek event upstream until some element can perform the +seek operation, which is typically the source or demuxer element. All +intermediate elements can transform the requested seek offset to another +format, this way a decoder element can transform a seek to a frame +number to a timestamp, for example. + +When the seek event reaches an element that will perform the seek +operation, that element performs the following steps. + +1) send a FLUSH_START event to all downstream and upstream peer elements. +2) make sure the streaming thread is not running. The streaming thread will + always stop because of step 1). +3) perform the seek operation +4) send a FLUSH done event to all downstream and upstream peer elements. +5) send SEGMENT event to inform all elements of the new position and to complete + the seek. + +In step 1) all downstream elements have to return from any blocking +operations and have to refuse any further buffers or events different +from a FLUSH done. + +The first step ensures that the streaming thread eventually unblocks and +that step 2) can be performed. At this point, dataflow is completely +stopped in the pipeline. + +In step 3) the element performs the seek to the requested position. + +In step 4) all peer elements are allowed to accept data again and +streaming can continue from the new position. A FLUSH done event is sent +to all the peer elements so that they accept new data again and restart +their streaming threads. + +Step 5) informs all elements of the new position in the stream. After +that the event function returns back to the application. and the +streaming threads start to produce new data. + +Since the pipeline is still PAUSED, this will preroll the next media +sample in the sinks. The application can wait for this preroll to +complete by performing a `_get_state()` on the pipeline. + +The last step in the seek operation is then to adjust the stream +running_time of the pipeline to 0 and to set the pipeline back to +PLAYING. + +The sequence of events in our mp3 playback example. + +``` + | a) seek on pipeline + | b) PAUSE pipeline ++----------------------------------V--------+ +| pipeline | c) seek on sink +| +---------+ +----------+ +---V------+ | +| | filesrc | | mp3dec | | alsasink | | +| | src-sink src-sink | | +| +---------+ +----------+ +----|-----+ | ++-----------------------------------|-------+ + <------------------------+ + d) seek travels upstream + + --------------------------> 1) FLUSH event + | 2) stop streaming + | 3) perform seek + --------------------------> 4) FLUSH done event + --------------------------> 5) SEGMENT event + + | e) update running_time to 0 + | f) PLAY pipeline +``` diff --git a/markdown/design/preroll.md b/markdown/design/preroll.md new file mode 100644 index 0000000000..f728e1b98a --- /dev/null +++ b/markdown/design/preroll.md @@ -0,0 +1,55 @@ +# Preroll + +A sink element can only complete the state change to `PAUSED` after a +buffer has been queued on the input pad or pads. This process is called +prerolling and is needed to fill the pipeline with buffers so that the +transition to `PLAYING` goes as fast as possible with no visual delay for +the user. + +Preroll is also crucial in maintaining correct audio and video +synchronisation and ensuring that no buffers are dropped in the sinks. + +After receiving a buffer (or EOS) on a pad the chain/event function +should wait to render the buffers or in the EOS case, wait to post the +EOS message. While waiting, the sink will wait for the preroll cond to +be signalled. + +Several things can happen that require the preroll cond to be signalled. +This include state changes or flush events. The prerolling is +implemented in sinks (see [element-sink](design/element-sink.md) + +## Committing the state + +When going to `PAUSED` and `PLAYING` a buffer should be queued in the pad. +We also make this requirement for going to `PLAYING` since a flush event +in the `PAUSED` state could unqueue the buffer again. + +The state is commited in the following conditions: + +- a buffer is received on a sinkpad +- an GAP event is received on a sinkpad. +- an EOS event is received on a sinkpad. + +We require the state change to be commited in EOS as well since an EOS +means by definition that no buffer is going to arrive anymore. + +After the state is commited, a blocking wait should be performed for the +next event. Some sinks might render the preroll buffer before starting +this blocking wait. + +## Unlocking the preroll + +The following conditions unlock the preroll: + +- a state change +- a flush event + +When the preroll is unlocked by a flush event, a return value of +`GST_FLOW_FLUSHING` is to be returned to the peer pad. + +When preroll is unlocked by a state change to `PLAYING`, playback and +rendering of the buffers shall start. + +When preroll is unlocked by a state change to READY, the buffer is to be +discarded and a `GST_FLOW_FLUSHING` shall be returned to the peer +element. diff --git a/markdown/design/probes.md b/markdown/design/probes.md new file mode 100644 index 0000000000..aa2626695d --- /dev/null +++ b/markdown/design/probes.md @@ -0,0 +1,363 @@ +# Probes + +Probes are callbacks that can be installed by the application and will notify +the application about the states of the dataflow. + +# Requirements + +Applications should be able to monitor and control the dataflow on pads. +We identify the following types: + + - be notified when the pad is/becomes idle and make sure the pad stays + idle. This is essential to be able to implement dynamic relinking of + elements without breaking the dataflow. + + - be notified when data, events or queries are pushed or sent on a + pad. It should also be possible to inspect and modify the data. + + - be able to drop, pass and block on data based on the result of the + callback. + + - be able to drop, pass data on blocking pads based on methods + performed by the application + thread. + +# Overview + +The function gst_pad_add_probe() is used to add a probe to a pad. It accepts a +probe type mask and a callback. + +``` c + gulong gst_pad_add_probe (GstPad *pad, + GstPadProbeType mask, + GstPadProbeCallback callback, + gpointer user_data, + GDestroyNotify destroy_data); +``` + +The function returns a gulong that uniquely identifies the probe and that can +be used to remove the probe with gst_pad_remove_probe(): + +``` c + void gst_pad_remove_probe (GstPad *pad, gulong id); +``` + +The mask parameter is a bitwise or of the following flags: + +``` c +typedef enum +{ + GST_PAD_PROBE_TYPE_INVALID = 0, + + /* flags to control blocking */ + GST_PAD_PROBE_TYPE_IDLE = (1 << 0), + GST_PAD_PROBE_TYPE_BLOCK = (1 << 1), + + /* flags to select datatypes */ + GST_PAD_PROBE_TYPE_BUFFER = (1 << 4), + GST_PAD_PROBE_TYPE_BUFFER_LIST = (1 << 5), + GST_PAD_PROBE_TYPE_EVENT_DOWNSTREAM = (1 << 6), + GST_PAD_PROBE_TYPE_EVENT_UPSTREAM = (1 << 7), + GST_PAD_PROBE_TYPE_EVENT_FLUSH = (1 << 8), + GST_PAD_PROBE_TYPE_QUERY_DOWNSTREAM = (1 << 9), + GST_PAD_PROBE_TYPE_QUERY_UPSTREAM = (1 << 10), + + /* flags to select scheduling mode */ + GST_PAD_PROBE_TYPE_PUSH = (1 << 12), + GST_PAD_PROBE_TYPE_PULL = (1 << 13), +} GstPadProbeType; +``` + +When adding a probe with the IDLE or BLOCK flag, the probe will become a +blocking probe (see below). Otherwise the probe will be a DATA probe. + +The datatype and scheduling selector flags are used to select what kind of +datatypes and scheduling modes should be allowed in the callback. + +The blocking flags must match the triggered probe exactly. + +The probe callback is defined as: + +``` c + GstPadProbeReturn (*GstPadProbeCallback) (GstPad *pad, GstPadProbeInfo *info, + gpointer user_data); +``` + +A probe info structure is passed as an argument and its type is guaranteed +to match the mask that was used to register the callback. The data item in the +info contains type specific data, which is usually the data item that is blocked +or NULL when no data item is present. + +The probe can return any of the following return values: + +``` c +typedef enum +{ + GST_PAD_PROBE_DROP, + GST_PAD_PROBE_OK, + GST_PAD_PROBE_REMOVE, + GST_PAD_PROBE_PASS, +} GstPadProbeReturn; +``` + +`GST_PAD_PROBE_OK` is the normal return value. DROP will drop the item that is +currently being probed. GST_PAD_PROBE_REMOVE the currently executing probe from the +list of probes. + +`GST_PAD_PROBE_PASS` is relevant for blocking probes and will temporarily unblock the +pad and let the item trough, it will then block again on the next item. + +# Blocking probes + +Blocking probes are probes with BLOCK or IDLE flags set. They will always +block the dataflow and trigger the callback according to the following rules: + +When the IDLE flag is set, the probe callback is called as soon as no data is +flowing over the pad. If at the time of probe registration, the pad is idle, +the callback will be called immediately from the current thread. Otherwise, +the callback will be called as soon as the pad becomes idle in the streaming +thread. + +The IDLE probe is useful to perform dynamic linking, it allows to wait for for +a safe moment when an unlink/link operation can be done. Since the probe is a +blocking probe, it will also make sure that the pad stays idle until the probe +is removed. + +When the BLOCK flag is set, the probe callback will be called when new data +arrives on the pad and right before the pad goes into the blocking state. This +callback is thus only called when there is new data on the pad. + +The blocking probe is removed with gst_pad_remove_probe() or when the probe +callback return GST_PAD_PROBE_REMOVE. In both cases, and if this was the last +blocking probe on the pad, the pad is unblocked and dataflow can continue. + +# Non-Blocking probes + +Non-blocking probes or DATA probes are probes triggered when data is flowing +over the pad. The are called after the blocking probes are run and always with +data. + +# Push dataflow + +Push probes have the GST\_PAD\_PROBE\_TYPE\_PUSH flag set in the +callbacks. + +In push based scheduling, the blocking probe is called first with the +data item. Then the data probes are called before the peer pad chain or +event function is called. + +The data probes are called before the peer pad is checked. This allows +for linking the pad in either the BLOCK or DATA probes on the pad. + +Before the peerpad chain or event function is called, the peer pad block +and data probes are called. + +Finally, the IDLE probe is called on the pad after the data was sent to +the peer pad. + +The push dataflow probe behavior is the same for buffers and +bidirectional events. + +``` + pad peerpad + | | +gst_pad_push() / | | +gst_pad_push_event() | | +-------------------->O | + O | + flushing? O | + FLUSHING O | + < - - - - - - O | + O-> do BLOCK probes | + O | + O-> do DATA probes | + no peer? O | + NOT_LINKED O | + < - - - - - - O | + O gst_pad_chain() / | + O gst_pad_send_event() | + O------------------------------>O + O flushing? O + O FLUSHING O + O< - - - - - - - - - - - - - - -O + O O-> do BLOCK probes + O O + O O-> do DATA probes + O O + O O---> chainfunc / + O O eventfunc + O< - - - - - - - - - - - - - - -O + O | + O-> do IDLE probes | + O | + < - - - - - - O | + | | +``` + +# Pull dataflow + +Pull probes have the `GST_PAD_PROBE_TYPE_PULL` flag set in the +callbacks. + +The `gst_pad_pull_range()` call will first trigger the BLOCK probes +without a DATA item. This allows the pad to be linked before the peer +pad is resolved. It also allows the callback to set a data item in the +probe info. + +After the blocking probe and the getrange function is called on the peer +pad and there is a data item, the DATA probes are called. + +When control returns to the sinkpad, the IDLE callbacks are called. The +IDLE callback is called without a data item so that it will also be +called when there was an error. + +If there is a valid DATA item, the DATA probes are called for the item. + +``` + srcpad sinkpad + | | + | | gst_pad_pull_range() + | O<--------------------- + | O + | O flushing? + | O FLUSHING + | O - - - - - - - - - - > + | do BLOCK probes <-O + | O no peer? + | O NOT_LINKED + | O - - - - - - - - - - > + | gst_pad_get_range() O + O<------------------------------O + O O + O flushing? O + O FLUSHING O + O- - - - - - - - - - - - - - - >O +do BLOCK probes <-O O + O O + getrangefunc <---O O + O flow error? O + O- - - - - - - - - - - - - - - >O + O O + do DATA probes <-O O + O- - - - - - - - - - - - - - - >O + | O + | do IDLE probes <-O + | O flow error? + | O - - - - - - - - - - > + | O + | do DATA probes <-O + | O - - - - - - - - - - > + | | +``` + +# Queries + +Query probes have the GST_PAD_PROBE_TYPE_QUERY_* flag set in the +callbacks. + +``` + pad peerpad + | | +gst_pad_peer_query() | | +-------------------->O | + O | + O-> do BLOCK probes | + O | + O-> do QUERY | PUSH probes | + no peer? O | + FALSE O | + < - - - - - - O | + O gst_pad_query() | + O------------------------------>O + O O-> do BLOCK probes + O O + O O-> do QUERY | PUSH probes + O O + O O---> queryfunc + O error O + <- - - - - - - - - - - - - - - - - - - - - - -O + O O + O O-> do QUERY | PULL probes + O< - - - - - - - - - - - - - - -O + O | + O-> do QUERY | PULL probes | + O | + < - - - - - - O | + | | +``` + +For queries, the PUSH ProbeType is set when the query is traveling to +the object that will answer the query and the PULL type is set when the +query contains the answer. + +# Use-cases + +## Prerolling a partial pipeline + +``` + .---------. .---------. .----------. + | filesrc | | demuxer | .-----. | decoder1 | + | src -> sink src1 ->|queue|-> sink src + '---------' | | '-----' '----------' X + | | .----------. + | | .-----. | decoder2 | + | src2 ->|queue|-> sink src + '---------' '-----' '----------' X +``` + +The purpose is to create the pipeline dynamically up to the decoders but +not yet connect them to a sink and without losing any data. + +To do this, the source pads of the decoders is blocked so that no events +or buffers can escape and we don’t interrupt the stream. + +When all of the dynamic pad are created (no-more-pads emitted by the +branching point, ie, the demuxer or the queues filled) and the pads are +blocked (blocked callback received) the pipeline is completely +prerolled. + +It should then be possible to perform the following actions on the +prerolled pipeline: + + - query duration/position + + - perform a flushing seek to preroll a new position + + - connect other elements and unblock the blocked pads. + +## dynamically switching an element in a PLAYING pipeline + +``` + .----------. .----------. .----------. + | element1 | | element2 | | element3 | +... src -> sink src -> sink ... + '----------' '----------' '----------' + .----------. + | element4 | + sink src + '----------' +``` + +The purpose is to replace element2 with element4 in the PLAYING +pipeline. + +1) block element1 src pad. +2) inside the block callback nothing is flowing between + element1 and element2 and nothing will flow until unblocked. +3) unlink element1 and element2 +4) optional step: make sure data is flushed out of element2: + 4a) pad event probe on element2 src + 4b) send EOS to element2, this makes sure that element2 flushes out the last bits of data it holds. + 4c) wait for EOS to appear in the probe, drop the EOS. + 4d) remove the EOS pad event probe. +5) unlink element2 and element3 + 5a) optionally element2 can now be set to NULL and/or removed from the pipeline. +6) link element4 and element3 +7) link element1 and element4 +8) make sure element4 is in the same state as the rest of the elements. The + element should at least be PAUSED. +9) unblock element1 src + +The same flow can be used to replace an element in a PAUSED pipeline. Of +course in a PAUSED pipeline there might not be dataflow so the block +might not immediately happen. diff --git a/markdown/design/progress.md b/markdown/design/progress.md new file mode 100644 index 0000000000..bb1ecdc053 --- /dev/null +++ b/markdown/design/progress.md @@ -0,0 +1,222 @@ +# Progress Reporting + +This document describes the design and use cases for the progress +reporting messages. + +PROGRESS messages are posted on the bus to inform the application about +the progress of asynchronous operations in the pipeline. This should not +be confused with asynchronous state changes. + +We accommodate for the following requirements: + + - Application is informed when an async operation starts and + completes. + + - It should be possible for the application to generically detect + common operations and incorporate their progress into the GUI. + + - Applications can cancel pending operations by doing regular state + changes. + + - Applications should be able to wait for completion of async + operations. + +We allow for the following scenarios: + + - Elements want to inform the application about asynchronous DNS + lookups and pending network requests. This includes starting and + completing the lookup. + + - Elements opening devices and resources asynchronously. + + - Applications having more freedom to implement timeout and + cancelation of operations that currently block the state changes or + happen invisibly behind the scenes. + +## Rationale + +The main reason for adding these extra progress notifications is +twofold: + +### to give the application more information of what is going on + +When there are well defined progress information codes, applications +can let the user know about the status of the progress. We anticipate to +have at least DNS resolving and server connections and requests be well +defined. + +### To make the state changes non-blocking and cancellable. + +Currently state changes such as going to the READY or PAUSED state often do +blocking calls such as resolving DNS or connecting to a remote server. These +operations often block the main thread and are often not cancellable, causing +application lockups. + +We would like to make the state change function, instead, start a separate +thread that performs the blocking operations in a cancellable way. When going +back to the NULL state, all pending operations would be canceled immediately. + +For downward state changes, we want to let the application implement its own +timeout mechanism. For example: when stopping an RTSP stream, the clients +needs to send a TEARDOWN request to the server. This can however take an +unlimited amount of time in case of network problems. We want to give the +application an opportunity to wait (and timeout) for the completion of the +async operation before setting the element to the final NULL state. + +Progress updates are very similar to buffering messages in the same way +that the application can decide to wait for the completion of the +buffering process before performing the next state change. It might make +sense to implement buffering with the progress messages in the future. + +## Async state changes + +GStreamer currently has a `GST_STATE_CHANGE_ASYNC` return value to note +to the application that a state change is happening asynchronously. + +The main purpose of this return value is to make the pipeline wait for +preroll and delay a future (upwards) state changes until the sinks are +prerolled. + +In the case of async operations on source, this will automatically force +sinks to stay async because they will not preroll before the source can +produce data. + +The fact that other asynchronous operations happen behind the scenes is +irrelevant for the prerolling process so it is not implemented with the +ASYNC state change return value in order to not complicate the state +changes and mix concepts. + +## Use cases + +### RTSP client (but also HTTP, MMS, …) + +When the client goes from the READY to the PAUSED state, it opens a socket, +performs a DNS lookup, retrieves the SDP and negotiates the streams. All these +operations currently block the state change function for an indefinite amount +of time and while they are blocking cannot be canceled. + +Instead, a thread would be started to perform these operations asynchronously +and the state change would complete with the usual NO_PREROLL return value. +Before starting the thread a PROGRESS message would be posted to mark the +start of the async operation. + +As the DNS lookup completes and the connection is established, PROGRESS +messages are posted on the bus to inform the application of the progress. When +something fails, an error is posted and a PROGRESS CANCELED message is posted. +The application can then stop the pipeline. + +If there are no errors and the setup of the streams completed successfully, a +PROGRESS COMPLETED is posted on the bus. The thread then goes to sleep and the +asynchronous operation completed. + +The RTSP protocol requires to send a TEARDOWN request to the server +before closing the connection and destroying the socket. A state change to the +READY state will issue the TEARDOWN request in the background and notify the +application of this pending request with a PROGRESS message. + +The application might want to only go to the NULL state after it got confirmation +that the TEARDOWN request completed or it might choose to go to NULL after a +timeout. It might also be possible that the application just want to close the +socket as fast as possible without waiting for completion of the TEARDOWN request. + +### Network performance measuring + +DNS lookup and connection times can be measured by calculating the elapsed +time between the various PROGRESS messages. + +## Messages + +A new `PROGRESS` message will be created. + +The following fields will be contained in the message: + +- **`type`**, GST_TYPE_PROGRESS_TYPE: a set of types to define the type of progress + * GST_PROGRESS_TYPE_START: A new task is started in the background + * GST_PROGRESS_TYPE_CONTINUE: The previous tasks completed and a new one + continues. This is done so that the application can follow a set of + continuous tasks and react to COMPLETE only when the element completely + finished. * GST_PROGRESS_TYPE_CANCELED: A task is canceled by the user. + * GST_PROGRESS_TYPE_ERROR: A task stopped because of an error. In case of + an error, an error message will have been posted before. + * GST_PROGRESS_TYPE_COMPLETE: A task completed successfully. + +- **`code`**, G_TYPE_STRING: A generic extensible string that can be used to +programmatically determine the action that is in progress. Some standard +predefined codes will be defined. + +- **`text`**, G_TYPE_STRING: A user visible string detailing the action. + +- **`percent`**, G_TYPE_INT: between 0 and 100 Progress of the action as +a percentage, the following values are allowed: + - GST_PROGRESS_TYPE_START always has a 0% value. + - GST_PROGRESS_TYPE_CONTINUE have a value between 0 and 100 + - GST_PROGRESS_TYPE_CANCELED, GST_PROGRESS_TYPE_ERROR and + GST_PROGRESS_TYPE_COMPLETE always have a 100% value. + +- **`timeout`**, G_TYPE_INT in milliseconds: The timeout of the async +operation. -1 if unknown/unlimited.. This field can be interesting to the +application when it wants to display some sort of progress indication. + +- …. + +Depending on the code, more fields can be put here. + +## Implementation + +Elements should not do blocking operations from the state change +function. Instead, elements should post an appropriate progress message +with the right code and of type `GST_PROGRESS_TYPE_START` and then +start a thread to perform the blocking calls in a cancellable manner. + +It is highly recommended to only start async operations from the READY +to PAUSED state and onwards and not from the NULL to READY state. The +reason for this is that streaming threads are usually started in the +READY to PAUSED state and that the current NULL to READY state change is +used to perform a blocking check for the presence of devices. + +The progress message needs to be posted from the state change function +so that the application can immediately take appropriate action after +setting the state. + +The threads will usually perform many blocking calls with different +codes in a row, a client might first do a DNS query and then continue +with establishing a connection to the server. For this purpose the +`GST_PROGRESS_TYPE_CONTINUE` must be used. + +Usually, the thread used to perform the blocking operations can be used +to implement the streaming threads when needed. + +Upon downward state changes, operations that are busy in the thread are +canceled and `GST_PROGRESS_TYPE_CANCELED` is posted. + +The application can know about pending tasks because they received the +`GST_PROGRESS_TYPE_START` messages that didn’t complete with a +`GST_PROGRESS_TYPE_COMPLETE` message, got canceled with a +`GST_PROGRESS_TYPE_CANCELED` or errored with +`GST_PROGRESS_TYPE_ERROR.` Applications should be able to choose if +they wait for the pending operation or cancel them. + +If an async operation fails, an error message is posted first before the +`GST_PROGRESS_TYPE_ERROR` progress message. + +## Categories + +We want to propose some standard codes here: + +* "open" : A resource is being opened + +* "close" : A resource is being closed + +* "name-lookup" : A DNS lookup. + +* "connect" : A socket connection is established + +* "disconnect" : a socket connection is closed + +* "request" : A request is sent to a server and we are waiting for a reply. +This message is posted right before the request is sent and completed when the +reply has arrived completely. * "mount" : A volume is being mounted + +* "unmount" : A volume is being unmounted + +More codes can be posted by elements and can be made official later. diff --git a/markdown/design/push-pull.md b/markdown/design/push-pull.md new file mode 100644 index 0000000000..1d5a5e4629 --- /dev/null +++ b/markdown/design/push-pull.md @@ -0,0 +1,43 @@ +# push-pull + +Normally a source element will push data to the downstream element using +the `gst_pad_push()` method. The downstream peer pad will receive the +buffer in the Chain function. In the push mode, the source element is +the driving force in the pipeline as it initiates data transport. + +It is also possible for an element to pull data from an upstream +element. The downstream element does this by calling +`gst_pad_pull_range()` on one of its sinkpads. In this mode, the +downstream element is the driving force in the pipeline as it initiates +data transfer. + +It is important that the elements are in the correct state to handle a +push() or a `pull_range()` from the peer element. For push() based +elements this means that all downstream elements should be in the +correct state and for `pull_range()` based elements this means the +upstream elements should be in the correct state. + +Most sinkpads implement a chain function. This is the most common case. +sinkpads implementing a loop function will be the exception. Likewise +srcpads implementing a getrange function will be the exception. + +## state changes + +The GstBin sets the state of all the sink elements. These are the +elements without source pads. + +Setting the state on an element will first activate all the srcpads and +then the sinkpads. For each of the sinkpads, +`gst_pad_check_pull_range()` is performed. If the sinkpad supports a +loopfunction and the peer pad returns TRUE from the GstPadCheckPullRange +function, then the peer pad is activated first as it must be in the +right state to handle a `_pull_range().` Note that the state change of +the element is not yet performed, just the activate function is called +on the source pad. This means that elements that implement a getrange +function must be prepared to get their activate function called before +their state change function. + +Elements that have multiple sinkpads that require all of them to operate +in the same mode (push/pull) can use the `_check_pull_range()` on all +their pads and can then remove the loop functions if one of the pads +does not support pull based mode. diff --git a/markdown/design/qos.md b/markdown/design/qos.md new file mode 100644 index 0000000000..2181919e9a --- /dev/null +++ b/markdown/design/qos.md @@ -0,0 +1,445 @@ +# Quality-of-Service + +Quality of service is about measuring and adjusting the real-time +performance of a pipeline. + +The real-time performance is always measured relative to the pipeline +clock and typically happens in the sinks when they synchronize buffers +against the clock. + +The measurements result in QOS events that aim to adjust the datarate in +one or more upstream elements. Two types of adjustments can be made: + + - short time "emergency" corrections based on latest observation in + the sinks. + + - long term rate corrections based on trends observed in the sinks. + +It is also possible for the application to artificially introduce delay +between synchronized buffers, this is called throttling. It can be used +to reduce the framerate, for example. + +## Sources of quality problems + + - High CPU load + + - Network problems + + - Other resource problems such as disk load, memory bottlenecks etc. + + - application level throttling + +## QoS event + +The QoS event is generated by an element that synchronizes against the +clock. It travels upstream and contains the following fields: + +* **`type`**: `GST\_TYPE\_QOS\_TYPE:` The type of the QoS event, we have the +following types and the default type is `GST\_QOS\_TYPE\_UNDERFLOW`: + + * GST_QOS_TYPE_OVERFLOW: an element is receiving buffers too fast and can't + keep up processing them. Upstream should reduce the rate. + + * GST_QOS_TYPE_UNDERFLOW: an element is receiving buffers too slowly + and has to drop them because they are too late. Upstream should + increase the processing rate. + + * GST_QOS_TYPE_THROTTLE: the application is asking to add extra delay + between buffers, upstream is allowed to drop buffers + +* **`timestamp`**: G\_TYPE\_UINT64: The timestamp on the buffer that +generated the QoS event. These timestamps are expressed in total +running\_time in the sink so that the value is ever increasing. + +* **`jitter`**: G\_TYPE\_INT64: The difference of that timestamp against the +current clock time. Negative values mean the timestamp was on time. +Positive values indicate the timestamp was late by that amount. When +buffers are received in time and throttling is not enabled, the QoS +type field is set to OVERFLOW. When throttling, the jitter contains +the throttling delay added by the application and the type is set to +THROTTLE. + +* **`proportion`**: G\_TYPE\_DOUBLE: Long term prediction of the ideal rate +relative to normal rate to get optimal quality. + +The rest of this document deals with how these values can be calculated +in a sink and how the values can be used by other elements to adjust +their operations. + +## QoS message + +A QOS message is posted on the bus whenever an element decides to: + + - drop a buffer because of QoS reasons + + - change its processing strategy because of QoS reasons (quality) + +It should be expected that creating and posting the QoS message is +reasonably fast and does not significantly contribute to the QoS +problems. Options to disable this feature could also be presented on +elements. + +This message can be posted by a sink/src that performs synchronisation +against the clock (live) or it could be posted by an upstream element +that performs QoS because of QOS events received from a downstream +element (\!live). + +The `GST\_MESSAGE\_QOS` contains at least the following info: + +* **`live`**: G\_TYPE\_BOOLEAN: If the QoS message was dropped by a live +element such as a sink or a live source. If the live property is +FALSE, the QoS message was generated as a response to a QoS event in +a non-live element. + +* **`running-time`**: G\_TYPE\_UINT64: The running\_time of the buffer that +generated the QoS message. + +* **`stream-time`**: G\_TYPE\_UINT64: The stream\_time of the buffer that +generated the QoS message. + +* **`timestamp`**: G\_TYPE\_UINT64: The timestamp of the buffer that +generated the QoS message. + +* **`duration`**: G\_TYPE\_UINT64: The duration of the buffer that generated +the QoS message. + +* **`jitter`**: G\_TYPE\_INT64: The difference of the running-time against +the deadline. Negative values mean the timestamp was on time. +Positive values indicate the timestamp was late (and dropped) by +that amount. The deadline can be a realtime running\_time or an +estimated running\_time. + +* **`proportion`**: G\_TYPE\_DOUBLE: Long term prediction of the ideal rate +relative to normal rate to get optimal quality. + +* **`quality`**: G\_TYPE\_INT: An element dependent integer value that +specifies the current quality level of the element. The default +maximum quality is 1000000. + +* **`format`**: GST\_TYPE\_FORMAT Units of the *processed* and *dropped* +fields. Video sinks and video filters will use GST\_FORMAT\_BUFFERS +(frames). Audio sinks and audio filters will likely use +GST\_FORMAT\_DEFAULT (samples). + +* **`processed`**: G\_TYPE\_UINT64: Total number of units correctly +processed since the last state change to READY or a flushing +operation. + +* **`dropped`**: G\_TYPE\_UINT64: Total number of units dropped since the +last state change to READY or a flushing operation. + +The *running-time* and *processed* fields can be used to estimate the +average processing rate (framerate for video). + +Elements might add additional fields in the message which are documented +in the relevant elements or baseclasses. + +## Collecting statistics + +A buffer with timestamp B1 arrives in the sink at time T1. The buffer +timestamp is then synchronized against the clock which yields a jitter +J1 return value from the clock. The jitter J1 is simply calculated as + + J1 = CT - B1 + +Where CT is the clock time when the entry arrives in the sink. This +value is calculated inside the clock when we perform +`gst\_clock\_id\_wait()`. + +If the jitter is negative, the entry arrived in time and can be rendered +after waiting for the clock to reach time B1 (which is also CT - J1). + +If the jitter is positive however, the entry arrived too late in the +sink and should therefore be dropped. J1 is the amount of time the entry +was late. + +Any buffer that arrives in the sink should generate a QoS event +upstream. + +Using the jitter we can calculate the time when the buffer arrived in +the sink: + + T1 = B1 + J1. (1) + +The time the buffer leaves the sink after synchronisation is measured +as: + + T2 = B1 + (J1 < 0 ? 0 : J1) (2) + +For buffers that arrive in time (J1 \< 0) the buffer leaves after +synchronisation which is exactly B1. Late buffers (J1 \>= 0) leave the +sink when they arrive, whithout any synchronisation, which is T2 = T1 = +B1 + J1. + +Using a previous T0 and a new T1, we can calculate the time it took for +upstream to generate a buffer with timestamp B1. + + PT1 = T1 - T0 (3) + +We call PT1 the processing time needed to generate buffer with timestamp +B1. + +Moreover, given the duration of the buffer D1, the current data rate +(DR1) of the upstream element is given as: + +``` + PT1 T1 - T0 +DR1 = --- = ------- (4) + D1 D1 +``` + +For values 0.0 \< DR1 ⇐ 1.0 the upstream element is producing faster +than real-time. If DR1 is exactly 1.0, the element is running at a +perfect speed. + +Values DR1 \> 1.0 mean that the upstream element cannot produce buffers +of duration D1 in real-time. It is exactly DR1 that tells the amount of +speedup we require from upstream to regain real-time performance. + +An element that is not receiving enough data is said to be underflowed. + +## Element measurements + +In addition to the measurements of the datarate of the upstream element, +a typical element must also measure its own performance. Global pipeline +performance problems can indeed also be caused by the element itself +when it receives too much data it cannot process in time. The element is +then said to be overflowed. + +# Short term correction + +The timestamp and jitter serve as short term correction information for +upstream elements. Indeed, given arrival time T1 as given in (1) we can +be certain that buffers with a timestamp B2 \< T1 will be too late in +the sink. + +In case of a positive jitter we can therefore send a QoS event with a +timestamp B1, jitter J1 and proportion given by (4). + +This allows an upstream element to not generate any data with timestamps +B2 \< T1, where the element can derive T1 as B1 + J1. + +This will effectively result in frame drops. + +The element can even do a better estimation of the next valid timestamp +it should output. + +Indeed, given the element generated a buffer with timestamp B0 that +arrived in time in the sink but then received a QoS event stating B1 +arrived J1 too late. This means generating B1 took (B1 + J1) - B0 = T1 - +T0 = PT1, as given in (3). Given the buffer B1 had a duration D1 and +assuming that generating a new buffer B2 will take the same amount of +processing time, a better estimation for B2 would then be: + +``` + B2 = T1 + D2 * DR1 +``` + +expanding gives: + +``` + B2 = (B1 + J1) + D2 * (B1 + J1 - B0) + -------------- + D1 +``` + +assuming the durations of the frames are equal and thus D1 = D2: + +``` + B2 = (B1 + J1) + (B1 + J1 - B0) + + B2 = 2 * (B1 + J1) - B0 +``` + +also: + +``` + B0 = B1 - D1 +``` + +so: + +``` + B2 = 2 * (B1 + J1) - (B1 - D1) +``` + +Which yields a more accurate prediction for the next buffer given as: + +``` + B2 = B1 + 2 * J1 + D1 (5) +``` + +# Long term correction + +The datarate used to calculate (5) for the short term prediction is +based on a single observation. A more accurate datarate can be obtained +by creating a running average over multiple datarate observations. + +This average is less susceptible to sudden changes that would only +influence the datarate for a very short period. + +A running average is calculated over the observations given in (4) and +is used as the proportion member in the QoS event that is sent upstream. + +Receivers of the QoS event should permanently reduce their datarate as +given by the proportion member. Failure to do so will certainly lead to +more dropped frames and a generally worse QoS. + +# Throttling + +In throttle mode, the time distance between buffers is kept to a +configurable throttle interval. This means that effectively the buffer +rate is limited to 1 buffer per throttle interval. This can be used to +limit the framerate, for example. + +When an element is configured in throttling mode (this is usually only +implemented on sinks) it should produce QoS events upstream with the +jitter field set to the throttle interval. This should instruct upstream +elements to skip or drop the remaining buffers in the configured +throttle interval. + +The proportion field is set to the desired slowdown needed to get the +desired throttle interval. Implementations can use the QoS Throttle +type, the proportion and the jitter member to tune their +implementations. + +# QoS strategies + +Several strategies exist to reduce processing delay that might affect +real time performance. + + - lowering quality + + - dropping frames (reduce CPU/bandwidth usage) + + - switch to a lower decoding/encoding quality (reduce algorithmic + complexity) + + - switch to a lower quality source (reduce network usage) + + - increasing thread priorities + + - switch to real-time scheduling + + - assign more CPU cycles to critial pipeline parts + + - assign more CPU(s) to critical pipeline parts + +# QoS implementations + +Here follows a small overview of how QoS can be implemented in a range +of different types of elements. + +# GstBaseSink + +The primary implementor of QoS is GstBaseSink. It will calculate the +following values: + + - upstream running average of processing time (5) in stream time. + + - running average of buffer durations. + + - running average of render time (in system time) + + - rendered/dropped buffers + +The processing time and the average buffer durations will be used to +calculate a proportion. + +The processing time in system time is compared to render time to decide +if the majority of the time is spend upstream or in the sink itself. +This value is used to decide overflow or underflow. + +The number of rendered and dropped buffers is used to query stats on the +sink. + +A QoS event with the most current values is sent upstream for each +buffer that was received by the sink. + +Normally QoS is only enabled for video pipelines. The reason being that +drops in audio are more disturbing than dropping video frames. Also +video requires in general more processing than audio. + +Normally there is a threshold for when buffers get dropped in a video +sink. Frames that arrive 20 milliseconds late are still rendered as it +is not noticeable for the human eye. + +A QoS message is posted whenever a (part of a) buffer is dropped. + +In throttle mode, the sink sends QoS event upstream with the timestamp +set to the running\_time of the latest buffer and the jitter set to the +throttle interval. If the throttled buffer is late, the lateness is +subtracted from the throttle interval in order to keep the desired +throttle interval. + +# GstBaseTransform + +Transform elements can entirely skip the transform based on the +timestamp and jitter values of recent QoS event since these buffers will +certainly arrive too late. + +With any intermediate element, the element should measure its +performance to decide if it is responsible for the quality problems or +any upstream/downstream element. + +some transforms can reduce the complexity of their algorithms. Depending +on the algorithm, the changes in quality may have disturbing visual or +audible effect that should be avoided. + +A QoS message should be posted when a frame is dropped or when the +quality of the filter is reduced. The quality member in the QOS message +should reflect the quality setting of the filter. + +# Video Decoders + +A video decoder can, based on the codec in use, decide to not decode +intermediate frames. A typical codec can for example skip the decoding +of B-frames to reduce the CPU usage and framerate. + +If each frame is independantly decodable, any arbitrary frame can be +skipped based on the timestamp and jitter values of the latest QoS +event. In addition can the proportion member be used to permanently skip +frames. + +It is suggested to adjust the quality field of the QoS message with the +expected amount of dropped frames (skipping B and/or P frames). This +depends on the particular spacing of B and P frames in the stream. If +the quality control would result in half of the frames to be dropped +(typical B frame skipping), the quality field would be set to ``1000000 * +1/2 = 500000``. If a typical I frame spacing of 18 frames is used, +skipping B and P frames would result in 17 dropped frames or 1 decoded +frame every 18 frames. The quality member should be set to `1000000 * +1/18 = 55555`. + + - skipping B frames: quality = 500000 + + - skipping P/B frames: quality = 55555 (for I-frame spacing of 18 + frames) + +# Demuxers + +Demuxers usually cannot do a lot regarding QoS except for skipping +frames to the next keyframe when a lateness QoS event arrives on a +source pad. + +A demuxer can however measure if the performance problems are upstream +or downstream and forward an updated QoS event upstream. + +Most demuxers that have multiple output pads might need to combine the +QoS events on all the pads and derive an aggregated QoS event for the +upstream element. + +# Sources + +The QoS events only apply to push based sources since pull based sources +are entirely controlled by another downstream element. + +Sources can receive a overflow or underflow event that can be used to +switch to less demanding source material. In case of a network stream, a +switch could be done to a lower or higher quality stream or additional +enhancement layers could be used or ignored. + +Live sources will automatically drop data when it takes too long to +process the data that the element pushes out. + +Live sources should post a QoS message when data is dropped. diff --git a/markdown/design/query.md b/markdown/design/query.md new file mode 100644 index 0000000000..0c2c910f87 --- /dev/null +++ b/markdown/design/query.md @@ -0,0 +1,71 @@ +# Query + +## Purpose + +Queries are used to get information about the stream. A query is started +on a specific pad and travels up or downstream. + +## Requirements + + - multiple return values, grouped together when they make sense. + + - one pad function to perform the query + + - extensible queries. + +## Implementation + + - GstQuery extends GstMiniObject and contains a GstStructure (see + GstMessage) + + - some standard query types are defined below + + - methods to create and parse the results in the GstQuery. + + - define pad + method: + +``` c + gboolean (*GstPadQueryFunction) (GstPad *pad, + GstObject *parent, + GstQuery *query); +``` + +pad returns result in query structure and TRUE as result or FALSE when query is +not supported. + +## Query types + +**`GST_QUERY_POSITION`**: get info on current position of the stream in stream_time. + +**`GST_QUERY_DURATION`**: get info on the total duration of the stream. + +**`GST_QUERY_LATENCY`**: get amount of latency introduced in the pipeline. (See [latency](design/latency.md)) + +**`GST_QUERY_RATE`**: get the current playback rate of the pipeline + +**`GST_QUERY_SEEKING`**: get info on how seeking can be done + - getrange, with/without offset/size + - ranges where seeking is efficient (for caching network sources) + - flags describing seeking behaviour (forward, backward, segments, + play backwards, ...) + +**`GST_QUERY_SEGMENT`**: get info about the currently configured playback segment. + +**`GST_QUERY_CONVERT`**: convert format/value to another format/value pair. + +**`GST_QUERY_FORMATS`**: return list of supported formats that can be used for GST_QUERY_CONVERT. + +**`GST_QUERY_BUFFERING`**: query available media for efficient seeking (See [buffering](design/buffering.md)) + +**`GST_QUERY_CUSTOM`**: a custom query, the name of the query defines the properties of the query. + +**`GST_QUERY_URI`**: query the uri of the source or sink element + +**`GST_QUERY_ALLOCATION`**: the buffer allocation properties (See [bufferpool](design/bufferpool.md)) + +**`GST_QUERY_SCHEDULING`**: the scheduling properties (See [scheduling](design/scheduling.md)) + +**`GST_QUERY_ACCEPT_CAPS`**: check if caps are supported (See [negotiation](design/negotiation.md)) + +**`GST_QUERY_CAPS`**: get the possible caps (See [negotiation](design/negotiation.md)) diff --git a/markdown/design/relations.md b/markdown/design/relations.md new file mode 100644 index 0000000000..7e87dd99b7 --- /dev/null +++ b/markdown/design/relations.md @@ -0,0 +1,522 @@ +# Object relation types + +This document describes the relations between objects that exist in +GStreamer. It will also describe the way of handling the relation wrt +locking and refcounting. + +## parent-child relation + +``` + +---------+ +-------+ + | parent | | child | +*--->| *----->| | + | F1|<-----* 1| + +---------+ +-------+ +``` + +### properties + - parent has references to multiple children + - child has reference to parent + - reference fields protected with LOCK + - the reference held by each child to the parent is NOT reflected in + the refcount of the parent. + - the parent removes the floating flag of the child when taking + ownership. + - the application has valid reference to parent + - creation/destruction requires two unnested locks and 1 refcount. + +### usage in GStreamer + * GstBin -> GstElement + * GstElement -> GstRealPad + +### lifecycle + +#### object creation + +The application creates two object and holds a pointer +to them. The objects are initially FLOATING with a refcount of 1. + +``` + +---------+ +-------+ +*--->| parent | *--->| child | + | * | | | + | F1| | * F1| + +---------+ +-------+ +``` + +#### establishing the parent-child relationship + +The application then calls a method on the parent object to take ownership of +the child object. The parent performs the following actions: + +``` +result = _set_parent (child, parent); +if (result) { + lock (parent); + ref_pointer = child; + + 1. update other data structures .. unlock (parent); +} else { + + 2. child had parent .. +} +``` + +the `_set_parent()` method performs the following actions: + +``` + lock (child); + if (child->parent != null) { + unlock (child); + return false; + } + if (is_floating (child)) { + unset (child, floating); + } + else { + _ref (child); + } + child->parent = parent; + unlock (child); + _signal (parent_set, child, parent); + return true; +``` + +The function atomically checks if the child has no parent yet +and will set the parent if not. It will also sink the child, meaning +all floating references to the child are invalid now as it takes +over the refcount of the object. + +Visually: + +after `_set_parent()` returns TRUE: + +``` + +---------+ +-------+ +*---->| parent | *-//->| child | + | * | | | + | F1|<-------------* 1| + +---------+ +-------+ +``` + +after parent updates ref_pointer to child. + +``` + +---------+ +-------+ +*---->| parent | *-//->| child | + | *--------->| | + | F1|<---------* 1| + +---------+ +-------+ +``` + +- only one parent is able to \_sink the same object because the +\_set\_parent() method is atomic. + +- since only one parent is able to \_set\_parent() the object, only +one will add a reference to the object. + +- since the parent can hold multiple references to children, we don’t +need to lock the parent when locking the child. Many threads can +call \_set\_parent() on the children with the same parent, the +parent can then add all those to its lists. + +> Note: that the signal is emitted before the parent has added the +> element to its internal data structures. This is not a problem +> since the parent usually has his own signal to inform the app that +> the child was reffed. One possible solution would be to update the +> internal structure first and then perform a rollback if the \_set_parent() +> failed. This is not a good solution as iterators might grab the +> 'half-added' child too soon. + +#### using the parent-child relationship + + - since the initial floating reference to the child object became + invalid after giving it to the parent, any reference to a child has + at least a refcount \> 1. + + - this means that unreffing a child object cannot decrease the + refcount to 0. In fact, only the parent can destroy and dispose the + child object. + + - given a reference to the child object, the parent pointer is only + valid when holding the child LOCK. Indeed, after unlocking the child + LOCK, the parent can unparent the child or the parent could even + become disposed. To avoid the parent dispose problem, when obtaining + the parent pointer, if should be reffed before releasing the child + LOCK. + + * getting a reference to the parent. + - a referece is held to the child, so it cannot be disposed. + +``` c + LOCK (child); + parent = _ref (child->parent); + UNLOCK (child); + + .. use parent .. + + _unref (parent); +``` + + * getting a reference to a child + + - a reference to a child can be obtained by reffing it before adding + it to the parent or by querying the parent. + + - when requesting a child from the parent, a reference is held to the + parent so it cannot be disposed. The parent will use its internal + data structures to locate the child element and will return a + reference to it with an incremented refcount. The requester should + \_unref() the child after usage. + + * destroying the parent-child relationship + + - only the parent can actively destroy the parent-child relationship + this typically happens when a method is called on the parent to + release ownership of the child. + + - a child shall never remove itself from the parent. + + - since calling a method on the parent with the child as an argument + requires the caller to obtain a valid reference to the child, the + child refcount is at least \> 1. + + - the parent will perform the folowing actions: + +``` c + LOCK (parent); + if (ref_pointer == child) { + ref_pointer = NULL; + + ..update other data structures .. + UNLOCK (parent); + + _unparent (child); + } else { + UNLOCK (parent); + .. not our child .. + } +``` + +The `_unparent()` method performs the following actions: + +``` c +LOCK (child); +if (child->parent != NULL) { + child->parent = NULL; + UNLOCK (child); + _signal (PARENT_UNSET, child, parent); + + _unref (child); +} else { + UNLOCK (child); +} +``` + +Since the `_unparent()` method unrefs the child object, it is possible that +the child pointer is invalid after this function. If the parent wants to +perform other actions on the child (such as signal emmision) it should +`_ref()` the child first. + +## single-reffed relation + +``` + +---------+ +---------+ +*--->| object1 | *--->| object2 | + | *--------->| | + | 1| | 2| + +---------+ +---------+ +``` + +### properties + - one object has a reference to another + - reference field protected with LOCK + - the reference held by the object is reflected in the refcount of the + other object. + - typically the other object can be shared among multiple other + objects where each ref is counted for in the refcount. + - no object has ownership of the other. + - either shared state or copy-on-write. + - creation/destruction requires one lock and one refcount. + +### usage + + GstRealPad -> GstCaps + GstBuffer -> GstCaps + GstEvent -> GstCaps + GstEvent -> GstObject + GstMessage -> GstCaps + GstMessage -> GstObject + +### lifecycle + +#### Two objects exist unlinked. + +``` + +---------+ +---------+ +*--->| object1 | *--->| object2 | + | * | | | + | 1| | 1| + +---------+ +---------+ +``` + +#### establishing the single-reffed relationship + +The second object is attached to the first one using a method +on the first object. The second object is reffed and a pointer +is updated in the first object using the following algorithm: + +``` c + LOCK (object1); + if (object1->pointer) + _unref (object1->pointer); + object1->pointer = _ref (object2); + UNLOCK (object1); +``` + +After releasing the lock on the first object is is not sure that +object2 is still reffed from object1. + +``` + +---------+ +---------+ +*--->| object1 | *--->| object2 | + | *--------->| | + | 1| | 2| + +---------+ +---------+ +``` + +#### using the single-reffed relationship + +The only way to access object2 is by holding a ref to it or by +getting the reference from object1. +Reading the object pointed to by object1 can be done like this: + +``` c + LOCK (object1); + object2 = object1->pointer; + _ref (object2); + UNLOCK (object1); + + … use object2 … + _unref (object2); +``` + +Depending on the type of the object, modifications can be done either with +copy-on-write or directly into the object. + +Copy on write can practically only be done like this: + +``` c + LOCK (object1); + object2 = object1->pointer; + object2 = _copy_on_write (object2); + ... make modifications to object2 ... + UNLOCK (object1); + + Releasing the lock has only a very small window where the copy_on_write + actually does not perform a copy: + + LOCK (object1); + object2 = object1->pointer; + _ref (object2); + UNLOCK (object1); + + /* object2 now has at least 2 refcounts making the next + copy-on-write make a real copy, unless some other thread writes + another object2 to object1 here … */ + + object2 = _copy_on_write (object2); + + /* make modifications to object2 … */ + + LOCK (object1); + if (object1->pointer != object2) { + if (object1->pointer) + _unref (object1->pointer); + object1->pointer = gst_object_ref (object2); + } + UNLOCK (object1); +``` + +#### destroying the single-reffed relationship + +The folowing algorithm removes the single-reffed link between +object1 and object2. + +``` c + LOCK (object1); + _unref (object1->pointer); + object1->pointer = NULL; + UNLOCK (object1); +``` + +Which yields the following initial state again: + +``` + +---------+ +---------+ + *--->| object1 | *--->| object2 | + | * | | | + | 1| | 1| + +---------+ +---------+ +``` + +## unreffed relation + +``` + +---------+ +---------+ +*--->| object1 | *--->| object2 | + | *--------->| | + | 1|<---------* 1| + +---------+ +---------+ +``` + +### properties + +- two objects have references to each other +- both objects can only have 1 reference to another object. +- reference fields protected with LOCK +- the references held by each object are NOT reflected in the refcount +of the other object. +- no object has ownership of the other. +- typically each object is owned by a different parent. +- creation/destruction requires two nested locks and no refcounts. + +### usage + +- This type of link is used when the link is less important than the +existance of the objects, If one of the objects is disposed, so is +the link. + + GstRealPad <-> GstRealPad (srcpad lock taken first) + +### lifecycle + +#### Two objects exist unlinked. + +``` + +---------+ +---------+ +*--->| object1 | *--->| object2 | + | * | | | + | 1| | * 1| + +---------+ +---------+ +``` + +#### establishing the unreffed relationship + +Since we need to take two locks, the order in which these locks are +taken is very important or we might cause deadlocks. This lock order +must be defined for all unreffed relations. In these examples we always +lock object1 first and then object2. + +``` c + LOCK (object1); + LOCK (object2); + object2->refpointer = object1; + object1->refpointer = object2; + UNLOCK (object2); + UNLOCK (object1); +``` + +#### using the unreffed relationship + +Reading requires taking one of the locks and reading the corresponing +object. Again we need to ref the object before releasing the lock. + +``` c + LOCK (object1); + object2 = _ref (object1->refpointer); + UNLOCK (object1); + + .. use object2 .. + _unref (object2); +``` + +#### destroying the unreffed relationship + +Because of the lock order we need to be careful when destroying this +Relation. + +When only a reference to object1 is held: + +``` c + LOCK (object1); + LOCK (object2); + object1->refpointer->refpointer = NULL; + object1->refpointer = NULL; + UNLOCK (object2); + UNLOCK (object1); +``` + +When only a reference to object2 is held we need to get a handle to the +other object fist so that we can lock it first. There is a window where +we need to release all locks and the relation could be invalid. To solve +this we check the relation after grabbing both locks and retry if the +relation changed. + +``` c + retry: + LOCK (object2); + object1 = _ref (object2->refpointer); + UNLOCK (object2); + .. things can change here .. + LOCK (object1); + LOCK (object2); + if (object1 == object2->refpointer) { + /* relation unchanged */ + object1->refpointer->refpointer = NULL; + object1->refpointer = NULL; + } + else { + /* relation changed.. retry */ + UNLOCK (object2); + UNLOCK (object1); + _unref (object1); + goto retry; + } + UNLOCK (object2); + UNLOCK (object1); + _unref (object1); + + /* When references are held to both objects. Note that it is not possible to + get references to both objects with the locks released since when the + references are taken and the locks are released, a concurrent update might + have changed the link, making the references not point to linked objects. */ + + LOCK (object1); + LOCK (object2); + if (object1->refpointer == object2) { + object2->refpointer = NULL; + object1->refpointer = NULL; + } + else { + .. objects are not linked .. + } + UNLOCK (object2); + UNLOCK (object1); +``` + +## double-reffed relation + +``` + +---------+ +---------+ +*--->| object1 | *--->| object2 | + | *--------->| | + | 2|<---------* 2| + +---------+ +---------+ +``` + +### properties + + - two objects have references to each other + - reference fields protected with LOCK + - the references held by each object are reflected in the refcount of + the other object. + - no object has ownership of the other. + - typically each object is owned by a different parent. + - creation/destruction requires two locks and two refcounts. + +#### usage + +Not used in GStreamer. + +### lifecycle diff --git a/markdown/design/scheduling.md b/markdown/design/scheduling.md new file mode 100644 index 0000000000..63fc2ed527 --- /dev/null +++ b/markdown/design/scheduling.md @@ -0,0 +1,246 @@ +# Scheduling + +The scheduling in GStreamer is based on pads actively pushing +(producing) data or pad pulling in data (consuming) from other pads. + +## Pushing + +A pad can produce data and push it to the next pad. A pad that behaves +this way exposes a loop function that will be called repeatedly until it +returns false. The loop function is allowed to block whenever it wants. +When the pad is deactivated the loop function should unblock though. + +A pad operating in the push mode can only produce data to a pad that +exposes a chain function. This chain function will be called with the +buffer produced by the pushing pad. + +This method of producing data is called the streaming mode since the +producer produces a constant stream of data. + +## Pulling + +Pads that operate in pulling mode can only pull data from a pad that +exposes the pull\_range function. In this case, the sink pad exposes a +loop function that will be called repeatedly until the task is stopped. + +After pulling data from the peer pad, the loop function will typically +call the push function to push the result to the peer sinkpad. + +## Deciding the scheduling mode + +When a pad is activated, the \_activate() function is called. The pad +can then choose to activate itself in push or pull mode depending on +upstream capabilities. + +The GStreamer core will by default activate pads in push mode when there +is no activate function for the pad. + +## The chain function + +The chain function will be called when a upstream element performs a +\_push() on the pad. The upstream element can be another chain based +element or a pushing source. + +## The getrange function + +The getrange function is called when a peer pad performs a +\_pull\_range() on the pad. This downstream pad can be a pulling element +or another \_pull\_range() based element. + +## Scheduling Query + +A sinkpad can ask the upstream srcpad for its scheduling attributes. It +does this with the SCHEDULING query. + +* (out) **`modes`**: G_TYPE_ARRAY (default NULL): an array of GST_TYPE_PAD_MODE enums. Contains all the supported scheduling modes. + +* (out) **`flags`**, GST_TYPE_SCHEDULING_FLAGS (default 0): + +```c +typedef enum { + GST_SCHEDULING_FLAG_SEEKABLE = (1 << 0), + GST_SCHEDULING_FLAG_SEQUENTIAL = (1 << 1), + GST_SCHEDULING_FLAG_BANDWIDTH_LIMITED = (1 << 2) +} GstSchedulingFlags; +``` + +* + * **`_SEEKABLE`**: the offset of a pull operation can be specified, if this + flag is false, the offset should be -1, + + * **` _SEQUENTIAL`**: suggest sequential access to the data. If`` _SEEKABLE`` is + specified, seeks are allowed but should be avoided. This is common for network + streams. + + * **`_BANDWIDTH_LIMITED`**: suggest the element supports buffering data for + downstream to cope with bandwidth limitations. If this flag is on the + downstream element might ask for more data than necessary for normal playback. + This use-case is interesting for on-disk buffering scenarios for instance. Seek + operations might be slow as well so downstream elements should take this into + consideration. + +* (out) **`minsize`**: G_TYPE_INT (default 1): the suggested minimum size of pull requests +* (out) **`maxsize`**: G_TYPE_INT (default -1, unlimited): the suggested maximum size of pull requests +* (out) **`align`**: G_TYPE_INT (default 0): the suggested alignment for the pull requests. + +## Plug-in techniques + +### Multi-sink elements + +Elements with multiple sinks can either expose a loop function on each +of the pads to actively pull\_range data or they can expose a chain +function on each pad. + +Implementing a chain function is usually easy and allows for all +possible scheduling methods. + +# Pad select + +If the chain based sink wants to wait for one of the pads to receive a buffer, just +implement the action to perform in the chain function. Be aware that the action could +be performed in different threads and possibly simultaneously so grab the STREAM_LOCK. + +# Collect pads + +If the chain based sink pads all require one buffer before the element can operate on +the data, collect all the buffers in the chain function and perform the action when +all chainpads received the buffer. + +In this case you probably also don't want to accept more data on a pad that has a buffer +queued. This can easily be done with the following code snippet: + +``` c +static GstFlowReturn _chain (GstPad *pad, GstBuffer *buffer) +{ + LOCK (mylock); + while (pad->store != NULL) { + WAIT (mycond, mylock); + } + pad->store = buffer; + SIGNAL (mycond); + UNLOCK (mylock); + + return GST_FLOW_OK; +} + +static void _pull (GstPad *pad, GstBuffer **buffer) +{ + LOCK (mylock); + while (pad->store == NULL) { + WAIT (mycond, mylock); + } + **buffer = pad->store; + pad->store = NULL; + SIGNAL (mycond); + UNLOCK (mylock); +} +``` + +## Cases + +Inside the braces below the pads is stated what function the pad +support: + +* l: exposes a loop function, so it can act as a pushing source. +* g: exposes a getrange function +* c: exposes a chain function + +Following scheduling decisions are made based on the scheduling methods exposed +by the pads: + +* (g) - (l): sinkpad will pull data from src +* (l) - (c): srcpad actively pushes data to sinkpad +* () - (c): srcpad will push data to sinkpad. + +* () - () : not schedulable. +* () - (l): not schedulable. +* (g) - () : not schedulable. +* (g) - (c): not schedulable. +* (l) - () : not schedulable. +* (l) - (l): not schedulable + +* () - (g): impossible +* (g) - (g): impossible. +* (l) - (g): impossible +* (c) - () : impossible +* (c) - (g): impossible +* (c) - (l): impossible +* (c) - (c): impossible + +``` + +---------+ +------------+ +-----------+ + | filesrc | | mp3decoder | | audiosink | + | src--sink src--sink | + +---------+ +------------+ +-----------+ + (l-g) (c) () (c) +``` + +When activating the pads: + + - audiosink has a chain function and the peer pad has no loop + function, no scheduling is done. + + - mp3decoder and filesrc expose an (l) - (c) connection, a thread is + created to call the srcpad loop function. + +``` + +---------+ +------------+ +----------+ + | filesrc | | avidemuxer | | fakesink | + | src--sink src--sink | + +---------+ +------------+ +----------+ + (l-g) (l) () (c) +``` + + - fakesink has a chain function and the peer pad has no loop function, + no scheduling is done. + + - avidemuxer and filesrc expose an (g) - (l) connection, a thread is + created to call the sinkpad loop function. + +``` + +---------+ +----------+ +------------+ +----------+ + | filesrc | | identity | | avidemuxer | | fakesink | + | src--sink src--sink src--sink | + +---------+ +----------+ +------------+ +----------+ + (l-g) (c) () (l) () (c) +``` + + - fakesink has a chain function and the peer pad has no loop function, + no scheduling is done. + + - avidemuxer and identity expose no schedulable connection so this + pipeline is not schedulable. + +``` + +---------+ +----------+ +------------+ +----------+ + | filesrc | | identity | | avidemuxer | | fakesink | + | src--sink src--sink src--sink | + +---------+ +----------+ +------------+ +----------+ + (l-g) (c-l) (g) (l) () (c) +``` + + - fakesink has a chain function and the peer pad has no loop function, + no scheduling is done. + + - avidemuxer and identity expose an (g) - (l) connection, a thread is + created to call the sinkpad loop function. + + - identity knows the srcpad is getrange based and uses the thread from + avidemux to getrange data from filesrc. + +``` + +---------+ +----------+ +------------+ +----------+ + | filesrc | | identity | | oggdemuxer | | fakesink | + | src--sink src--sink src--sink | + +---------+ +----------+ +------------+ +----------+ + (l-g) (c) () (l-c) () (c) +``` + + - fakesink has a chain function and the peer pad has no loop function, + no scheduling is done. + + - oggdemuxer and identity expose an () - (l-c) connection, oggdemux + has to operate in chain mode. + + - identity chan only work chain based and so filesrc creates a thread + to push data to identity. diff --git a/markdown/design/seeking.md b/markdown/design/seeking.md new file mode 100644 index 0000000000..dbd7d0c51a --- /dev/null +++ b/markdown/design/seeking.md @@ -0,0 +1,229 @@ +# Seeking +Seeking in GStreamer means configuring the pipeline for playback of the +media between a certain start and stop time, called the playback +segment. By default a pipeline will play from position 0 to the total +duration of the media at a rate of 1.0. + +A seek is performed by sending a seek event to the sink elements of a +pipeline. Sending the seek event to a bin will by default forward the +event to all sinks in the bin. + +When performing a seek, the start and stop values of the segment can be +specified as absolute positions or relative to the currently configured +playback segment. Note that it is not possible to seek relative to the +current playback position. To seek relative to the current playback +position, one must query the position first and then perform an absolute +seek to the desired position. + +Feedback of the seek operation can be immediately using the +`GST_SEEK_FLAG_FLUSH` flag. With this flag, all pending data in the +pipeline is discarded and playback starts from the new position +immediately. + +When the FLUSH flag is not set, the seek will be queued and executed as +soon as possible, which might be after all queues are emptied. + +Seeking can be performed in different formats such as time, frames or +samples. + +The seeking can be performed to a nearby key unit or to the exact +(estimated) unit in the media (`GST_SEEK_FLAG_KEY_UNIT`). See below +for more details on this. + +The seeking can be performed by using an estimated target position or in +an accurate way (`GST_SEEK_FLAG_ACCURATE`). For some formats this can +result in having to scan the complete file in order to accurately find +the target unit. See below for more details on this. + +Non segment seeking will make the pipeline emit EOS when the configured +segment has been played. + +Segment seeking (using the `GST_SEEK_FLAG_SEGMENT`) will not emit an +EOS at the end of the playback segment but will post a SEGMENT_DONE +message on the bus. This message is posted by the element driving the +playback in the pipeline, typically a demuxer. After receiving the +message, the application can reconnect the pipeline or issue other seek +events in the pipeline. Since the message is posted as early as possible +in the pipeline, the application has some time to issue a new seek to +make the transition seamless. Typically the allowed delay is defined by +the buffer sizes of the sinks as well as the size of any queues in the +pipeline. + +The seek can also change the playback speed of the configured segment. A +speed of 1.0 is normal speed, 2.0 is double speed. Negative values mean +backward playback. + +When performing a seek with a playback rate different from 1.0, the +`GST_SEEK_FLAG_SKIP` flag can be used to instruct decoders and demuxers +that they are allowed to skip decoding. This can be useful when resource +consumption is more important than accurately producing all frames. + + + + +## Generating seeking events + +A seek event is created with `gst_event_new_seek ()`. + +## Seeking variants + +The different kinds of seeking methods and their internal workings are +described below. + +### FLUSH seeking + +This is the most common way of performing a seek in a playback +application. The application issues a seek on the pipeline and the new +media is immediately played after the seek call returns. + +### seeking without FLUSH + +This seek type is typically performed after issuing segment seeks to +finish the playback of the pipeline. + +Performing a non-flushing seek in a PAUSED pipeline blocks until the +pipeline is set to playing again since all data passing is blocked in +the prerolled sinks. + +### segment seeking with FLUSH + +This seek is typically performed when starting seamless looping. + +### segment seeking without FLUSH + +This seek is typically performed when continuing seamless looping. + +Demuxer/parser behaviour and `SEEK_FLAG_KEY_UNIT` and +`SEEK_FLAG_ACCURATE` + +This section aims to explain the behaviour expected by an element with +regard to the KEY_UNIT and ACCURATE seek flags using the example of a +parser or demuxer. + +#### DEFAULT BEHAVIOUR: + +When a seek to a certain position is requested, the demuxer/parser will +do two things (ignoring flushing and segment seeks, and simplified for +illustration purposes): + + - send a segment event with a new start position + + - start pushing data/buffers again + +To ensure that the data corresponding to the requested seek position can +actually be decoded, a demuxer or parser needs to start pushing data +from a keyframe/keyunit at or before the requested seek position. + +Unless requested differently (via the KEY_UNIT flag), the start of the +segment event should be the requested seek position. + +So by default a demuxer/parser will then start pushing data from +position DATA and send a segment event with start position SEG_START, +and DATA ⇐ SEG_START. + +If DATA < SEG_START, a well-behaved video decoder will start decoding +frames from DATA, but take into account the segment configured by the +demuxer via the segment event, and only actually output decoded video +frames from SEG_START onwards, dropping all decoded frames that are +before the segment start and adjusting the timestamp/duration of the +buffer that overlaps the segment start ("clipping"). A +not-so-well-behaved video decoder will start decoding frames from DATA +and push decoded video frames out starting from position DATA, in which +case the frames that are before the configured segment start will +usually be dropped/clipped downstream (e.g. by the video sink). + +#### GST_SEEK_FLAG_KEY_UNIT: + +If the KEY_UNIT flag is specified, the demuxer/parser should adjust the +segment start to the position of the key frame closest to the requested +seek position and then start pushing out data from there. The nearest +key frame may be before or after the requested seek position, but many +implementations will only look for the closest keyframe before the +requested position. + +Most media players and thumbnailers do (and should be doing) KEY_UNIT +seeks by default, for performance reasons, to ensure almost-instant +responsiveness when scrubbing (dragging the seek slider in PAUSED or +PLAYING mode). This works well for most media, but results in suboptimal +behaviour for a small number of *odd* files (e.g. files that only have +one keyframe at the very beginning, or only a few keyframes throughout +the entire stream). At the time of writing, a solution for this still +needs to be found, but could be implemented demuxer/parser-side, e.g. +make demuxers/parsers ignore the KEY_UNIT flag if the position +adjustment would be larger than 1/10th of the duration or somesuch. + +Flags can be used to influence snapping direction for those cases where +it matters. SNAP_BEFORE will select the preceding position to the seek +target, and SNAP_AFTER will select the following one. If both flags are +set, the nearest one to the seek target will be used. If none of these +flags are set, the seeking implemention is free to select whichever it +wants. + +#### Summary: + + - if the KEY_UNIT flag is **not** specified, the demuxer/parser + should start pushing data from a key unit preceding the seek + position (or from the seek position if that falls on a key unit), + and the start of the new segment should be the requested seek + position. + + - if the KEY_UNIT flag is specified, the demuxer/parser should start + pushing data from the key unit nearest the seek position (or from + the seek position if that falls on a key unit), and the start of the + new segment should be adjusted to the position of that key unit + which was nearest the requested seek position (ie. the new segment + start should be the position from which data is pushed). + +### GST_SEEK_FLAG_ACCURATE: + +If the ACCURATE flag is specified in a seek request, the demuxer/parser +is asked to do whatever it takes (!) to make sure that the position +seeked to is accurate in relation to the beginning of the stream. This +means that it is not acceptable to just approximate the position (e.g. +using an average bitrate). The achieved position must be exact. In the +worst case, the demuxer or parser needs to push data from the beginning +of the file and let downstream clip everything before the requested +segment start. + +The ACCURATE flag does not affect what the segment start should be in +relation to the requested seek position. Only the KEY_UNIT flag (or its +absence) has any effect on that. + +Video editors and frame-stepping applications usually use the ACCURATE +flag. + +#### Summary: + + - if the ACCURATE flag is **not** specified, it is up to the + demuxer/parser to decide how exact the seek should be. If the flag + is not specified, the expectation is that the demuxer/parser does a + resonable best effort attempt, trading speed for accuracy. In the + absence of an index, the seek position may be approximated. + + - if the ACCURATE flag is specified, absolute accuracy is required, + and speed is of no concern. It is not acceptable to just approximate + the seek position in that case. + + - the ACCURATE flag does not imply that the segment starts at the + requested seek position or should be adjusted to the nearest + keyframe, only the KEY_UNIT flag determines that. + +### ACCURATE and KEY_UNIT combinations: + +All combinations of these two flags are valid: + + - neither flag specified: segment starts at seek position, send data + from preceding key frame (or earlier), feel free to approximate the + seek position + + - only KEY_UNIT specified: segment starts from position of nearest + keyframe, send data from nearest keyframe, feel free to approximate + the seek position + + - only ACCURATE specified: segment starts at seek position, send data + from preceding key frame (or earlier), do not approximate the seek + position under any circumstances + + - ACCURATE | KEY_UNIT specified: segment starts from position of + nearest keyframe, send data from nearest key frame, do not + approximate the seek position under any circumstances diff --git a/markdown/design/segments.md b/markdown/design/segments.md new file mode 100644 index 0000000000..24e925e710 --- /dev/null +++ b/markdown/design/segments.md @@ -0,0 +1,108 @@ +# Segments + +A segment in GStreamer denotes a set of media samples that must be +processed. A segment has a start time, a stop time and a processing +rate. + +A media stream has a start and a stop time. The start time is always 0 +and the stop time is the total duration (or -1 if unknown, for example a +live stream). We call this the complete media stream. + +The segment of the complete media stream can be played by issuing a seek +on the stream. The seek has a start time, a stop time and a processing +rate. + +``` + complete stream ++------------------------------------------------+ +0 duration + segment + |--------------------------| + start stop +``` + +The playback of a segment starts with a source or demuxer element +pushing a segment event containing the start time, stop time and rate of +the segment. The purpose of this segment is to inform downstream +elements of the requested segment positions. Some elements might produce +buffers that fall outside of the segment and that might therefore be +discarded or + clipped. + +## Use case: FLUSHING seek + +ex. `filesrc ! avidemux ! videodecoder ! videosink` + +When doing a seek in this pipeline for a segment 1 to 5 seconds, avidemux +will perform the seek. + +Avidemux starts by sending a FLUSH_START event downstream and upstream. This +will cause its streaming task to PAUSED because \_pad_pull_range() and +\_pad_push() will return FLUSHING. It then waits for the STREAM_LOCK, +which will be unlocked when the streaming task pauses. At this point no +streaming is happening anymore in the pipeline and a FLUSH_STOP is sent +upstream and downstream. + +When avidemux starts playback of the segment from second 1 to 5, it pushes +out a segment with 1 and 5 as start and stop times. The stream_time in +the segment is also 1 as this is the position we seek to. + +The video decoder stores these values internally and forwards them to the +next downstream element (videosink, which also stores the values) + +Since second 1 does not contain a keyframe, the avi demuxer starts sending +data from the previous keyframe which is at timestamp 0. + +The video decoder decodes the keyframe but knows it should not push the +video frame yet as it falls outside of the configured segment. + +When the video decoder receives the frame with timestamp 1, it is able to +decode this frame as it received and decoded the data up to the previous +keyframe. It then continues to decode and push frames with timestamps >= 1. +When it reaches timestamp 5, it does not decode and push frames anymore. + +The video sink receives a frame of timestamp 1. It takes the start value of +the previous segment and aplies the following (simplified) formula: + +``` + render_time = BUFFER_TIMESTAMP - segment_start + element->base_time +``` + +It then syncs against the clock with this render_time. Note that +BUFFER_TIMESTAMP is always >= segment_start or else it would fall outside of +the configure segment. + +Videosink reports its current position as (simplified): + +``` + current_position = clock_time - element->base_time + segment_time +``` + +See [synchronisation](design/synchronisation.md) for a more detailed and accurate explanation of +synchronisation and position reporting. + +Since after a flushing seek the stream_time is reset to 0, the new buffer +will be rendered immediately after the seek and the current_position will be +the stream_time of the seek that was performed. + +The stop time is important when the video format contains B frames. The +video decoder receives a P frame first, which it can decode but not push yet. +When it receives a B frame, it can decode the B frame and push the B frame +followed by the previously decoded P frame. If the P frame is outside of the +segment, the decoder knows it should not send the P frame. + +Avidemux stops sending data after pushing a frame with timestamp 5 and +returns GST_FLOW_EOS from the chain function to make the upstream +elements perform the EOS logic. + +## Use case: live stream + +## Use case: segment looping + +Consider the case of a wav file with raw audio. + +``` + filesrc ! wavparse ! alsasink +``` + +FIXME! diff --git a/markdown/design/seqnums.md b/markdown/design/seqnums.md new file mode 100644 index 0000000000..5e1bb1067f --- /dev/null +++ b/markdown/design/seqnums.md @@ -0,0 +1,85 @@ +# Seqnums (Sequence numbers) + +Seqnums are integers associated to events and messages. They are used to +identify a group of events and messages as being part of the same +*operation* over the pipeline. + +Whenever a new event or message is created, a seqnum is set into them. +This seqnum is created from an ever increasing source (starting from 0 +and it might wrap around), so each new event and message gets a new and +hopefully unique seqnum. + +Suppose an element receives an event A and, as part of the logic of +handling the event A, creates a new event B. B should have its seqnum to +the same as A, because they are part of the same operation. The same +logic applies if this element had to create multiple events or messages, +all of those should have the seqnum set to the value on the received +event. For example, when a sink element receives an EOS event and +creates a new EOS message to post, it should copy the seqnum from the +event to the message because the EOS message is a consequence of the EOS +event being received. + +Preserving the seqnums accross related events and messages allows the +elements and applications to identify a set of events/messages as being +part of a single operation on the pipeline. For example, flushes, +segments and EOS that are related to a seek event started by the +application. + +Seqnums are also useful for elements to discard duplicated events, +avoiding handling them again. + +Below are some scenarios as examples of how to handle seqnums when +receving events: + +# Forcing EOS on the pipeline + +The application has a pipeline running and does a +`gst_element_send_event` to the pipeline with an EOS event. All the +sources in the pipeline will have their `send_event` handlers called and +will receive the event from the application. + +When handling this event, the sources will push either the same EOS +downstream or create their own EOS event and push. In the later case, +the source should copy the seqnum from the original EOS to the newly +created. This same logic applies to all elements that receive the EOS +downstream, either push the same event or, if creating a new one, copy +the seqnum. + +When the EOS reaches the sink, it will create an EOS message, copy the +seqnum to the message and post to the bus. The application receives the +message and can compare the seqnum of the message with the one from the +original event sent to the pipeline. If they match, it knows that this +EOS message was caused by the event it pushed and not from other reason +(input finished or configured segment was over). + +# Seeking + +A seek event sent to the pipeline is forwarded to all sinks in it. Those +sinks, then, push the seek event upstream until they reach an element +that is capable of handling it. If the element handling the seek has +multiple source pads (tipically a demuxer is handling the seek) it might +receive the same seek event on all pads. To prevent handling the same +seek event multiple times, the seqnum can be used to identify those +events as being the same and only handle the first received. + +Also, when handling the seek, the element might push flush-start, +flush-stop and a segment event. All those events should have the same +seqnum of the seek event received. When this segment is over and an +EOS/Segment-done event is going to be pushed, it also should have the +same seqnum of the seek that originated the segment to be played. + +Having the same seqnum as the seek on the segment-done or EOS events is +important for the application to identify that the segment requested by +its seek has finished playing. + +# Questions + +What happens if the application has sent a seek to the pipeline and, +while the segment relative to this seek is playing, it sends an EOS +event? Should the EOS pushed by the source have the seqnum of the +segment or the EOS from the application? + +If the EOS was received from the application before the segment ended, +it should have the EOS from the application event. If the segment ends +before the application event is received/handled, it should have the +seek/segment seqnum. diff --git a/markdown/design/sparsestreams.md b/markdown/design/sparsestreams.md new file mode 100644 index 0000000000..708a99ad38 --- /dev/null +++ b/markdown/design/sparsestreams.md @@ -0,0 +1,110 @@ +# DRAFT Sparse Streams + +## Introduction + +In 0.8, there was some support for Sparse Streams through the use of +FILLER events. These were used to mark gaps between buffers so that +downstream elements could know not to expect any more data for that gap. + +In 0.10, segment information conveyed through SEGMENT events can be used +for the same purpose. + +In 1.0, there is a GAP event that works in a similar fashion as the +FILLER event in 0.8. + +## Use cases + +1) Sub-title streams Sub-title information from muxed formats such as +Matroska or MPEG consist of irregular buffers spaced far apart compared +to the other streams (audio and video). Since these usually only appear +when someone speaks or some other action in the video/audio needs +describing, they can be anywhere from 1-2 seconds to several minutes +apart. Downstream elements that want to mix sub-titles and video (and muxers) +have no way of knowing whether to process a video packet or wait a moment +for a corresponding sub-title to be delivered on another pad. + +2) Still frame/menu support In DVDs (and other formats), there are +still-frame regions where the current video frame should be retained and +no audio played for a period. In DVD, these are described either as a +fixed duration, or infinite duration still frame. + +3) Avoiding processing silence from audio generators Imagine a source +that from time to time produces empty buffers (silence or blank images). +If the pipeline has many elements next, it is better to optimise the +obsolete data processing in this case. Examples for such sources are +sound-generators (simsyn in gst-buzztard) or a source in a voip +application that uses noise-gating (to save bandwith). + +## Details + +### Sub-title streams + +The main requirement here is to avoid stalling the +pipeline between sub-title packets, and is effectively updating the +minimum-timestamp for that +stream. + +A demuxer can do this by sending an 'update' SEGMENT with a new start time +to the subtitle pad. For example, every time the SCR in MPEG data +advances more than 0.5 seconds, the MPEG demuxer can issue a SEGMENT with +(update=TRUE, start=SCR ). Downstream elements can then be aware not to +expect any data older than the new start time. + +The same holds true for any element that knows the current position in the +stream - once the element knows that there is no more data to be presented +until time 'n' it can advance the start time of the current segment to 'n'. + +This technique can also be used, for example, to represent a stream of +MIDI events spaced to a clock period. When there is no event present for +a clock time, a SEGMENT update can be sent in its place. + +### Still frame/menu support + +Still frames in DVD menus are not the same, +in that they do not introduce a gap in the timestamps of the data. +Instead, they represent a pause in the presentation of a stream. +Correctly performing the wait requires some synchronisation with +downstream elements. + +In this scenario, an upstream element that wants to execute a still frame +performs the following steps: + + - Send all data before the still frame wait + + - Send a DRAIN event to ensure that all data has been played + downstream. + + - wait on the clock for the required duration, possibly interrupting + if necessary due to an intervening activity (such as a user + navigation) + + - FLUSH the pipeline using a normal flush sequence (FLUSH\_START, + chain-lock, FLUSH\_STOP) + + - Send a SEGMENT to restart playback with the next timestamp in the + stream. + +The upstream element performing the wait must only do so when in the PLAYING +state. During PAUSED, the clock will not be running, and may not even have +been distributed to the element yet. + +DRAIN is a new event that will block on a src pad until all data downstream +has been played out. + +Flushing after completing the still wait is to ensure that data after the wait +is played correctly. Without it, sinks will consider the first buffers +(x seconds, where x is the duration of the wait that occurred) to be +arriving late at the sink, and they will be discarded instead of played. + +### For audio + +It is the same case as the first one - there is a *gap* in the audio +data that needs to be presented, and this can be done by sending a +SEGMENT update that moves the start time of the segment to the next +timestamp when data will be sent. + +For video, however it is slightly different. Video frames are typically +treated at the moment as continuing to be displayed after their indicated +duration if no new frame arrives. Here, it is desired to display a blank +frame instead, in which case at least one blank frame should be sent before +updating the start time of the segment. diff --git a/markdown/design/standards.md b/markdown/design/standards.md new file mode 100644 index 0000000000..5bff604248 --- /dev/null +++ b/markdown/design/standards.md @@ -0,0 +1,51 @@ +# Ownership of dynamic objects + +Any object-oriented system or language that doesn’t have automatic +garbage collection has many potential pitfalls as far as the pointers +go. Therefore, some standards must be adhered to as far as who owns +what. + +## Strings + +Arguments passed into a function are owned by the caller, and the +function will make a copy of the string for its own internal use. The +string should be const gchar \*. Strings returned from a function are +always a copy of the original and should be freed after usage by the +caller. + +ex: + +``` c + name = gst_element_get_name (element); /* copy of name is made */ + .. use name .. + g_free (name); /* free after usage */ +``` + +## Objects + +Objects passed into a function are owned by the caller, any additional +reference held to the object after leaving the function should increase +the refcount of that object. + +Objects returned from a function are owned by the caller. This means +that the called should \_free() or \_unref() the object after usage. + +ex: + +``` c + peer = gst_pad_get_peer (pad); /* peer with increased refcount */ + if (peer) { + .. use peer .. + gst_object_unref (GST_OBJECT (peer)); /* unref peer after usage */ + } +``` + +## Iterators + +When retrieving multiple objects from an object an iterator should be +used. The iterator allows you to access the objects one after another +while making sure that the set of objects retrieved remains consistent. + +Each object retrieved from an iterator has its refcount increased or is +a copy of the original. In any case the object should be unreffed or +freed after usage. diff --git a/markdown/design/states.md b/markdown/design/states.md new file mode 100644 index 0000000000..51b73ce1be --- /dev/null +++ b/markdown/design/states.md @@ -0,0 +1,404 @@ +# States + +Both elements and pads can be in different states. The states of the +pads are linked to the state of the element so the design of the states +is mainly focused around the element states. + +An element can be in 4 states. NULL, READY, PAUSED and PLAYING. When an +element is initially instantiated, it is in the NULL state. + +## State definitions + + - NULL: This is the initial state of an element. + + - READY: The element should be prepared to go to PAUSED. + + - PAUSED: The element should be ready to accept and process data. Sink + elements however only accept one buffer and then block. + + - PLAYING: The same as PAUSED except for live sources and sinks. Sinks + accept and render data. Live sources produce data. + +We call the sequence NULL→PLAYING an upwards state change and +PLAYING→NULL a downwards state change. + +## State transitions + +the following state changes are possible: + +* *NULL -> READY*: + - The element must check if the resources it needs are available. + Device sinks and sources typically try to probe the device to constrain + their caps. + - The element opens the device, this is needed if the previous step requires + the device to be opened. + +* *READY -> PAUSED*: + - The element pads are activated in order to receive data in PAUSED. + Streaming threads are started. + - Some elements might need to return `ASYNC` and complete the state change + when they have enough information. It is a requirement for sinks to + return `ASYNC` and complete the state change when they receive the first + buffer or EOS event (preroll). Sinks also block the dataflow when in PAUSED. + - A pipeline resets the running_time to 0. + - Live sources return NO_PREROLL and don't generate data. + +* *PAUSED -> PLAYING*: + - Most elements ignore this state change. + - The pipeline selects a clock and distributes this to all the children + before setting them to PLAYING. This means that it is only allowed to + synchronize on the clock in the PLAYING state. + - The pipeline uses the clock and the running_time to calculate the base_time. + The base_time is distributed to all children when performing the state + change. + - Sink elements stop blocking on the preroll buffer or event and start + rendering the data. + - Sinks can post the EOS message in the PLAYING state. It is not allowed to + post EOS when not in the PLAYING state. + - While streaming in PAUSED or PLAYING elements can create and remove + sometimes pads. + - Live sources start generating data and return SUCCESS. + +* *PLAYING -> PAUSED*: + - Most elements ignore this state change. + - The pipeline calculates the running_time based on the last selected clock + and the base_time. It stores this information to continue playback when + going back to the PLAYING state. + - Sinks unblock any clock wait calls. + - When a sink does not have a pending buffer to play, it returns `ASYNC` from + this state change and completes the state change when it receives a new + buffer or an EOS event. + - Any queued EOS messages are removed since they will be reposted when going + back to the PLAYING state. The EOS messages are queued in GstBins. + - Live sources stop generating data and return NO_PREROLL. + +* *PAUSED -> READY*: + - Sinks unblock any waits in the preroll. + - Elements unblock any waits on devices + - Chain or get_range functions return FLUSHING. + - The element pads are deactivated so that streaming becomes impossible and + all streaming threads are stopped. + - The sink forgets all negotiated formats + - Elements remove all sometimes pads + +* *READY -> NULL*: + - Elements close devices + - Elements reset any internal state. + +## State variables + +An element has 4 state variables that are protected with the object LOCK: + + - *STATE* + - *STATE_NEXT* + - *STATE_PENDING* + - *STATE_RETURN* + +The STATE always reflects the current state of the element. The +STATE\_NEXT reflects the next state the element will go to. The +STATE\_PENDING always reflects the required state of the element. The +STATE\_RETURN reflects the last return value of a state change. + +The STATE\_NEXT and STATE\_PENDING can be VOID\_PENDING if the element +is in the right state. + +An element has a special lock to protect against concurrent invocations +of set\_state(), called the STATE\_LOCK. + +## Setting state on elements + +The state of an element can be changed with \_element\_set\_state(). +When changing the state of an element all intermediate states will also +be set on the element until the final desired state is set. + +The `set\_state()` function can return 3 possible values: + +* *GST_STATE_FAILURE*: The state change failed for some reason. The plugin should +have posted an error message on the bus with information. + +* *GST_STATE_SUCCESS*: The state change is completed successfully. + +* *GST_STATE_ASYNC*: The state change will complete later on. This can happen +when the element needs a long time to perform the state change or for sinks +that need to receive the first buffer before they can complete the state change +(preroll). + +* *GST_STATE_NO_PREROLL*: The state change is completed successfully but the +element will not be able to produce data in the PAUSED state. + +In the case of an `ASYNC` state change, it is possible to proceed to the +next state before the current state change completed, however, the +element will only get to this next state before completing the previous +`ASYNC` state change. After receiving an `ASYNC` return value, you can use +`element\_get\_state()` to poll the status of the element. If the +polling returns `SUCCESS`, the element completed the state change to the +last requested state with `set\_state()`. + +When setting the state of an element, the STATE\_PENDING is set to the +required state. Then the state change function of the element is called +and the result of that function is used to update the STATE and +STATE\_RETURN fields, STATE\_NEXT, STATE\_PENDING and STATE\_RETURN +fields. If the function returned `ASYNC`, this result is immediately +returned to the caller. + +## Getting state of elements + +The get\_state() function takes 3 arguments, two pointers that will +hold the current and pending state and one GstClockTime that holds a +timeout value. The function returns a GstElementStateReturn. + + - If the element returned `SUCCESS` to the previous \_set\_state() + function, this function will return the last state set on the + element and VOID\_PENDING in the pending state value. The function + returns GST\_STATE\_SUCCESS. + + - If the element returned NO\_PREROLL to the previous \_set\_state() + function, this function will return the last state set on the + element and VOID\_PENDING in the pending state value. The function + returns GST\_STATE\_NO\_PREROLL. + + - If the element returned FAILURE to the previous \_set\_state() call, + this function will return FAILURE with the state set to the current + state of the element and the pending state set to the value used in + the last call of \_set\_state(). + + - If the element returned `ASYNC` to the previous \_set\_state() call, + this function will wait for the element to complete its state change + up to the amount of time specified in the GstClockTime. + + - If the element does not complete the state change in the + specified amount of time, this function will return `ASYNC` with + the state set to the current state and the pending state set to + the pending state. + + - If the element completes the state change within the specified + timeout, this function returns the updated state and + VOID\_PENDING as the pending state. + + - If the element aborts the `ASYNC` state change due to an error + within the specified timeout, this function returns FAILURE with + the state set to last successful state and pending set to the + last attempt. The element should also post an error message on + the bus with more information about the problem. + +## States in GstBin + +A GstBin manages the state of its children. It does this by propagating +the state changes performed on it to all of its children. The +\_set\_state() function on a bin will call the \_set\_state() function +on all of its children, that are not already in the target state or in a +change state to the target state. + +The children are iterated from the sink elements to the source elements. +This makes sure that when changing the state of an element, the +downstream elements are in the correct state to process the eventual +buffers. In the case of a downwards state change, the sink elements will +shut down first which makes the upstream elements shut down as well +since the \_push() function returns a GST\_FLOW\_FLUSHING error. + +If all the children return `SUCCESS`, the function returns `SUCCESS` as +well. + +If one of the children returns FAILURE, the function returns FAILURE as +well. In this state it is possible that some elements successfully +changed state. The application can check which elements have a changed +state, which were in error and which were not affected by iterating the +elements and calling \_get\_state() on the elements. + +If after calling the state function on all children, one of the children +returned `ASYNC`, the function returns `ASYNC` as well. + +If after calling the state function on all children, one of the children +returned NO\_PREROLL, the function returns NO\_PREROLL as well. + +If both NO\_PREROLL and `ASYNC` children are present, NO\_PREROLL is +returned. + +The current state of the bin can be retrieved with \_get\_state(). + +If the bin is performing an `ASYNC` state change, it will automatically +update its current state fields when it receives state messages from the +children. + +## Implementing states in elements + +### READY + +## upward state change + +Upward state changes always return `ASYNC` either if the STATE\_PENDING is +reached or not. + +Element: + +* A -> B => `SUCCESS` + - commit state + +* A -> B => `ASYNC` + - no commit state + - element commits state `ASYNC` + +* A -> B while `ASYNC` + - update STATE_PENDING state + - no commit state + - no change_state called on element + +Bin: + +* A->B: all elements `SUCCESS` + - commit state + +* A->B: some elements `ASYNC` + - no commit state + - listen for commit messages on bus + - for each commit message, poll elements, this happens in another + thread. + - if no `ASYNC` elements, commit state, continue state change + to STATE_PENDING + +## downward state change + +Downward state changes only return `ASYNC` if the final state is ASYNC. +This is to make sure that it’s not needed to wait for an element to +complete the preroll or other `ASYNC` state changes when one only wants to +shut down an element. + +Element: + +A -> B => `SUCCESS` + - commit state + +A -> B => `ASYNC` not final state + - commit state on behalf of element + +A -> B => `ASYNC` final state + - element will commit `ASYNC` + +Bin: + +A -> B -> `SUCCESS` + - commit state + +A -> B -> `ASYNC` not final state + - commit state on behalf of element, continue state change + +A -> B => `ASYNC` final state + - no commit state + - listen for commit messages on bus + - for each commit message, poll elements + - if no `ASYNC` elements, commit state + +## Locking overview (element) + +- Element committing `SUCCESS` + + - STATE\_LOCK is taken in set\_state + + - change state is called if `SUCCESS`, commit state is called + + - commit state calls change\_state to next state change. + + - if final state is reached, stack unwinds and result is returned + to set\_state and + caller. + +``` +set_state(element) change_state (element) commit_state + + | | | + | | | +STATE_LOCK | | + | | | + |------------------------>| | + | | | + | | | + | | (do state change) | + | | | + | | | + | | if `SUCCESS` | + | |---------------------->| + | | | post message + | | | + | |<----------------------| if (!final) change_state (next) + | | | else SIGNAL + | | | + | | | + | | | + |<------------------------| | + | `SUCCESS` + | +STATE_UNLOCK + | + `SUCCESS` +``` + +- Element committing `ASYNC` + + - STATE\_LOCK is taken in set\_state + + - change state is called and returns `ASYNC` + + - `ASYNC` returned to the caller. + + - element takes LOCK in streaming thread. + + - element calls commit\_state in streaming thread. + + - commit state calls change\_state to next state + change. + +``` +set_state(element) change_state (element) stream_thread commit_state (element) + + | | | | + | | | | +STATE_LOCK | | | + | | | | + |------------------------>| | | + | | | | + | | | | + | | (start_task) | | + | | | | + | | STREAM_LOCK | + | | |... | + |<------------------------| | | + | ASYNC STREAM_UNLOCK | +STATE_UNLOCK | | + | .....sync........ STATE_LOCK | + ASYNC |----------------->| + | | + | |---> post_message() + | |---> if (!final) change_state (next) + | | else SIGNAL + |<-----------------| + STATE_UNLOCK + | + STREAM_LOCK + | ... + STREAM_UNLOCK +``` + +## Remarks + +set\_state cannot be called from multiple threads at the same time. The +STATE\_LOCK prevents this. + +State variables are protected with the LOCK. + +Calling set\_state while gst\_state is called should unlock the +get\_state with an error. The cookie will do that. + +``` c +set_state(element) + +STATE_LOCK + +LOCK +update current, next, pending state +cookie++ +UNLOCK + +change_state + +STATE_UNLOCK +``` diff --git a/markdown/design/stream-selection.md b/markdown/design/stream-selection.md new file mode 100644 index 0000000000..0e2968731c --- /dev/null +++ b/markdown/design/stream-selection.md @@ -0,0 +1,580 @@ +# Stream selection + +History +``` +v0.1: Jun 11th 2015 + Initial Draft +v0.2: Sep 18th 2015 + Update to reflect design changes +v1.0: Jun 28th 2016 + Pre-commit revision +``` + +This document describes the events and objects involved in stream +selection in GStreamer pipelines, elements and applications + +## Background + +This new API is intended to address the use cases described in +this section: + +1) As a user/app I want an overview and control of the media streams + that can be configured within a pipeline for processing, even + when some streams are mutually exclusive or logical constructs only. + +2) The user/app can disable entirely streams it's not interested + in so they don't occupy memory or processing power - discarded + as early as possible in the pipeline. The user/app can also + (re-)enable them at a later time. + +3) If the set of possible stream configurations is changing, + the user/app should be aware of the pending change and + be able to make configuration choices for the new set of streams, + as well as possibly still reconfiguring the old set + +4) Elements that have some other internal mechanism for triggering + stream selections (DVD, or maybe some scripted playback + playlist) should be able to trigger 'selection' of some particular + stream. + +5) Indicate known relationships between streams - for example that + 2 separate video feeds represent the 2 views of a stereoscopic + view, or that certain streams are mutually exclusive. + +> Note: the streams that are "available" are not automatically +> the ones active, or present in the pipeline as pads. Think HLS/DASH +> alternate streams. + +Use case examples: + +1) Playing an MPEG-TS multi-program stream, we want to tell the + app that there are multiple programs that could be extracted + from the incoming feed. Further, we want to provide a mechanism + for the app to select which program(s) to decode, and once + that is known to further tell the app which elementary streams + are then available within those program(s) so the app/user can + choose which audio track(s) to decode and/or use. + +2) A new PMT arrives for an MPEG-TS stream, due to a codec or + channel change. The pipeline will need to reconfigure to + play the desired streams from new program. Equally, there + may be multiple seconds of content buffered from the old + program and it should still be possible to switch (for example) + subtitle tracks responsively in the draining out data, as + well as selecting which subs track to play from the new feed. + This same scenario applies when doing gapless transition to a + new source file/URL, except that likely the element providing + the list of streams also changes as a new demuxer is installed. + +3) When playing a multi-angle DVD, the DVD Virtual Machine needs to + extract 1 angle from the data for presentation. It can publish + the available angles as logical streams, even though only one + stream can be chosen. + +4) When playing a DVD, the user can make stream selections from the + DVD menu to choose audio or sub-picture tracks, or the DVD VM + can trigger automatic selections. In addition, the player UI + should be able to show which audio/subtitle tracks are available + and allow direct selection in a GUI the same as for normal + files with subtitle tracks in them. + +5) Playing a SCHC (3DTV) feed, where one view is MPEG-2 and the other + is H.264 and they should be combined for 3D presentation, or + not bother decoding 1 stream if displaying 2D. + (bug https://bugzilla.gnome.org/show_bug.cgi?id=719333) + +FIXME - need some use cases indicating what alternate streams in + HLS might require - what are the possibilities? + +## Design Overview + +Stream selection in GStreamer is implemented in several parts: +1) Objects describing streams : GstStream +2) Objects describing a collection of streams : GstStreamCollection +3) Events from the app allowing selection and activation of some streams: + GST_EVENT_SELECT_STREAMS +4) Messages informing the user/application about the available + streams and current status: + GST_MESSAGE_STREAM_COLLECTION + GST_MESSAGE_STREAMS_SELECTED + +## GstStream objects + +* API: GstStream +* API: gst_stream_new(..) +* API: gst_stream_get_\*(...) +* API: gst_stream_set_\*() +* API: gst_event_set_stream(...) +* API: gst_event_parse_stream(...) + +GstStream objects are a high-level convenience object containing +information regarding a possible data stream that can be exposed by +GStreamer elements. + +They are mostly the aggregation of information present in other +GStreamer components (STREAM_START, CAPS, TAGS event) but are not +tied to the presence of a GstPad, and for some use-cases provide +information that the existing components don't provide. + +The various properties of a GstStream object are: + - stream_id (from the STREAM_START event) + - flags (from the STREAM_START event) + - caps + - tags + - type (high-level type of stream: Audio, Video, Container,...) + +GstStream objects can be subclassed so that they can be re-used by +elements already using the notion of stream (which is common for +example in demuxers). + +Elements that create GstStream should also set it on the +GST_EVENT_STREAM_START event of the relevant pad. This helps +downstream elements to have all information in one location. + +## Exposing collections of streams + +* API: GstStreamCollection +* API: gst_stream_collection_new(...) +* API: gst_stream_collection_add_stream(...) +* API: gst_stream_collection_get_size(...) +* API: gst_stream_collection_get_stream(...) +* API: GST_MESSAGE_STREAM_COLLECTION +* API: gst_message_new_stream_collection(...) +* API: gst_message_parse_stream_collection(...) +* API: GST_EVENT_STREAM_COLLECTION +* API: gst_event_new_stream_collection(...) +* API: gst_event_parse_stream_collection(...) + +Elements that create new streams (such as demuxers) or can create +new streams (like the HLS/DASH alternative streams) can list the +streams they can make available with the GstStreamCollection object. + +Other elements that might generate GstStreamCollections are the +DVD-VM, which handles internal switching of tracks, or parsebin and +decodebin3 when it aggregates and presents multiple internal stream +sources as a single configurable collection. + +The GstStreamCollection object is a flat listing of GstStream objects. + +The various properties of a GstStreamCollection are: + - 'identifier' + - the identifier of the collection (unique name) + - Generated from the 'upstream stream id' (or stream ids, plural) + - the list of GstStreams in the collection. + - (Not implemented) : Flags - + For now, the only flag is 'INFORMATIONAL' - used by container parsers to + publish information about detected streams without allowing selection of + the streams. + - (Not implemented yet) : The relationship between the various streams + This specifies which streams are exclusive (can not be selected at the + same time), are related (such as LINKED_VIEW or ENHANCEMENT), or need to + be selected together. + +An element will inform outside components about that collection via: + +* a GST_MESSAGE_STREAM_COLLECTION message on the bus. +* a GST_EVENT_STREAM_COLLECTION on each source pads. + +Applications and container bin elements can listen and collect the +various stream collections to know the full range of streams +available within a bin/pipeline. + +Once posted on the bus, a GstStreamCollection is immutable. It is +updated by subsequent messages with a matching identifier. + +If the element that provided the collection goes away, there is no way +to know that the streams are no longer valid (without having the +user/app track that element). The exception to that is if the bin +containing that element (such as parsebin or decodebin3) informs that +the next collection is a replacement of the former one. + +The mutual exclusion and relationship lists use stream-ids +rather than GstStream references in order to avoid circular +referencing problems. + +### Usage from elements + +When a demuxer knows the list of streams it can expose, it +creates a new GstStream for each stream it can provide with the +appropriate information (stream id, flag, tags, caps, ...). + +The demuxer then creates a GstStreamCollection object in which it +will put the list of GstStream it can expose. That collection is +then both posted on the bus (via a GST_MESSAGE_COLLECTION) and on +each pad (via a GST_EVENT_STREAM_COLLECTION). + +That new collection must be posted on the bus *before* the changes +are made available. i.e. before pads corresponding to that selection +are added/removed. + +In order to be backwards-compatible and support elements that don't +create streams/collection yet, the new 'parsebin' element used by +decodebin3 will automatically create those if not provided. + +### Usage from application + +Applications can know what streams are available by listening to the +GST_MESSAGE_STREAM_COLLECTION messages posted on the bus. + +The application can list the available streams per-type (such as all +the audio streams, or all the video streams) by iterating the +streams available in the collection by GST_STREAM_TYPE. + +The application will also be able to use these stream information to +decide which streams should be activated or not (see the stream +selection event below). + +### Backwards compatibility + +Not all demuxers will create the various GstStream and +GstStreamCollection objects. In order to remain backwards +compatible, a parent bin (parsebin in decodebin3) will create the +GstStream and GstStreamCollection based on the pads being +added/removed from an element. + +This allows providing stream listing/selection for any demuxer-like +element even if it doesn't implement the GstStreamCollection usage. + +## Stream selection event + +* API: GST_EVENT_SELECT_STREAMS +* API: gst_event_new_select_streams(...) +* API: gst_event_parse_select_streams(...) + + Stream selection events are generated by the application and + sent into the pipeline to configure the streams. + +The event carries: + * List of GstStreams to activate - a subset of the GstStreamCollection + * (Not implemented) - List of GstStreams to be kept discarded - a + subset of streams for which hot-swapping will not be desired, + allowing elements (such as decodebin3, demuxers, ...) to not parse or + buffer those streams at all. + +### Usage from application + +There are two use-cases where an application needs to specify in a +generic fashion which streams it wants in output: + +1) When there are several present streams of which it only wants a + subset (such as one audio, one video and one subtitle + stream). Those streams are demuxed and present in the pipeline. +2) When the stream the user wants require some element to undertake + some action to expose that stream in the pipeline (such as + DASH/HLS alternative streams). + +From the point of view of the application, those two use-cases are +treated identically. The streams are all available through the +GstStreamCollection posted on the bus, and it will select a subset. + +The application can select the streams it wants by creating a +GST_EVENT_SELECT_STREAMS event with the list of stream-id of the +streams it wants. That event is then sent on the pipeline, +eventually traveling all the way upstream from each sink. + +In some cases, selecting one stream may trigger the availability of +other dependent streams, resulting in new GstStreamCollection +messages. This can happen in the case where choosing a different DVB +channel would create a new single-program collection. + +### Usage in elements + +Elements that receive the GST_EVENT_SELECT_STREAMS event and that +can activate/deactivate streams need to look at the list of +stream-id contained in the event and decide if they need to do some +action. + +In the standard demuxer case (demuxing and exposing all streams), +there is nothing to do by default. + +In decodebin3, activating or deactivating streams is taken care of by +linking only the streams present in the event to decoders and output +ghostpad. + +In the case of elements that can expose alternate streams that are +not present in the pipeline as pads, they will take the appropriate +action to add/remove those streams. + +Containers that receive the event should pass it to any elements +with no downstream peers, so that streams can be configured during +pre-roll before a pipeline is completely linked down to sinks. + +## decodebin3 usage and example + +This is an example of how decodebin3 works by using the +above-mentioned objects/events/messages. + +For clarity/completeness, we will consider a mpeg-ts stream that has +multiple audio streams. Furthermore that stream might have changes +at some point (switching video codec, or adding/removing audio +streams). + +### Initial differences + +decodebin3 is different, compared to decodebin2, in the sense that, by +default: +* it will only expose as output ghost source pads one stream of each + type (one audio, one video, ..). +* It will only decode the exposed streams + +The multiqueue element is still used and takes in all elementary +(non-decoded) streams. If parsers are needed/present they are placed +before the multiqueue. This is needed in order for multiqueue to +work only with packetized and properly timestamped streams. + +Note that the whole typefinding of streams, and optional depayloading, +demuxing and parsing are done in a new 'parsebin' element. + +Just like the current implementation, demuxers will expose all +streams present within a program as source pads. They will connect +to parsers and multiqueue. + +Initial setup. 1 video stream, 2 audio streams. + +``` + +---------------------+ + | parsebin | + | --------- | +-------------+ + | | demux |--[parser]-+-| multiqueue |--[videodec]---[ +]-+-| |--[parser]-+-| | + | | |--[parser]-+-| |--[audiodec]---[ + | --------- | +-------------+ + +---------------------+ +``` + +### GstStreamCollection + +When parsing the initial PAT/PMT, the demuxer will: +1) create the various GstStream objects for each stream. +2) create the GstStreamCollection for that initial PMT +3) post the GST_MESSAGE_STREAM_COLLECTION Decodebin will intercept that message +and know what the demuxer will be exposing. +4) The demuxer creates the various pads and sends the corresponding +STREAM_START event (with the same stream-id as the corresponding +GstStream objects), CAPS event, and TAGS event. + + * parsebin will add all relevant parsers and expose those streams. + + * Decodebin will be able to correlate, based on STREAM_START event +stream-id, what pad corresponds to which stream. It links each stream +from parsebin to multiqueue. + + * Decodebin knows all the streams that will be available. Since by +default it is configured to only expose a stream of each type, it +will pick a stream of each for which it will complete the +auto-plugging (finding a decoder and then exposing that stream as a +source ghostpad. + +> Note: If the demuxer doesn't create/post the GstStreamCollection, +> parsebin will create it on itself, as explained in section 2.3 +> above. + +### Changing the active selection from the application + +The user wants to change the audio track. The application received +the GST_MESSAGE_STREAM_COLLECTION containing the list of available +streams. For clarity, we will assume those stream-ids are +"video-main", "audio-english" and "audio-french". + +The user prefers to use the french soundtrack (which it knows based +on the language tag contained in the GstStream objects). + +The application will create and send a GST_EVENT_SELECT_STREAM event +containing the list of streams: "video-main", "audio-french". + +That event gets sent on the pipeline, the sinks send it upstream and +eventually reach decodebin. + +Decodebin compares: +* The currently active selection ("video-main", "audio-english") +* The available stream collection ("video-main", "audio-english", + "audio-french") +* The list of streams in the event ("video-main", "audio-french") + +Decodebin determines that no change is required for "video-main", +but sees that it needs to deactivate "audio-english" and activate +"audio-french". + +It unlinks the multiqueue source pad connected to the audiodec. Then +it queries audiodec, using the GST_QUERY_ACCEPT_CAPS, whether it can +accept as-is the caps from the "audio-french" stream. +1) If it does, the multiqueue source pad corresponding to + "audio-french" is linked to the decoder. +2) If it does not, the existing audio decoder is removed, + a new decoder is selected (like during initial + auto-plugging), and replaces the old audio decoder element. + +The newly selected stream gets decoded and output through the same +pad as the previous audio stream. + +Note: +The default behaviour would be to only expose one stream of each +type. But nothing prevents decodebin from outputting more/less of +each type if the GST_EVENT_SELECT_STREAM event specifies that. This +allows covering more use-case than the simple playback one. +Such examples could be : + * Wanting just a video stream or just an audio stream + * Wanting all decoded streams + * Wanting all audio streams + ... + +### Changes coming from upstream + +At some point in time, a PMT change happens. Let's assume a change +in video-codec and/or PID. + +The demuxer creates a new GstStream for the changed/new stream, +creates a new GstStreamCollection for the updated PMT and posts it. + +Decodebin sees the new GstStreamCollection message. + +The demuxer (and parsebin) then adds and removes pads. +1) decodebin will match the new pads to GstStream in the "new" + GstStreamCollection the same way it did for the initial pads in + section 4.2 above. +2) decodebin will see whether the new stream can re-use a multiqueue + slot used by a stream of the same type no longer present (it + compares the old collection to the new collection). + In this case, decodebin sees that the new video stream can re-use + the same slot as the previous video stream. +3) If the new stream is going to be active by default (in this case + it does because we are replacing the only video stream, which was + active), it will check whether the caps are compatible with the + existing videodec (in the same way it was done for the audio + decoder switch in section 4.3). + +Eventually, the stream that switched will be decoded and output +through the same pad as the previous video stream in a gapless fashion. + +### Further examples + +##### HLS alternates + +There is a main (multi-bitrate or not) stream with audio and +video interleaved in mpeg-ts. The manifest also indicates the +presence of alternate language audio-only streams. +HLS would expose one collection containing: +1) The main A+V CONTAINER stream (mpeg-ts), initially active, + downloaded and exposed as a pad +2) The alternate A-only streams, initially inactive and not exposed as pads +the tsdemux element connected to the first stream will also expose +a collection containing +1.1) A video stream +1.2) An audio stream + +``` + [ Collection 1 ] [ Collection 2 ] + [ (hlsdemux) ] [ (tsdemux) ] + [ upstream:nil ] /----[ upstream:main] + [ ] / [ ] + [ "main" (A+V) ]<-/ [ "video" (V) ] viddec1 : "video" + [ "fre" (A) ] [ "eng" (A) ] auddec1 : "eng" + [ "kor" (A) ] [ ] +``` + +The user might want to use the korean audio track instead of the +default english one. + => SELECT_STREAMS ("video", "kor") + +1) decodebin3 receives and sends the event further upstream +2) tsdemux sees that "video" is part of its current upstream, + so adds the corresponding stream-id ("main") to the event + and sends it upstream ("main", "video", "kor") +3) hlsdemux receives the event + => It activates "kor" in addition to "main" +4) The event travels back to decodebin3 which will remember the + requested selection. If "kor" is already present it will switch + the "eng" stream from the audio decoder to the "kor" stream. + If it appears a bit later, it will wait until that "kor" stream + is available before switching + +#### multi-program MPEG-TS + +Assuming the case of a mpeg-ts stream which contains multiple +programs. +There would be three "levels" of collection: + 1) The collection of programs presents in the stream + 2) The collection of elementary streams presents in a stream + 3) The collection of streams decodebin can expose + +Initially tsdemux exposes the first program present (default) + +``` + [ Collection 1 ] [ Collection 2 ] [ Collection 3 ] + [ (tsdemux) ] [ (tsdemux) ] [ (decodebin) ] + [ id:Programs ]<-\ [ id:BBC1 ]<-\ [ id:BBC1-decoded ] + [ upstream:nil ] \-----[ upstream:Programs] \----[ upstream:BBC1 ] + [ ] [ ] [ ] + [ "BBC1" (C) ] [ id:"bbcvideo"(V) ] [ id:"bbcvideo"(V)] + [ "ITV" (C) ] [ id:"bbcaudio"(A) ] [ id:"bbcaudio"(A)] + [ "NBC" (C) ] [ ] [ ] +``` + +At some point the user wants to switch to ITV (of which we do not +know the topology at this point in time. A SELECT_STREAMS event +is sent with "ITV" in it and the pointer to the Collection1. +1) The event travels up the pipeline until tsdemux receives it + and begins the switch. +2) tsdemux publishes a new 'Collection 2a/ITV' and marks 'Collection 2/BBC' + as replaced. +2a) App may send a SELECT_STREAMS event configuring which demuxer output + streams should be selected (parsed) +3) tsdemux adds/removes pads as needed (flushing pads as it removes them?) +4) Decodebin feeds new pad streams through existing parsers/decoders as + needed. As data from the new collection arrives out each decoder, + decodebin sends new GstStreamCollection messages to the app so it + can know that the new streams are now switchable at that level. +4a) As new GstStreamCollections are published, the app may override + the default decodebin stream selection to expose more/fewer streams. + The default is to decode and output 1 stream of each type. + +Final state: + +``` + [ Collection 1 ] [ Collection 4 ] [ Collection 5 ] + [ (tsdemux) ] [ (tsdemux) ] [ (decodebin) ] + [ id:Programs ]<-\ [ id:ITV ]<-\ [ id:ITV-decoded ] + [ upstream:nil ] \-----[ upstream:Programs] \----[ upstream:ITV ] + [ ] [ ] [ ] + [ "BBC1" (C) ] [ id:"itvvideo"(V) ] [ id:"itvvideo"(V)] + [ "ITV" (C) ] [ id:"itvaudio"(A) ] [ id:"itvaudio"(A)] + [ "NBC" (C) ] [ ] [ ] +``` + +### TODO + +- Add missing implementation + + - Add flags to GstStreamCollection + + - Add mutual-exclusion and relationship API to GstStreamCollection + +- Add helper API to figure out whether a collection is a replacement +of another or a completely new one. This will require a more generic +system to know whether a certain stream-id is a replacement of +another or not. + +### OPEN QUESTIONS + +- Is a FLUSHING flag for stream-selection required or not ? This would +make the handler of the SELECT\_STREAMS event send FLUSH START/STOP +before switching to the other streams. This is tricky when dealing +where situations where we keep some streams and only switch some +others. Do we flush all streams ? Do we only flush the new streams, +potentially resulting in delay to fully switch ? Furthermore, due to +efficient buffering in decodebin3, the switching time has been +minimized extensively, to the point where flushing might not bring a +noticeable improvement. + +- Store the stream collection in bins/pipelines ? A Bin/Pipeline could +store all active collection internally, so that it could be queried +later on. This could be useful to then get, on any pipeline, at any +point in time, the full list of collections available without having +to listen to all COLLECTION messages on the bus. This would require +fixing the "is a collection a replacement or not" issue first. + +- When switching to new collections, should decodebin3 make any effort +to *map* corresponding streams from the old to new PMT - that is, +try and stick to the *english* language audio track, for example? +Alternatively, rely on the app to do such smarts with stream-select +messages ? diff --git a/markdown/design/stream-status.md b/markdown/design/stream-status.md new file mode 100644 index 0000000000..13174dae67 --- /dev/null +++ b/markdown/design/stream-status.md @@ -0,0 +1,106 @@ +# Stream Status + +This document describes the design and use cases for the stream status +messages. + +STREAM_STATUS messages are posted on the bus when the state of a +streaming thread changes. The purpose of this message is to allow the +application to interact with the streaming thread properties, such as +the thread priority or the threadpool to use. + +We accommodate for the following requirements: + + - Application is informed when a streaming thread is about to be + created. It should be possible for the application to suggest a + custom GstTaskPool. + + - Application is informed when the status of a streaming thread is + changed. This can be interesting for GUI application that want to + visualize the status of the streaming threads + (playing/paused/stopped) + + - Application is informed when a streaming thread is destroyed. + +We allow for the following scenarios: + + - Elements require a specific (internal) streaming thread to operate + or the application can create/specify a thread for the element. + + - Elements allow the application to configure a priority on the + threads. + +## Use cases + +- boost the priority of the udp receiver streaming thread + +``` + .--------. .-------. .------. .-------. + | udpsrc | | depay | | adec | | asink | + | src->sink src->sink src->sink | + '--------' '-------' '------' '-------' +``` + +- when going from READY to PAUSED state, udpsrc will require a +streaming thread for pushing data into the depayloader. It will +post a STREAM_STATUS message indicating its requirement for a +streaming thread. + +- The application will usually react to the STREAM_STATUS +messages with a sync bus handler. + +- The application can configure the GstTask with a custom +GstTaskPool to manage the streaming thread or it can ignore the +message which will make the element use its default GstTaskPool. + +- The application can react to the ENTER/LEAVE stream status +message to configure the thread right before it is +started/stopped. This can be used to configure the thread +priority. + +- Before the GstTask is changed state (start/pause/stop) a +STREAM_STATUS message is posted that can be used by the +application to keep track of the running streaming threads. + +## Messages + +The existing STREAM_STATUS message will be further defined and implemented in +(selected) elements. The following fields will be contained in the message: + + - **`type`**, GST_TYPE_STREAM_STATUS_TYPE: + + - a set of types to control the lifecycle of the thread: + GST_STREAM_STATUS_TYPE_CREATE: a new streaming thread is going + to be created. The application has the chance to configure a custom + thread. GST_STREAM_STATUS_TYPE_ENTER: the streaming thread is + about to enter its loop function for the first time. + GST_STREAM_STATUS_TYPE_LEAVE: the streaming thread is about to + leave its loop. GST_STREAM_STATUS_TYPE_DESTROY: a streaming + thread is destroyed + + - A set of types to control the state of the threads: + GST_STREAM_STATUS_TYPE_START: a streaming thread is started + GST_STREAM_STATUS_TYPE_PAUSE: a streaming thread is paused + GST_STREAM_STATUS_TYPE_STOP: a streaming thread is stopped + + - **`owner`**: GST_TYPE_ELEMENT: The owner element of the thread. The + message source will contain the pad (or one of the pads) that will + produce data by this thread. If this thread does not produce data on + a pad, the message source will contain the owner as well. The idea + is that the application should be able to see from the element/pad + what function this thread has in the context of the application and + configure the thread appropriatly. + + - **`object`**: G_TYPE, GstTask/GThread: A GstTask/GThread controlling + this streaming thread. + + - **`flow-return`**: GstFlowReturn: A status code for why the thread state + changed. when threads are created and started, this is usually + GST_FLOW_OK but when they are stopping it contains the reason code + why it stopped. + + - **`reason`**: G_TYPE_STRING: A string describing the reason why the + thread started/stopped/paused. Can be NULL if no reason is given. + +## Events + +FIXME diff --git a/markdown/design/streams.md b/markdown/design/streams.md new file mode 100644 index 0000000000..24f4f870e9 --- /dev/null +++ b/markdown/design/streams.md @@ -0,0 +1,82 @@ +# Streams + + This document describes the objects that are passed from element to + element in the streaming thread. + +## Stream objects + +The following objects are to be expected in the streaming thread: + + - events + - STREAM_START (START) + - SEGMENT (SEGMENT) + - EOS * (EOS) + - TAG (T) + - buffers * (B) + +Objects marked with * need to be synchronised to the clock in sinks and +live sources. + +## Typical stream + +A typical stream starts with a stream start event that marks the +start of the stream, followed by a segment event that marks the +buffer timestamp range. After that buffers are sent one after the +other. After the last buffer an EOS marks the end of the stream. No +more buffers are to be processed after the EOS event. + +``` ++-----+-------+ +-++-+ +-+ +---+ +|START|SEGMENT| |B||B| ... |B| |EOS| ++-----+-------+ +-++-+ +-+ +---+ +``` + +1) **`STREAM_START`** + - marks the start of a stream; unlike the SEGMENT event, there + will be no STREAM_START event after flushing seeks. + +2) **`SEGMENT`**, rate, start/stop, time + - marks valid buffer timestamp range (start, stop) + - marks stream_time of buffers (time). This is the stream time of buffers + with a timestamp of S.start. + - marks playback rate (rate). This is the required playback rate. + - marks applied rate (applied_rate). This is the already applied playback + rate. (See also [trickmodes](design/trickmodes.md)) + - marks running_time of buffers. This is the time used to synchronize + against the clock. + +3) **N buffers** + - displayable buffers are between start/stop of the SEGMENT (S). Buffers + outside the segment range should be dropped or clipped. + + - running_time: + +``` + if (S.rate > 0.0) + running_time = (B.timestamp - S.start) / ABS (S.rate) + S.base + else + running_time = (S.stop - B.timestamp) / ABS (S.rate) + S.base +``` + + - a monotonically increasing value that can be used to synchronize + against the clock (See also + [synchronisation](design/synchronisation.md)). + + - stream_time: + * current position in stream between 0 and duration. + +``` + stream_time = (B.timestamp - S.start) * ABS (S.applied_rate) + S.time +``` + + +4) **`EOS`** + - marks the end of data, nothing is to be expected after EOS, elements + should refuse more data and return GST_FLOW_EOS. A FLUSH_STOP + event clears the EOS state of an element. + +## Elements + +These events are generated typically either by the GstBaseSrc class for +sources operating in push mode, or by a parser/demuxer operating in +pull-mode and pushing parsed/demuxed data downstream. diff --git a/markdown/design/synchronisation.md b/markdown/design/synchronisation.md new file mode 100644 index 0000000000..1b5a8e0a3f --- /dev/null +++ b/markdown/design/synchronisation.md @@ -0,0 +1,271 @@ +# Synchronisation + +This document outlines the techniques used for doing synchronised +playback of multiple streams. + +Synchronisation in a GstPipeline is achieved using the following 3 +components: + + - a GstClock, which is global for all elements in a GstPipeline. + + - Timestamps on a GstBuffer. + + - the SEGMENT event preceding the buffers. + +## A GstClock + +This object provides a counter that represents the current time in +nanoseconds. This value is called the absolute\_time. + +Different sources exist for this counter: + + - the system time (with g\_get\_current\_time() and with microsecond + accuracy) + + - monotonic time (with g\_get\_monotonic\_time () with microsecond + accuracy) + + - an audio device (based on number of samples played) + + - a network source based on packets received + timestamps in those + packets (a typical example is an RTP source) + + - … + +In GStreamer any element can provide a GstClock object that can be used +in the pipeline. The GstPipeline object will select a clock from all the +providers and will distribute it to all other elements (see +[gstpipeline](design/gstpipeline.md)). + +A GstClock always counts time upwards and does not necessarily start at +0. + +While it is possible, it is not recommended to create a clock derived +from the contents of a stream (for example, create a clock from the PCR +in an mpeg-ts stream). + +## Running time + +After a pipeline selected a clock it will maintain the running\_time +based on the selected clock. This running\_time represents the total +time spent in the PLAYING state and is calculated as follows: + + - If the pipeline is NULL/READY, the running\_time is undefined. + + - In PAUSED, the running\_time remains at the time when it was last + PAUSED. When the stream is PAUSED for the first time, the + running\_time is 0. + + - In PLAYING, the running\_time is the delta between the + absolute\_time and the base time. The base time is defined as the + absolute\_time minus the running\_time at the time when the pipeline + is set to PLAYING. + + - after a flushing seek, the running\_time is set to 0 (see + [seeking](design/seeking.md)). This is accomplished by redistributing a new + base\_time to the elements that got flushed. + +This algorithm captures the running\_time when the pipeline is set from +PLAYING to PAUSED and restores this time based on the current +absolute\_time when going back to PLAYING. This allows for both clocks +that progress when in the PAUSED state (systemclock) and clocks that +don’t (audioclock). + +The clock and pipeline now provide a running\_time to all elements that +want to perform synchronisation. Indeed, the running time can be +observed in each element (during the PLAYING state) as: + +``` + C.running_time = absolute_time - base_time +``` + +We note C.running\_time as the running\_time obtained by looking at the +clock. This value is monotonically increasing at the rate of the clock. + +## Timestamps + +The GstBuffer timestamps and the preceding SEGMENT event (See +[streams](design/streams.md)) define a transformation of the buffer timestamps to +running\_time as follows: + +The following notation is used: + +**B**: GstBuffer + - B.timestamp = buffer timestamp (GST_BUFFER_PTS or GST_BUFFER_DTS) + +**S**: SEGMENT event preceding the buffers. + - S.start: start field in the SEGMENT event. This is the lowest allowed + timestamp. + - S.stop: stop field in the SEGMENT event. This is the highers allowed + timestamp. + - S.rate: rate field of SEGMENT event. This is the playback rate. + - S.base: a base time for the time. This is the total elapsed running_time of any + previous segments. + - S.offset: an offset to apply to S.start or S.stop. This is the amount that + has already been elapsed in the segment. + +Valid buffers for synchronisation are those with B.timestamp between +S.start and S.stop (after applying the S.offset). All other buffers +outside this range should be dropped or clipped to these boundaries (see +also [segments](design/segments.md)). + +The following transformation to running_time exist: + +``` + if (S.rate > 0.0) + B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base + => + B.timestamp = (B.running_time - S.base) * ABS (S.rate) + S.start + S.offset + else + B.running_time = ((S.stop - S.offset) - B.timestamp) / ABS (S.rate) + S.base + => + B.timestamp = S.stop - S.offset - ((B.running_time - S.base) * ABS (S.rate)) +``` + +We write B.running_time as the running_time obtained from the SEGMENT +event and the buffers of that segment. + +The first displayable buffer will yield a value of 0 (since B.timestamp +== S.start and S.offset and S.base == 0). + +For S.rate \> 1.0, the timestamps will be scaled down to increase the +playback rate. Likewise, a rate between 0.0 and 1.0 will slow down +playback. + +For negative rates, timestamps are received stop S.stop to S.start so +that the first buffer received will be transformed into B.running\_time +of 0 (B.timestamp == S.stop and S.base == 0). + +This makes it so that B.running\_time is always monotonically increasing +starting from 0 with both positive and negative rates. + +## Synchronisation + +As we have seen, we can get a running\_time: + + - using the clock and the element’s base\_time with: + +``` + C.running_time = absolute_time - base_time +``` + +- using the buffer timestamp and the preceding SEGMENT event as (assuming +positive playback rate): + +``` + B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base +``` + +We prefix C. and B. before the two running times to note how they were +calculated. + +The task of synchronized playback is to make sure that we play a buffer +with B.running\_time at the moment when the clock reaches the same +C.running\_time. + +Thus the following must hold: + +``` + B.running_time = C.running_time +``` + +expaning: + +``` + B.running_time = absolute_time - base_time +``` + +or: + +``` + absolute_time = B.running_time + base_time +``` + +The absolute\_time when a buffer with B.running\_time should be played +is noted with B.sync\_time. Thus: + +``` + B.sync_time = B.running_time + base_time +``` + +One then waits for the clock to reach B.sync\_time before rendering the +buffer in the sink (See also [clocks](design/clocks.md)). + +For multiple streams this means that buffers with the same running\_time +are to be displayed at the same time. + +A demuxer must make sure that the SEGMENT it emits on its output pads +yield the same running\_time for buffers that should be played +synchronized. This usually means sending the same SEGMENT on all pads +and making sure that the synchronized buffers have the same timestamps. + +## Stream time + +The stream time is also known as the position in the stream and is a +value between 0 and the total duration of the media file. + +It is the stream time that is used for: + + - report the POSITION query in the pipeline + + - the position used in seek events/queries + + - the position used to synchronize controller values + +Additional fields in the SEGMENT are used: + + - S.time: time field in the SEGMENT event. This the stream-time of + S.start + + - S.applied\_rate: The rate already applied to the segment. + +Stream time is calculated using the buffer times and the preceding +SEGMENT event as follows: + +``` + stream_time = (B.timestamp - S.start) * ABS (S.applied_rate) + S.time + => B.timestamp = (stream_time - S.time) / ABS(S.applied_rate) + S.start +``` + +For negative rates, B.timestamp will go backwards from S.stop to +S.start, making the stream time go backwards: + +``` + stream_time = (S.stop - B.timestamp) * ABS(S.applied_rate) + S.time + => B.timestamp = S.stop - (stream_time - S.time) / ABS(S.applied_rate) +``` + +In the PLAYING state, it is also possible to use the pipeline clock to +derive the current stream\_time. + +Give the two formulas above to match the clock times with buffer +timestamps allows us to rewrite the above formula for stream\_time (and +for positive rates). + +``` + C.running_time = absolute_time - base_time + B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base + + => + (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base = absolute_time - base_time; + + => + (B.timestamp - (S.start + S.offset)) / ABS (S.rate) = absolute_time - base_time - S.base; + + => + (B.timestamp - (S.start + S.offset)) = (absolute_time - base_time - S.base) * ABS (S.rate) + + => + (B.timestamp - S.start) = S.offset + (absolute_time - base_time - S.base) * ABS (S.rate) + + filling (B.timestamp - S.start) in the above formule for stream time + + => + stream_time = (S.offset + (absolute_time - base_time - S.base) * ABS (S.rate)) * ABS (S.applied_rate) + S.time +``` + +This last formula is typically used in sinks to report the current +position in an accurate and efficient way. + +Note that the stream time is never used for synchronisation against the +clock. diff --git a/markdown/design/toc.md b/markdown/design/toc.md new file mode 100644 index 0000000000..c833bb1131 --- /dev/null +++ b/markdown/design/toc.md @@ -0,0 +1,226 @@ +# Implementing GstToc support in GStreamer elements + +## General info about GstToc structure + +GstToc introduces a general way to handle chapters within multimedia +formats. GstToc can be represented as tree structure with arbitrary +hierarchy. Tree item can be either of two types: sequence or +alternative. Sequence types acts like a part of the media data, for +example audio track in CUE sheet, or part of the movie. Alternative +types acts like some kind of selection to process a different version of +the media content, for example DVD angles. GstToc has one constraint on +the tree structure: it does not allow different entry types on the same +level of the hierarchy, i.e. you shouldn’t have editions and chapters +mixed together. Here is an example of right TOC: + +``` + ------- TOC ------- + / \ + edition1 edition2 + | | + -chapter1 -chapter3 + -chapter2 +``` + +Here are two editions (alternatives), the first contains two chapters +(sequence type), and the second has only one chapter. And here is an +example of invalid TOC: + +``` + ------- TOC ------- + / \ + edition1 chapter1 + | + -chapter1 + -chapter2 +``` + +Here you have edition1 and chapter1 mixed on the same level of +hierarchy, and such TOC will be considered broken. + +GstToc has *entries* field of GList type which consists of children +items. Each item is of type GstTocEntry. Also GstToc has list of tags +and GstStructure called *info*. Please, use GstToc.info and +GstTocEntry.info fields this way: create a GstStructure, put all info +related to your element there and put this structure into the *info* +field under the name of your element. Some fields in the *info* +structure can be used for internal purposes, so you should use it in the +way described above to not to overwrite already existent fields. + +Let’s look at GstTocEntry a bit closer. One of the most important fields +is *uid*, which must be unique for each item within the TOC. This is +used to identify each item inside TOC, especially when element receives +TOC select event with UID to seek on. Field *subentries* of type GList +contains children items of type GstTocEntry. Thus you can achieve +arbitrary hierarchy level. Field *type* can be either +GST\_TOC\_ENTRY\_TYPE\_CHAPTER or GST\_TOC\_ENTRY\_TYPE\_EDITION which +corresponds to chapter or edition type of item respectively. Field +*tags* is a list of tags related to the item. And field *info* is +similar to GstToc.info described above. + +So, a little more about managing GstToc. Use gst\_toc\_new() and +gst\_toc\_unref() to create/free it. GstTocEntry can be created using +gst\_toc\_entry\_new(). While building GstToc you can set start and stop +timestamps for each item using gst\_toc\_entry\_set\_start\_stop() and +loop\_type and repeat\_count using gst\_toc\_entry\_set\_loop(). The +best way to process already created GstToc is to recursively go through +the *entries* and *subentries* fields. + +Applications and plugins should not rely on TOCs having a certain kind +of structure, but should allow for different alternatives. For example, +a simple CUE sheet embedded in a file may be presented as a flat list of +track entries, or could have a top-level edition node (or some other +alternative type entry) with track entries underneath that node; or even +multiple top-level edition nodes (or some other alternative type +entries) each with track entries underneath, in case the source file has +extracted a track listing from different sources). + +## TOC scope: global and current + +There are two main consumers for TOC information: applications and +elements in the pipeline that are TOC writers (such as e.g. +matroskamux). + +Applications typically want to know the entire table of contents (TOC) +with all entries that can possibly be selected. + +TOC writers in the pipeline, however, would not want to write a TOC for +all possible/available streams, but only for the current stream. + +When transcoding a title from a DVD, for example, the application would +still want to know the entire TOC, with all titles, the chapters for +each title, and the available angles. When transcoding to a file, we +only want the TOC information that is relevant to the transcoded stream +to be written into the file structure, e.g. the chapters of the title +being transcoded (or possibly only chapters 5-7 if only those have been +selected for playback/ transcoding). + +This is why we may need to create two different TOCs for those two types +of consumers. + +Elements that extract TOC information should send TOC events downstream. + +Like with tags, sinks will post a TOC message on the bus for the +application with the global TOC, once a global TOC event reaches the +sink. + +## Working with GstMessage + +If a table of contents is available, applications will receive a TOC +message on the pipeline’s GstBus. + +A TOC message will be posted on the bus by sinks when the receive a TOC +event containing a TOC with global scope. Elements extracting TOCs +should not post a TOC message themselves, but send a TOC event +downstream. + +The reason for this is that there may be cascades of TOCs (e.g. a zip +archive containing multiple matroska files, each with a TOC). + +GstMessage with GstToc can be created using gst\_message\_new\_toc() and +parsed with gst\_message\_parse\_toc(). The *updated* parameter in these +methods indicates whether the TOC was just discovered (set to false) or +TOC was already found and have been updated (set to true). This message +will typically be posted by sinks to pipeline in case you have +discovered TOC data within your element. + +## Working with GstEvent + +There are two types of TOC-related events: + + - downstream TOC events that contain TOC information and travel + downstream + + - toc-select events that travel upstream and can be used to select a + certain TOC entry for playback (similar to seek events) + +GstToc supports select event through GstEvent infrastructure. The idea +is the following: when you receive TOC select event, parse it with +gst\_event\_parse\_toc\_select() and seek stream (if it is not +streamable) for specified TOC UID (you can use gst\_toc\_find\_entry() +to find entry in TOC by UID). To create TOC select event use +gst\_event\_new\_toc\_select(). The common action on such event is to +seek to specified UID within your element. + +## Implementation coverage, Specifications, … + +Below is a list of container formats, links to documentation and a +summary of toc related features. Each section title also indicates +whether reading/writing a toc is implemented. Below hollow bullet point +*o* indicate no support and filled bullets *\*\* indicate that this +feature is handled. + +### AIFC: -/- + +o *MARK* o *INST* + +The *MARK* chunk defines a list of (cue-id, position\_in\_samples, +label). + +The *INST* chunk contains a sustainLoop and releaseLoop, each consisting +of (loop-type, cue-begin, cue-end) + +### FLAC: read/write + + \* +METADATA\_BLOCK\_CUESHEET \* CUESHEET\_TRACK o CUESHEET\_TRACK\_INDEX + +Both CUESHEET\_TRACK and CUESHEET\_TRACK\_INDEX have a (relative) offset +in samples. CUESHEET\_TRACK has ISRC metadata. + +### MKV: read/write + + \* Chapters +and Editions each having a uid \* Chapter have start/end time and +metadata: ChapString, ChapLanguage, ChapCountry + +### MP4: \* elst + +The *elst* atom contains a list of edits. Each edit consists of (length, +start, play-back speed). + +### OGG: -/- o VorbisComment + +fields called CHAPTERxxx and CHAPTERxxxNAME with xxx being a number +between 000 and 999. + +### WAV: read/write \* *cue +' o 'plst* \* *adtl* \* *labl* \* *note* o *ltxt* o *smpl* + +The *cue ' chunk defines a list of markers in the stream with 'cue-id’s. +The 'smpl* chunk defines a list of regions in the stream with 'cue-id’s +in the same namespace (?). + +The various *adtl* chunks: *labl*, *note* and *ltxt* refer to the +'cue-id’s. + +A *plst* chunk defines a sequence of segments (cue-id, length\_samples, +repeats). The *smpl* chunk defines a list of loops (cue-id, beg, end, +loop-type, repeats). + +## Conclusion/Ideas/Future work + +Based on the data of chapter 5, a few thoughts and observations that can +be used to extend and refine our API. These things below are not +reflecting the current implementation. + +All formats have table of \[cue-id, cue-start, (cue-end), (extra tags)\] +- cue-id is commonly represented as and unsigned int 32bit - cue-end is +optional - extra tags could be represented as a structure/taglist + +Many formats have metadata that references the cue-table. - loops in +instruments in wav, aifc - edit lists in wav, mp4 + +For mp4.edtl, wav.plst we could expose two editions. 1) the edit list is +flattened: default, for playback 2) the stream has the raw data and the +edit list is there as chapter markers: useful for editing software + +We might want to introduce a new GST\_TOC\_ENTRY\_TYPE\_MARKER or \_CUE. +This would be a sequence entry-type and it would not be used for +navigational purposes, but to attach data to a point in time (envelopes, +loops, …). + +API wise there is some overlap between: - exposing multiple audio/video +tracks as pads or as ToC editions. For ToC editions, we have the +TocSelect event. - exposing subtitles as a sparse stream or as as ToC +sequence of markers with labels diff --git a/markdown/design/tracing.md b/markdown/design/tracing.md new file mode 100644 index 0000000000..4dc66af4eb --- /dev/null +++ b/markdown/design/tracing.md @@ -0,0 +1,405 @@ +# Tracing + +This subsystem will provide a mechanism to get structured tracing info +from GStreamer applications. This can be used for post-run analysis as +well as for live introspection. + +# Use cases + + - I’d like to get statistics from a running application. + + - I’d like to to understand which parts of my pipeline use how many + resources. + + - I’d like to know which parts of the pipeline use how much memory. + + - I’d like to know about ref-counts of parts in the pipeline to find + ref-count issues. + +# Non use-cases + + - Some element in the pipeline does not play along the rules, find out + which one. This could be done with generic tests. + +# Design + +The system brings the following new items: core hooks: probes in the +core api, that will expose internal state when tracing is in use +tracers: plugin features that can process data from the hooks and emit a +log tracing front-ends: applications that consume logs from tracers + +Like the logging, the tracer hooks can be compiled out and if not use a +local condition to check if active. + +Certain GStreamer core function (such as gst_pad_push or +gst_element_add_pad) will call into the tracer subsystem to dispatch +into active tracing modules. Developers will be able to select a list of +plugins by setting an environment variable, such as +GST_TRACERS="meminfo;dbus". One can also pass parameters to plugins: +GST_TRACERS="log(events,buffers);stats(all)". When then plugins are +loaded, we’ll add them to certain hooks according to which they are +interested in. + +Right now tracing info is logged as GstStructures to the TRACE level. +Idea: Another env var GST_TRACE_CHANNEL could be used to send the +tracing to a file or a socket. See + for discussion on +these environment variables. + +# Hook api + +We’ll wrap interesting api calls with two macros, e.g. gst_pad_push(): + +GstFlowReturn gst_pad_push (GstPad * pad, GstBuffer * buffer) { +GstFlowReturn res; + +``` c + g_return_val_if_fail (GST_IS_PAD (pad), GST_FLOW_ERROR); + g_return_val_if_fail (GST_PAD_IS_SRC (pad), GST_FLOW_ERROR); + g_return_val_if_fail (GST_IS_BUFFER (buffer), GST_FLOW_ERROR); + + GST_TRACER_PAD_PUSH_PRE (pad, buffer); + res = gst_pad_push_data (pad, + GST_PAD_PROBE_TYPE_BUFFER | GST_PAD_PROBE_TYPE_PUSH, buffer); + GST_TRACER_PAD_PUSH_POST (pad, res); + return res; +} +``` + +TODO(ensonic): gcc has some magic for wrapping functions - + - + + +TODO(ensonic): we should eval if we can use something like jump_label +in the kernel - + + - + - + - + +TODO(ensonic): liblttng-ust provides such a mechanism for user-space - +but this is mostly about logging traces - it is linux specific :/ + +In addition to api hooks we should also provide timer hooks. Interval +timers are useful to get e.g. resource usage snapshots. Also absolute +timers might make sense. All this could be implemented with a clock +thread. We can use another env-var GST_TRACE_TIMERS="100ms,75ms" to +configure timers and then pass them to the tracers like, +GST_TRACERS="rusage(timer=100ms);meminfo(timer=75ms)". Maybe we can +create them ad-hoc and avoid the GST_TRACE_TIMERS var. + +Hooks (* already implemented) + +* gst_bin_add +* gst_bin_remove +* gst_element_add_pad +* gst_element_post_message +* gst_element_query +* gst_element_remove_pad +* gst_element_factory_make +* gst_pad_link +* gst_pad_pull_range +* gst_pad_push +* gst_pad_push_list +* gst_pad_push_event +* gst_pad_unlink + +## Tracer api + +Tracers are plugin features. They have a simple api: + +class init Here the tracers describe the data the will emit. + +instance init Tracers attach handlers to one or more hooks using +gst_tracing_register_hook(). In case the are configurable, they can +read the options from the *params* property. This is the extra detail +from the environment var. + +hook functions Hooks marshal the parameters given to a trace hook into +varargs and also add some extra into such as a timestamp. Hooks will be +called from misc threads. The trace plugins should only consume (=read) +the provided data. Expensive computation should be avoided to not affect +the execution too much. Most trace plugins will log data to a trace +channel. + +instance destruction Tracers can output results and release data. This +would ideally be done at the end of the applications, but gst_deinit() +is not mandatory. gst_tracelib was using a gcc_destructor. Ideally +tracer modules log data as they have them and leave aggregation to a +tool that processes the log. + +## tracer event classes + +Most tracers will log some kind of *events* : a data transfer, an event, +a message, a query or a measurement. Every tracers should describe the +data format. This way tools that process tracer logs can show the data +in a meaningful way without having to know about the tracer plugin. + +One way would be to introspect the data from the plugin. This has the +disadvantage that the postprocessing app needs to load the plugins or +talk to the gstreamer registry. An alternative is to also log the format +description into the log. Right now we’re logging several nested +GstStructure from the `tracer_class_init()` function (except in the +log tracer). + +``` +gst_tracer_record_new ("thread-rusage.class", + // value in the log record (order does not matter) + // *thread-id* is a *key* to related the record to something as indicated + // by *scope* substructure "thread-id", + GST_TYPE_STRUCTURE, gst_structure_new ("scope", "type", + G_TYPE_GTYPE, G_TYPE_GUINT64, "related-to", + GST_TYPE_TRACER_VALUE_SCOPE, GST_TRACER_VALUE_SCOPE_THREAD, + NULL), + // next value in the record // *average-cpuload* is a measurement as indicated by the *value* + // substructure "average-cpuload", + GST_TYPE_STRUCTURE, gst_structure_new ("value", // value type + "type", G_TYPE_GTYPE, G_TYPE_UINT, + // human readable description, that can be used as a graph label + "description", G_TYPE_STRING, "average cpu usage per thread", + // flags that help to use the right graph type + // flags { aggregated, windowed, cumulative, … } + "flags", GST_TYPE_TRACER_VALUE_FLAGS, GST_TRACER_VALUE_FLAGS_AGGREGATED, + // value range + "min", G_TYPE_UINT, 0, "max", G_TYPE_UINT, 100, NULL), + … NULL); +``` + + +A few ideas that are not yet in the above spec: - it would be nice to +describe the unit of values - putting it into the description is not +flexible though, e.g. time would be a guint64 but a ui would reformat it +to e.g. h:m:s.ms - other units are e.g.: percent, per-mille, or kbit/s - +we’d like to have some metadata on scopes - e.g. we’d like to log the +thread-names, so that a UI can show that instead of thread-ids - the +stats tracer logs *new-element* and *new-pad* messages - they add a +unique *ix* to each instance as the memory ptr can be reused for new +instances, the data is attached to the objects as qdata - the latency +tracer would like to also reference this metadata - right now we log the +classes as structures - this is important so that the log is self +contained - it would be nice to add them to the registry, so that +gst-inspect can show them + +We could also consider to add each value as a READABLE gobject property. +The property has name/description. We could use qdata for scope and +flags (or have some new property flags). We would also need a new +"notify" signal, so that value-change notifications would include a +time-stamp. This way the tracers would not needs to be aware of the +logging. The core tracer would register the notify handlers and emit the +log. Or we just add a gst_tracer_class_install_event() and that +mimics the g_object_class_install_property(). + +Frontends can: - do an events over time histogram - plot curves of +values over time or deltas - show gauges - collect statistics (min, max, +avg, …) + +We can have some under gstreamer/plugins/tracers/ + +## latency + + - register to buffer and event flow + + - send custom event on buffer flow at source elements + + - catch events on event transfer at sink elements + +## meminfo (not yet implemented) + + - register to an interval-timer hook. + - call mallinfo() and log memory usage + + rusage + + - register to an interval-timer hook. + + - call getrusage() and log resource usage + +## dbus (not yet implemented) + + - provide a dbus iface to announce applications that are traced + - tracing UIs can use the dbus iface to find the channels where logging and + tracing is getting logged to + - one would start the tracing UI first and when the application is started with + tracing activated, the dbus plugin will announce the new application, + upon which the tracing UI can start reading from the log channels, this avoid + missing some data + +## topology (not yet implemented) + + - register to pipeline topology hooks + + - tracing UIs can show a live pipeline graph + +## stats + + - register to buffer, event, message and query flow + + - tracing apps can do e.g. statistics + +## refcounts (not yet implemented) + + - log ref-counts of objects + - just logging them outside of glib/gobject would still make it hard to detect + issues though + +## opengl (not yet implemented) + + - upload/download times + + - there is not hardware agnostic way to get e.g. memory usage info (gl + extensions) + +## memory (not yet implemented) + + - trace live instance (and pointer to the memory) + - use an atexit handler to dump leaked instance + https://bugzilla.gnome.org/show_bug.cgi?id=756760#c6 + +## leaks + + - track creation/destruction of GstObject and GstMiniObject + + - log those which are still alive when app is exiting and raise an + error if any + + - If the GST_LEAKS_TRACER_SIG env variable is defined the tracer + will handle the following UNIX signals: + + - SIGUSR1: log alive objects + + - SIGUSR2: create a checkpoint and print a list of objects created and + destroyed since the previous checkpoint. + + - If the GST_LEAKS_TRACER_STACK_TRACE env variable is defined log + the creation stack trace of leaked objects. This may significantly + increase memory consumption. + +## gst-debug-viewer + +gst-debug-viewer could be given the trace log in addition to the debug +log (or a combined log). Alternatively it would show a dialog that shows +all local apps (if the dbus plugin is loaded) and read the log streams +from the sockets/files that are configured for the app. + +## gst-tracer + +Counterpart of gst-tracelib-ui. + +## gst-stats + +A terminal app that shows summary/running stats like the summary +gst-tracelib shows at the end of a run. Currently only shows an +aggregated status. + +## live-graphers + +Maybe we can even feed the log into existing live graphers, with a +little driver * + + - should tracers log into the debug.log or into a separate log? + + - separate log + + - use a binary format? + + - worse performance (we’re writing two logs at the same time) + + - need to be careful when people to GST_DEBUG_CHANNEL=stderr and + GST_TRACE_CHANNEL=stderr (use a shared channel, but what about the + formats?) + + - debug log + + - the tracer subsystem would need to log the GST_TRACE at a level + that is active + + - should the tracer call gst_debug_category_set_threshold() to + ensure things work, even though the levels don’t make a lot of sense + here + + - make logging a tracer (a hook in gst_debug_log_valist, move + gst_debug_log_default() to the tracer module) + + - log all debug log to the tracer log, some of the current logging + statements can be replaced by generic logging as shown in the + log-tracer + + - add tools/gst-debug to extract a human readable debug log from the + trace log + + - we could maintain a list of log functions, where + gst_tracer_log_trace() is the default one. This way e.g. + gst-validate could consume the traces directly. + + - when hooking into a timer, should we just have some predefined + intervals? + + - can we add a tracer module that registers the timer hook? then we + could do GST_TRACER="timer(10ms);rusage" right now the tracer hooks + are defined as an enum though. + + - when connecting to a running app, we can’t easily get the *current* + state if logging is using a socket, as past events are not + explicitly stored, we could determine the current topology and emit + events with GST_CLOCK_TIME_NONE as ts to indicate that the events + are synthetic. + + - we need stable ids for scopes (threads, elements, pads) + + - the address can be reused + + - we can use gst_util_seqnum_next() + + - something like gst_object_get_path_string() won’t work as + objects are initially without parent + + - right now the tracing-hooks are enabled/disabled from configure with + --{enable,disable}-gst-tracer-hooks The tracer code and the plugins + are still built though. We should add a + --{enable,disable}-gst-tracer to disabled the whole system, + allthough this is a bit confusing with the --{enable,disable}-trace + option we have already. + +## Try it + +### Traces for buffer flow in TRACE level: + + GST_DEBUG="GST_TRACER:7,GST_BUFFER*:7,GST_EVENT:7,GST_MESSAGE:7" + GST_TRACERS=log gst-launch-1.0 fakesrc num-buffers=10 ! fakesink - + +### Print some pipeline stats on exit: + + GST_DEBUG="GST_TRACER:7" GST_TRACERS="stats;rusage" + GST_DEBUG_FILE=trace.log gst-launch-1.0 fakesrc num-buffers=10 + sizetype=fixed ! queue ! fakesink && gst-stats-1.0 trace.log + +### get ts, average-cpuload, current-cpuload, time and plot + + GST_DEBUG="GST_TRACER:7" GST_TRACERS="stats;rusage" + GST_DEBUG_FILE=trace.log /usr/bin/gst-play-1.0 $HOME/Videos/movie.mp4 && + ./scripts/gst-plot-traces.sh --format=png | gnuplot eog trace.log.*.png + +### print processing latencies + + GST_DEBUG="GST_TRACER:7" GST_TRACERS=latency gst-launch-1.0 \ + audiotestsrc num-buffers=10 ! audioconvert ! volume volume=0.7 ! \ + autoaudiosink + +### Raise a warning if a leak is detected + + GST_TRACERS="leaks" gst-launch-1.0 videotestsrc num-buffers=10 ! + fakesink + +### check if any GstEvent or GstMessage is leaked and raise a warning + + GST_DEBUG="GST_TRACER:7" GST_TRACERS="leaks(GstEvent,GstMessage)" + gst-launch-1.0 videotestsrc num-buffers=10 ! fakesink + +# Performance + + run ./tests/benchmarks/tracing.sh + + egrep -c "(proc|thread)-rusage" trace.log 658618 grep -c + "gst_tracer_log_trace" trace.log 823351 + +- we can optimize most of it by using quarks in structures or +eventually avoid structures totally diff --git a/markdown/design/trickmodes.md b/markdown/design/trickmodes.md new file mode 100644 index 0000000000..4f5da78665 --- /dev/null +++ b/markdown/design/trickmodes.md @@ -0,0 +1,235 @@ +# Trickmodes + +GStreamer provides API for performing various trickmode playback. This +includes: + + - server side trickmodes + + - client side fast/slow forward playback + + - client side fast/slow backwards playback + +Server side trickmodes mean that a source (network source) can provide a +stream with different playback speed and direction. The client does not +have to perform any special algorithms to decode this stream. + +Client side trickmodes mean that the decoding client (GStreamer) +performs the needed algorithms to change the direction and speed of the +media file. + +Seeking can both be done in a playback pipeline and a transcoding +pipeline. + +## General seeking overview + +Consider a typical playback pipeline: + +``` + .---------. .------. + .-------. | decoder |->| sink | +.--------. | |-->'---------' '------' +| source |->| demux | +'--------' | |-->.---------. .------. + '-------' | decoder |->| sink | + '---------' '------' +``` + +The pipeline is initially configured to play back at speed 1.0 starting +from position 0 and stopping at the total duration of the file. + +When performing a seek, the following steps have to be taken by the +application: + +### Create a seek event + +The seek event contains: + + - various flags describing: + + - where to seek to (KEY\_UNIT) + + - how accurate the seek should be (ACCURATE) + + - how to perform the seek (FLUSH) + + - what to do when the stop position is reached (SEGMENT). + + - extra playback options (SKIP) + + - a format to seek in, this can be time, bytes, units (frames, + samples), … + + - a playback rate, 1.0 is normal playback speed, positive values + bigger than 1.0 mean fast playback. negative values mean reverse + playback. A playback speed of 0.0 is not allowed (but is equivalent + to PAUSING the pipeline). + + - a start position, this value has to be between 0 and the total + duration of the file. It can also be relative to the previously + configured start value. + + - a stop position, this value has to be between 0 and the total + duration. It can also be relative to the previously configured stop + value. + +See also gst\_event\_new\_seek(). + +### Send the seek event + +Send the new seek event to the pipeline with +gst\_element\_send\_event(). + +By default the pipeline will send the event to all sink elements. By +default an element will forward the event upstream on all sinkpads. +Elements can modify the format of the seek event. The most common format +is GST\_FORMAT\_TIME. + +One element will actually perform the seek, this is usually the demuxer +or source element. For more information on how to perform the different +seek types see [seeking](design/seeking.md). + +For client side trickmode a SEGMENT event will be sent downstream with +the new rate and start/stop positions. All elements prepare themselves +to handle the rate (see below). The applied rate of the SEGMENT event +will be set to 1.0 to indicate that no rate adjustment has been done. + +for server side trick mode a SEGMENT event is sent downstream with a +rate of 1.0 and the start/stop positions. The elements will configure +themselves for normal playback speed since the server will perform the +rate conversions. The applied rate will be set to the rate that will be +applied by the server. This is done to insure that the position +reporting performed in the sink is aware of the trick mode. + +When the seek succeeds, the \_send\_event() function will return TRUE. + +## Server side trickmode + +The source element operates in push mode. It can reopen a server +connection requesting a new byte or time position and a new playback +speed. The capabilities can be queried from the server when the +connection is opened. + +We assume the source element is derived from the GstPushSrc base class. +The base source should be configured with gst\_base\_src\_set\_format +(src, GST\_FORMAT\_TIME). + +The do\_seek method will be called on the push src subclass with the +seek information passed in the GstSegment argument. + +The rate value in the segment should be used to reopen the connection to +the server requesting data at the new speed and possibly a new playback +position. + +When the server connection was successfully reopened, set the rate of +the segment to 1.0 so that the client side trickmode is not enabled. The +applied rate in the segment is set to the rate transformation done by +the server. + +Alternatively a combination of client side and serverside trickmode can +be used, for example if the server does not support certain rates, the +client can perform rate conversion for the remainder. + +``` + source server +do_seek | | + ----------->| | + | reopen connection | + |-------------------->| + | . + | success . + |<--------------------| + modify | | + rate to 1.0 | | + | | + return | | + TRUE | | + | | +``` + +After performing the seek, the source will inform the downstream +elements of the new segment that is to be played back. Since the segment +will have a rate of 1.0, no client side trick modes are enabled. The +segment will have an applied rate different from 1.0 to indicate that +the media contains data with non-standard playback speed or direction. + +## client side forward trickmodes + +The seek happens as stated above. a SEGMENT event is sent downstream +with a rate different from 1.0. Plugins receiving the SEGMENT can decide +to perform the rate conversion of the media data (retimestamp video +frames, resample audio, …). + +If a plugin decides to resample or retimestamp, it should modify the +SEGMENT with a rate of 1.0 and update the applied rate so that +downstream elements don’t resample again but are aware that the media +has been modified. + +The GStreamer base audio and video sinks will resample automatically if +they receive a SEGMENT event with a rate different from 1.0. The +position reporting in the base audio and video sinks will also depend on +the applied rate of the segment information. + +When the SKIP flag is set, frames can be dropped in the elements. If S +is the speedup factor, a good algorithm for implementing frame skipping +is to send audio in chunks of Nms (usually 300ms is good) and then skip +((S-1) \* Nns) of audio data. For the video we send only the keyframes +in the (S \* Nns) interval. In this case, the demuxer would scale the +timestamps and would set an applied rate of S. + +## client side backwards trickmode + +For backwards playback the following rules apply: + + - the rate in the SEGMENT is less than 0.0. + + - the SEGMENT start position is less than the stop position, playback + will however happen from stop to start in reverse. + + - the time member in the SEGMENT is set to the stream time of the + start position. + +For plugins the following rules apply: + + - A source plugin sends data in chunks starting from the last chunk of + the file. The actual bytes are not reversed. Each chunk that is not + forward continuous with the previous chunk is marked with a DISCONT + flag. + + - A demuxer accumulates the chunks. As soon as a keyframe is found, + everything starting from the keyframe up to the accumulated data is + sent downstream. Timestamps on the buffers are set starting from the + stop position to start, effectively going backwards. Chunks are + marked with DISCONT when they are not forward continuous with the + previous buffer. + + - A video decoder decodes and accumulates all decoded frames. If a + buffer with a DISCONT, SEGMENT or EOS is received, all accumulated + frames are sent downsteam in reverse. + + - An audio decoder decodes and accumulates all decoded audio. If a + buffer with a DISCONT, SEGMENT or EOS is received, all accumulated + audio is sent downstream in reverse order. Some audio codecs need + the previous data buffer to decode the current one, in that case, + the previous DISCONT buffer needs to be combined with the last + non-DISCONT buffer to generate the last bit of output. + + - A sink reverses (for audio) and retimestamps (audio, video) the + buffers before playing them back. Retimestamping occurs relative to + the stop position, making the timestamps increase again and suitable + for synchronizing against the clock. Audio sinks also have to + perform simple resampling before playing the samples. + + - for transcoding, audio and video resamplers can be used to reverse, + resample and retimestamp the buffers. Any rate adjustments performed + on the media must be added to the applied\_rate and subtracted from + the rate members in the SEGMENT + event. + +In SKIP mode, the same algorithm as for forward SKIP mode can be used. + +## Notes + + - The clock/running\_time keeps running forward. + + - backwards playback potentially uses a lot of memory as frames and + undecoded data gets buffered. diff --git a/sitemap.txt b/sitemap.txt index bea97cda28..6166b172ac 100644 --- a/sitemap.txt +++ b/sitemap.txt @@ -137,3 +137,60 @@ index.md splitup.md licensing.md rtp.md + design/index.md + design/MT-refcounting.md + design/TODO.md + design/activation.md + design/buffer.md + design/buffering.md + design/bufferpool.md + design/caps.md + design/clocks.md + design/context.md + design/controller.md + design/conventions.md + design/dynamic.md + design/element-sink.md + design/element-source.md + design/element-transform.md + design/events.md + design/framestep.md + design/gstbin.md + design/gstbus.md + design/gstelement.md + design/gstghostpad.md + design/gstobject.md + design/gstpipeline.md + design/draft-klass.md + design/latency.md + design/live-source.md + design/memory.md + design/messages.md + design/meta.md + design/draft-metadata.md + design/miniobject.md + design/missing-plugins.md + design/negotiation.md + design/overview.md + design/preroll.md + design/probes.md + design/progress.md + design/push-pull.md + design/qos.md + design/query.md + design/relations.md + design/scheduling.md + design/seeking.md + design/segments.md + design/seqnums.md + design/sparsestreams.md + design/standards.md + design/states.md + design/stream-selection.md + design/stream-status.md + design/streams.md + design/synchronisation.md + design/draft-tagreading.md + design/toc.md + design/tracing.md + design/trickmodes.md