docs: Remove design doc as they have been moved to gst-docs

https://bugzilla.gnome.org/show_bug.cgi?id=775667
This commit is contained in:
Thibault Saunier 2016-12-05 18:16:34 -03:00
parent dbe3d2b328
commit d1c79fc177
59 changed files with 2 additions and 12147 deletions

View file

@ -10,8 +10,8 @@ endif
BUILT_SOURCES = version.entities
SUBDIRS = design gst libs $(PLUGIN_DOCS_DIRS)
DIST_SUBDIRS = design gst libs plugins
SUBDIRS = gst libs $(PLUGIN_DOCS_DIRS)
DIST_SUBDIRS = gst libs plugins
EXTRA_DIST = version.entities.in list-ulink.xsl

View file

@ -1,78 +0,0 @@
EXTRA_DIST = \
draft-klass.txt \
draft-metadata.txt \
draft-push-pull.txt \
draft-tagreading.txt \
part-activation.txt \
part-buffering.txt \
part-bufferpool.txt \
part-buffer.txt \
part-caps.txt \
part-clocks.txt \
part-context.txt \
part-controller.txt \
part-conventions.txt \
part-dynamic.txt \
part-element-sink.txt \
part-element-source.txt \
part-element-transform.txt \
part-events.txt \
part-framestep.txt \
part-gstbin.txt \
part-gstbus.txt \
part-gstelement.txt \
part-gstghostpad.txt \
part-gstobject.txt \
part-gstpipeline.txt \
part-latency.txt \
part-live-source.txt \
part-memory.txt \
part-messages.txt \
part-meta.txt \
part-miniobject.txt \
part-missing-plugins.txt \
part-MT-refcounting.txt \
part-negotiation.txt \
part-overview.txt \
part-preroll.txt \
part-probes.txt \
part-progress.txt \
part-push-pull.txt \
part-qos.txt \
part-query.txt \
part-relations.txt \
part-scheduling.txt \
part-seeking.txt \
part-segments.txt \
part-seqnums.txt \
part-sparsestreams.txt \
part-standards.txt \
part-states.txt \
part-stream-status.txt \
part-streams.txt \
part-synchronisation.txt \
part-toc.txt \
part-TODO.txt \
part-tracing.txt \
part-trickmodes.txt
CLEANFILES = index.html index.txt
html:
if ! test -z `which asciidoc`; then \
echo >index.txt "GStreamer design"; \
echo >>index.txt "================"; \
echo >>index.txt "The Gstreamer developers"; \
echo >>index.txt "Version $(PACKAGE_VERSION)"; \
echo >>index.txt ""; \
( cd $(srcdir) && \
cat >>$(abs_builddir)/index.txt $(EXTRA_DIST) ); \
asciidoc -o index.html index.txt; \
else \
echo "need asciidoc to generate html"; \
fi;
upload:
@echo nothing to upload

View file

@ -1,187 +0,0 @@
Element Klass definition
------------------------
Purpose
~~~~~~~
Applications should be able to retrieve elements from the registry of existing
elements based on specific capabilities or features of the element.
A playback application might want to retrieve all the elements that can be
used for visualisation, for example, or a video editor might want to select
all video effect filters.
The topic of defining the klass of elements should be based on use cases.
A list of classes that are used in a installation can be generated using:
gst-inspect-1.0 -a | grep -ho Class:.* | cut -c8- | sed "s/\//\\n/g" | sort | uniq
Proposal
~~~~~~~~
The GstElementDetails contains a field named klass that is a pointer to a
string describing the element type.
In this document we describe the format and contents of the string. Elements
should adhere to this specification although that is not enforced to allow
for wild (application specific) customisation.
1) string format
<keyword>['/'<keyword]*
The string consists of an _unordered_ list of keywords separated with a '/'
character. While the / suggests a hierarchy, this is _not_ the case.
2) keyword categories
- functional
Categories are base on _intended usage_ of the element. Some elements
might have other side-effects (especially for filers/effects). The purpose
is to list enough keywords so that applications can do meaningful filtering,
not to completely describe the functionality, that is expressed in caps etc..
* Source : produces data
* Sink : consumes data
* Filter : filters/transforms data, no modification on the data is
intended (although it might be unavoidable). The
filter can decide on input and output caps independently
of the stream contents (GstBaseTransform).
* Effect : applies an effect to some data, changes to data are
intended. Examples are colorbalance, volume. These
elements can also be implemented with GstBaseTransform.
* Demuxer : splits audio, video, ... from a stream
* Muxer : interleave audio, video, ... into one stream, this is
like mixing but without losing or degrading each separate
input stream. The reverse operation is possible with a
Demuxer that reproduces the exact same input streams.
* Decoder : decodes encoded data into a raw format, there is typically
no relation between input caps and output caps. The output
caps are defined in the stream data. This separates the
Decoder from the Filter and Effect.
* Encoder : encodes raw data into an encoded format.
* Mixer : combine audio, video, .. this is like muxing but with
applying some algorithm so that the individual streams
are not extractable anymore, there is therefore no
reverse operation to mixing. (audio mixer, video mixer, ...)
* Converter : convert audio into video, text to audio, ... The converter
typically works on raw types only. The source media type
is listed first.
* Analyzer : reports about the stream contents.
* Control : controls some aspect of a hardware device
* Extracter : extracts tags/headers from a stream
* Formatter : adds tags/headers to a stream
* Connector : allows for new connections in the pipeline. (tee, ...)
* ...
- Based on media type
Purpose is to make a selection for elements operating on the different
types of media. An audio application must be able to filter out the
elements operating on audio, for example.
* Audio : operates on audio data
* Video : operates on video data
* Image : operates on image data. Usually this media type can also
be used to make a video stream in which case it is added
together with the Video media type.
* Text : operates on text data
* Metadata : operates on metadata
* ...
- Extra features
The purpose is to further specialize the element, mostly for
application specific needs.
* Network : element is used in networked situations
* Protocol : implements some protocol (RTSP, HTTP, ...)
* Payloader : encapsulate as payload (RTP, RDT,.. )
* Depayloader : strip a payload (RTP, RDT,.. )
* RTP : intended to be used in RTP applications
* Device : operates on some hardware device (disk, network,
audio card, video card, usb, ...)
* Visualisation : intended to be used for audio visualisation
* Debug : intended usage is more for debugging purposes.
- Categories found, but not yet in one of the above lists
* Bin : playbin, decodebin, bin, pipeline
* Codec : lots of decoders, encoder, demuxers
should be removed?
* Generic : should be removed?
* File : like network, should go to Extra?
* Editor : gnonlin, textoverlays
* DVD, GDP, LADSPA, Parser, Player, Subtitle, Testing, ...
3) suggested order:
<functional>[/<media type>]*[/<extra...>]*
4) examples:
apedemux : Extracter/Metadata
audiotestsrc : Source/Audio
autoaudiosink : Sink/Audio/Device
cairotimeoverlay : Mixer/Video/Text
dvdec : Decoder/Video
dvdemux : Demuxer
goom : Converter/Audio/Video
id3demux : Extracter/Metadata
udpsrc : Source/Network/Protocol/Device
videomixer : Mixer/Video
videoconvert : Filter/Video (intended use to convert video with as little
visible change as possible)
vertigotv : Effect/Video (intended use is to change the video)
volume : Effect/Audio (intended use is to change the audio data)
vorbisdec : Decoder/Audio
vorbisenc : Encoder/Audio
oggmux : Muxer
adder : Mixer/Audio
videobox : Effect/Video
alsamixer : Control/Audio/Device
audioconvert : Filter/Audio
audioresample : Filter/Audio
xvimagesink : Sink/Video/Device
navseek : Filter/Debug
decodebin : Decoder/Demuxer
level : Filter/Analyzer/Audio
tee : Connector/Debug
5) open issues:
- how to differentiate physical devices from logical ones?
autoaudiosink : Sink/Audio/Device
alsasink : Sink/Audio/Device
Use cases
~~~~~~~~~
- get a list of all elements implementing a video effect (pitivi):
klass.contains (Effect & Video)
- get list of muxers (pitivi):
klass.contains (Muxer)
- get list of video encoders (pitivi):
klass.contains (Encoder & video)
- Get a list of all audio/video visualisations (totem):
klass.contains (Visualisation)
- Get a list of all decoders/demuxer/metadata parsers/vis (playbin):
klass.contains (Visualisation | Demuxer | Decoder | (Extractor & Metadata))
- Get a list of elements that can capture from an audio device (gst-properties):
klass.contains (Source & Audio & Device)
* filters out audiotestsrc, since it is not a device

View file

@ -1,201 +0,0 @@
Metadata
--------
This draft recaps the current metadata handling in GStreamer and proposes some
additions.
Supported Metadata standards
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The paragraphs below list supported native metadata standards sorted by type and
then in alphabetical order. Some standards have been extended to support
additional metadata. GStreamer already supports all of those to some extend.
This is showns in the table below as either [--], [r-], [-w] or [rw] depending on
read/write support (08.Feb.2010).
Audio
- mp3
ID3v2: [rw]
http://www.id3.org/Developer_Information
ID3v1: [rw]
http://www.id3.org/ID3v1
XMP: [--] (inside ID3v2 PRIV tag of owner XMP)
http://www.adobe.com/devnet/xmp/
- ogg/vorbis
vorbiscomment: [rw]
http://www.xiph.org/vorbis/doc/v-comment.html
http://wiki.xiph.org/VorbisComment
- wav
LIST/INFO chunk: [rw]
http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt
XMP: [--]
http://www.adobe.com/devnet/xmp/
Video
- 3gp
{moov,trak}.udta: [rw]
http://www.3gpp.org/ftp/Specs/html-info/26244.htm
ID3V2: [--]
http://www.3gpp.org/ftp/Specs/html-info/26244.htm
http://www.mp4ra.org/specs.html#id3v2
- avi
LIST/INFO chunk: [rw]
http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt
XMP: [--] (inside "_PMX" chunk)
http://www.adobe.com/devnet/xmp/
- asf
??:
XMP: [--]
http://www.adobe.com/devnet/xmp/
- flv [--]
XMP: (inside onXMPData script data tag)
http://www.adobe.com/devnet/xmp/
- mkv
tags: [rw]
http://www.matroska.org/technical/specs/tagging/index.html
- mov
XMP: [--] (inside moov.udta.XMP_ box)
http://www.adobe.com/devnet/xmp/
- mp4
{moov,trak}.udta: [rw]
http://standards.iso.org/ittf/PubliclyAvailableStandards/c051533_ISO_IEC_14496-12_2008.zip
moov.udta.meta.ilst: [rw]
http://atomicparsley.sourceforge.net/
http://atomicparsley.sourceforge.net/mpeg-4files.html
ID3v2: [--]
http://www.mp4ra.org/specs.html#id3v2
XMP: [--] (inside UUID box)
http://www.adobe.com/devnet/xmp/
- mxf
??
Images
- gif
XMP: [--]
http://www.adobe.com/devnet/xmp/
- jpg
jif: [rw] (only comments)
EXIF: [rw] (via metadata plugin)
http://www.exif.org/specifications.html
IPTC: [rw] (via metadata plugin)
http://www.iptc.org/IPTC4XMP/
XMP: [rw] (via metadata plugin)
http://www.adobe.com/devnet/xmp/
- png
XMP: [--]
http://www.adobe.com/devnet/xmp/
further Links:
http://age.hobba.nl/audio/tag_frame_reference.html
http://wiki.creativecommons.org/Tracker_CC_Indexing
Current Metadata handling
~~~~~~~~~~~~~~~~~~~~~~~~~
When reading files, demuxers or parsers extract the metadata. It will be sent
a GST_EVENT_TAG to downstream elements. When a sink element receives a tag
event, it will post a GST_MESSAGE_TAG message on the bus with the contents of
the tag event.
Elements receiving GST_EVENT_TAG events can mangle them, mux them into the
buffers they send or just pass them through. Usually is muxers that will format
the tag data into the form required by the format they mux. Such elements would
also implement the GstTagSetter interface to receive tags from the application.
+----------+
| demux |
sink src --> GstEvent(tag) over GstPad to downstream element
+----------+
method call over GstTagSetter interface from application
|
v
+----------+
| mux |
GstEvent(tag) over GstPad from upstream element --> sink src
+----------+
The data used in all those interfaces is GstTagList. It is based on a
GstStructure which is like a hash table with differently typed entries. The key
is always a string/GQuark. Many keys are predefined in GStreamer core. More keys
are defined in gst-plugins-base/gst-libs/gst/tag/tag.h.
If elements and applications use predefined types, it is possible to transcode a
file from one format into another while preserving all known and mapped
metadata.
Issues
~~~~~~
Unknown/Unmapped metadata
^^^^^^^^^^^^^^^^^^^^^^^^^
Right now GStreamer can lose metadata when transcoding, remuxing content. This
can happend as we don't map all metadata fields to generic ones.
We should probably also add the whole metadata blob to the GstTagList. We would
need a GST_TAG_SYSTEM_xxx define (e.g. GST_TAG_SYSTEM_ID3V2) for each standard.
The content is not printable and should be treated as binary if not known. The
tag is not mergeable - call gst_tag_register() with GstTagMergeFunc=NULL. Also
the tag data is only useful for upstream elements, not for the application.
A muxer would first scan a taglist for known system tags. Unknown tags are
ignored as already. It would first populate its own metadata store with the
entries from the system tag and the update the entries with the data in normal
tags.
Below is an initial list of tag systems:
ID3V1 - GST_TAG_SYSTEM_ID3V1
ID3V2 - GST_TAG_SYSTEM_ID3V2
RIFF_INFO - GST_TAG_SYSTEM_RIFF_INFO
XMP - GST_TAG_SYSTEM_XMP
We would basically need this for each container format.
See also https://bugzilla.gnome.org/show_bug.cgi?id=345352
Lost metadata
^^^^^^^^^^^^^
A case slighly different from the previous is that when an application sets a
GstTagList on a pipeline. Right elements consuming tags do not report which tags
have been consumed. Especially when using elements that make metadata
persistent, we have no means of knowing which of the tags made it into the
target stream and which were not serialized. Ideally the application would like
to know which kind of metadata is accepted by a pipleine to reflect that in the
UI.
Although it is in practise so that elements implementing GstTagSetter are the
ones that serialize, this does not have to be so. Otherwise we could add a
means to that interface, where elements add the tags they have serialized. The
application could build one list from all the tag messages and then query all
the serialized tags from tag-setters. The delta tells what has not been
serialized.
A different approach would be to query the list of supported tags in advance.
This could be a query (GST_QUERY_TAG_SUPPORT). The query result could be a list
of elements and their tags. As a convenience we could flatten the list of tags
for the top-level element (if the query was sent to a bin) and add that.
Tags are per Element
^^^^^^^^^^^^^^^^^^^^
In many cases we want tags per stream. Even metadata standards like mp4/3gp
metadata supports that. Right now GST_MESSAGE_SRC(tags) is the element. We tried
changing that to the pad, but that broke applications.
Also we miss the symmetric functionality in GstTagSetter. This interface is
usually implemented by elements.
Open bugs
^^^^^^^^^
https://bugzilla.gnome.org/buglist.cgi?query_format=advanced;short_desc=tag;bug_status=UNCONFIRMED;bug_status=NEW;bug_status=ASSIGNED;bug_status=REOPENED;bug_status=NEEDINFO;short_desc_type=allwordssubstr;product=GStreamer
Add GST_TAG_MERGE_REMOVE
https://bugzilla.gnome.org/show_bug.cgi?id=560302

View file

@ -1,119 +0,0 @@
DRAFT push-pull scheduling
--------------------------
Status
DRAFT. DEPRECATED by better current implementation.
Observations:
- The main scheduling mode is chain based scheduling where the source
element pushes buffers through the pipeline to the sinks. this is
called the push model
- In the pull model, some plugin pulls buffers from an upstream peer
element before consuming and/or pushing them further downstream.
Usages of pull based scheduling:
- sinks that pull in data, possibly at fixed intervals driven by some
hardware device (audiocard, videodevice, ...).
- Efficient random access to resources. Especially useful for certain
types of demuxers.
API for pull-based scheduling:
- an element that wants to pull data from a peer element needs to call
the pull_range() method. This methods requires an offset and a size.
It is possible to leave the offset and size at -1, indicating that
any offset or size is acceptable, this of course removes the advantages
of getrange based scheduling.
Types of pull based scheduling:
- some sources can do random access (file source, ...)
- some sources can read a random number of bytes but not at a random
offset. (audio cards, ...) Audio cards using a ringbuffer can
however do random access in the ringbuffer.
- some sources can do random access in a range of bytes but not in
another range. (a caching network source).
- some sources can do a fixed size data and without an offset.
(video sources, ...)
Current scheduling decision:
- core selects scheduling type starting on sinks by looking at existence
of loop function on sinkpad and calling _check_pull_range() on the
source pad to activate the pads in push/pull mode.
- element proxies pull mode pad activation to peer pad.
Problems:
- core makes a tough desicion without knowing anything about the
element. Some elements are able to deal with a pull_range()
without offset while others need full random access.
Requirements:
- element should be able to select scheduling method itself based on
how it can use the peer element pull_range. This includes if the
peer can operate with or without offset/size. This also means that
the core does not need to select the scheduling method anymore and
allows for more efficient scheduling methods adjusted for the
particular element.
Proposition:
- pads are activated without the core selecting a method.
- pads queries scheduling mode of peer pad. This query is rather
finegrained and allows the element to know if the peer supports
offsets and sizes in the get_range function. A proposition for
the query is outlined in draft-query.txt.
- pad selects scheduling mode and informs the peer pad of this
decision.
Things to query:
- pad can do real random access (downstream peer can ask for offset != -1)
- min offset
- suggest sequential access
- max offset
- align: all offsets should be aligned with this value.
- pad can give ranges from A to B length (peer can ask for A <= length <= B)
- min length
- suggested length
- max length
Use cases:
- An audio source can provide random access to the samples queued in its
DMA buffer, it however suggests sequential access method.
An audio source can provide a random number of samples but prefers
reading from the hardware using a fixed segment size.
- A caching network source would suggest sequential access but is seekable
in the cached region. Applications can query for the already downloaded
portion and update the GUI, a seek can be done in that area.
- a live video source can only provide buffers sequentialy. It exposes
offsets as -1. lengths are also -1.

View file

@ -1,99 +0,0 @@
Tagreading
----------
The tagreading (metadata reading) use case for mediacenter applications is not
too well supported by the current GStreamer architecture. It uses demuxers on the
files, which generally said takes too long (building seek-index, prerolling).
What we want is specialized elements / parsing modes that just do the
tag-reading.
The idea is to define a TagReadIFace. Tag-demuxers, classic demuxers and decoder
plugins can just implement the interface or provide a separate element that
implements the interface.
In addition we need a tagreadbin, that similar to decodebin does a typefind and
then plugs the right tagread element(s). If will only look at elements that
implement the interface. It can plug serval if possible.
For optimal performance typefind and tagread could share the list of already
peeked buffers (a queue element after sink, but that would change pull to push).
Design
~~~~~~
The plan is that applications can do the following:
pipeline = "filesrc ! tagbin"
for (file_path in list_of_files) {
filesrc.location=file_path
pipeline.set_state(PAUSED)
// wait for TAGS & EOS
pipeline.set_state(READY)
}
* it should have one sinkpad of type ANY
* it should send EOS when all metadata has been read
"done"-signal from all tagread-elements
* special tagread-elements should have RANK_NONE to be not autoplugged by
decodebin
Interface
~~~~~~~~~
* gboolean iface property "tag-reading"
Switches the element to tagreading mode. Needed if normal element implement
that behaviour. Elements will skip parsing unneeded data, don't build a
seeking index, etc.
* signal "done"
Equivalent of EOS.
Use Cases
~~~~~~~~~
* mp3 with id3- and apetags
* plug id3demux ! apedemux
* avi with vorbis audio
* plug avidemux
* new pad -> audio/vorbis
* plug vorbisdec or special vorbiscomment reader
Additional Thoughts
~~~~~~~~~~~~~~~~~~~
* would it make sense to have 2-phase tag-reading (property on tagbin and/or
tagread elements)
* 1st phase: get tag-data that are directly embedded in the data
* 2nd phase: get tag-data that has to be generated
* e.g. album-art via web, video-thumbnails
* what about caching backends
* it would be good to allow applications to supply tagbin with a tagcache-
object instance. Whenever tagbin gets a 'location' to tagread, it consults
the cache first. whenever there is a cache-miss it will tag-read and then
store in the cache
GstTagList *gst_tag_cache_load_tag_data (GstTagCache *self, const gchar *uri);
gst_tag_cache_store_tag_data (GstTagCache *self, const gchar *uri, GstTagList *tags);
Tests
~~~~~
* write a generic test for parsers/demuxers to ensure they send tags until they
reached PAUSED (elements need to parse file for prerolling anyway):
set pipeline to paused, check for tags, set to playing, error out if tags come
after paused
Code Locations
~~~~~~~~~~~~~~
* tagreadbin -> gst-plugins-base/gst/tagread
* tagreaderiface -> gst-plugins-base/gst-libs/gst/tag
Reuse
~~~~~
* ogg : gst-plugins-base/ext/ogg
* avi : gst-plugins-good/gst/avi
* mp3 : gst-plugins-good/gst/id3demux
* wav : gst-plugins-good/gst/wavparse
* qt : gst-plugins-bad/gst/qtdemux

View file

@ -1,417 +0,0 @@
Conventions for thread a safe API
---------------------------------
The GStreamer API is designed to be thread safe. This means that API functions
can be called from multiple threads at the same time. GStreamer internally uses
threads to perform the data passing and various asynchronous services such as
the clock can also use threads.
This design decision has implications for the usage of the API and the objects
which this document explains.
MT safety techniques
~~~~~~~~~~~~~~~~~~~~
Several design patterns are used to guarantee object consistency in GStreamer.
This is an overview of the methods used in various GStreamer subsystems.
Refcounting:
All shared objects have a refcount associated with them. Each reference
obtained to the object should increase the refcount and each reference lost
should decrease the refcount.
The refcounting is used to make sure that when another thread destroys the
object, the ones which still hold a reference to the object do not read from
invalid memory when accessing the object.
Refcounting is also used to ensure that mutable data structures are only
modified when they are owned by the calling code.
It is a requirement that when two threads have a handle on an object, the
refcount must be more than one. This means that when one thread passes an
object to another thread it must increase the refcount. This requirement makes
sure that one thread cannot suddenly dispose the object making the other
thread crash when it tries to access the pointer to invalid memory.
Shared data structures and writability:
All objects have a refcount associated with them. Each reference obtained to
the object should increase the refcount and each reference lost should
decrease the refcount.
Each thread having a refcount to the object can safely read from the object.
but modifications made to the object should be preceded with a
_get_writable() function call. This function will check the refcount of the
object and if the object is referenced by more than one instance, a copy is
made of the object that is then by definition only referenced from the calling
thread. This new copy is then modifiable without being visible to other
refcount holders.
This technique is used for information objects that, once created, never
change their values. The lifetime of these objects is generally short, the
objects are usually simple and cheap to copy/create.
The advantage of this method is that no reader/writers locks are needed. all
threads can concurrently read but writes happen locally on a new copy. In most
cases _get_writable() can avoid a real copy because the calling method is the
only one holding a reference, which makes read/write very cheap.
The drawback is that sometimes 1 needless copy can be done. This would happen
when N threads call _get_writable() at the same time, all seeing that N
references are held on the object. In this case 1 copy too many will be done.
This is not a problem in any practical situation because the copy operation is
fast.
Mutable substructures:
Special techniques are necessary to ensure the consistency of compound shared
objects. As mentioned above, shared objects need to have a reference count of
1 if they are to be modified. Implicit in this assumption is that all parts of
the shared object belong only to the object. For example, a GstStructure in
one GstCaps object should not belong to any other GstCaps object. This
condition suggests a parent-child relationship: structures can only be added
to parent object if they do not already have a parent object.
In addition, these substructures must not be modified while more than one code
segment has a reference on the parent object. For example, if the user creates
a GstStructure, adds it to a GstCaps, and the GstCaps is then referenced by
other code segments, the GstStructure should then become immutable, so that
changes to that data structure do not affect other parts of the code. This
means that the child is only mutable when the parent's reference count is 1,
as well as when the child structure has no parent.
The general solution to this problem is to include a field in child structures
pointing to the parent's atomic reference count. When set to NULL, this
indicates that the child has no parent. Otherwise, procedures that modify the
child structure must check if the parent's refcount is 1, and otherwise must
cause an error to be signaled.
Note that this is an internal implementation detail; application or plugin
code that calls _get_writable() on an object is guaranteed to receive an
object of refcount 1, which must then be writable. The only trick is that a
pointer to a child structure of an object is only valid while the calling code
has a reference on the parent object, because the parent is the owner of the
child.
Object locking:
For objects that contain state information and generally have a longer
lifetime, object locking is used to update the information contained in the
object.
All readers and writers acquire the lock before accessing the object. Only one
thread is allowed access the protected structures at a time.
Object locking is used for all objects extending from GstObject such as
GstElement, GstPad.
Object locking can be done with recursive locks or regular mutexes. Object
locks in GStreamer are implemented with mutexes which cause deadlocks when
locked recursively from the same thread. This is done because regular mutexes
are cheaper.
Atomic operations
Atomic operations are operations that are performed as one consistent
operation even when executed by multiple threads. They do however not use the
conventional aproach of using mutexes to protect the critical section but rely
on CPU features and instructions.
The advantages are mostly speed related since there are no heavyweight locks
involved. Most of these instructions also do not cause a context switch in case
of concurrent access but use a retry mechanism or spinlocking.
Disadvantages are that each of these instructions usually cause a cache flush
on multi-CPU machines when two processors perform concurrent access.
Atomic operations are generally used for refcounting and for the allocation of
small fixed size objects in a memchunk. They can also be used to implement a
lockfree list or stack.
Compare and swap
As part of the atomic operations, compare-and-swap (CAS) can be used to access
or update a single property or pointer in an object without having to take a
lock.
This technique is currently not used in GStreamer but might be added in the
future in performance critical places.
Objects
~~~~~~~
* Locking involved:
- atomic operations for refcounting
- object locking
All objects should have a lock associated with them. This lock is used to keep
internal consistency when multiple threads call API function on the object.
For objects that extend the GStreamer base object class this lock can be
obtained with the macros GST_OBJECT_LOCK() and GST_OBJECT_UNLOCK(). For other object that do
not extend from the base GstObject class these macros can be different.
* refcounting
All new objects created have the FLOATING flag set. This means that the object
is not owned or managed yet by anybody other than the one holding a reference
to the object. The object in this state has a reference count of 1.
Various object methods can take ownership of another object, this means that
after calling a method on object A with an object B as an argument, the object
B is made sole property of object A. This means that after the method call you
are not allowed to access the object anymore unless you keep an extra
reference to the object. An example of such a method is the _bin_add() method.
As soon as this function is called in a Bin, the element passed as an argument
is owned by the bin and you are not allowed to access it anymore without
taking a _ref() before adding it to the bin. The reason being that after the
_bin_add() call disposing the bin also destroys the element.
Taking ownership of an object happens through the process of "sinking" the
object. the _sink() method on an object will decrease the refcount of the
object if the FLOATING flag is set. The act of taking ownership of an object
is then performed as a _ref() followed by a _sink() call on the object.
The float/sink process is very useful when initializing elements that will
then be placed under control of a parent. The floating ref keeps the object
alive until it is parented, and once the object is parented you can forget
about it.
also see part-relations.txt
* parent-child relations
One can create parent-child relationships with the _object_set_parent()
method. This method refs and sinks the object and assigns its parent property
to that of the managing parent.
The child is said to have a weak link to the parent since the refcount of the
parent is not increased in this process. This means that if the parent is
disposed it has to unset itself as the parent of the object before disposing
itself, else the child object holds a parent pointer to invalid memory.
The responsibilities for an object that sinks other objects are summarised as:
- taking ownership of the object
- call _object_set_parent() to set itself as the object parent, this call
will _ref() and _sink() the object.
- keep reference to object in a datastructure such as a list or array.
- on dispose
- call _object_unparent() to reset the parent property and unref the
object.
- remove the object from the list.
also see part-relations.txt
* Properties
Most objects also expose state information with public properties in the
object. Two types of properties might exist: accessible with or without
holding the object lock. All properties should only be accessed with their
corresponding macros. The public object properties are marked in the .h files
with /*< public >*/. The public properties that require a lock to be held are
marked with /*< public >*/ /* with <lock_type> */, where <lock_type> can be
"LOCK" or "STATE_LOCK" or any other lock to mark the type(s) of lock to be
held.
Example:
in GstPad there is a public property "direction". It can be found in the
section marked as public and requiring the LOCK to be held. There exists
also a macro to access the property.
struct _GstRealPad {
...
/*< public >*/ /* with LOCK */
...
GstPadDirection direction;
...
};
#define GST_RPAD_DIRECTION(pad) (GST_REAL_PAD_CAST(pad)->direction)
Accessing the property is therefore allowed with the following code example:
GST_OBJECT_LOCK (pad);
direction = GST_RPAD_DIRECTION (pad);
GST_OBJECT_UNLOCK (pad);
* Property lifetime
All properties requiring a lock can change after releasing the associated
lock. This means that as long as you hold the lock, the state of the
object regarding the locked properties is consistent with the information
obtained. As soon as the lock is released, any values acquired from the
properties might not be valid anymore and can as best be described as a
snapshot of the state when the lock was held.
This means that all properties that require access beyond the scope of the
critial section should be copied or refcounted before releasing the lock.
Most object provide a _get_<property>() method to get a copy or refcounted
instance of the property value. The caller should not wory about any locks
but should unref/free the object after usage.
Example:
the following example correctly gets the peer pad of an element. It is
required to increase the refcount of the peer pad because as soon as the
lock is released, the peer could be unreffed and disposed, making the
pointer obtained in the critical section point to invalid memory.
GST_OBJECT_LOCK (pad);
peer = GST_RPAD_PEER (pad);
if (peer)
gst_object_ref (GST_OBJECT (peer));
GST_OBJECT_UNLOCK (pad);
... use peer ...
if (peer)
gst_object_unref (GST_OBJECT (peer));
Note that after releasing the lock the peer might not actually be the peer
anymore of the pad. If you need to be sure it is, you need to extend the
critical section to include the operations on the peer.
The following code is equivalent to the above but with using the functions
to access object properties.
peer = gst_pad_get_peer (pad);
if (peer) {
... use peer ...
gst_object_unref (GST_OBJECT (peer));
}
Example:
Accessing the name of an object makes a copy of the name. The caller of the
function should g_free() the name after usage.
GST_OBJECT_LOCK (object)
name = g_strdup (GST_OBJECT_NAME (object));
GST_OBJECT_UNLOCK (object)
... use name ...
g_free (name);
or:
name = gst_object_get_name (object);
... use name ...
g_free (name);
* Accessor methods
For aplications it is encouraged to use the public methods of the object. Most
useful operations can be performed with the methods so it is seldom required
to access the public fields manually.
All accessor methods that return an object should increase the refcount of the
returned object. The caller should _unref() the object after usage. Each
method should state this refcounting policy in the documentation.
* Accessing lists
If the object property is a list, concurrent list iteration is needed to get
the contents of the list. GStreamer uses the cookie mechanism to mark the last
update of a list. The list and the cookie are protected by the same lock. Each
update to a list requires the following actions:
- acquire lock
- update list
- update cookie
- release lock
Updating the cookie is usually done by incrementing its value by one. Since
cookies use guint32 its wraparound is for all practical reasons is not a
problem.
Iterating a list can safely be done by surrounding the list iteration with a
lock/unlock of the lock.
In some cases it is not a good idea to hold the lock for a long time while
iterating the list. The state change code for a bin in GStreamer, for example,
has to iterate over each element and perform a blocking call on each of them
potentially causing infinite bin locking. In this case the cookie can be used
to iterate a list.
Example:
The following algorithm iterates a list and reverses the updates in the
case a concurrent update was done to the list while iterating. The idea is
that whenever we reacquire the lock, we check for updates to the cookie to
decide if we are still iterating the right list.
GST_OBJECT_LOCK (lock);
/* grab list and cookie */
cookie = object->list_cookie;
list = object-list;
while (list) {
GstObject *item = GST_OBJECT (list->data);
/* need to ref the item before releasing the lock */
gst_object_ref (item);
GST_OBJECT_UNLOCK (lock);
... use/change item here...
/* release item here */
gst_object_unref (item);
GST_OBJECT_LOCK (lock);
if (cookie != object->list_cookie) {
/* handle rollback caused by concurrent modification
* of the list here */
...rollback changes to items...
/* grab new cookie and list */
cookie = object->list_cookie;
list = object->list;
}
else {
list = g_list_next (list);
}
}
GST_OBJECT_UNLOCK (lock);
* GstIterator
GstIterator provides an easier way of retrieving elements in a concurrent
list. The following code example is equivalent to the previous example.
Example:
it = _get_iterator(object);
while (!done) {
switch (gst_iterator_next (it, &item)) {
case GST_ITERATOR_OK:
... use/change item here...
/* release item here */
gst_object_unref (item);
break;
case GST_ITERATOR_RESYNC:
/* handle rollback caused by concurrent modification
* of the list here */
...rollback changes to items...
/* resync iterator to start again */
gst_iterator_resync (it);
break;
case GST_ITERATOR_DONE:
done = TRUE;
break;
}
}
gst_iterator_free (it);

View file

@ -1,86 +0,0 @@
TODO - Future Development
-------------------------
API/ABI
~~~~~~~
- implement return values from events in addition to the gboolean. This should
be done by making the event contain a GstStructure with input/output values,
similar to GstQuery. A typical use case is performing a non-accurate seek to a
keyframe, after the seek you want to get the new stream time that will
actually be used to update the slider bar.
- make gst_pad_push_event() return a GstFlowReturn
- GstEvent, GstMessage register like GstFormat or GstQuery.
- query POSITION/DURATION return accuracy. Just a flag or accuracy percentage.
- use | instead of + as divider in serialization of Flags
(gstvalue/gststructure)
- rethink how we handle dynamic replugging wrt segments and other events that
already got pushed and need to be pushed again. Might need GstFlowReturn from
gst_pad_push_event(). FIXED in 0.11 with sticky events.
- Optimize negotiation. We currently do a get_caps() call when we link pads,
which could potentially generate a huge list of caps and all their
combinations, we need to avoid generating these huge lists by generating them
We also need to incrementally return intersections etc, for this. somewhat
incrementally when needed. We can do this with a gst_pad_iterate_caps() call.
We also need to incrementally return intersections etc, for this.
FIXED in 0.11 with a filter on getcaps functions.
- Elements in a bin have no clue about the final state of the parent element
since the bin sets the target state on its children in small steps. This
causes problems for elements that like to know the final state (rtspsrc going
to PAUSED or READY is different in that we can avoid sending the useless
PAUSED request).
- Make serialisation of structures more consistent, readable and nicer code-wise.
- pad block has several issues:
* can't block on selected things, like push, pull, pad_alloc, events, ...
* can't check why the block happened. We should also be able to get the item/
reason that blocked the pad.
* it only blocks on datapassing. When EOS, the block never happens but ideally
should because pad block should inform the app when there is no dataflow.
* the same goes for segment seeks that don't push in-band EOS events. Maybe
segment seeks should also send an EOS event when they're done.
* blocking should only happen from one thread. If one thread does pad_alloc
and another a push, the push might be busy while the block callback is done.
* maybe this name is overloaded. We need to look at some more use cases before
trying to fix this.
FIXED in 0.11 with BLOCKING probes.
- rethink the way we do upstream renegotiation. Currently it's done with
pad_alloc but this has many issues such as only being able to suggest 1 format
and the need to allocate a buffer of this suggested format (some elements such
as capsfilter only know about the format, not the size). We would ideally like
to let upstream renegotiate a new format just like it did when it started.
This could, for example, easily be triggered with a RENEGOTIATE event.
FIXED in 0.11 with RECONFIGURE events.
- Remove the result format value in queries. FIXED in 0.11
- Try to minimize the amount of acceptcaps calls when pushing buffers around.
The element pushing the buffer usually negotiated already and decided on the
format.
The element receiving the buffer usually has to accept the caps anyway.
IMPLEMENTATION
~~~~~~~~~~~~~~
- implement more QOS, see part-qos.txt.
- implement BUFFERSIZE.
DESIGN
~~~~~~
- unlinking pads in the PAUSED state needs to make sure the stream thread is not
executing code. Can this be done with a flush to unlock all downstream chain
functions? Do we do this automatically or let the app handle this?

View file

@ -1,93 +0,0 @@
Pad (de)activation
------------------
Activation
~~~~~~~~~~
When changing states, a bin will set the state on all of its children in
sink-to-source order. As elements undergo the READY->PAUSED transition,
their pads are activated so as to prepare for data flow. Some pads will
start tasks to drive the data flow.
An element activates its pads from sourcepads to sinkpads. This to make
sure that when the sinkpads are activated and ready to accept data, the
sourcepads are already active to pass the data downstream.
Pads can be activated in one of two modes, PUSH and PULL. PUSH pads are
the normal case, where the source pad in a link sends data to the sink
pad via gst_pad_push(). PULL pads instead have sink pads request data
from the source pads via gst_pad_pull_range().
To activate a pad, the core will call gst_pad_set_active() with a TRUE
argument, indicating that the pad should be active. If the pad is
already active, be it in a PUSH or PULL mode, gst_pad_set_active() will
return without doing anything. Otherwise it will call the activation
function of the pad.
Because the core does not know in which mode to activate a pad (PUSH or
PULL), it delegates that choice to a method on the pad, activate(). The
activate() function of a pad should choose whether to operate in PUSH or
PULL mode. Once the choice is made, it should call activate_mode()
with the selected activation mode.
The default activate() function will call activate_mode() with
#GST_PAD_MODE_PUSH, as it is the default mechanism for data flow.
A sink pad that supports either mode of operation might call
activate_mode(PULL) if the SCHEDULING query upstream contains the
#GST_PAD_MODE_PULL scheduling mode, and activate_mode(PUSH) otherwise.
Consider the case fakesrc ! fakesink, where fakesink is configured to
operate in PULL mode. State changes in the pipeline will start with
fakesink, which is the most downstream element. The core will call
activate() on fakesink's sink pad. For fakesink to go into PULL mode, it
needs to implement a custom activate() function that will call
activate_mode(PULL) on its sink pad (because the default is to
use PUSH mode). activate_mode(PULL) is then responsible for starting
the task that pulls from fakesrc:src. Clearly, fakesrc needs to be
notified that fakesrc is about to pull on its src pad, even though the
pipeline has not yet changed fakesrc's state. For this reason,
GStreamer will first call call activate_mode(PULL) on fakesink:sink's
peer before calling activate_mode(PULL) on fakesink:sinks.
In short, upstream elements operating in PULL mode must be ready to
produce data in READY, after having activate_mode(PULL) called on their
source pad. Also, a call to activate_mode(PULL) needs to propagate through
the pipeline to every pad that a gst_pad_pull() will reach. In the case
fakesrc ! identity ! fakesink, calling activate_mode(PULL) on identity's
source pad would need to activate its sink pad in pull mode as well,
which should propagate all the way to fakesrc.
If, on the other hand, fakesrc ! fakesink is operating in PUSH mode, the
activation sequence is different. First, activate() on fakesink:sink
calls activate_mode(PUSH) on fakesink:sink. Then fakesrc's pads are
activated: sources first, then sinks (of which fakesrc has none).
fakesrc:src's activation function is then called.
Note that it does not make sense to set an activation function on a
source pad. The peer of a source pad is downstream, meaning it should
have been activated first. If it was activated in PULL mode, the
source pad should have already had activate_mode(PULL) called on it, and
thus needs no further activation. Otherwise it should be in PUSH mode,
which is the choice of the default activation function.
So, in the PUSH case, the default activation function chooses PUSH mode,
which calls activate_mode(PUSH), which will then start a task on the source
pad and begin pushing. In this way PUSH scheduling is a bit easier,
because it follows the order of state changes in a pipeline. fakesink is
already in PAUSED with an active sink pad by the time fakesrc starts
pushing data.
Deactivation
~~~~~~~~~~~~
Pad deactivation occurs when its parent goes into the READY state or when the
pad is deactivated explicitly by the application or element.
gst_pad_set_active() is called with a FALSE argument, which then calls
activate_mode(PUSH) or activate_mode(PULL) with a FALSE argument, depending
on the current activation mode of the pad.
Mode switching
~~~~~~~~~~~~~~
Changing from push to pull modes needs a bit of thought. This is actually
possible and implemented but not yet documented here.

View file

@ -1,159 +0,0 @@
GstBuffer
---------
This document describes the design for buffers.
A GstBuffer is the object that is passed from an upstream element to a
downstream element and contains memory and metadata information.
Requirements
~~~~~~~~~~~~
- It must be fast
* allocation, free, low fragmentation
- Must be able to attach multiple memory blocks to the buffer
- Must be able to attach arbitrary metadata to buffers
- efficient handling of subbuffer, copy, span, trim
Lifecycle
~~~~~~~~~
GstMemory extends from GstMiniObject and therefore uses its lifecycle
management (See part-miniobject.txt).
Writability
~~~~~~~~~~~
When a Buffers is writable as returned from gst_buffer_is_writable():
- metadata can be added/removed and the metadata can be changed
- GstMemory blocks can be added/removed
The individual memory blocks have their own locking and READONLY flags
that might influence their writability.
Buffers can be made writable with gst_buffer_make_writable(). This will copy the
buffer with the metadata and will ref the memory in the buffer. This means that
the memory is not automatically copied when copying buffers.
Managing GstMemory
------------------
A GstBuffer contains an array of pointers to GstMemory objects.
When the buffer is writable, gst_buffer_insert_memory() can be used to add a
new GstMemory object to the buffer. When the array of memory is full, memory
will be merged to make room for the new memory object.
gst_buffer_n_memory() is used to get the amount of memory blocks on the
GstBuffer.
With gst_buffer_peek_memory(), memory can be retrieved from the memory array.
The desired access pattern for the memory block should be specified so that
appropriate checks can be made and, in case of GST_MAP_WRITE, a writable copy
can be constructed when needed.
gst_buffer_remove_memory_range() and gst_buffer_remove_memory() can be used to
remove memory from the GstBuffer.
Subbuffers
----------
Subbuffers are made by copying only a region of the memory blocks and copying
all of the metadata.
Span
----
Spanning will merge together the data of 2 buffers into a new buffer
Data access
-----------
Accessing the data of the buffer can happen by retrieving the individual
GstMemory objects in the GstBuffer or by using the gst_buffer_map() and
gst_buffer_unmap() functions.
The _map and _unmap functions will always return the memory of all blocks as
one large contiguous region of memory. Using the _map and _unmap functions
might be more convenient than accessing the individual memory blocks at the
expense of being more expensive because it might perform memcpy operations.
For buffers with only one GstMemory object (the most common case), _map and
_unmap have no performance penalty at all.
* Read access with 1 memory block
The memory block is accessed and mapped for read access.
The memory block is unmapped after usage
* write access with 1 memory block
The buffer should be writable or this operation will fail.
The memory block is accessed. If the memory block is readonly, a copy is made
and the original memory block is replaced with this copy. Then the memory
block is mapped in write mode and unmapped after usage.
* Read access with multiple memory blocks
The memory blocks are combined into one large memory block. If the buffer is
writable, the memory blocks are replaced with this new combined block. If the
buffer is not writable, the memory is returned as is. The memory block is
then mapped in read mode.
When the memory is unmapped after usage and the buffer has multiple memory
blocks, this means that the map operation was not able to store the combined
buffer and it thus returned memory that should be freed. Otherwise, the memory
is unmapped.
* Write access with multiple memory blocks
The buffer should be writable or the operation fails. The memory blocks are
combined into one large memory block and the existing blocks are replaced with
this new block. The memory is then mapped in write mode and unmapped after
usage.
Use cases
---------
Generating RTP packets from h264 video
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We receive as input a GstBuffer with an encoded h264 image and we need to
create RTP packets containing this h264 data as the payload. We typically need
to fragment the h264 data into multiple packets, each with their own RTP and
payload specific header.
+-------+-------+---------------------------+--------+
input H264 buffer: | NALU1 | NALU2 | ..... | NALUx |
+-------+-------+---------------------------+--------+
|
V
array of +-+ +-------+ +-+ +-------+ +-+ +-------+
output buffers: | | | NALU1 | | | | NALU2 | .... | | | NALUx |
+-+ +-------+ +-+ +-------+ +-+ +-------+
: : : :
\-----------/ \-----------/
buffer 1 buffer 2
The output buffer array consists of x buffers consisting of an RTP payload header
and a subbuffer of the original input H264 buffer. Since the rtp headers and
the h264 data don't need to be contiguous in memory, they are added to the buffer
as separate GstMemory blocks and we can avoid to memcpy the h264 data into
contiguous memory.
A typical udpsink will then use something like sendmsg to send the memory regions
on the network inside one UDP packet. This will further avoid having to memcpy
data into contiguous memory.
Using bufferlists, the complete array of output buffers can be pushed in one
operation to the peer element.

View file

@ -1,324 +0,0 @@
Buffering
---------
This document outlines the buffering policy used in the GStreamer
core that can be used by plugins and applications.
The purpose of buffering is to accumulate enough data in a pipeline so that
playback can occur smoothly and without interruptions. It is typically done
when reading from a (slow) non-live network source but can also be used for
live sources.
We want to be able to implement the following features:
- buffering up to a specific amount of data, in memory, before starting playback
so that network fluctuations are minimized.
- download of the network file to a local disk with fast seeking in the
downloaded data. This is similar to the quicktime/youtube players.
- caching of semi-live streams to a local, on disk, ringbuffer with seeking in
the cached area. This is similar to tivo-like timeshifting.
- progress report about the buffering operations
- the possibility for the application to do more complex buffering
Some use cases:
* Stream buffering:
+---------+ +---------+ +-------+
| httpsrc | | buffer | | demux |
| src - sink src - sink ....
+---------+ +---------+ +-------+
In this case we are reading from a slow network source into a buffer element
(such as queue2).
The buffer element has a low and high watermark expressed in bytes. The
buffer uses the watermarks as follows:
- The buffer element will post BUFFERING messages until the high watermark
is hit. This instructs the application to keep the pipeline PAUSED, which
will eventually block the srcpad from pushing while data is prerolled in
the sinks.
- When the high watermark is hit, a BUFFERING message with 100% will be
posted, which instructs the application to continue playback.
- When the low watermark is hit during playback, the queue will start posting
BUFFERING messages again, making the application PAUSE the pipeline again
until the high watermark is hit again. This is called the rebuffering
stage.
- During playback, the queue level will fluctuate between the high and
low watermarks as a way to compensate for network irregularities.
This buffering method is usable when the demuxer operates in push mode.
Seeking in the stream requires the seek to happen in the network source.
It is mostly desirable when the total duration of the file is not known, such
as in live streaming or when efficient seeking is not possible/required.
* Incremental download
+---------+ +---------+ +-------+
| httpsrc | | buffer | | demux |
| src - sink src - sink ....
+---------+ +----|----+ +-------+
V
file
In this case, we know the server is streaming a fixed length file to the
client. The application can choose to download the file to disk. The buffer
element will provide a push or pull based srcpad to the demuxer to navigate in
the downloaded file.
This mode is only suitable when the client can determine the length of the
file on the server.
In this case, buffering messages will be emitted as usual when the requested
range is not within the downloaded area + buffersize. The buffering message
will also contain an indication that incremental download is being performed.
This flag can be used to let the application control the buffering in a more
intelligent way, using the BUFFERING query, for example.
The application can use the BUFFERING query to get the estimated download time
and match this time to the current/remaining playback time to control when
playback should start to have a non-interrupted playback experience.
* Timeshifting
+---------+ +---------+ +-------+
| httpsrc | | buffer | | demux |
| src - sink src - sink ....
+---------+ +----|----+ +-------+
V
file-ringbuffer
In this mode, a fixed size ringbuffer is kept to download the server content.
This allows for seeking in the buffered data. Depending on the size of the
buffer one can seek further back in time.
This mode is suitable for all live streams.
As with the incremental download mode, buffering messages are emitted along
with an indication that timeshifting download is in progress.
* Live buffering
In live pipelines we usually introduce some latency between the capture and
the playback elements. This latency can be introduced by a queue (such as a
jitterbuffer) or by other means (in the audiosink).
Buffering messages can be emitted in those live pipelines as well and serve as
an indication to the user of the latency buffering. The application usually
does not react to these buffering messages with a state change.
Messages
~~~~~~~~
A GST_MESSAGE_BUFFERING must be posted on the bus when playback temporarily
stops to buffer and when buffering finishes. When the percentage field in the
BUFFERING message is 100, buffering is done. Values less than 100 mean that
buffering is in progress.
The BUFFERING message should be intercepted and acted upon by the application.
The message contains at least one field that is sufficient for basic
functionality:
"buffer-percent", G_TYPE_INT, between 0 and 100
Several more clever ways of dealing with the buffering messages can be used when
in incremental or timeshifting download mode. For this purpose additional fields
are added to the buffering message:
"buffering-mode", GST_TYPE_BUFFERING_MODE,
enum { "stream", "download", "timeshift", "live" }
- Buffering mode in use. See above for an explanation of the
different alternatives. This field can be used to let the
application have more control over the buffering process.
"avg-in-rate", G_TYPE_INT
- Average input buffering speed in bytes/second. -1 is unknown.
This is the average number of bytes per second that is received on the
buffering element input (sink) pads. It is a measurement of the network
speed in most cases.
"avg-out-rate", G_TYPE_INT
- Average consumption speed in bytes/second. -1 is unknown.
This is the average number of bytes per second that is consumed by the
downstream element of the buffering element.
"buffering-left", G_TYPE_INT64
- Estimated time that buffering will take in milliseconds. -1 is unknown.
This is measured based on the avg-in-rate and the filled level of the
queue. The application can use this hint to update the GUI about the
estimated remaining time that buffering will take.
Application
~~~~~~~~~~~
While data is buffered the pipeline should remain in the PAUSED state. It is
also possible that more data should be buffered while the pipeline is PLAYING,
in which case the pipeline should be PAUSED until the buffering finishes.
BUFFERING messages can be posted while the pipeline is prerolling. The
application should not set the pipeline to PLAYING before a BUFFERING message
with a 100 percent value is received, which might only happen after the pipeline
prerolls.
An exception is made for live pipelines. The application may not change
the state of a live pipeline when a buffering message is received. Usually these
buffering messages contain the "buffering-mode" = "live".
The buffering message can also instruct the application to switch to a
periodical BUFFERING query instead, so it can more precisely control the
buffering process. The application can, for example, choose not to act on the
BUFFERING complete message (buffer-percent = 100) to resume playback but use
the estimated download time instead, resuming playback when it has determined
that it should be able to provide uninterrupted playback.
Buffering Query
~~~~~~~~~~~~~~~
In addition to the BUFFERING messages posted by the buffering elements, we want
to be able to query the same information from the application. We also want to
be able to present the user with information about the downloaded range in the
file so that the GUI can react on it.
In addition to all the fields present in the buffering message, the BUFFERING
query contains the following field, which indicates the available downloaded
range in a specific format and the estimated time to complete:
"busy", G_TYPE_BOOLEAN
- if buffering was busy. This flag allows the application to pause the
pipeline by using the query only.
"format", GST_TYPE_FORMAT
- the format of the "start" and "stop" values below
"start", G_TYPE_INT64, -1 unknown
- the start position of the available data. If there are multiple ranges,
this field contains the start position of the currently downloading
range.
"stop", G_TYPE_INT64, -1 unknown
- the stop position of the available data. If there are multiple ranges,
this field contains the stop position of the currently downloading
range.
"estimated-total", G_TYPE_INT64
- gives the estimated download time in milliseconds. -1 unknown.
When the size of the downloaded file is known, this value will contain
the latest estimate of the remaining download time of the currently
downloading range. This value is usually only filled for the "download"
buffering mode. The application can use this information to estimate the
amount of remaining time to download till the end of the file.
"buffering-ranges", G_TYPE_ARRAY of GstQueryBufferingRange
- contains optionally the downloaded areas in the format given above. One
of the ranges contains the same start/stop position as above.
typedef struct
{
gint64 start;
gint64 stop;
} GstQueryBufferingRange;
For the "download" and "timeshift" buffering-modes, the start and stop positions
specify the ranges where efficient seeking in the downloaded media is possible.
Seeking outside of these ranges might be slow or not at all possible.
For the "stream" and "live" mode the start and stop values describe the oldest
and newest item (expressed in "format") in the buffer.
Defaults
~~~~~~~~
Some defaults for common elements:
A GstBaseSrc with random access replies to the BUFFERING query with:
"buffer-percent" = 100
"buffering-mode" = "stream"
"avg-in-rate" = -1
"avg-out-rate" = -1
"buffering-left" = 0
"format" = GST_FORMAT_BYTES
"start" = 0
"stop" = the total filesize
"estimated-total" = 0
"buffering-ranges" = NULL
A GstBaseSrc in push mode replies to the BUFFERING query with:
"buffer-percent" = 100
"buffering-mode" = "stream"
"avg-in-rate" = -1
"avg-out-rate" = -1
"buffering-left" = 0
"format" = a valid GST_TYPE_FORMAT
"start" = current position
"stop" = current position
"estimated-total" = -1
"buffering-ranges" = NULL
Buffering strategies
~~~~~~~~~~~~~~~~~~~~
Buffering strategies are specific implementations based on the buffering
message and query described above.
Most strategies have to balance buffering time versus maximal playback
experience.
* simple buffering
NON-live pipelines are kept in the paused state while buffering messages with
a percent < 100% are received.
This buffering strategy relies on the buffer size and low/high watermarks of
the element. It can work with a fixed size buffer in memory or on disk.
The size of the buffer is usually expressed in a fixed amount of time units
and the estimated bitrate of the upstream source is used to convert this time
to bytes.
All GStreamer applications must implement this strategy. Failure to do so
will result in starvation at the sink.
* no-rebuffer strategy
This strategy tries to buffer as much data as possible so that playback can
continue without any further rebuffering.
This strategy is initially similar to simple buffering, the difference is in
deciding on the condition to continue playback. When a 100% buffering message
has been received, the application will not yet start the playback but it will
start a periodic buffering query, which will return the estimated amount of
buffering time left. When the estimated time left is less than the remaining
playback time, playback can continue.
This strategy requires a unlimited buffer size in memory or on disk, such as
provided by elements that implement the incremental download buffering mode.
Usually, the application can choose to start playback even before the
remaining buffer time elapsed in order to more quickly start the playback at
the expense of a possible rebuffering phase.
* Incremental rebuffering
The application implements the simple buffering strategy but with each
rebuffering phase, it increases the size of the buffer.
This strategy has quick, fixed time startup times but incrementally longer
rebuffering times if the network is slower than the media bitrate.

View file

@ -1,377 +0,0 @@
Bufferpool
----------
This document details the design of how buffers are be allocated and
managed in pools.
Bufferpools increase performance by reducing allocation overhead and
improving possibilities to implement zero-copy memory transfer.
Together with the ALLOCATION query, elements can negotiate allocation properties
and bufferpools between themselves. This also allows elements to negotiate
buffer metadata between themselves.
Requirements
------------
- Provide a GstBufferPool base class to help the efficient implementation of a
list of reusable GstBuffer objects.
- Let upstream elements initiate the negotiation of a bufferpool and its
configuration. Allow downstream elements provide bufferpool properties and/or
a bufferpool. This includes the following properties:
* have minimum and maximum amount of buffers with the option of
preallocating buffers.
* allocator, alignment and padding support
* buffer metadata
* arbitrary extra options
- Integrate with dynamic caps renegotiation.
- Notify upstream element of new bufferpool availability. This is important
when a new element, that can provide a bufferpool, is dynamically linked
downstream.
GstBufferPool
-------------
The bufferpool object manages a list of buffers with the same properties such
as size, padding and alignment.
The bufferpool has two states: active and inactive. In the inactive
state, the bufferpool can be configured with the required allocation
preferences. In the active state, buffers can be retrieved from and
returned to the pool.
The default implementation of the bufferpool is able to allocate buffers
from any allocator with arbitrary alignment and padding/prefix.
Custom implementations of the bufferpool can override the allocation and
free algorithms of the buffers from the pool. This should allow for
different allocation strategies such as using shared memory or hardware
mapped memory.
Negotiation
-----------
After a particular media format has been negotiated between two pads (using the
CAPS event), they must agree on how to allocate buffers.
The srcpad will always take the initiative to negotiate the allocation
properties. It starts with creating a GST_QUERY_ALLOCATION with the negotiated
caps.
The srcpad can set the need-pool flag to TRUE in the query to optionally make the
peer pad allocate a bufferpool. It should only do this if it is able to use
the peer provided bufferpool.
It will then inspect the returned results and configure the returned pool or
create a new pool with the returned properties when needed.
Buffers are then allocated by the srcpad from the negotiated pool and pushed to
the peer pad as usual.
The allocation query can also return an allocator object when the buffers are of
different sizes and can't be allocated from a pool.
Allocation query
----------------
The allocation query has the following fields:
(in) "caps", GST_TYPE_CAPS
- the caps that was negotiated
(in) "need-pool", G_TYPE_BOOLEAN
- if a GstBufferPool is requested
(out) "pool", G_TYPE_ARRAY of structure
- an array of pool configurations.
struct {
GstBufferPool *pool;
guint size;
guint min_buffers;
guint max_buffers;
}
Use gst_query_parse_nth_allocation_pool() to get the values.
The allocator can contain multiple pool configurations. If need-pool
was TRUE, the pool member might contain a GstBufferPool when the
downstream element can provide one.
Size contains the size of the bufferpool's buffers and is never 0.
min_buffers and max_buffers contain the suggested min and max amount of
buffers that should be managed by the pool.
The upstream element can choose to use the provided pool or make its own
pool when none was provided or when the suggested pool was not
acceptable.
The pool can then be configured with the suggested min and max amount of
buffers or a downstream element might choose different values.
(out) "allocator", G_TYPE_ARRAY of structure
- an array of allocator parameters that can be used.
struct {
GstAllocator *allocator;
GstAllocationParams params;
}
Use gst_query_parse_nth_allocation_param() to get the values.
The element performing the query can use the allocators and its
parameters to allocate memory for the downstream element.
It is also possible to configure the allocator in a provided pool.
(out) "metadata", G_TYPE_ARRAY of structure
- an array of metadata params that can be accepted.
struct {
GType api;
GstStructure *params;
}
Use gst_query_parse_nth_allocation_meta() to get the values.
These metadata items can be accepted by the downstream element when
placed on buffers. There is also an arbitrary GstStructure associated
with the metadata that contains metadata-specific options.
Some bufferpools have options to enable metadata on the buffers
allocated by the pool.
Allocating from pool
--------------------
Buffers are allocated from the pool of a pad:
res = gst_buffer_pool_acquire_buffer (pool, &buffer, &params);
A GstBuffer that is allocated from the pool will always be writable (have a
refcount of 1) and it will also have its pool member point to the GstBufferPool
that created the buffer.
Buffers are refcounted in the usual way. When the refcount of the buffer
reaches 0, the buffer is automatically returned to the pool.
Since all the buffers allocated from the pool keep a reference to the pool,
when nothing else is holding a refcount to the pool, it will be finalized
when all the buffers from the pool are unreffed. By setting the pool to
the inactive state we can drain all buffers from the pool.
When the pool is in the inactive state, gst_buffer_pool_acquire_buffer() will
return GST_FLOW_FLUSHING immediately.
Extra parameters can be given to the gst_buffer_pool_acquire_buffer() method to
influence the allocation decision. GST_BUFFER_POOL_FLAG_KEY_UNIT and
GST_BUFFER_POOL_FLAG_DISCONT serve as hints.
When the bufferpool is configured with a maximum number of buffers, allocation
will block when all buffers are outstanding until a buffer is returned to the
pool. This behaviour can be changed by specifying the
GST_BUFFER_POOL_FLAG_DONTWAIT flag in the parameters. With this flag set,
allocation will return GST_FLOW_EOS when the pool is empty.
Renegotiation
-------------
Renegotiation of the bufferpool might need to be performed when the
configuration of the pool changes. Changes can be in the buffer size (because
of a caps change), alignment or number of buffers.
* downstream
When the upstream element wants to negotiate a new format, it might need
to renegotiate a new bufferpool configuration with the downstream element.
This can, for example, happen when the buffer size changes.
We can not just reconfigure the existing bufferpool because there might
still be outstanding buffers from the pool in the pipeline. Therefore we
need to create a new bufferpool for the new configuration while we let the
old pool drain.
Implementations can choose to reuse the same bufferpool object and wait for
the drain to finish before reconfiguring the pool.
The element that wants to renegotiate a new bufferpool uses exactly the same
algorithm as when it first started. It will negotiate caps first then use the
ALLOCATION query to get and configure the new pool.
* upstream
When a downstream element wants to negotiate a new format, it will send a
RECONFIGURE event upstream. This instructs upstream to renegotiate both
the format and the bufferpool when needed.
A pipeline reconfiguration happens when new elements are added or removed from
the pipeline or when the topology of the pipeline changes. Pipeline
reconfiguration also triggers possible renegotiation of the bufferpool and
caps.
A RECONFIGURE event tags each pad it travels on as needing reconfiguration.
The next buffer allocation will then require the renegotiation or
reconfiguration of a pool.
Shutting down
-------------
In push mode, a source pad is responsible for setting the pool to the
inactive state when streaming stops. The inactive state will unblock any pending
allocations so that the element can shut down.
In pull mode, the sink element should set the pool to the inactive state when
shutting down so that the peer _get_range() function can unblock.
In the inactive state, all the buffers that are returned to the pool will
automatically be freed by the pool and new allocations will fail.
Use cases
---------
1) videotestsrc ! xvimagesink
Before videotestsrc can output a buffer, it needs to negotiate caps and
a bufferpool with the downstream peer pad.
First it will negotiate a suitable format with downstream according to the
normal rules. It will send a CAPS event downstream with the negotiated
configuration.
Then it does an ALLOCATION query. It will use the returned bufferpool or
configures its own bufferpool with the returned parameters. The bufferpool is
initially in the inactive state.
The ALLOCATION query lists the desired configuration of the downstream
xvimagesink, which can have specific alignment and/or min/max amount of
buffers.
videotestsrc updates the configuration of the bufferpool, it will likely
set the min buffers to 1 and the size of the desired buffers. It then
updates the bufferpool configuration with the new properties.
When the configuration is successfully updated, videotestsrc sets the
bufferpool to the active state. This preallocates the buffers in the pool
(if needed). This operation can fail when there is not enough memory
available. Since the bufferpool is provided by xvimagesink, it will allocate
buffers backed by an XvImage and pointing to shared memory with the X server.
If the bufferpool is successfully activated, videotestsrc can acquire a
buffer from the pool, fill in the data and push it out to xvimagesink.
xvimagesink can know that the buffer originated from its pool by following
the pool member.
when shutting down, videotestsrc will set the pool to the inactive state,
this will cause further allocations to fail and currently allocated buffers
to be freed. videotestsrc will then free the pool and stop streaming.
2) videotestsrc ! queue ! myvideosink
In this second use case we have a videosink that can at most allocate
3 video buffers.
Again videotestsrc will have to negotiate a bufferpool with the peer
element. For this it will perform the ALLOCATION query which
queue will proxy to its downstream peer element.
The bufferpool returned from myvideosink will have a max_buffers set to 3.
queue and videotestsrc can operate with this upper limit because none of
those elements require more than that amount of buffers for temporary
storage.
Myvideosink's bufferpool will then be configured with the size of the
buffers for the negotiated format and according to the padding and alignment
rules. When videotestsrc sets the pool to active, the 3 video
buffers will be preallocated in the pool.
videotestsrc acquires a buffer from the configured pool on its srcpad and
pushes this into the queue. When videotestsrc has acquired and pushed
3 frames, the next call to gst_buffer_pool_acquire_buffer() will block
(assuming the GST_BUFFER_POOL_FLAG_DONTWAIT is not specified).
When the queue has pushed out a buffer and the sink has rendered it, the
refcount of the buffer reaches 0 and the buffer is recycled in the pool.
This will wake up the videotestsrc that was blocked, waiting for more
buffers and will make it produce the next buffer.
In this setup, there are at most 3 buffers active in the pipeline and
the videotestsrc is rate limited by the rate at which buffers are recycled
in the bufferpool.
When shutting down, videotestsrc will first set the bufferpool on the srcpad
to inactive. This causes any pending (blocked) acquire to return with a
FLUSHING result and causes the streaming thread to pause.
3) .. ! myvideodecoder ! queue ! fakesink
In this case, the myvideodecoder requires buffers to be aligned to 128
bytes and padded with 4096 bytes. The pipeline starts out with the
decoder linked to a fakesink but we will then dynamically change the
sink to one that can provide a bufferpool.
When myvideodecoder negotiates the size with the downstream fakesink element, it will
receive a NULL bufferpool because fakesink does not provide a bufferpool.
It will then select its own custom bufferpool to start the data transfer.
At some point we block the queue srcpad, unlink the queue from the
fakesink, link a new sink and set the new sink to the PLAYING state.
Linking the new sink would automatically send a RECONFIGURE event upstream
and, through queue, inform myvideodecoder that it should renegotiate its
bufferpool because downstream has been reconfigured.
Before pushing the next buffer, myvideodecoder has to renegotiate a new
bufferpool. To do this, it performs the usual bufferpool negotiation
algorithm. If it can obtain and configure a new bufferpool from downstream,
it sets its own (old) pool to inactive and unrefs it. This will eventually
drain and unref the old bufferpool.
The new bufferpool is set as the new bufferpool for the srcpad and sinkpad
of the queue and set to the active state.
4) .. ! myvideodecoder ! queue ! myvideosink
myvideodecoder has negotiated a bufferpool with the downstream myvideosink
to handle buffers of size 320x240. It has now detected a change in the
video format and needs to renegotiate to a resolution of 640x480. This
requires it to negotiate a new bufferpool with a larger buffer size.
When myvideodecoder needs to get the bigger buffer, it starts the
negotiation of a new bufferpool. It queries a bufferpool from downstream,
reconfigures it with the new configuration (which includes the bigger buffer
size) and sets the bufferpool to active. The old pool is inactivated
and unreffed, which causes the old format to drain.
It then uses the new bufferpool for allocating new buffers of the new
dimension.
If at some point, the decoder wants to switch to a lower resolution again,
it can choose to use the current pool (which has buffers that are larger
than the required size) or it can choose to renegotiate a new bufferpool.
5) .. ! myvideodecoder ! videoscale ! myvideosink
myvideosink is providing a bufferpool for upstream elements and wants to
change the resolution.
myvideosink sends a RECONFIGURE event upstream to notify upstream that a
new format is desirable. Upstream elements try to negotiate a new format
and bufferpool before pushing out a new buffer. The old bufferpools are
drained in the regular way.

View file

@ -1,147 +0,0 @@
Caps
----
Caps are lightweight refcounted objects describing media types.
They are composed of an array of GstStructures plus, optionally,
a GstCapsFeatures set for the GstStructure.
Caps are exposed on GstPadTemplates to describe all possible types a
given pad can handle. They are also stored in the registry along with
a description of the element.
Caps are exposed on the element pads via CAPS and ACCEPT_CAPS queries.
This function describes the possible types that the pad can handle or
produce (see part-pads.txt and part-negotiation.txt).
Various methods exist to work with the media types such as subtracting
or intersecting.
Operations
~~~~~~~~~~
Fixating
--------
Caps are fixed if they only contain a single structure and this
structure is fixed. A structure is fixed if none of the fields of the
structure is an unfixed type, for example a range, list or array.
For fixating caps only the first structure is kept as the order of
structures is meant to express the preferences for the different
structures. Afterwards, each unfixed field of this structure is set
to the value that makes most sense for the media format by the element
or pad implementation and then every remaining unfixed field is set to
an arbitrary value that is a subset of the unfixed field's values.
EMPTY caps are fixed caps and ANY caps are not. Caps with ANY caps features
are not fixed.
Subset
------
One caps "A" is a subset of another caps "B" if for each structure in
"A" there exists a structure in "B" that is a superset of the structure
in "A".
A structure "a" is the subset of a structure "b" if it has the same
structure name, the same caps features and each field in "b" exists
in "a" and the value of the field in "a" is a subset of the value of
the field in "b". "a" can have additional fields that are not in "b".
EMPTY caps are a subset of every other caps. Every caps are a subset of
ANY caps.
Equality
--------
Caps "A" and "B" are equal if "A" is a subset of "B" and "B" is a subset
of "A". This means that both caps are expressing the same possibilities
but their structures can still be different if they contain unfixed
fields.
Intersection
------------
The intersection of caps "A" and caps "B" are the caps that contain the
intersection of all their structures with each other.
The intersection of structure "a" and structure "b" is empty if their
structure name or their caps features are not equal, or if "a" and "b"
contain the same field but the intersection of both field values is empty.
If one structure contains a field that is not existing in the other
structure it will be copied over to the intersection with the same
value.
The intersection with ANY caps is always the other caps and the intersection
with EMPTY caps is always EMPTY.
Union
-----
The union of caps "A" and caps "B" are the caps that contain the union
of all their structures with each other.
The union of structure "a" and structure "b" are the two structures "a"
and "b" if the structure names or caps features are not equal. Otherwise,
the union is the structure that contains the union of each fields value.
If a field is only in one of the two structures it is not contained in
the union.
The union with ANY caps is always ANY and the union with EMPTY caps is
always the other caps.
Subtraction
-----------
The subtraction of caps "A" from caps "B" is the most generic subset
of "B" that has an empty intersection with "A" but only contains
structures with names and caps features that are existing in "B".
Basic Rules
~~~~~~~~~~~
Semantics of caps and their usage
---------------------------------
A caps can contain multiple structures, in which case any of the
structures would be acceptable. The structures are in the preferred
order of the creator of the caps, with the preferred structure being
first and during negotiation of caps this order should be considered to
select the most optimal structure.
Each of these structures has a name that specifies the media type, e.g.
"video/x-theora" to specify Theora video. Additional fields in the
structure add additional constraints and/or information about the media
type, like the width and height of a video frame, or the codec profile
that is used. These fields can be non-fixed (e.g. ranges) for non-fixed
caps but must be fixated to a fixed value during negotiation.
If a field is included in the caps returned by a pad via the CAPS query,
it imposes an additional constraint during negotiation. The caps in the
end must have this field with a value that is a subset of the non-fixed
value. Additional fields that are added in the negotiated caps give
additional information about the media but are treated as optional.
Information that can change for every buffer and is not relevant during
negotiation must not be stored inside the caps.
For each of the structures in caps it is possible to store caps
features. The caps features are expressing additional requirements
for a specific structure, and only structures with the same name _and_
equal caps features are considered compatible.
Caps features can be used to require a specific memory representation
or a specific meta to be set on buffers, for example a pad could require
for a specific structure that it is passed EGLImage memory or buffers with
the video meta.
If no caps features are provided for a structure, it is assumed that
system memory is required unless later negotiation steps (e.g. the
ALLOCATION query) detect that something else can be used. The special
ANY caps features can be used to specify that any caps feature would
be accepted, for example if the buffer memory is not touched at all.
Compatibility of caps
---------------------
Pads can be linked when the caps of both pads are compatible. This is
the case when their intersection is not empty.
For checking if a pad actually supports a fixed caps an intersection is
not enough. Instead the fixed caps must be at least a subset of the
pad's caps but pads can introduce additional constraints which would be
checked in the ACCEPT_CAPS query handler.
Data flow can only happen after pads have decided on common fixed caps.
These caps are distributed to both pads with the CAPS event.

View file

@ -1,88 +0,0 @@
Clocks
------
The GstClock returns a monotonically increasing time with the method
_get_time(). Its accuracy and base time depends on the specific clock
implementation but time is always expressed in nanoseconds. Since the
baseline of the clock is undefined, the clock time returned is not
meaningful in itself, what matters are the deltas between two clock
times.
The time reported by the clock is called the absolute_time.
Clock Selection
~~~~~~~~~~~~~~~
To synchronize the different elements, the GstPipeline is responsible for
selecting and distributing a global GstClock for all the elements in it.
This selection happens whenever the pipeline goes to PLAYING. Whenever an
element is added/removed from the pipeline, this selection will be redone in the
next state change to PLAYING. Adding an element that can provide a clock will
post a GST_MESSAGE_CLOCK_PROVIDE message on the bus to inform parent bins of the
fact that a clock recalculation is needed.
When a clock is selected, a NEW_CLOCK message is posted on the bus signaling the
clock to the application.
When the element that provided the clock is removed from the pipeline, a
CLOCK_LOST message is posted. The application must then set the pipeline to
PAUSED and PLAYING again in order to let the pipeline select a new clock
and distribute a new base time.
The clock selection is performed as part of the state change from PAUSED to
PLAYING and is described in part-states.txt.
Clock features
~~~~~~~~~~~~~~
The clock supports periodic and single shot clock notifications both
synchronous and asynchronous.
One first needs to create a GstClockID for the periodic or single shot
notification using _clock_new_single_shot_id() or _clock_new_periodic_id().
To perform a blocking wait for the specific time of the GstClockID use the
gst_clock_id_wait(). To receive a callback when the specific time is reached
in the clock use gst_clock_id_wait_async(). Both these calls can be interrupted
with the gst_clock_id_unschedule() call. If the blocking wait is unscheduled
a value of GST_CLOCK_UNSCHEDULED is returned.
The async callbacks can happen from any thread, either provided by the
core or from a streaming thread. The application should be prepared for this.
A GstClockID that has been unscheduled cannot be used again for any wait
operation.
It is possible to perform a blocking wait on the same ID from multiple
threads. However, registering the same ID for multiple async notifications is
not possible, the callback will only be called once.
None of the wait operations unref the GstClockID, the owner is
responsible for unreffing the ids itself. This holds true for both periodic and
single shot notifications. The reason being that the owner of the ClockID
has to keep a handle to the ID to unblock the wait on FLUSHING events
or state changes and if we unref it automatically, the handle might be
invalid.
These clock operations do not operate on the stream time, so the callbacks
will also occur when not in PLAYING state as if the clock just keeps on
running. Some clocks however do not progress when the element that provided
the clock is not PLAYING.
Clock implementations
~~~~~~~~~~~~~~~~~~~~~
The GStreamer core provides a GstSystemClock based on the system time.
Asynchronous callbacks are scheduled from an internal thread.
Clock implementers are encouraged to subclass this systemclock as it
implements the async notification.
Subclasses can however override all of the important methods for sync and
async notifications to implement their own callback methods or blocking
wait operations.

View file

@ -1,65 +0,0 @@
Context
-------
GstContext is a container object, containing a type string and a
generic GstStructure. It is used to store and propagate context
information in a pipeline, like device handles, display server
connections and other information that should be shared between
multiple elements in a pipeline.
For sharing context objects and distributing them between application
and elements in a pipeline, there are downstream queries, upstream
queries, messages and functions to set a context on a complete pipeline.
Context types
~~~~~~~~~~~~~
Context type names should be unique and be put in appropriate namespaces,
to prevent name conflicts, e.g. "gst.egl.EGLDisplay". Only one specific
type is allowed per context type name.
Elements
~~~~~~~~
Elements that need a specific context for their operation would
do the following steps until one succeeds:
1) Check if the element already has a context of the specific type,
i.e. it was previously set via gst_element_set_context().
2) Query downstream with GST_QUERY_CONTEXT for the context and check if
downstream already has a context of the specific type
3) Query upstream with GST_QUERY_CONTEXT for the context and check if
upstream already has a context of the specific type
4) Post a GST_MESSAGE_NEED_CONTEXT message on the bus with the required
context types and afterwards check if a usable context was set now
as in 1). The message could be handled by the parent bins of the
element and the application.
4) Create a context by itself and post a GST_MESSAGE_HAVE_CONTEXT message
on the bus.
Bins will propagate any context that is set on them to their child elements via
gst_element_set_context(). Even to elements added after a given context has
been set.
Bins can handle the GST_MESSAGE_NEED_CONTEXT message, can filter both
messages and can also set different contexts for different pipeline parts.
Applications
~~~~~~~~~~~~
Applications can set a specific context on a pipeline or elements inside
a pipeline with gst_element_set_context().
If an element inside the pipeline needs a specific context, it will post
a GST_MESSAGE_NEED_CONTEXT message on the bus. The application can now
create a context of the requested type or pass an already existing context
to the element (or to the complete pipeline).
Whenever an element creates a context internally it will post a
GST_MESSAGE_HAVE_CONTEXT message on the bus. Bins will cache these
contexts and pass them to any future element that requests them.

View file

@ -1,67 +0,0 @@
Controller
----------
The controller subsystem allows to automate element property changes. It works
so that all parameter changes are time based and elements request property
updates at processing time.
Element view
~~~~~~~~~~~~
Elements don't need to do much. They need to:
- mark object properties that can be changed while processing with
GST_PARAM_CONTROLLABLE
- call gst_object_sync_values (self, timestamp) in the processing function
before accessing the parameters.
All ordered property types can be automated (int, double, boolean, enum). Other
property types can also be automated by using special control bindings. One can
e.g. write a control-binding that updates a text property based on timestamps.
Application view
~~~~~~~~~~~~~~~~
Applications need to setup the property automation. For that they need to create
a GstControlSource and attach it to a property using GstControlBinding. Various
control-sources and control-bindings exist. All control sources produce control
value sequences in the form of gdouble values. The control bindings map them to
the value range and type of the bound property.
One control-source can be attached to one or more properties at the same time.
If it is attached multiple times, then each control-binding will scale and
convert the control values to the target property type and range.
One can create complex control-curves by using a GstInterpolationControlSource.
This allows the classic user editable control-curve (often seen in audio/video
editors). Another way is to use computed control curves. GstLFOControlSource can
generate various repetitive signals. Those can be made more complex by chaining
the control sources. One can attach another control-source to e.g. modulate the
frequency of the first GstLFOControlSource.
In most cases GstControlBindingDirect will be the binding to be used. Other
control bindings are there to handle special cases, such as having 1-4 control-
sources and combine their values into a single guint to control a rgba-color
property.
TODO
~~~~
control-source value ranges
- control sources should ideally emit values between [0.0 and 1.0]
- right now lfo-control-sources emits values between [-1.0 and 1.0]
- we can make control-sources announce that or fix it in a lfo2-control-source
ranged-control-binding
- it might be a nice thing to have a control-binding that has scale and offset
properties
- when attaching a control-source to e.g. volume, one needs to be aware that
the values go from [0.0 to 4.0]
- we can also have a "mapping-mode"={AS_IS, TRANSFORMED} on direct-control-binding
and two extra properties that are used in TRANSFORMED mode
control-setup descriptions
- it would be nice to have a way to parse a textual control-setup description. This
could be used in gst-launch and in presets. It needs to be complemented with a
formatter (for the preset storage or e.g. for debug logging).
- this could be function-style:
direct(control-source=lfo(waveform='sine',offset=0.5))
or gst-launch style (looks weird)
lfo wave=sine offset=0.5 ! direct .control-source

View file

@ -1,77 +0,0 @@
Documentation conventions
-------------------------
Due to the potential for exponential growth, several abbreviating conventions will be used throughout this
documentation. These conventions have grown primarily from extremely in-depth discussions of the architecture in IRC.
This has verified the safety of these conventions, if used properly. There are no known namespace conflicts as long as
context is rigorously observed.
Object classes
~~~~~~~~~~~~~~
Since everything starts with Gst, we will generally refer to objects by the shorter name, i.e. Element or Pad. These
names will always have their first letter capitalized.
Function names
~~~~~~~~~~~~~~
Within the context of a given object, functions defined in that object's header and/or source file will have their
object-specific prefix stripped. For instance, gst_element_add_pad() would be referred to as simply _add_pad(). Note
that the trailing parentheses should always be present, but sometimes may not be. A prefixing underscore (_) will
always tell you it's a function, however, regardless of the presence or absence of the trailing parentheses.
defines and enums
~~~~~~~~~~~~~~~~~
Values and macros defined as enums and preprocessor macros will be referred to in all capitals, as per their
definition. This includes object flags and element states, as well as general enums. Examples are the states NULL,
READY, PLAYING, and PAUSED; the element flags LOCKED_STATE , and state return values SUCCESS, FAILURE, and
ASYNC. Where there is a prefix, as in the element flags, it is usually dropped and implied. Note however that
element flags should be cross-checked with the header, as there are currently two conventions in use: with and without
_FLAGS_ in the middle.
Drawing conventions
~~~~~~~~~~~~~~~~~~~
When drawing pictures the following conventions apply:
objects
^^^^^^^
Objects are drawn with a box like:
+------+
| |
+------+
pointers
^^^^^^^^
a pointer to an object.
+-----+
*--->| |
+-----+
an invalid pointer, this is a pointer that should not be used.
*-//->
elements
^^^^^^^^
+----------+
| name |
sink src
+----------+
pad links
^^^^^^^^^
-----+ +---
| |
src--sink
-----+ +---

View file

@ -1,14 +0,0 @@
Dynamic pipelines
-----------------
This document describes many use cases for dynamically constructing and
manipulating a running or paused pipeline and the features provided by
GStreamer.
When constructing dynamic pipelines it is important to understand the
following features of gstreamer:
- pad blocking (part-block.txt)
- playback segments.
- streaming vs application threads.

View file

@ -1,292 +0,0 @@
Sink elements
-------------
Sink elements consume data and normally have no source pads.
Typical sink elements include:
- audio/video renderers
- network sinks
- filesinks
Sinks are harder to construct than other element types as they are
treated specially by the GStreamer core.
state changes
~~~~~~~~~~~~~
A sink always returns ASYNC from the state change to PAUSED, this
includes a state change from READY->PAUSED and PLAYING->PAUSED. The
reason for this is that this way we can detect when the first buffer
or event arrives in the sink when the state change completes.
A sink should block on the first EOS event or buffer received in the
READY->PAUSED state before commiting the state to PAUSED.
FLUSHING events have to be handled out of sync with the buffer flow
and take no part in the preroll procedure.
Events other than EOS do not complete the preroll stage.
sink overview
~~~~~~~~~~~~~
- TODO: PREROLL_LOCK can be removed and we can safely use the STREAM_LOCK.
# Commit the state. We return TRUE if we can continue
# streaming, FALSE in the case we go to a READY or NULL state.
# if we go to PLAYING, we don't need to block on preroll.
commit
{
LOCK
switch (pending)
case PLAYING:
need_preroll = FALSE
break
case PAUSED:
break
case READY:
case NULL:
return FALSE
case VOID:
return TRUE
# update state
state = pending
next = VOID
pending = VOID
UNLOCK
return TRUE
}
# Sync an object. We have to wait for the element to reach
# the PLAYING state before we can wait on the clock.
# Some items do not need synchronisation (most events) so the
# get_times method returns FALSE (not syncable)
# need_preroll indicates that we are not in the PLAYING state
# and therefore need to commit and potentially block on preroll
# if our clock_wait got interrupted we commit and block again.
# The reason for this is that the current item being rendered is
# not yet finished and we can use that item to finish preroll.
do_sync (obj)
{
# get timing information for this object
syncable = get_times (obj, &start, &stop)
if (!syncable)
return OK;
again:
while (need_preroll)
if (need_commit)
need_commit = FALSE
if (!commit)
return FLUSHING
if (need_preroll)
# release PREROLL_LOCK and wait. prerolled can be observed
# and will be TRUE
prerolled = TRUE
PREROLL_WAIT (releasing PREROLL_LOCK)
prerolled = FALSE
if (flushing)
return FLUSHING
if (valid (start || stop))
PREROLL_UNLOCK
end_time = stop
ret = wait_clock (obj,start)
PREROLL_LOCK
if (flushing)
return FLUSHING
# if the clock was unscheduled, we redo the
# preroll
if (ret == UNSCHEDULED)
goto again
}
# render a prerollable item (EOS or buffer). It is
# always called with the PREROLL_LOCK helt.
render_object (obj)
{
ret = do_sync (obj)
if (ret != OK)
return ret;
# preroll and syncing done, now we can render
render(obj)
}
| # sinks that sync on buffer contents do like this
| while (more_to_render)
| ret = render
| if (ret == interrupted)
| prerolled = TRUE
render (buffer) ----->| PREROLL_WAIT (releasing PREROLL_LOCK)
| prerolled = FALSE
| if (flushing)
| return FLUSHING
|
# queue a prerollable item (EOS or buffer). It is
# always called with the PREROLL_LOCK helt.
# This function will commit the state when receiving the
# first prerollable item.
# items are then added to the rendering queue or rendered
# right away if no preroll is needed.
queue (obj, prerollable)
{
if (need_preroll)
if (prerollable)
queuelen++
# first item in the queue while we need preroll
# will complete state change and call preroll
if (queuelen == 1)
preroll (obj)
if (need_commit)
need_commit = FALSE
if (!commit)
return FLUSHING
# then see if we need more preroll items before we
# can block
if (need_preroll)
if (queuelen <= maxqueue)
queue.add (obj)
return OK
# now clear the queue and render each item before
# rendering the current item.
while (queue.hasItem)
render_object (queue.remove())
render_object (obj)
queuelen = 0
}
# various event functions
event
EOS:
# events must complete preroll too
STREAM_LOCK
PREROLL_LOCK
if (flushing)
return FALSE
ret = queue (event, TRUE)
if (ret == FLUSHING)
return FALSE
PREROLL_UNLOCK
STREAM_UNLOCK
break
SEGMENT:
# the segment must be used to clip incoming
# buffers. Then then go into the queue as non-prerollable
# items used for syncing the buffers
STREAM_LOCK
PREROLL_LOCK
if (flushing)
return FALSE
set_clip
ret = queue (event, FALSE)
if (ret == FLUSHING)
return FALSE
PREROLL_UNLOCK
STREAM_UNLOCK
break
FLUSH_START:
# set flushing and unblock all that is waiting
event ----> subclasses can interrupt render
PREROLL_LOCK
flushing = TRUE
unlock_clock
PREROLL_SIGNAL
PREROLL_UNLOCK
STREAM_LOCK
lost_state
STREAM_UNLOCK
break
FLUSH_END:
# unset flushing and clear all data and eos
STREAM_LOCK
event
PREROLL_LOCK
queue.clear
queuelen = 0
flushing = FALSE
eos = FALSE
PREROLL_UNLOCK
STREAM_UNLOCK
break
# the chain function checks the buffer falls within the
# configured segment and queues the buffer for preroll and
# rendering
chain
STREAM_LOCK
PREROLL_LOCK
if (flushing)
return FLUSHING
if (clip)
queue (buffer, TRUE)
PREROLL_UNLOCK
STREAM_UNLOCK
state
switch (transition)
READY_PAUSED:
# no datapassing is going on so we always return ASYNC
ret = ASYNC
need_commit = TRUE
eos = FALSE
flushing = FALSE
need_preroll = TRUE
prerolled = FALSE
break
PAUSED_PLAYING:
# we grab the preroll lock. This we can only do if the
# chain function is either doing some clock sync, we are
# waiting for preroll or the chain function is not being called.
PREROLL_LOCK
if (prerolled || eos)
ret = OK
need_commit = FALSE
need_preroll = FALSE
if (eos)
post_eos
else
PREROLL_SIGNAL
else
need_preroll = TRUE
need_commit = TRUE
ret = ASYNC
PREROLL_UNLOCK
break
PLAYING_PAUSED:
---> subclass can interrupt render
# we grab the preroll lock. This we can only do if the
# chain function is either doing some clock sync
# or the chain function is not being called.
PREROLL_LOCK
need_preroll = TRUE
unlock_clock
if (prerolled || eos)
ret = OK
else
ret = ASYNC
PREROLL_UNLOCK
break
PAUSED_READY:
---> subclass can interrupt render
# we grab the preroll lock. Set to flushing and unlock
# everything. This should exit the chain functions and stop
# streaming.
PREROLL_LOCK
flushing = TRUE
unlock_clock
queue.clear
queuelen = 0
PREROLL_SIGNAL
ret = OK
PREROLL_UNLOCK
break

View file

@ -1,137 +0,0 @@
Source elements
---------------
A source element is an element that provides data to the pipeline. It
does typically not have any sink (input) pads.
Typical source elements include:
- file readers
- network elements (live or not)
- capture elements (video/audio/...)
- generators (signals/video/audio/...)
Live sources
~~~~~~~~~~~~
A source is said to be a live source when it has the following property:
* temporarily stopping reading from the source causes data to be lost.
In general when this property holds, the source also produces data at a fixed
rate. Most sources have a limit on the rate at which they can deliver data, which
might be faster or slower than the consumption rate. This property however does
not make them a live source.
Let's look at some example sources.
- file readers: you can PAUSE without losing data. There is however a limit to
how fast you can read from this source. This limit is usually much higher
than the consumption rate. In some cases it might be slower (an NFS share,
for example) in which case you might need to use some buffering
(see part-buffering.txt).
- HTTP network element: you can PAUSE without data loss. Depending on the
available network bandwidth, consumption rate might be higher than production
rate in which case buffering should be used (see part-buffering.txt).
- audio source: pausing the audio capture will lead to lost data. this source
is therefore definatly live. In addition, an audio source will produce data
at a fixed rate (the samplerate). Also depending on the buffersize, this
source will introduce a latency (see part-latency.txt).
- udp network source: Pausing the receiving part will lead to lost data. This
source is therefore a live source. Also in a typical case the udp packets
will be received at a certain rate, which might be difficult to guess because
of network jitter. This source does not necessarily introduce latency on its
own.
- dvb source: PAUSING this element will lead to data loss, it's a live source
similar to a UDP source.
Source types
~~~~~~~~~~~~
A source element can operate in three ways:
- it is fully seekable, this means that random access can be performed
on it in an efficient way. (a file reader,...). This also typically
means that the source is not live.
- data can be obtained from it with a variable size. This means that
the source can give N bytes of data. An example is an audio source.
A video source always provides the same amount of data (one video
frame). Note that this is not a fully seekable source.
- it is a live source, see above.
When writing a source, one has to look at how the source can operate to
decide on the scheduling methods to implement on the source.
- fully seekable sources implement a getrange function on the source pad.
- sources that can give N bytes but cannot do seeking also implement a
getrange function but state that they cannot do random access.
- sources that are purely live sources implement a task to push out
data.
Any source that has a getrange function must also implement a push based
scheduling mode. In this mode the source starts a task that gets N bytes
and pushes them out. Whenever possible, the peer element will select the
getrange based scheduling method of the source, though.
A source with a getrange function must activate itself in the pad activate
function. This is needed because the downstream peer element will decide
and activate the source element in its state change function before the
source's state change function is called.
Source base classes
~~~~~~~~~~~~~~~~~~~
GstBaseSrc:
This base class provides an implementation of a random access source and
is very well suited for file reader like sources.
GstPushSrc:
Base class for block-based sources. This class is mostly useful for
elements that cannot do random access, or at least very slowly. The
source usually prefers to push out a fixed size buffer.
Classes extending this base class will usually be scheduled in a push
based mode. If the peer accepts to operate without offsets and within
the limits of the allowed block size, this class can operate in getrange
based mode automatically.
The subclass should extend the methods from the baseclass in
addition to the create method. If the source is seekable, it
needs to override GstBaseSrc::event() in addition to
GstBaseSrc::is_seekable() in order to retrieve the seek offset,
which is the offset of the next buffer to be requested.
Flushing, scheduling and sync is all handled by this base class.
Timestamps
~~~~~~~~~~
A non-live source should timestamp the buffers it produces starting from 0. If
it is not possible to timestamp every buffer (filesrc), the source is allowed to
only timestamp the first buffer (as 0).
Live sources only produce data in the PLAYING state, when the clock is running.
They should timestamp each buffer they produce with the current running_time of
the pipeline, which is expressed as:
absolute_time - base_time
With absolute_time the time obtained from the global pipeline with
gst_clock_get_time() and base_time being the time of that clock when the
pipeline was last set to PLAYING.

View file

@ -1,308 +0,0 @@
Transform elements
------------------
Transform elements transform input buffers to output buffers based
on the sink and source caps.
An important requirement for a transform is that the output caps are completely
defined by the input caps and vice versa. This means that a typical decoder
element can NOT be implemented with a transform element, this is because the
output caps like width and height of the decompressed video frame, for example,
are encoded in the stream and thus not defined by the input caps.
Typical transform elements include:
- audio convertors (audioconvert, audioresample,...)
- video convertors (colorspace, videoscale, ...)
- filters (capsfilter, volume, colorbalance, ...)
The implementation of the transform element has to take care of
the following things:
- efficient negotiation both up and downstream
- efficient buffer alloc and other buffer management
Some transform elements can operate in different modes:
- passthrough (no changes are done on the input buffers)
- in-place (changes made directly to the incoming buffers without requiring a
copy or new buffer allocation)
- metadata changes only
Depending on the mode of operation the buffer allocation strategy might change.
The transform element should at any point be able to renegotiate sink and src
caps as well as change the operation mode.
In addition, the transform element will typically take care of the following
things as well:
- flushing, seeking
- state changes
- timestamping, this is typically done by copying the input timestamps to the
output buffers but subclasses should be able to override this.
- QoS, avoiding calls to the subclass transform function
- handle scheduling issues such as push and pull based operation.
In the next sections, we will describe the behaviour of the transform element in
each of the above use cases. We focus mostly on the buffer allocation strategies
and caps negotiation.
Processing
~~~~~~~~~~
A transform has 2 main processing functions:
- transform():
Transform the input buffer to the output buffer. The output buffer is
guaranteed to be writable and different from the input buffer.
- transform_ip():
Transform the input buffer in-place. The input buffer is writable and of
bigger or equal size than the output buffer.
A transform can operate in the following modes:
- passthrough:
The element will not make changes to the buffers, buffers are pushed straight
through, caps on both sides need to be the same. The element can optionally
implement a transform_ip() function to take a look at the data, the buffer
does not have to be writable.
- in-place:
Changes can be made to the input buffer directly to obtain the output buffer.
The transform must implement a transform_ip() function.
- copy-transform
The transform is performed by copying and transforming the input buffer to a
new output buffer. The transform must implement a transform() function.
When no transform() function is provided, only in-place and passthrough
operation is allowed, this means that source and destination caps must be equal
or that the source buffer size is bigger or equal than the destination buffer.
When no transform_ip() function is provided, only passthrough and
copy-transforms are supported. Providing this function is an optimisation that
can avoid a buffer copy.
When no functions are provided, we can only process in passthrough mode.
Negotiation
~~~~~~~~~~~
Typical (re)negotiation of the transform element in push mode always goes from
sink to src, this means triggers the following sequence:
- the sinkpad receives a new caps event.
- the transform function figures out what it can convert these caps to.
- try to see if we can configure the caps unmodified on the peer. We need to
do this because we prefer to not do anything.
- the transform configures itself to transform from the new sink caps to the
target src caps
- the transform processes and sets the output caps on the src pad
We call this downstream negotiation (DN) and it goes roughly like this:
sinkpad transform srcpad
CAPS event | | |
------------>| find_transform() | |
|------------------->| |
| | CAPS event |
| |--------------------->|
| <configure caps> <-| |
These steps configure the element for a transformation from the input caps to
the output caps.
The transform has 3 function to perform the negotiation:
- transform_caps():
Transform the caps on a certain pad to all the possible supported caps on
the other pad. The input caps are guaranteed to be a simple caps with just
one structure. The caps do not have to be fixed.
- fixate_caps():
Given a caps on one pad, fixate the caps on the other pad. The target caps
are writable.
- set_caps():
Configure the transform for a transformation between src caps and dest
caps. Both caps are guaranteed to be fixed caps.
If no transform_caps() is defined, we can only perform the identity transform,
by default.
If no set_caps() is defined, we don't care about caps. In that case we also
assume nothing is going to write to the buffer and we don't enforce a writable
buffer for the transform_ip function, when present.
One common function that we need for the transform element is to find the best
transform from one format (src) to another (dest). Some requirements of this
function are:
- has a fixed src caps
- finds a fixed dest caps that the transform element can transform to
- the dest caps are compatible and can be accepted by peer elements
- the transform function prefers to make src caps == dest caps
- the transform function can optionally fixate dest caps.
The find_transform() function goes like this:
- start from src aps, these caps are fixed.
- check if the caps are acceptable for us as src caps. This is usually
enforced by the padtemplate of the element.
- calculate all caps we can transform too with transform_caps()
- if the original caps are a subset of the transforms, try to see if the
the caps are acceptable for the peer. If this is possible, we can
perform passthrough and make src == dest. This is performed by simply
calling gst_pad_peer_accept_caps().
- if the caps are not fixed, we need to fixate it, start by taking the peer
caps and intersect with them.
- for each of the transformed caps retrieved with transform_caps():
- try to fixate the caps with fixate_caps()
- if the caps are fixated, check if the peer accepts them with
_peer_accept_caps(), if the peer accepts, we have found a dest caps.
- if we run out of caps, we fail to find a transform.
- if we found a destination caps, configure the transform with set_caps().
After this negotiation process, the transform element is usually in a steady
state. We can identify these steady states:
- src and sink pads both have the same caps. Note that when the caps are equal
on both pads, the input and output buffers automatically have the same size.
The element can operate on the buffers in the following ways: (Same caps, SC)
- passthrough: buffers are inspected but no metadata or buffer data
is changed. The input buffers don't need to be writable. The input
buffer is simply pushed out again without modifications. (SCP)
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| pad_push() |
| |--------------------->|
| | |
- in-place: buffers are modified in-place, this means that the input
buffer is modified to produce a new output buffer. This requires the
input buffer to be writable. If the input buffer is not writable, a new
buffer has to be allocated from the bufferpool. (SCI)
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| |
| | [!writable] |
| | alloc buffer |
| .-| |
| <transform_ip> | | |
| '>| |
| | pad_push() |
| |--------------------->|
| | |
- copy transform: a new output buffer is allocate from the bufferpool
and data from the input buffer is transformed into the output buffer.
(SCC)
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| |
| | alloc buffer |
| .-| |
| <transform> | | |
| '>| |
| | pad_push() |
| |--------------------->|
| | |
- src and sink pads have different caps. The element can operate on the
buffers in the following way: (Different Caps, DC)
- in-place: input buffers are modified in-place. This means that the input
buffer has a size that is larger or equal to the output size. The input
buffer will be resized to the size of the output buffer. If the input
buffer is not writable or the output size is bigger than the input size,
we need to pad-alloc a new buffer. (DCI)
sinkpad transform srcpad
chain() | | |
------------>| handle_buffer() | |
|------------------->| |
| | [!writable || !size] |
| | alloc buffer |
| .-| |
| <transform_ip> | | |
| '>| |
| | pad_push() |
| |--------------------->|
| | |
- copy transform: a new output buffer is allocated and the data from the
input buffer is transformed into the output buffer. The flow is exactly
the same as the case with the same-caps negotiation. (DCC)
We can immediately observe that the copy transform states will need to
allocate a new buffer from the bufferpool. When the transform element is
receiving a non-writable buffer in the in-place state, it will also
need to perform an allocation. There is no reason why the passthrough state would
perform an allocation.
This steady state changes when one of the following actions occur:
- the sink pad receives new caps, this triggers the above downstream
renegotation process, see above for the flow.
- the transform element wants to renegotiate (because of changed properties,
for example). This essentially clears the current steady state and
triggers the downstream and upstream renegotiation process. This situation
also happens when a RECONFIGURE event was received on the transform srcpad.
Allocation
~~~~~~~~~~
After the transform element is configured with caps, a bufferpool needs to be
negotiated to perform the allocation of buffers. We have 2 cases:
- The element is operating in passthrough we don't need to allocate a buffer
in the transform element.
- The element is not operating in passthrough and needs to allocation an
output buffer.
In case 1, we don't query and configure a pool. We let upstream decide if it
wants to use a bufferpool and then we will proxy the bufferpool from downstream
to upstream.
In case 2, we query and set a bufferpool on the srcpad that will be used for
doing the allocations.
In order to perform allocation, we need to be able to get the size of the
output buffer after the transform. We need additional function to
retrieve the size. There are two functions:
- transform_size()
Given a caps and a size on one pad, and a caps on the other pad, calculate
the size of the other buffer. This function is able to perform all size
transforms and is the preferred method of transforming a size.
- get_unit_size()
When the input size and output size are always a multiple of each other
(audio conversion, ..) we can define a more simple get_unit_size() function.
The transform will use this function to get the same amount of units in the
source and destination buffers.
For performance reasons, the mapping between caps and size is kept in a cache.

View file

@ -1,295 +0,0 @@
Events
------
Events are objects passed around in parallel to the buffer dataflow to
notify elements of various events.
Events are received on pads using the event function. Some events should
be interleaved with the data stream so they require taking the STREAM_LOCK,
others don't.
Different types of events exist to implement various functionalities.
GST_EVENT_FLUSH_START: data is to be discarded
GST_EVENT_FLUSH_STOP: data is allowed again
GST_EVENT_CAPS: Format information about the following buffers
GST_EVENT_SEGMENT: Timing information for the following buffers
GST_EVENT_TAG: Stream metadata.
GST_EVENT_BUFFERSIZE: Buffer size requirements
GST_EVENT_SINK_MESSAGE: An event turned into a message by sinks
GST_EVENT_EOS: no more data is to be expected on a pad.
GST_EVENT_QOS: A notification of the quality of service of the stream
GST_EVENT_SEEK: A seek should be performed to a new position in the stream
GST_EVENT_NAVIGATION: A navigation event.
GST_EVENT_LATENCY: Configure the latency in a pipeline
GST_EVENT_STEP: Stepping event
GST_EVENT_RECONFIGURE: stream reconfigure event
* GST_EVENT_DRAIN: Play all data downstream before returning.
* not yet implemented, under investigation, might be needed to do still frames
in DVD.
src pads
--------
A gst_pad_push_event() on a srcpad will first store the sticky event in the
sticky array before sending the event to the peer pad. If there is no peer pad
and the event was not stored in the sticky array, FALSE is returned.
Flushing pads will refuse the events and will not store the sticky events.
sink pads
---------
A gst_pad_send_event() on a sinkpad will call the event function on the pad. If
the event function returns success, the sticky event is stored in the sticky
event array and the event is marked for update.
When the pad is flushing, the _send_event() function returns FALSE immediately.
When the next data item is pushed, the pending events are pushed first.
This ensures that the event function is never called for flushing pads and that
the sticky array only contains events for which the event function returned
success.
pad link
--------
When linking pads, the srcpad sticky events are marked for update when they are
different from the sinkpad events. The next buffer push will push the events to
the sinkpad.
FLUSH_START/STOP
~~~~~~~~~~~~~~~~
A flush event is sent both downstream and upstream to clear any pending data
from the pipeline. This might be needed to make the graph more responsive
when the normal dataflow gets interrupted by for example a seek event.
Flushing happens in two stages.
1) a source element sends the FLUSH_START event to the downstream peer element.
The downstream element starts rejecting buffers from the upstream elements. It
sends the flush event further downstream and discards any buffers it is
holding as well as return from the chain function as soon as possible.
This makes sure that all upstream elements get unblocked.
This event is not synchronized with the STREAM_LOCK and can be done in the
application thread.
2) a source element sends the FLUSH_STOP event to indicate
that the downstream element can accept buffers again. The downstream
element sends the flush event to its peer elements. After this step dataflow
continues. The FLUSH_STOP call is synchronized with the STREAM_LOCK so any
data used by the chain function can safely freed here if needed. Any
pending EOS events should be discarded too.
After the flush completes the second stage, data is flowing again in the pipeline
and all buffers are more recent than those before the flush.
For elements that use the pullrange function, they send both flush events to
the upstream pads in the same way to make sure that the pullrange function
unlocks and any pending buffers are cleared in the upstream elements.
A FLUSH_START may instruct the pipeline to distribute a new base_time to
elements so that the running_time is reset to 0.
(see part-clocks.txt and part-synchronisation.txt).
EOS
~~~
The EOS event can only be sent on a sinkpad. It is typically emitted by the
source element when it has finished sending data. This event is mainly sent
in the streaming thread but can also be sent from the application thread.
The downstream element should forward the EOS event to its downstream peer
elements. This way the event will eventually reach the sinks which should
then post an EOS message on the bus when in PLAYING.
An element might want to flush its internally queued data before forwarding
the EOS event downstream. This flushing can be done in the same thread as
the one handling the EOS event.
For elements with multiple sink pads it might be possible to wait for EOS on
all the pads before forwarding the event.
The EOS event should always be interleaved with the data flow, therefore the
GStreamer core will take the STREAM_LOCK.
Sometimes the EOS event is generated by another element than the source, for
example a demuxer element can generate an EOS event before the source element.
This is not a problem, the demuxer does not send an EOS event to the upstream
element but returns GST_FLOW_EOS, causing the source element to stop
sending data.
An element that sends EOS on a pad should stop sending data on that pad. Source
elements typically pause() their task for that purpose.
By default, a GstBin collects all EOS messages from all its sinks before
posting the EOS message to its parent.
The EOS is only posted on the bus by the sink elements in the PLAYING state. If
the EOS event is received in the PAUSED state, it is queued until the element
goes to PLAYING.
A FLUSH_STOP event on an element flushes the EOS state and all pending EOS messages.
SEGMENT
~~~~~~~
A segment event is sent downstream by an element to indicate that the following
group of buffers start and end at the specified positions. The newsegment event
also contains the playback speed and the applied rate of the stream.
Since the stream time is always set to 0 at start and after a seek, a 0
point for all next buffer's timestamps has to be propagated through the
pipeline using the SEGMENT event.
Before sending buffers, an element must send a SEGMENT event. An element is
free to refuse buffers if they were not preceded by a SEGMENT event.
Elements that sync to the clock should store the SEGMENT start and end values
and subtract the start value from the buffer timestamp before comparing
it against the stream time (see part-clocks.txt).
An element is allowed to send out buffers with the SEGMENT start time already
subtracted from the timestamp. If it does so, it needs to send a corrected
SEGMENT downstream, ie, one with start time 0.
A SEGMENT event should be generated as soon as possible in the pipeline and
is usually generated by a demuxer or source. The event is generated before
pushing the first buffer and after a seek, right before pushing the new buffer.
The SEGMENT event should be sent from the streaming thread and should be
serialized with the buffers.
Buffers should be clipped within the range indicated by the newsegment event
start and stop values. Sinks must drop buffers with timestamps out of the
indicated segment range.
TAG
~~~
The tag event is sent downstream when an element has discovered metadata
tags in a media file. Encoders can use this event to adjust their tagging
system. A tag is serialized with buffers.
BUFFERSIZE
~~~~~~~~~~
NOTE: This event is not yet implemented.
An element can suggest a buffersize for downstream elements. This is
typically done by elements that produce data on multiple source pads
such as demuxers.
QOS
~~~
A QOS, or quality of service message, is generated in an element to report
to the upstream elements about the current quality of real-time performance
of the stream. This is typically done by the sinks that measure the amount
of framedrops they have. (see part-qos.txt)
SEEK
~~~~
A seek event is issued by the application to configure the playback range
of a stream. It is called form the application thread and travels upstream.
The seek event contains the new start and stop position of playback
after the seek is performed. Optionally the stop position can be left
at -1 to continue playback to the end of the stream. The seek event
also contains the new playback rate of the stream, 1.0 is normal playback,
2.0 double speed and negative values mean backwards playback.
A seek usually flushes the graph to minimize latency after the seek. This
behaviour is triggered by using the SEEK_FLUSH flag on the seek event.
The seek event usually starts from the sink elements and travels upstream
from element to element until it reaches an element that can perform the
seek. No intermediate element is allowed to assume that a seek to this
location will happen. It is allowed to modify the start and stop times if it
needs to do so. this is typically the case if a seek is requested for a
non-time position.
The actual seek is performed in the application thread so that success
or failure can be reported as a return value of the seek event. It is
therefore important that before executing the seek, the element acquires
the STREAM_LOCK so that the streaming thread and the seek get serialized.
The general flow of executing the seek with FLUSH is as follows:
1) unblock the streaming threads, they could be blocked in a chain
function. This is done by sending a FLUSH_START on all srcpads or by pausing
the streaming task, depending on the seek FLUSH flag.
The flush will make sure that all downstream elements unlock and
that control will return to this element chain/loop function.
We cannot lock the STREAM_LOCK before doing this since it might
cause a deadlock.
2) acquire the STREAM_LOCK. This will work since the chain/loop function
was unlocked/paused in step 1).
3) perform the seek. since the STREAM_LOCK is held, the streaming thread
will wait for the seek to complete. Most likely, the stream thread
will pause because the peer elements are flushing.
4) send a FLUSH_STOP event to all peer elements to allow streaming again.
5) create a SEGMENT event to signal the new buffer timestamp base time.
This event must be queued to be sent by the streaming thread.
6) start stopped tasks and unlock the STREAM_LOCK, dataflow will continue
now from the new position.
More information about the different seek types can be found in
part-seeking.txt.
NAVIGATION
~~~~~~~~~~~
A navigation event is generated by a sink element to signal the elements
of a navigation event such as a mouse movement or button click.
Navigation events travel upstream.
LATENCY
~~~~~~~
A latency event is used to configure a certain latency in the pipeline. It
contains a single GstClockTime with the required latency. The latency value is
calculated by the pipeline and distributed to all sink elements before they are
set to PLAYING. The sinks will add the configured latency value to the
timestamps of the buffer in order to delay their presentation.
(See also part-latency.txt).
DRAIN
~~~~~
NOTE: This event is not yet implemented.
Drain event indicates that upstream is about to perform a real-time event, such
as pausing to present an interactive menu or such, and needs to wait for all
data it has sent to be played-out in the sink.
Drain should only be used by live elements, as it may otherwise occur during
prerolling.
Usually after draining the pipeline, an element either needs to modify timestamps,
or FLUSH to prevent subsequent data being discarded at the sinks for arriving
late (only applies during playback scenarios).

View file

@ -1,250 +0,0 @@
Frame step
----------
This document outlines the details of the frame stepping functionality in
GStreamer.
The stepping functionality operates on the current playback segment, position
and rate as it was configured with a regular seek event. In contrast to the seek
event, it operates very closely to the sink and thus has a very low latency and
is not slowed down by queues and does not actually perform any seeking logic.
For this reason we want to include a new API instead of reusing the seek API.
The following requirements are needed:
- The ability to walk forwards and backwards in the stream.
- Arbitrary increments in any supported format (time, frames, bytes ...)
- High speed, minimal overhead. This mechanism is not more expensive than
simple playback.
- switching between forwards and backwards stepping should be fast.
- Maintain synchronisation between streams.
- Get feedback of the amount of skipped data.
- Ability to play a certain amount of data at an arbitrary speed.
We want a system where we can step frames in PAUSED as well as play short
segments of data in PLAYING.
Use Cases
~~~~~~~~~
* frame stepping in video only pipeline in PAUSED
.-----. .-------. .------. .-------.
| src | | demux | .-----. | vdec | | vsink |
| src->sink src1->|queue|->sink src->sink |
'-----' '-------' '-----' '------' '-------'
- app sets the pipeline to PAUSED to block on the preroll picture
- app seeks to required position in the stream. This can be done with a
positive or negative rate depending on the required frame stepping
direction.
- app steps frames (in GST_FORMAT_DEFAULT or GST_FORMAT_BUFFER). The
pipeline loses its PAUSED state until the required number of frames have
been skipped, it then prerolls again. This skipping is purely done in
the sink.
- sink posts STEP_DONE with amount of frames stepped and corresponding time
interval.
* frame stepping in audio/video pipeline in PAUSED
.-----. .-------. .------. .-------.
| src | | demux | .-----. | vdec | | vsink |
| src->sink src1->|queue|->sink src->sink |
'-----' | | '-----' '------' '-------'
| | .------. .-------.
| | .-----. | adec | | asink |
| src2->|queue|->sink src->sink |
'-------' '-----' '------' '-------'
- app sets the pipeline to PAUSED to block on the preroll picture
- app seeks to required position in the stream. This can be done with a
positive or negative rate depending on the required frame stepping
direction.
- app steps frames (in GST_FORMAT_DEFAULT or GST_FORMAT_BUFFER) or an amount
of time on the video sink. The pipeline loses its PAUSED state until the
required number of frames have been skipped, it then prerolls again.
This skipping is purely done in the sink.
- sink posts STEP_DONE with amount of frames stepped and corresponding time
interval.
- the app skips the same amount of time on the audiosink to align the
streams again. When huge amount of video frames are skipped, there needs
to be enough queueing in the pipeline to compensate for the accumulated
audio.
* frame stepping in audio/video pipeline in PLAYING
- app sets the pipeline to PAUSED to block on the preroll picture
- app seeks to required position in the stream. This can be done with a
positive or negative rate depending on the required frame stepping
direction.
- app configures frames steps (in GST_FORMAT_DEFAULT or GST_FORMAT_BUFFER) or
an amount of time on the sink. The step event has a flag indicating live
stepping so that the stepping will only happens in PLAYING.
- app sets pipeline to PLAYING. The pipeline continues PLAYING until it
consumed the amount of time.
- sink posts STEP_DONE with amount of frames stepped and corresponding time
interval. The sink will then wait for another step event. Since the
STEP_DONE message was emitted by the sink when it handed off the buffer to
the device, there is usually sufficient time to queue a new STEP event so
that one can seamlessly continue stepping.
events
~~~~~~
A new GST_EVENT_STEP event is introduced to start the step operation.
The step event is created with the following fields in the structure:
"format", GST_TYPE_FORMAT
The format of the step units
"amount", G_TYPE_UINT64
The amount of units to step. A 0 amount immediately completes and can be
used to cancel the current step and resume normal non-stepping behaviour
to the end of the segment.
A -1 amount steps until the end of the segment.
"rate", G_TYPE_DOUBLE
The rate at which the frames should be stepped in PLAYING mode. 1.0 is
the normal playback speed and direction of the segment, 2.0
is double speed. A speed of 0.0 is not allowed. When performing a
flushing step, the speed is not relevant. Note that we don't allow negative
rates here, use a seek with a negative rate first to reverse the playback
direction.
"flush", G_TYPE_BOOLEAN
when flushing is TRUE, the step is performed immediately:
- In the PAUSED state the pipeline loses the PAUSED state, the requested
amount of data is skipped and the pipeline prerolls again when a
non-intermediate step completes.
When the pipeline was stepping while the event is sent, the current step
operation is updated with the new amount and format. The sink will do a
best effort to comply with the new amount.
- In the PLAYING state, the pipeline loses the PLAYING state, the
requested amount of data is skipped (not rendered) from the previous STEP
request or from the position of the last PAUSED if no previous STEP
operation was performed. The pipeline goes back to the PLAYING state
when a non-intermediate step completes.
When flushing is FALSE, the step will be performed later.
- In the PAUSED state the step will be done when going to PLAYING. Any
previous step operation will be overridden with the new STEP event.
- In the PLAYING state the step operation will be performed after the
current step operation completes. If there was no previous step
operation, the step operation will be performed from the position of the
last PAUSED state.
"intermediate", G_TYPE_BOOLEAN
Signal that this step operation is an intermediate step, part of a series
of step operations. It is mostly interesting for stepping in the PAUSED state
because the sink will only perform a preroll after a non-intermediate step
operation completes. Intermediate steps are useful to flush out data from
other sinks in order to not cause excessive queueing. In the PLAYING state
the intermediate flag has no visual effect. In all states, the intermediate
flag is passed to the corresponding GST_MESSAGE_STEP_DONE.
The application will create a STEP event to start or stop the stepping
operation. Both stepping in PAUSED and PLAYING can be performed by means of
the flush flag.
The event is usually sent to the pipeline, which will typically distribute the
event to all of its sinks. For some use cases, like frame stepping on video
frames only, the event should only be sent to the video sink and upon reception
of the STEP_DONE message, one can step the other sinks to align the streams
again.
For large stepping amounts, there needs to be enough queueing in front of all
the sinks. If large steps need to be performed, they can be split up into
smaller step operations using the "intermediate" flag on the step.
Since the step event does not update the base_time of any of the elements, the
sinks should keep track of the amount of stepped data in order to remain
synchronized against the clock.
messages
~~~~~~~~
A GST_MESSAGE_STEP_START is created. It contains the following fields.
"active"
If the step was queued or activated.
"format", GST_TYPE_FORMAT
The format of the step units that queued/activated.
"amount", G_TYPE_UINT64
The amount of units that were queued/activated.
"rate", G_TYPE_DOUBLE
The rate and direction at which the frames were queued/activated.
"flush", G_TYPE_BOOLEAN
If the queued/activated frames will be flushed.
"intermediate", G_TYPE_BOOLEAN
If this is an intermediate step operation that queued/activated.
The STEP_START message is emitted 2 times:
* first when an element received the STEP event and queued it. The "active"
field will be FALSE in this case.
* second when the step operation started in the streaming thread. The "active"
field is TRUE in this case. After this message is emitted, the application
can queue a new step operation.
The purpose of this message is to find out how many elements participate in the
step operation and to queue new step operations at the earliest possible
moment.
A new GST_MESSAGE_STEP_DONE message is created. It contains the following
fields:
"format", GST_TYPE_FORMAT
The format of the step units that completed.
"amount", G_TYPE_UINT64
The amount of units that were stepped.
"rate", G_TYPE_DOUBLE
The rate and direction at which the frames were stepped.
"flush", G_TYPE_BOOLEAN
If the stepped frames were flushed.
"intermediate", G_TYPE_BOOLEAN
If this is an intermediate step operation that completed.
"duration", G_TYPE_UINT64
The total duration of the stepped units in GST_FORMAT_TIME.
"eos", G_TYPE_BOOLEAN
The step ended because of EOS.
The message is emitted by the element that performs the step operation. The
purpose is to return the duration in GST_FORMAT_TIME of the stepped media. This
especially interesting to align other stream in case of stepping frames on the
video sink element.
Direction switch
~~~~~~~~~~~~~~~~
When quickly switching between a forwards and a backwards step of, for example,
one video frame, we need either:
a) issue a new seek to change the direction from the current position.
b) cache a certain number of stepped frames and walk the cache.
option a) might be very slow.
For option b) we would ideally like to offload this caching functionality to a
separate element, which means that we need to forward the STEP event upstream.
It's unclear how this could work in a generic way. What is a demuxer supposed
to do when it received a step event? a flushing seek to what stream position?

View file

@ -1,115 +0,0 @@
GstBin
------
GstBin is a container element for other GstElements. This makes it possible
to group elements together so that they can be treated as one single
GstElement. A GstBin provides a GstBus for the children and collates messages
from them.
Add/removing elements
~~~~~~~~~~~~~~~~~~~~~
The basic functionality of a bin is to add and remove GstElements to/from it.
gst_bin_add() and gst_bin_remove() perform these operations respectively.
The bin maintains a parent-child relationship with its elements (see part-
relations.txt).
Retrieving elements
~~~~~~~~~~~~~~~~~~~
GstBin provides a number of functions to retrieve one or more children from
itself. A few examples of the provided functions:
gst_bin_get_by_name() retrieves an element by name.
gst_bin_iterate_elements() returns an iterator to all the children.
element management
~~~~~~~~~~~~~~~~~~
The most important function of the GstBin is to distribute all GstElement
operations on itself to all of its children. This includes:
- state changes
- index get/set
- clock get/set
The state change distribution is the most complex and is explained in
part-states.txt.
GstBus
~~~~~~
The GstBin creates a GstBus for its children and distributes it when child
elements are added to the bin. The bin attaches a sync handler to receive
messages from children. The bus for receiving messages from children is
distinct from the bin's own externally-visible GstBus.
Messages received from children are forwarded intact onto the bin's
external message bus, except for EOS and SEGMENT_START/DONE which are
handled specially.
ASYNC_START/ASYNC_STOP messages received from the children are used to
trigger a recalculation of the current state of the bin, as described in
part-states.txt.
The application can retrieve the external GstBus and integrate it in the
mainloop or it can just _pop() messages off in its own thread.
When a bin goes to READY it will clear all cached messages.
EOS
~~~
The sink elements will post an EOS message on the bus when they reach EOS. The
EOS message is only posted to the bus when the sink element is in PLAYING.
The bin collects all EOS messages and forwards it to the application as
soon as all the sinks have posted an EOS.
The list of queued EOS messages is cleared when the bin goes to PAUSED
again. This means that all elements should repost the EOS message when going
to PLAYING again.
SEGMENT_START/DONE
~~~~~~~~~~~~~~~~~~
A bin collects SEGMENT_START messages but does not post them to the application.
It counts the number of SEGMENT_START messages and posts a SEGMENT_STOP message
to the application when an equal number of SEGMENT_STOP messages where received.
The cached SEGMENT_START/STOP messages are cleared when going to READY.
DURATION
~~~~~~~~
When a DURATION query is performed on a bin, it will forward the query to all
its sink elements. The bin will calculate the total duration as the MAX of all
returned durations and will then cache the result so that any further query can
use the cached version. The reason for caching the result is because the
duration of a stream typically does not change that often.
A GST_MESSAGE_DURATION_CHANGED posted by an element will clear the cached
duration value so that the bin will query the sinks again. This message is
typically posted by elements that calculate the duration of the stream based
on some average bitrate, which might change while playing the stream. The
DURATION_CHANGED message is posted to the application, which can then fetch
the updated DURATION.
Subclassing
~~~~~~~~~~~
Subclasses of GstBin are free to implement their own add/remove implementations.
It is a good idea to update the GList of children so that the _iterate() functions
can still be used if the custom bin allows access to its children.
Any bin subclass can also implement a custom message handler by overriding the
default message handler.

View file

@ -1,41 +0,0 @@
GstBus
------
The GstBus is an object responsible for delivering GstMessages in
a first-in first-out way from the streaming threads to the application.
Since the application typically only wants to deal with delivery of these
messages from one thread, the GstBus will marshall the messages between
different threads. This is important since the actual streaming of media
is done in another threads (streaming threads) than the application. It is
also important to not block the streaming threads while the application deals
with the message.
The GstBus provides support for GSource based notifications. This makes it
possible to handle the delivery in the glib mainloop. Different GSources
can be added to the same bin provided they listen to different message types.
A message is posted on the bus with the gst_bus_post() method. With the
gst_bus_peek() and _pop() methods one can look at or retrieve a previously
posted message.
The bus can be polled with the gst_bus_poll() method. This methods blocks
up to the specified timeout value until one of the specified messages types
is posted on the bus. The application can then _pop() the messages from the
bus to handle them.
It is also possible to get messages from the bus without any thread
marshalling with the gst_bus_set_sync_handler() method. This makes it
possible to react to a message in the same thread that posted the
message on the bus. This should only be used if the application is able
to deal with messages from different threads.
If no messages are popped from the bus with either a GSource or gst_bus_pop(),
they remain on the bus.
When a pipeline or bin goes from READY into NULL state, it will set its bus
to flushing, ie. the bus will drop all existing and new messages on the bus,
This is necessary because bus messages hold references to the bin/pipeline
or its elements, so there are circular references that need to be broken if
one ever wants to be able to destroy a bin or pipeline properly.

View file

@ -1,69 +0,0 @@
GstElement
----------
The Element is the most important object in the entire GStreamer system, as it
defines the structure of the pipeline. Elements include sources, filters,
sinks, and containers (Bins). They may be an intrinsic part of the core
GStreamer library, or may be loaded from a plugin. In some cases they're even
fabricated from completely different systems (see the LADSPA plugin). They
are generally created from a GstElementFactory, which will be covered in
another chapter, but for the intrinsic types they can be created with specific
functions.
Elements contains GstPads (also covered in another chapter), which are
subsequently used to connect the Elements together to form a pipeline capable
of passing and processing data. They have a parent, which must be another
Element. This allows deeply nested pipelines, and the possibility of
"black-box" meta-elements.
Name
~~~~
All elements are named, and while they should ideally be unique in any given
pipeline, they do not have to be. The only guaranteed unique name for an
element is its complete path in the object hierarchy. In other words, an
element's name is unique inside its parent. (This follows from GstObject's
name explanation)
This uniqueness is guaranteed through all functions where either parentage
or name of an element is changed.
Pads
~~~~
GstPads are the property of a given GstElement. They provide the connection
capability, with allowing arbitrary structure in the graph. For any Element
but a source or sink, there will be at least 2 Pads owned by the Element.
These pads are stored in a single GList within the Element. Several counters
are kept in order to allow quicker determination of the type and properties of
a given Element.
Pads may be added to an element with _add_pad. Retrieval is via _get_static_pad(),
which operates on the name of the Pad (the unique key). This means that all
Pads owned by a given Element must have unique names.
A pointer to the GList of pads may be obtained with _iterate_pads.
gst_element_add_pad(element,pads):
Sets the element as the parent of the pad, then adds the pad to the
element's list of pads, keeping the counts of total, src, and sink pads
up to date. Emits the "new_pad" signal with the pad as argument.
Fails if either the element or pad are either NULL or not what they
claim to be. Should fail if the pad already has a parent. Should fail
if the pad is already owned by the element. Should fail if there's
already a pad by that name in the list of pads.
pad = gst_element_get_pad(element,"padname"):
Searches through the list of pads
Ghost Pads
~~~~~~~~~~
More info in part-gstghostpad.txt.
State
~~~~~
An element has a state. More info in part-states.txt.

View file

@ -1,451 +0,0 @@
Ghostpads
---------
GhostPads are used to build complex compound elements out of
existing elements. They are used to expose internal element pads
on the complex element.
Some design requirements
- Must look like a real GstPad on both sides.
- target of Ghostpad must be changeable
- target can be initially NULL
* a GhostPad is implemented using a private GstProxyPad class:
GstProxyPad
(------------------)
| GstPad |
|------------------|
| GstPad *target |
(------------------)
| GstPad *internal |
(------------------)
GstGhostPad
(------------------) -\
| GstPad | |
|------------------| |
| GstPad *target | > GstProxyPad
|------------------| |
| GstPad *internal | |
|------------------| -/
| <private data> |
(------------------)
A GstGhostPad (X) is _always_ created together with a GstProxyPad (Y).
The internal pad pointers are set to point to the eachother. The
GstProxyPad pairs have opposite directions, the GstGhostPad has the same
direction as the (future) ghosted pad (target).
(- X --------)
| |
| target * |
|------------|
| internal *----+
(------------) |
^ V
| (- Y --------)
| | |
| | target * |
| |------------|
+----* internal |
(------------)
Which we will abbreviate to:
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
The GstGhostPad (X) is also set as the parent of the GstProxyPad (Y).
The target is a pointer to the internal pads peer. It is an optimisation to
quickly get to the peer of a ghostpad without having to dereference the
internal->peer.
Some use case follow with a description of how the datastructure
is modified.
* Creating a ghostpad with a target:
gst_ghost_pad_new (char *name, GstPad *target)
1) create new GstGhostPad X + GstProxyPad Y
2) X name set to @name
3) X direction is the same as the target, Y is opposite.
4) the target of X is set to @target
5) Y is linked to @target
6) link/unlink and activate functions are set up
on GstGhostPad.
(--------------
(- X --------) |
| | |------)
| target *------------------> | sink |
(------------) -------> |------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
- Automatically takes same direction as target.
- target is filled in automatically.
* Creating a ghostpad without a target
gst_ghost_pad_new_no_target (char *name, GstPadDirection dir)
1) create new GstGhostPad X + GstProxyPad Y
2) X name set to @name
3) X direction is @dir
5) link/unlink and activate functions are set up
on GstGhostPad.
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
- allows for setting the target later
* Setting target on an untargetted unlinked ghostpad
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
1) assert direction of newtarget == X direction
2) target is set to newtarget
3) internal pad Y is linked to newtarget
(--------------
(- X --------) |
| | |------)
| target *------------------> | sink |
(------------) -------> |------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
* Setting target on a targetted unlinked ghostpad
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(--------------
(- X --------) |
| | |-------)
| target *------------------> | sink1 |
(------------) -------> |-------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
1) assert direction of newtarget (sink2) == X direction
2) unlink internal pad Y and oldtarget
3) target is set to newtarget (sink2)
4) internal pad Y is linked to newtarget
(--------------
(- X --------) |
| | |-------)
| target *------------------> | sink2 |
(------------) -------> |-------)
| / (--------------
(- Y --------) / (pad link)
//<-----* target |/
(------------)
* Linking a pad to an untargetted ghostpad:
gst_pad_link (src, X)
(- X --------)
| |
| target *--------->//
(------------)
|
(- Y --------)
| target *----->//
(------------)
-------)
|
(-----|
| src |
(-----|
-------)
X is a sink GstGhostPad without a target. The internal GstProxyPad Y has
the same direction as the src pad (peer).
1) link function is called
a) Y direction is same as @src
b) Y target is set to @src
c) Y is activated in the same mode as X
d) core makes link from @src to X
(- X --------)
| |
| target *----->//
>(------------)
(real pad link) / |
/ (- Y ------)
/ -----* target |
-------) / / (----------)
| / /
(-----|/ /
| src |<----
(-----|
-------)
* Linking a pad to a targetted ghostpad:
gst_pad_link (src, X)
(--------
(- X --------) |
| | |------)
| target *------------->| sink |
(------------) >|------)
| / (--------
| /
| /
-------) | / (real pad link)
| (- Y ------) /
(-----| | |/
| src | //<----* target |
(-----| (----------)
-------)
1) link function is called
a) Y direction is same as @src
b) Y target is set to @src
c) Y is activated in the same mode as X
d) core makes link from @src to X
(--------
(- X --------) |
| | |------)
| target *------------->| sink |
>(------------) >|------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
* Setting target on untargetted linked ghostpad:
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(- X --------)
| |
| target *------>//
>(------------)
(real pad link) / |
/ |
/ |
-------) / |
| / (- Y ------)
(-----|/ | |
| src |<-------------* target |
(-----| (----------)
-------)
1) assert direction of @newtarget == X direction
2) X target is set to @newtarget
3) Y is linked to @newtarget
(--------
(- X --------) |
| | |------)
| target *------------->| sink |
>(------------) >|------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
* Setting target on targetted linked ghostpad:
gst_ghost_pad_set_target (char *name, GstPad *newtarget)
(--------
(- X --------) |
| | |-------)
| target *------------->| sink1 |
>(------------) >|-------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
1) assert direction of @newtarget == X direction
2) Y and X target are unlinked
2) X target is set to @newtarget
3) Y is linked to @newtarget
(--------
(- X --------) |
| | |-------)
| target *------------->| sink2 |
>(------------) >|-------)
(real pad link) / | / (--------
/ | /
/ | /
-------) / | / (real pad link)
| / (- Y ------) /
(-----|/ | |/
| src |<-------------* target |
(-----| (----------)
-------)
Activation
~~~~~~~~~~
Sometimes ghost pads should proxy activation functions. This thingie
attempts to explain how it should work in the different cases.
+---+ +----+ +----+ +----+
| A +-----+ B | | C |-------+ D |
+---+ +---=+ +=---+ +----+
+--=-----------------------------=-+
| +=---+ +----+ +----+ +---=+ |
| | a +---+ b ==== c +--+ d | |
| +----+ +----+ +----+ +----+ |
| |
+----------------------------------+
state change goes from right to left
<-----------------------------------------------------------
All of the labeled boxes are pads. The dashes (---) show pad links, and
the double-lines (===) are internal connections. The box around a, b, c,
and d is a bin. B and C are ghost pads, and a and d are proxy pads. The
arrow represents the direction of a state change algorithm. Not counting
the bin, there are three elements involved here -- the parent of D, the
parent of A, and the parent of b and c.
Now, in the state change from READY to PAUSED, assuming the pipeline
does not have a live source, all of the pads will end up activated at
the end. There are 4 possible activation modes:
1) AD and ab in PUSH, cd and CD in PUSH
2) AD and ab in PUSH, cd and CD in PULL
3) AD and ab in PULL, cd and CD in PUSH
4) AD and ab in PULL, cd and CD in PULL
When activating (1), the state change algorithm will first visit the
parent of D and activate D in push mode. Then it visits the bin. The bin
will first change the state of its child before activating its pads.
That means c will be activated in push mode. [*] At this point, d and C
should also be active in push mode, because it could be that activating
c in push mode starts a thread, which starts pushing to pads which
aren't ready yet. Then b is activated in push mode. Then, the bin
activates C in push mode, which should already be in push mode, so
nothing is done. It then activates B in push mode, which activates b in
push mode, but it's already there, then activates a in push mode as
well. The order of activating a and b does not matter in this case.
Then, finally, the state change algorithm moves to the parent of A,
activates A in push mode, and dataflow begins.
[*] Not yet implemented.
Activation mode (2) is implausible, so we can ignore it for now. That
leaves us with the rest.
(3) is the same as (1) until you get to activating b. Activating b will
proxy directly to activating a, which will activate B and A as well.
Then when the state change algorithm gets to B and A it sees that they
are already active, so it ignores them.
Similarly in (4), activating D will cause the activation of all of the
rest of the pads, in this order: C d c b a B A. Then when the state
change gets to the other elements they are already active, and in fact
data flow is already occurring.
So, from these scenarios, we can distill how ghost pad activation
functions should work:
Ghost source pads (e.g. C):
push:
called by: element state change handler
behavior: just return TRUE
pull:
called by: peer's activatepull
behavior: change the internal pad, which proxies to its peer e.g. C
changes d which changes c.
Internal sink pads (e.g. d):
push:
called by: nobody (doesn't seem possible)
behavior: n/a
pull:
called by: ghost pad
behavior: proxy to peer first
Internal src pads (e.g. a):
push:
called by: ghost pad
behavior: activate peer in push mode
pull:
called by: peer's activatepull
behavior: proxy to ghost pad, which proxies to its peer (e.g. a
calls B which calls A)
Ghost sink pads (e.g. B):
push:
called by: element state change handler
behavior: change the internal pad, which proxies to peer (e.g. B
changes a which changes b)
pull:
called by: internal pad
behavior: proxy to peer
It doesn't really make sense to have activation functions on proxy pads
that aren't part of a ghost pad arrangement.

View file

@ -1,91 +0,0 @@
GstObject
---------
The base class for the entire GStreamer hierarchy is the GstObject.
Parentage
~~~~~~~~~
A pointer is available to store the current parent of the object. This is one
of the two fundamental requirements for a hierarchical system such as GStreamer
(for the other, read up on GstBin). Three functions are provided:
_set_parent(), _get_parent(), and _unparent(). The third is required because
there is an explicit check in _set_parent(): an object must not already have a
parent if you wish to set one. You must unparent the object first. This
allows for new additions later.
- GstObject's that can be parented:
GstElement (inside a bin)
GstPad (inside an element)
Naming
~~~~~~
- names of objects cannot be changed when they are parented
- names of objects should be unique across parent
- set_name() can fail because of this
- as can gst_element_add_pad()/gst_bin_add_element()
- gst_object_set_name() only changes the object's name
- objects also have a name_prefix that is used to prefix the object name
during debugging and identification
- there are object-specific set_name's() which also set the name_prefix
on the object. This is useful for debugging purposes to give the object
a more identifiable name. Typically a parent will call _set_name_prefix
on children, taking a lock on them to do so.
Locking
~~~~~~~
The GstObject contains the necessary primitives to lock the object in a
thread-safe manner. This will be used to provide general thread-safety as
needed. However, this lock is generic, i.e. it covers the whole object.
The object LOCK is a very lowlevel lock that should only be held to access
the object properties for short periods of code.
All members of the GstObject structure marked as
/*< public >*/ /* with LOCK */
are protected by this lock. These members can only be accessed for reading
or writing while the lock is held. All members should be copied or reffed
if they are used after releasing the LOCK.
Note that this does *not* mean that no other thread can modify the object at
the same time that the lock is held. It only means that any two sections of
code that obey the lock are guaranteed to not be running simultaneously. "The
lock is voluntary and cooperative".
This lock will ideally be used for parentage, flags and naming, which is
reasonable, since they are the only possible things to protect in the
GstObject.
Locking order
~~~~~~~~~~~~~
In parent-child situations the lock of the parent must always be taken first
before taking the lock of the child. It is NOT allowed to hold the child
lock before taking the parent lock.
This policy allows for parents to iterate their children and setting properties
on them.
Whenever a nested lock needs to be taken on objects not involved in a
parent-child relation (eg. pads), an explictic locking order has to be defined.
Path Generation
~~~~~~~~~~~~~~~
Due to the base nature of the GstObject, it becomes the only reasonable place
to put this particular function (_get_path_string). It will generate a string
describing the parent hierarchy of a given GstObject.
Flags
~~~~~
Each object in the GStreamer object hierarchy can have flags associated with it,
which are used to describe a state or a feature of the object.

View file

@ -1,88 +0,0 @@
GstPipeline
-----------
A GstPipeline is usually a toplevel bin and provides all of its
children with a clock.
A GstPipeline also provides a toplevel GstBus (see part-gstbus.txt)
The pipeline also calculates the running_time based on the selected
clock (see also clocks.txt and part-synchronisation.txt).
The pipeline will calculate a global latency for the elements in the pipeline.
(See also part-latency.txt).
State changes
~~~~~~~~~~~~~
In addition to the normal state change procedure of its parent class
GstBin, the pipeline performs the following actions during a state change:
- NULL -> READY:
- set the bus to non-flushing
- READY -> PAUSED:
- reset the running_time to 0
- PAUSED -> PLAYING:
- Select and a clock.
- calculate base_time using the running_time.
- calculate and distribute latency.
- set clock and base_time on all elements before performing the
state change.
- PLAYING -> PAUSED:
- calculate the running_time when the pipeline was PAUSED.
- READY -> NULL:
- set the bus to flushing (when auto-flushing is enabled)
The running_time represents the total elapsed time, measured in clock units,
that the pipeline spent in the PLAYING state (see part-synchronisation.txt).
The running_time is set to 0 after a flushing seek.
Clock selection
~~~~~~~~~~~~~~~
Since all of the children of a GstPipeline must use the same clock, the
pipeline must select a clock. This clock selection happens when the pipeline
goes to the PLAYING state.
The default clock selection algorithm works as follows:
- If the application selected a clock, use that clock. (see below)
- Use the clock of most upstream element that can provide a clock. This
selection is performed by iterating the element starting from the
sinks going upstream.
* since this selection procedure happens in the PAUSED->PLAYING
state change, all the sinks are prerolled and we can thus be sure
that each sink is linked to some upstream element.
* in the case of a live pipeline (NO_PREROLL), the sink will not yet
be prerolled and the selection process will select the clock of
a more upstream element.
- use GstSystemClock, this only happens when no element provides a
usable clock.
The application can influence this clock selection with two methods:
gst_pipeline_use_clock() and gst_pipeline_auto_clock().
The _use_clock() method forces the use of a specific clock on the pipeline
regardless of what clock providers are children of the pipeline. Setting
NULL disables the clock completely and makes the pipeline run as fast as
possible.
The _auto_clock() method removes the fixed clock and reactivates the auto-
matic clock selection algorithm described above.
GstBus
~~~~~~
A GstPipeline provides a GstBus to the application. The bus can be retrieved
with gst_pipeline_get_bus() and can then be used to retrieve messages posted by
the elements in the pipeline (see part-gstbus.txt).

View file

@ -1,407 +0,0 @@
Latency
-------
The latency is the time it takes for a sample captured at timestamp 0 to reach the
sink. This time is measured against the clock in the pipeline. For pipelines
where the only elements that synchronize against the clock are the sinks, the
latency is always 0 since no other element is delaying the buffer.
For pipelines with live sources, a latency is introduced, mostly because of the
way a live source works. Consider an audio source, it will start capturing the
first sample at time 0. If the source pushes buffers with 44100 samples at a
time at 44100Hz it will have collected the buffer at second 1.
Since the timestamp of the buffer is 0 and the time of the clock is now >= 1
second, the sink will drop this buffer because it is too late.
Without any latency compensation in the sink, all buffers will be dropped.
The situation becomes more complex in the presence of:
- 2 live sources connected to 2 live sinks with different latencies
* audio/video capture with synchronized live preview.
* added latencies due to effects (delays, resamplers...)
- 1 live source connected to 2 live sinks
* firewire DV
* RTP, with added latencies because of jitter buffers.
- mixed live source and non-live source scenarios.
* synchronized audio capture with non-live playback. (overdubs,..)
- clock slaving in the sinks due to the live sources providing their own
clocks.
To perform the needed latency corrections in the above scenarios, we must
develop an algorithm to calculate a global latency for the pipeline. The
algorithm must be extensible so that it can optimize the latency at runtime.
It must also be possible to disable or tune the algorithm based on specific
application needs (required minimal latency).
Pipelines without latency compensation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We show some examples to demonstrate the problem of latency in typical
capture pipelines.
- Example 1
An audio capture/playback pipeline.
asrc: audio source, provides a clock
asink audio sink, provides a clock
.--------------------------.
| pipeline |
| .------. .-------. |
| | asrc | | asink | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
NULL->READY:
asink: NULL->READY: probes device, returns SUCCESS
asrc: NULL->READY: probes device, returns SUCCESS
READY->PAUSED:
asink: READY:->PAUSED open device, returns ASYNC
asrc: READY->PAUSED: open device, returns NO_PREROLL
* Since the source is a live source, it will only produce data in the
PLAYING state. To note this fact, it returns NO_PREROLL from the state change
function.
* This sink returns ASYNC because it can only complete the state change to
PAUSED when it receives the first buffer.
At this point the pipeline is not processing data and the clock is not
running. Unless a new action is performed on the pipeline, this situation will
never change.
PAUSED->PLAYING:
asrc clock selected because it is the most upstream clock provider. asink can
only provide a clock when it received the first buffer and configured the
device with the samplerate in the caps.
asink: PAUSED:->PLAYING, sets pending state to PLAYING, returns ASYNC becaus
it is not prerolled. The sink will commit state to
PLAYING when it prerolls.
asrc: PAUSED->PLAYING: starts pushing buffers.
* since the sink is still performing a state change from READY -> PAUSED, it
remains ASYNC. The pending state will be set to PLAYING.
* The clock starts running as soon as all the elements have been set to
PLAYING.
* the source is a live source with a latency. Since it is synchronized with
the clock, it will produce a buffer with timestamp 0 and duration D after
time D, ie. it will only be able to produce the last sample of the buffer
(with timestamp D) at time D. This latency depends on the size of the
buffer.
* the sink will receive the buffer with timestamp 0 at time >= D. At this
point the buffer is too late already and might be dropped. This state of
constantly dropping data will not change unless a constant latency
correction is added to the incoming buffer timestamps.
The problem is due to the fact that the sink is set to (pending) PLAYING
without being prerolled, which only happens in live pipelines.
- Example 2
An audio/video capture/playback pipeline. We capture both audio and video and
have them played back synchronized again.
asrc: audio source, provides a clock
asink audio sink, provides a clock
vsrc: video source
vsink video sink
.--------------------------.
| pipeline |
| .------. .-------. |
| | asrc | | asink | |
| | src -> sink | |
| '------' '-------' |
| .------. .-------. |
| | vsrc | | vsink | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
The state changes happen in the same way as example 1. Both sinks end up with
pending state of PLAYING and a return value of ASYNC until they receive the
first buffer.
For audio and video to be played in sync, both sinks must compensate for the
latency of its source but must also use exactly the same latency correction.
Suppose asrc has a latency of 20ms and vsrc a latency of 33ms, the total
latency in the pipeline has to be at least 33ms. This also means that the
pipeline must have at least a 33 - 20 = 13ms buffering on the audio stream or
else the audio src will underrun while the audiosink waits for the previous
sample to play.
- Example 3
An example of the combination of a non-live (file) and a live source (vsrc)
connected to live sinks (vsink, sink).
.--------------------------.
| pipeline |
| .------. .-------. |
| | file | | sink | |
| | src -> sink | |
| '------' '-------' |
| .------. .-------. |
| | vsrc | | vsink | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
The state changes happen in the same way as example 1. Except sink will be
able to preroll (commit its state to PAUSED).
In this case sink will have no latency but vsink will. The total latency
should be that of vsink.
Note that because of the presence of a live source (vsrc), the pipeline can be
set to playing before sink is able to preroll. Without compensation for the
live source, this might lead to synchronisation problems because the latency
should be configured in the element before it can go to PLAYING.
- Example 4
An example of the combination of a non-live and a live source. The non-live
source is connected to a live sink and the live source to a non-live sink.
.--------------------------.
| pipeline |
| .------. .-------. |
| | file | | sink | |
| | src -> sink | |
| '------' '-------' |
| .------. .-------. |
| | vsrc | | files | |
| | src -> sink | |
| '------' '-------' |
'--------------------------'
The state changes happen in the same way as example 3. Sink will be
able to preroll (commit its state to PAUSED). files will not be able to
preroll.
sink will have no latency since it is not connected to a live source. files
does not do synchronisation so it does not care about latency.
The total latency in the pipeline is 0. The vsrc captures in sync with the
playback in sink.
As in example 3, sink can only be set to PLAYING after it successfully
prerolled.
State Changes
~~~~~~~~~~~~~
A Sink is never set to PLAYING before it is prerolled. In order to do this, the
pipeline (at the GstBin level) keeps track of all
elements that require preroll (the ones that return ASYNC from the state
change). These elements posted a ASYNC_START message without a matching
ASYNC_DONE message.
The pipeline will not change the state of the elements that are still doing an
ASYNC state change.
When an ASYNC element prerolls, it commits its state to PAUSED and posts an
ASYNC_DONE message. The pipeline notices this ASYNC_DONE message and matches it
with the ASYNC_START message it cached for the corresponding element.
When all ASYNC_START messages are matched with an ASYNC_DONE message, the
pipeline proceeds with setting the elements to the final state again.
The base time of the element was already set by the pipeline when it changed the
NO_PREROLL element to PLAYING. This operation has to be performed in the
separate async state change thread (like the one currently used for going from
PAUSED->PLAYING in a non-live pipeline).
Query
~~~~~
The pipeline latency is queried with the LATENCY query.
(out) "live", G_TYPE_BOOLEAN (default FALSE)
- if a live element is found upstream
(out) "min-latency", G_TYPE_UINT64 (default 0, must not be NONE)
- the minimum latency in the pipeline, meaning the minimum time
downstream elements synchronizing to the clock have to wait until
they can be sure that all data for the current running time has
been received.
Elements answering the latency query and introducing latency must
set this to the maximum time for which they will delay data, while
considering upstream's minimum latency. As such, from an element's
perspective this is *not* its own minimum latency but its own
maximum latency.
Considering upstream's minimum latency in general means that the
element's own value is added to upstream's value, as this will give
the overall minimum latency of all elements from the source to the
current element:
min_latency = upstream_min_latency + own_min_latency
(out) "max-latency", G_TYPE_UINT64 (default 0, NONE meaning infinity)
- the maximum latency in the pipeline, meaning the maximum time an
element synchronizing to the clock is allowed to wait for receiving
all data for the current running time. Waiting for a longer time
will result in data loss, overruns and underruns of buffers and in
general breaks synchronized data flow in the pipeline.
Elements answering the latency query should set this to the maximum
time for which they can buffer upstream data without blocking or
dropping further data. For an element this value will generally be
its own minimum latency, but might be bigger than that if it can
buffer more data. As such, queue elements can be used to increase
the maximum latency.
The value set in the query should again consider upstream's maximum
latency:
- If the current element has blocking buffering, i.e. it does
not drop data by itself when its internal buffer is full, it should
just add its own maximum latency (i.e. the size of its internal
buffer) to upstream's value. If upstream's maximum latency, or the
elements internal maximum latency was NONE (i.e. infinity), it will
be set to infinity.
if (upstream_max_latency == NONE || own_max_latency == NONE)
max_latency = NONE;
else
max_latency = upstream_max_latency + own_max_latency
If the element has multiple sinkpads, the minimum upstream latency is
the maximum of all live upstream minimum latencies.
- If the current element has leaky buffering, i.e. it drops data by
itself when its internal buffer is full, it should take the minimum
of its own maximum latency and upstream's. Examples for such
elements are audio sinks and sources with an internal ringbuffer,
leaky queues and in general live sources with a limited amount of
internal buffers that can be used.
max_latency = MIN (upstream_max_latency, own_max_latency)
Note: many GStreamer base classes allow subclasses to set a
minimum and maximum latency and handle the query themselves. These
base classes assume non-leaky (i.e. blocking) buffering for the
maximum latency. The base class' default query handler needs to be
overridden to correctly handle leaky buffering.
If the element has multiple sinkpads, the maximum upstream latency is
the minimum of all live upstream maximum latencies.
Event
~~~~~
The latency in the pipeline is configured with the LATENCY event, which contains
the following fields:
"latency", G_TYPE_UINT64
- the configured latency in the pipeline
Latency compensation
~~~~~~~~~~~~~~~~~~~~
Latency calculation and compensation is performed before the pipeline proceeds to
the PLAYING state.
When the pipeline collected all ASYNC_DONE messages it can calculate the global
latency as follows:
- perform a latency query on all sinks
- sources set their minimum and maximum latency
- other elements add their own values as described above
- latency = MAX (all min latencies)
- if MIN (all max latencies) < latency we have an impossible situation and we
must generate an error indicating that this pipeline cannot be played. This
usually means that there is not enough buffering in some chain of the
pipeline. A queue can be added to those chains.
The sinks gather this information with a LATENCY query upstream. Intermediate
elements pass the query upstream and add the amount of latency they add to the
result.
ex1:
sink1: [20 - 20]
sink2: [33 - 40]
MAX (20, 33) = 33
MIN (20, 40) = 20 < 33 -> impossible
ex2:
sink1: [20 - 50]
sink2: [33 - 40]
MAX (20, 33) = 33
MIN (50, 40) = 40 >= 33 -> latency = 33
The latency is set on the pipeline by sending a LATENCY event to the sinks
in the pipeline. This event configures the total latency on the sinks. The
sink forwards this LATENCY event upstream so that intermediate elements can
configure themselves as well.
After this step, the pipeline continues setting the pending state on its
elements.
A sink adds the latency value, received in the LATENCY event, to
the times used for synchronizing against the clock. This will effectively
delay the rendering of the buffer with the required latency. Since this delay is
the same for all sinks, all sinks will render data relatively synchronised.
Flushing a playing pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~
We can implement resynchronisation after an uncontrolled FLUSH in (part of) a
pipeline in the same way. Indeed, when a flush is performed on
a PLAYING live element, a new base time must be distributed to this element.
A flush in a pipeline can happen in the following cases:
- flushing seek in the pipeline
- performed by the application on the pipeline
- performed by the application on an element
- flush preformed by an element
- after receiving a navigation event (DVD, ...)
When a playing sink is flushed by a FLUSH_START event, an ASYNC_START message is
posted by the element. As part of the message, the fact that the element got
flushed is included. The element also goes to a pending PAUSED state and has to
be set to the PLAYING state again later.
The ASYNC_START message is kept by the parent bin. When the element prerolls,
it posts an ASYNC_DONE message.
When all ASYNC_START messages are matched with an ASYNC_DONE message, the bin
will capture a new base_time from the clock and will bring all the sinks back to
PLAYING after setting the new base time on them. It's also possible
to perform additional latency calculations and adjustments before doing this.
Dynamically adjusting latency
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
An element that want to change the latency in the pipeline can do this by
posting a LATENCY message on the bus. This message instructs the pipeline to:
- query the latency in the pipeline (which might now have changed) with a
LATENCY query.
- redistribute a new global latency to all elements with a LATENCY event.
A use case where the latency in a pipeline can change could be a network element
that observes an increased inter packet arrival jitter or excessive packet loss
and decides to increase its internal buffering (and thus the latency). The
element must post a LATENCY message and perform the additional latency
adjustments when it receives the LATENCY event from the downstream peer element.
In a similar way can the latency be decreased when network conditions are
improving again.
Latency adjustments will introduce glitches in playback in the sinks and must
only be performed in special conditions.

View file

@ -1,60 +0,0 @@
Live sources
------------
A live source is a source that cannot be arbitrarily PAUSED without losing
data.
A live source such as an element capturing audio or video need to be handled
in a special way. It does not make sense to start the dataflow in the PAUSED
state for those devices as the user might wait a long time between going from
PAUSED to PLAYING, making the previously captured buffers irrelevant.
A live source therefore only produces buffers in the PLAYING state. This has
implications for sinks waiting for a buffer to complete the preroll state
since such a buffer might never arrive.
Live sources return NO_PREROLL when going to the PAUSED state to inform the
bin/pipeline that this element will not be able to produce data in the
PAUSED state. NO_PREROLL should be returned for both READY->PAUSED and
PLAYING->PAUSED.
When performing a get_state() on a bin with a non-zero timeout value, the
bin must be sure that there are no live sources in the pipeline because else
the get_state() function would block on the sinks.
A gstbin therefore always performs a zero timeout get_state() on its
elements to discover the NO_PREROLL (and ERROR) elements before performing
a blocking wait.
Scheduling
~~~~~~~~~~
Live sources will not produce data in the paused state. They block in the
getrange function or in the loop function until they go to PLAYING.
Latency
~~~~~~~
The live source timestamps its data with the time of the clock at the
time the data was captured. Normally it will take some time to capture
the first sample of data and the last sample. This means that when the
buffer arrives at the sink, it will already be late and will be dropped.
The latency is the time it takes to construct one buffer of data. This latency
is exposed with a LATENCY query.
See part-latency.txt.
Timestamps
~~~~~~~~~~
Live sources always timestamp their buffers with the running_time of the
pipeline. This is needed to be able to match the timestamps of different live
sources in order to synchronize them.
This is in contrast to non-live sources, which timestamp their buffers starting
from running_time 0.

View file

@ -1,169 +0,0 @@
GstMemory
---------
This document describes the design of the memory objects.
GstMemory objects are usually added to GstBuffer objects and contain the
multimedia data passed around in the pipeline.
Requirements
~~~~~~~~~~~~
- It must be possible to have different memory allocators
- It must be possible to efficiently share memory objects, copy, span
and trim.
Memory layout
~~~~~~~~~~~~~
GstMemory manages a memory region. The accessible part of the managed region is
defined by an offset relative to the start of the region and a size. This
means that the managed region can be larger than what is visible to the user of
GstMemory API.
Schematically, GstMemory has a pointer to a memory region of _maxsize_. The area
starting from _offset_ and _size_ is accessible.
memory
GstMemory ->*----------------------------------------------------*
^----------------------------------------------------^
maxsize
^--------------------------------------^
offset size
The current properties of the accessible memory can be retrieved with:
gsize gst_memory_get_sizes (GstMemory *mem, gsize *offset, gsize *maxsize);
The offset and size can be changed with:
void gst_memory_resize (GstMemory *mem, gssize offset, gsize size);
Allocators
~~~~~~~~~~
GstMemory objects are created by allocators. Allocators are a subclass
of GstObject and can be subclassed to make custom allocators.
struct _GstAllocator {
GstObject object;
const gchar *mem_type;
GstMemoryMapFunction mem_map;
GstMemoryUnmapFunction mem_unmap;
GstMemoryCopyFunction mem_copy;
GstMemoryShareFunction mem_share;
GstMemoryIsSpanFunction mem_is_span;
};
The allocator class has 2 virtual methods. One to create a GstMemory,
another to free it again.
struct _GstAllocatorClass {
GstObjectClass object_class;
GstMemory * (*alloc) (GstAllocator *allocator, gsize size,
GstAllocationParams *params);
void (*free) (GstAllocator *allocator, GstMemory *memory);
};
Allocators are refcounted. It is also possible to register the allocator to the
GStreamer system. This way, the allocator can be retrieved by name.
After an allocator is created, new GstMemory can be created with
GstMemory * gst_allocator_alloc (const GstAllocator * allocator,
gsize size,
GstAllocationParams *params);
GstAllocationParams contain extra info such as flags, alignment, prefix and
padding.
The GstMemory object is a refcounted object that must be freed with
gst_memory_unref ().
The GstMemory keeps a ref to the allocator that allocated it. Inside the
allocator are the most common GstMemory operations listed. Custom
GstAllocator implementations must implement the various operations on
the memory they allocate.
It is also possible to create a new GstMemory object that wraps existing
memory with:
GstMemory * gst_memory_new_wrapped (GstMemoryFlags flags,
gpointer data, gsize maxsize,
gsize offset, gsize size,
gpointer user_data,
GDestroyNotify notify);
Lifecycle
~~~~~~~~~
GstMemory extends from GstMiniObject and therefore uses its lifecycle
management (See part-miniobject.txt).
Data Access
~~~~~~~~~~~
Access to the memory region is always controlled with a map and unmap method
call. This allows the implementation to monitor the access patterns or set up
the required memory mappings when needed.
The access of the memory object is controlled with the locking mechanism on
GstMiniObject (See part-miniobject.txt).
Mapping a memory region requires the caller to specify the access method: READ
and/or WRITE. Mapping a memory region will first try to get a lock on the
memory in the requested access mode. This means that the map operation can
fail when WRITE access is requested on a non-writable memory object (it has
an exclusive counter > 1, the memory is already locked in an incompatible
access mode or the memory is marked readonly).
After the data has been accessed in the object, the unmap call must be
performed, which will unlock the memory again.
It is allowed to recursively map multiple times with the same or narrower
access modes. For each of the map calls, a corresponding unmap call needs to
be made. WRITE-only memory cannot be mapped in READ mode and READ-only memory
cannot be mapped in WRITE mode.
The memory pointer returned from the map call is guaranteed to remain valid in
the requested mapping mode until the corresponding unmap call is performed on
the pointer.
When multiple map operations are nested and return the same pointer, the pointer
is valid until the last unmap call is done.
When the final reference on a memory object is dropped, all outstanding
mappings should have been unmapped.
Resizing a GstMemory does not influence any current mappings in any way.
Copy
~~~~
A GstMemory copy can be made with the gst_memory_copy() call. Normally,
allocators will implement a custom version of this function to make a copy of
the same kind of memory as the original one.
This is what the fallback version of the copy function does, albeit slower
than what a custom implementation could do.
The copy operation is only required to copy the visible range of the memory
block.
Share
~~~~~
A memory region can be shared between GstMemory object with the
gst_memory_share() operation.

View file

@ -1,148 +0,0 @@
Messages
--------
Messages are refcounted lightweight objects to signal the application
of pipeline events.
Messages are implemented as a subclass of GstMiniObject with a generic
GstStructure as the content. This allows for writing custom messages without
requiring an API change while allowing a wide range of different types
of messages.
Messages are posted by objects in the pipeline and are passed to the
application using the GstBus (See also part-gstbus.txt and part-gstpipeline.txt).
Message types
~~~~~~~~~~~~~
GST_MESSAGE_EOS:
Posted by sink elements. This message is posted to the application when all
the sinks in a pipeline have posted an EOS message. When performing a
flushing seek, the EOS state of the pipeline and sinks is reset.
GST_MESSAGE_ERROR:
An element in the pipeline got into an error state. The message carries
a GError and a debug string describing the error. This usually means that
part of the pipeline is not streaming anymore.
GST_MESSAGE_WARNING:
An element in the pipeline encountered a condition that made it produce a
warning. This could be a recoverable decoding error or some other non fatal
event. The pipeline continues streaming after a warning.
GST_MESSAGE_INFO:
An element produced an informational message.
GST_MESSAGE_TAG:
An element decoded metadata about the stream. The message carries a GstTagList
with the tag information.
GST_MESSAGE_BUFFERING:
An element is buffering data and that could potentially take some time. This
message is typically emitted by elements that perform some sort of network
buffering. While the pipeline is buffering it should remain in the PAUSED
state. When the buffering is finished, it can resume PLAYING.
GST_MESSAGE_STATE_CHANGED:
An element changed state in the pipeline. The message carries the old, new
and pending state of the element.
GST_MESSAGE_STATE_DIRTY:
An internal message used to instruct a pipeline hierarchy that a state
recalculation must be performed because of an ASYNC state change completed.
This message is not used anymore.
GST_MESSAGE_STEP_DONE:
An element stepping frames has finished. This is currently not used.
GST_MESSAGE_CLOCK_PROVIDE:
An element notifies its capability of providing a clock for the pipeline.
GST_MESSAGE_CLOCK_LOST:
The current clock, as selected by the pipeline, became unusable. The pipeline
will select a new clock on the next PLAYING state change.
GST_MESSAGE_NEW_CLOCK:
A new clock was selected for the pipeline.
GST_MESSAGE_STRUCTURE_CHANGE:
The pipeline changed its structure, This means elements were added or removed or
pads were linked or unlinked. This message is not yet used.
GST_MESSAGE_STREAM_STATUS:
Posted by an element when it starts/stops/pauses a streaming task. It
contains information about the reason why the stream state changed along
with the thread id. The application can use this information to detect
failures in streaming threads and/or to adjust streaming thread priorities.
GST_MESSAGE_APPLICATION:
The application posted a message. This message must be used when the
application posts a message on the bus.
GST_MESSAGE_ELEMENT:
Element-specific message. See the specific element's documentation
GST_MESSAGE_SEGMENT_START:
An element started playback of a new segment. This message is not forwarded
to applications but is used internally to schedule SEGMENT_DONE messages.
GST_MESSAGE_SEGMENT_DONE:
An element or bin completed playback of a segment. This message is only posted
on the bus if a SEGMENT seek is performed on a pipeline.
GST_MESSAGE_DURATION_CHANGED:
An element posts this message when it has detected or updated the stream duration.
GST_MESSAGE_ASYNC_START:
Posted by sinks when they start an asynchronous state change.
GST_MESSAGE_ASYNC_DONE:
Posted by sinks when they receive the first data buffer and complete the
asynchronous state change.
GST_MESSAGE_LATENCY:
Posted by elements when the latency in a pipeline changed and a new global
latency should be calculated by the pipeline or application.
GST_MESSAGE_REQUEST_STATE:
Posted by elements when they want to change the state of the pipeline they
are in. A typical use case would be an audio sink that requests the pipeline
to pause in order to play a higher priority stream.
GST_MESSAGE_STEP_START:
A Stepping operation has started.
GST_MESSAGE_QOS:
A buffer was dropped or an element changed its processing strategy for
Quality of Service reasons.
GST_MESSAGE_PROGRESS:
A progress message was posted. Progress messages inform the application about
the state of asynchronous operations.

View file

@ -1,396 +0,0 @@
GstMeta
-------
This document describes the design for arbitrary per-buffer metadata.
Buffer metadata typically describes the low level properties of the buffer
content. These properties are commonly not negotiated with caps but they are
negotiated in the bufferpools.
Some examples of metadata:
- interlacing information
- video alignment, cropping, panning information
- extra container information such as granulepos, ...
- extra global buffer properties
Requirements
~~~~~~~~~~~~
- It must be fast
* allocation, free, low fragmentation
* access to the metadata fields, preferably not much slower than directly
accessing a C structure field
- It must be extensible. Elements should be able to add new arbitrary metadata
without requiring much effort. Also new metadata fields should not break API
or ABI.
- It plays nice with subbuffers. When a subbuffer is created, the various
buffer metadata should be copied/updated correctly.
- We should be able to negotiate metadata between elements
Use cases
---------
* Video planes
Video data is sometimes allocated in non-contiguous planes for the Y and the UV
data. We need to be able to specify the data on a buffer using multiple
pointers in memory. We also need to be able to specify the stride for these
planes.
* Extra buffer data
Some elements might need to store extra data for a buffer. This is typically
done when the resources are allocated from another subsystem such as OMX or
X11.
* Processing information
Pan and crop information can be added to the buffer data when the downstream
element can understand and use this metadata. An imagesink can, for example,
use the pan and cropping information when blitting the image on the screen
with little overhead.
GstMeta
~~~~~~~
A GstMeta is a structure as follows:
struct _GstMeta {
GstMetaFlags flags;
const GstMetaInfo *info; /* tag and info for the meta item */
};
The purpose of the this structure is to serve as a common header for all metadata
information that we can attach to a buffer. Specific metadata, such as timing metadata,
will have this structure as the first field. For example:
struct _GstMetaTiming {
GstMeta meta; /* common meta header */
GstClockTime dts; /* decoding timestamp */
GstClockTime pts; /* presentation timestamp */
GstClockTime duration; /* duration of the data */
GstClockTime clock_rate; /* clock rate for the above values */
};
Or another example for the video memory regions that consists of both fields and
methods.
#define GST_VIDEO_MAX_PLANES 4
struct GstMetaVideo {
GstMeta meta;
GstBuffer *buffer;
GstVideoFlags flags;
GstVideoFormat format;
guint id
guint width;
guint height;
guint n_planes;
gsize offset[GST_VIDEO_MAX_PLANES]; /* offset in the buffer memory region of the
* first pixel. */
gint stride[GST_VIDEO_MAX_PLANES]; /* stride of the image lines. Can be negative when
* the image is upside-down */
gpointer (*map) (GstMetaVideo *meta, guint plane, gpointer * data, gint *stride,
GstMapFlags flags);
gboolean (*unmap) (GstMetaVideo *meta, guint plane, gpointer data);
};
gpointer gst_meta_video_map (GstMetaVideo *meta, guint plane, gpointer * data,
gint *stride, GstMapflags flags);
gboolean gst_meta_video_unmap (GstMetaVideo *meta, guint plane, gpointer data);
GstMeta derived structures define the API of the metadata. The API can consist of
fields and/or methods. It is possible to have different implementations for the
same GstMeta structure.
The implementation of the GstMeta API would typically add more fields to the
public structure that allow it to implement the API.
GstMetaInfo will point to more information about the metadata and looks like this:
struct _GstMetaInfo {
GType api; /* api type */
GType type; /* implementation type */
gsize size; /* size of the structure */
GstMetaInitFunction init_func;
GstMetaFreeFunction free_func;
GstMetaTransformFunction transform_func;
};
api will contain a GType of the metadata API. A repository of registered MetaInfo
will be maintained by the core. We will register some common metadata structures
in core and some media specific info for audio/video/text in -base. Plugins can
register additional custom metadata.
For each implementation of api, there will thus be a unique GstMetaInfo. In the
case of metadata with a well defined API, the implementation specific init
function will setup the methods in the metadata structure. A unique GType will
be made for each implementation and stored in the type field.
Along with the metadata description we will have functions to initialize/free (and/or refcount)
a specific GstMeta instance. We also have the possibility to add a custom
transform function that can be used to modify the metadata when a transformation
happens.
There are no explicit methods to serialize and deserialize the metadata. Since
each type has a GType, we can reuse the GValue transform functions for this.
The purpose of the separate MetaInfo is to not have to carry the free/init functions in
each buffer instance but to define them globally. We still want quick access to the info
so we need to make the buffer metadata point to the info.
Technically we could also specify the field and types in the MetaInfo and
provide a generic API to retrieve the metadata fields without the need for a
header file. We will not do this yet.
Allocation of the GstBuffer structure will result in the allocation of a memory region
of a customizable size (512 bytes). Only the first sizeof (GstBuffer) bytes of this
region will initially be used. The remaining bytes will be part of the free metadata
region of the buffer. Different implementations are possible and are invisible
in the API or ABI.
The complete buffer with metadata could, for example, look as follows:
+-------------------------------------+
GstMiniObject | GType (GstBuffer) |
| refcount, flags, copy/disp/free |
+-------------------------------------+
GstBuffer | pool,pts,dts,duration,offsets |
| <private data> |
+.....................................+
| next ---+
+- | info ------> GstMetaInfo
GstMetaTiming | | | |
| | dts | |
| | pts | |
| | duration | |
+- | clock_rate | |
+ . . . . . . . . . . . . . . . . . . + |
| next <--+
GstMetaVideo +- +- | info ------> GstMetaInfo
| | | | |
| | | flags | |
| | | n_planes | |
| | | planes[] | |
| | | map | |
| | | unmap | |
+- | | | |
| | private fields | |
GstMetaVideoImpl | | ... | |
| | ... | |
+- | | |
+ . . . . . . . . . . . . . . . . . . + .
. .
API examples
~~~~~~~~~~~~
Buffers are created using the normal gst_buffer_new functions. The standard fields
are initialized as usual. A memory area that is bigger than the structure size
is allocated for the buffer metadata.
gst_buffer_new ();
After creating a buffer, the application can set caps and add metadata
information.
To add or retrieve metadata, a handle to a GstMetaInfo structure needs to be
obtained. This defines the implementation and API of the metadata. Usually, a
handle to this info structure can be obtained by calling a public _get_info()
method from a shared library (for shared metadata).
The following defines can usually be found in the shared .h file.
GstMetaInfo * gst_meta_timing_get_info();
#define GST_META_TIMING_INFO (gst_meta_timing_get_info())
Adding metadata to a buffer can be done with the gst_buffer_add_meta() call.
This function will create new metadata based on the implementation specified by
the GstMetaInfo. It is also possible to pass a generic pointer to the add_meta()
function that can contain parameters to initialize the new metadata fields.
Retrieving the metadata on a buffer can be done with the
gst_buffer_meta_get() method. This function retrieves an existing metadata
conforming to the API specified in the given info. When no such metadata exists,
the function will return NULL.
GstMetaTiming *timing;
timing = gst_buffer_get_meta (buffer, GST_META_TIMING_INFO);
Once a reference to the info has been obtained, the associated metadata can be
added or modified on a buffer.
timing->timestamp = 0;
timing->duration = 20 * GST_MSECOND;
Other convenience macros can be made to simplify the above code:
#define gst_buffer_get_meta_timing(b) \
((GstMetaTiming *) gst_buffer_get_meta ((b), GST_META_TIMING_INFO)
This makes the code look like this:
GstMetaTiming *timing;
timing = gst_buffer_get_meta_timing (buffer);
timing->timestamp = 0;
timing->duration = 20 * GST_MSECOND;
To iterate the different metainfo structures, one can use the
gst_buffer_meta_get_next() methods.
GstMeta *current = NULL;
/* passing NULL gives the first entry */
current = gst_buffer_meta_get_next (buffer, current);
/* passing a GstMeta returns the next */
current = gst_buffer_meta_get_next (buffer, current);
Memory management
~~~~~~~~~~~~~~~~~
* allocation
We initially allocate a reasonable sized GstBuffer structure (say 512 bytes).
Since the complete buffer structure, including a large area for metadata, is
allocated in one go, we can reduce the number of memory allocations while still
providing dynamic metadata.
When adding metadata, we need to call the init function of the associated
metadata info structure. Since adding the metadata requires the caller to pass
a handle to the info, this operation does not require table lookups.
Per-metadata memory initialisation is needed because not all metadata is
initialized in the same way. We need to, for example, set the timestamps to
NONE in the MetaTiming structures.
The init/free functions can also be used to implement refcounting for a metadata
structure. This can be useful when a structure is shared between buffers.
When the free_size of the GstBuffer is exhausted, we will allocate new memory
for each newly added Meta and use the next pointers to point to this. It
is expected that this does not occur often and we might be able to optimize
this transparently in the future.
* free
When a GstBuffer is freed, we potentially might have to call a custom free
function on the metadata info. In the case of the Memory metadata, we need to
call the associated free function to free the memory.
When freeing a GstBuffer, the custom buffer free function will iterate all of
the metadata in the buffer and call the associated free functions in the
MetaInfo associated with the entries. Usually, this function will be NULL.
Serialization
~~~~~~~~~~~~~
When buffer should be sent over the wire or be serialized in GDP, we need a way
to perform custom serialization and deserialization on the metadata.
for this we can use the GValue transform functions.
Transformations
~~~~~~~~~~~~~~~
After certain transformations, the metadata on a buffer might not be relevant
anymore.
Consider, for example, metadata that lists certain regions of interest
on the video data. If the video is scaled or rotated, the coordinates might not
make sense anymore. A transform element should be able to adjust or remove the
associated metadata when it becomes invalid.
We can make the transform element aware of the metadata so that it can adjust or
remove in an intelligent way. Since we allow arbitrary metadata, we can't do
this for all metadata and thus we need some other way.
One proposition is to tag the metadata type with keywords that specify what it
functionally refers too. We could, for example, tag the metadata for the regions
of interest with a tag that notes that the metadata refers to absolute pixel
positions. A transform could then know that the metadata is not valid anymore
when the position of the pixels changed (due to rotation, flipping, scaling and
so on).
Subbuffers
~~~~~~~~~~
Subbuffers are implemented with a generic copy. Parameters to the copy
are the offset and size. This allows each metadata structure to implement the
actions needed to update the metadata of the subbuffer.
It might not make sense for some metadata to work with subbuffers. For example
when we take a subbuffer of a buffer with a video frame, the GstMetaVideo
simply becomes invalid and is removed from the new subbuffer.
Relationship with GstCaps
~~~~~~~~~~~~~~~~~~~~~~~~~
The difference between GstCaps, used in negotiation, and the metadata is not
clearly defined.
We would like to think of the GstCaps containing the information needed to
functionally negotiate the format between two elements. The Metadata should then
only contain variables that can change between each buffer.
For example, for video we would have width/height/framerate in the caps but then
have the more technical details, such as stride, data pointers, pan/crop/zoom
etc in the metadata.
A scheme like this would still allow us to functionally specify the desired
video resolution while the implementation details would be inside the metadata.
Relationship with GstMiniObject qdata
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
qdata on a miniobject is element private and is not visible to other element.
Therefore qdata never contains essential information that describes the buffer
content.
Compatibility
~~~~~~~~~~~~~
We need to make sure that elements exchange metadata that they both understand,
This is particularly important when the metadata describes the data layout in
memory (such as strides).
The ALLOCATION query is used to let upstream know what metadata we can suport.
It is also possible to have a bufferpool add certain metadata to the buffers
from the pool. This feature is activated by enabling a buffer option when
configuring the pool.
Notes
~~~~~
Some structures that we need to be able to add to buffers.
* Clean Aperture
* Arbitrary Matrix Transform
* Aspect ratio
* Pan/crop/zoom
* Video strides
Some of these overlap, we need to find a minimal set of metadata structures that
allows us to define all use cases.

View file

@ -1,209 +0,0 @@
GstMiniObject
-------------
This document describes the design of the miniobject base class.
The miniobject abstract base class is used to construct lightweight refcounted
and boxed types that are frequently created and destroyed.
Requirements
~~~~~~~~~~~~
- Be lightweight
- Refcounted
- I must be possible to control access to the object, ie. when the object is
readable and writable.
- Subclasses must be able to use their own allocator for the memory.
Usage
~~~~~
Users of the GstMiniObject infrastructure will need to define a structure that
includes the GstMiniObject structure as the first field.
struct {
GstMiniObject mini_object;
/* my fields */
...
} MyObject
The subclass should then implement a constructor method where it allocates the
memory for its structure and initializes the miniobject structure with
gst_mini_object_init(). Copy and Free functions are provided to the
gst_mini_object_init() function.
MyObject *
my_object_new()
{
MyObject *res = g_slice_new (MyObject);
gst_mini_object_init (GST_MINI_OBJECT_CAST (res), 0,
MY_TYPE_OBJECT,
(GstMiniObjectCopyFunction) _my_object_copy,
(GstMiniObjectDisposeFunction) NULL,
(GstMiniObjectFreeFunction) _my_object_free);
/* other init */
.....
return res;
}
The Free function is responsible for freeing the allocated memory for
the structure.
static void
_my_object_free (MyObject *obj)
{
/* other cleanup */
...
g_slice_free (MyObject, obj);
}
Lifecycle
~~~~~~~~~
GstMiniObject is refcounted. When a GstMiniObject is first created,
it has a refcount of 1.
Each variable holding a reference to a GstMiniObject is responsible for
updating the refcount. This includes incrementing the refcount with
gst_mini_object_ref() when a reference is kept to a miniobject or
gst_mini_object_unref() when a reference is released.
When the refcount reaches 0, and thus no objects hold a reference to the
miniobject anymore, we can free the miniobject.
When freeing the miniobject, first the GstMiniObjectDisposeFunction is called.
This function is allowed to revive the object again by incrementing the
refcount, in which case it should return FALSE from the dispose function. The
dispose function is used by GstBuffer to revive the buffer back into the
GstBufferPool when needed.
When the dispose function returns TRUE, the GstMiniObjectFreeFunction will be
called and the miniobject will be freed.
Copy
~~~~
A miniobject can be copied with gst_mini_object_copy(). This function will
call the custom copy function that was provided when registering the new
GstMiniObject subclass.
The copy function should try to preserve as much info from the original object
as possible.
The new copy should be writable.
Access management
~~~~~~~~~~~~~~~~~
GstMiniObject can be shared between multiple threads. It is important that when
a thread writes to a GstMiniObject that the other threads don't not see the
changes.
To avoid exposing changes from one thread to another thread, the miniobjects
are managed in a Copy-On-Write way. A copy is only made when it is known that
the object is shared between multiple objects or threads.
There are 2 methods implemented for controlling access to the miniobject.
- A first method relies on the refcount of the object to control writability.
Objects using this method have the LOCKABLE flag unset.
- A second method relies on a separate counter for controlling
the access to the object. Objects using this method have the LOCKABLE flag
set.
You can check if an object is writable with gst_mini_object_is_writable() and
you can make any miniobject writable with gst_mini_object_make_writable().
This will create a writable copy when the object was not writable.
non-LOCKABLE GstMiniObjects
---------------------------
These GstMiniObjects have the LOCKABLE flag unset. They use the refcount value
to control writability of the object.
When the refcount of the miniobject is > 1, the objects it referenced by at
least 2 objects and is thus considered unwritable. A copy must be made before a
modification to the object can be done.
Using the refcount to control writability is problematic for many language
bindings that can keep additional references to the objects. This method is
mainly for historical reasons until all users of the miniobjects are
converted to use the LOCAKBLE flag.
LOCKABLE GstMiniObjects
-----------------------
These GstMiniObjects have the LOCKABLE flag set. They use a separate counter
for controlling writability and access to the object.
It consists of 2 components:
* exclusive counter
Each object that wants to keep a reference to a GstMiniObject and doesn't want to
see the changes from other owners of the same GstMiniObject needs to lock the
GstMiniObject in EXCLUSIVE mode, which will increase the exclusive counter.
The exclusive counter counts the amount of objects that share this
GstMiniObject. The counter is initially 0, meaning that the object is not shared with
any object.
When a reference to a GstMiniObject release, both the ref count and the
exclusive counter will be decreased with gst_mini_object_unref() and
gst_mini_object_unlock () respectively.
* locking
All read and write access must be performed between a gst_mini_object_lock() and
gst_mini_object_unlock() pair with the requested access method.
A gst_mini_object_lock() can fail when a WRITE lock is requested and the exclusive
counter is > 1. Indeed a GstMiniObject object with an exclusive counter > 1 is
locked EXCLUSIVELY by at least 2 objects and is therefore not writable.
Once the GstMiniObject is locked with a certain access mode, it can be recursively
locked with the same or narrower access mode. For example, first locking the
GstMiniObject in READWRITE mode allows you to recusively lock the
GstMiniObject in
READWRITE, READ and WRITE mode. Memory locked in READ mode cannot be locked
recursively in WRITE or READWRITE mode.
Note that multiple threads can READ lock the GstMiniObject concurrently but cannot
lock the object in WRITE mode because the exclusive counter must be > 1.
All calls to gst_mini_object_lock() need to be paired with one
gst_mini_object_unlock() call with the same access mode. When the last refcount
of the object is removed, there should be no more outstanding locks.
Note that a shared counter of both 0 and 1 leaves the GstMiniObject writable. The
reason is to make it easy to create and pass ownership of the GstMiniObject to
another object while keeping it writable. When the GstMiniObject is
created with a shared count of 0, it is writable. When the GstMiniObject is then
added to another object, the shared count is incremented to 1 and the
GstMiniObject remains writable. The 0 share counter has a similar purpose as the floating
reference in GObject.
Weak references
~~~~~~~~~~~~~~~
GstMiniObject has support for weak references. A callback will be called when
the object is freed for all registered weak references.
QData
~~~~~
Extra data can be associated with a GstMiniObject by using the QData API.

View file

@ -1,262 +0,0 @@
What to do when a plugin is missing
-----------------------------------
The mechanism and API described in this document requires GStreamer core and
gst-plugins-base versions >= 0.10.12. Further information on some aspects of
this document can be found in the libgstbaseutils API reference.
We only discuss playback pipelines for now.
A three step process:
1) GStreamer level
- elements will use a "missing-plugin" element message to report missing
plugins, with the following fields set:
- type: (string) { "urisource", "urisink", "decoder", "encoder", "element" }
(we do not distinguish between demuxer/decoders/parsers etc.)
- detail: (string) or (caps) depending on the type { ANY }
ex: "mms, "mmsh", "audio/x-mp3,rate=48000,..."
- name: (string) { ANY }
ex: "MMS protocol handler",..
- missing uri handler
ex. mms://foo.bar/file.asf
When no protocol handler is installed for mms://, the application will not be
able to instantiate an element for that uri (gst_element_make_from_uri()
returns NULL).
Playbin will post a "missing-plugin" element message with the type set to
"urisource", detail set to "mms". Optionally the friendly name can be filled
in as well.
- missing typefind function
We don't recognize the type of the file, this should normally not happen
because all the typefinders are in the basic GStreamer installation.
There is not much useful information we can give about how to resolve this
issue. It is possible to use the first N bytes of the data to determine the
type (and needed plugin) on the server. We don't explore this option in this
document yet, but the proposal is flexible enough to accommodate this in the
future should the need arise.
- missing demuxer
Typically after running typefind on the data we determine the type of the
file. If there is no plugin found for the type, a "missing-plugin" element
message is posted by decodebin with the following fields: Type set to
"decoder", detail set to the caps for witch no plugin was found. Optionally
the friendly name can be filled in as well.
- missing decoder
The demuxer will dynamically create new pads with specific caps while it
figures out the contents of the container format. Decodebin tries to find the
decoders for these formats in the registry. If there is no decoder found, a
"missing-plugin" element message is posted by decodebin with the following
fields: Type set to "decoder", detail set to the caps for which no plugin
was found. Optionally the friendly name can be filled in as well. There is
no distinction made between the missing demuxer and decoder at the
application level.
- missing element
Decodebin and playbin will create a set of helper elements when they set up
their decoding pipeline. These elements are typically colorspace, sample rate,
audio sinks,... Their presence on the system is required for the functionality
of decodebin. It is typically a package dependency error if they are not
present but in case of a corrupted system the following "missing-plugin"
element message will be emitted: type set to "element", detail set to the
element factory name and the friendly name optionally set to a description
of the element's functionality in the decoding pipeline.
Except for reporting the missing plugins, no further policy is enforced at the
GStreamer level. It is up to the application to decide whether a missing
plugin constitutes a problem or not.
2) application level
The application's job is to listen for the "missing-plugin" element messages
and to decide on a policy to handle them. Following cases exist:
- partially missing plugins
The application will be able to complete a state change to PAUSED but there
will be a "missing-plugin" element message on the GstBus.
This means that it will be possible to play back part of the media file but not
all of it.
For example: suppose we have an .avi file with mp3 audio and divx video. If we
have the mp3 audio decoder but not the divx video decoder, it will be possible
to play only the audio part but not the video part. For an audio playback
application, this is not a problem but a video player might want to decide on:
- require the use to install the additionally required plugins.
- inform the user that only the audio will be played back
- ask the user if it should download the additional codec or only play the
audio part.
- ...
- completely unplayable stream
The application will receive an ERROR message from GStreamer informing it that
playback stopped (before it could reach PAUSED). This happens because none of
the streams is connected to a decoder. The error code and domain should be one
of the following in this case:
- GST_CORE_ERROR_MISSING_PLUGIN (domain: GST_CORE_ERROR)
- GST_STREAM_ERROR_CODEC_NOT_FOUND (domain: GST_STREAM_ERROR)
The application can then see that there are a set of "missing-plugin" element
messages on the GstBus and can decide to trigger the download procedure. It
does that as described in the following section.
"missing-plugin" element messages can be identified using the function
gst_is_missing_plugin_message().
3) Plugin download stage
At this point the application has
- collected one or more "missing-plugin" element messages
- made a decision that additional plugins should be installed
It will call a GStreamer utility function to convert each "missing-plugin"
message into an identifier string describing the missing capability. This is
done using the function gst_missing_plugin_message_get_installer_detail().
The application will then pass these strings to gst_install_plugins_async()
or gst_install_plugins_sync() to initiate the download. See the API
documentation there (libgstbaseutils, part of gst-plugins-base) for more
details.
When new plugins have been installed, the application will have to initiate
a re-scan of the GStreamer plugin registry using gst_update_registry().
4) Format of the (UTF-8) string ID passed to the external installer system
The string is made up of several fields, separated by '|' characters.
The fields are:
- plugin system identifier, ie. "gstreamer"
This identifier determines the format of the rest of the detail string.
Automatic plugin installers should not process detail strings with
unknown identifiers. This allows other plugin-based libraries to use
the same mechanism for their automatic plugin installation needs, or
for the format to be changed should it turn out to be insufficient.
- plugin system version, e.g. "0.10"
This is required so that when there is a GStreamer-0.12 or GStreamer-1.0
at somem point in future, the different major versions can still co-exist
and use the same plugin install mechanism in the same way.
- application identifier, e.g. "totem"
This may also be in the form of "pid/12345" if the program name can't
be obtained for some reason.
- human-readable localised description of the required component,
e.g. "Vorbis audio decoder"
- identifier string for the required component, e.g.
- urisource-$(PROTOCOL_REQUIRED)
e.g. urisource-http or urisource-mms
- element-$(ELEMENT_REQUIRED),
e.g. element-videoconvert
- decoder-$(CAPS_REQUIRED)
e.g. decoder-audio/x-vorbis or
decoder-application/ogg or
decoder-audio/mpeg, mpegversion=(int)4 or
decoder-video/mpeg, systemstream=(boolean)true, mpegversion=(int)2
- encoder-$(CAPS_REQUIRED)
e.g. encoder-audio/x-vorbis
- optional further fields not yet specified
An entire ID string might then look like this, for example:
gstreamer|0.10|totem|Vorbis audio decoder|decoder-audio/x-vorbis
Plugin installers parsing this ID string should expect further fields also
separated by '|' symbols and either ignore them, warn the user, or error
out when encountering them.
The human-readable description string is provided by the libgstbaseutils
library that can be found in gst-plugins-base versions >= 0.10.12 and can
also be used by demuxers to find out the codec names for taglists from given
caps in a unified and consistent way.
Applications can create these detail strings using the function
gst_missing_plugin_message_get_installer_detail() on a given missing-plugin
message.
5) Using missing-plugin messages for error reporting:
Missing-plugin messages are also useful for error reporting purposes, either
in the case where the application does not support libgimme-codec, or the
external installer is not available or not able to install the required
plugins.
When creating error messages, applications may use the function
gst_missing_plugin_message_get_description() to obtain a possibly translated
description from each missing-plugin message (e.g. "Matroska demuxer" or
"Theora video depayloader"). This can be used to report to the user exactly
what it is that is missing.
6) Notes for packagers
- An easy way to introspect plugin .so files is:
$ gst-inspect --print-plugin-auto-install-info /path/to/libgstfoo.so
The output will be something like:
decoder-audio/x-vorbis
element-vorbisdec
element-vorbisenc
element-vorbisparse
element-vorbistag
encoder-audio/x-vorbis
BUT could also be like this (from the faad element in this case):
decoder-audio/mpeg, mpegversion=(int){ 2, 4 }
NOTE that this does not exactly match the caps string that the installer
will get from the application. The application will always ever ask for
one of
decoder-audio/mpeg, mpegversion=(int)2
decoder-audio/mpeg, mpegversion=(int)4
- when introspecting, keep in mind that there are GStreamer plugins that
in turn load external plugins. Examples of these are pitfdll, ladspa, or
the GStreamer libvisual plugin. Those plugins will only announce elements
for the currently installed external plugins at the time of introspection!
With the exception of pitfdll, this is not really relevant to the playback
case, but may become an issue in future when applications like buzztard,
jokosher or pitivi start requestion elements by name, for example ladspa
effect elements or so.
This case could be handled if those wrapper plugins would also provide a
gst-install-xxx-plugins-helper, where xxx={ladspa|visual|...}. Thus if the
distro specific gst-install-plugins-helper can't resolve a request for e.g.
element-bml-sonicverb it can forward the request to
gst-install-bml-plugins-helper (bml is the buzz machine loader).
7) Further references:
http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-libs/html/gstreamer-base-utils.html

View file

@ -1,353 +0,0 @@
Negotiation
-----------
Capabilities negotiation is the process of deciding on an adequate
format for dataflow within a GStreamer pipeline. Ideally, negotiation
(also known as "capsnego") transfers information from those parts of the
pipeline that have information to those parts of the pipeline that are
flexible, constrained by those parts of the pipeline that are not
flexible.
Basic rules
~~~~~~~~~~~
These simple rules must be followed:
1) downstream suggests formats
2) upstream decides on format
There are 4 queries/events used in caps negotiation:
1) GST_QUERY_CAPS : get possible formats
2) GST_QUERY_ACCEPT_CAPS : check if format is possible
3) GST_EVENT_CAPS : configure format (downstream)
4) GST_EVENT_RECONFIGURE : inform upstream of possibly new caps
Queries
-------
A pad can ask the peer pad for its supported GstCaps. It does this with
the CAPS query. The list of supported caps can be used to choose an
appropriate GstCaps for the data transfer.
The CAPS query works recursively, elements should take their peers into
consideration when constructing the possible caps. Because the result caps
can be very large, the filter can be used to restrict the caps. Only the
caps that match the filter will be returned as the result caps. The
order of the filter caps gives the order of preference of the caller and
should be taken into account for the returned caps.
(in) "filter", GST_TYPE_CAPS (default NULL)
- a GstCaps to filter the results against
(out) "caps", GST_TYPE_CAPS (default NULL)
- the result caps
A pad can ask the peer pad if it supports a given caps. It does this with
the ACCEPT_CAPS query. The caps must be fixed.
The ACCEPT_CAPS query is not required to work recursively, it can simply
return TRUE if a subsequent CAPS event with those caps would return
success.
(in) "caps", GST_TYPE_CAPS
- a GstCaps to check, must be fixed
(out) "result", G_TYPE_BOOLEAN (default FALSE)
- TRUE if the caps are accepted
Events
~~~~~~
When a media format is negotiated, peer elements are notified of the GstCaps
with the CAPS event. The caps must be fixed.
"caps", GST_TYPE_CAPS
- the negotiated GstCaps, must be fixed
Operation
~~~~~~~~~
GStreamer's two scheduling modes, push mode and pull mode, lend
themselves to different mechanisms to achieve this goal. As it is more
common we describe push mode negotiation first.
Push-mode negotiation
~~~~~~~~~~~~~~~~~~~~~
Push-mode negotiation happens when elements want to push buffers and
need to decide on the format. This is called downstream negotiation
because the upstream element decides the format for the downstream
element. This is the most common case.
Negotiation can also happen when a downstream element wants to receive
another data format from an upstream element. This is called upstream
negotiation.
The basics of negotiation are as follows:
- GstCaps (see part-caps.txt) are refcounted before they are pushed as
an event to describe the contents of the following buffer.
- An element should reconfigure itself to the new format received as a CAPS
event before processing the following buffers. If the data type in the
caps event is not acceptable, the element should refuse the event. The
element should also refuse the next buffers by returning an appropriate
GST_FLOW_NOT_NEGOTIATED return value from the chain function.
- Downstream elements can request a format change of the stream by sending a
RECONFIGURE event upstream. Upstream elements will renegotiate a new format
when they receive a RECONFIGURE event.
The general flow for a source pad starting the negotiation.
src sink
| |
| querycaps? |
|---------------->|
| caps |
select caps |< - - - - - - - -|
from the | |
candidates | |
| |-.
| accepts? | |
type A |---------------->| | optional
| yes | |
|< - - - - - - - -| |
| |-'
| send_event() |
send CAPS |---------------->| Receive type A, reconfigure to
event A | | process type A.
| |
| push |
push buffer |---------------->| Process buffer of type A
| |
One possible implementation in pseudo code:
[element wants to create a buffer]
if not format
# see what we can do
ourcaps = gst_pad_query_caps (srcpad)
# see what the peer can do filtered against our caps
candidates = gst_pad_peer_query_caps (srcpad, ourcaps)
foreach candidate in candidates
# make sure the caps is fixed
fixedcaps = gst_pad_fixate_caps (srcpad, candidate)
# see if the peer accepts it
if gst_pad_peer_accept_caps (srcpad, fixedcaps)
# store the caps as the negotiated caps, this will
# call the setcaps function on the pad
gst_pad_push_event (srcpad, gst_event_new_caps (fixedcaps))
break
endif
done
endif
#negotiate allocator/bufferpool with the ALLOCATION query
buffer = gst_buffer_new_allocate (NULL, size, 0);
# fill buffer and push
The general flow for a sink pad starting a renegotiation.
src sink
| |
| accepts? |
|<----------------| type B
| yes |
|- - - - - - - - >|-.
| | | suggest B caps next
| |<'
| |
| push_event() |
mark .-|<----------------| send RECONFIGURE event
renegotiate| | |
'>| |
| querycaps() |
renegotiate |---------------->|
| suggest B |
|< - - - - - - - -|
| |
| send_event() |
send CAPS |---------------->| Receive type B, reconfigure to
event B | | process type B.
| |
| push |
push buffer |---------------->| Process buffer of type B
| |
Use case:
videotestsrc ! xvimagesink
1) Who decides what format to use?
- src pad always decides, by convention. sinkpad can suggest a format
by putting it high in the caps query result GstCaps.
- since the src decides, it can always choose something that it can do,
so this step can only fail if the sinkpad stated it could accept
something while later on it couldn't.
2) When does negotiation happen?
- before srcpad does a push, it figures out a type as stated in 1), then
it pushes a caps event with the type. The sink checks the media type and
configures itself for this type.
- the source then usually does an ALLOCATION query to negotiate a bufferpool
with the sink. It then allocates a buffer from the pool and pushes it to
the sink. since the sink accepted the caps, it can create a pool for the
format.
- since the sink stated in 1) it could accept the type, it will be able to
handle it.
3) How can sink request another format?
- sink asks if new format is possible for the source.
- sink pushes RECONFIGURE event upstream
- src receives the RECONFIGURE event and marks renegotiation
- On the next buffer push, the source renegotiates the caps and the
bufferpool. The sink will put the new new preferred format high in the list
of caps it returns from its caps query.
videotestsrc ! queue ! xvimagesink
- queue proxies all accept and caps queries to the other peer pad.
- queue proxies the bufferpool
- queue proxies the RECONFIGURE event
- queue stores CAPS event in the queue. This means that the queue can contain
buffers with different types.
Pull-mode negotiation
~~~~~~~~~~~~~~~~~~~~~
Rationale
^^^^^^^^^
A pipeline in pull mode has different negotiation needs than one
activated in push mode. Push mode is optimized for two use cases:
* Playback of media files, in which the demuxers and the decoders are
the points from which format information should disseminate to the
rest of the pipeline; and
* Recording from live sources, in which users are accustomed to putting
a capsfilter directly after the source element; thus the caps
information flow proceeds from the user, through the potential caps
of the source, to the sinks of the pipeline.
In contrast, pull mode has other typical use cases:
* Playback from a lossy source, such as RTP, in which more knowledge
about the latency of the pipeline can increase quality; or
* Audio synthesis, in which audio APIs are tuned to produce only the
necessary number of samples, typically driven by a hardware interrupt
to fill a DMA buffer or a Jack[0] port buffer.
* Low-latency effects processing, whereby filters should be applied as
data is transferred from a ring buffer to a sink instead of
beforehand. For example, instead of using the internal alsasink
ringbuffer thread in push-mode wavsrc ! volume ! alsasink, placing
the volume inside the sound card writer thread via wavsrc !
audioringbuffer ! volume ! alsasink.
[0] http://jackit.sf.net
The problem with pull mode is that the sink has to know the format in
order to know how many bytes to pull via gst_pad_pull_range(). This
means that before pulling, the sink must initiate negotation to decide
on a format.
Recalling the principles of capsnego, whereby information must flow from
those that have it to those that do not, we see that the three named use
cases have different negotiation requirements:
* RTP and low-latency playback are both like the normal playback case,
in which information flows downstream.
* In audio synthesis, the part of the pipeline that has the most
information is the sink, constrained by the capabilities of the graph
that feeds it. However the caps are not completely specified; at some
point the user has to intervene to choose the sample rate, at least.
This can be done externally to gstreamer, as in the jack elements, or
internally via a capsfilter, as is customary with live sources.
Given that sinks potentially need the input of sources, as in the RTP
case and at least as a filter in the synthesis case, there must be a
negotiation phase before the pull thread is activated. Also, given the
low latency offered by pull mode, we want to avoid capsnego from within
the pulling thread, in case it causes us to miss our scheduling
deadlines.
The pull thread is usually started in the PAUSED->PLAYING state change. We must
be able to complete the negotiation before this state change happens.
The time to do capsnego, then, is after the SCHEDULING query has succeeded,
but before the sink has spawned the pulling thread.
Mechanism
^^^^^^^^^
The sink determines that the upstream elements support pull based scheduling by
doing a SCHEDULING query.
The sink initiates the negotiation process by intersecting the results
of gst_pad_query_caps() on its sink pad and its peer src pad. This is the
operation performed by gst_pad_get_allowed_caps(). In the simple
passthrough case, the peer pad's caps query should return the
intersection of calling get_allowed_caps() on all of its sink pads. In
this way the sink element knows the capabilities of the entire pipeline.
The sink element then fixates the resulting caps, if necessary,
resulting in the flow caps. From now on, the caps query of the sinkpad
will only return these fixed caps meaning that upstream elements
will only be able to produce this format.
If the sink element could not set caps on its sink pad, it should post
an error message on the bus indicating that negotiation was not
possible.
When negotiation succeeded, the sinkpad and all upstream internally linked pads
are activated in pull mode. Typically, this operation will trigger negotiation
on the downstream elements, which will now be forced to negotiate to the
final fixed desired caps of the sinkpad.
After these steps, the sink element returns ASYNC from the state change
function. The state will commit to PAUSED when the first buffer is received in
the sink. This is needed to provide a consistent API to the applications that
expect ASYNC return values from sinks but it also allows us to perform the
remainder of the negotiation outside of the context of the pulling thread.
Patterns
~~~~~~~~
We can identify 3 patterns in negotiation:
1) Fixed : Can't choose the output format
- Caps encoded in the stream
- A video/audio decoder
- usually uses gst_pad_use_fixed_caps()
2) Transform
- Caps not modified (passthrough)
- can do caps transform based on element property
- fixed caps get transformed into fixed caps
- videobox
3) Dynamic : can choose output format
- A converter element
- depends on downstream caps, needs to do a CAPS query to find
transform.
- usually prefers to use the identity transform
- fixed caps can be transformed into unfixed caps.

View file

@ -1,541 +0,0 @@
Overview
--------
This part gives an overview of the design of GStreamer with references to
the more detailed explanations of the different topics.
This document is intented for people that want to have a global overview of
the inner workings of GStreamer.
Introduction
~~~~~~~~~~~~
GStreamer is a set of libraries and plugins that can be used to implement various
multimedia applications ranging from desktop players, audio/video recorders,
multimedia servers, transcoders, etc.
Applications are built by constructing a pipeline composed of elements. An element
is an object that performs some action on a multimedia stream such as:
- read a file
- decode or encode between formats
- capture from a hardware device
- render to a hardware device
- mix or multiplex multiple streams
Elements have input and output pads called sink and source pads in GStreamer. An
application links elements together on pads to construct a pipeline. Below is
an example of an ogg/vorbis playback pipeline.
+-----------------------------------------------------------+
| ----------> downstream -------------------> |
| |
| pipeline |
| +---------+ +----------+ +-----------+ +----------+ |
| | filesrc | | oggdemux | | vorbisdec | | alsasink | |
| | src-sink src-sink src-sink | |
| +---------+ +----------+ +-----------+ +----------+ |
| |
| <---------< upstream <-------------------< |
+-----------------------------------------------------------+
The filesrc element reads data from a file on disk. The oggdemux element parses
the data and sends the compressed audio data to the vorbisdec element. The
vorbisdec element decodes the compressed data and sends it to the alsasink
element. The alsasink element sends the samples to the audio card for playback.
Downstream and upstream are the terms used to describe the direction in the
Pipeline. From source to sink is called "downstream" and "upstream" is
from sink to source. Dataflow always happens downstream.
The task of the application is to construct a pipeline as above using existing
elements. This is further explained in the pipeline building topic.
The application does not have to manage any of the complexities of the
actual dataflow/decoding/conversions/synchronisation etc. but only calls high
level functions on the pipeline object such as PLAY/PAUSE/STOP.
The application also receives messages and notifications from the pipeline such
as metadata, warning, error and EOS messages.
If the application needs more control over the graph it is possible to directly
access the elements and pads in the pipeline.
Design overview
~~~~~~~~~~~~~~~
GStreamer design goals include:
- Process large amounts of data quickly
- Allow fully multithreaded processing
- Ability to deal with multiple formats
- Synchronize different dataflows
- Ability to deal with multiple devices
The capabilities presented to the application depends on the number of elements
installed on the system and their functionality.
The GStreamer core is designed to be media agnostic but provides many features
to elements to describe media formats.
Elements
~~~~~~~~
The smallest building blocks in a pipeline are elements. An element provides a
number of pads which can be source or sinkpads. Sourcepads provide data and
sinkpads consume data. Below is an example of an ogg demuxer element that has
one pad that takes (sinks) data and two source pads that produce data.
+-----------+
| oggdemux |
| src0
sink src1
+-----------+
An element can be in four different states: NULL, READY, PAUSED, PLAYING. In the
NULL and READY state, the element is not processing any data. In the PLAYING state
it is processing data. The intermediate PAUSED state is used to preroll data in
the pipeline. A state change can be performed with gst_element_set_state().
An element always goes through all the intermediate state changes. This means that
when en element is in the READY state and is put to PLAYING, it will first go
through the intermediate PAUSED state.
An element state change to PAUSED will activate the pads of the element. First the
source pads are activated, then the sinkpads. When the pads are activated, the
pad activate function is called. Some pads will start a thread (GstTask) or some
other mechanism to start producing or consuming data.
The PAUSED state is special as it is used to preroll data in the pipeline. The purpose
is to fill all connected elements in the pipeline with data so that the subsequent
PLAYING state change happens very quickly. Some elements will therefore not complete
the state change to PAUSED before they have received enough data. Sink elements are
required to only complete the state change to PAUSED after receiving the first data.
Normally the state changes of elements are coordinated by the pipeline as explained
in [part-states.txt].
Different categories of elements exist:
- source elements, these are elements that do not consume data but only provide data
for the pipeline.
- sink elements, these are elements that do not produce data but renders data to
an output device.
- transform elements, these elements transform an input stream in a certain format
into a stream of another format. Encoder/decoder/converters are examples.
- demuxer elements, these elements parse a stream and produce several output streams.
- mixer/muxer elements, combine several input streams into one output stream.
Other categories of elements can be constructed (see part-klass.txt).
Bins
~~~~
A bin is an element subclass and acts as a container for other elements so that multiple
elements can be combined into one element.
A bin coordinates its children's state changes as explained later. It also distributes
events and various other functionality to elements.
A bin can have its own source and sinkpads by ghostpadding one or more of its children's
pads to itself.
Below is a picture of a bin with two elements. The sinkpad of one element is ghostpadded
to the bin.
+---------------------------+
| bin |
| +--------+ +-------+ |
| | | | | |
| /sink src-sink | |
sink +--------+ +-------+ |
+---------------------------+
Pipeline
~~~~~~~~
A pipeline is a special bin subclass that provides the following features to its
children:
- Select and manage a global clock for all its children.
- Manage running_time based on the selected clock. Running_time is the elapsed
time the pipeline spent in the PLAYING state and is used for
synchronisation.
- Manage latency in the pipeline.
- Provide means for elements to comunicate with the application by the GstBus.
- Manage the global state of the elements such as Errors and end-of-stream.
Normally the application creates one pipeline that will manage all the elements
in the application.
Dataflow and buffers
~~~~~~~~~~~~~~~~~~~~
GStreamer supports two possible types of dataflow, the push and pull model. In the
push model, an upstream element sends data to a downstream element by calling a
method on a sinkpad. In the pull model, a downstream element requests data from
an upstream element by calling a method on a source pad.
The most common dataflow is the push model. The pull model can be used in specific
circumstances by demuxer elements. The pull model can also be used by low latency
audio applications.
The data passed between pads is encapsulated in Buffers. The buffer contains
pointers to the actual memory and also metadata describing the memory. This metadata
includes:
- timestamp of the data, this is the time instance at which the data was captured
or the time at which the data should be played back.
- offset of the data: a media specific offset, this could be samples for audio or
frames for video.
- the duration of the data in time.
- additional flags describing special properties of the data such as
discontinuities or delta units.
- additional arbitrary metadata
When an element whishes to send a buffer to another element is does this using one
of the pads that is linked to a pad of the other element. In the push model, a
buffer is pushed to the peer pad with gst_pad_push(). In the pull model, a buffer
is pulled from the peer with the gst_pad_pull_range() function.
Before an element pushes out a buffer, it should make sure that the peer element
can understand the buffer contents. It does this by querying the peer element
for the supported formats and by selecting a suitable common format. The selected
format is then first sent to the peer element with a CAPS event before pushing
the buffer (see part-negotiation.txt).
When an element pad receives a CAPS event, it has to check if it understand the
media type. The element must refuse following buffers if the media type
preceding it was not accepted.
Both gst_pad_push() and gst_pad_pull_range() have a return value indicating whether
the operation succeeded. An error code means that no more data should be sent
to that pad. A source element that initiates the data flow in a thread typically
pauses the producing thread when this happens.
A buffer can be created with gst_buffer_new() or by requesting a usable buffer
from a buffer pool using gst_buffer_pool_acquire_buffer(). Using the second
method, it is possible for the peer element to implement a custom buffer
allocation algorithm.
The process of selecting a media type is called caps negotiation.
Caps
~~~~
A media type (Caps) is described using a generic list of key/value pairs. The key is
a string and the value can be a single/list/range of int/float/string.
Caps that have no ranges/list or other variable parts are said to be fixed and
can be used to put on a buffer.
Caps with variables in them are used to describe possible media types that can be
handled by a pad.
Dataflow and events
~~~~~~~~~~~~~~~~~~~
Parallel to the dataflow is a flow of events. Unlike the buffers, events can pass
both upstream and downstream. Some events only travel upstream others only downstream.
The events are used to denote special conditions in the dataflow such as EOS or
to inform plugins of special events such as flushing or seeking.
Some events must be serialized with the buffer flow, others don't. Serialized
events are inserted between the buffers. Non serialized events jump in front
of any buffers current being processed.
An example of a serialized event is a TAG event that is inserted between buffers
to mark metadata for those buffers.
An example of a non serialized event is the FLUSH event.
Pipeline construction
~~~~~~~~~~~~~~~~~~~~~
The application starts by creating a Pipeline element using gst_pipeline_new ().
Elements are added to and removed from the pipeline with gst_bin_add() and
gst_bin_remove().
After adding the elements, the pads of an element can be retrieved with
gst_element_get_pad(). Pads can then be linked together with gst_pad_link().
Some elements create new pads when actual dataflow is happening in the pipeline.
With g_signal_connect() one can receive a notification when an element has created
a pad. These new pads can then be linked to other unlinked pads.
Some elements cannot be linked together because they operate on different
incompatible data types. The possible datatypes a pad can provide or consume can
be retrieved with gst_pad_get_caps().
Below is a simple mp3 playback pipeline that we constructed. We will use this
pipeline in further examples.
+-------------------------------------------+
| pipeline |
| +---------+ +----------+ +----------+ |
| | filesrc | | mp3dec | | alsasink | |
| | src-sink src-sink | |
| +---------+ +----------+ +----------+ |
+-------------------------------------------+
Pipeline clock
~~~~~~~~~~~~~~
One of the important functions of the pipeline is to select a global clock
for all the elements in the pipeline.
The purpose of the clock is to provide a stricly increasing value at the rate
of one GST_SECOND per second. Clock values are expressed in nanoseconds.
Elements use the clock time to synchronize the playback of data.
Before the pipeline is set to PLAYING, the pipeline asks each element if they can
provide a clock. The clock is selected in the following order:
- If the application selected a clock, use that one.
- If a source element provides a clock, use that clock.
- Select a clock from any other element that provides a clock, start with the
sinks.
- If no element provides a clock a default system clock is used for the pipeline.
In a typical playback pipeline this algorithm will select the clock provided by
a sink element such as an audio sink.
In capture pipelines, this will typically select the clock of the data producer, which
in most cases can not control the rate at which it produces data.
Pipeline states
~~~~~~~~~~~~~~~
When all the pads are linked and signals have been connected, the pipeline can
be put in the PAUSED state to start dataflow.
When a bin (and hence a pipeline) performs a state change, it will change the state
of all its children. The pipeline will change the state of its children from the
sink elements to the source elements, this to make sure that no upstream element
produces data to an element that is not yet ready to accept it.
In the mp3 playback pipeline, the state of the elements is changed in the order
alsasink, mp3dec, filesrc.
All intermediate states are traversed for each element resulting in the following
chain of state changes:
alsasink to READY: the audio device is probed
mp3dec to READY: nothing happens.
filesrc to READY: the file is probed
alsasink to PAUSED: the audio device is opened. alsasink is a sink and returns
ASYNC because it did not receive data yet.
mp3dec to PAUSED: the decoding library is initialized
filesrc to PAUSED: the file is opened and a thread is started to push data to
mp3dec
At this point data flows from filesrc to mp3dec and alsasink. Since mp3dec is PAUSED,
it accepts the data from filesrc on the sinkpad and starts decoding the compressed
data to raw audio samples.
The mp3 decoder figures out the samplerate, the number of channels and other audio
properties of the raw audio samples and sends out a caps event with the media type.
Alsasink then receives the caps event, inspects the caps and reconfigures
itself to process the media type.
mp3dec then puts the decoded samples into a Buffer and pushes this buffer to the next
element.
Alsasink receives the buffer with samples. Since it received the first buffer of
samples, it completes the state change to the PAUSED state. At this point the
pipeline is prerolled and all elements have samples. Alsasink is now also
capable of providing a clock to the pipeline.
Since alsasink is now in the PAUSED state it blocks while receiving the first buffer. This
effectively blocks both mp3dec and filesrc in their gst_pad_push().
Since all elements now return SUCCESS from the gst_element_get_state() function,
the pipeline can be put in the PLAYING state.
Before going to PLAYING, the pipeline select a clock and samples the current time of
the clock. This is the base_time. It then distributes this time to all elements.
Elements can then synchronize against the clock using the buffer running_time +
base_time (See also part-synchronisation.txt).
The following chain of state changes then takes place:
alsasink to PLAYING: the samples are played to the audio device
mp3dec to PLAYING: nothing happens
filesrc to PLAYING: nothing happens
Pipeline status
~~~~~~~~~~~~~~~
The pipeline informs the application of any special events that occur in the
pipeline with the bus. The bus is an object that the pipeline provides and that
can be retrieved with gst_pipeline_get_bus().
The bus can be polled or added to the glib mainloop.
The bus is distributed to all elements added to the pipeline. The elements use the bus
to post messages on. Various message types exist such as ERRORS, WARNINGS, EOS,
STATE_CHANGED, etc..
The pipeline handles EOS messages received from elements in a special way. It will
only forward the message to the application when all sink elements have posted an
EOS message.
Other methods for obtaining the pipeline status include the Query functionality that
can be performed with gst_element_query() on the pipeline. This type of query
is useful for obtaining information about the current position and total time of
the pipeline. It can also be used to query for the supported seeking formats and
ranges.
Pipeline EOS
~~~~~~~~~~~~
When the source filter encounters the end of the stream, it sends an EOS event to
the peer element. This event will then travel downstream to all of the connected
elements to inform them of the EOS. The element is not supposed to accept any more
data after receiving an EOS event on a sinkpad.
The element providing the streaming thread stops sending data after sending the
EOS event.
The EOS event will eventually arrive in the sink element. The sink will then post
an EOS message on the bus to inform the pipeline that a particular stream has
finished. When all sinks have reported EOS, the pipeline forwards the EOS message
to the application. The EOS message is only forwarded to the application in the
PLAYING state.
When in EOS, the pipeline remains in the PLAYING state, it is the applications
responsability to PAUSE or READY the pipeline. The application can also issue
a seek, for example.
Pipeline READY
~~~~~~~~~~~~~~
When a running pipeline is set from the PLAYING to READY state, the following
actions occur in the pipeline:
alsasink to PAUSED: alsasink blocks and completes the state change on the
next sample. If the element was EOS, it does not wait for
a sample to complete the state change.
mp3dec to PAUSED: nothing
filesrc to PAUSED: nothing
Going to the intermediate PAUSED state will block all elements in the _push()
functions. This happens because the sink element blocks on the first buffer
it receives.
Some elements might be performing blocking operations in the PLAYING state that
must be unblocked when they go into the PAUSED state. This makes sure that the
state change happens very fast.
In the next PAUSED to READY state change the pipeline has to shut down and all
streaming threads must stop sending data. This happens in the following sequence:
alsasink to READY: alsasink unblocks from the _chain() function and returns a
FLUSHING return value to the peer element. The sinkpad is
deactivated and becomes unusable for sending more data.
mp3dec to READY: the pads are deactivated and the state change completes when
mp3dec leaves its _chain() function.
filesrc to READY: the pads are deactivated and the thread is paused.
The upstream elements finish their chain() function because the downstream element
returned an error code (FLUSHING) from the _push() functions. These error codes
are eventually returned to the element that started the streaming thread (filesrc),
which pauses the thread and completes the state change.
This sequence of events ensure that all elements are unblocked and all streaming
threads stopped.
Pipeline seeking
~~~~~~~~~~~~~~~~
Seeking in the pipeline requires a very specific order of operations to make
sure that the elements remain synchronized and that the seek is performed with
a minimal amount of latency.
An application issues a seek event on the pipeline using gst_element_send_event()
on the pipeline element. The event can be a seek event in any of the formats
supported by the elements.
The pipeline first pauses the pipeline to speed up the seek operations.
The pipeline then issues the seek event to all sink elements. The sink then forwards
the seek event upstream until some element can perform the seek operation, which is
typically the source or demuxer element. All intermediate elements can transform the
requested seek offset to another format, this way a decoder element can transform a
seek to a frame number to a timestamp, for example.
When the seek event reaches an element that will perform the seek operation, that
element performs the following steps.
1) send a FLUSH_START event to all downstream and upstream peer elements.
2) make sure the streaming thread is not running. The streaming thread will
always stop because of step 1).
3) perform the seek operation
4) send a FLUSH done event to all downstream and upstream peer elements.
5) send SEGMENT event to inform all elements of the new position and to complete
the seek.
In step 1) all downstream elements have to return from any blocking operations
and have to refuse any further buffers or events different from a FLUSH done.
The first step ensures that the streaming thread eventually unblocks and that
step 2) can be performed. At this point, dataflow is completely stopped in the
pipeline.
In step 3) the element performs the seek to the requested position.
In step 4) all peer elements are allowed to accept data again and streaming
can continue from the new position. A FLUSH done event is sent to all the peer
elements so that they accept new data again and restart their streaming threads.
Step 5) informs all elements of the new position in the stream. After that the
event function returns back to the application. and the streaming threads start
to produce new data.
Since the pipeline is still PAUSED, this will preroll the next media sample in the
sinks. The application can wait for this preroll to complete by performing a
_get_state() on the pipeline.
The last step in the seek operation is then to adjust the stream running_time of
the pipeline to 0 and to set the pipeline back to PLAYING.
The sequence of events in our mp3 playback example.
| a) seek on pipeline
| b) PAUSE pipeline
+----------------------------------V--------+
| pipeline | c) seek on sink
| +---------+ +----------+ +---V------+ |
| | filesrc | | mp3dec | | alsasink | |
| | src-sink src-sink | |
| +---------+ +----------+ +----|-----+ |
+-----------------------------------|-------+
<------------------------+
d) seek travels upstream
--------------------------> 1) FLUSH event
| 2) stop streaming
| 3) perform seek
--------------------------> 4) FLUSH done event
--------------------------> 5) SEGMENT event
| e) update running_time to 0
| f) PLAY pipeline

View file

@ -1,59 +0,0 @@
Preroll
-------
A sink element can only complete the state change to PAUSED after a buffer
has been queued on the input pad or pads. This process is called prerolling
and is needed to fill the pipeline with buffers so that the transition to
PLAYING goes as fast as possible with no visual delay for the user.
Preroll is also crucial in maintaining correct audio and video synchronisation
and ensuring that no buffers are dropped in the sinks.
After receiving a buffer (or EOS) on a pad the chain/event function should
wait to render the buffers or in the EOS case, wait to post the EOS
message. While waiting, the sink will wait for the preroll cond to be signalled.
Several things can happen that require the preroll cond to be signalled. This
include state changes or flush events. The prerolling is implemented in
sinks (see part-element-sink.txt)
Committing the state
~~~~~~~~~~~~~~~~~~~~
When going to PAUSED and PLAYING a buffer should be queued in the pad. We also
make this requirement for going to PLAYING since a flush event in the PAUSED
state could unqueue the buffer again.
The state is commited in the following conditions:
- a buffer is received on a sinkpad
- an GAP event is received on a sinkpad.
- an EOS event is received on a sinkpad.
We require the state change to be commited in EOS as well since an EOS means
by definition that no buffer is going to arrive anymore.
After the state is commited, a blocking wait should be performed for the
next event. Some sinks might render the preroll buffer before starting this
blocking wait.
Unlocking the preroll
~~~~~~~~~~~~~~~~~~~~~
The following conditions unlock the preroll:
- a state change
- a flush event
When the preroll is unlocked by a flush event, a return value of
GST_FLOW_FLUSHING is to be returned to the peer pad.
When preroll is unlocked by a state change to PLAYING, playback and
rendering of the buffers shall start.
When preroll is unlocked by a state change to READY, the buffer is
to be discarded and a GST_FLOW_FLUSHING shall be returned to the
peer element.

View file

@ -1,362 +0,0 @@
Probes
------
Probes are callbacks that can be installed by the application and will notify
the application about the states of the dataflow.
Requirements
------------
Applications should be able to monitor and control the dataflow on pads. We
identify the following types:
- be notified when the pad is/becomes idle and make sure the pad stays idle.
This is essential to be able to implement dynamic relinking of elements
without breaking the dataflow.
- be notified when data, events or queries are pushed or sent on a pad. It
should also be possible to inspect and modify the data.
- be able to drop, pass and block on data based on the result of the callback.
- be able to drop, pass data on blocking pads based on methods performed by
the application thread.
Overview
--------
The function gst_pad_add_probe() is used to add a probe to a pad. It accepts a
probe type mask and a callback.
gulong gst_pad_add_probe (GstPad *pad,
GstPadProbeType mask,
GstPadProbeCallback callback,
gpointer user_data,
GDestroyNotify destroy_data);
The function returns a gulong that uniquely identifies the probe and that can
be used to remove the probe with gst_pad_remove_probe():
void gst_pad_remove_probe (GstPad *pad, gulong id);
The mask parameter is a bitwise or of the following flags:
typedef enum
{
GST_PAD_PROBE_TYPE_INVALID = 0,
/* flags to control blocking */
GST_PAD_PROBE_TYPE_IDLE = (1 << 0),
GST_PAD_PROBE_TYPE_BLOCK = (1 << 1),
/* flags to select datatypes */
GST_PAD_PROBE_TYPE_BUFFER = (1 << 4),
GST_PAD_PROBE_TYPE_BUFFER_LIST = (1 << 5),
GST_PAD_PROBE_TYPE_EVENT_DOWNSTREAM = (1 << 6),
GST_PAD_PROBE_TYPE_EVENT_UPSTREAM = (1 << 7),
GST_PAD_PROBE_TYPE_EVENT_FLUSH = (1 << 8),
GST_PAD_PROBE_TYPE_QUERY_DOWNSTREAM = (1 << 9),
GST_PAD_PROBE_TYPE_QUERY_UPSTREAM = (1 << 10),
/* flags to select scheduling mode */
GST_PAD_PROBE_TYPE_PUSH = (1 << 12),
GST_PAD_PROBE_TYPE_PULL = (1 << 13),
} GstPadProbeType;
When adding a probe with the IDLE or BLOCK flag, the probe will become a
blocking probe (see below). Otherwise the probe will be a DATA probe.
The datatype and scheduling selector flags are used to select what kind of
datatypes and scheduling modes should be allowed in the callback.
The blocking flags must match the triggered probe exactly.
The probe callback is defined as:
GstPadProbeReturn (*GstPadProbeCallback) (GstPad *pad, GstPadProbeInfo *info,
gpointer user_data);
A probe info structure is passed as an argument and its type is guaranteed
to match the mask that was used to register the callback. The data item in the
info contains type specific data, which is usually the data item that is blocked
or NULL when no data item is present.
The probe can return any of the following return values:
typedef enum
{
GST_PAD_PROBE_DROP,
GST_PAD_PROBE_OK,
GST_PAD_PROBE_REMOVE,
GST_PAD_PROBE_PASS,
} GstPadProbeReturn;
GST_PAD_PROBE_OK is the normal return value. DROP will drop the item that is
currently being probed. GST_PAD_PROBE_REMOVE the currently executing probe from the
list of probes.
GST_PAD_PROBE_PASS is relevant for blocking probes and will temporarily unblock the
pad and let the item trough, it will then block again on the next item.
Blocking probes
---------------
Blocking probes are probes with BLOCK or IDLE flags set. They will always
block the dataflow and trigger the callback according to the following rules:
When the IDLE flag is set, the probe callback is called as soon as no data is
flowing over the pad. If at the time of probe registration, the pad is idle,
the callback will be called immediately from the current thread. Otherwise,
the callback will be called as soon as the pad becomes idle in the streaming
thread.
The IDLE probe is useful to perform dynamic linking, it allows to wait for for
a safe moment when an unlink/link operation can be done. Since the probe is a
blocking probe, it will also make sure that the pad stays idle until the probe
is removed.
When the BLOCK flag is set, the probe callback will be called when new data
arrives on the pad and right before the pad goes into the blocking state. This
callback is thus only called when there is new data on the pad.
The blocking probe is removed with gst_pad_remove_probe() or when the probe
callback return GST_PAD_PROBE_REMOVE. In both cases, and if this was the last
blocking probe on the pad, the pad is unblocked and dataflow can continue.
Non-Blocking probes
--------------------
Non-blocking probes or DATA probes are probes triggered when data is flowing
over the pad. The are called after the blocking probes are run and always with
data.
Push dataflow
-------------
Push probes have the GST_PAD_PROBE_TYPE_PUSH flag set in the callbacks.
In push based scheduling, the blocking probe is called first with the data item.
Then the data probes are called before the peer pad chain or event function is
called.
The data probes are called before the peer pad is checked. This allows for
linking the pad in either the BLOCK or DATA probes on the pad.
Before the peerpad chain or event function is called, the peer pad block and
data probes are called.
Finally, the IDLE probe is called on the pad after the data was sent to the
peer pad.
The push dataflow probe behavior is the same for buffers and bidirectional events.
pad peerpad
| |
gst_pad_push() / | |
gst_pad_push_event() | |
-------------------->O |
O |
flushing? O |
FLUSHING O |
< - - - - - - O |
O-> do BLOCK probes |
O |
O-> do DATA probes |
no peer? O |
NOT_LINKED O |
< - - - - - - O |
O gst_pad_chain() / |
O gst_pad_send_event() |
O------------------------------>O
O flushing? O
O FLUSHING O
O< - - - - - - - - - - - - - - -O
O O-> do BLOCK probes
O O
O O-> do DATA probes
O O
O O---> chainfunc /
O O eventfunc
O< - - - - - - - - - - - - - - -O
O |
O-> do IDLE probes |
O |
< - - - - - - O |
| |
Pull dataflow
-------------
Pull probes have the GST_PAD_PROBE_TYPE_PULL flag set in the callbacks.
The gst_pad_pull_range() call will first trigger the BLOCK probes without a DATA
item. This allows the pad to be linked before the peer pad is resolved. It also
allows the callback to set a data item in the probe info.
After the blocking probe and the getrange function is called on the peer pad
and there is a data item, the DATA probes are called.
When control returns to the sinkpad, the IDLE callbacks are called. The IDLE
callback is called without a data item so that it will also be called when there
was an error.
If there is a valid DATA item, the DATA probes are called for the item.
srcpad sinkpad
| |
| | gst_pad_pull_range()
| O<---------------------
| O
| O flushing?
| O FLUSHING
| O - - - - - - - - - - >
| do BLOCK probes <-O
| O no peer?
| O NOT_LINKED
| O - - - - - - - - - - >
| gst_pad_get_range() O
O<------------------------------O
O O
O flushing? O
O FLUSHING O
O- - - - - - - - - - - - - - - >O
do BLOCK probes <-O O
O O
getrangefunc <---O O
O flow error? O
O- - - - - - - - - - - - - - - >O
O O
do DATA probes <-O O
O- - - - - - - - - - - - - - - >O
| O
| do IDLE probes <-O
| O flow error?
| O - - - - - - - - - - >
| O
| do DATA probes <-O
| O - - - - - - - - - - >
| |
Queries
-------
Query probes have the GST_PAD_PROBE_TYPE_QUERY_* flag set in the callbacks.
pad peerpad
| |
gst_pad_peer_query() | |
-------------------->O |
O |
O-> do BLOCK probes |
O |
O-> do QUERY | PUSH probes |
no peer? O |
FALSE O |
< - - - - - - O |
O gst_pad_query() |
O------------------------------>O
O O-> do BLOCK probes
O O
O O-> do QUERY | PUSH probes
O O
O O---> queryfunc
O error O
<- - - - - - - - - - - - - - - - - - - - - - -O
O O
O O-> do QUERY | PULL probes
O< - - - - - - - - - - - - - - -O
O |
O-> do QUERY | PULL probes |
O |
< - - - - - - O |
| |
For queries, the PUSH ProbeType is set when the query is traveling to the object
that will answer the query and the PULL type is set when the query contains the
answer.
Use-cases
---------
Prerolling a partial pipeline
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.---------. .---------. .----------.
| filesrc | | demuxer | .-----. | decoder1 |
| src -> sink src1 ->|queue|-> sink src
'---------' | | '-----' '----------' X
| | .----------.
| | .-----. | decoder2 |
| src2 ->|queue|-> sink src
'---------' '-----' '----------' X
The purpose is to create the pipeline dynamically up to the
decoders but not yet connect them to a sink and without losing
any data.
To do this, the source pads of the decoders is blocked so that no
events or buffers can escape and we don't interrupt the stream.
When all of the dynamic pad are created (no-more-pads emitted by the
branching point, ie, the demuxer or the queues filled) and the pads
are blocked (blocked callback received) the pipeline is completely
prerolled.
It should then be possible to perform the following actions on the
prerolled pipeline:
- query duration/position
- perform a flushing seek to preroll a new position
- connect other elements and unblock the blocked pads.
dynamically switching an element in a PLAYING pipeline
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.----------. .----------. .----------.
| element1 | | element2 | | element3 |
... src -> sink src -> sink ...
'----------' '----------' '----------'
.----------.
| element4 |
sink src
'----------'
The purpose is to replace element2 with element4 in the PLAYING
pipeline.
1) block element1 src pad.
2) inside the block callback nothing is flowing between
element1 and element2 and nothing will flow until unblocked.
3) unlink element1 and element2
4) optional step: make sure data is flushed out of element2:
4a) pad event probe on element2 src
4b) send EOS to element2, this makes sure that element2 flushes
out the last bits of data it holds.
4c) wait for EOS to appear in the probe, drop the EOS.
4d) remove the EOS pad event probe.
5) unlink element2 and element3
5a) optionally element2 can now be set to NULL and/or removed from the
pipeline.
6) link element4 and element3
7) link element1 and element4
8) make sure element4 is in the same state as the rest of the elements. The
element should at least be PAUSED.
9) unblock element1 src
The same flow can be used to replace an element in a PAUSED pipeline. Of
course in a PAUSED pipeline there might not be dataflow so the block might
not immediately happen.

View file

@ -1,234 +0,0 @@
Progress Reporting
------------------
This document describes the design and use cases for the progress reporting
messages.
PROGRESS messages are posted on the bus to inform the application about the
progress of asynchronous operations in the pipeline. This should not be confused
with asynchronous state changes.
We accommodate for the following requirements:
- Application is informed when an async operation starts and completes.
- It should be possible for the application to generically detect common
operations and incorporate their progress into the GUI.
- Applications can cancel pending operations by doing regular state changes.
- Applications should be able to wait for completion of async operations.
We allow for the following scenarios:
- Elements want to inform the application about asynchronous DNS lookups and
pending network requests. This includes starting and completing the lookup.
- Elements opening devices and resources asynchronously.
- Applications having more freedom to implement timeout and cancelation of
operations that currently block the state changes or happen invisibly behind
the scenes.
Rationale
~~~~~~~~~
The main reason for adding these extra progress notifications is twofold:
1) to give the application more information of what is going on
When there are well defined progress information codes, applications
can let the user know about the status of the progress. We anticipate to
have at least DNS resolving and server connections and requests be well
defined.
2) To make the state changes non-blocking and cancellable.
Currently state changes such as going to the READY or PAUSED state often do
blocking calls such as resolving DNS or connecting to a remote server. These
operations often block the main thread and are often not cancellable, causing
application lockups.
We would like to make the state change function, instead, start a separate
thread that performs the blocking operations in a cancellable way. When going
back to the NULL state, all pending operations would be canceled immediately.
For downward state changes, we want to let the application implement its own
timeout mechanism. For example: when stopping an RTSP stream, the clients
needs to send a TEARDOWN request to the server. This can however take an
unlimited amount of time in case of network problems. We want to give the
application an opportunity to wait (and timeout) for the completion of the
async operation before setting the element to the final NULL state.
Progress updates are very similar to buffering messages in the same way that the
application can decide to wait for the completion of the buffering process
before performing the next state change. It might make sense to implement
buffering with the progress messages in the future.
Async state changes
~~~~~~~~~~~~~~~~~~~
GStreamer currently has a GST_STATE_CHANGE_ASYNC return value to note to the
application that a state change is happening asynchronously.
The main purpose of this return value is to make the pipeline wait for preroll
and delay a future (upwards) state changes until the sinks are prerolled.
In the case of async operations on source, this will automatically force sinks
to stay async because they will not preroll before the source can produce data.
The fact that other asynchronous operations happen behind the scenes is
irrelevant for the prerolling process so it is not implemented with the ASYNC
state change return value in order to not complicate the state changes and mix
concepts.
Use cases
~~~~~~~~~
* RTSP client (but also HTTP, MMS, ...)
When the client goes from the READY to the PAUSED state, it opens a socket,
performs a DNS lookup, retrieves the SDP and negotiates the streams. All these
operations currently block the state change function for an indefinite amount
of time and while they are blocking cannot be canceled.
Instead, a thread would be started to perform these operations asynchronously
and the state change would complete with the usual NO_PREROLL return value.
Before starting the thread a PROGRESS message would be posted to mark the
start of the async operation.
As the DNS lookup completes and the connection is established, PROGRESS
messages are posted on the bus to inform the application of the progress. When
something fails, an error is posted and a PROGRESS CANCELED message is posted.
The application can then stop the pipeline.
If there are no errors and the setup of the streams completed successfully, a
PROGRESS COMPLETED is posted on the bus. The thread then goes to sleep and the
asynchronous operation completed.
The RTSP protocol requires to send a TEARDOWN request to the server
before closing the connection and destroying the socket. A state change to the
READY state will issue the TEARDOWN request in the background and notify the
application of this pending request with a PROGRESS message.
The application might want to only go to the NULL state after it got confirmation
that the TEARDOWN request completed or it might choose to go to NULL after a
timeout. It might also be possible that the application just want to close the
socket as fast as possible without waiting for completion of the TEARDOWN request.
* Network performance measuring
DNS lookup and connection times can be measured by calculating the elapsed
time between the various PROGRESS messages.
Messages
~~~~~~~~
A new PROGRESS message will be created.
The following fields will be contained in the message:
- "type", GST_TYPE_PROGRESS_TYPE
- a set of types to define the type of progress
GST_PROGRESS_TYPE_START: A new task is started in the background
GST_PROGRESS_TYPE_CONTINUE: The previous tasks completed and a new
one continues. This is done so that the application can follow
a set of continuous tasks and react to COMPLETE only when the
element completely finished.
GST_PROGRESS_TYPE_CANCELED: A task is canceled by the user.
GST_PROGRESS_TYPE_ERROR: A task stopped because of an error. In case of
an error, an error message will have been posted before.
GST_PROGRESS_TYPE_COMPLETE: A task completed successfully.
- "code", G_TYPE_STRING
A generic extensible string that can be used to programmatically determine the
action that is in progress. Some standard predefined codes will be
defined.
- "text", G_TYPE_STRING
A user visible string detailing the action.
- "percent", G_TYPE_INT between 0 and 100
Progress of the action as a percentage, the following values are allowed:
- GST_PROGRESS_TYPE_START always has a 0% value.
- GST_PROGRESS_TYPE_CONTINUE have a value between 0 and 100
- GST_PROGRESS_TYPE_CANCELED, GST_PROGRESS_TYPE_ERROR and
GST_PROGRESS_TYPE_COMPLETE always have a 100% value.
- "timeout", G_TYPE_INT in milliseconds
The timeout of the async operation. -1 if unknown/unlimited..
This field can be interesting to the application when it wants to display
some sort of progress indication.
- ....
Depending on the code, more fields can be put here.
Implementation
~~~~~~~~~~~~~~
Elements should not do blocking operations from the state change function.
Instead, elements should post an appropriate progress message with the right
code and of type GST_PROGRESS_TYPE_START and then start a thread to perform
the blocking calls in a cancellable manner.
It is highly recommended to only start async operations from the READY to PAUSED
state and onwards and not from the NULL to READY state. The reason for this is
that streaming threads are usually started in the READY to PAUSED state and that
the current NULL to READY state change is used to perform a blocking check for
the presence of devices.
The progress message needs to be posted from the state change function so that
the application can immediately take appropriate action after setting the state.
The threads will usually perform many blocking calls with different codes
in a row, a client might first do a DNS query and then continue with
establishing a connection to the server. For this purpose the
GST_PROGRESS_TYPE_CONTINUE must be used.
Usually, the thread used to perform the blocking operations can be used to
implement the streaming threads when needed.
Upon downward state changes, operations that are busy in the thread are canceled
and GST_PROGRESS_TYPE_CANCELED is posted.
The application can know about pending tasks because they received the
GST_PROGRESS_TYPE_START messages that didn't complete with a
GST_PROGRESS_TYPE_COMPLETE message, got canceled with a
GST_PROGRESS_TYPE_CANCELED or errored with GST_PROGRESS_TYPE_ERROR.
Applications should be able to choose if they wait for the pending
operation or cancel them.
If an async operation fails, an error message is posted first before the
GST_PROGRESS_TYPE_ERROR progress message.
Categories
~~~~~~~~~~
We want to propose some standard codes here:
"open" : A resource is being opened
"close" : A resource is being closed
"name-lookup" : A DNS lookup.
"connect" : A socket connection is established
"disconnect" : a socket connection is closed
"request" : A request is sent to a server and we are waiting for a
reply. This message is posted right before the request is sent
and completed when the reply has arrived completely.
"mount" : A volume is being mounted
"unmount" : A volume is being unmounted
More codes can be posted by elements and can be made official later.

View file

@ -1,46 +0,0 @@
push-pull
---------
Normally a source element will push data to the downstream element using
the gst_pad_push() method. The downstream peer pad will receive the
buffer in the Chain function. In the push mode, the source element is the
driving force in the pipeline as it initiates data transport.
It is also possible for an element to pull data from an upstream element.
The downstream element does this by calling gst_pad_pull_range() on one
of its sinkpads. In this mode, the downstream element is the driving force
in the pipeline as it initiates data transfer.
It is important that the elements are in the correct state to handle a
push() or a pull_range() from the peer element. For push() based elements
this means that all downstream elements should be in the correct state and
for pull_range() based elements this means the upstream elements should
be in the correct state.
Most sinkpads implement a chain function. This is the most common case.
sinkpads implementing a loop function will be the exception. Likewise
srcpads implementing a getrange function will be the exception.
state changes
~~~~~~~~~~~~~
The GstBin sets the state of all the sink elements. These are the elements
without source pads.
Setting the state on an element will first activate all the srcpads and then
the sinkpads. For each of the sinkpads, gst_pad_check_pull_range() is
performed. If the sinkpad supports a loopfunction and the peer pad returns TRUE
from the GstPadCheckPullRange function, then the peer pad is activated first as
it must be in the right state to handle a _pull_range(). Note that the
state change of the element is not yet performed, just the activate function
is called on the source pad. This means that elements that implement a
getrange function must be prepared to get their activate function called
before their state change function.
Elements that have multiple sinkpads that require all of them to operate
in the same mode (push/pull) can use the _check_pull_range() on all
their pads and can then remove the loop functions if one of the pads does
not support pull based mode.

View file

@ -1,436 +0,0 @@
Quality-of-Service
------------------
Quality of service is about measuring and adjusting the real-time
performance of a pipeline.
The real-time performance is always measured relative to the pipeline
clock and typically happens in the sinks when they synchronize buffers
against the clock.
The measurements result in QOS events that aim to adjust the datarate
in one or more upstream elements. Two types of adjustments can be
made:
- short time "emergency" corrections based on latest observation
in the sinks.
- long term rate corrections based on trends observed in the sinks.
It is also possible for the application to artificially introduce delay
between synchronized buffers, this is called throttling. It can be used
to reduce the framerate, for example.
Sources of quality problems
~~~~~~~~~~~~~~~~~~~~~~~~~~~
- High CPU load
- Network problems
- Other resource problems such as disk load, memory bottlenecks etc.
- application level throttling
QoS event
~~~~~~~~~
The QoS event is generated by an element that synchronizes against the clock. It
travels upstream and contains the following fields:
- type, GST_TYPE_QOS_TYPE:
The type of the QoS event, we have the following types and the default type
is GST_QOS_TYPE_UNDERFLOW:
GST_QOS_TYPE_OVERFLOW: an element is receiving buffers too fast and can't
keep up processing them. Upstream should reduce the
rate.
GST_QOS_TYPE_UNDERFLOW: an element is receiving buffers too slowly and has
to drop them because they are too late. Upstream should
increase the processing rate.
GST_QOS_TYPE_THROTTLE: the application is asking to add extra delay between
buffers, upstream is allowed to drop buffers
- timestamp, G_TYPE_UINT64:
The timestamp on the buffer that generated the QoS event. These timestamps
are expressed in total running_time in the sink so that the value is ever
increasing.
- jitter, G_TYPE_INT64:
The difference of that timestamp against the current clock time. Negative
values mean the timestamp was on time. Positive values indicate the
timestamp was late by that amount. When buffers are received in time and
throttling is not enabled, the QoS type field is set to OVERFLOW.
When throttling, the jitter contains the throttling delay added by the
application and the type is set to THROTTLE.
- proportion, G_TYPE_DOUBLE:
Long term prediction of the ideal rate relative to normal rate to get
optimal quality.
The rest of this document deals with how these values can be calculated
in a sink and how the values can be used by other elements to adjust their
operations.
QoS message
~~~~~~~~~~~
A QOS message is posted on the bus whenever an element decides to:
- drop a buffer because of QoS reasons
- change its processing strategy because of QoS reasons (quality)
It should be expected that creating and posting the QoS message is reasonably
fast and does not significantly contribute to the QoS problems. Options to
disable this feature could also be presented on elements.
This message can be posted by a sink/src that performs synchronisation against the
clock (live) or it could be posted by an upstream element that performs QoS
because of QOS events received from a downstream element (!live).
The GST_MESSAGE_QOS contains at least the following info:
- live: G_TYPE_BOOLEAN:
If the QoS message was dropped by a live element such as a sink or a live
source. If the live property is FALSE, the QoS message was generated as a
response to a QoS event in a non-live element.
- running-time, G_TYPE_UINT64:
The running_time of the buffer that generated the QoS message.
- stream-time, G_TYPE_UINT64:
The stream_time of the buffer that generated the QoS message.
- timestamp, G_TYPE_UINT64:
The timestamp of the buffer that generated the QoS message.
- duration, G_TYPE_UINT64:
The duration of the buffer that generated the QoS message.
- jitter, G_TYPE_INT64:
The difference of the running-time against the deadline. Negative
values mean the timestamp was on time. Positive values indicate the
timestamp was late (and dropped) by that amount. The deadline can be
a realtime running_time or an estimated running_time.
- proportion, G_TYPE_DOUBLE:
Long term prediction of the ideal rate relative to normal rate to get
optimal quality.
- quality, G_TYPE_INT:
An element dependent integer value that specifies the current quality
level of the element. The default maximum quality is 1000000.
- format, GST_TYPE_FORMAT
Units of the 'processed' and 'dropped' fields. Video sinks and video
filters will use GST_FORMAT_BUFFERS (frames). Audio sinks and audio filters
will likely use GST_FORMAT_DEFAULT (samples).
- processed: G_TYPE_UINT64:
Total number of units correctly processed since the last state change to
READY or a flushing operation.
- dropped: G_TYPE_UINT64:
Total number of units dropped since the last state change to READY or a
flushing operation.
The 'running-time' and 'processed' fields can be used to estimate the average
processing rate (framerate for video).
Elements might add additional fields in the message which are documented in the
relevant elements or baseclasses.
Collecting statistics
~~~~~~~~~~~~~~~~~~~~~
A buffer with timestamp B1 arrives in the sink at time T1. The buffer
timestamp is then synchronized against the clock which yields a jitter J1
return value from the clock. The jitter J1 is simply calculated as
J1 = CT - B1
Where CT is the clock time when the entry arrives in the sink. This value
is calculated inside the clock when we perform gst_clock_id_wait().
If the jitter is negative, the entry arrived in time and can be rendered
after waiting for the clock to reach time B1 (which is also CT - J1).
If the jitter is positive however, the entry arrived too late in the sink
and should therefore be dropped. J1 is the amount of time the entry was late.
Any buffer that arrives in the sink should generate a QoS event upstream.
Using the jitter we can calculate the time when the buffer arrived in the
sink:
T1 = B1 + J1. (1)
The time the buffer leaves the sink after synchronisation is measured as:
T2 = B1 + (J1 < 0 ? 0 : J1) (2)
For buffers that arrive in time (J1 < 0) the buffer leaves after synchronisation
which is exactly B1. Late buffers (J1 >= 0) leave the sink when they arrive,
whithout any synchronisation, which is T2 = T1 = B1 + J1.
Using a previous T0 and a new T1, we can calculate the time it took for
upstream to generate a buffer with timestamp B1.
PT1 = T1 - T0 (3)
We call PT1 the processing time needed to generate buffer with timestamp B1.
Moreover, given the duration of the buffer D1, the current data rate (DR1) of
the upstream element is given as:
PT1 T1 - T0
DR1 = --- = ------- (4)
D1 D1
For values 0.0 < DR1 <= 1.0 the upstream element is producing faster than
real-time. If DR1 is exactly 1.0, the element is running at a perfect speed.
Values DR1 > 1.0 mean that the upstream element cannot produce buffers of
duration D1 in real-time. It is exactly DR1 that tells the amount of speedup
we require from upstream to regain real-time performance.
An element that is not receiving enough data is said to be underflowed.
Element measurements
~~~~~~~~~~~~~~~~~~~~
In addition to the measurements of the datarate of the upstream element, a
typical element must also measure its own performance. Global pipeline
performance problems can indeed also be caused by the element itself when it
receives too much data it cannot process in time. The element is then said to
be overflowed.
Short term correction
---------------------
The timestamp and jitter serve as short term correction information
for upstream elements. Indeed, given arrival time T1 as given in (1)
we can be certain that buffers with a timestamp B2 < T1 will be too late
in the sink.
In case of a positive jitter we can therefore send a QoS event with
a timestamp B1, jitter J1 and proportion given by (4).
This allows an upstream element to not generate any data with timestamps
B2 < T1, where the element can derive T1 as B1 + J1.
This will effectively result in frame drops.
The element can even do a better estimation of the next valid timestamp it
should output.
Indeed, given the element generated a buffer with timestamp B0 that arrived
in time in the sink but then received a QoS event stating B1 arrived J1
too late. This means generating B1 took (B1 + J1) - B0 = T1 - T0 = PT1, as
given in (3). Given the buffer B1 had a duration D1 and assuming that
generating a new buffer B2 will take the same amount of processing time,
a better estimation for B2 would then be:
B2 = T1 + D2 * DR1
expanding gives:
B2 = (B1 + J1) + D2 * (B1 + J1 - B0)
--------------
D1
assuming the durations of the frames are equal and thus D1 = D2:
B2 = (B1 + J1) + (B1 + J1 - B0)
B2 = 2 * (B1 + J1) - B0
also:
B0 = B1 - D1
so:
B2 = 2 * (B1 + J1) - (B1 - D1)
Which yields a more accurate prediction for the next buffer given as:
B2 = B1 + 2 * J1 + D1 (5)
Long term correction
--------------------
The datarate used to calculate (5) for the short term prediction is based
on a single observation. A more accurate datarate can be obtained by
creating a running average over multiple datarate observations.
This average is less susceptible to sudden changes that would only influence
the datarate for a very short period.
A running average is calculated over the observations given in (4) and is
used as the proportion member in the QoS event that is sent upstream.
Receivers of the QoS event should permanently reduce their datarate
as given by the proportion member. Failure to do so will certainly lead to
more dropped frames and a generally worse QoS.
Throttling
----------
In throttle mode, the time distance between buffers is kept to a configurable
throttle interval. This means that effectively the buffer rate is limited
to 1 buffer per throttle interval. This can be used to limit the framerate,
for example.
When an element is configured in throttling mode (this is usually only
implemented on sinks) it should produce QoS events upstream with the jitter
field set to the throttle interval. This should instruct upstream elements to
skip or drop the remaining buffers in the configured throttle interval.
The proportion field is set to the desired slowdown needed to get the
desired throttle interval. Implementations can use the QoS Throttle type,
the proportion and the jitter member to tune their implementations.
QoS strategies
--------------
Several strategies exist to reduce processing delay that might affect
real time performance.
- lowering quality
- dropping frames (reduce CPU/bandwidth usage)
- switch to a lower decoding/encoding quality (reduce algorithmic
complexity)
- switch to a lower quality source (reduce network usage)
- increasing thread priorities
- switch to real-time scheduling
- assign more CPU cycles to critial pipeline parts
- assign more CPU(s) to critical pipeline parts
QoS implementations
-------------------
Here follows a small overview of how QoS can be implemented in a range of
different types of elements.
GstBaseSink
-----------
The primary implementor of QoS is GstBaseSink. It will calculate the following
values:
- upstream running average of processing time (5) in stream time.
- running average of buffer durations.
- running average of render time (in system time)
- rendered/dropped buffers
The processing time and the average buffer durations will be used to
calculate a proportion.
The processing time in system time is compared to render time to decide if
the majority of the time is spend upstream or in the sink itself. This value
is used to decide overflow or underflow.
The number of rendered and dropped buffers is used to query stats on the sink.
A QoS event with the most current values is sent upstream for each buffer
that was received by the sink.
Normally QoS is only enabled for video pipelines. The reason being that drops
in audio are more disturbing than dropping video frames. Also video requires in
general more processing than audio.
Normally there is a threshold for when buffers get dropped in a video sink. Frames
that arrive 20 milliseconds late are still rendered as it is not noticeable for
the human eye.
A QoS message is posted whenever a (part of a) buffer is dropped.
In throttle mode, the sink sends QoS event upstream with the timestamp set to
the running_time of the latest buffer and the jitter set to the throttle interval.
If the throttled buffer is late, the lateness is subtracted from the throttle
interval in order to keep the desired throttle interval.
GstBaseTransform
----------------
Transform elements can entirely skip the transform based on the timestamp and
jitter values of recent QoS event since these buffers will certainly arrive
too late.
With any intermediate element, the element should measure its performance to
decide if it is responsible for the quality problems or any upstream/downstream
element.
some transforms can reduce the complexity of their algorithms. Depending on the
algorithm, the changes in quality may have disturbing visual or audible effect
that should be avoided.
A QoS message should be posted when a frame is dropped or when the quality
of the filter is reduced. The quality member in the QOS message should reflect
the quality setting of the filter.
Video Decoders
--------------
A video decoder can, based on the codec in use, decide to not decode intermediate
frames. A typical codec can for example skip the decoding of B-frames to reduce
the CPU usage and framerate.
If each frame is independantly decodable, any arbitrary frame can be skipped based
on the timestamp and jitter values of the latest QoS event. In addition can the
proportion member be used to permanently skip frames.
It is suggested to adjust the quality field of the QoS message with the expected
amount of dropped frames (skipping B and/or P frames). This depends on the
particular spacing of B and P frames in the stream. If the quality control would
result in half of the frames to be dropped (typical B frame skipping), the
quality field would be set to 1000000 * 1/2 = 500000. If a typical I frame spacing
of 18 frames is used, skipping B and P frames would result in 17 dropped frames
or 1 decoded frame every 18 frames. The quality member should be set to
1000000 * 1/18 = 55555.
- skipping B frames: quality = 500000
- skipping P/B frames: quality = 55555 (for I-frame spacing of 18 frames)
Demuxers
--------
Demuxers usually cannot do a lot regarding QoS except for skipping frames to the next
keyframe when a lateness QoS event arrives on a source pad.
A demuxer can however measure if the performance problems are upstream or downstream
and forward an updated QoS event upstream.
Most demuxers that have multiple output pads might need to combine the QoS
events on all the pads and derive an aggregated QoS event for the upstream element.
Sources
-------
The QoS events only apply to push based sources since pull based sources are entirely
controlled by another downstream element.
Sources can receive a overflow or underflow event that can be used to switch to
less demanding source material. In case of a network stream, a switch could be done
to a lower or higher quality stream or additional enhancement layers could be used
or ignored.
Live sources will automatically drop data when it takes too long to process the data
that the element pushes out.
Live sources should post a QoS message when data is dropped.

View file

@ -1,100 +0,0 @@
Query
-----
Purpose
~~~~~~~
Queries are used to get information about the stream.
A query is started on a specific pad and travels up or downstream.
Requirements
~~~~~~~~~~~~
- multiple return values, grouped together when they make sense.
- one pad function to perform the query
- extensible queries.
Implementation
~~~~~~~~~~~~~~
- GstQuery extends GstMiniObject and contains a GstStructure (see GstMessage)
- some standard query types are defined below
- methods to create and parse the results in the GstQuery.
- define pad method:
gboolean (*GstPadQueryFunction) (GstPad *pad,
GstObject *parent,
GstQuery *query);
pad returns result in query structure and TRUE as result or FALSE when
query is not supported.
Query types
~~~~~~~~~~~
- GST_QUERY_POSITION:
get info on current position of the stream in stream_time.
- GST_QUERY_DURATION:
get info on the total duration of the stream.
- GST_QUERY_LATENCY:
get amount of latency introduced in the pipeline. (See part-latency.txt)
- GST_QUERY_RATE:
get the current playback rate of the pipeline
- GST_QUERY_SEEKING:
get info on how seeking can be done
- getrange, with/without offset/size
- ranges where seeking is efficient (for caching network sources)
- flags describing seeking behaviour (forward, backward, segments,
play backwards, ...)
- GST_QUERY_SEGMENT:
get info about the currently configured playback segment.
- GST_QUERY_CONVERT:
convert format/value to another format/value pair.
- GST_QUERY_FORMATS:
return list of supported formats that can be used for GST_QUERY_CONVERT.
- GST_QUERY_BUFFERING:
query available media for efficient seeking (See part-buffering.txt)
- GST_QUERY_CUSTOM:
a custom query, the name of the query defines the properties of the query.
- GST_QUERY_URI:
query the uri of the source or sink element
- GST_QUERY_ALLOCATION:
the buffer allocation properties (See part-bufferpool.txt)
- GST_QUERY_SCHEDULING:
the scheduling properties (See part-scheduling.txt)
- GST_QUERY_ACCEPT_CAPS:
check if caps are supported (See part-negotiation.txt)
- GST_QUERY_CAPS:
get the possible caps (See part-negotiation.txt)

View file

@ -1,491 +0,0 @@
Object relation types
---------------------
This document describes the relations between objects that exist in GStreamer.
It will also describe the way of handling the relation wrt locking and
refcounting.
parent-child relation
~~~~~~~~~~~~~~~~~~~~~
+---------+ +-------+
| parent | | child |
*--->| *----->| |
| F1|<-----* 1|
+---------+ +-------+
- properties
- parent has references to multiple children
- child has reference to parent
- reference fields protected with LOCK
- the reference held by each child to the parent is
NOT reflected in the refcount of the parent.
- the parent removes the floating flag of the child when taking
ownership.
- the application has valid reference to parent
- creation/destruction requires two unnested locks and 1 refcount.
- usage in GStreamer
GstBin -> GstElement
GstElement -> GstRealPad
- lifecycle
a) object creation
The application creates two object and holds a pointer
to them. The objects are initially FLOATING with a refcount
of 1.
+---------+ +-------+
*--->| parent | *--->| child |
| * | | |
| F1| | * F1|
+---------+ +-------+
b) establishing the parent-child relationship
The application then calls a method on the parent object to take
ownership of the child object. The parent performs the following
actions:
result = _set_parent (child, parent);
if (result) {
LOCK (parent);
ref_pointer = child;
.. update other data structures ..
UNLOCK (parent);
}
else {
.. child had parent ..
}
The _set_parent() method performs the following actions:
LOCK (child);
if (child->parent != NULL) {
UNLOCK (child);
return FALSE;
}
if (IS_FLOATING (child)) {
UNSET (child, FLOATING);
}
else {
_ref (child);
}
child->parent = parent;
UNLOCK (child);
_signal (PARENT_SET, child, parent);
return TRUE;
The function atomically checks if the child has no parent yet
and will set the parent if not. It will also sink the child, meaning
all floating references to the child are invalid now as it takes
over the refcount of the object.
Visually:
after _set_parent() returns TRUE:
+---------+ +-------+
*---->| parent | *-//->| child |
| * | | |
| F1|<-------------* 1|
+---------+ +-------+
after parent updates ref_pointer to child.
+---------+ +-------+
*---->| parent | *-//->| child |
| *--------->| |
| F1|<---------* 1|
+---------+ +-------+
- only one parent is able to _sink the same object because the
_set_parent() method is atomic.
- since only one parent is able to _set_parent() the object, only
one will add a reference to the object.
- since the parent can hold multiple references to children, we don't
need to lock the parent when locking the child. Many threads can
call _set_parent() on the children with the same parent, the parent
can then add all those to its lists.
Note: that the signal is emitted before the parent has added the
element to its internal data structures. This is not a problem
since the parent usually has his own signal to inform the app that
the child was reffed. One possible solution would be to update the
internal structure first and then perform a rollback if the _set_parent()
failed. This is not a good solution as iterators might grab the
'half-added' child too soon.
c) using the parent-child relationship
- since the initial floating reference to the child object became
invalid after giving it to the parent, any reference to a child
has at least a refcount > 1.
- this means that unreffing a child object cannot decrease the refcount
to 0. In fact, only the parent can destroy and dispose the child
object.
- given a reference to the child object, the parent pointer is only
valid when holding the child LOCK. Indeed, after unlocking the child
LOCK, the parent can unparent the child or the parent could even become
disposed. To avoid the parent dispose problem, when obtaining the
parent pointer, if should be reffed before releasing the child LOCK.
1) getting a reference to the parent.
- a referece is held to the child, so it cannot be disposed.
LOCK (child);
parent = _ref (child->parent);
UNLOCK (child);
.. use parent ..
_unref (parent);
2) getting a reference to a child
- a reference to a child can be obtained by reffing it before
adding it to the parent or by querying the parent.
- when requesting a child from the parent, a reference is held to
the parent so it cannot be disposed. The parent will use its
internal data structures to locate the child element and will
return a reference to it with an incremented refcount. The
requester should _unref() the child after usage.
d) destroying the parent-child relationship
- only the parent can actively destroy the parent-child relationship
this typically happens when a method is called on the parent to release
ownership of the child.
- a child shall never remove itself from the parent.
- since calling a method on the parent with the child as an argument
requires the caller to obtain a valid reference to the child, the child
refcount is at least > 1.
- the parent will perform the folowing actions:
LOCK (parent);
if (ref_pointer == child) {
ref_pointer = NULL;
.. update other data structures ..
UNLOCK (parent);
_unparent (child);
}
else {
UNLOCK (parent);
.. not our child ..
}
The _unparent() method performs the following actions:
LOCK (child);
if (child->parent != NULL) {
child->parent = NULL;
UNLOCK (child);
_signal (PARENT_UNSET, child, parent);
_unref (child);
}
else {
UNLOCK (child);
}
Since the _unparent() method unrefs the child object, it is possible that
the child pointer is invalid after this function. If the parent wants to
perform other actions on the child (such as signal emmision) it should
_ref() the child first.
single-reffed relation
~~~~~~~~~~~~~~~~~~~~~~
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 1| | 2|
+---------+ +---------+
- properties
- one object has a reference to another
- reference field protected with LOCK
- the reference held by the object is reflected in the
refcount of the other object.
- typically the other object can be shared among multiple
other objects where each ref is counted for in the
refcount.
- no object has ownership of the other.
- either shared state or copy-on-write.
- creation/destruction requires one lock and one refcount.
- usage
GstRealPad -> GstCaps
GstBuffer -> GstCaps
GstEvent -> GstCaps
GstEvent -> GstObject
GstMessage -> GstCaps
GstMessage -> GstObject
- lifecycle
a) Two objects exist unlinked.
+---------+ +---------+
*--->| object1 | *--->| object2 |
| * | | |
| 1| | 1|
+---------+ +---------+
b) establishing the single-reffed relationship
The second object is attached to the first one using a method
on the first object. The second object is reffed and a pointer
is updated in the first object using the following algorithm:
LOCK (object1);
if (object1->pointer)
_unref (object1->pointer);
object1->pointer = _ref (object2);
UNLOCK (object1);
After releasing the lock on the first object is is not sure that
object2 is still reffed from object1.
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 1| | 2|
+---------+ +---------+
c) using the single-reffed relationship
The only way to access object2 is by holding a ref to it or by
getting the reference from object1.
Reading the object pointed to by object1 can be done like this:
LOCK (object1);
object2 = object1->pointer;
_ref (object2);
UNLOCK (object1);
.. use object2 ...
_unref (object2);
Depending on the type of the object, modifications can be done either
with copy-on-write or directly into the object.
Copy on write can practically only be done like this:
LOCK (object1);
object2 = object1->pointer;
object2 = _copy_on_write (object2);
... make modifications to object2 ...
UNLOCK (object1);
Releasing the lock has only a very small window where the copy_on_write
actually does not perform a copy:
LOCK (object1);
object2 = object1->pointer;
_ref (object2);
UNLOCK (object1);
.. object2 now has at least 2 refcounts making the next
copy-on-write make a real copy, unless some other thread
writes another object2 to object1 here ...
object2 = _copy_on_write (object2);
.. make modifications to object2 ...
LOCK (object1);
if (object1->pointer != object2) {
if (object1->pointer)
_unref (object1->pointer);
object1->pointer = gst_object_ref (object2);
}
UNLOCK (object1);
d) destroying the single-reffed relationship
The folowing algorithm removes the single-reffed link between
object1 and object2.
LOCK (object1);
_unref (object1->pointer);
object1->pointer = NULL;
UNLOCK (object1);
Which yields the following initial state again:
+---------+ +---------+
*--->| object1 | *--->| object2 |
| * | | |
| 1| | 1|
+---------+ +---------+
unreffed relation
~~~~~~~~~~~~~~~~~
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 1|<---------* 1|
+---------+ +---------+
- properties
- two objects have references to each other
- both objects can only have 1 reference to another object.
- reference fields protected with LOCK
- the references held by each object are NOT reflected in the
refcount of the other object.
- no object has ownership of the other.
- typically each object is owned by a different parent.
- creation/destruction requires two nested locks and no refcounts.
- usage
- This type of link is used when the link is less important than
the existance of the objects, If one of the objects is disposed, so
is the link.
GstRealPad <-> GstRealPad (srcpad lock taken first)
- lifecycle
a) Two objects exist unlinked.
+---------+ +---------+
*--->| object1 | *--->| object2 |
| * | | |
| 1| | * 1|
+---------+ +---------+
b) establishing the unreffed relationship
Since we need to take two locks, the order in which these locks are
taken is very important or we might cause deadlocks. This lock order
must be defined for all unreffed relations. In these examples we always
lock object1 first and then object2.
LOCK (object1);
LOCK (object2);
object2->refpointer = object1;
object1->refpointer = object2;
UNLOCK (object2);
UNLOCK (object1);
c) using the unreffed relationship
Reading requires taking one of the locks and reading the corresponing
object. Again we need to ref the object before releasing the lock.
LOCK (object1);
object2 = _ref (object1->refpointer);
UNLOCK (object1);
.. use object2 ..
_unref (object2);
d) destroying the unreffed relationship
Because of the lock order we need to be careful when destroying this
Relation.
When only a reference to object1 is held:
LOCK (object1);
LOCK (object2);
object1->refpointer->refpointer = NULL;
object1->refpointer = NULL;
UNLOCK (object2);
UNLOCK (object1);
When only a reference to object2 is held we need to get a handle to the
other object fist so that we can lock it first. There is a window where
we need to release all locks and the relation could be invalid. To solve
this we check the relation after grabbing both locks and retry if the
relation changed.
retry:
LOCK (object2);
object1 = _ref (object2->refpointer);
UNLOCK (object2);
.. things can change here ..
LOCK (object1);
LOCK (object2);
if (object1 == object2->refpointer) {
/* relation unchanged */
object1->refpointer->refpointer = NULL;
object1->refpointer = NULL;
}
else {
/* relation changed.. retry */
UNLOCK (object2);
UNLOCK (object1);
_unref (object1);
goto retry;
}
UNLOCK (object2);
UNLOCK (object1);
_unref (object1);
When references are held to both objects. Note that it is not possible to
get references to both objects with the locks released since when the
references are taken and the locks are released, a concurrent update might
have changed the link, making the references not point to linked objects.
LOCK (object1);
LOCK (object2);
if (object1->refpointer == object2) {
object2->refpointer = NULL;
object1->refpointer = NULL;
}
else {
.. objects are not linked ..
}
UNLOCK (object2);
UNLOCK (object1);
double-reffed relation
~~~~~~~~~~~~~~~~~~~~~~
+---------+ +---------+
*--->| object1 | *--->| object2 |
| *--------->| |
| 2|<---------* 2|
+---------+ +---------+
- properties
- two objects have references to each other
- reference fields protected with LOCK
- the references held by each object are reflected in the
refcount of the other object.
- no object has ownership of the other.
- typically each object is owned by a different parent.
- creation/destruction requires two locks and two refcounts.
- usage
Not used in GStreamer.
- lifecycle

View file

@ -1,252 +0,0 @@
Scheduling
----------
The scheduling in GStreamer is based on pads actively pushing (producing) data or
pad pulling in data (consuming) from other pads.
Pushing
~~~~~~~
A pad can produce data and push it to the next pad. A pad that behaves this way
exposes a loop function that will be called repeatedly until it returns false.
The loop function is allowed to block whenever it wants. When the pad is deactivated
the loop function should unblock though.
A pad operating in the push mode can only produce data to a pad that exposes a
chain function. This chain function will be called with the buffer produced by
the pushing pad.
This method of producing data is called the streaming mode since the producer
produces a constant stream of data.
Pulling
~~~~~~~
Pads that operate in pulling mode can only pull data from a pad that exposes the
pull_range function. In this case, the sink pad exposes a loop function that will be
called repeatedly until the task is stopped.
After pulling data from the peer pad, the loop function will typically call the
push function to push the result to the peer sinkpad.
Deciding the scheduling mode
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When a pad is activated, the _activate() function is called. The pad can then
choose to activate itself in push or pull mode depending on upstream
capabilities.
The GStreamer core will by default activate pads in push mode when there is no
activate function for the pad.
The chain function
~~~~~~~~~~~~~~~~~~
The chain function will be called when a upstream element performs a _push() on the pad.
The upstream element can be another chain based element or a pushing source.
The getrange function
~~~~~~~~~~~~~~~~~~~~~
The getrange function is called when a peer pad performs a _pull_range() on the pad. This
downstream pad can be a pulling element or another _pull_range() based element.
Scheduling Query
~~~~~~~~~~~~~~~~
A sinkpad can ask the upstream srcpad for its scheduling attributes. It does
this with the SCHEDULING query.
(out) "modes", G_TYPE_ARRAY (default NULL)
- an array of GST_TYPE_PAD_MODE enums. Contains all the supported
scheduling modes.
(out) "flags", GST_TYPE_SCHEDULING_FLAGS (default 0)
typedef enum {
GST_SCHEDULING_FLAG_SEEKABLE = (1 << 0),
GST_SCHEDULING_FLAG_SEQUENTIAL = (1 << 1),
GST_SCHEDULING_FLAG_BANDWIDTH_LIMITED = (1 << 2)
} GstSchedulingFlags;
_SEEKABLE:
- the offset of a pull operation can be specified, if this flag is false,
the offset should be -1,
_SEQUENTIAL:
- suggest sequential access to the data. If _SEEKABLE is specified,
seeks are allowed but should be avoided. This is common for network
streams.
_BANDWIDTH_LIMITED:
- suggest the element supports buffering data for downstream to
cope with bandwidth limitations. If this flag is on the
downstream element might ask for more data than necessary for
normal playback. This use-case is interesting for on-disk
buffering scenarios for instance. Seek operations might be
slow as well so downstream elements should take this into
consideration.
(out) "minsize", G_TYPE_INT (default 1)
- the suggested minimum size of pull requests
(out) "maxsize", G_TYPE_INT (default -1, unlimited)
- the suggested maximum size of pull requests
(out) "align", G_TYPE_INT (default 0)
- the suggested alignment for the pull requests.
Plug-in techniques
~~~~~~~~~~~~~~~~~~
Multi-sink elements
^^^^^^^^^^^^^^^^^^^
Elements with multiple sinks can either expose a loop function on each of the pads to
actively pull_range data or they can expose a chain function on each pad.
Implementing a chain function is usually easy and allows for all possible scheduling
methods.
Pad select
----------
If the chain based sink wants to wait for one of the pads to receive a buffer, just
implement the action to perform in the chain function. Be aware that the action could
be performed in different threads and possibly simultaneously so grab the STREAM_LOCK.
Collect pads
------------
If the chain based sink pads all require one buffer before the element can operate on
the data, collect all the buffers in the chain function and perform the action when
all chainpads received the buffer.
In this case you probably also don't want to accept more data on a pad that has a buffer
queued. This can easily be done with the following code snippet:
static GstFlowReturn _chain (GstPad *pad, GstBuffer *buffer)
{
LOCK (mylock);
while (pad->store != NULL) {
WAIT (mycond, mylock);
}
pad->store = buffer;
SIGNAL (mycond);
UNLOCK (mylock);
return GST_FLOW_OK;
}
static void _pull (GstPad *pad, GstBuffer **buffer)
{
LOCK (mylock);
while (pad->store == NULL) {
WAIT (mycond, mylock);
}
**buffer = pad->store;
pad->store = NULL;
SIGNAL (mycond);
UNLOCK (mylock);
}
Cases
~~~~~
Inside the braces below the pads is stated what function the
pad support:
l: exposes a loop function, so it can act as a pushing source.
g: exposes a getrange function
c: exposes a chain function
following scheduling decisions are made based on the scheduling
methods exposed by the pads:
(g) - (l): sinkpad will pull data from src
(l) - (c): srcpad actively pushes data to sinkpad
() - (c): srcpad will push data to sinkpad.
() - () : not schedulable.
() - (l): not schedulable.
(g) - () : not schedulable.
(g) - (c): not schedulable.
(l) - () : not schedulable.
(l) - (l): not schedulable
() - (g): impossible
(g) - (g): impossible.
(l) - (g): impossible
(c) - () : impossible
(c) - (g): impossible
(c) - (l): impossible
(c) - (c): impossible
+---------+ +------------+ +-----------+
| filesrc | | mp3decoder | | audiosink |
| src--sink src--sink |
+---------+ +------------+ +-----------+
(l-g) (c) () (c)
When activating the pads:
* audiosink has a chain function and the peer pad has no
loop function, no scheduling is done.
* mp3decoder and filesrc expose an (l) - (c) connection,
a thread is created to call the srcpad loop function.
+---------+ +------------+ +----------+
| filesrc | | avidemuxer | | fakesink |
| src--sink src--sink |
+---------+ +------------+ +----------+
(l-g) (l) () (c)
* fakesink has a chain function and the peer pad has no
loop function, no scheduling is done.
* avidemuxer and filesrc expose an (g) - (l) connection,
a thread is created to call the sinkpad loop function.
+---------+ +----------+ +------------+ +----------+
| filesrc | | identity | | avidemuxer | | fakesink |
| src--sink src--sink src--sink |
+---------+ +----------+ +------------+ +----------+
(l-g) (c) () (l) () (c)
* fakesink has a chain function and the peer pad has no
loop function, no scheduling is done.
* avidemuxer and identity expose no schedulable connection so
this pipeline is not schedulable.
+---------+ +----------+ +------------+ +----------+
| filesrc | | identity | | avidemuxer | | fakesink |
| src--sink src--sink src--sink |
+---------+ +----------+ +------------+ +----------+
(l-g) (c-l) (g) (l) () (c)
* fakesink has a chain function and the peer pad has no
loop function, no scheduling is done.
* avidemuxer and identity expose an (g) - (l) connection,
a thread is created to call the sinkpad loop function.
* identity knows the srcpad is getrange based and uses the
thread from avidemux to getrange data from filesrc.
+---------+ +----------+ +------------+ +----------+
| filesrc | | identity | | oggdemuxer | | fakesink |
| src--sink src--sink src--sink |
+---------+ +----------+ +------------+ +----------+
(l-g) (c) () (l-c) () (c)
* fakesink has a chain function and the peer pad has no
loop function, no scheduling is done.
* oggdemuxer and identity expose an () - (l-c) connection,
oggdemux has to operate in chain mode.
* identity chan only work chain based and so filesrc creates
a thread to push data to identity.

View file

@ -1,251 +0,0 @@
Seeking
-------
Seeking in GStreamer means configuring the pipeline for playback of the
media between a certain start and stop time, called the playback segment.
By default a pipeline will play from position 0 to the total duration of the
media at a rate of 1.0.
A seek is performed by sending a seek event to the sink elements of a
pipeline. Sending the seek event to a bin will by default forward
the event to all sinks in the bin.
When performing a seek, the start and stop values of the segment can be
specified as absolute positions or relative to the currently configured
playback segment. Note that it is not possible to seek relative to the current
playback position. To seek relative to the current playback position, one must
query the position first and then perform an absolute seek to the desired
position.
Feedback of the seek operation can be immediately using the GST_SEEK_FLAG_FLUSH
flag. With this flag, all pending data in the pipeline is discarded and playback
starts from the new position immediately.
When the FLUSH flag is not set, the seek will be queued and executed as
soon as possible, which might be after all queues are emptied.
Seeking can be performed in different formats such as time, frames
or samples.
The seeking can be performed to a nearby key unit or to the exact
(estimated) unit in the media (GST_SEEK_FLAG_KEY_UNIT). See below for more
details on this.
The seeking can be performed by using an estimated target position or in an
accurate way (GST_SEEK_FLAG_ACCURATE). For some formats this can result in
having to scan the complete file in order to accurately find the target unit.
See below for more details on this.
Non segment seeking will make the pipeline emit EOS when the configured
segment has been played.
Segment seeking (using the GST_SEEK_FLAG_SEGMENT) will not emit an EOS at
the end of the playback segment but will post a SEGMENT_DONE message on the
bus. This message is posted by the element driving the playback in the
pipeline, typically a demuxer. After receiving the message, the application
can reconnect the pipeline or issue other seek events in the pipeline.
Since the message is posted as early as possible in the pipeline, the
application has some time to issue a new seek to make the transition seamless.
Typically the allowed delay is defined by the buffer sizes of the sinks as well
as the size of any queues in the pipeline.
The seek can also change the playback speed of the configured segment.
A speed of 1.0 is normal speed, 2.0 is double speed. Negative values
mean backward playback.
When performing a seek with a playback rate different from 1.0, the
GST_SEEK_FLAG_SKIP flag can be used to instruct decoders and demuxers that they
are allowed to skip decoding. This can be useful when resource consumption is
more important than accurately producing all frames.
Seeking in push based elements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Generating seeking events
~~~~~~~~~~~~~~~~~~~~~~~~~
A seek event is created with gst_event_new_seek ().
Seeking variants
~~~~~~~~~~~~~~~~
The different kinds of seeking methods and their internal workings are
described below.
FLUSH seeking
^^^^^^^^^^^^^
This is the most common way of performing a seek in a playback application.
The application issues a seek on the pipeline and the new media is immediately
played after the seek call returns.
seeking without FLUSH
^^^^^^^^^^^^^^^^^^^^^
This seek type is typically performed after issuing segment seeks to finish
the playback of the pipeline.
Performing a non-flushing seek in a PAUSED pipeline blocks until the pipeline
is set to playing again since all data passing is blocked in the prerolled
sinks.
segment seeking with FLUSH
^^^^^^^^^^^^^^^^^^^^^^^^^^
This seek is typically performed when starting seamless looping.
segment seeking without FLUSH
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This seek is typically performed when continuing seamless looping.
========================================================================
Demuxer/parser behaviour and SEEK_FLAG_KEY_UNIT and SEEK_FLAG_ACCURATE
========================================================================
This section aims to explain the behaviour expected by an element with regard
to the KEY_UNIT and ACCURATE seek flags using the example of a parser or
demuxer.
1. DEFAULT BEHAVIOUR:
When a seek to a certain position is requested, the demuxer/parser will
do two things (ignoring flushing and segment seeks, and simplified for
illustration purposes):
- send a segment event with a new start position
- start pushing data/buffers again
To ensure that the data corresponding to the requested seek position
can actually be decoded, a demuxer or parser needs to start pushing data
from a keyframe/keyunit at or before the requested seek position.
Unless requested differently (via the KEY_UNIT flag), the start of the
segment event should be the requested seek position.
So by default a demuxer/parser will then start pushing data from
position DATA and send a segment event with start position SEG_START,
and DATA <= SEG_START.
If DATA < SEG_START, a well-behaved video decoder will start decoding frames
from DATA, but take into account the segment configured by the demuxer via
the segment event, and only actually output decoded video frames from
SEG_START onwards, dropping all decoded frames that are before the
segment start and adjusting the timestamp/duration of the buffer that
overlaps the segment start ("clipping"). A not-so-well-behaved video decoder
will start decoding frames from DATA and push decoded video frames out
starting from position DATA, in which case the frames that are before
the configured segment start will usually be dropped/clipped downstream
(e.g. by the video sink).
2. GST_SEEK_FLAG_KEY_UNIT:
If the KEY_UNIT flag is specified, the demuxer/parser should adjust the
segment start to the position of the key frame closest to the requested
seek position and then start pushing out data from there. The nearest
key frame may be before or after the requested seek position, but many
implementations will only look for the closest keyframe before the
requested position.
Most media players and thumbnailers do (and should be doing) KEY_UNIT seeks
by default, for performance reasons, to ensure almost-instant responsiveness
when scrubbing (dragging the seek slider in PAUSED or PLAYING mode). This
works well for most media, but results in suboptimal behaviour for a small
number of 'odd' files (e.g. files that only have one keyframe at the very
beginning, or only a few keyframes throughout the entire stream). At the
time of writing, a solution for this still needs to be found, but could be
implemented demuxer/parser-side, e.g. make demuxers/parsers ignore the
KEY_UNIT flag if the position adjustment would be larger than 1/10th of
the duration or somesuch.
Flags can be used to influence snapping direction for those cases where it
matters. SNAP_BEFORE will select the preceding position to the seek target,
and SNAP_AFTER will select the following one. If both flags are set, the
nearest one to the seek target will be used. If none of these flags are set,
the seeking implemention is free to select whichever it wants.
Summary:
- if the KEY_UNIT flag is *not* specified, the demuxer/parser should
start pushing data from a key unit preceding the seek position
(or from the seek position if that falls on a key unit), and
the start of the new segment should be the requested seek position.
- if the KEY_UNIT flag is specified, the demuxer/parser should start
pushing data from the key unit nearest the seek position (or from
the seek position if that falls on a key unit), and
the start of the new segment should be adjusted to the position of
that key unit which was nearest the requested seek position (ie.
the new segment start should be the position from which data is
pushed).
3. GST_SEEK_FLAG_ACCURATE:
If the ACCURATE flag is specified in a seek request, the demuxer/parser
is asked to do whatever it takes (!) to make sure that the position seeked
to is accurate in relation to the beginning of the stream. This means that
it is not acceptable to just approximate the position (e.g. using an average
bitrate). The achieved position must be exact. In the worst case, the demuxer
or parser needs to push data from the beginning of the file and let downstream
clip everything before the requested segment start.
The ACCURATE flag does not affect what the segment start should be in
relation to the requested seek position. Only the KEY_UNIT flag (or its
absence) has any effect on that.
Video editors and frame-stepping applications usually use the ACCURATE flag.
Summary:
- if the ACCURATE flag is *not* specified, it is up to the demuxer/parser
to decide how exact the seek should be. If the flag is not specified,
the expectation is that the demuxer/parser does a resonable best effort
attempt, trading speed for accuracy. In the absence of an index, the
seek position may be approximated.
- if the ACCURATE flag is specified, absolute accuracy is required, and
speed is of no concern. It is not acceptable to just approximate the
seek position in that case.
- the ACCURATE flag does not imply that the segment starts at the
requested seek position or should be adjusted to the nearest keyframe,
only the KEY_UNIT flag determines that.
4. ACCURATE and KEY_UNIT combinations:
All combinations of these two flags are valid:
- neither flag specified: segment starts at seek position, send data
from preceding key frame (or earlier), feel free to approximate the
seek position
- only KEY_UNIT specified: segment starts from position of nearest
keyframe, send data from nearest keyframe, feel free to approximate the
seek position
- only ACCURATE specified: segment starts at seek position, send data
from preceding key frame (or earlier), do not approximate the seek
position under any circumstances
- ACCURATE | KEY_UNIT specified: segment starts from position of nearest
keyframe, send data from nearest key frame, do not approximate the seek
position under any circumstances

View file

@ -1,109 +0,0 @@
Segments
--------
A segment in GStreamer denotes a set of media samples that must be
processed. A segment has a start time, a stop time and a processing
rate.
A media stream has a start and a stop time. The start time is
always 0 and the stop time is the total duration (or -1 if unknown,
for example a live stream). We call this the complete media stream.
The segment of the complete media stream can be played by issuing a seek
on the stream. The seek has a start time, a stop time and a processing rate.
complete stream
+------------------------------------------------+
0 duration
segment
|--------------------------|
start stop
The playback of a segment starts with a source or demuxer element pushing a
segment event containing the start time, stop time and rate of the segment.
The purpose of this segment is to inform downstream elements of the
requested segment positions. Some elements might produce buffers that fall
outside of the segment and that might therefore be discarded or clipped.
Use case: FLUSHING seek
~~~~~~~~~~~~~~~~~~~~~~~
ex.
filesrc ! avidemux ! videodecoder ! videosink
When doing a seek in this pipeline for a segment 1 to 5 seconds, avidemux
will perform the seek.
Avidemux starts by sending a FLUSH_START event downstream and upstream. This
will cause its streaming task to PAUSED because _pad_pull_range() and
_pad_push() will return FLUSHING. It then waits for the STREAM_LOCK,
which will be unlocked when the streaming task pauses. At this point no
streaming is happening anymore in the pipeline and a FLUSH_STOP is sent
upstream and downstream.
When avidemux starts playback of the segment from second 1 to 5, it pushes
out a segment with 1 and 5 as start and stop times. The stream_time in
the segment is also 1 as this is the position we seek to.
The video decoder stores these values internally and forwards them to the
next downstream element (videosink, which also stores the values)
Since second 1 does not contain a keyframe, the avi demuxer starts sending
data from the previous keyframe which is at timestamp 0.
The video decoder decodes the keyframe but knows it should not push the
video frame yet as it falls outside of the configured segment.
When the video decoder receives the frame with timestamp 1, it is able to
decode this frame as it received and decoded the data up to the previous
keyframe. It then continues to decode and push frames with timestamps >= 1.
When it reaches timestamp 5, it does not decode and push frames anymore.
The video sink receives a frame of timestamp 1. It takes the start value of
the previous segment and aplies the following (simplified) formula:
render_time = BUFFER_TIMESTAMP - segment_start + element->base_time
It then syncs against the clock with this render_time. Note that
BUFFER_TIMESTAMP is always >= segment_start or else it would fall outside of
the configure segment.
Videosink reports its current position as (simplified):
current_position = clock_time - element->base_time + segment_time
See part-synchronisation.txt for a more detailed and accurate explanation of
synchronisation and position reporting.
Since after a flushing seek the stream_time is reset to 0, the new buffer
will be rendered immediately after the seek and the current_position will be
the stream_time of the seek that was performed.
The stop time is important when the video format contains B frames. The
video decoder receives a P frame first, which it can decode but not push yet.
When it receives a B frame, it can decode the B frame and push the B frame
followed by the previously decoded P frame. If the P frame is outside of the
segment, the decoder knows it should not send the P frame.
Avidemux stops sending data after pushing a frame with timestamp 5 and
returns GST_FLOW_EOS from the chain function to make the upstream
elements perform the EOS logic.
Use case: live stream
~~~~~~~~~~~~~~~~~~~~~
Use case: segment looping
~~~~~~~~~~~~~~~~~~~~~~~~~
Consider the case of a wav file with raw audio.
filesrc ! wavparse ! alsasink

View file

@ -1,91 +0,0 @@
Seqnums (Sequence numbers)
--------------------------
Seqnums are integers associated to events and messages. They are used to
identify a group of events and messages as being part of the same 'operation'
over the pipeline.
Whenever a new event or message is created, a seqnum is set into them. This
seqnum is created from an ever increasing source (starting from 0 and it
might wrap around), so each new event and message gets a new and hopefully
unique seqnum.
Suppose an element receives an event A and, as part of the logic of handling
the event A, creates a new event B. B should have its seqnum to the same as A,
because they are part of the same operation. The same logic applies if this
element had to create multiple events or messages, all of those should have
the seqnum set to the value on the received event. For example, when a sink
element receives an EOS event and creates a new EOS message to post, it
should copy the seqnum from the event to the message because the EOS message
is a consequence of the EOS event being received.
Preserving the seqnums accross related events and messages allows the elements
and applications to identify a set of events/messages as being part of a single
operation on the pipeline. For example, flushes, segments and EOS that are
related to a seek event started by the application.
Seqnums are also useful for elements to discard duplicated events, avoiding
handling them again.
Below are some scenarios as examples of how to handle seqnums when receving
events:
Forcing EOS on the pipeline
---------------------------
The application has a pipeline running and does a gst_element_send_event
to the pipeline with an EOS event. All the sources in the pipeline will
have their send_event handlers called and will receive the event from
the application.
When handling this event, the sources will push either the same EOS downstream
or create their own EOS event and push. In the later case, the source should
copy the seqnum from the original EOS to the newly created. This same logic
applies to all elements that receive the EOS downstream, either push the
same event or, if creating a new one, copy the seqnum.
When the EOS reaches the sink, it will create an EOS message, copy the
seqnum to the message and post to the bus. The application receives the
message and can compare the seqnum of the message with the one from the
original event sent to the pipeline. If they match, it knows that this
EOS message was caused by the event it pushed and not from other reason
(input finished or configured segment was over).
Seeking
-------
A seek event sent to the pipeline is forwarded to all sinks in it. Those
sinks, then, push the seek event upstream until they reach an element
that is capable of handling it. If the element handling the seek has
multiple source pads (tipically a demuxer is handling the seek) it might
receive the same seek event on all pads. To prevent handling the same
seek event multiple times, the seqnum can be used to identify those
events as being the same and only handle the first received.
Also, when handling the seek, the element might push flush-start, flush-stop
and a segment event. All those events should have the same seqnum of the seek
event received. When this segment is over and an EOS/Segment-done event is
going to be pushed, it also should have the same seqnum of the seek that
originated the segment to be played.
Having the same seqnum as the seek on the segment-done or EOS events is
important for the application to identify that the segment requested
by its seek has finished playing.
Questions
---------
A) What happens if the application has sent a seek to the pipeline and,
while the segment relative to this seek is playing, it sends an EOS
event? Should the EOS pushed by the source have the seqnum of the
segment or the EOS from the application?
If the EOS was received from the application before the segment ended, it
should have the EOS from the application event. If the segment ends before
the application event is received/handled, it should have the seek/segment
seqnum.

View file

@ -1,106 +0,0 @@
DRAFT Sparse Streams
--------------------
Introduction
~~~~~~~~~~~~
In 0.8, there was some support for Sparse Streams through the use of
FILLER events. These were used to mark gaps between buffers so that downstream
elements could know not to expect any more data for that gap.
In 0.10, segment information conveyed through SEGMENT events can be used
for the same purpose.
In 1.0, there is a GAP event that works in a similar fashion as the FILLER
event in 0.8.
Use cases
~~~~~~~~~
1) Sub-title streams
Sub-title information from muxed formats such as Matroska or MPEG consist of
irregular buffers spaced far apart compared to the other streams
(audio and video). Since these usually only appear when someone speaks or
some other action in the video/audio needs describing, they can be anywhere
from 1-2 seconds to several minutes apart.
Downstream elements that want to mix sub-titles and video (and muxers)
have no way of knowing whether to process a video packet or wait a moment
for a corresponding sub-title to be delivered on another pad.
2) Still frame/menu support
In DVDs (and other formats), there are still-frame regions where the current
video frame should be retained and no audio played for a period. In DVD,
these are described either as a fixed duration, or infinite duration still
frame.
3) Avoiding processing silence from audio generators
Imagine a source that from time to time produces empty buffers (silence
or blank images). If the pipeline has many elements next, it is better to
optimise the obsolete data processing in this case. Examples for such sources
are sound-generators (simsyn in gst-buzztard) or a source in a voip
application that uses noise-gating (to save bandwith).
Details
~~~~~~~
1) Sub-title streams
The main requirement here is to avoid stalling the pipeline between sub-title
packets, and is effectively updating the minimum-timestamp for that stream.
A demuxer can do this by sending an 'update' SEGMENT with a new start time
to the subtitle pad. For example, every time the SCR in MPEG data
advances more than 0.5 seconds, the MPEG demuxer can issue a SEGMENT with
(update=TRUE, start=SCR ). Downstream elements can then be aware not to
expect any data older than the new start time.
The same holds true for any element that knows the current position in the
stream - once the element knows that there is no more data to be presented
until time 'n' it can advance the start time of the current segment to 'n'.
This technique can also be used, for example, to represent a stream of
MIDI events spaced to a clock period. When there is no event present for
a clock time, a SEGMENT update can be sent in its place.
2) Still frame/menu support
Still frames in DVD menus are not the same, in that they do not introduce
a gap in the timestamps of the data. Instead, they represent a pause in the
presentation of a stream. Correctly performing the wait requires some
synchronisation with downstream elements.
In this scenario, an upstream element that wants to execute a still frame
performs the following steps:
* Send all data before the still frame wait
* Send a DRAIN event to ensure that all data has been played downstream.
* wait on the clock for the required duration, possibly interrupting
if necessary due to an intervening activity (such as a user navigation)
* FLUSH the pipeline using a normal flush sequence (FLUSH_START,
chain-lock, FLUSH_STOP)
* Send a SEGMENT to restart playback with the next timestamp in the
stream.
The upstream element performing the wait must only do so when in the PLAYING
state. During PAUSED, the clock will not be running, and may not even have
been distributed to the element yet.
DRAIN is a new event that will block on a src pad until all data downstream
has been played out.
Flushing after completing the still wait is to ensure that data after the wait
is played correctly. Without it, sinks will consider the first buffers
(x seconds, where x is the duration of the wait that occurred) to be
arriving late at the sink, and they will be discarded instead of played.
3) For audio, 3) is the same case as 1) - there is a 'gap' in the audio data
that needs to be presented, and this can be done by sending a SEGMENT
update that moves the start time of the segment to the next timestamp when
data will be sent.
For video, however it is slightly different. Video frames are typically
treated at the moment as continuing to be displayed after their indicated
duration if no new frame arrives. In 3), it is desired to display a blank
frame instead, in which case at least one blank frame should be sent before
updating the start time of the segment.

View file

@ -1,54 +0,0 @@
Ownership of dynamic objects
----------------------------
Any object-oriented system or language that doesn't have automatic garbage
collection has many potential pitfalls as far as the pointers go. Therefore,
some standards must be adhered to as far as who owns what.
Strings
~~~~~~~
Arguments passed into a function are owned by the caller, and the function
will make a copy of the string for its own internal use. The string should
be const gchar *. Strings returned from a function are always a copy of the
original and should be freed after usage by the caller.
ex:
name = gst_element_get_name (element); /* copy of name is made */
.. use name ..
g_free (name); /* free after usage */
Objects
~~~~~~~
Objects passed into a function are owned by the caller, any additional
reference held to the object after leaving the function should increase the
refcount of that object.
Objects returned from a function are owned by the caller. This means that the
called should _free() or _unref() the object after usage.
ex:
peer = gst_pad_get_peer (pad); /* peer with increased refcount */
if (peer) {
.. use peer ..
gst_object_unref (GST_OBJECT (peer)); /* unref peer after usage */
}
Iterators
~~~~~~~~~
When retrieving multiple objects from an object an iterator should be used.
The iterator allows you to access the objects one after another while making
sure that the set of objects retrieved remains consistent.
Each object retrieved from an iterator has its refcount increased or is a
copy of the original. In any case the object should be unreffed or freed
after usage.

View file

@ -1,406 +0,0 @@
States
------
Both elements and pads can be in different states. The states of the pads are
linked to the state of the element so the design of the states is mainly
focused around the element states.
An element can be in 4 states. NULL, READY, PAUSED and PLAYING. When an element
is initially instantiated, it is in the NULL state.
State definitions
~~~~~~~~~~~~~~~~~
- NULL: This is the initial state of an element.
- READY: The element should be prepared to go to PAUSED.
- PAUSED: The element should be ready to accept and process data. Sink
elements however only accept one buffer and then block.
- PLAYING: The same as PAUSED except for live sources and sinks. Sinks accept
and render data. Live sources produce data.
We call the sequence NULL->PLAYING an upwards state change and PLAYING->NULL
a downwards state change.
State transitions
~~~~~~~~~~~~~~~~~
the following state changes are possible:
NULL -> READY
- The element must check if the resources it needs are available.
Device sinks and sources typically try to probe the device to constrain
their caps.
- The element opens the device, this is needed if the previous step requires
the device to be opened.
READY -> PAUSED
- The element pads are activated in order to receive data in PAUSED.
Streaming threads are started.
- Some elements might need to return ASYNC and complete the state change
when they have enough information. It is a requirement for sinks to
return ASYNC and complete the state change when they receive the first
buffer or EOS event (preroll). Sinks also block the dataflow when in PAUSED.
- A pipeline resets the running_time to 0.
- Live sources return NO_PREROLL and don't generate data.
PAUSED -> PLAYING
- Most elements ignore this state change.
- The pipeline selects a clock and distributes this to all the children
before setting them to PLAYING. This means that it is only allowed to
synchronize on the clock in the PLAYING state.
- The pipeline uses the clock and the running_time to calculate the base_time.
The base_time is distributed to all children when performing the state
change.
- Sink elements stop blocking on the preroll buffer or event and start
rendering the data.
- Sinks can post the EOS message in the PLAYING state. It is not allowed to
post EOS when not in the PLAYING state.
- While streaming in PAUSED or PLAYING elements can create and remove
sometimes pads.
- Live sources start generating data and return SUCCESS.
PLAYING -> PAUSED
- Most elements ignore this state change.
- The pipeline calculates the running_time based on the last selected clock
and the base_time. It stores this information to continue playback when
going back to the PLAYING state.
- Sinks unblock any clock wait calls.
- When a sink does not have a pending buffer to play, it returns ASYNC from
this state change and completes the state change when it receives a new
buffer or an EOS event.
- Any queued EOS messages are removed since they will be reposted when going
back to the PLAYING state. The EOS messages are queued in GstBins.
- Live sources stop generating data and return NO_PREROLL.
PAUSED -> READY
- Sinks unblock any waits in the preroll.
- Elements unblock any waits on devices
- Chain or get_range functions return FLUSHING.
- The element pads are deactivated so that streaming becomes impossible and
all streaming threads are stopped.
- The sink forgets all negotiated formats
- Elements remove all sometimes pads
READY -> NULL
- Elements close devices
- Elements reset any internal state.
State variables
~~~~~~~~~~~~~~~
An element has 4 state variables that are protected with the object LOCK:
- STATE
- STATE_NEXT
- STATE_PENDING
- STATE_RETURN
The STATE always reflects the current state of the element.
The STATE_NEXT reflects the next state the element will go to.
The STATE_PENDING always reflects the required state of the element.
The STATE_RETURN reflects the last return value of a state change.
The STATE_NEXT and STATE_PENDING can be VOID_PENDING if the element is in
the right state.
An element has a special lock to protect against concurrent invocations of
_set_state(), called the STATE_LOCK.
Setting state on elements
~~~~~~~~~~~~~~~~~~~~~~~~~
The state of an element can be changed with _element_set_state(). When changing
the state of an element all intermediate states will also be set on the element
until the final desired state is set.
The _set_state() function can return 3 possible values:
GST_STATE_FAILURE: The state change failed for some reason. The plugin should
have posted an error message on the bus with information.
GST_STATE_SUCCESS: The state change is completed successfully.
GST_STATE_ASYNC: The state change will complete later on. This can happen
when the element needs a long time to perform the state
change or for sinks that need to receive the first buffer
before they can complete the state change (preroll).
GST_STATE_NO_PREROLL: The state change is completed successfully but the element
will not be able to produce data in the PAUSED state.
In the case of an ASYNC state change, it is possible to proceed to the next
state before the current state change completed, however, the element will only
get to this next state before completing the previous ASYNC state change.
After receiving an ASYNC return value, you can use _element_get_state() to poll
the status of the element. If the polling returns SUCCESS, the element completed
the state change to the last requested state with _set_state().
When setting the state of an element, the STATE_PENDING is set to the required
state. Then the state change function of the element is called and the result of
that function is used to update the STATE and STATE_RETURN fields, STATE_NEXT,
STATE_PENDING and STATE_RETURN fields. If the function returned ASYNC, this result
is immediately returned to the caller.
Getting state of elements
~~~~~~~~~~~~~~~~~~~~~~~~~
The _get_state() function takes 3 arguments, two pointers that will hold the
current and pending state and one GstClockTime that holds a timeout value. The
function returns a GstElementStateReturn.
- If the element returned SUCCESS to the previous _set_state() function, this
function will return the last state set on the element and VOID_PENDING in
the pending state value. The function returns GST_STATE_SUCCESS.
- If the element returned NO_PREROLL to the previous _set_state() function, this
function will return the last state set on the element and VOID_PENDING in
the pending state value. The function returns GST_STATE_NO_PREROLL.
- If the element returned FAILURE to the previous _set_state() call, this
function will return FAILURE with the state set to the current state of
the element and the pending state set to the value used in the last call
of _set_state().
- If the element returned ASYNC to the previous _set_state() call, this function
will wait for the element to complete its state change up to the amount of time
specified in the GstClockTime.
* If the element does not complete the state change in the specified amount of
time, this function will return ASYNC with the state set to the current state
and the pending state set to the pending state.
* If the element completes the state change within the specified timeout, this
function returns the updated state and VOID_PENDING as the pending state.
* If the element aborts the ASYNC state change due to an error within the
specified timeout, this function returns FAILURE with the state set to last
successful state and pending set to the last attempt. The element should
also post an error message on the bus with more information about the problem.
States in GstBin
~~~~~~~~~~~~~~~~
A GstBin manages the state of its children. It does this by propagating the state
changes performed on it to all of its children. The _set_state() function on a
bin will call the _set_state() function on all of its children, that are
not already in the target state or in a change state to the target state.
The children are iterated from the sink elements to the source elements. This makes
sure that when changing the state of an element, the downstream elements are in
the correct state to process the eventual buffers. In the case of a downwards
state change, the sink elements will shut down first which makes the upstream
elements shut down as well since the _push() function returns a GST_FLOW_FLUSHING
error.
If all the children return SUCCESS, the function returns SUCCESS as well.
If one of the children returns FAILURE, the function returns FAILURE as well. In
this state it is possible that some elements successfully changed state. The
application can check which elements have a changed state, which were in error
and which were not affected by iterating the elements and calling _get_state()
on the elements.
If after calling the state function on all children, one of the children returned
ASYNC, the function returns ASYNC as well.
If after calling the state function on all children, one of the children returned
NO_PREROLL, the function returns NO_PREROLL as well.
If both NO_PREROLL and ASYNC children are present, NO_PREROLL is returned.
The current state of the bin can be retrieved with _get_state().
If the bin is performing an ASYNC state change, it will automatically update its
current state fields when it receives state messages from the children.
Implementing states in elements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
READY
^^^^^
upward state change
~~~~~~~~~~~~~~~~~~~
Upward state changes always return ASYNC either if the STATE_PENDING is
reached or not.
Element:
A -> B => SUCCESS
- commit state
A -> B => ASYNC
- no commit state
- element commits state ASYNC
A -> B while ASYNC
- update STATE_PENDING state
- no commit state
- no change_state called on element
Bin:
A->B: all elements SUCCESS
- commit state
A->B: some elements ASYNC
- no commit state
- listen for commit messages on bus
- for each commit message, poll elements, this happens in another
thread.
- if no ASYNC elements, commit state, continue state change
to STATE_PENDING
downward state change
~~~~~~~~~~~~~~~~~~~~~
Downward state changes only return ASYNC if the final state is ASYNC.
This is to make sure that it's not needed to wait for an element to
complete the preroll or other ASYNC state changes when one only wants to
shut down an element.
Element:
A -> B => SUCCESS
- commit state
A -> B => ASYNC not final state
- commit state on behalf of element
A -> B => ASYNC final state
- element will commit ASYNC
Bin:
A -> B -> SUCCESS
- commit state
A -> B -> ASYNC not final state
- commit state on behalf of element, continue state change
A -> B => ASYNC final state
- no commit state
- listen for commit messages on bus
- for each commit message, poll elements
- if no ASYNC elements, commit state
Locking overview (element)
~~~~~~~~~~~~~~~~~~~~~~~~~~
* Element committing SUCCESS
- STATE_LOCK is taken in set_state
- change state is called if SUCCESS, commit state is called
- commit state calls change_state to next state change.
- if final state is reached, stack unwinds and result is returned to
set_state and caller.
set_state(element) change_state (element) commit_state
| | |
| | |
STATE_LOCK | |
| | |
|------------------------>| |
| | |
| | |
| | (do state change) |
| | |
| | |
| | if SUCCESS |
| |---------------------->|
| | | post message
| | |
| |<----------------------| if (!final) change_state (next)
| | | else SIGNAL
| | |
| | |
| | |
|<------------------------| |
| SUCCESS
|
STATE_UNLOCK
|
SUCCESS
* Element committing ASYNC
- STATE_LOCK is taken in set_state
- change state is called and returns ASYNC
- ASYNC returned to the caller.
- element takes LOCK in streaming thread.
- element calls commit_state in streaming thread.
- commit state calls change_state to next state change.
set_state(element) change_state (element) stream_thread commit_state (element)
| | | |
| | | |
STATE_LOCK | | |
| | | |
|------------------------>| | |
| | | |
| | | |
| | (start_task) | |
| | | |
| | STREAM_LOCK |
| | |... |
|<------------------------| | |
| ASYNC STREAM_UNLOCK |
STATE_UNLOCK | |
| .....sync........ STATE_LOCK |
ASYNC |----------------->|
| |
| |---> post_message()
| |---> if (!final) change_state (next)
| | else SIGNAL
|<-----------------|
STATE_UNLOCK
|
STREAM_LOCK
| ...
STREAM_UNLOCK
Remarks
~~~~~~~
set_state cannot be called from multiple threads at the same time. The STATE_LOCK
prevents this.
state variables are protected with the LOCK.
calling set_state while gst_state is called should unlock the get_state with
an error. The cookie will do that.
set_state(element)
STATE_LOCK
LOCK
update current, next, pending state
cookie++
UNLOCK
change_state
STATE_UNLOCK

View file

@ -1,611 +0,0 @@
Stream selection
----------------
History
v0.1: Jun 11th 2015
Initial Draft
v0.2: Sep 18th 2015
Update to reflect design changes
v1.0: Jun 28th 2016
Pre-commit revision
This document describes the events and objects involved in stream
selection in GStreamer pipelines, elements and applications
0. Background
----------------
This new API is intended to address the use cases described in
this section:
1) As a user/app I want an overview and control of the media streams
that can be configured within a pipeline for processing, even
when some streams are mutually exclusive or logical constructs only.
2) The user/app can disable entirely streams it's not interested
in so they don't occupy memory or processing power - discarded
as early as possible in the pipeline. The user/app can also
(re-)enable them at a later time.
3) If the set of possible stream configurations is changing,
the user/app should be aware of the pending change and
be able to make configuration choices for the new set of streams,
as well as possibly still reconfiguring the old set
4) Elements that have some other internal mechanism for triggering
stream selections (DVD, or maybe some scripted playback
playlist) should be able to trigger 'selection' of some particular
stream.
5) Indicate known relationships between streams - for example that
2 separate video feeds represent the 2 views of a stereoscopic
view, or that certain streams are mutually exclusive.
Note: the streams that are "available" are not automatically
the ones active, or present in the pipeline as pads. Think HLS/DASH
alternate streams.
Use case examples:
1) Playing an MPEG-TS multi-program stream, we want to tell the
app that there are multiple programs that could be extracted
from the incoming feed. Further, we want to provide a mechanism
for the app to select which program(s) to decode, and once
that is known to further tell the app which elementary streams
are then available within those program(s) so the app/user can
choose which audio track(s) to decode and/or use.
2) A new PMT arrives for an MPEG-TS stream, due to a codec or
channel change. The pipeline will need to reconfigure to
play the desired streams from new program. Equally, there
may be multiple seconds of content buffered from the old
program and it should still be possible to switch (for example)
subtitle tracks responsively in the draining out data, as
well as selecting which subs track to play from the new feed.
This same scenario applies when doing gapless transition to a
new source file/URL, except that likely the element providing
the list of streams also changes as a new demuxer is installed.
3) When playing a multi-angle DVD, the DVD Virtual Machine needs to
extract 1 angle from the data for presentation. It can publish
the available angles as logical streams, even though only one
stream can be chosen.
4) When playing a DVD, the user can make stream selections from the
DVD menu to choose audio or sub-picture tracks, or the DVD VM
can trigger automatic selections. In addition, the player UI
should be able to show which audio/subtitle tracks are available
and allow direct selection in a GUI the same as for normal
files with subtitle tracks in them.
5) Playing a SCHC (3DTV) feed, where one view is MPEG-2 and the other
is H.264 and they should be combined for 3D presentation, or
not bother decoding 1 stream if displaying 2D.
(bug https://bugzilla.gnome.org/show_bug.cgi?id=719333)
*) FIXME - need some use cases indicating what alternate streams in
HLS might require - what are the possibilities?
1. Design Overview
-----------
Stream selection in GStreamer is implemented in several parts:
1) Objects describing streams : GstStream
2) Objects describing a collection of streams : GstStreamCollection
3) Events from the app allowing selection and activation of some streams:
GST_EVENT_SELECT_STREAMS
4) Messages informing the user/application about the available
streams and current status:
GST_MESSAGE_STREAM_COLLECTION
GST_MESSAGE_STREAMS_SELECTED
2. GstStream objects
--------------------
API: GstStream
API: gst_stream_new(..)
API: gst_stream_get_*(...)
API: gst_stream_set_*()
API: gst_event_set_stream(...)
API: gst_event_parse_stream(...)
GstStream objects are a high-level convenience object containing
information regarding a possible data stream that can be exposed by
GStreamer elements.
They are mostly the aggregation of information present in other
GStreamer components (STREAM_START, CAPS, TAGS event) but are not
tied to the presence of a GstPad, and for some use-cases provide
information that the existing components don't provide.
The various properties of a GstStream object are:
- stream_id (from the STREAM_START event)
- flags (from the STREAM_START event)
- caps
- tags
- type (high-level type of stream: Audio, Video, Container,...)
GstStream objects can be subclassed so that they can be re-used by
elements already using the notion of stream (which is common for
example in demuxers).
Elements that create GstStream should also set it on the
GST_EVENT_STREAM_START event of the relevant pad. This helps
downstream elements to have all information in one location.
3. Exposing collections of streams
----------------------------------
API: GstStreamCollection
API: gst_stream_collection_new(...)
API: gst_stream_collection_add_stream(...)
API: gst_stream_collection_get_size(...)
API: gst_stream_collection_get_stream(...)
API: GST_MESSAGE_STREAM_COLLECTION
API: gst_message_new_stream_collection(...)
API: gst_message_parse_stream_collection(...)
API: GST_EVENT_STREAM_COLLECTION
API: gst_event_new_stream_collection(...)
API: gst_event_parse_stream_collection(...)
Elements that create new streams (such as demuxers) or can create
new streams (like the HLS/DASH alternative streams) can list the
streams they can make available with the GstStreamCollection object.
Other elements that might generate GstStreamCollections are the
DVD-VM, which handles internal switching of tracks, or parsebin and
decodebin3 when it aggregates and presents multiple internal stream
sources as a single configurable collection.
The GstStreamCollection object is a flat listing of GstStream objects.
The various properties of a GstStreamCollection are:
- 'identifier'
- the identifier of the collection (unique name)
- Generated from the 'upstream stream id' (or stream ids, plural)
- the list of GstStreams in the collection.
- (Not implemented) : Flags -
For now, the only flag is 'INFORMATIONAL' - used by container parsers to
publish information about detected streams without allowing selection of
the streams.
- (Not implemented yet) : The relationship between the various streams
This specifies which streams are exclusive (can not be selected at the
same time), are related (such as LINKED_VIEW or ENHANCEMENT), or need to
be selected together.
An element will inform outside components about that collection via:
* a GST_MESSAGE_STREAM_COLLECTION message on the bus.
* a GST_EVENT_STREAM_COLLECTION on each source pads.
Applications and container bin elements can listen and collect the
various stream collections to know the full range of streams
available within a bin/pipeline.
Once posted on the bus, a GstStreamCollection is immutable. It is
updated by subsequent messages with a matching identifier.
If the element that provided the collection goes away, there is no way
to know that the streams are no longer valid (without having the
user/app track that element). The exception to that is if the bin
containing that element (such as parsebin or decodebin3) informs that
the next collection is a replacement of the former one.
The mutual exclusion and relationship lists use stream-ids
rather than GstStream references in order to avoid circular
referencing problems.
3.1 Usage from elements
-----------------------
When a demuxer knows the list of streams it can expose, it
creates a new GstStream for each stream it can provide with the
appropriate information (stream id, flag, tags, caps, ...).
The demuxer then creates a GstStreamCollection object in which it
will put the list of GstStream it can expose. That collection is
then both posted on the bus (via a GST_MESSAGE_COLLECTION) and on
each pad (via a GST_EVENT_STREAM_COLLECTION).
That new collection must be posted on the bus *before* the changes
are made available. i.e. before pads corresponding to that selection
are added/removed.
In order to be backwards-compatible and support elements that don't
create streams/collection yet, the new 'parsebin' element used by
decodebin3 will automatically create those if not provided.
3.2 Usage from application
--------------------------
Applications can know what streams are available by listening to the
GST_MESSAGE_STREAM_COLLECTION messages posted on the bus.
The application can list the available streams per-type (such as all
the audio streams, or all the video streams) by iterating the
streams available in the collection by GST_STREAM_TYPE.
The application will also be able to use these stream information to
decide which streams should be activated or not (see the stream
selection event below).
3.3 Backwards compatibility
---------------------------
Not all demuxers will create the various GstStream and
GstStreamCollection objects. In order to remain backwards
compatible, a parent bin (parsebin in decodebin3) will create the
GstStream and GstStreamCollection based on the pads being
added/removed from an element.
This allows providing stream listing/selection for any demuxer-like
element even if it doesn't implement the GstStreamCollection usage.
4. Stream selection event
-------------------------
API: GST_EVENT_SELECT_STREAMS
API: gst_event_new_select_streams(...)
API: gst_event_parse_select_streams(...)
Stream selection events are generated by the application and
sent into the pipeline to configure the streams.
The event carries:
* List of GstStreams to activate - a subset of the GstStreamCollection
* (Not implemented) - List of GstStreams to be kept discarded - a
subset of streams for which hot-swapping will not be desired,
allowing elements (such as decodebin3, demuxers, ...) to not parse or
buffer those streams at all.
4.1. Usage from application
---------------------------
There are two use-cases where an application needs to specify in a
generic fashion which streams it wants in output:
1) When there are several present streams of which it only wants a
subset (such as one audio, one video and one subtitle
stream). Those streams are demuxed and present in the pipeline.
2) When the stream the user wants require some element to undertake
some action to expose that stream in the pipeline (such as
DASH/HLS alternative streams).
From the point of view of the application, those two use-cases are
treated identically. The streams are all available through the
GstStreamCollection posted on the bus, and it will select a subset.
The application can select the streams it wants by creating a
GST_EVENT_SELECT_STREAMS event with the list of stream-id of the
streams it wants. That event is then sent on the pipeline,
eventually traveling all the way upstream from each sink.
In some cases, selecting one stream may trigger the availability of
other dependent streams, resulting in new GstStreamCollection
messages. This can happen in the case where choosing a different DVB
channel would create a new single-program collection.
4.2. Usage in elements
----------------------
Elements that receive the GST_EVENT_SELECT_STREAMS event and that
can activate/deactivate streams need to look at the list of
stream-id contained in the event and decide if they need to do some
action.
In the standard demuxer case (demuxing and exposing all streams),
there is nothing to do by default.
In decodebin3, activating or deactivating streams is taken care of by
linking only the streams present in the event to decoders and output
ghostpad.
In the case of elements that can expose alternate streams that are
not present in the pipeline as pads, they will take the appropriate
action to add/remove those streams.
Containers that receive the event should pass it to any elements
with no downstream peers, so that streams can be configured during
pre-roll before a pipeline is completely linked down to sinks.
5. decodebin3 usage and example
-------------------------------
This is an example of how decodebin3 works by using the
above-mentioned objects/events/messages.
For clarity/completeness, we will consider a mpeg-ts stream that has
multiple audio streams. Furthermore that stream might have changes
at some point (switching video codec, or adding/removing audio
streams).
5.1. Initial differences
------------------------
decodebin3 is different, compared to decodebin2, in the sense that, by
default:
* it will only expose as output ghost source pads one stream of each
type (one audio, one video, ..).
* It will only decode the exposed streams
The multiqueue element is still used and takes in all elementary
(non-decoded) streams. If parsers are needed/present they are placed
before the multiqueue. This is needed in order for multiqueue to
work only with packetized and properly timestamped streams.
Note that the whole typefinding of streams, and optional depayloading,
demuxing and parsing are done in a new 'parsebin' element.
Just like the current implementation, demuxers will expose all
streams present within a program as source pads. They will connect
to parsers and multiqueue.
Initial setup. 1 video stream, 2 audio streams.
+---------------------+
| parsebin |
| --------- | +-------------+
| | demux |--[parser]-+-| multiqueue |--[videodec]---[
]-+-| |--[parser]-+-| |
| | |--[parser]-+-| |--[audiodec]---[
| --------- | +-------------+
+---------------------+
5.2. GstStreamCollection
------------------------
When parsing the initial PAT/PMT, the demuxer will:
1) create the various GstStream objects for each stream.
2) create the GstStreamCollection for that initial PMT
3) post the GST_MESSAGE_STREAM_COLLECTION
Decodebin will intercept that message and know what the demuxer will
be exposing.
4) The demuxer creates the various pads and sends the corresponding
STREAM_START event (with the same stream-id as the corresponding
GstStream objects), CAPS event, and TAGS event.
parsebin will add all relevant parsers and expose those streams.
Decodebin will be able to correlate, based on STREAM_START event
stream-id, what pad corresponds to which stream. It links each stream
from parsebin to multiqueue.
Decodebin knows all the streams that will be available. Since by
default it is configured to only expose a stream of each type, it
will pick a stream of each for which it will complete the
auto-plugging (finding a decoder and then exposing that stream as a
source ghostpad.
Note:
If the demuxer doesn't create/post the GstStreamCollection,
parsebin will create it on itself, as explained in section 2.3
above.
5.3. Changing the active selection from the application
-------------------------------------------------------
The user wants to change the audio track. The application received
the GST_MESSAGE_STREAM_COLLECTION containing the list of available
streams. For clarity, we will assume those stream-ids are
"video-main", "audio-english" and "audio-french".
The user prefers to use the french soundtrack (which it knows based
on the language tag contained in the GstStream objects).
The application will create and send a GST_EVENT_SELECT_STREAM event
containing the list of streams: "video-main", "audio-french".
That event gets sent on the pipeline, the sinks send it upstream and
eventually reach decodebin.
Decodebin compares:
* The currently active selection ("video-main", "audio-english")
* The available stream collection ("video-main", "audio-english",
"audio-french")
* The list of streams in the event ("video-main", "audio-french")
Decodebin determines that no change is required for "video-main",
but sees that it needs to deactivate "audio-english" and activate
"audio-french".
It unlinks the multiqueue source pad connected to the audiodec. Then
it queries audiodec, using the GST_QUERY_ACCEPT_CAPS, whether it can
accept as-is the caps from the "audio-french" stream.
1) If it does, the multiqueue source pad corresponding to
"audio-french" is linked to the decoder.
2) If it does not, the existing audio decoder is removed,
a new decoder is selected (like during initial
auto-plugging), and replaces the old audio decoder element.
The newly selected stream gets decoded and output through the same
pad as the previous audio stream.
Note:
The default behaviour would be to only expose one stream of each
type. But nothing prevents decodebin from outputting more/less of
each type if the GST_EVENT_SELECT_STREAM event specifies that. This
allows covering more use-case than the simple playback one.
Such examples could be :
* Wanting just a video stream or just an audio stream
* Wanting all decoded streams
* Wanting all audio streams
...
5.4. Changes coming from upstream
---------------------------------
At some point in time, a PMT change happens. Let's assume a change
in video-codec and/or PID.
The demuxer creates a new GstStream for the changed/new stream,
creates a new GstStreamCollection for the updated PMT and posts it.
Decodebin sees the new GstStreamCollection message.
The demuxer (and parsebin) then adds and removes pads.
1) decodebin will match the new pads to GstStream in the "new"
GstStreamCollection the same way it did for the initial pads in
section 4.2 above.
2) decodebin will see whether the new stream can re-use a multiqueue
slot used by a stream of the same type no longer present (it
compares the old collection to the new collection).
In this case, decodebin sees that the new video stream can re-use
the same slot as the previous video stream.
3) If the new stream is going to be active by default (in this case
it does because we are replacing the only video stream, which was
active), it will check whether the caps are compatible with the
existing videodec (in the same way it was done for the audio
decoder switch in section 4.3).
Eventually, the stream that switched will be decoded and output
through the same pad as the previous video stream in a gapless fashion.
5.5. Further examples
---------------------
5.5.1. HLS alternates
---------------------
There is a main (multi-bitrate or not) stream with audio and
video interleaved in mpeg-ts. The manifest also indicates the
presence of alternate language audio-only streams.
HLS would expose one collection containing:
1) The main A+V CONTAINER stream (mpeg-ts), initially active,
downloaded and exposed as a pad
2) The alternate A-only streams, initially inactive and not
exposed as pads
the tsdemux element connected to the first stream will also
expose a collection containing
1.1) A video stream
1.2) An audio stream
[ Collection 1 ] [ Collection 2 ]
[ (hlsdemux) ] [ (tsdemux) ]
[ upstream:nil ] /----[ upstream:main]
[ ] / [ ]
[ "main" (A+V) ]<-/ [ "video" (V) ] viddec1 : "video"
[ "fre" (A) ] [ "eng" (A) ] auddec1 : "eng"
[ "kor" (A) ] [ ]
The user might want to use the korean audio track instead of the
default english one.
=> SELECT_STREAMS ("video", "kor")
1) decodebin3 receives and sends the event further upstream
2) tsdemux sees that "video" is part of its current upstream,
so adds the corresponding stream-id ("main") to the event
and sends it upstream ("main", "video", "kor")
3) hlsdemux receives the event
=> It activates "kor" in addition to "main"
4) The event travels back to decodebin3 which will remember the
requested selection. If "kor" is already present it will switch
the "eng" stream from the audio decoder to the "kor" stream.
If it appears a bit later, it will wait until that "kor" stream
is available before switching
5.5.2 multi-program MPEG-TS
---------------------------
Assuming the case of a mpeg-ts stream which contains multiple
programs.
There would be three "levels" of collection:
1) The collection of programs presents in the stream
2) The collection of elementary streams presents in a stream
3) The collection of streams decodebin can expose
Initially tsdemux exposes the first program present (default)
[ Collection 1 ] [ Collection 2 ] [ Collection 3 ]
[ (tsdemux) ] [ (tsdemux) ] [ (decodebin) ]
[ id:Programs ]<-\ [ id:BBC1 ]<-\ [ id:BBC1-decoded ]
[ upstream:nil ] \-----[ upstream:Programs] \----[ upstream:BBC1 ]
[ ] [ ] [ ]
[ "BBC1" (C) ] [ id:"bbcvideo"(V) ] [ id:"bbcvideo"(V)]
[ "ITV" (C) ] [ id:"bbcaudio"(A) ] [ id:"bbcaudio"(A)]
[ "NBC" (C) ] [ ] [ ]
At some point the user wants to switch to ITV (of which we do not
know the topology at this point in time. A SELECT_STREAMS event
is sent with "ITV" in it and the pointer to the Collection1.
1) The event travels up the pipeline until tsdemux receives it
and begins the switch.
2) tsdemux publishes a new 'Collection 2a/ITV' and marks 'Collection 2/BBC'
as replaced.
2a) App may send a SELECT_STREAMS event configuring which demuxer output
streams should be selected (parsed)
3) tsdemux adds/removes pads as needed (flushing pads as it removes them?)
4) Decodebin feeds new pad streams through existing parsers/decoders as
needed. As data from the new collection arrives out each decoder,
decodebin sends new GstStreamCollection messages to the app so it
can know that the new streams are now switchable at that level.
4a) As new GstStreamCollections are published, the app may override
the default decodebin stream selection to expose more/fewer streams.
The default is to decode and output 1 stream of each type.
Final state:
[ Collection 1 ] [ Collection 4 ] [ Collection 5 ]
[ (tsdemux) ] [ (tsdemux) ] [ (decodebin) ]
[ id:Programs ]<-\ [ id:ITV ]<-\ [ id:ITV-decoded ]
[ upstream:nil ] \-----[ upstream:Programs] \----[ upstream:ITV ]
[ ] [ ] [ ]
[ "BBC1" (C) ] [ id:"itvvideo"(V) ] [ id:"itvvideo"(V)]
[ "ITV" (C) ] [ id:"itvaudio"(A) ] [ id:"itvaudio"(A)]
[ "NBC" (C) ] [ ] [ ]
6.0 TODO
--------
* Add missing implementation
- Add flags to GstStreamCollection
- Add mutual-exclusion and relationship API to GstStreamCollection
* Add helper API to figure out whether a collection is a replacement of another
or a completely new one. This will require a more generic system to know whether
a certain stream-id is a replacement of another or not.
7.0 OPEN QUESTIONS
------------------
* Is a FLUSHING flag for stream-selection required or not ?
This would make the handler of the SELECT_STREAMS event send FLUSH START/STOP
before switching to the other streams.
This is tricky when dealing where situations where we keep some streams and
only switch some others. Do we flush all streams ? Do we only flush the new
streams, potentially resulting in delay to fully switch ?
Furthermore, due to efficient buffering in decodebin3, the switching time has
been minimized extensively, to the point where flushing might not bring a
noticeable improvement.
* Store the stream collection in bins/pipelines ?
A Bin/Pipeline could store all active collection internally, so that it
could be queried later on. This could be useful to then get, on any pipeline,
at any point in time, the full list of collections available without having
to listen to all COLLECTION messages on the bus.
This would require fixing the "is a collection a replacement or not" issue first.
* When switching to new collections, should decodebin3 make any effort
to 'map' corresponding streams from the old to new PMT - that is,
try and stick to the 'english' language audio track, for example?
Alternatively, rely on the app to do such smarts with stream-select
messages ?

View file

@ -1,108 +0,0 @@
Stream Status
-------------
This document describes the design and use cases for the stream status
messages.
STREAM_STATUS messages are posted on the bus when the state of a streaming
thread changes. The purpose of this message is to allow the application to
interact with the streaming thread properties, such as the thread priority or
the threadpool to use.
We accommodate for the following requirements:
- Application is informed when a streaming thread is about to be created. It
should be possible for the application to suggest a custom GstTaskPool.
- Application is informed when the status of a streaming thread is changed.
This can be interesting for GUI application that want to visualize the status
of the streaming threads (playing/paused/stopped)
- Application is informed when a streaming thread is destroyed.
We allow for the following scenarios:
- Elements require a specific (internal) streaming thread to operate or the
application can create/specify a thread for the element.
- Elements allow the application to configure a priority on the threads.
Use cases
~~~~~~~~~
* boost the priority of the udp receiver streaming thread
.--------. .-------. .------. .-------.
| udpsrc | | depay | | adec | | asink |
| src->sink src->sink src->sink |
'--------' '-------' '------' '-------'
- when going from READY to PAUSED state, udpsrc will require a streaming
thread for pushing data into the depayloader. It will post a STREAM_STATUS
message indicating its requirement for a streaming thread.
- The application will usually react to the STREAM_STATUS messages with a sync
bus handler.
- The application can configure the GstTask with a custom GstTaskPool to
manage the streaming thread or it can ignore the message which will make
the element use its default GstTaskPool.
- The application can react to the ENTER/LEAVE stream status message to
configure the thread right before it is started/stopped. This can be used to
configure the thread priority.
- Before the GstTask is changed state (start/pause/stop) a STREAM_STATUS
message is posted that can be used by the application to keep track of
the running streaming threads.
Messages
~~~~~~~~
The existing STREAM_STATUS message will be further defined and implemented in
(selected) elements. The following fields will be contained in the message:
- "type", GST_TYPE_STREAM_STATUS_TYPE
- a set of types to control the lifecycle of the thread:
GST_STREAM_STATUS_TYPE_CREATE: a new streaming thread is going to be
created. The application has the chance to configure a custom thread.
GST_STREAM_STATUS_TYPE_ENTER: the streaming thread is about to enter its
loop function for the first time.
GST_STREAM_STATUS_TYPE_LEAVE: the streaming thread is about to leave its
loop.
GST_STREAM_STATUS_TYPE_DESTROY: a streaming thread is destroyed
- A set of types to control the state of the threads:
GST_STREAM_STATUS_TYPE_START: a streaming thread is started
GST_STREAM_STATUS_TYPE_PAUSE: a streaming thread is paused
GST_STREAM_STATUS_TYPE_STOP: a streaming thread is stopped
- "owner", GST_TYPE_ELEMENT
The owner element of the thread. The message source will contain the pad
(or one of the pads) that will produce data by this thread. If this thread
does not produce data on a pad, the message source will contain the owner
as well. The idea is that the application should be able to see from the
element/pad what function this thread has in the context of the
application and configure the thread appropriatly.
- "object", G_TYPE, GstTask/GThread
A GstTask/GThread controlling this streaming thread.
- "flow-return", GstFlowReturn
A status code for why the thread state changed. when threads are created
and started, this is usually GST_FLOW_OK but when they are stopping it
contains the reason code why it stopped.
- "reason", G_TYPE_STRING
A string describing the reason why the thread started/stopped/paused.
Can be NULL if no reason is given.
Events
~~~~~~

View file

@ -1,82 +0,0 @@
Streams
-------
This document describes the objects that are passed from element to
element in the streaming thread.
Stream objects
~~~~~~~~~~~~~~
The following objects are to be expected in the streaming thread:
- events
- STREAM_START (START)
- SEGMENT (SEGMENT)
- EOS * (EOS)
- TAG (T)
- buffers * (B)
Objects marked with * need to be synchronised to the clock in sinks
and live sources.
Typical stream
~~~~~~~~~~~~~~
A typical stream starts with a stream start event that marks the
start of the stream, followed by a segment event that marks the
buffer timestamp range. After that buffers are sent one after the
other. After the last buffer an EOS marks the end of the stream. No
more buffers are to be processed after the EOS event.
+-----+-------+ +-++-+ +-+ +---+
|START|SEGMENT| |B||B| ... |B| |EOS|
+-----+-------+ +-++-+ +-+ +---+
1) STREAM_START
- marks the start of a stream; unlike the SEGMENT event, there
will be no STREAM_START event after flushing seeks.
2) SEGMENT, rate, start/stop, time
- marks valid buffer timestamp range (start, stop)
- marks stream_time of buffers (time). This is the stream time of buffers
with a timestamp of S.start.
- marks playback rate (rate). This is the required playback rate.
- marks applied rate (applied_rate). This is the already applied playback
rate. (See also part-trickmodes.txt)
- marks running_time of buffers. This is the time used to synchronize
against the clock.
3) N buffers
- displayable buffers are between start/stop of the SEGMENT (S). Buffers
outside the segment range should be dropped or clipped.
- running_time:
if (S.rate > 0.0)
running_time = (B.timestamp - S.start) / ABS (S.rate) + S.base
else
running_time = (S.stop - B.timestamp) / ABS (S.rate) + S.base
* a monotonically increasing value that can be used to synchronize
against the clock (See also part-synchronisation.txt).
- stream_time:
stream_time = (B.timestamp - S.start) * ABS (S.applied_rate) + S.time
* current position in stream between 0 and duration.
4) EOS
- marks the end of data, nothing is to be expected after EOS, elements
should refuse more data and return GST_FLOW_EOS. A FLUSH_STOP
event clears the EOS state of an element.
Elements
~~~~~~~~
These events are generated typically either by the GstBaseSrc class for
sources operating in push mode, or by a parser/demuxer operating in pull-mode
and pushing parsed/demuxed data downstream.

View file

@ -1,237 +0,0 @@
Synchronisation
---------------
This document outlines the techniques used for doing synchronised playback of
multiple streams.
Synchronisation in a GstPipeline is achieved using the following 3 components:
- a GstClock, which is global for all elements in a GstPipeline.
- Timestamps on a GstBuffer.
- the SEGMENT event preceding the buffers.
A GstClock
~~~~~~~~~~
This object provides a counter that represents the current time in nanoseconds.
This value is called the absolute_time.
Different sources exist for this counter:
- the system time (with g_get_current_time() and with microsecond accuracy)
- monotonic time (with g_get_monotonic_time () with microsecond accuracy)
- an audio device (based on number of samples played)
- a network source based on packets received + timestamps in those packets (a
typical example is an RTP source)
- ...
In GStreamer any element can provide a GstClock object that can be used in the
pipeline. The GstPipeline object will select a clock from all the providers and
will distribute it to all other elements (see part-gstpipeline.txt).
A GstClock always counts time upwards and does not necessarily start at 0.
While it is possible, it is not recommended to create a clock derived from the
contents of a stream (for example, create a clock from the PCR in an mpeg-ts
stream).
Running time
~~~~~~~~~~~~
After a pipeline selected a clock it will maintain the running_time based on the
selected clock. This running_time represents the total time spent in the PLAYING
state and is calculated as follows:
- If the pipeline is NULL/READY, the running_time is undefined.
- In PAUSED, the running_time remains at the time when it was last
PAUSED. When the stream is PAUSED for the first time, the running_time
is 0.
- In PLAYING, the running_time is the delta between the absolute_time
and the base time. The base time is defined as the absolute_time minus
the running_time at the time when the pipeline is set to PLAYING.
- after a flushing seek, the running_time is set to 0 (see part-seeking.txt).
This is accomplished by redistributing a new base_time to the elements that
got flushed.
This algorithm captures the running_time when the pipeline is set from PLAYING
to PAUSED and restores this time based on the current absolute_time when going
back to PLAYING. This allows for both clocks that progress when in the PAUSED
state (systemclock) and clocks that don't (audioclock).
The clock and pipeline now provide a running_time to all elements that want to
perform synchronisation. Indeed, the running time can be observed in each
element (during the PLAYING state) as:
C.running_time = absolute_time - base_time
We note C.running_time as the running_time obtained by looking at the clock.
This value is monotonically increasing at the rate of the clock.
Timestamps
~~~~~~~~~~
The GstBuffer timestamps and the preceding SEGMENT event (See
part-streams.txt) define a transformation of the buffer timestamps to
running_time as follows:
The following notation is used:
B: GstBuffer
- B.timestamp = buffer timestamp (GST_BUFFER_PTS or GST_BUFFER_DTS)
S: SEGMENT event preceding the buffers.
- S.start: start field in the SEGMENT event. This is the lowest allowed
timestamp.
- S.stop: stop field in the SEGMENT event. This is the highers allowed
timestamp.
- S.rate: rate field of SEGMENT event. This is the playback rate.
- S.base: a base time for the time. This is the total elapsed running_time of any
previous segments.
- S.offset: an offset to apply to S.start or S.stop. This is the amount that
has already been elapsed in the segment.
Valid buffers for synchronisation are those with B.timestamp between S.start
and S.stop (after applying the S.offset). All other buffers outside this range
should be dropped or clipped to these boundaries (see also part-segments.txt).
The following transformation to running_time exist:
if (S.rate > 0.0)
B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base
=>
B.timestamp = (B.running_time - S.base) * ABS (S.rate) + S.start + S.offset
else
B.running_time = ((S.stop - S.offset) - B.timestamp) / ABS (S.rate) + S.base
=>
B.timestamp = S.stop - S.offset - ((B.running_time - S.base) * ABS (S.rate))
We write B.running_time as the running_time obtained from the SEGMENT event
and the buffers of that segment.
The first displayable buffer will yield a value of 0 (since B.timestamp ==
S.start and S.offset and S.base == 0).
For S.rate > 1.0, the timestamps will be scaled down to increase the playback
rate. Likewise, a rate between 0.0 and 1.0 will slow down playback.
For negative rates, timestamps are received stop S.stop to S.start so that the
first buffer received will be transformed into B.running_time of 0
(B.timestamp == S.stop and S.base == 0).
This makes it so that B.running_time is always monotonically increasing
starting from 0 with both positive and negative rates.
Synchronisation
~~~~~~~~~~~~~~~
As we have seen, we can get a running_time:
- using the clock and the element's base_time with:
C.running_time = absolute_time - base_time
- using the buffer timestamp and the preceding SEGMENT event as (assuming
positive playback rate):
B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base
We prefix C. and B. before the two running times to note how they were
calculated.
The task of synchronized playback is to make sure that we play a buffer with
B.running_time at the moment when the clock reaches the same C.running_time.
Thus the following must hold:
B.running_time = C.running_time
expaning:
B.running_time = absolute_time - base_time
or:
absolute_time = B.running_time + base_time
The absolute_time when a buffer with B.running_time should be played is noted
with B.sync_time. Thus:
B.sync_time = B.running_time + base_time
One then waits for the clock to reach B.sync_time before rendering the buffer in
the sink (See also part-clocks.txt).
For multiple streams this means that buffers with the same running_time are to
be displayed at the same time.
A demuxer must make sure that the SEGMENT it emits on its output pads yield
the same running_time for buffers that should be played synchronized. This
usually means sending the same SEGMENT on all pads and making sure that the
synchronized buffers have the same timestamps.
Stream time
~~~~~~~~~~~
The stream time is also known as the position in the stream and is a value
between 0 and the total duration of the media file.
It is the stream time that is used for:
- report the POSITION query in the pipeline
- the position used in seek events/queries
- the position used to synchronize controller values
Additional fields in the SEGMENT are used:
- S.time: time field in the SEGMENT event. This the stream-time of S.start
- S.applied_rate: The rate already applied to the segment.
Stream time is calculated using the buffer times and the preceding SEGMENT
event as follows:
stream_time = (B.timestamp - S.start) * ABS (S.applied_rate) + S.time
=> B.timestamp = (stream_time - S.time) / ABS(S.applied_rate) + S.start
For negative rates, B.timestamp will go backwards from S.stop to S.start,
making the stream time go backwards:
stream_time = (S.stop - B.timestamp) * ABS(S.applied_rate) + S.time
=> B.timestamp = S.stop - (stream_time - S.time) / ABS(S.applied_rate)
In the PLAYING state, it is also possible to use the pipeline clock to derive
the current stream_time.
Give the two formulas above to match the clock times with buffer timestamps
allows us to rewrite the above formula for stream_time (and for positive rates).
C.running_time = absolute_time - base_time
B.running_time = (B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base
=>
(B.timestamp - (S.start + S.offset)) / ABS (S.rate) + S.base = absolute_time - base_time;
=>
(B.timestamp - (S.start + S.offset)) / ABS (S.rate) = absolute_time - base_time - S.base;
=>
(B.timestamp - (S.start + S.offset)) = (absolute_time - base_time - S.base) * ABS (S.rate)
=>
(B.timestamp - S.start) = S.offset + (absolute_time - base_time - S.base) * ABS (S.rate)
filling (B.timestamp - S.start) in the above formule for stream time
=>
stream_time = (S.offset + (absolute_time - base_time - S.base) * ABS (S.rate)) * ABS (S.applied_rate) + S.time
This last formula is typically used in sinks to report the current position in
an accurate and efficient way.
Note that the stream time is never used for synchronisation against the clock.

View file

@ -1,235 +0,0 @@
Implementing GstToc support in GStreamer elements
1. General info about GstToc structure
GstToc introduces a general way to handle chapters within multimedia formats.
GstToc can be represented as tree structure with arbitrary hierarchy. Tree item
can be either of two types: sequence or alternative. Sequence types acts like a
part of the media data, for example audio track in CUE sheet, or part of the
movie. Alternative types acts like some kind of selection to process a different
version of the media content, for example DVD angles.
GstToc has one constraint on the tree structure: it does not allow different
entry types on the same level of the hierarchy, i.e. you shouldn't
have editions and chapters mixed together. Here is an example of right TOC:
------- TOC -------
/ \
edition1 edition2
| |
-chapter1 -chapter3
-chapter2
Here are two editions (alternatives), the first contains two chapters (sequence
type), and the second has only one chapter. And here is an example of invalid
TOC:
------- TOC -------
/ \
edition1 chapter1
|
-chapter1
-chapter2
Here you have edition1 and chapter1 mixed on the same level of hierarchy,
and such TOC will be considered broken.
GstToc has 'entries' field of GList type which consists of children items.
Each item is of type GstTocEntry. Also GstToc has list of tags and
GstStructure called 'info'. Please, use GstToc.info and GstTocEntry.info
fields this way: create a GstStructure, put all info related to your element
there and put this structure into the 'info' field under the name of your
element. Some fields in the 'info' structure can be used for internal
purposes, so you should use it in the way described above to not to
overwrite already existent fields.
Let's look at GstTocEntry a bit closer. One of the most important fields
is 'uid', which must be unique for each item within the TOC. This is used
to identify each item inside TOC, especially when element receives TOC
select event with UID to seek on. Field 'subentries' of type GList contains
children items of type GstTocEntry. Thus you can achieve arbitrary hierarchy
level. Field 'type' can be either GST_TOC_ENTRY_TYPE_CHAPTER or
GST_TOC_ENTRY_TYPE_EDITION which corresponds to chapter or edition type of
item respectively. Field 'tags' is a list of tags related to the item. And field
'info' is similar to GstToc.info described above.
So, a little more about managing GstToc. Use gst_toc_new() and gst_toc_unref()
to create/free it. GstTocEntry can be created using gst_toc_entry_new().
While building GstToc you can set start and stop timestamps for each item using
gst_toc_entry_set_start_stop() and loop_type and repeat_count using
gst_toc_entry_set_loop().
The best way to process already created GstToc is to recursively go through
the 'entries' and 'subentries' fields.
Applications and plugins should not rely on TOCs having a certain kind of
structure, but should allow for different alternatives. For example, a
simple CUE sheet embedded in a file may be presented as a flat list of
track entries, or could have a top-level edition node (or some other
alternative type entry) with track entries underneath that node; or even
multiple top-level edition nodes (or some other alternative type entries)
each with track entries underneath, in case the source file has extracted
a track listing from different sources).
2. TOC scope: global and current
There are two main consumers for TOC information: applications and elements
in the pipeline that are TOC writers (such as e.g. matroskamux).
Applications typically want to know the entire table of contents (TOC) with
all entries that can possibly be selected.
TOC writers in the pipeline, however, would not want to write a TOC for all
possible/available streams, but only for the current stream.
When transcoding a title from a DVD, for example, the application would still
want to know the entire TOC, with all titles, the chapters for each title,
and the available angles. When transcoding to a file, we only want the TOC
information that is relevant to the transcoded stream to be written into
the file structure, e.g. the chapters of the title being transcoded (or
possibly only chapters 5-7 if only those have been selected for playback/
transcoding).
This is why we may need to create two different TOCs for those two types
of consumers.
Elements that extract TOC information should send TOC events downstream.
Like with tags, sinks will post a TOC message on the bus for the application
with the global TOC, once a global TOC event reaches the sink.
3. Working with GstMessage
If a table of contents is available, applications will receive a TOC message
on the pipeline's GstBus.
A TOC message will be posted on the bus by sinks when the receive a TOC event
containing a TOC with global scope. Elements extracting TOCs should not post
a TOC message themselves, but send a TOC event downstream.
The reason for this is that there may be cascades of TOCs (e.g. a zip archive
containing multiple matroska files, each with a TOC).
GstMessage with GstToc can be created using gst_message_new_toc() and parsed
with gst_message_parse_toc(). The 'updated' parameter in these methods indicates
whether the TOC was just discovered (set to false) or TOC was already found and
have been updated (set to true). This message will typically be posted by sinks
to pipeline in case you have discovered TOC data within your element.
4. Working with GstEvent
There are two types of TOC-related events:
- downstream TOC events that contain TOC information and
travel downstream
- toc-select events that travel upstream and can be used to select
a certain TOC entry for playback (similar to seek events)
GstToc supports select event through GstEvent infrastructure. The idea is the
following: when you receive TOC select event, parse it with
gst_event_parse_toc_select() and seek stream (if it is not streamable) for
specified TOC UID (you can use gst_toc_find_entry() to find entry in TOC by UID).
To create TOC select event use gst_event_new_toc_select(). The common action on
such event is to seek to specified UID within your element.
5. Implementation coverage, Specifications, ...
Below is a list of container formats, links to documentation and a summary of
toc related features. Each section title also indicates whether reading/writing
a toc is implemented. Below hollow bullet point 'o' indicate no support and
filled bullets '*' indicate that this feature is handled.
AIFC: -/-
http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/AIFF/Docs/AIFF-1.3.pdf
o 'MARK'
o 'INST'
The 'MARK' chunk defines a list of (cue-id, position_in_samples, label).
The 'INST' chunk contains a sustainLoop and releaseLoop, each consisting of
(loop-type, cue-begin, cue-end)
FLAC: read/write
http://xiph.org/flac/format.html#metadata_block_cuesheet
* METADATA_BLOCK_CUESHEET
* CUESHEET_TRACK
o CUESHEET_TRACK_INDEX
Both CUESHEET_TRACK and CUESHEET_TRACK_INDEX have a (relative) offset in
samples. CUESHEET_TRACK has ISRC metadata.
MKV: read/write
http://matroska.org/technical/specs/chapters/index.html
* Chapters and Editions each having a uid
* Chapter have start/end time and metadata:
ChapString, ChapLanguage, ChapCountry
MP4:
* elst
The 'elst' atom contains a list of edits. Each edit consists of (length, start,
play-back speed).
OGG: -/-
https://wiki.xiph.org/Chapter_Extension
o VorbisComment fields called CHAPTERxxx and CHAPTERxxxNAME with xxx being a
number between 000 and 999.
WAV: read/write
http://www.sonicspot.com/guide/wavefiles.html
* 'cue '
o 'plst'
* 'adtl'
* 'labl'
* 'note'
o 'ltxt'
o 'smpl'
The 'cue ' chunk defines a list of markers in the stream with 'cue-id's. The
'smpl' chunk defines a list of regions in the stream with 'cue-id's in the same
namespace (?).
The various 'adtl' chunks: 'labl', 'note' and 'ltxt' refer to the 'cue-id's.
A 'plst' chunk defines a sequence of segments (cue-id, length_samples, repeats).
The 'smpl' chunk defines a list of loops (cue-id, beg, end, loop-type, repeats).
6. Conclusion/Ideas/Future work
Based on the data of chapter 5, a few thoughts and observations that can be used
to extend and refine our API. These things below are not reflecting the current
implementation.
All formats have table of [cue-id, cue-start, (cue-end), (extra tags)]
- cue-id is commonly represented as and unsigned int 32bit
- cue-end is optional
- extra tags could be represented as a structure/taglist
Many formats have metadata that references the cue-table.
- loops in instruments in wav, aifc
- edit lists in wav, mp4
For mp4.edtl, wav.plst we could expose two editions.
1) the edit list is flattened: default, for playback
2) the stream has the raw data and the edit list is there as chapter markers:
useful for editing software
We might want to introduce a new GST_TOC_ENTRY_TYPE_MARKER or _CUE. This would
be a sequence entry-type and it would not be used for navigational purposes, but
to attach data to a point in time (envelopes, loops, ...).
API wise there is some overlap between:
- exposing multiple audio/video tracks as pads or as ToC editions. For ToC
editions, we have the TocSelect event.
- exposing subtitles as a sparse stream or as as ToC sequence of markers with
labels

View file

@ -1,379 +0,0 @@
Tracing
=======
This subsystem will provide a mechanism to get structured tracing info from
GStreamer applications. This can be used for post-run analysis as well as for
live introspection.
Use cases
---------
* I'd like to get statistics from a running application.
* I'd like to to understand which parts of my pipeline use how many resources.
* I'd like to know which parts of the pipeline use how much memory.
* I'd like to know about ref-counts of parts in the pipeline to find ref-count
issues.
Non use-cases
-------------
* Some element in the pipeline does not play along the rules, find out which
one. This could be done with generic tests.
Design
------
The system brings the following new items:
core hooks: probes in the core api, that will expose internal state when tracing
is in use
tracers: plugin features that can process data from the hooks and emit a log
tracing front-ends: applications that consume logs from tracers
Like the logging, the tracer hooks can be compiled out and if not use a local
condition to check if active.
Certain GStreamer core function (such as gst_pad_push or gst_element_add_pad)
will call into the tracer subsystem to dispatch into active tracing modules.
Developers will be able to select a list of plugins by setting an environment
variable, such as GST_TRACERS="meminfo;dbus". One can also pass parameters to
plugins: GST_TRACERS="log(events,buffers);stats(all)".
When then plugins are loaded, we'll add them to certain hooks according to which
they are interested in.
Right now tracing info is logged as GstStructures to the TRACE level.
Idea: Another env var GST_TRACE_CHANNEL could be used to send the tracing to a
file or a socket. See https://bugzilla.gnome.org/show_bug.cgi?id=733188 for
discussion on these environment variables.
Hook api
--------
We'll wrap interesting api calls with two macros, e.g. gst_pad_push():
GstFlowReturn
gst_pad_push (GstPad * pad, GstBuffer * buffer)
{
GstFlowReturn res;
g_return_val_if_fail (GST_IS_PAD (pad), GST_FLOW_ERROR);
g_return_val_if_fail (GST_PAD_IS_SRC (pad), GST_FLOW_ERROR);
g_return_val_if_fail (GST_IS_BUFFER (buffer), GST_FLOW_ERROR);
GST_TRACER_PAD_PUSH_PRE (pad, buffer);
res = gst_pad_push_data (pad,
GST_PAD_PROBE_TYPE_BUFFER | GST_PAD_PROBE_TYPE_PUSH, buffer);
GST_TRACER_PAD_PUSH_POST (pad, res);
return res;
}
TODO(ensonic): gcc has some magic for wrapping functions
- http://gcc.gnu.org/onlinedocs/gcc/Constructing-Calls.html
- http://www.clifford.at/cfun/gccfeat/#gccfeat05.c
TODO(ensonic): we should eval if we can use something like jump_label in the kernel
- http://lwn.net/Articles/412072/ + http://lwn.net/Articles/435215/
- http://lxr.free-electrons.com/source/kernel/jump_label.c
- http://lxr.free-electrons.com/source/include/linux/jump_label.h
- http://lxr.free-electrons.com/source/arch/x86/kernel/jump_label.c
TODO(ensonic): liblttng-ust provides such a mechanism for user-space
- but this is mostly about logging traces
- it is linux specific :/
In addition to api hooks we should also provide timer hooks. Interval timers are
useful to get e.g. resource usage snapshots. Also absolute timers might make
sense. All this could be implemented with a clock thread. We can use another
env-var GST_TRACE_TIMERS="100ms,75ms" to configure timers and then pass them to
the tracers like, GST_TRACERS="rusage(timer=100ms);meminfo(timer=75ms)". Maybe
we can create them ad-hoc and avoid the GST_TRACE_TIMERS var.
Hooks (* already implemented)
-----
* gst_bin_add
* gst_bin_remove
* gst_element_add_pad
* gst_element_post_message
* gst_element_query
* gst_element_remove_pad
* gst_element_factory_make
* gst_pad_link
* gst_pad_pull_range
* gst_pad_push
* gst_pad_push_list
* gst_pad_push_event
* gst_pad_unlink
Tracer api
----------
Tracers are plugin features. They have a simple api:
class init
Here the tracers describe the data the will emit.
instance init
Tracers attach handlers to one or more hooks using gst_tracing_register_hook().
In case the are configurable, they can read the options from the 'params'
property. This is the extra detail from the environment var.
hook functions
Hooks marshal the parameters given to a trace hook into varargs and also
add some extra into such as a timestamp. Hooks will be called from misc threads.
The trace plugins should only consume (=read) the provided data. Expensive
computation should be avoided to not affect the execution too much.
Most trace plugins will log data to a trace channel.
instance destruction
Tracers can output results and release data. This would ideally be done at the
end of the applications, but gst_deinit() is not mandatory. gst_tracelib was
using a gcc_destructor. Ideally tracer modules log data as they have them and
leave aggregation to a tool that processes the log.
tracer event classes
--------------------
Most tracers will log some kind of 'events' : a data transfer, an event,
a message, a query or a measurement. Every tracers should describe the data
format. This way tools that process tracer logs can show the data in a
meaningful way without having to know about the tracer plugin.
One way would be to introspect the data from the plugin. This has the
disadvantage that the postprocessing app needs to load the plugins or talk to
the gstreamer registry. An alternative is to also log the format description
into the log. Right now we're logging several nested GstStructure from the
_tracer_class_init() function (except in the log tracer).
// the name is the value name + ".class"
// the content describes a single log record
gst_tracer_record_new ("thread-rusage.class",
// value in the log record (order does not matter)
// 'thread-id' is a 'key' to related the record to something as indicated
// by 'scope' substructure
"thread-id", GST_TYPE_STRUCTURE, gst_structure_new ("scope",
"type", G_TYPE_GTYPE, G_TYPE_GUINT64,
"related-to", GST_TYPE_TRACER_VALUE_SCOPE, GST_TRACER_VALUE_SCOPE_THREAD,
NULL),
// next value in the record
// 'average-cpuload' is a measurement as indicated by the 'value'
// substructure
"average-cpuload", GST_TYPE_STRUCTURE, gst_structure_new ("value",
// value type
"type", G_TYPE_GTYPE, G_TYPE_UINT,
// human readable description, that can be used as a graph label
"description", G_TYPE_STRING, "average cpu usage per thread",
// flags that help to use the right graph type
// flags { aggregated, windowed, cumulative, ... }
"flags", GST_TYPE_TRACER_VALUE_FLAGS, GST_TRACER_VALUE_FLAGS_AGGREGATED,
// value range
"min", G_TYPE_UINT, 0,
"max", G_TYPE_UINT, 100,
NULL),
...
NULL);
A few ideas that are not yet in the above spec:
- it would be nice to describe the unit of values
- putting it into the description is not flexible though, e.g. time would be
a guint64 but a ui would reformat it to e.g. h:m:s.ms
- other units are e.g.: percent, per-mille, or kbit/s
- we'd like to have some metadata on scopes
- e.g. we'd like to log the thread-names, so that a UI can show that instead
of thread-ids
- the stats tracer logs 'new-element' and 'new-pad' messages
- they add a unique 'ix' to each instance as the memory ptr can be reused
for new instances, the data is attached to the objects as qdata
- the latency tracer would like to also reference this metadata
- right now we log the classes as structures
- this is important so that the log is self contained
- it would be nice to add them to the registry, so that gst-inspect can show
them
We could also consider to add each value as a READABLE gobject property. The
property has name/description. We could use qdata for scope and flags (or have
some new property flags).
We would also need a new "notify" signal, so that value-change notifications
would include a time-stamp. This way the tracers would not needs to be aware of
the logging. The core tracer would register the notify handlers and emit the
log.
Or we just add a gst_tracer_class_install_event() and that mimics the
g_object_class_install_property().
Frontends can:
- do an events over time histogram
- plot curves of values over time or deltas
- show gauges
- collect statistics (min, max, avg, ...)
Plugins ideas
=============
We can have some under gstreamer/plugins/tracers/
latency
-------
- register to buffer and event flow
- send custom event on buffer flow at source elements
- catch events on event transfer at sink elements
meminfo (not yet implemented)
-------
- register to an interval-timer hook.
- call mallinfo() and log memory usage
rusage
------
- register to an interval-timer hook.
- call getrusage() and log resource usage
dbus (not yet implemented)
----
- provide a dbus iface to announce applications that are traced
- tracing UIs can use the dbus iface to find the channels where logging and
tracing is getting logged to
- one would start the tracing UI first and when the application is started with
tracing activated, the dbus plugin will announce the new application,
upon which the tracing UI can start reading from the log channels, this avoid
missing some data
topology (not yet implemented)
--------
- register to pipeline topology hooks
- tracing UIs can show a live pipeline graph
stats
-----
- register to buffer, event, message and query flow
- tracing apps can do e.g. statistics
refcounts (not yet implemented)
---------
- log ref-counts of objects
- just logging them outside of glib/gobject would still make it hard to detect
issues though
opengl (not yet implemented)
------
- upload/download times
- there is not hardware agnostic way to get e.g. memory usage info
(gl extensions)
memory (not yet implemented)
------
- trace live instance (and pointer to the memory)
- use an atexit handler to dump leaked instance
https://bugzilla.gnome.org/show_bug.cgi?id=756760#c6
leaks
-----
- track creation/destruction of GstObject and GstMiniObject
- log those which are still alive when app is exiting and raise an error if any
- If the GST_LEAKS_TRACER_SIG env variable is defined the tracer will handle the following UNIX signals:
- SIGUSR1: log alive objects
- SIGUSR2: create a checkpoint and print a list of objects created and destroyed since
the previous checkpoint.
- If the GST_LEAKS_TRACER_STACK_TRACE env variable is defined log the creation
stack trace of leaked objects. This may significantly increase memory
consumption.
User interfaces
===============
gst-debug-viewer
----------------
gst-debug-viewer could be given the trace log in addition to the debug log (or a
combined log). Alternatively it would show a dialog that shows all local apps
(if the dbus plugin is loaded) and read the log streams from the sockets/files
that are configured for the app.
gst-tracer
----------
Counterpart of gst-tracelib-ui.
gst-stats
---------
A terminal app that shows summary/running stats like the summary gst-tracelib
shows at the end of a run. Currently only shows an aggregated status.
live-graphers
-------------
Maybe we can even feed the log into existing live graphers, with a little driver
* https://github.com/dkogan/feedgnuplot
Problems / Open items
=====================
- should tracers log into the debug.log or into a separate log?
- separate log
- use a binary format?
- worse performance (we're writing two logs at the same time)
- need to be careful when people to GST_DEBUG_CHANNEL=stderr and
GST_TRACE_CHANNEL=stderr (use a shared channel, but what about the
formats?)
- debug log
- the tracer subsystem would need to log the GST_TRACE at a level that is
active
- should the tracer call gst_debug_category_set_threshold() to ensure things
work, even though the levels don't make a lot of sense here
- make logging a tracer (a hook in gst_debug_log_valist, move
gst_debug_log_default() to the tracer module)
- log all debug log to the tracer log, some of the current logging
statements can be replaced by generic logging as shown in the log-tracer
- add tools/gst-debug to extract a human readable debug log from the trace
log
- we could maintain a list of log functions, where gst_tracer_log_trace() is
the default one. This way e.g. gst-validate could consume the traces
directly.
- when hooking into a timer, should we just have some predefined intervals?
- can we add a tracer module that registers the timer hook? then we could do
GST_TRACER="timer(10ms);rusage"
right now the tracer hooks are defined as an enum though.
- when connecting to a running app, we can't easily get the 'current' state if
logging is using a socket, as past events are not explicitly stored, we could
determine the current topology and emit events with GST_CLOCK_TIME_NONE as ts
to indicate that the events are synthetic.
- we need stable ids for scopes (threads, elements, pads)
- the address can be reused
- we can use gst_util_seqnum_next()
- something like gst_object_get_path_string() won't work as objects are
initially without parent
- right now the tracing-hooks are enabled/disabled from configure with
--{enable,disable}-gst-tracer-hooks
The tracer code and the plugins are still built though. We should add a
--{enable,disable}-gst-tracer to disabled the whole system, allthough this
is a bit confusing with the --{enable,disable}-trace option we have already.
Try it
======
GST_DEBUG="GST_TRACER:7,GST_BUFFER*:7,GST_EVENT:7,GST_MESSAGE:7" GST_TRACERS=log gst-launch-1.0 fakesrc num-buffers=10 ! fakesink
- traces for buffer flow in TRACE level
GST_DEBUG="GST_TRACER:7" GST_TRACERS="stats;rusage" GST_DEBUG_FILE=trace.log gst-launch-1.0 fakesrc num-buffers=10 sizetype=fixed ! queue ! fakesink
gst-stats-1.0 trace.log
- print some pipeline stats on exit
GST_DEBUG="GST_TRACER:7" GST_TRACERS="stats;rusage" GST_DEBUG_FILE=trace.log /usr/bin/gst-play-1.0 $HOME/Videos/movie.mp4
./scripts/gst-plot-traces.sh --format=png | gnuplot
eog trace.log.*.png
- get ts, average-cpuload, current-cpuload, time and plot
GST_DEBUG="GST_TRACER:7" GST_TRACERS=latency gst-launch-1.0 audiotestsrc num-buffers=10 ! audioconvert ! volume volume=0.7 ! autoaudiosink
- print processing latencies
GST_TRACERS="leaks" gst-launch-1.0 videotestsrc num-buffers=10 ! fakesink
- Raise a warning if a leak is detected
GST_DEBUG="GST_TRACER:7" GST_TRACERS="leaks(GstEvent,GstMessage)" gst-launch-1.0 videotestsrc num-buffers=10 ! fakesink
- check if any GstEvent or GstMessage is leaked and raise a warning
Performance
===========
run ./tests/benchmarks/tracing.sh <tracer(s)> <media>
egrep -c "(proc|thread)-rusage" trace.log
658618
grep -c "gst_tracer_log_trace" trace.log
823351
- we can optimize most of it by using quarks in structures or eventually avoid structures totally

View file

@ -1,216 +0,0 @@
Trickmodes
----------
GStreamer provides API for performing various trickmode playback. This includes:
- server side trickmodes
- client side fast/slow forward playback
- client side fast/slow backwards playback
Server side trickmodes mean that a source (network source) can provide a
stream with different playback speed and direction. The client does not have to
perform any special algorithms to decode this stream.
Client side trickmodes mean that the decoding client (GStreamer) performs the
needed algorithms to change the direction and speed of the media file.
Seeking can both be done in a playback pipeline and a transcoding pipeline.
General seeking overview
~~~~~~~~~~~~~~~~~~~~~~~~~
Consider a typical playback pipeline:
.---------. .------.
.-------. | decoder |->| sink |
.--------. | |-->'---------' '------'
| source |->| demux |
'--------' | |-->.---------. .------.
'-------' | decoder |->| sink |
'---------' '------'
The pipeline is initially configured to play back at speed 1.0 starting from
position 0 and stopping at the total duration of the file.
When performing a seek, the following steps have to be taken by the application:
Create a seek event
^^^^^^^^^^^^^^^^^^^
The seek event contains:
- various flags describing:
- where to seek to (KEY_UNIT)
- how accurate the seek should be (ACCURATE)
- how to perform the seek (FLUSH)
- what to do when the stop position is reached (SEGMENT).
- extra playback options (SKIP)
- a format to seek in, this can be time, bytes, units (frames, samples), ...
- a playback rate, 1.0 is normal playback speed, positive values bigger than 1.0
mean fast playback. negative values mean reverse playback. A playback speed of
0.0 is not allowed (but is equivalent to PAUSING the pipeline).
- a start position, this value has to be between 0 and the total duration of the
file. It can also be relative to the previously configured start value.
- a stop position, this value has to be between 0 and the total duration. It can
also be relative to the previously configured stop value.
See also gst_event_new_seek().
Send the seek event
^^^^^^^^^^^^^^^^^^^
Send the new seek event to the pipeline with gst_element_send_event().
By default the pipeline will send the event to all sink elements.
By default an element will forward the event upstream on all sinkpads.
Elements can modify the format of the seek event. The most common format is
GST_FORMAT_TIME.
One element will actually perform the seek, this is usually the demuxer or
source element. For more information on how to perform the different seek
types see part-seeking.txt.
For client side trickmode a SEGMENT event will be sent downstream with
the new rate and start/stop positions. All elements prepare themselves to
handle the rate (see below). The applied rate of the SEGMENT event will
be set to 1.0 to indicate that no rate adjustment has been done.
for server side trick mode a SEGMENT event is sent downstream with a
rate of 1.0 and the start/stop positions. The elements will configure themselves
for normal playback speed since the server will perform the rate conversions.
The applied rate will be set to the rate that will be applied by the server. This
is done to insure that the position reporting performed in the sink is aware
of the trick mode.
When the seek succeeds, the _send_event() function will return TRUE.
Server side trickmode
~~~~~~~~~~~~~~~~~~~~~
The source element operates in push mode. It can reopen a server connection requesting
a new byte or time position and a new playback speed. The capabilities can be queried
from the server when the connection is opened.
We assume the source element is derived from the GstPushSrc base class. The base source
should be configured with gst_base_src_set_format (src, GST_FORMAT_TIME).
The do_seek method will be called on the push src subclass with the seek information
passed in the GstSegment argument.
The rate value in the segment should be used to reopen the connection to the server
requesting data at the new speed and possibly a new playback position.
When the server connection was successfully reopened, set the rate of the segment
to 1.0 so that the client side trickmode is not enabled. The applied rate in the
segment is set to the rate transformation done by the server.
Alternatively a combination of client side and serverside trickmode can be used, for
example if the server does not support certain rates, the client can perform rate
conversion for the remainder.
source server
do_seek | |
----------->| |
| reopen connection |
|-------------------->|
| .
| success .
|<--------------------|
modify | |
rate to 1.0 | |
| |
return | |
TRUE | |
| |
After performing the seek, the source will inform the downstream elements of the
new segment that is to be played back. Since the segment will have a rate of 1.0,
no client side trick modes are enabled. The segment will have an applied rate
different from 1.0 to indicate that the media contains data with non-standard
playback speed or direction.
client side forward trickmodes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The seek happens as stated above. a SEGMENT event is sent downstream with a rate
different from 1.0. Plugins receiving the SEGMENT can decide to perform the
rate conversion of the media data (retimestamp video frames, resample audio, ...).
If a plugin decides to resample or retimestamp, it should modify the SEGMENT with
a rate of 1.0 and update the applied rate so that downstream elements don't resample
again but are aware that the media has been modified.
The GStreamer base audio and video sinks will resample automatically if they receive
a SEGMENT event with a rate different from 1.0. The position reporting in the
base audio and video sinks will also depend on the applied rate of the segment
information.
When the SKIP flag is set, frames can be dropped in the elements. If S is the
speedup factor, a good algorithm for implementing frame skipping is to send audio in
chunks of Nms (usually 300ms is good) and then skip ((S-1) * Nns) of audio data.
For the video we send only the keyframes in the (S * Nns) interval. In this
case, the demuxer would scale the timestamps and would set an applied rate of S.
client side backwards trickmode
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For backwards playback the following rules apply:
- the rate in the SEGMENT is less than 0.0.
- the SEGMENT start position is less than the stop position, playback will
however happen from stop to start in reverse.
- the time member in the SEGMENT is set to the stream time of the start
position.
For plugins the following rules apply:
- A source plugin sends data in chunks starting from the last chunk of the
file. The actual bytes are not reversed. Each chunk that is not forward
continuous with the previous chunk is marked with a DISCONT flag.
- A demuxer accumulates the chunks. As soon as a keyframe is found, everything
starting from the keyframe up to the accumulated data is sent downstream.
Timestamps on the buffers are set starting from the stop position to start,
effectively going backwards. Chunks are marked with DISCONT when they are not
forward continuous with the previous buffer.
- A video decoder decodes and accumulates all decoded frames. If a buffer with
a DISCONT, SEGMENT or EOS is received, all accumulated frames are sent
downsteam in reverse.
- An audio decoder decodes and accumulates all decoded audio. If a buffer with
a DISCONT, SEGMENT or EOS is received, all accumulated audio is sent
downstream in reverse order. Some audio codecs need the previous
data buffer to decode the current one, in that case, the previous DISCONT
buffer needs to be combined with the last non-DISCONT buffer to generate the
last bit of output.
- A sink reverses (for audio) and retimestamps (audio, video) the buffers
before playing them back. Retimestamping occurs relative to the stop
position, making the timestamps increase again and suitable for synchronizing
against the clock.
Audio sinks also have to perform simple resampling before playing the
samples.
- for transcoding, audio and video resamplers can be used to reverse, resample
and retimestamp the buffers. Any rate adjustments performed on the media must
be added to the applied_rate and subtracted from the rate members in the
SEGMENT event.
In SKIP mode, the same algorithm as for forward SKIP mode can be used.
Notes
~~~~~
- The clock/running_time keeps running forward.
- backwards playback potentially uses a lot of memory as frames and undecoded
data gets buffered.