docs: design: move most design docs to gst-docs module

2025-04-15 12:34:15 +00:00 · 2016-12-08 22:59:58 +00:00 · 2016-12-08 22:59:58 +00:00 · 46138b1b1d
commit 46138b1b1d
parent 49653b058a
13 changed files with 1 additions and 3652 deletions
--- a/docs/design/Makefile.am
+++ b/docs/design/Makefile.am
@ -2,16 +2,5 @@ SUBDIRS =


 EXTRA_DIST = \
-	design-audiosinks.txt      \
-	design-decodebin.txt       \
-	design-encoding.txt        \
-	design-orc-integration.txt \
 	draft-hw-acceleration.txt  \
-	draft-keyframe-force.txt   \
-	draft-subtitle-overlays.txt\
-	draft-va.txt               \
-	part-interlaced-video.txt  \
-	part-mediatype-audio-raw.txt\
-	part-mediatype-text-raw.txt\
-	part-mediatype-video-raw.txt\
-	part-playbin.txt
+	draft-va.txt
--- a/docs/design/design-audiosinks.txt
+++ b/docs/design/design-audiosinks.txt
@ -1,138 +0,0 @@
-Audiosink design
----------------
-
-Requirements:
-
- - must operate chain based.
-   Most simple playback pipelines will push audio from the decoders
-   into the audio sink.
- 
- - must operate getrange based
-   Most professional audio applications will operate in a mode where
-   the audio sink pulls samples from the pipeline. This is typically
-   done in a callback from the audiosink requesting N samples. The
-   callback is either scheduled from a thread or from an interrupt
-   from the audio hardware device. 
-
- - Exact sample accurate clocks.
-   the audiosink must be able to provide a clock that is sample 
-   accurate even if samples are dropped or when discontinuities are
-   found in the stream.
-
- - Exact timing of playback.
-   The audiosink must be able to play samples at their exact times.
-
- - use DMA access when possible.
-   When the hardware can do DMA we should use it. This should also
-   work over bufferpools to avoid data copying to/from kernel space.
-
-
-Design:
-
- The design is based on a set of base classes and the concept of a
- ringbuffer of samples.
-
-   +-----------+   - provide preroll, rendering, timing
-   + basesink  +   - caps nego
-   +-----+-----+
-         |
-   +-----V----------+   - manages ringbuffer
-   + audiobasesink  +   - manages scheduling (push/pull)
-   +-----+----------+   - manages clock/query/seek
-         |              - manages scheduling of samples in the ringbuffer
-         |              - manages caps parsing
-         |
-   +-----V------+   - default ringbuffer implementation with a GThread
-   + audiosink  +   - subclasses provide open/read/close methods
-   +------------+
-
-  The ringbuffer is a contiguous piece of memory divided into segtotal
-  pieces of segments. Each segment has segsize bytes.
-
-         play position 
-           v          
-   +---+---+---+-------------------------------------+----------+
-   + 0 | 1 | 2 | ....                                | segtotal |
-   +---+---+---+-------------------------------------+----------+
-   <--->
-     segsize bytes = N samples * bytes_per_sample.
-
-  
-  The ringbuffer has a play position, which is expressed in
-  segments. The play position is where the device is currently reading
-  samples from the buffer.
-
-  The ringbuffer can be put to the PLAYING or STOPPED state. 
-  
-  In the STOPPED state no samples are played to the device and the play
-  pointer does not advance. 
-  
-  In the PLAYING state samples are written to the device and the ringbuffer 
-  should call a configurable callback after each segment is written to the
-  device. In this state the play pointer is advanced after each segment is
-  written.
-
-  A write operation to the ringbuffer will put new samples in the ringbuffer.
-  If there is not enough space in the ringbuffer, the write operation will 
-  block.  The playback of the buffer never stops, even if the buffer is 
-  empty. When the buffer is empty, silence is played by the device.
-
-  The ringbuffer is implemented with lockfree atomic operations, especially
-  on the reading side so that low-latency operations are possible.
-
-  Whenever new samples are to be put into the ringbuffer, the position of the
-  read pointer is taken. The required write position is taken and the diff
-  is made between the required and actual position. If the difference is <0,
-  the sample is too late. If the difference is bigger than segtotal, the
-  writing part has to wait for the play pointer to advance. 
-
-
-Scheduling:
-
-  - chain based mode:
-
-   In chain based mode, bytes are written into the ringbuffer. This operation
-   will eventually block when the ringbuffer is filled. 
-  
-   When no samples arrive in time, the ringbuffer will play silence. Each 
-   buffer that arrives will be placed into the ringbuffer at the correct 
-   times. This means that dropping samples or inserting silence is done
-   automatically and very accurate and independend of the play pointer.
-   
-   In this mode, the ringbuffer is usually kept as full as possible. When 
-   using a small buffer (small segsize and segtotal), the latency for audio 
-   to start from the sink to when it is played can be kept low but at least
-   one context switch has to be made between read and write.
-
-  - getrange based mode
-
-    In getrange based mode, the audiobasesink will use the callback function
-    of the ringbuffer to get a segsize samples from the peer element. These
-    samples will then be placed in the ringbuffer at the next play position.
-    It is assumed that the getrange function returns fast enough to fill the
-    ringbuffer before the play pointer reaches the write pointer.
-  
-    In this mode, the ringbuffer is usually kept as empty as possible. There
-    is no context switch needed between the elements that create the samples
-    and the actual writing of the samples to the device.
-
-
-DMA mode:
-
-  - Elements that can do DMA based access to the audio device have to subclass
-    from the GstAudioBaseSink class and wrap the DMA ringbuffer in a subclass
-    of GstRingBuffer.
-    
-    The ringbuffer subclass should trigger a callback after writing or playing
-    each sample to the device. This callback can be triggered from a thread or
-    from a signal from the audio device. 
-
-
-Clocks:
-
-   The GstAudioBaseSink class will use the ringbuffer to act as a clock provider.
-   It can do this by using the play pointer and the delay to calculate the
-   clock time.
-  
-
-  
--- a/docs/design/design-decodebin.txt
+++ b/docs/design/design-decodebin.txt
@ -1,274 +0,0 @@
-Decodebin design
-
-GstDecodeBin
------------
-
-Description:
-
-  Autoplug and decode to raw media
-
-  Input : single pad with ANY caps Output : Dynamic pads
-
-* Contents
-
-  _ a GstTypeFindElement connected to the single sink pad
-
-  _ optionally a demuxer/parser
-
-  _ optionally one or more DecodeGroup
-
-* Autoplugging
-
-  The goal is to reach 'target' caps (by default raw media).
-
-  This is done by using the GstCaps of a source pad and finding the available
-demuxers/decoders GstElement that can be linked to that pad.
-
-  The process starts with the source pad of typefind and stops when no more
-non-target caps are left. It is commonly done while pre-rolling, but can also
-happen whenever a new pad appears on any element.
-
-  Once a target caps has been found, that pad is ghosted and the
-'pad-added' signal is emitted.
-
-  If no compatible elements can be found for a GstCaps, the pad is ghosted and
-the 'unknown-type' signal is emitted.
-
-
-* Assisted auto-plugging
-
-  When starting the auto-plugging process for a given GstCaps, two signals are
-emitted in the following way in order to allow the application/user to assist or
-fine-tune the process.
-
-  _ 'autoplug-continue' :
-
-    gboolean user_function (GstElement * decodebin, GstPad *pad, GstCaps * caps)
-
-    This signal is fired at the very beginning with the source pad GstCaps. If
-  the callback returns TRUE, the process continues normally. If the callback
-  returns FALSE, then the GstCaps are considered as a target caps and the
-  autoplugging process stops.
-
-  - 'autoplug-factories' :
-
-    GValueArray user_function (GstElement* decodebin, GstPad* pad, 
-         GstCaps* caps);
-
-    Get a list of elementfactories for @pad with @caps. This function is used to
-    instruct decodebin2 of the elements it should try to autoplug. The default
-    behaviour when this function is not overriden is to get all elements that
-    can handle @caps from the registry sorted by rank.
-
-  - 'autoplug-select' :
-
-    gint user_function (GstElement* decodebin, GstPad* pad, GstCaps* caps,
-                GValueArray* factories);
-
-    This signal is fired once autoplugging has got a list of compatible
-  GstElementFactory. The signal is emitted with the GstCaps of the source pad
-  and a pointer on the GValueArray of compatible factories.
-
-    The callback should return the index of the elementfactory in @factories
-    that should be tried next.
-
-    If the callback returns -1, the autoplugging process will stop as if no
-  compatible factories were found.
-
-  The default implementation of this function will try to autoplug the first
-  factory of the list.
-
-* Target Caps
-
-  The target caps are a read/write GObject property of decodebin.
-
-  By default the target caps are:
-
-  _ Raw audio : audio/x-raw
-
-  _ and raw video : video/x-raw
-
-  _ and Text : text/plain, text/x-pango-markup
-
-
-* media chain/group handling
-
-  When autoplugging, all streams coming out of a demuxer will be grouped in a
-DecodeGroup.
-
-  All new source pads created on that demuxer after it has emitted the
-'no-more-pads' signal will be put in another DecodeGroup.
-
-  Only one decodegroup can be active at any given time. If a new decodegroup is
-created while another one exists, that decodegroup will be set as blocking until
-the existing one has drained.
-
-
-
-DecodeGroup
-----------
-
-Description:
-
-  Streams belonging to the same group/chain of a media file.
-
-* Contents
-
-  The DecodeGroup contains:
-
-  _ a GstMultiQueue to which all streams of a the media group are connected.
-
-  _ the eventual decoders which are autoplugged in order to produce the
-  requested target pads.
-
-* Proper group draining
-
-  The DecodeGroup takes care that all the streams in the group are completely
-drained (EOS has come through all source ghost pads).
-
-* Pre-roll and block
-
-  The DecodeGroup has a global blocking feature. If enabled, all the ghosted
-source pads for that group will be blocked.
-
-  A method is available to unblock all blocked pads for that group.
- 
-
-
-GstMultiQueue
-------------
-
-Description:
-
-  Multiple input-output data queue
-  
-  The GstMultiQueue achieves the same functionality as GstQueue, with a few
-differences:
-
-  * Multiple streams handling.
-
-    The element handles queueing data on more than one stream at once. To
-  achieve such a feature it has request sink pads (sink_%u) and 'sometimes' src
-  pads (src_%u).
-
-    When requesting a given sinkpad, the associated srcpad for that stream will
-  be created. Ex: requesting sink_1 will generate src_1.
-
-
-  * Non-starvation on multiple streams.
-
-    If more than one stream is used with the element, the streams' queues will
-  be dynamically grown (up to a limit), in order to ensure that no stream is
-  risking data starvation. This guarantees that at any given time there are at
-  least N bytes queued and available for each individual stream.
-
-    If an EOS event comes through a srcpad, the associated queue should be
-  considered as 'not-empty' in the queue-size-growing algorithm.
-
-
-  * Non-linked srcpads graceful handling.
-
-    A GstTask is started for all srcpads when going to GST_STATE_PAUSED.
-
-    The task are blocking against a GCondition which will be fired in two
-  different cases:
-
-    _ When the associated queue has received a buffer.
-
-    _ When the associated queue was previously declared as 'not-linked' and the
-    first buffer of the queue is scheduled to be pushed synchronously in
-    relation to the order in which it arrived globally in the element (see
-    'Synchronous data pushing' below).
-
-    When woken up by the GCondition, the GstTask will try to push the next
-  GstBuffer/GstEvent on the queue. If pushing the GstBuffer/GstEvent returns
-  GST_FLOW_NOT_LINKED, then the associated queue is marked as 'not-linked'. If
-  pushing the GstBuffer/GstEvent succeeded the queue will no longer be marked as
-  'not-linked'.
-
-    If pushing on all srcpads returns GstFlowReturn different from GST_FLOW_OK,
-  then all the srcpads' tasks are stopped and subsequent pushes on sinkpads will
-  return GST_FLOW_NOT_LINKED.
-
-  * Synchronous data pushing for non-linked pads.
-
-    In order to better support dynamic switching between streams, the multiqueue
-  (unlike the current GStreamer queue) continues to push buffers on non-linked
-  pads rather than shutting down. 
-
-    In addition, to prevent a non-linked stream from very quickly consuming all
-  available buffers and thus 'racing ahead' of the other streams, the element
-  must ensure that buffers and inlined events for a non-linked stream are pushed
-  in the same order as they were received, relative to the other streams
-  controlled by the element. This means that a buffer cannot be pushed to a
-  non-linked pad any sooner than buffers in any other stream which were received
-  before it.
-
-
-=====================================
- Parsers, decoders and auto-plugging
-=====================================
-
-This section has DRAFT status.
-
-Some media formats come in different "flavours" or "stream formats". These
-formats differ in the way the setup data and media data is signalled and/or
-packaged. An example for this is H.264 video, where there is a bytestream
-format (with codec setup data signalled inline and units prefixed by a sync
-code and packet length information) and a "raw" format where codec setup
-data is signalled out of band (via the caps) and the chunking is implicit
-in the way the buffers were muxed into a container, to mention just two of
-the possible variants.
-
-Especially on embedded platforms it is common that decoders can only
-handle one particular stream format, and not all of them.
-
-Where there are multiple stream formats, parsers are usually expected
-to be able to convert between the different formats. This will, if
-implemented correctly, work as expected in a static pipeline such as
-
-   ... ! parser ! decoder ! sink
-
-where the parser can query the decoder's capabilities even before
-processing the first piece of data, and configure itself to convert
-accordingly, if conversion is needed at all.
-
-In an auto-plugging context this is not so straight-forward though,
-because elements are plugged incrementally and not before the previous
-element has processes some data and decided what it will output exactly
-(unless the template caps are completely fixed, then it can continue
-right away, this is not always the case here though, see below). A
-parser will thus have to decide on *some* output format so auto-plugging
-can continue. It doesn't know anything about the available decoders and
-their capabilities though, so it's possible that it will choose a format
-that is not supported by any of the available decoders, or by the preferred
-decoder.
-
-If the parser had sufficiently concise but fixed source pad template caps,
-decodebin could continue to plug a decoder right away, allowing the
-parser to configure itself in the same way as it would with a static
-pipeline. This is not an option, unfortunately, because often the
-parser needs to process some data to determine e.g. the format's profile or
-other stream properties (resolution, sample rate, channel configuration, etc.),
-and there may be different decoders for different profiles (e.g. DSP codec
-for baseline profile, and software fallback for main/high profile; or a DSP
-codec only supporting certain resolutions, with a software fallback for
-unusual resolutions). So if decodebin just plugged the most highest-ranking
-decoder, that decoder might not be be able to handle the actual stream later
-on, which would yield an error (this is a data flow error then which would
-be hard to intercept and avoid in decodebin). In other words, we can't solve
-this issue by plugging a decoder right away with the parser.
-
-So decodebin needs to communicate to the parser the set of available decoder
-caps (which would contain the relevant capabilities/restrictions such as
-supported profiles, resolutions, etc.), after the usual "autoplug-*" signal
-filtering/sorting of course.
-
-This is done by plugging a capsfilter element right after the parser, and
-constructing set of filter caps from the list of available decoders (one
-appends at the end just the name(s) of the caps structures from the parser
-pad template caps to function as an 'ANY other' caps equivalent). This let
-the parser negotiate to a supported stream format in the same way as with
-the static pipeline mentioned above, but of course incur some overhead
-through the additional capsfilter element.
-
--- a/docs/design/design-encoding.txt
+++ b/docs/design/design-encoding.txt
@ -1,571 +0,0 @@
-Encoding and Muxing
-------------------
-
-Summary
-------
- A. Problems
- B. Goals
- 1. EncodeBin
- 2. Encoding Profile System
- 3. Helper Library for Profiles
- I. Use-cases researched
-
-
-A. Problems this proposal attempts to solve
-------------------------------------------
-
-* Duplication of pipeline code for gstreamer-based applications
-  wishing to encode and or mux streams, leading to subtle differences
-  and inconsistencies across those applications.
-
-* No unified system for describing encoding targets for applications
-  in a user-friendly way.
-
-* No unified system for creating encoding targets for applications,
-  resulting in duplication of code across all applications,
-  differences and inconsistencies that come with that duplication,
-  and applications hardcoding element names and settings resulting in
-  poor portability.
-
-
-
-B. Goals
--------
-
-1. Convenience encoding element
-
-  Create a convenience GstBin for encoding and muxing several streams,
-  hereafter called 'EncodeBin'.
-
-  This element will only contain one single property, which is a
-  profile.
-
-2. Define a encoding profile system
-
-2. Encoding profile helper library
-
-  Create a helper library to:
-  * create EncodeBin instances based on profiles, and
-  * help applications to create/load/save/browse those profiles.
-
-
-
-
-1. EncodeBin
------------
-
-1.1 Proposed API
----------------
-
-  EncodeBin is a GstBin subclass.
-
-  It implements the GstTagSetter interface, by which it will proxy the
-  calls to the muxer.
-
-  Only two introspectable property (i.e. usable without extra API):
-  * A GstEncodingProfile*
-  * The name of the profile to use
-
-  When a profile is selected, encodebin will:
-  * Add REQUEST sinkpads for all the GstStreamProfile
-  * Create the muxer and expose the source pad
-
-  Whenever a request pad is created, encodebin will:
-  * Create the chain of elements for that pad
-  * Ghost the sink pad
-  * Return that ghost pad
-
-  This allows reducing the code to the minimum for applications
-  wishing to encode a source for a given profile:
-
-  ...
-
-  encbin = gst_element_factory_make("encodebin, NULL);
-  g_object_set (encbin, "profile", "N900/H264 HQ", NULL);
-  gst_element_link (encbin, filesink);
-
-  ...
-
-  vsrcpad = gst_element_get_src_pad(source, "src1");
-  vsinkpad = gst_element_get_request_pad (encbin, "video_%u");
-  gst_pad_link(vsrcpad, vsinkpad);
-
-  ...
-
-
-1.2 Explanation of the Various stages in EncodeBin
--------------------------------------------------
-
-  This describes the various stages which can happen in order to end
-  up with a multiplexed stream that can then be stored or streamed.
-
-1.2.1 Incoming streams
-
-  The streams fed to EncodeBin can be of various types:
-
-  * Video
-   * Uncompressed (but maybe subsampled)
-   * Compressed
-  * Audio
-   * Uncompressed (audio/x-raw)
-   * Compressed
-  * Timed text
-  * Private streams
-
-
-1.2.2 Steps involved for raw video encoding
-
-(0) Incoming Stream
-
-(1) Transform raw video feed (optional)
-
- Here we modify the various fundamental properties of a raw video
- stream to be compatible with the intersection of:
-  * The encoder GstCaps and
-  * The specified "Stream Restriction" of the profile/target
-
- The fundamental properties that can be modified are:
-  * width/height
-    This is done with a video scaler.
-    The DAR (Display Aspect Ratio) MUST be respected.
-    If needed, black borders can be added to comply with the target DAR.
-  * framerate
-  * format/colorspace/depth
-    All of this is done with a colorspace converter
-
-(2) Actual encoding (optional for raw streams)
-
- An encoder (with some optional settings) is used.
-
-(3) Muxing
-
- A muxer (with some optional settings) is used.
-
-(4) Outgoing encoded and muxed stream
-
-
-1.2.3 Steps involved for raw audio encoding
-
- This is roughly the same as for raw video, expect for (1)
-
-(1) Transform raw audo feed (optional)
-
- We modify the various fundamental properties of a raw audio stream to
- be compatible with the intersection of:
-  * The encoder GstCaps and
-  * The specified "Stream Restriction" of the profile/target
-
- The fundamental properties that can be modifier are:
- * Number of channels
- * Type of raw audio (integer or floating point)
- * Depth (number of bits required to encode one sample)
-
-
-1.2.4 Steps involved for encoded audio/video streams
-
- Steps (1) and (2) are replaced by a parser if a parser is available
- for the given format.
-
-
-1.2.5 Steps involved for other streams
-
- Other streams will just be forwarded as-is to the muxer, provided the
- muxer accepts the stream type.
-
- 
-
-
-2. Encoding Profile System
--------------------------
-
- This work is based on:
- * The existing GstPreset system for elements [0]
- * The gnome-media GConf audio profile system [1]
- * The investigation done into device profiles by Arista and
- Transmageddon [2 and 3]
-
-2.2 Terminology
---------------
-
-* Encoding Target Category
-  A Target Category is a classification of devices/systems/use-cases
-  for encoding.
-
-  Such a classification is required in order for:
-  * Applications with a very-specific use-case to limit the number of
-    profiles they can offer the user. A screencasting application has
-    no use with the online services targets for example. 
-  * Offering the user some initial classification in the case of a
-    more generic encoding application (like a video editor or a
-    transcoder). 
-
-  Ex:
-   Consumer devices
-   Online service
-   Intermediate Editing Format
-   Screencast
-   Capture
-   Computer
-
-* Encoding Profile Target
-  A Profile Target describes a specific entity for which we wish to
-  encode.
-  A Profile Target must belong to at least one Target Category.
-  It will define at least one Encoding Profile.
-
-  Ex (with category):
-   Nokia N900 (Consumer device)
-   Sony PlayStation 3 (Consumer device)
-   Youtube (Online service)
-   DNxHD (Intermediate editing format)
-   HuffYUV (Screencast)
-   Theora (Computer)
-
-* Encoding Profile
-  A specific combination of muxer, encoders, presets and limitations.
-
-  Ex:
-   Nokia N900/H264 HQ
-   Ipod/High Quality
-   DVD/Pal
-   Youtube/High Quality
-   HTML5/Low Bandwith
-   DNxHD
-
-2.3 Encoding Profile
--------------------
-
-An encoding profile requires the following information:
-
- * Name
-   This string is not translatable and must be unique.
-   A recommendation to guarantee uniqueness of the naming could be:
-      <target>/<name>
- * Description
-   This is a translatable string describing the profile
- * Muxing format
-   This is a string containing the GStreamer media-type of the
-   container format.
- * Muxing preset
-   This is an optional string describing the preset(s) to use on the
-   muxer.
- * Multipass setting
-   This is a boolean describing whether the profile requires several
-   passes.
- * List of Stream Profile
-
-2.3.1 Stream Profiles
-
-A Stream Profile consists of:
-
- * Type
-   The type of stream profile (audio, video, text, private-data)
- * Encoding Format
-   This is a string containing the GStreamer media-type of the encoding
-   format to be used. If encoding is not to be applied, the raw audio
-   media type will be used.
- * Encoding preset
-   This is an optional string describing the preset(s) to use on the
-   encoder.
- * Restriction
-   This is an optional GstCaps containing the restriction of the
-   stream that can be fed to the encoder.
-   This will generally containing restrictions in video
-   width/heigh/framerate or audio depth.
- * presence
-   This is an integer specifying how many streams can be used in the
-   containing profile. 0 means that any number of streams can be
-   used.
- * pass
-   This is an integer which is only meaningful if the multipass flag
-   has been set in the profile. If it has been set it indicates which
-   pass this Stream Profile corresponds to.
- 
-2.4 Example profile
-------------------
-
-The representation used here is XML only as an example. No decision is
-made as to which formatting to use for storing targets and profiles.
-
-<gst-encoding-target>
-  <name>Nokia N900</name>
-  <category>Consumer Device</category>
-  <profiles>
-    <profile>Nokia N900/H264 HQ</profile>
-    <profile>Nokia N900/MP3</profile>
-    <profile>Nokia N900/AAC</profile>
-  </profiles>
-</gst-encoding-target>
-
-<gst-encoding-profile>
-  <name>Nokia N900/H264 HQ</name>
-  <description>
-    High Quality H264/AAC for the Nokia N900
-  </description>
-  <format>video/quicktime,variant=iso</format>
-  <streams>
-    <stream-profile>
-      <type>audio</type>
-      <format>audio/mpeg,mpegversion=4</format>
-      <preset>Quality High/Main</preset>
-      <restriction>audio/x-raw,channels=[1,2]</restriction>
-      <presence>1</presence>
-    </stream-profile>
-    <stream-profile>
-      <type>video</type>
-      <format>video/x-h264</format>
-      <preset>Profile Baseline/Quality High</preset>
-      <restriction>
-        video/x-raw,width=[16, 800],\
-	height=[16, 480],framerate=[1/1, 30000/1001]
-      </restriction>
-      <presence>1</presence>
-    </stream-profile>
-  </streams>
-  
-</gst-encoding-profile>
-
-2.5 API
-------
-  A proposed C API is contained in the gstprofile.h file in this directory.
-
-
-2.6 Modifications required in the existing GstPreset system
-----------------------------------------------------------
-
-2.6.1. Temporary preset.
-
-  Currently a preset needs to be saved on disk in order to be
-  used.
-
-  This makes it impossible to have temporary presets (that exist only
-  during the lifetime of a process), which might be required in the
-  new proposed profile system
-
-2.6.2 Categorisation of presets.
-
-  Currently presets are just aliases of a group of property/value
-  without any meanings or explanation as to how they exclude each
-  other.
-
-  Take for example the H264 encoder. It can have presets for:
-  * passes (1,2 or 3 passes)
-  * profiles (Baseline, Main, ...)
-  * quality (Low, medium, High)
-
-  In order to programmatically know which presets exclude each other,
-  we here propose the categorisation of these presets.
-
-  This can be done in one of two ways
-  1. in the name (by making the name be [<category>:]<name>)
-    This would give for example: "Quality:High", "Profile:Baseline"
-  2. by adding a new _meta key
-    This would give for example: _meta/category:quality
-
-2.6.3 Aggregation of presets.
-
-  There can be more than one choice of presets to be done for an
-  element (quality, profile, pass).
-
-  This means that one can not currently describe the full
-  configuration of an element with a single string but with many.
-
-  The proposal here is to extend the GstPreset API to be able to set
-  all presets using one string and a well-known separator ('/').
-
-  This change only requires changes in the core preset handling code.
-
-  This would allow doing the following:
-  gst_preset_load_preset (h264enc,
-                          "pass:1/profile:baseline/quality:high");
-
-2.7 Points to be determined
---------------------------
-
-  This document hasn't determined yet how to solve the following
-  problems:
-
-2.7.1 Storage of profiles
-
-  One proposal for storage would be to use a system wide directory
-  (like $prefix/share/gstreamer-0.10/profiles) and store XML files for
-  every individual profiles.
-
-  Users could then add their own profiles in ~/.gstreamer-0.10/profiles
-
-  This poses some limitations as to what to do if some applications
-  want to have some profiles limited to their own usage.
-
-
-3. Helper library for profiles
------------------------------
-
- These helper methods could also be added to existing libraries (like
- GstPreset, GstPbUtils, ..).
-
- The various API proposed are in the accompanying gstprofile.h file.
-
-3.1 Getting user-readable names for formats
-
- This is already provided by GstPbUtils.
-
-3.2 Hierarchy of profiles
-
- The goal is for applications to be able to present to the user a list
- of combo-boxes for choosing their output profile:
-
- [      Category      ]       # optional, depends on the application
- [    Device/Site/..  ]       # optional, depends on the application
- [      Profile       ]
-
- Convenience methods are offered to easily get lists of categories,
- devices, and profiles.
-
-3.3 Creating Profiles
-
- The goal is for applications to be able to easily create profiles.
-
- The applications needs to be able to have a fast/efficient way to:
- * select a container format and see all compatible streams he can use
- with it.
- * select a codec format and see which container formats he can use
- with it.
-
- The remaining parts concern the restrictions to encoder
- input.
-
-3.4 Ensuring availability of plugins for Profiles
-
- When an application wishes to use a Profile, it should be able to
- query whether it has all the needed plugins to use it.
-
- This part will use GstPbUtils to query, and if needed install the
- missing plugins through the installed distribution plugin installer.
-
-
-I. Use-cases researched
-----------------------
-
- This is a list of various use-cases where encoding/muxing is being
- used.
-
-* Transcoding
-
-  The goal is to convert with as minimal loss of quality any input
-  file for a target use.
-  A specific variant of this is transmuxing (see below).
-
-  Example applications: Arista, Transmageddon
-
-* Rendering timelines
-
-  The incoming streams are a collection of various segments that need
-  to be rendered.
-  Those segments can vary in nature (i.e. the video width/height can
-  change).
-  This requires the use of identiy with the single-segment property
-  activated to transform the incoming collection of segments to a
-  single continuous segment.
-
-  Example applications: PiTiVi, Jokosher
-
-* Encoding of live sources
-
-  The major risk to take into account is the encoder not encoding the
-  incoming stream fast enough. This is outside of the scope of
-  encodebin, and should be solved by using queues between the sources
-  and encodebin, as well as implementing QoS in encoders and sources
-  (the encoders emitting QoS events, and the upstream elements
-  adapting themselves accordingly).
-
-  Example applications: camerabin, cheese
-
-* Screencasting applications
-
-  This is similar to encoding of live sources.
-  The difference being that due to the nature of the source (size and
-  amount/frequency of updates) one might want to do the encoding in
-  two parts:
-  * The actual live capture is encoded with a 'almost-lossless' codec
-  (such as huffyuv)
-  * Once the capture is done, the file created in the first step is
-  then rendered to the desired target format.
-
-  Fixing sources to only emit region-updates and having encoders
-  capable of encoding those streams would fix the need for the first
-  step but is outside of the scope of encodebin.
-
-  Example applications: Istanbul, gnome-shell, recordmydesktop
-
-* Live transcoding
-
-  This is the case of an incoming live stream which will be
-  broadcasted/transmitted live.
-  One issue to take into account is to reduce the encoding latency to
-  a minimum. This should mostly be done by picking low-latency
-  encoders.
-
-  Example applications: Rygel, Coherence
-
-* Transmuxing
-
-  Given a certain file, the aim is to remux the contents WITHOUT
-  decoding into either a different container format or the same
-  container format.
-  Remuxing into the same container format is useful when the file was
-  not created properly (for example, the index is missing).
-  Whenever available, parsers should be applied on the encoded streams
-  to validate and/or fix the streams before muxing them.
-
-  Metadata from the original file must be kept in the newly created
-  file.
-
-  Example applications: Arista, Transmaggedon
-
-* Loss-less cutting
-
-  Given a certain file, the aim is to extract a certain part of the
-  file without going through the process of decoding and re-encoding
-  that file.
-  This is similar to the transmuxing use-case.
-
-  Example applications: PiTiVi, Transmageddon, Arista, ...
-
-* Multi-pass encoding
-
-  Some encoders allow doing a multi-pass encoding.
-  The initial pass(es) are only used to collect encoding estimates and
-  are not actually muxed and outputted.
-  The final pass uses previously collected information, and the output
-  is then muxed and outputted.
-
-* Archiving and intermediary format
-
-  The requirement is to have lossless
-
-* CD ripping
-
-  Example applications: Sound-juicer
-
-* DVD ripping
-
-  Example application: Thoggen
-
-
-
-* Research links
-
-  Some of these are still active documents, some other not
-
-[0] GstPreset API documentation
-    http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstPreset.html
-
-[1] gnome-media GConf profiles
-    http://www.gnome.org/~bmsmith/gconf-docs/C/gnome-media.html
-
-[2] Research on a Device Profile API
-    http://gstreamer.freedesktop.org/wiki/DeviceProfile
-
-[3] Research on defining presets usage
-    http://gstreamer.freedesktop.org/wiki/PresetDesign
-
--- a/docs/design/design-orc-integration.txt
+++ b/docs/design/design-orc-integration.txt
@ -1,204 +0,0 @@
-
-Orc Integration
-===============
-
-Sections
--------
-
- - About Orc
- - Fast memcpy()
- - Normal Usage
- - Build Process
- - Testing
- - Orc Limitations
-
-
-About Orc
---------
-
-Orc code can be in one of two forms: in .orc files that is converted
-by orcc to C code that calls liborc functions, or C code that calls
-liborc to create complex operations at runtime.  The former is mostly
-for functions with predetermined functionality.  The latter is for
-functionality that is determined at runtime, where writing .orc
-functions for all combinations would be prohibitive.  Orc also has
-a fast memcpy and memset which are useful independently.
-
-
-Fast memcpy()
-------------
-
-*** This part is not integrated yet. ***
-
-Orc has built-in functions orc_memcpy() and orc_memset() that work
-like memcpy() and memset().  These are meant for large copies only.
-A reasonable cutoff for using orc_memcpy() instead of memcpy() is
-if the number of bytes is generally greater than 100.  DO NOT use
-orc_memcpy() if the typical is size is less than 20 bytes, especially
-if the size is known at compile time, as these cases are inlined by
-the compiler.
-
-(Example: sys/ximage/ximagesink.c)
-
-Add $(ORC_CFLAGS) to libgstximagesink_la_CFLAGS and $(ORC_LIBS) to
-libgstximagesink_la_LIBADD.  Then, in the source file, add:
-
-  #ifdef HAVE_ORC
-  #include <orc/orc.h>
-  #else
-  #define orc_memcpy(a,b,c) memcpy(a,b,c)
-  #endif
-
-Then switch relevant uses of memcpy() to orc_memcpy().
-
-The above example works whether or not Orc is enabled at compile
-time.
-
-
-Normal Usage
------------
-
-The following lines are added near the top of Makefile.am for plugins
-that use Orc code in .orc files (this is for the volume plugin):
-
-  ORC_BASE=volume
-  include $(top_srcdir)/common/orc.mk
-
-Also add the generated source file to the plugin build:
-
-  nodist_libgstvolume_la_SOURCES = $(ORC_SOURCES)
-
-And of course, add $(ORC_CFLAGS) to libgstvolume_la_CFLAGS, and
-$(ORC_LIBS) to libgstvolume_la_LIBADD.
-
-The value assigned to ORC_BASE does not need to be related to
-the name of the plugin.
-
-
-Advanced Usage
--------------
-
-The Holy Grail of Orc usage is to programmatically generate Orc code
-at runtime, have liborc compile it into binary code at runtime, and
-then execute this code.  Currently, the best example of this is in
-Schroedinger.  An example of how this would be used is audioconvert:
-given an input format, channel position manipulation, dithering and
-quantizing configuration, and output format, a Orc code generator
-would create an OrcProgram, add the appropriate instructions to do
-each step based on the configuration, and then compile the program.
-Successfully compiling the program would return a function pointer
-that can be called to perform the operation.
-
-This sort of advanced usage requires structural changes to current
-plugins (e.g., audioconvert) and will probably be developed
-incrementally.  Moreover, if such code is intended to be used without
-Orc as strict build/runtime requirement, two codepaths would need to
-be developed and tested.  For this reason, until GStreamer requires
-Orc, I think it's a good idea to restrict such advanced usage to the
-cog plugin in -bad, which requires Orc.
-
-
-Build Process
-------------
-
-The goal of the build process is to make Orc non-essential for most
-developers and users.  This is not to say you shouldn't have Orc
-installed -- without it, you will get slow backup C code, just that
-people compiling GStreamer are not forced to switch from Liboil to
-Orc immediately.
-
-With Orc installed, the build process will use the Orc Compiler (orcc)
-to convert each .orc file into a temporary C source (tmp-orc.c) and a
-temporary header file (${name}orc.h if constructed from ${base}.orc).
-The C source file is compiled and linked to the plugin, and the header
-file is included by other source files in the plugin.
-
-If 'make orc-update' is run in the source directory, the files
-tmp-orc.c and ${base}orc.h are copied to ${base}orc-dist.c and
-${base}orc-dist.h respectively.  The -dist.[ch] files are automatically
-disted via orc.mk.  The -dist.[ch] files should be checked in to
-git whenever the .orc source is changed and checked in.  Example
-workflow:
-
-  edit .orc file
-  ... make, test, etc.
-  make orc-update
-  git add volume.orc volumeorc-dist.c volumeorc-dist.h
-  git commit
-
-At 'make dist' time, all of the .orc files are compiled, and then
-copied to their -dist.[ch] counterparts, and then the -dist.[ch]
-files are added to the dist directory.
-
-Without Orc installed (or --disable-orc given to configure), the
-dist.[ch] files are copied to tmp-orc.c and ${name}orc.h.  When
-compiled Orc disabled, DISABLE_ORC is defined in config.h, and
-the C backup code is compiled.  This backup code is pure C, and
-does not include orc headers or require linking against liborc.
-
-The common/orc.mk build method is limited by the inflexibility of
-automake.  The file tmp-orc.c must be a fixed filename, using ORC_NAME
-to generate the filename does not work because it conflicts with
-automake's dependency generation.  Building multiple .orc files
-is not possible due to this restriction.
-
-
-Testing
-------
-
-If you create another .orc file, please add it to
-tests/orc/Makefile.am.  This causes automatic test code to be
-generated and run during 'make check'.  Each function in the .orc
-file is tested by comparing the results of executing the run-time
-compiled code and the C backup function.
-
-
-Orc Limitations
---------------
-
-audioconvert
-
-  Orc doesn't have a mechanism for generating random numbers, which
-  prevents its use as-is for dithering.  One way around this is to
-  generate suitable dithering values in one pass, then use those
-  values in a second Orc-based pass.
-
-  Orc doesn't handle 64-bit float, for no good reason.
-
-  Irrespective of Orc handling 64-bit float, it would be useful to
-  have a direct 32-bit float to 16-bit integer conversion.
-
-  audioconvert is a good candidate for programmatically generated
-  Orc code.
-
-  audioconvert enumerates functions in terms of big-endian vs.
-  little-endian.  Orc's functions are "native" and "swapped".
-  Programmatically generating code removes the need to worry about
-  this.
-
-  Orc doesn't handle 24-bit samples.  Fixing this is not a priority
-  (for ds).
-
-videoscale
-
-  Orc doesn't handle horizontal resampling yet.  The plan is to add
-  special sampling opcodes, for nearest, bilinear, and cubic
-  interpolation.
-
-videotestsrc
-
-  Lots of code in videotestsrc needs to be rewritten to be SIMD
-  (and Orc) friendly, e.g., stuff that uses oil_splat_u8().
-
-  A fast low-quality random number generator in Orc would be useful
-  here.
-
-volume
-  
-  Many of the comments on audioconvert apply here as well.
-
-  There are a bunch of FIXMEs in here that are due to misapplied
-  patches.
-
-
-
--- a/docs/design/draft-keyframe-force.txt
+++ b/docs/design/draft-keyframe-force.txt
@ -1,91 +0,0 @@
-Forcing keyframes
-----------------
-
-Consider the following use case:
-
-  We have a pipeline that performs video and audio capture from a live source,
-  compresses and muxes the streams and writes the resulting data into a file.
-
-  Inside the uncompressed video data we have a specific pattern inserted at
-  specific moments that should trigger a switch to a new file, meaning, we close
-  the existing file we are writing to and start writing to a new file.
-
-  We want the new file to start with a keyframe so that one can start decoding
-  the file immediately.
-
-Components:
-
-  1) We need an element that is able to detect the pattern in the video stream.
-
-  2) We need to inform the video encoder that it should start encoding a keyframe
-     starting from exactly the frame with the pattern.
-
-  3) We need to inform the demuxer that it should flush out any pending data and
-     start creating the start of a new file with the keyframe as a first video
-     frame.
-
-  4) We need to inform the sink element that it should start writing to the next
-     file. This requires application interaction to instruct the sink of the new
-     filename. The application should also be free to ignore the boundary and
-     continue to write to the existing file. The application will typically use
-     an event pad probe to detect the custom event.
-
-Implementation:
-
- The implementation would consist of generating a GST_EVENT_CUSTOM_DOWNSTREAM
- event that marks the keyframe boundary. This event is inserted into the
- pipeline by the application upon a certain trigger. In the above use case this
- trigger would be given by the element that detects the pattern, in the form of
- an element message.
- 
- The custom event would travel further downstream to instruct encoder, muxer and
- sink about the possible switch.
-
- The information passed in the event consists of:
-
-  name:  GstForceKeyUnit
-	 (G_TYPE_UINT64)"timestamp"    : the timestamp of the buffer that
-	                                 triggered the event.
-	 (G_TYPE_UINT64)"stream-time"  : the stream position that triggered the
-	                                 event.
-	 (G_TYPE_UINT64)"running-time" : the running time of the stream when the 
-	                                 event was triggered.
-	 (G_TYPE_BOOLEAN)"all-headers" : Send all headers, including those in
-                                         the caps or those sent at the start of
-                                         the stream.
-
-	 ....                          : optional other data fields.
-
-  Note that this event is purely informational, no element is required to
-  perform an action but it should forward the event downstream, just like any
-  other event it does not handle.
-
-  Elements understanding the event should behave as follows:
-
-  1) The video encoder receives the event before the next frame. Upon reception
-     of the event it schedules to encode the next frame as a keyframe. 
-     Before pushing out the encoded keyframe it must push the GstForceKeyUnit
-     event downstream.
-
-  2) The muxer receives the GstForceKeyUnit event and flushes out its current state,
-     preparing to produce data that can be used as a keyunit. Before pushing out
-     the new data it pushes the GstForceKeyUnit event downstream.
-
-  3) The application receives the GstForceKeyUnit on a sink padprobe of the sink
-     and reconfigures the sink to make it perform new actions after receiving
-     the next buffer. 
-
-
-Upstream
--------
-
-When using RTP packets can get lost or receivers can be added at any time,
-they may request a new key frame.
-
-An downstream element sends an upstream "GstForceKeyUnit" event up the
-pipeline.
-
-When an element produces some kind of key unit in output, but has
-no such concept in its input (like an encoder that takes raw frames),
-it consumes the event (doesn't pass it upstream), and instead sends
-a downstream GstForceKeyUnit event and a new keyframe.
--- a/docs/design/draft-subtitle-overlays.txt
+++ b/docs/design/draft-subtitle-overlays.txt
@ -1,546 +0,0 @@
-===============================================================
- Subtitle overlays, hardware-accelerated decoding and playbin
-===============================================================
-
-Status: EARLY DRAFT / BRAINSTORMING
-
- === 1. Background ===
-
-Subtitles can be muxed in containers or come from an external source.
-
-Subtitles come in many shapes and colours. Usually they are either
-text-based (incl. 'pango markup'), or bitmap-based (e.g. DVD subtitles
-and the most common form of DVB subs). Bitmap based subtitles are
-usually compressed in some way, like some form of run-length encoding.
-
-Subtitles are currently decoded and rendered in subtitle-format-specific
-overlay elements. These elements have two sink pads (one for raw video
-and one for the subtitle format in question) and one raw video source pad.
-
-They will take care of synchronising the two input streams, and of
-decoding and rendering the subtitles on top of the raw video stream.
-
-Digression: one could theoretically have dedicated decoder/render elements
-that output an AYUV or ARGB image, and then let a videomixer element do
-the actual overlaying, but this is not very efficient, because it requires
-us to allocate and blend whole pictures (1920x1080 AYUV = 8MB,
-1280x720 AYUV = 3.6MB, 720x576 AYUV = 1.6MB) even if the overlay region
-is only a small rectangle at the bottom. This wastes memory and CPU.
-We could do something better by introducing a new format that only
-encodes the region(s) of interest, but we don't have such a format yet, and
-are not necessarily keen to rewrite this part of the logic in playbin
-at this point - and we can't change existing elements' behaviour, so would
-need to introduce new elements for this.
-
-Playbin2 supports outputting compressed formats, i.e. it does not
-force decoding to a raw format, but is happy to output to a non-raw
-format as long as the sink supports that as well.
-
-In case of certain hardware-accelerated decoding APIs, we will make use
-of that functionality. However, the decoder will not output a raw video
-format then, but some kind of hardware/API-specific format (in the caps)
-and the buffers will reference hardware/API-specific objects that
-the hardware/API-specific sink will know how to handle.
-
-
- === 2. The Problem ===
-
-In the case of such hardware-accelerated decoding, the decoder will not
-output raw pixels that can easily be manipulated. Instead, it will
-output hardware/API-specific objects that can later be used to render
-a frame using the same API.
-
-Even if we could transform such a buffer into raw pixels, we most
-likely would want to avoid that, in order to avoid the need to
-map the data back into system memory (and then later back to the GPU).
-It's much better to upload the much smaller encoded data to the GPU/DSP
-and then leave it there until rendered.
-
-Currently playbin only supports subtitles on top of raw decoded video.
-It will try to find a suitable overlay element from the plugin registry
-based on the input subtitle caps and the rank. (It is assumed that we
-will be able to convert any raw video format into any format required
-by the overlay using a converter such as videoconvert.)
-
-It will not render subtitles if the video sent to the sink is not
-raw YUV or RGB or if conversions have been disabled by setting the
-native-video flag on playbin.
-
-Subtitle rendering is considered an important feature. Enabling
-hardware-accelerated decoding by default should not lead to a major
-feature regression in this area.
-
-This means that we need to support subtitle rendering on top of
-non-raw video.
-
-
- === 3. Possible Solutions ===
-
-The goal is to keep knowledge of the subtitle format within the
-format-specific GStreamer plugins, and knowledge of any specific
-video acceleration API to the GStreamer plugins implementing
-that API. We do not want to make the pango/dvbsuboverlay/dvdspu/kate
-plugins link to libva/libvdpau/etc. and we do not want to make
-the vaapi/vdpau plugins link to all of libpango/libkate/libass etc.
-
-
-Multiple possible solutions come to mind:
-
-  (a) backend-specific overlay elements
-
-      e.g. vaapitextoverlay, vdpautextoverlay, vaapidvdspu, vdpaudvdspu,
-      vaapidvbsuboverlay, vdpaudvbsuboverlay, etc.
-
-      This assumes the overlay can be done directly on the backend-specific
-      object passed around.
-
-      The main drawback with this solution is that it leads to a lot of
-      code duplication and may also lead to uncertainty about distributing
-      certain duplicated pieces of code. The code duplication is pretty
-      much unavoidable, since making textoverlay, dvbsuboverlay, dvdspu,
-      kate, assrender, etc. available in form of base classes to derive
-      from is not really an option. Similarly, one would not really want
-      the vaapi/vdpau plugin to depend on a bunch of other libraries
-      such as libpango, libkate, libtiger, libass, etc.
-
-      One could add some new kind of overlay plugin feature though in
-      combination with a generic base class of some sort, but in order
-      to accommodate all the different cases and formats one would end
-      up with quite convoluted/tricky API.
-
-      (Of course there could also be a GstFancyVideoBuffer that provides
-      an abstraction for such video accelerated objects and that could
-      provide an API to add overlays to it in a generic way, but in the
-      end this is just a less generic variant of (c), and it is not clear
-      that there are real benefits to a specialised solution vs. a more
-      generic one).
-
-
-  (b) convert backend-specific object to raw pixels and then overlay
-
-      Even where possible technically, this is most likely very
-      inefficient.
-
-
-  (c) attach the overlay data to the backend-specific video frame buffers
-      in a generic way and do the actual overlaying/blitting later in
-      backend-specific code such as the video sink (or an accelerated
-      encoder/transcoder)
-
-      In this case, the actual overlay rendering (i.e. the actual text
-      rendering or decoding DVD/DVB data into pixels) is done in the
-      subtitle-format-specific GStreamer plugin. All knowledge about
-      the subtitle format is contained in the overlay plugin then,
-      and all knowledge about the video backend in the video backend
-      specific plugin.
-
-      The main question then is how to get the overlay pixels (and
-      we will only deal with pixels here) from the overlay element
-      to the video sink.
-
-      This could be done in multiple ways: One could send custom
-      events downstream with the overlay data, or one could attach
-      the overlay data directly to the video buffers in some way.
-
-      Sending inline events has the advantage that is is fairly
-      transparent to any elements between the overlay element and
-      the video sink: if an effects plugin creates a new video
-      buffer for the output, nothing special needs to be done to
-      maintain the subtitle overlay information, since the overlay
-      data is not attached to the buffer. However, it slightly
-      complicates things at the sink, since it would also need to
-      look for the new event in question instead of just processing
-      everything in its buffer render function.
-
-      If one attaches the overlay data to the buffer directly, any
-      element between overlay and video sink that creates a new
-      video buffer would need to be aware of the overlay data
-      attached to it and copy it over to the newly-created buffer.
-
-      One would have to do implement a special kind of new query
-      (e.g. FEATURE query) that is not passed on automatically by
-      gst_pad_query_default() in order to make sure that all elements
-      downstream will handle the attached overlay data. (This is only
-      a problem if we want to also attach overlay data to raw video
-      pixel buffers; for new non-raw types we can just make it
-      mandatory and assume support and be done with it; for existing
-      non-raw types nothing changes anyway if subtitles don't work)
-      (we need to maintain backwards compatibility for existing raw
-      video pipelines like e.g.:  ..decoder ! suboverlay ! encoder..)
-
-      Even though slightly more work, attaching the overlay information
-      to buffers seems more intuitive than sending it interleaved as
-      events. And buffers stored or passed around (e.g. via the
-      "last-buffer" property in the sink when doing screenshots via
-      playbin) always contain all the information needed.
-
-
-  (d) create a video/x-raw-*-delta format and use a backend-specific videomixer
-
-      This possibility was hinted at already in the digression in
-      section 1. It would satisfy the goal of keeping subtitle format
-      knowledge in the subtitle plugins and video backend knowledge
-      in the video backend plugin. It would also add a concept that
-      might be generally useful (think ximagesrc capture with xdamage).
-      However, it would require adding foorender variants of all the
-      existing overlay elements, and changing playbin to that new
-      design, which is somewhat intrusive. And given the general
-      nature of such a new format/API, we would need to take a lot
-      of care to be able to accommodate all possible use cases when
-      designing the API, which makes it considerably more ambitious.
-      Lastly, we would need to write videomixer variants for the
-      various accelerated video backends as well.
-
-
-Overall (c) appears to be the most promising solution. It is the least
-intrusive and should be fairly straight-forward to implement with
-reasonable effort, requiring only small changes to existing elements
-and requiring no new elements.
-
-Doing the final overlaying in the sink as opposed to a videomixer
-or overlay in the middle of the pipeline has other advantages:
-
- - if video frames need to be dropped, e.g. for QoS reasons,
-   we could also skip the actual subtitle overlaying and
-   possibly the decoding/rendering as well, if the
-   implementation and API allows for that to be delayed.
-
- - the sink often knows the actual size of the window/surface/screen
-   the output video is rendered to. This *may* make it possible to
-   render the overlay image in a higher resolution than the input
-   video, solving a long standing issue with pixelated subtitles on
-   top of low-resolution videos that are then scaled up in the sink.
-   This would require for the rendering to be delayed of course instead
-   of just attaching an AYUV/ARGB/RGBA blog of pixels to the video buffer
-   in the overlay, but that could all be supported.
-
- - if the video backend / sink has support for high-quality text
-   rendering (clutter?) we could just pass the text or pango markup
-   to the sink and let it do the rest (this is unlikely to be
-   supported in the general case - text and glyph rendering is
-   hard; also, we don't really want to make up our own text markup
-   system, and pango markup is probably too limited for complex
-   karaoke stuff).
-
-
- === 4. API needed ===
-
-  (a) Representation of subtitle overlays to be rendered
-
-      We need to pass the overlay pixels from the overlay element to the
-      sink somehow. Whatever the exact mechanism, let's assume we pass
-      a refcounted GstVideoOverlayComposition struct or object.
-
-      A composition is made up of one or more overlays/rectangles.
-
-      In the simplest case an overlay rectangle is just a blob of
-      RGBA/ABGR [FIXME?] or AYUV pixels with positioning info and other
-      metadata, and there is only one rectangle to render.
-
-      We're keeping the naming generic ("OverlayFoo" rather than
-      "SubtitleFoo") here, since this might also be handy for
-      other use cases such as e.g. logo overlays or so. It is not
-      designed for full-fledged video stream mixing though.
-
-        // Note: don't mind the exact implementation details, they'll be hidden
-
-        // FIXME: might be confusing in 0.11 though since GstXOverlay was
-        //        renamed to GstVideoOverlay in 0.11, but not much we can do,
-        //        maybe we can rename GstVideoOverlay to something better
-
-        struct GstVideoOverlayComposition
-        {
-            guint                          num_rectangles;
-            GstVideoOverlayRectangle    ** rectangles;
-
-            /* lowest rectangle sequence number still used by the upstream
-             * overlay element. This way a renderer maintaining some kind of
-             * rectangles <-> surface cache can know when to free cached
-             * surfaces/rectangles. */
-            guint                          min_seq_num_used;
-
-            /* sequence number for the composition (same series as rectangles) */
-            guint                          seq_num;
-        }
-
-        struct GstVideoOverlayRectangle
-        {
-            /* Position on video frame and dimension of output rectangle in
-             * output frame terms (already adjusted for the PAR of the output
-             * frame). x/y can be negative (overlay will be clipped then) */
-            gint  x, y;
-            guint render_width, render_height;
-
-            /* Dimensions of overlay pixels */
-            guint width, height, stride;
-
-            /* This is the PAR of the overlay pixels */
-            guint par_n, par_d;
-
-            /* Format of pixels, GST_VIDEO_FORMAT_ARGB on big-endian systems,
-             * and BGRA on little-endian systems (i.e. pixels are treated as
-             * 32-bit values and alpha is always in the most-significant byte,
-             * and blue is in the least-significant byte).
-             *
-             * FIXME: does anyone actually use AYUV in practice? (we do
-             * in our utility function to blend on top of raw video)
-             * What about AYUV and endianness? Do we always have [A][Y][U][V]
-             * in memory? */
-            /* FIXME: maybe use our own enum? */
-            GstVideoFormat format;
-
-            /* Refcounted blob of memory, no caps or timestamps */
-            GstBuffer *pixels;
-
-            // FIXME: how to express source like text or pango markup?
-            //        (just add source type enum + source buffer with data)
-            //
-            // FOR 0.10: always send pixel blobs, but attach source data in
-            // addition (reason: if downstream changes, we can't renegotiate
-            // that properly, if we just do a query of supported formats from
-            // the start). Sink will just ignore pixels and use pango markup
-            // from source data if it supports that.
-            //
-            // FOR 0.11: overlay should query formats (pango markup, pixels)
-            // supported by downstream and then only send that. We can
-            // renegotiate via the reconfigure event.
-            //
-
-            /* sequence number: useful for backends/renderers/sinks that want
-             * to maintain a cache of rectangles <-> surfaces. The value of
-             * the min_seq_num_used in the composition tells the renderer which
-             * rectangles have expired. */
-            guint      seq_num;
-
-            /* FIXME: we also need a (private) way to cache converted/scaled
-             * pixel blobs */
-        }
-
-      (a1) Overlay consumer API:
-
-        How would this work in a video sink that supports scaling of textures:
-
-        gst_foo_sink_render () {
-          /* assume only one for now */
-          if video_buffer has composition:
-            composition = video_buffer.get_composition()
-
-            for each rectangle in composition:
-              if rectangle.source_data_type == PANGO_MARKUP
-                actor = text_from_pango_markup (rectangle.get_source_data())
-              else
-                pixels = rectangle.get_pixels_unscaled (FORMAT_RGBA, ...)
-                actor = texture_from_rgba (pixels, ...)
-
-              .. position + scale on top of video surface ...
-        }
-
-      (a2) Overlay producer API:
-
-        e.g. logo or subpicture overlay: got pixels, stuff into rectangle:
-
-         if (logoverlay->cached_composition == NULL) {
-           comp = composition_new ();
-
-           rect = rectangle_new (format, pixels_buf,
-                                 width, height, stride, par_n, par_d,
-                                 x, y, render_width, render_height);
-
-           /* composition adds its own ref for the rectangle */
-           composition_add_rectangle (comp, rect);
-           rectangle_unref (rect);
-
-           /* buffer adds its own ref for the composition */
-           video_buffer_attach_composition (comp);
-
-           /* we take ownership of the composition and save it for later */
-           logoverlay->cached_composition = comp;
-         } else {
-           video_buffer_attach_composition (logoverlay->cached_composition);
-         }
-
-      FIXME: also add some API to modify render position/dimensions of
-      a rectangle (probably requires creation of new rectangle, unless
-      we handle writability like with other mini objects).
-
-  (b) Fallback overlay rendering/blitting on top of raw video
-
-      Eventually we want to use this overlay mechanism not only for
-      hardware-accelerated video, but also for plain old raw video,
-      either at the sink or in the overlay element directly.
-
-      Apart from the advantages listed earlier in section 3, this
-      allows us to consolidate a lot of overlaying/blitting code that
-      is currently repeated in every single overlay element in one
-      location. This makes it considerably easier to support a whole
-      range of raw video formats out of the box, add SIMD-optimised
-      rendering using ORC, or handle corner cases correctly.
-
-      (Note: side-effect of overlaying raw video at the video sink is
-      that if e.g. a screnshotter gets the last buffer via the last-buffer
-      property of basesink, it would get an image without the subtitles
-      on top. This could probably be fixed by re-implementing the
-      property in GstVideoSink though. Playbin2 could handle this
-      internally as well).
-
-        void
-        gst_video_overlay_composition_blend (GstVideoOverlayComposition * comp
-                                             GstBuffer                  * video_buf)
-        {
-          guint n;
-
-          g_return_if_fail (gst_buffer_is_writable (video_buf));
-          g_return_if_fail (GST_BUFFER_CAPS (video_buf) != NULL);
-
-          ... parse video_buffer caps into BlendVideoFormatInfo ...
-
-          for each rectangle in the composition: {
-
-                 if (gst_video_format_is_yuv (video_buf_format)) {
-                   overlay_format = FORMAT_AYUV;
-                 } else if (gst_video_format_is_rgb (video_buf_format)) {
-                   overlay_format = FORMAT_ARGB;
-                 } else {
-                   /* FIXME: grayscale? */
-                   return;
-                 }
-
-                 /* this will scale and convert AYUV<->ARGB if needed */
-                 pixels = rectangle_get_pixels_scaled (rectangle, overlay_format);
-
-                 ... clip output rectangle ...
-
-                 __do_blend (video_buf_format, video_buf->data,
-                             overlay_format, pixels->data,
-                             x, y, width, height, stride);
-
-                 gst_buffer_unref (pixels);
-          }
-        }
-
-
-  (c) Flatten all rectangles in a composition
-
-      We cannot assume that the video backend API can handle any
-      number of rectangle overlays, it's possible that it only
-      supports one single overlay, in which case we need to squash
-      all rectangles into one.
-
-      However, we'll just declare this a corner case for now, and
-      implement it only if someone actually needs it. It's easy
-      to add later API-wise. Might be a bit tricky if we have
-      rectangles with different PARs/formats (e.g. subs and a logo),
-      though we could probably always just use the code from (b)
-      with a fully transparent video buffer to create a flattened
-      overlay buffer.
-
-  (d) core API: new FEATURE query
-
-      For 0.10 we need to add a FEATURE query, so the overlay element
-      can query whether the sink downstream and all elements between
-      the overlay element and the sink support the new overlay API.
-      Elements in between need to support it because the render
-      positions and dimensions need to be updated if the video is
-      cropped or rescaled, for example.
-
-      In order to ensure that all elements support the new API,
-      we need to drop the query in the pad default query handler
-      (so it only succeeds if all elements handle it explicitly).
-
-      Might want two variants of the feature query - one where
-      all elements in the chain need to support it explicitly
-      and one where it's enough if some element downstream
-      supports it.
-
-      In 0.11 this could probably be handled via GstMeta and
-      ALLOCATION queries (and/or we could simply require
-      elements to be aware of this API from the start).
-
-      There appears to be no issue with downstream possibly
-      not being linked yet at the time when an overlay would
-      want to do such a query.
-
-
-Other considerations:
-
- - renderers (overlays or sinks) may be able to handle only ARGB or only AYUV
-   (for most graphics/hw-API it's likely ARGB of some sort, while our
-   blending utility functions will likely want the same colour space as
-   the underlying raw video format, which is usually YUV of some sort).
-   We need to convert where required, and should cache the conversion.
-
- - renderers may or may not be able to scale the overlay. We need to
-   do the scaling internally if not (simple case: just horizontal scaling
-   to adjust for PAR differences; complex case: both horizontal and vertical
-   scaling, e.g. if subs come from a different source than the video or the
-   video has been rescaled or cropped between overlay element and sink).
-
- - renderers may be able to generate (possibly scaled) pixels on demand
-   from the original data (e.g. a string or RLE-encoded data). We will
-   ignore this for now, since this functionality can still be added later
-   via API additions. The most interesting case would be to pass a pango
-   markup string, since e.g. clutter can handle that natively.
-
- - renderers may be able to write data directly on top of the video pixels
-   (instead of creating an intermediary buffer with the overlay which is
-   then blended on top of the actual video frame), e.g. dvdspu, dvbsuboverlay
-
-   However, in the interest of simplicity, we should probably ignore the
-   fact that some elements can blend their overlays directly on top of the
-   video (decoding/uncompressing them on the fly), even more so as it's
-   not obvious that it's actually faster to decode the same overlay
-   70-90 times (say) (ie. ca. 3 seconds of video frames) and then blend
-   it 70-90 times instead of decoding it once into a temporary buffer
-   and then blending it directly from there, possibly SIMD-accelerated.
-   Also, this is only relevant if the video is raw video and not some
-   hardware-acceleration backend object.
-
-   And ultimately it is the overlay element that decides whether to do
-   the overlay right there and then or have the sink do it (if supported).
-   It could decide to keep doing the overlay itself for raw video and
-   only use our new API for non-raw video.
-
- - renderers may want to make sure they only upload the overlay pixels once
-   per rectangle if that rectangle recurs in subsequent frames (as part of
-   the same composition or a different composition), as is likely. This caching
-   of e.g. surfaces needs to be done renderer-side and can be accomplished
-   based on the sequence numbers. The composition contains the lowest
-   sequence number still in use upstream (an overlay element may want to
-   cache created compositions+rectangles as well after all to re-use them
-   for multiple frames), based on that the renderer can expire cached
-   objects. The caching needs to be done renderer-side because attaching
-   renderer-specific objects to the rectangles won't work well given the
-   refcounted nature of rectangles and compositions, making it unpredictable
-   when a rectangle or composition will be freed or from which thread
-   context it will be freed. The renderer-specific objects are likely bound
-   to other types of renderer-specific contexts, and need to be managed
-   in connection with those.
-
- - composition/rectangles should internally provide a certain degree of
-   thread-safety. Multiple elements (sinks, overlay element) might access
-   or use the same objects from multiple threads at the same time, and it
-   is expected that elements will keep a ref to compositions and rectangles
-   they push downstream for a while, e.g. until the current subtitle
-   composition expires.
-
- === 5. Future considerations ===
-
- - alternatives: there may be multiple versions/variants of the same subtitle
-   stream. On DVDs, there may be a 4:3 version and a 16:9 version of the same
-   subtitles. We could attach both variants and let the renderer pick the best
-   one  for the situation (currently we just use the 16:9 version). With totem,
-   it's ultimately totem that adds the 'black bars' at the top/bottom, so totem
-   also knows if it's got a 4:3 display and can/wants to fit 4:3 subs (which
-   may render on top of the bars) or not, for example.
-
- === 6. Misc. FIXMEs ===
-
-TEST: should these look (roughly) alike (note text distortion) - needs fixing in textoverlay
-
-gst-launch-0.10 \
-    videotestsrc ! video/x-raw,width=640,height=480,pixel-aspect-ratio=1/1 ! textoverlay text=Hello font-desc=72 ! xvimagesink \
-    videotestsrc ! video/x-raw,width=320,height=480,pixel-aspect-ratio=2/1 ! textoverlay text=Hello font-desc=72 ! xvimagesink \
-    videotestsrc ! video/x-raw,width=640,height=240,pixel-aspect-ratio=1/2 ! textoverlay text=Hello font-desc=72 ! xvimagesink
-
- ~~~ THE END ~~~ 
-
--- a/docs/design/part-interlaced-video.txt
+++ b/docs/design/part-interlaced-video.txt
@ -1,107 +0,0 @@
-Interlaced Video
-================
-
-Video buffers have a number of states identifiable through a combination of caps
-and buffer flags.
-
-Possible states:
- Progressive
- Interlaced
-  - Plain
-    - One field
-    - Two fields
-    - Three fields - this should be a progressive buffer with a repeated 'first'
-      field that can be used for telecine pulldown
-  - Telecine
-    - One field
-    - Two fields
-      - Progressive
-      - Interlaced (a.k.a. 'mixed'; the fields are from different frames)
-    - Three fields - this should be a progressive buffer with a repeated 'first'
-      field that can be used for telecine pulldown
-
-Note: It can be seen that the difference between the plain interlaced and
-telecine states is that in the telecine state, buffers containing two fields may
-be progressive.
-
-Tools for identification:
- GstVideoInfo
-  - GstVideoInterlaceMode - enum - GST_VIDEO_INTERLACE_MODE_...
-    - PROGRESSIVE
-    - INTERLEAVED
-    - MIXED
- Buffers flags - GST_VIDEO_BUFFER_FLAG_...
-  - TFF
-  - RFF
-  - ONEFIELD
-  - INTERLACED
-
-
-Identification of Buffer States
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Note that flags are not necessarily interpreted in the same way for all
-different states nor are they necessarily required nor make sense in all cases.
-
-
-Progressive
-...........
-
-If the interlace mode in the video info corresponding to a buffer is
-"progressive", then the buffer is progressive.
-
-
-Plain Interlaced
-................
-
-If the video info interlace mode is "interleaved", then the buffer is plain
-interlaced.
-
-GST_VIDEO_BUFFER_FLAG_TFF indicates whether the top or bottom field is to be
-displayed first. The timestamp on the buffer corresponds to the first field.
-
-GST_VIDEO_BUFFER_FLAG_RFF indicates that the first field (indicated by the TFF flag)
-should be repeated. This is generally only used for telecine purposes but as the
-telecine state was added long after the interlaced state was added and defined,
-this flag remains valid for plain interlaced buffers.
-
-GST_VIDEO_BUFFER_FLAG_ONEFIELD means that only the field indicated through the TFF
-flag is to be used. The other field should be ignored.
-
-
-Telecine
-........
-
-If video info interlace mode is "mixed" then the buffers are in some form of
-telecine state.
-
-The TFF and ONEFIELD flags have the same semantics as for the plain interlaced
-state.
-
-GST_VIDEO_BUFFER_FLAG_RFF in the telecine state indicates that the buffer contains
-only repeated fields that are present in other buffers and are as such
-unneeded. For example, in a sequence of three telecined frames, we might have:
-
-AtAb AtBb BtBb
-
-In this situation, we only need the first and third buffers as the second
-buffer contains fields present in the first and third.
-
-Note that the following state can have its second buffer identified using the
-ONEFIELD flag (and TFF not set):
-
-AtAb AtBb BtCb
-
-The telecine state requires one additional flag to be able to identify
-progressive buffers.
-
-The presence of the GST_VIDEO_BUFFER_FLAG_INTERLACED means that the buffer is an
-'interlaced' or 'mixed' buffer that contains two fields that, when combined
-with fields from adjacent buffers, allow reconstruction of progressive frames.
-The absence of the flag implies the buffer containing two fields is a
-progressive frame.
-
-For example in the following sequence, the third buffer would be mixed (yes, it
-is a strange pattern, but it can happen):
-
-AtAb AtBb BtCb CtDb DtDb
--- a/docs/design/part-mediatype-audio-raw.txt
+++ b/docs/design/part-mediatype-audio-raw.txt
@ -1,76 +0,0 @@
-Media Types
-----------
-
- audio/x-raw
-
-  format, G_TYPE_STRING, mandatory
-   The format of the audio samples, see the Formats section for a list
-   of valid sample formats.
-
-  rate, G_TYPE_INT, mandatory
-   The samplerate of the audio
-
-  channels, G_TYPE_INT, mandatory
-   The number of channels
-
-  channel-mask, GST_TYPE_BITMASK, mandatory for more than 2 channels
-   Bitmask of channel positions present. May be omitted for mono and
-   stereo. May be set to 0 to denote that the channels are unpositioned.
-
-  layout, G_TYPE_STRING, mandatory
-   The layout of channels within a buffer. Possible values are
-   "interleaved" (for LRLRLRLR) and "non-interleaved" (LLLLRRRR)
-
-Use GstAudioInfo and related helper API to create and parse raw audio caps.
-
-
-Metadata
--------
-
- "GstAudioDownmixMeta"
-   A matrix for downmixing multichannel audio to a lower numer of channels.
-   
-
-Formats
-------
-
- The following values can be used for the format string property.
-
-  "S8" 8-bit signed PCM audio
-  "U8" 8-bit unsigned PCM audio
-
-  "S16LE" 16-bit signed PCM audio
-  "S16BE" 16-bit signed PCM audio
-  "U16LE" 16-bit unsigned PCM audio
-  "U16BE" 16-bit unsigned PCM audio
-
-  "S24_32LE" 24-bit signed PCM audio packed into 32-bit
-  "S24_32BE" 24-bit signed PCM audio packed into 32-bit
-  "U24_32LE" 24-bit unsigned PCM audio packed into 32-bit
-  "U24_32BE" 24-bit unsigned PCM audio packed into 32-bit
-
-  "S32LE" 32-bit signed PCM audio
-  "S32BE" 32-bit signed PCM audio
-  "U32LE" 32-bit unsigned PCM audio
-  "U32BE" 32-bit unsigned PCM audio
-
-  "S24LE" 24-bit signed PCM audio
-  "S24BE" 24-bit signed PCM audio
-  "U24LE" 24-bit unsigned PCM audio
-  "U24BE" 24-bit unsigned PCM audio
-
-  "S20LE" 20-bit signed PCM audio
-  "S20BE" 20-bit signed PCM audio
-  "U20LE" 20-bit unsigned PCM audio
-  "U20BE" 20-bit unsigned PCM audio
-
-  "S18LE" 18-bit signed PCM audio
-  "S18BE" 18-bit signed PCM audio
-  "U18LE" 18-bit unsigned PCM audio
-  "U18BE" 18-bit unsigned PCM audio
-
-  "F32LE" 32-bit floating-point audio
-  "F32BE" 32-bit floating-point audio
-  "F64LE" 64-bit floating-point audio
-  "F64BE" 64-bit floating-point audio
-
--- a/docs/design/part-mediatype-text-raw.txt
+++ b/docs/design/part-mediatype-text-raw.txt
@ -1,28 +0,0 @@
-Media Types
-----------
-
- text/x-raw
-
-  format, G_TYPE_STRING, mandatory
-    The format of the text, see the Formats section for a list of valid format
-    strings.
-
-Metadata
--------
-
-  There are no common metas for this raw format yet.
-
-Formats
-------
-
- "utf8" plain timed utf8 text (formerly text/plain)
-
-        Parsed timed text in utf8 format.
-
- "pango-markup" plain timed utf8 text with pango markup (formerly text/x-pango-markup)
-
-        Same as "utf8", but text embedded in an XML-style markup language for
-        size, colour, emphasis, etc.
-
-        See http://developer.gnome.org/pango/stable/PangoMarkupFormat.html
-
--- a/docs/design/part-mediatype-video-raw.txt
+++ b/docs/design/part-mediatype-video-raw.txt
--- a/docs/design/part-playbin.txt
+++ b/docs/design/part-playbin.txt
@ -1,69 +0,0 @@
-playbin
--------
-
-The purpose of this element is to decode and render the media contained in a
-given generic uri. The element extends GstPipeline and is typically used in
-playback situations.
-
-Required features:
-
- - accept and play any valid uri. This includes
-   - rendering video/audio
-   - overlaying subtitles on the video
- - optionally read external subtitle files
- - allow for hardware (non raw) sinks
- - selection of audio/video/subtitle streams based on language.
- - perform network buffering/incremental download
- - gapless playback
- - support for visualisations with configurable sizes
- - ability to reject files that are too big, or of a format that would require
-   too much CPU/memory usage.
- - be very efficient with adding elements such as converters to reduce the
-   amount of negotiation that has to happen.
- - handle chained oggs. This includes having support for dynamic pad add and
-   remove from a demuxer.
-
-Components
----------
-
-* decodebin2
-
- - performs the autoplugging of demuxers/decoders
- - emits signals when for steering the autoplugging
-   - to decide if a non-raw media format is acceptable as output
-   - to sort the possible decoders for a non-raw format
- - see also decodebin2 design doc
-
-* uridecodebin
-
- - combination of a source to handle the given uri, an optional queueing element
-   and one or more decodebin2 elements to decode the non-raw streams.
-
-* playsink
-
- - handles display of audio/video/text.
- - has request audio/video/text input pad. There is only one sinkpad per type.
-   The requested pads define the configuration of the internal pipeline. 
- - allows for setting audio/video sinks or does automatic sink selection.
- - allows for configuration of visualisation element.
- - allows for enable/disable of visualisation, audio and video.
-
-* playbin
-
- - combination of one or more uridecodebin elements to read the uri and subtitle
-   uri.
- - support for queuing new media to support gapless playback.
- - handles stream selection.
- - uses playsink to display.
- - selection of sinks and configuration of uridecodebin with raw output formats.
-
-
-Gapless playback
----------------
-
-playbin has an "about-to-finish" signal. The application should configure a new
-uri (and optional suburi) in the callback. When the current media finishes, this
-new media will be played next.
-
-
-    
--- a/docs/design/part-stereo-multiview-video.markdown
+++ b/docs/design/part-stereo-multiview-video.markdown
@ -1,278 +0,0 @@
-Design for Stereoscopic & Multiview Video Handling
-==================================================
-
-There are two cases to handle:
-
-* Encoded video output from a demuxer to parser / decoder or from encoders into a muxer.
-* Raw video buffers
-
-The design below is somewhat based on the proposals from
-[bug 611157](https://bugzilla.gnome.org/show_bug.cgi?id=611157)
-
-Multiview is used as a generic term to refer to handling both
-stereo content (left and right eye only) as well as extensions for videos
-containing multiple independent viewpoints.
-
-Encoded Signalling
------------------
-This is regarding the signalling in caps and buffers from demuxers to
-parsers (sometimes) or out from encoders.
-
-For backward compatibility with existing codecs many transports of
-stereoscopic 3D content use normal 2D video with 2 views packed spatially
-in some way, and put extra new descriptions in the container/mux.
-
-Info in the demuxer seems to apply to stereo encodings only. For all
-MVC methods I know, the multiview encoding is in the video bitstream itself
-and therefore already available to decoders. Only stereo systems have been retro-fitted
-into the demuxer.
-
-Also, sometimes extension descriptions are in the codec (e.g. H.264 SEI FPA packets)
-and it would be useful to be able to put the info onto caps and buffers from the
-parser without decoding.
-
-To handle both cases, we need to be able to output the required details on
-encoded video for decoders to apply onto the raw video buffers they decode.
-
-*If there ever is a need to transport multiview info for encoded data the
-same system below for raw video or some variation should work*
-
-### Encoded Video: Properties that need to be encoded into caps
-1. multiview-mode (called "Channel Layout" in bug 611157)
-    * Whether a stream is mono, for a single eye, stereo, mixed-mono-stereo
-      (switches between mono and stereo - mp4 can do this)
-    * Uses a buffer flag to mark individual buffers as mono or "not mono"
-      (single|stereo|multiview) for mixed scenarios. The alternative (not
-      proposed) is for the demuxer to switch caps for each mono to not-mono
-      change, and not used a 'mixed' caps variant at all.
-    * _single_ refers to a stream of buffers that only contain 1 view.
-      It is different from mono in that the stream is a marked left or right
-      eye stream for later combining in a mixer or when displaying.
-    * _multiple_ marks a stream with multiple independent views encoded.
-      It is included in this list for completeness. As noted above, there's
-      currently no scenario that requires marking encoded buffers as MVC.
-2. Frame-packing arrangements / view sequence orderings
-    * Possible frame packings: side-by-side, side-by-side-quincunx,
-      column-interleaved, row-interleaved, top-bottom, checker-board
-    * bug 611157 - sreerenj added side-by-side-full and top-bottom-full but
-      I think that's covered by suitably adjusting pixel-aspect-ratio. If
-      not, they can be added later.
-    * _top-bottom_, _side-by-side_, _column-interleaved_, _row-interleaved_ are as the names suggest.
-    * _checker-board_, samples are left/right pixels in a chess grid +-+-+-/-+-+-+
-    * _side-by-side-quincunx_. Side By Side packing, but quincunx sampling -
-      1 pixel offset of each eye needs to be accounted when upscaling or displaying
-    * there may be other packings (future expansion)
-    * Possible view sequence orderings: frame-by-frame, frame-primary-secondary-tracks, sequential-row-interleaved
-    * _frame-by-frame_, each buffer is left, then right view etc
-    * _frame-primary-secondary-tracks_ - the file has 2 video tracks (primary and secondary), one is left eye, one is right.
-      Demuxer info indicates which one is which.
-      Handling this means marking each stream as all-left and all-right views, decoding separately, and combining automatically (inserting a mixer/combiner in playbin)
-      -> *Leave this for future expansion*
-    * _sequential-row-interleaved_ Mentioned by sreerenj in bug patches, I can't find a mention of such a thing. Maybe it's in MPEG-2
-      -> *Leave this for future expansion / deletion*
-3. view encoding order
-    * Describes how to decide which piece of each frame corresponds to left or right eye
-    * Possible orderings left, right, left-then-right, right-then-left
-    - Need to figure out how we find the correct frame in the demuxer to start decoding when seeking in frame-sequential streams
-    - Need a buffer flag for marking the first buffer of a group.
-4. "Frame layout flags"
-    * flags for view specific interpretation
-    * horizontal-flip-left, horizontal-flip-right, vertical-flip-left, vertical-flip-right
-      Indicates that one or more views has been encoded in a flipped orientation, usually due to camera with mirror or displays with mirrors.
-    * This should be an actual flags field. Registered GLib flags types aren't generally well supported in our caps - the type might not be loaded/registered yet when parsing a caps string, so they can't be used in caps templates in the registry.
-    * It might be better just to use a hex value / integer
-
-Buffer representation for raw video
-----------------------------------
-* Transported as normal video buffers with extra metadata
-* The caps define the overall buffer width/height, with helper functions to
-  extract the individual views for packed formats
-* pixel-aspect-ratio adjusted if needed to double the overall width/height
-* video sinks that don't know about multiview extensions yet will show the packed view as-is
-  For frame-sequence outputs, things might look weird, but just adding multiview-mode to the sink caps
-  can disallow those transports.
-* _row-interleaved_ packing is actually just side-by-side memory layout with half frame width, twice
-  the height, so can be handled by adjusting the overall caps and strides
-* Other exotic layouts need new pixel formats defined (checker-board, column-interleaved, side-by-side-quincunx)
-* _Frame-by-frame_ - one view per buffer, but with alternating metas marking which buffer is which left/right/other view and using a new buffer flag as described above
-  to mark the start of a group of corresponding frames.
-* New video caps addition as for encoded buffers
-
-### Proposed Caps fields
-Combining the requirements above and collapsing the combinations into mnemonics:
-
-* multiview-mode =
-   mono | left | right | sbs | sbs-quin | col | row | topbot | checkers |
-   frame-by-frame | mixed-sbs | mixed-sbs-quin | mixed-col | mixed-row |
-   mixed-topbot | mixed-checkers | mixed-frame-by-frame | multiview-frames mixed-multiview-frames
-* multiview-flags =
-    + 0x0000 none
-    + 0x0001 right-view-first
-    + 0x0002 left-h-flipped
-    + 0x0004 left-v-flipped
-    + 0x0008 right-h-flipped
-    + 0x0010 right-v-flipped
-
-### Proposed new buffer flags
-Add two new GST_VIDEO_BUFFER flags in video-frame.h and make it clear that those
-flags can apply to encoded video buffers too. wtay says that's currently the
-case anyway, but the documentation should say it.
-
-**GST_VIDEO_BUFFER_FLAG_MULTIPLE_VIEW** - Marks a buffer as representing non-mono content, although it may be a single (left or right) eye view.
-**GST_VIDEO_BUFFER_FLAG_FIRST_IN_BUNDLE** - for frame-sequential methods of transport, mark the "first" of a left/right/other group of frames
-
-### A new GstMultiviewMeta
-This provides a place to describe all provided views in a buffer / stream,
-and through Meta negotiation to inform decoders about which views to decode if
-not all are wanted.
-
-* Logical labels/names and mapping to GstVideoMeta numbers
-* Standard view labels LEFT/RIGHT, and non-standard ones (strings)
-
-        GST_VIDEO_MULTIVIEW_VIEW_LEFT = 1
-        GST_VIDEO_MULTIVIEW_VIEW_RIGHT = 2
-
-        struct GstVideoMultiviewViewInfo {
-            guint view_label;
-            guint meta_id; // id of the GstVideoMeta for this view
-
-            padding;
-        }
-
-        struct GstVideoMultiviewMeta {
-            guint n_views;
-            GstVideoMultiviewViewInfo *view_info;
-        }
-
-The meta is optional, and probably only useful later for MVC
-
-
-Outputting stereo content
-------------------------
-The initial implementation for output will be stereo content in glimagesink
-
-### Output Considerations with OpenGL
-* If we have support for stereo GL buffer formats, we can output separate left/right eye images and let the hardware take care of display.
-* Otherwise, glimagesink needs to render one window with left/right in a suitable frame packing
-  and that will only show correctly in fullscreen on a device set for the right 3D packing -> requires app intervention to set the video mode.
-* Which could be done manually on the TV, or with HDMI 1.4 by setting the right video mode for the screen to inform the TV or third option, we
-  support rendering to two separate overlay areas on the screen - one for left eye, one for right which can be supported using the 'splitter' element and 2 output sinks or, better, add a 2nd window overlay for split stereo output
-* Intel hardware doesn't do stereo GL buffers - only nvidia and AMD, so initial implementation won't include that
-
-## Other elements for handling multiview content
-* videooverlay interface extensions
-    * __Q__: Should this be a new interface?
-    * Element message to communicate the presence of stereoscopic information to the app
-    * App needs to be able to override the input interpretation - ie, set multiview-mode and multiview-flags
-        * Most videos I've seen are side-by-side or top-bottom with no frame-packing metadata
-    * New API for the app to set rendering options for stereo/multiview content
-    * This might be best implemented as a **multiview GstContext**, so that
-      the pipeline can share app preferences for content interpretation and downmixing
-      to mono for output, or in the sink and have those down as far upstream/downstream as possible.
-* Converter element
-    * convert different view layouts
-    * Render to anaglyphs of different types (magenta/green, red/blue, etc) and output as mono
-* Mixer element
-    * take 2 video streams and output as stereo
-    * later take n video streams
-    * share code with the converter, it just takes input from n pads instead of one.
-* Splitter element
-    * Output one pad per view
-
-### Implementing MVC handling in decoders / parsers (and encoders)
-Things to do to implement MVC handling
-
-1. Parsing SEI in h264parse and setting caps (patches available in
-   bugzilla for parsing, see below)
-2. Integrate gstreamer-vaapi MVC support with this proposal
-3. Help with [libav MVC implementation](https://wiki.libav.org/Blueprint/MVC)
-4. generating SEI in H.264 encoder
-5. Support for MPEG2 MVC extensions
-
-## Relevant bugs
-[bug 685215](https://bugzilla.gnome.org/show_bug.cgi?id=685215) - codecparser h264: Add initial MVC parser
-[bug 696135](https://bugzilla.gnome.org/show_bug.cgi?id=696135) - h264parse: Add mvc stream parsing support
-[bug 732267](https://bugzilla.gnome.org/show_bug.cgi?id=732267) - h264parse: extract base stream from MVC or SVC encoded streams
-
-## Other Information
-[Matroska 3D support notes](http://www.matroska.org/technical/specs/notes.html#3D)
-
-## Open Questions
-
-### Background
-
-### Representation for GstGL
-When uploading raw video frames to GL textures, the goal is to implement:
-
-2. Split packed frames into separate GL textures when uploading, and
-attach multiple GstGLMemory's to the GstBuffer. The multiview-mode and
-multiview-flags fields in the caps should change to reflect the conversion
-from one incoming GstMemory to multiple GstGLMemory, and change the
-width/height in the output info as needed.
-
-This is (currently) targetted as 2 render passes - upload as normal
-to a single stereo-packed RGBA texture, and then unpack into 2
-smaller textures, output with GST_VIDEO_MULTIVIEW_MODE_SEPARATED, as
-2 GstGLMemory attached to one buffer. We can optimise the upload later
-to go directly to 2 textures for common input formats.
-
-Separat output textures have a few advantages:
-
-* Filter elements can more easily apply filters in several passes to each
-texture without fundamental changes to our filters to avoid mixing pixels
-from separate views.
-* Centralises the sampling of input video frame packings in the upload code,
-which makes adding new packings in the future easier.
-* Sampling multiple textures to generate various output frame-packings
-for display is conceptually simpler than converting from any input packing
-to any output packing.
-* In implementations that support quad buffers, having separate textures
-makes it trivial to do GL_LEFT/GL_RIGHT output
-
-For either option, we'll need new glsink output API to pass more
-information to applications about multiple views for the draw signal/callback.
-
-I don't know if it's desirable to support *both* methods of representing
-views. If so, that should be signalled in the caps too. That could be a
-new multiview-mode for passing views in separate GstMemory objects
-attached to a GstBuffer, which would not be GL specific.
-
-### Overriding frame packing interpretation
-Most sample videos available are frame packed, with no metadata
-to say so. How should we override that interpretation?
-
-* Simple answer: Use capssetter + new properties on playbin to
-  override the multiview fields
-  *Basically implemented in playbin, using a pad probe. Needs more work for completeness*
-
-### Adding extra GstVideoMeta to buffers
-There should be one GstVideoMeta for the entire video frame in packed
-layouts, and one GstVideoMeta per GstGLMemory when views are attached
-to a GstBuffer separately. This should be done by the buffer pool,
-which knows from the caps.
-
-### videooverlay interface extensions
-GstVideoOverlay needs:
-
-* A way to announce the presence of multiview content when it is
-  detected/signalled in a stream.
-* A way to tell applications which output methods are supported/available
-* A way to tell the sink which output method it should use
-* Possibly a way to tell the sink to override the input frame
-  interpretation / caps - depends on the answer to the question
-  above about how to model overriding input interpretation.
-
-### What's implemented
-* Caps handling
-* gst-plugins-base libsgstvideo pieces
-* playbin caps overriding
-* conversion elements - glstereomix, gl3dconvert (needs a rename),
-  glstereosplit.
-
-### Possible future enhancements
-* Make GLupload split to separate textures at upload time?
-    * Needs new API to extract multiple textures from the upload. Currently only outputs 1 result RGBA texture.
-* Make GLdownload able to take 2 input textures, pack them and colorconvert / download as needed.
-  - current done by packing then downloading which isn't OK overhead for RGBA download
-* Think about how we integrate GLstereo - do we need to do anything special,
-  or can the app just render to stereo/quad buffers if they're available?