mirror of
https://gitlab.freedesktop.org/gstreamer/gstreamer.git
synced 2024-07-04 21:55:55 +00:00
f5e3267133
Original commit message from CVS: New mimtypes document, going into effect today... For details, see this document, it describes everything and tell syou what to do and not do. Plugins commit follows in a few seconds (and it's huge)
423 lines
14 KiB
Plaintext
423 lines
14 KiB
Plaintext
Mimetypes in GStreamer
|
|
======================
|
|
|
|
1) What is a mimetype
|
|
---------------------
|
|
A mimetype is a combination of two (short) strings (words), the content
|
|
type and the content subtype, that make up a pair that describes a file
|
|
content type. In multimedia, mime types are used to describe the media
|
|
streamtype . In GStreamer, obsiously, we use mimetypes in the same way.
|
|
They are part of a GstCaps, that describes a media stream. Besides a
|
|
mimetype, a GstCaps also contains stream properties (GstProps), which
|
|
are combinations of key/value pairs, and a name.
|
|
|
|
An example of a mimetype is 'video/mpeg'. A corresponding GstCaps could
|
|
be created using:
|
|
GstCaps *caps = gst_caps_new("video_mpeg_type",
|
|
"video/mpeg",
|
|
gst_props_new("width", GST_PROPS_INT(384),
|
|
"height", GST_PROPS_INT(288),
|
|
NULL));
|
|
or using a macro:
|
|
GstCaps *caps = GST_CAPS_NEW("video_mpeg_type",
|
|
"video/mpeg",
|
|
"width", GST_PROPS_INT(384),
|
|
"height", GST_PROPS_INT(288)
|
|
);
|
|
|
|
Obviously, mimetypes and their corresponding properties are of major
|
|
importance in GStreamer for uniquely identifying media streams.
|
|
|
|
Official MIME media types are assigned by the IANA. Current
|
|
assignments are at http://www.iana.org/assignments/media-types/.
|
|
|
|
2) The problems
|
|
---------------
|
|
Some streams may have mimetypes or GstCaps that do not fully describe
|
|
the stream. In most cases, this is not a problem, though. For a stream
|
|
that contains Ogg/Vorbis data, we don't need to know the samplerate of
|
|
the raw audio stream, for example, since we can't play it back anyway.
|
|
The samplerate _is_ important for _raw_ audio, so a decoder would need
|
|
to retrieve the samplerate from the Ogg/Vorbis stream headers (that are
|
|
part of the bytestream) in order to pass it on in the GstCaps that
|
|
belongs to the decoded audio ('audio/raw').
|
|
However, other plugins *might* want to know such properties, even for
|
|
compressed streams. One such example is an AVI muxer, which does want
|
|
to know the samplerate of an audio stream, even when it is compressed.
|
|
|
|
Another problem is that many media types can be defined in multiple ways.
|
|
For example, MJPEG video can be defined as video/jpeg, video/mjpeg,
|
|
image/jpeg, video/avi with a compression of (fourcc) MJPG, etc. None of
|
|
these is really official, since there isn't an official mimetype for
|
|
encoded MJPEG video.
|
|
|
|
The main focus of this document is to propose a standardized set of
|
|
mimetypes and properties that will be used by the GStreamer plugins.
|
|
|
|
3) Different types of streams
|
|
-----------------------------
|
|
There are several types of media streams. The most important distinction
|
|
will be container formats, audio codecs and video codecs. Container
|
|
formats are bytestreams that contain one or more substreams inside it,
|
|
and don't provide any direct media data itself. Examples are Quicktime,
|
|
AVI or MPEG System Stream. They mostly contain of a set of headers that
|
|
define the media stream(s) that is packed inside the container and the
|
|
media data itself.
|
|
Video codecs and audio codecs describe encoded audio or video data.
|
|
Examples are MPEG-1 video, DivX video, MPEG-1 layer 3 (MP3) audio or
|
|
Ogg/Vorbis audio. Actually, Ogg is a container format too (for Vorbis
|
|
audio), but these are usually used in conjunction with each other.
|
|
|
|
3a) Container formats
|
|
---------------------
|
|
1 - AVI (Microsoft RIFF/AVI)
|
|
mimetype: video/avi
|
|
|
|
2 - Quicktime (Apple)
|
|
mimetype: video/quicktime
|
|
|
|
3 - MPEG (MPEG LA)
|
|
mimetype: video/mpeg
|
|
properties: 'systemstream' = TRUE (BOOLEAN)
|
|
|
|
4 - ASF (Microsoft)
|
|
mimetype: video/x-asf
|
|
|
|
5 - WAV (PCM)
|
|
mimetype: audio/x-wav
|
|
|
|
6 - RealMedia (Real)
|
|
mimetype: video/x-pn-realvideo
|
|
properties: 'systemstream' = TRUE (BOOLEAN)
|
|
|
|
7 - DV (Digital Video)
|
|
mimetype: video/x-dv
|
|
properties: 'systemstream' = TRUE (BOOLEAN)
|
|
|
|
8 - Ogg (Xiph)
|
|
mimetype: application/ogg
|
|
|
|
9 - Matroska
|
|
mimetype: video/x-mkv
|
|
|
|
10 - Shockwave (Macromedia)
|
|
mimetype: application/x-shockwave-flash
|
|
|
|
11 - AU audio (Sun)
|
|
mimetype: audio/x-au
|
|
|
|
12 - Mod audio
|
|
mimetype: audio/x-mod
|
|
|
|
13 - FLX video (?)
|
|
mimetype: video/x-fli
|
|
|
|
14 - Monkeyaudio
|
|
mimetype: application/x-ape
|
|
|
|
15 - AIFF audio
|
|
mimetype: audio/x-aiff
|
|
|
|
16 - SID audio
|
|
mimetype: audio/x-sid
|
|
|
|
Please note that we try to keep these mimetypes as similar as possible
|
|
to what's used as standard mimetypes in Gnome (Gnome-VFS/Nautilus) and
|
|
KDE (Konqueror).
|
|
|
|
Current problems: there's a very thin line between audio codecs and
|
|
audio containers (take mp3 vs. sid, etc.) - this is just a per-case
|
|
thing right now and needs to be documented further.
|
|
|
|
3b) Video codecs
|
|
For convenience, the fourcc codes used in the AVI container format will be
|
|
listed along with the mimetype and optional properties.
|
|
|
|
Preface - (optional) properties for all video formats:
|
|
'width' = X (INT)
|
|
'height' = X (INT)
|
|
'pixel_width' and 'pixel_height' = X (2xINT, together aspect ratio)
|
|
'framerate' = X (FLOAT)
|
|
|
|
1 - Raw Video (YUV/YCbCr)
|
|
mimetype: video/x-raw-yuv
|
|
properties: 'format' = 'XXXX' (fourcc)
|
|
known fourccs: YUY2, I420, Y41P, YVYU, UYVY, etc.
|
|
properties 'width' and 'height' are required
|
|
|
|
Note: some raw video formats have implicit alignment rules. We should
|
|
discuss this more.
|
|
Note: some formats have multiple fourccs (e.g. IYUV/I420 or YUY2/YUYV).
|
|
For each of these, we only use one (e.g. I420 and YUY2).
|
|
|
|
Currently recognized formats:
|
|
YUY2: packed, Y-U-Y-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
|
|
YVYU: packed, Y-V-Y-U order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
|
|
UYVY: packed, U-Y-V-Y order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
|
|
Y41P: packed, UYVYUYVYYYYY order, U/V hor 4x subsampled (YUV-4:1:1, 12 bpp)
|
|
|
|
Y42B: planar, Y-U-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
|
|
YV12: planar, Y-V-U order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp)
|
|
I420: planar, Y-U-V order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp)
|
|
Y41B: planar, Y-U-V order, U/V hor 4x subsampled (YUV-4:1:1, 12bpp)
|
|
YUV9: planar, Y-U-V order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp)
|
|
YVU9: planar, Y-V-U order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp)
|
|
|
|
Y800: one-plane (Y-only, YUV-4:0:0, 8bpp)
|
|
|
|
See http://www.fourcc.org/ for more information.
|
|
|
|
Note: YUV-4:4:4 (both planar and packed, in multiple orders) are missing.
|
|
|
|
2) Raw Video (RGB)
|
|
-------------------
|
|
mimetype: video/x-raw-rgb
|
|
properties: 'endianness' = 1234/4321 (INT) <- endianness
|
|
'depth' = 15/16/24 (INT) <- bits per pixel (depth)
|
|
'bpp' = 16/24/32 (INT) <- bits per pixel (in memory)
|
|
'red_mask' = bitmask (0x..) (INT) <- red pixel mask
|
|
'green_mask' = bitmask (0x..) (INT) <- green pixel mask
|
|
'blue_mask' = bitmask (0x..) (INT) <- blue pixel mask
|
|
properties 'width' and 'height' are required
|
|
|
|
'bpp' is the number of bits of memory used for each pixel. 'depth'
|
|
is the color depth.
|
|
|
|
24 and 32 bit RGB should always be specified as big endian, since
|
|
any little endian format can be transformed into big endian by
|
|
rearranging the color masks. 15 and 16 bit formats should generally
|
|
have the same byte order as the cpu.
|
|
|
|
Color masks are interpreted by loading 'bpp' number of bits using
|
|
'endianness' rule, and masking and shifting by each color mask.
|
|
Loading a 24-bit value cannot be done directly, but one can perform
|
|
an equivalent operation.
|
|
|
|
Examples:
|
|
msb .. lsb
|
|
- memory: RRRRRRRR GGGGGGGG BBBBBBBB RRRRRRRR GGGGGGGG ...
|
|
'bpp' = 24
|
|
'depth' = 24
|
|
'endianness' = 4321 (G_BIG_ENDIAN)
|
|
'red_mask' = 0xff0000
|
|
'green_mask' = 0x00ff00
|
|
'blue_mask' = 0x0000ff
|
|
|
|
- memory: xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG ...
|
|
'bpp' = 16
|
|
'depth' = 15
|
|
'endianness' = 4321 (G_BIG_ENDIAN)
|
|
'red_mask' = 0x7c00
|
|
'green_mask' = 0x03e0
|
|
'blue_mask' = 0x003f
|
|
|
|
- memory: GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB ...
|
|
'bpp' = 16
|
|
'depth' = 15
|
|
'endianness' = 1234 (G_LITTLE_ENDIAN)
|
|
'red_mask' = 0x7c00
|
|
'green_mask' = 0x03e0
|
|
'blue_mask' = 0x003f
|
|
|
|
3 - MPEG-1, -2 and -4 video (ISO/LA MPEG)
|
|
mimetype: video/mpeg
|
|
properties: 'systemstream' = FALSE (BOOLEAN)
|
|
'mpegversion' = 1/2/4 (INT)
|
|
known fourccs: MPEG, MPGI
|
|
|
|
4 - DivX 3.x, 4.x and 5.x video (divx.com)
|
|
mimetype: video/x-divx
|
|
optional properties: 'divxversion' = 3/4/5 (INT)
|
|
known fourccs: DIV3, DIV4, DIV5, DIVX, DX50, DIVX, divx
|
|
|
|
5 - Microsoft MPEG 4.1, 4.2 and 4.3
|
|
mimetype: video/x-msmpeg
|
|
optional properties: 'msmpegversion' = 41/42/43 (INT)
|
|
known fourccs: MPG4, MP42, MP43
|
|
|
|
6 - Motion-JPEG (official and extended)
|
|
mimetype: video/x-jpeg
|
|
known fourccs: MJPG (YUY2 MJPEG), JPEG (any), PIXL (Pinnacle/Miro), VIXL
|
|
|
|
7 - Sorensen (Quicktime - SVQ1/SVQ3)
|
|
mimetypes: video/x-svq
|
|
properties: 'svqversion' = 1/3 (INT)
|
|
|
|
8 - H263 and related codecs
|
|
mimetype: video/x-h263
|
|
known fourccs: H263, i263, M263, x263, VDOW, VIVO
|
|
|
|
9 - RealVideo (Real)
|
|
mimetype: video/x-pn-realvideo
|
|
properties: 'systemstream' = FALSE (BOOLEAN)
|
|
known fourccs: RV10, RV20, RV30
|
|
|
|
10 - Digital Video (DV)
|
|
mimetype: video/x-dv
|
|
properties: 'systemstream' = FALSE (BOOLEAN)
|
|
known fourccs: DVSD, dvsd
|
|
|
|
11 - Windows Media Video 1 and 2 (WMV)
|
|
mimetype: video/x-wmv
|
|
properties: 'wmvversion' = 1/2 (INT)
|
|
|
|
12 - XviD (xvid.org)
|
|
mimetype: video/x-xvid
|
|
known fourccs: xvid, XVID
|
|
|
|
13 - 3IVX (3ixv.org)
|
|
mimetype: video/x-3ivx
|
|
known fourccs: 3IV0, 3IV1, 3IV2
|
|
|
|
14 - Ogg/Tarkin (Xiph)
|
|
mimetype: video/x-tarkin
|
|
|
|
15 - VP3
|
|
mimetype: video/x-vp3
|
|
|
|
16 - Ogg/Theora (Xiph, VP3-like)
|
|
mimetype: video/x-theora
|
|
|
|
17 - Huffyuv
|
|
mimetype: video/x-huffyuv
|
|
known fourccs: HFYU
|
|
|
|
18 - FF Video 1 (FFMPEG)
|
|
mimetype: video/x-ffv
|
|
properties: 'ffvversion' = 1 (INT)
|
|
|
|
19 - H264
|
|
mimetype: video/x-h264
|
|
|
|
20 - Indeo 3 (Intel)
|
|
mimetype: video/x-indeo
|
|
properties: 'indeoversion' = 3 (INT)
|
|
|
|
21 - Portable Network Graphics (PNG)
|
|
mimetype: video/x-png
|
|
|
|
TODO: subsampling information for YUV?
|
|
|
|
TODO: colorspace identifications for MJPEG? How?
|
|
|
|
TODO: how to distinguish MJPEG-A/B (Quicktime) and lossless JPEG?
|
|
|
|
TODO: divx4/divx5/xvid/3ivx/mpeg-4 - how to make them overlap? (all
|
|
ISO MPEG-4 compatible)
|
|
|
|
3c) Audio Codecs
|
|
----------------
|
|
for convenience, the two-byte hexcodes (as are being used for identification
|
|
in AVI files) are also given
|
|
|
|
Preface - (optional) properties for all audio formats:
|
|
'rate' = X (int) <- sampling rate
|
|
'channels' = X (int) <- number of audio channels
|
|
|
|
1 - Raw Audio (integer format)
|
|
mimetype: audio/x-raw-int
|
|
properties: 'width' = X (INT) <- memory bits per sample
|
|
'depth' = X (INT) <- used bits per sample
|
|
'signed' = X (BOOLEAN)
|
|
'endianness' = 1234/4321 (INT)
|
|
|
|
2 - Raw Audio (floating point format)
|
|
mimetype: audio/x-raw-float
|
|
properties: 'depth' = X (INT) <- 32=float, 64=double
|
|
'endianness' = 1234/4321 (INT) <- use G_BIG/LITTLE_ENDIAN!
|
|
'slope' = X (FLOAT, normally 1.0)
|
|
'intercept' = X (FLOAT, normally 0.0)
|
|
|
|
3 - Alaw Raw Audio
|
|
mimetype: audio/x-alaw
|
|
|
|
4 - Mulaw Raw Audio
|
|
mimetype: audio/x-mulaw
|
|
|
|
5 - MPEG-1 layer 1/2/3 audio
|
|
mimetype: audio/mpeg
|
|
properties: 'mpegversion' = 1 (INT)
|
|
'layer' = 1/2/3 (INT)
|
|
|
|
6 - Ogg/Vorbis
|
|
mimetype: audio/x-vorbis
|
|
|
|
7 - Windows Media Audio 1 and 2 (WMA)
|
|
mimetype: audio/x-wma
|
|
properties: 'wmaversion' = 1/2 (INT)
|
|
|
|
8 - AC3
|
|
mimetype: audio/x-ac3
|
|
|
|
9 - FLAC (Free Lossless Audio Codec)
|
|
mimetype: audio/x-flac
|
|
|
|
10 - MACE 3/6 (Quicktime audio)
|
|
mimetype: audio/x-mace
|
|
properties: 'maceversion' = 3/6 (INT)
|
|
|
|
11 - MPEG-4 AAC
|
|
mimetype: audio/mpeg
|
|
properties: 'mpegversion' = 4 (INT)
|
|
|
|
12 - (IMA) ADPCM (Quicktime/WAV/Microsoft/4XM)
|
|
mimetype: audio/x-adpcm
|
|
properties: 'layout' = "quicktime"/"wav"/"microsoft"/"4xm" (STRING)
|
|
|
|
Note: the difference between each of these is the number of
|
|
samples packaed together per channel. For WAV, for
|
|
example, each sample is 4 bit, and 8 samples are packed
|
|
together per channel in the bytestream. For the others,
|
|
refer to technical documentation.
|
|
We probably want to distinguish these differently, but
|
|
I don't know how, yet.
|
|
|
|
13 - RealAudio (Real)
|
|
mimetype: audio/x-pn-realaudio
|
|
properties: 'bitrate' = 14400/28800 (INT)
|
|
|
|
14 - DV Audio
|
|
mimetype: audio/x-dv
|
|
|
|
15 - GSM Audio
|
|
mimetype: audio/x-gsm
|
|
|
|
16 - Speex audio
|
|
mimetype: audio/x-speex
|
|
|
|
TODO: adpcm/dv needs confirmation from someone with knowledge...
|
|
|
|
3d) Plugin Guidelines
|
|
---------------------
|
|
So, a short bit on what plugins should do. Above, I've stated that
|
|
audio properties like "channels" and "rate" or video properties like
|
|
"width" and "height" are all optional. This doesn't mean you can
|
|
just simply omit them and everything will still work!
|
|
|
|
An example is the best way to explain all this. AVI needs the width,
|
|
height, rate and channels for the AVI header. So if these properties
|
|
are missing, avimux cannot work. On the other hand, MPEG doesn't have
|
|
such properties in its header and would thus need to parse the stream
|
|
in order to find them out; we don't want that either (a plugin does
|
|
one job). So normally, mpegdemux and avimux wouldn't allow transcoding.
|
|
To solve this problem, there are stream parser elements (such as
|
|
mpegaudioparse, ac3parse and mpeg1videoparse).
|
|
|
|
Conclusions to draw from here: a plugin gives info it can provide as
|
|
seen from its own task/job. If it can't, other elements might still
|
|
need it and a stream parser needs to be written if it doesn't already
|
|
exist.
|
|
|
|
On properties that can be described by one of these (properties such
|
|
as 'width', 'height', 'fps', etc.): they're forbidden and should be
|
|
handled using filtered caps.
|
|
|
|
4) Status of this document
|
|
---------------------------
|
|
Not all plugins strictly follow these guidelines yet, but these are the
|
|
official types. Plugins not following these specs either use extensions
|
|
that should be documented, or are buggy (and should be fixed).
|
|
|
|
Blame Ronald Bultje <rbultje@ronald.bitfreak.net> aka BBB for any mistakes
|
|
in this document.
|