mirror of
https://gitlab.freedesktop.org/gstreamer/gstreamer.git
synced 2025-01-22 15:18:21 +00:00
f6f841cbb8
Original commit message from CVS: faq updates metadata/mediainfo
272 lines
9.3 KiB
Text
272 lines
9.3 KiB
Text
I'll use this doc to describe how I think media info should work from the
|
|
perspective of the application developer and end user, and from that
|
|
extrapolate what we need to provide that.
|
|
|
|
RATIONALE
|
|
---------
|
|
One of the strong points of GStreamer is that it abstracts library dependencies
|
|
away. A user is free to choose whatever plug-ins he has, and a developer
|
|
can code to the general API that GStreamer provides without having to deal
|
|
with the underlying codecs.
|
|
|
|
It is important that GStreamer also handles media info well and efficiently,
|
|
since more often than not the same libraries are needed to do this. So
|
|
to avoid applications depending on these libs just to do the media info,
|
|
we should make sure GStreamer provides a reasonable and fast abstraction
|
|
for this as well.
|
|
|
|
GOALS
|
|
-----
|
|
- quickly read and write "tags"
|
|
- quickly read stream metadata (technical properties, length, audio props, ...)
|
|
- cache both kinds of data transparently
|
|
- (possibly) provide bins that do this
|
|
- provide a simple API to do this
|
|
|
|
DEFINITION OF TERMS
|
|
-------------------
|
|
The user or developer using GStreamer is interested in all information that
|
|
describes the stream. The library handles these two types differently
|
|
however, so I will use the following terms to describe this :
|
|
|
|
- metadata :
|
|
every kind of information that is tied to the "concept" of the stream,
|
|
and not tied to the actual encoding or representation of the stream.
|
|
- it can be altered without transcoding the stream
|
|
- it would stay the same for different encodings of the file
|
|
- describes properties of the information encoded into the stream
|
|
- examples:
|
|
- artist, title, author
|
|
- year, track order, album
|
|
- comments
|
|
|
|
- mediainfo
|
|
every kind of information that is tied to the "codec" used.
|
|
- cannot be altered without transcoding
|
|
- is the set of parameters the stream has been encoded with
|
|
- describes properties of the encoded stream itself
|
|
- examples:
|
|
- bitrate targets (e.g. nominal), encoding mode (e.g. joint stereo)
|
|
- to this we also add "bitrate", but we query this through the pad_query
|
|
interface
|
|
|
|
- format
|
|
every kind of information that is tied to the "raw" bitstream
|
|
- cannot be altered without decoding and changing the raw bitstream
|
|
- examples:
|
|
- samplerate, bit depth/width, channels
|
|
- length in time
|
|
- video size, frames per second, colorspace used
|
|
- the format is queried by getting the GstCaps of the pad that sources
|
|
the buffers
|
|
|
|
- length in time and tracks for the whole stream
|
|
- gotten through pad queries
|
|
- stored in variables in the struct
|
|
|
|
- immediate info
|
|
- examples:
|
|
- position in time
|
|
- current bitrate
|
|
|
|
- tracks :
|
|
a media file or stream can contain multiple consecutive streams, which
|
|
we will call "tracks". GStreamer has a format for track used in querying
|
|
and seeking as well.
|
|
A track should be thought of as the whole of one single piece of media
|
|
inside a physical stream.
|
|
A track can have at most one set of tags, and has fixed "raw" properties.
|
|
|
|
EXAMPLE PIPELINES
|
|
-----------------
|
|
reading metadata : filesrc ! id3v1
|
|
- would read metadata from file
|
|
- id3v1 immediately causes filesrc to seek until it has found
|
|
- the (first) metadata
|
|
- that there is no metadata present
|
|
- id3v1 sends out a property notification with name "metadata" and
|
|
a GstCaps structure
|
|
|
|
resetting and writing content metadata :
|
|
id3v1 reset=true artist="Arid" ! filesink
|
|
|
|
- effect: clear the current tag and reset it to only have Arid as artist
|
|
- id3v1 seeks to the right location, clears the tag, and writes the new one
|
|
|
|
COST
|
|
----
|
|
Querying media info can be expensive.
|
|
Any application querying for media info should take this into account and
|
|
make sure that it doesn't block the app unnecessarily while the querying
|
|
happens.
|
|
|
|
The app should create an object, hand it a bunch of locations to query,
|
|
and connect to the signal the app is going to send out.
|
|
|
|
In most cases, querying content data should be fast since it doesn't involve
|
|
decoding
|
|
|
|
Technical data could be harder and thus might be better done only when needed.
|
|
|
|
CACHE
|
|
-----
|
|
Getting media info can be an expensive operation. It makes sense to cache
|
|
the dia info queried on-disk to provide rapid access to this data.
|
|
It is important however that this is done transparently - the system should
|
|
be able to keep working without it, or keep working when you delete this cache.
|
|
|
|
The API would provide a function like
|
|
gst_media_info_read_cached (media_info, location,
|
|
GST_MEDIA_INFO_METADATA,
|
|
GST_MEDIA_INFO_READ_CACHED);
|
|
|
|
to try and get the cached metadata using the media info object.
|
|
|
|
- check if the file is cached in the media info cache
|
|
- if no, then read the media info and store it in the cache
|
|
- if yes, then check the file against it's timestamp (or (part of) md5sum ?)
|
|
- if it was changed, force a new read and store it in the cache
|
|
- if it wasn't changed, just return the cached media info
|
|
|
|
|
|
For optimizations, it might also make sense to do
|
|
GList * gst_metadata_read_many (media_info, GList *locations, ...)
|
|
|
|
which would allow the back-end to implement this more efficiently.
|
|
Suppose an application loads a playlist, for example, then this playlist
|
|
could be handed to this function, and a GList of metadata types could
|
|
be returned.
|
|
|
|
Possible implementations :
|
|
- one large XML file : would end up being too heavy
|
|
- one XML file per dir on system : good compromise; would still make sense
|
|
to keep this in memory instead of reading and writing it all the time
|
|
Also, odds are good that users mostly use files from same dir in one app
|
|
(but not necessarily)
|
|
|
|
Possible extra niceties :
|
|
- matching of moved files, and a similar move of metadata (through user-space
|
|
tool ?)
|
|
|
|
!!! For speed reasons, it might make sense to somehow keep the cache in memory
|
|
instead of reparsing the same cache file each time.
|
|
|
|
!!! For disk space reasons, it might make sense to have a system cache.
|
|
Not sure if the complexity added is worth it though.
|
|
|
|
!!! For disk space reasons, we might want to add an upper limit on the size of
|
|
the cache. For that we might need a timestamp on last retrieval of metadata,
|
|
so that we can drop the old ones.
|
|
|
|
The cache should use standard glibc.
|
|
FIXME: is it worth it to use gnome-vfs for this ?
|
|
|
|
STANDARDIZATION OF MEDIAINFO
|
|
----------------------------
|
|
Different file formats have different "tags". It is not always possible
|
|
to map metadata to tags. Some level of agreement on metadata names is also
|
|
required.
|
|
|
|
For media info, the names or properties should be fairly standard.
|
|
We also use the same names as used for properties and capabilities in
|
|
GStreamer.
|
|
|
|
This means we use
|
|
- encoded audio
|
|
- "bitrate" (which is bits per second - use the most correct one,
|
|
ie. average bitrate for VBR for example)
|
|
|
|
- raw audio
|
|
- "samplerate" - sampling frequency
|
|
- "channels"
|
|
- "bitwidth" - how wide is the audio in bits
|
|
|
|
- encoded video
|
|
- "bitrate"
|
|
|
|
- raw video
|
|
(FIXME: I don't know enough about video, are these correct)
|
|
- "width"
|
|
- "height"
|
|
- "colordepth"
|
|
- "colorspace"
|
|
- "fps"
|
|
- "aspectratio"
|
|
|
|
We must find a way to avoid collision. A system stream can contain both
|
|
audio and video (-> bitrate) or multiple audio or video streams. One way
|
|
to do this might be to make a metadata set for a stream a GList of metadata
|
|
for elementary streams.
|
|
|
|
For metadata and tags, the standards are less clear.
|
|
Some nice ones to standardize on might be
|
|
- artist
|
|
- title
|
|
- author
|
|
- year
|
|
- genre (messy though)
|
|
- RMS, inpoint, outpoint (calculated through some formula, used for mixing)
|
|
|
|
TESTING
|
|
-------
|
|
It is important to write a decent testsuite for this and do speed comparisons
|
|
between the library used and the GStreamer implementation.
|
|
|
|
|
|
API
|
|
---
|
|
struct GstMetadata
|
|
{
|
|
gchar *location;
|
|
GstMetadataType type;
|
|
|
|
GList *streams;
|
|
GHashtable *values;
|
|
};
|
|
|
|
(streams would be a GList of (again) GstMetadata's.
|
|
"location" would then be reused to indicate an identifier in the stream.
|
|
FIXME: is that evil ?)
|
|
|
|
GstMetadataType - technical, content
|
|
GstMetadataReadType - cached, raw
|
|
|
|
GstMetadata *
|
|
gst_metadata_read (const char *location,
|
|
GstMetadataType type,
|
|
GstMetadataReadType read_type);
|
|
GstMetadata *
|
|
gst_metadata_read_props (const char *location,
|
|
GList *names,
|
|
GstMetadataType type,
|
|
GstMetadataReadType read_type);
|
|
GstMetadata *
|
|
gst_metadata_read_cached (const char *location,
|
|
GstMetadataType type,
|
|
GstMetadataReadType read_type);
|
|
|
|
GstMetadata *
|
|
gst_metadata_read_props_cached (...)
|
|
|
|
GList *
|
|
gst_metadata_read_cached_many (GList *locations,
|
|
GstMetadataType type,
|
|
GstMetadataReadType read_type);
|
|
|
|
GList *
|
|
gst_metadata_read_props_cached_many (GList *locations,
|
|
GList *names,
|
|
GstMetadataType type,
|
|
GstMetadataReadType read_type);
|
|
|
|
GList *
|
|
gst_metadata_content_write (const char *location,
|
|
GstMetadata *metadata);
|
|
|
|
|
|
SOME USEFUL RESOURCES
|
|
---------------------
|
|
http://www.chin.gc.ca/English/Standards/metadata_multimedia.html
|
|
- describes multimedia data for images
|
|
distinction between content (descriptive), technical and
|
|
administrative metadata
|