gstreamer/docs/random/thomasvs/metadata

I'll use this doc to describe how I think metadata should work from the
perspective of the application developer and end user, and from that
extrapolate what we need to provide that.

RATIONALE
---------
One of the strong points of GStreamer is that it abstracts library dependencies
away.  A user is free to choose whatever plug-ins he has, and a developer
can code to the general API that GStreamer provides without having to deal
with the underlying codecs.

It is important that GStreamer also handles metadata well and efficiently,
since more often than not the same libraries are needed to do this.  So
to avoid applications depending on these libs just to do the metadata,
we should make sure GStreamer provides a reasonable and fast abstraction
for this as well.

GOALS
-----
- quickly read and write content metadata
- quickly read stream metadata
- cache both kinds of data transparently
- (possibly) provide bins that do this
- provide a simple API to do this

DEFINITION OF TERMS
-------------------
The user or developer using GStreamer is interested in all information that
describes the stream.  The library handles these two types differently
however, so I will use the following terms to describe this :

- content metadata
  every kind of information that is tied to the "concept" of the stream,
  and not tied to the actual encoding or representation of the stream.
  - it can be altered without transcoding the stream
  - it would stay the same for different encodings of the file
  - describes properties of the information encoded into the stream
  - examples:
    - artist, title, author
    - year, track order, album
    - comments

- stream metadata
  every kind of information that is tied to the "codec" or "representation"
  used.
  - cannot be altered without transcoding
  - is the set of parameters the stream has been encoded with
  - describes properties of the stream itself
  - examples:
    - samplerate, bit depth/width, channels
    - bitrate, encoding mode (e.g. joint stereo)
    - video size, frames per second, colorspace used
    - length in time
    

EXAMPLE PIPELINES
-----------------
reading content metadata : filesrc ! id3v1
  - would read metadata from file
  - id3v1 immediately causes filesrc to seek until it has found
    - the (first) metadata
    - that there is no metadata present

resetting and writing content metadata :
	filesrc ! id3v1 reset=true artist="Arid" ! filesink

  - effect: clear the current tag and reset it to only have Arid as artist
  - id3v1 seeks to the right location, clears the tag, and writes the new one
  - filesrc might not be necessary here
  - this probably only works when doing an in-place edit

COST
----
Querying metadata can be expensive.
Any application querying for metadata should take this into account and
make sure that it doesn't block the app unnecessarily while the querying
happens.

In most cases, querying content data should be fast since it doesn't involve
decoding

Technical data could be harder and thus might be better done only when needed.

CACHE
-----
Getting metadata can be an expensive operation.  It makes sense to cache
the metadata queried on-disk to provide rapid access to this data.
It is important however that this is done transparently - the system should
be able to keep working without it, or keep working when you delete this cache.

The API would provide a function like 
	gst_metadata_content_read_cached (location) 
or even 
	gst_metadata_read_cached (location, GST_METADATA_CONTENT, GST_METADATA_READ_CACHED)
to try and get the cached metadata.

- check if the file is cached in the metadata cache
  - if no, then read the metadata and store it in the cache
  - if yes, then check the file against it's timestamp (or (part of) md5sum ?)
    - if it was changed, force a new read and store it in the cache
    - if it wasn't changed, just return the cached metadata

For optimizations, it might also make sense to do 
	GList * gst_metadata_read_many (GList *locations, ...)
which would allow the back-end to implement this more efficiently.
Suppose an application loads a playlist, for example, then this playlist
could be handed to this function, and a GList of metadata types could
be returned.

Possible implementations :
- one large XML file : would end up being too heavy
- one XML file per dir on system : good compromise; would still make sense
  to keep this in memory instead of reading and writing it all the time
  Also, odds are good that users mostly use files from same dir in one app
  (but not necessarily)

Possible extra niceties :
- matching of moved files, and a similar move of metadata (through user-space
  tool ?)

!!! For speed reasons, it might make sense to somehow keep the cache in memory
instead of reparsing the same cache file each time.

!!! For disk space reasons, it might make sense to have a system cache.
Not sure if the complexity added is worth it though.

!!! For disk space reasons, we might want to add an upper limit on the size of
the cache.  For that we might need a timestamp on last retrieval of metadata,
so that we can drop the old ones.

The cache should use standard glibc.
FIXME: is it worth it to use gnome-vfs for this ?

STANDARDIZATION OF METADATA
---------------------------
Different file formats have different "tags".  It is not always possible
to map metadata to tags.  Some level of agreement on metadata names is also
required.

For technical metadata, the names or properties should be fairly standard.
We also use the same names as used for properties and capabilities in 
GStreamer.

This means we use
  - encoded audio
    - "bitrate" (which is bits per second - use the most correct one,
             ie. average bitrate for VBR for example)

  - raw audio
    - "samplerate" - sampling frequency
    - "channels"
    - "bitwidth" - how wide is the audio in bits

  - encoded video
    - "bitrate"

  - raw video
    (FIXME: I don't know enough about video, are these correct)
    - "width"
    - "height"
    - "colordepth"
    - "colorspace"
    - "fps"
    - "aspectratio"

We must find a way to avoid collision.  A system stream can contain both
audio and video (-> bitrate) or multiple audio or video streams.  One way
to do this might be to make a metadata set for a stream a GList of metadata
for elementary streams.

For content metadata, the standards are less clear.
Some nice ones to standardize on might be
  - artist
  - title
  - author
  - year
  - genre (touchy though)
  - RMS, inpoint, outpoint (calculated through some formula, used for mixing)

TESTING
-------
It is important to write a decent testsuite for this and do speed comparisons
between the library used and the GStreamer implementation.


API
---
struct GstMetadata
{	
	gchar *location;
	GstMetadataType type;

	GList *streams;
	GHashtable *values;
};

(streams would be a GList of (again) GstMetadata's.  
 "location" would then be reused to indicate an identifier in the stream.
 FIXME: is that evil ?)

GstMetadataType - technical, content
GstMetadataReadType - cached, raw

GstMetadata *
gst_metadata_read (const char *location,
                   GstMetadataType type,
                   GstMetadataReadType read_type);
GstMetadata *
gst_metadata_read_props (const char *location,
                         GList *names,
                         GstMetadataType type,
                         GstMetadataReadType read_type);
GstMetadata *
gst_metadata_read_cached (const char *location, 
                          GstMetadataType type,
                          GstMetadataReadType read_type);

GstMetadata *
gst_metadata_read_props_cached (...)

GList *
gst_metadata_read_cached_many (GList *locations, 
                          GstMetadataType type,
                          GstMetadataReadType read_type);

GList *
gst_metadata_read_props_cached_many (GList *locations,
				     GList *names,
                                     GstMetadataType type,
                                     GstMetadataReadType read_type);

GList *
gst_metadata_content_write (const char *location,
                            GstMetadata *metadata);


SOME USEFUL RESOURCES
---------------------
http://www.chin.gc.ca/English/Standards/metadata_multimedia.html
- describes multimedia data for images
  distinction between content (descriptive), technical and 
  administrative metadata
some ideas on how to do metadata in gst Original commit message from CVS: some ideas on how to do metadata in gst 2002-10-28 02:22:01 +00:00			`I'll use this doc to describe how I think metadata should work from the`
			`perspective of the application developer and end user, and from that`
			`extrapolate what we need to provide that.`

			`RATIONALE`
			`---------`
			`One of the strong points of GStreamer is that it abstracts library dependencies`
			`away. A user is free to choose whatever plug-ins he has, and a developer`
			`can code to the general API that GStreamer provides without having to deal`
			`with the underlying codecs.`

			`It is important that GStreamer also handles metadata well and efficiently,`
			`since more often than not the same libraries are needed to do this. So`
			`to avoid applications depending on these libs just to do the metadata,`
			`we should make sure GStreamer provides a reasonable and fast abstraction`
			`for this as well.`

			`GOALS`
			`-----`
			`- quickly read and write content metadata`
			`- quickly read stream metadata`
			`- cache both kinds of data transparently`
			`- (possibly) provide bins that do this`
			`- provide a simple API to do this`

			`DEFINITION OF TERMS`
			`-------------------`
			`The user or developer using GStreamer is interested in all information that`
			`describes the stream. The library handles these two types differently`
			`however, so I will use the following terms to describe this :`

			`- content metadata`
			`every kind of information that is tied to the "concept" of the stream,`
			`and not tied to the actual encoding or representation of the stream.`
			`- it can be altered without transcoding the stream`
			`- it would stay the same for different encodings of the file`
			`- describes properties of the information encoded into the stream`
			`- examples:`
			`- artist, title, author`
			`- year, track order, album`
			`- comments`

			`- stream metadata`
			`every kind of information that is tied to the "codec" or "representation"`
			`used.`
			`- cannot be altered without transcoding`
			`- is the set of parameters the stream has been encoded with`
			`- describes properties of the stream itself`
			`- examples:`
			`- samplerate, bit depth/width, channels`
			`- bitrate, encoding mode (e.g. joint stereo)`
			`- video size, frames per second, colorspace used`
			`- length in time`


			`EXAMPLE PIPELINES`
			`-----------------`
			`reading content metadata : filesrc ! id3v1`
			`- would read metadata from file`
			`- id3v1 immediately causes filesrc to seek until it has found`
			`- the (first) metadata`
			`- that there is no metadata present`

			`resetting and writing content metadata :`
			`filesrc ! id3v1 reset=true artist="Arid" ! filesink`

			`- effect: clear the current tag and reset it to only have Arid as artist`
			`- id3v1 seeks to the right location, clears the tag, and writes the new one`
			`- filesrc might not be necessary here`
			`- this probably only works when doing an in-place edit`

			`COST`
			`----`
			`Querying metadata can be expensive.`
			`Any application querying for metadata should take this into account and`
			`make sure that it doesn't block the app unnecessarily while the querying`
			`happens.`

			`In most cases, querying content data should be fast since it doesn't involve`
			`decoding`

			`Technical data could be harder and thus might be better done only when needed.`

			`CACHE`
			`-----`
			`Getting metadata can be an expensive operation. It makes sense to cache`
			`the metadata queried on-disk to provide rapid access to this data.`
			`It is important however that this is done transparently - the system should`
			`be able to keep working without it, or keep working when you delete this cache.`

			`The API would provide a function like`
			`gst_metadata_content_read_cached (location)`
			`or even`
			`gst_metadata_read_cached (location, GST_METADATA_CONTENT, GST_METADATA_READ_CACHED)`
			`to try and get the cached metadata.`

			`- check if the file is cached in the metadata cache`
			`- if no, then read the metadata and store it in the cache`
			`- if yes, then check the file against it's timestamp (or (part of) md5sum ?)`
			`- if it was changed, force a new read and store it in the cache`
			`- if it wasn't changed, just return the cached metadata`

			`For optimizations, it might also make sense to do`
			`GList * gst_metadata_read_many (GList *locations, ...)`
			`which would allow the back-end to implement this more efficiently.`
			`Suppose an application loads a playlist, for example, then this playlist`
			`could be handed to this function, and a GList of metadata types could`
			`be returned.`

			`Possible implementations :`
			`- one large XML file : would end up being too heavy`
			`- one XML file per dir on system : good compromise; would still make sense`
			`to keep this in memory instead of reading and writing it all the time`
			`Also, odds are good that users mostly use files from same dir in one app`
			`(but not necessarily)`

			`Possible extra niceties :`
			`- matching of moved files, and a similar move of metadata (through user-space`
			`tool ?)`

			`!!! For speed reasons, it might make sense to somehow keep the cache in memory`
			`instead of reparsing the same cache file each time.`

			`!!! For disk space reasons, it might make sense to have a system cache.`
			`Not sure if the complexity added is worth it though.`

			`!!! For disk space reasons, we might want to add an upper limit on the size of`
			`the cache. For that we might need a timestamp on last retrieval of metadata,`
			`so that we can drop the old ones.`

			`The cache should use standard glibc.`
			`FIXME: is it worth it to use gnome-vfs for this ?`

			`STANDARDIZATION OF METADATA`
			`---------------------------`
			`Different file formats have different "tags". It is not always possible`
			`to map metadata to tags. Some level of agreement on metadata names is also`
			`required.`

			`For technical metadata, the names or properties should be fairly standard.`
			`We also use the same names as used for properties and capabilities in`
			`GStreamer.`

			`This means we use`
			`- encoded audio`
			`- "bitrate" (which is bits per second - use the most correct one,`
			`ie. average bitrate for VBR for example)`

			`- raw audio`
			`- "samplerate" - sampling frequency`
			`- "channels"`
			`- "bitwidth" - how wide is the audio in bits`

			`- encoded video`
			`- "bitrate"`

			`- raw video`
			`(FIXME: I don't know enough about video, are these correct)`
			`- "width"`
			`- "height"`
			`- "colordepth"`
			`- "colorspace"`
			`- "fps"`
			`- "aspectratio"`

			`We must find a way to avoid collision. A system stream can contain both`
			`audio and video (-> bitrate) or multiple audio or video streams. One way`
			`to do this might be to make a metadata set for a stream a GList of metadata`
			`for elementary streams.`

			`For content metadata, the standards are less clear.`
			`Some nice ones to standardize on might be`
			`- artist`
			`- title`
			`- author`
			`- year`
			`- genre (touchy though)`
			`- RMS, inpoint, outpoint (calculated through some formula, used for mixing)`

			`TESTING`
			`-------`
			`It is important to write a decent testsuite for this and do speed comparisons`
			`between the library used and the GStreamer implementation.`


			`API`
			`---`
			`struct GstMetadata`
			`{`
			`gchar *location;`
			`GstMetadataType type;`

			`GList *streams;`
			`GHashtable *values;`
			`};`

			`(streams would be a GList of (again) GstMetadata's.`
			`"location" would then be reused to indicate an identifier in the stream.`
			`FIXME: is that evil ?)`

			`GstMetadataType - technical, content`
			`GstMetadataReadType - cached, raw`

			`GstMetadata *`
			`gst_metadata_read (const char *location,`
			`GstMetadataType type,`
			`GstMetadataReadType read_type);`
			`GstMetadata *`
			`gst_metadata_read_props (const char *location,`
			`GList *names,`
			`GstMetadataType type,`
			`GstMetadataReadType read_type);`
			`GstMetadata *`
			`gst_metadata_read_cached (const char *location,`
			`GstMetadataType type,`
			`GstMetadataReadType read_type);`

			`GstMetadata *`
			`gst_metadata_read_props_cached (...)`

			`GList *`
			`gst_metadata_read_cached_many (GList *locations,`
			`GstMetadataType type,`
			`GstMetadataReadType read_type);`

			`GList *`
			`gst_metadata_read_props_cached_many (GList *locations,`
			`GList *names,`
			`GstMetadataType type,`
			`GstMetadataReadType read_type);`

			`GList *`
			`gst_metadata_content_write (const char *location,`
			`GstMetadata *metadata);`


			`SOME USEFUL RESOURCES`
			`---------------------`
			`http://www.chin.gc.ca/English/Standards/metadata_multimedia.html`
			`- describes multimedia data for images`
			`distinction between content (descriptive), technical and`
			`administrative metadata`