gstreamer/docs/pwg/advanced-tagging.xml
Thomas Vander Stichele 55b67f084d fix non-validating docbook make sure validation gets checked before building
Original commit message from CVS:
fix non-validating docbook
make sure validation gets checked before building
2004-01-29 17:25:18 +00:00

288 lines
11 KiB
XML

<chapter id="chapter-advanced-tagging">
<title>Tagging (Metadata and Streaminfo)</title>
<para>
Tags are pieces of information stored in a stream that are not the content
itself, butthey rather <emphasis>describe</emphasis> the content. Most
media container formats support tagging in one way or another. Ogg uses
VorbisComment for this, MP3 uses ID3, AVI and WAV use RIFF's INFO list
chunk, etc. GStreamer provides a general way for elements to read tags from
the stream and expose this to the user. The tags (at least the metadata)
will be part of the stream inside the pipeline. The consequence of this is
that transcoding of files from one format to another will automatically
preserve tags, as long as the input and output format elements both support
tagging.
</para>
<para>
Tags are separated in two categories in GStreamer, even though applications
won't notice anything of this. The first are called <emphasis>metadata</emphasis>,
the second are called <emphasis>streaminfo</emphasis>. Metadata are tags
that describe the non-technical parts of stream content. They can be
changed without needing to re-encode the stream completely. Examples are
<quote>author</quote>, <quote>title</quote> or <quote>album</quote>. The
container format might still need to be re-written for the tags to fit in,
though. Streaminfo, on the other hand, are tags that describe the stream
contents technically. To change them, the stream needs to be re-encoded.
Examples are <quote>codec</quote> or <quote>bitrate</quote>. Note that some
container formats (like ID3) store various streaminfo tags as metadata in
the file container, which means that they can be changed so that they don't
match the content in the file anymore. Still, they are called metadata
because <emphasis>technically</emphasis>, they can be changed without
re-encoding the whole stream, even though that makes them invalid. Files
with such metadata tags will have the same tag twice: once as metadata,
once as streaminfo.
</para>
<para>
A tag reading element is called <classname>TagGetter</classname> in
&GStreamer;. A tag writer is called <classname>TagSetter</classname>. An
element supporting both can be used in a tag editor for quick tag changing.
</para>
<sect1 id="section-tagging-read" xreflabel="Reading Tags from Streams">
<title>Reading Tags from Streams</title>
<para>
The basic object for tags is a <classname>GstTagList</classname>. An
element that is reading tags from a stream should create an empty taglist
and fill this with individual tags. Empty tag lists can be created with
<function>gst_tag_list_new ()</function>. Then, the element can fill the
list using <function>gst_tag_list_add_values ()</function>. Note that
an element probably reads metadata as strings, but values might not
necessarily be strings. Be sure to use <function>gst_value_transform ()</function>
to make sure that your data is of the right type. After data reading, the
application can be notified of the new taglist by calling
<function>gst_element_found_tags ()</function>. The tags should also be
part of the datastream, so they should be pushed over all source pads.
The function <function>gst_event_new_tag ()</function> creates an event
from a taglist. This can be pushed over source pads using
<function>gst_pad_push ()</function>. Simple elements with only one
source pad can combine all these steps all-in-one by using the function
<function>gst_element_found_tags_for_pad ()</function>.
</para>
<para>
The following example program will parse a file and parse the data as
metadata/tags rathen than as actual content-data. It will parse each
line as <quote>name:value</quote>, where name is the type of metadata
(title, author, ...) and value is the metadata value. The
<function>_getline ()</function> is the same as the one given in
<xref linkend="section-reqpad-sometimes"/>.
</para>
<programlisting>
<![CDATA[
static void
gst_my_filter_loopfunc (GstElement *element)
{
GstMyFilter *filter = GST_MY_FILTER (element);
GstBuffer *buf;
GstTagList *taglist = gst_tag_list_new ();
/* get each line and parse as metadata */
while ((buf = gst_my_filter_getline (filter))) {
gchar *line = GST_BUFFER_DATA (buf), *colon_pos, *type = NULL;a
/* get the position of the ':' and go beyond it */
if (!(colon_pos = strchr (line, ':')))
goto next:
/* get the string before that as type of metadata */
type = g_strndup (line, colon_pos - line);
/* content is one character beyond the ':' */
colon_pos = &colon_pos[1];
if (*colon_pos == '\0')
goto next;
/* get the metadata category, it's value type, store it in that
* type and add it to the taglist. */
if (gst_tag_exists (type)) {
GValue from = { 0 }, to = { 0 };
GType to_type;
to_type = gst_tag_get_type (type);
g_value_init (&from, G_TYPE_STRING);
g_value_set_string (&from, colon_pos);
g_value_init (&to, to_type);
g_value_transform (&from, &to);
g_value_unset (&from);
gst_tag_list_add_values (taglist, GST_TAG_MERGE_APPEND,
type, &to, NULL);
g_value_unset (&to);
}
next:
g_free (type);
gst_buffer_unref (buf);
}
/* signal metadata */
gst_element_found_tags_for_pad (element, filter->srcpad, 0, taglist);
gst_tag_list_free (taglist);
/* send EOS */
gst_pad_send_event (filter->srcpad, GST_DATA (gst_event_new (GST_EVENT_EOS)));
gst_element_set_eos (element);
}
]]>
</programlisting>
<para>
We currently assume the core to already <emphasis>know</emphasis> the
mimetype (<function>gst_tag_exists ()</function>). You can add new tags
to the list of known tags using <function>gst_tag_register ()</function>.
If you think the tag will be useful in more cases than just your own
element, it might be a good idea to add it to <filename>gsttag.c</filename>
instead. That's up to you to decide. If you want to do it in your own
element, it's easiest to register the tag in one of your class init
functions, preferrably <function>_class_init ()</function>.
</para>
<programlisting>
<![CDATA[
static void
gst_my_filter_class_init (GstMyFilterClass *klass)
{
[..]
gst_tag_register ("my_tag_name", GST_TAG_FLAG_META,
G_TYPE_STRING,
_("my own tag"),
_("a tag that is specific to my own element"),
NULL);
[..]
}
]]>
</programlisting>
</sect1>
<sect1 id="section-tagging-write" xreflabel="Writing Tags to Streams">
<title>Writing Tags to Streams</title>
<para>
Tag writers are the opposite of tag readers. Tag writers only take
metadata tags into account, since that's the only type of tags that have
to be written into a stream. Tag writers can receive tags in three ways:
internal, application and pipeline. Internal tags are tags read by the
element itself, which means that the tag writer is - in that case - a tag
reader, too. Application tags are tags provided to the element via the
TagSetter interface (which is just a layer). Pipeline tags are tags
provided to the element from within the pipeline. The element receives
such tags via the <classname>GST_EVENT_TAG</classname> event, which means
that tags writers should automatically be event aware. The tag writer is
responsible for combining all these three into one list and writing them
to the output stream.
</para>
<para>
The example below will receive tags from both application and pipeline,
combine them and write them to the output stream. It implements the tag
setter so applications can set tags, and retrieves pipeline tags from
incoming events.
</para>
<programlisting>
<![CDATA[
GType
gst_my_filter_get_type (void)
{
[..]
static const GInterfaceInfo tag_setter_info = {
NULL,
NULL,
NULL
};
[..]
g_type_add_interface_static (my_filter_type,
GST_TYPE_TAG_SETTER,
&tag_setter_info);
[..]
}
static void
gst_my_filter_init (GstMyFilter *filter)
{
GST_FLAG_SET (filter, GST_ELEMENT_EVENT_AWARE);
[..]
}
/*
* Write one tag.
*/
static void
gst_my_filter_write_tag (const GstTagList *taglist,
const gchar *tagname,
gpointer data)
{
GstMyFilter *filter = GST_MY_FILTER (data);
GstBuffer *buffer;
guint num_values = gst_tag_list_get_tag_size (list, tag_name), n;
const GValue *from;
GValue to = { 0 };
g_value_init (&to, G_TYPE_STRING);
for (n = 0; n < num_values; n++) {
from = gst_tag_list_get_value_index (taglist, tagname, n);
g_value_transform (from, &to);
buf = gst_buffer_new ();
GST_BUFFER_DATA (buf) = g_strdup_printf ("%s:%s", tagname,
g_value_get_string (&to));
GST_BUFFER_SIZE (buf) = strlen (GST_BUFFER_DATA (buf));
gst_pad_push (filter->srcpad, GST_DATA (buf));
}
g_value_unset (&to);
}
static void
gst_my_filter_loopfunc (GstElement *element)
{
GstMyFilter *filter = GST_MY_FILTER (element);
GstTagSetter *tagsetter = GST_TAG_SETTER (element);
GstData *data;
GstEvent *event;
gboolean eos = FALSE;
GstTagList *taglist = gst_tag_list_new ();
while (!eos) {
data = gst_pad_pull (filter->sinkpad);
/* We're not very much interested in data right now */
if (GST_IS_BUFFER (data))
gst_buffer_unref (GST_BUFFER (data));
event = GST_EVENT (data);
switch (GST_EVENT_TYPE (event)) {
case GST_EVENT_TAG:
gst_tag_list_insert (taglist, gst_event_tag_get_list (event),
GST_TAG_MERGE_PREPEND);
gst_event_unref (event);
break;
case GST_EVENT_EOS:
eos = TRUE;
gst_event_unref (event);
break;
default:
gst_pad_event_default (filter->sinkpad, event);
break;
}
}
/* merge tags with the ones retrieved from the application */
if (gst_tag_setter_get_list (tagsetter)) {
gst_tag_list_insert (taglist,
gst_tag_setter_get_list (tagsetter),
gst_tag_setter_get_merge_mode (tagsetter));
}
/* write tags */
gst_tag_list_foreach (taglist, gst_my_filter_write_tag, filter);
/* signal EOS */
gst_pad_push (filter->srcpad, GST_DATA (gst_event_new (GST_EVENT_EOS)));
gst_element_set_eos (element);
}
]]>
</programlisting>
<para>
Note that normally, elements would not read the full stream before
processing tags. Rather, they would read from each sinkpad until they've
received data (since tags usually come in before the first data buffer)
and process that.
</para>
</sect1>
</chapter>