Tagging (Metadata and Streaminfo)

Tagging (Metadata and Streaminfo) Tags are pieces of information stored in a stream that are not the content itself, but they rather describe the content. Most media container formats support tagging in one way or another. Ogg uses VorbisComment for this, MP3 uses ID3, AVI and WAV use RIFF's INFO list chunk, etc. GStreamer provides a general way for elements to read tags from the stream and expose this to the user. The tags (at least the metadata) will be part of the stream inside the pipeline. The consequence of this is that transcoding of files from one format to another will automatically preserve tags, as long as the input and output format elements both support tagging. Tags are separated in two categories in GStreamer, even though applications won't notice anything of this. The first are called metadata, the second are called streaminfo. Metadata are tags that describe the non-technical parts of stream content. They can be changed without needing to re-encode the stream completely. Examples are author, title or album. The container format might still need to be re-written for the tags to fit in, though. Streaminfo, on the other hand, are tags that describe the stream contents technically. To change them, the stream needs to be re-encoded. Examples are codec or bitrate. Note that some container formats (like ID3) store various streaminfo tags as metadata in the file container, which means that they can be changed so that they don't match the content in the file anymore. Still, they are called metadata because technically, they can be changed without re-encoding the whole stream, even though that makes them invalid. Files with such metadata tags will have the same tag twice: once as metadata, once as streaminfo. A tag reading element is called TagGetter in &GStreamer;. A tag writer is called TagSetter. An element supporting both can be used in a tag editor for quick tag changing. Reading Tags from Streams The basic object for tags is a GstTagList . An element that is reading tags from a stream should create an empty taglist and fill this with individual tags. Empty tag lists can be created with gst_tag_list_new (). Then, the element can fill the list using gst_tag_list_add_values () . Note that an element probably reads metadata as strings, but values might not necessarily be strings. Be sure to use gst_value_transform () to make sure that your data is of the right type. After data reading, the application can be notified of the new taglist by calling gst_element_found_tags (). The tags should also be part of the datastream, so they should be pushed over all source pads. The function gst_event_new_tag () creates an event from a taglist. This can be pushed over source pads using gst_pad_push (). Simple elements with only one source pad can combine all these steps all-in-one by using the function gst_element_found_tags_for_pad (). The following example program will parse a file and parse the data as metadata/tags rather than as actual content-data. It will parse each line as name:value, where name is the type of metadata (title, author, ...) and value is the metadata value. The _getline () is the same as the one given in . srcpad, 0, taglist); gst_tag_list_free (taglist); /* send EOS */ gst_pad_send_event (filter->srcpad, GST_DATA (gst_event_new (GST_EVENT_EOS))); gst_element_set_eos (element); } ]]> We currently assume the core to already know the mimetype (gst_tag_exists ()). You can add new tags to the list of known tags using gst_tag_register (). If you think the tag will be useful in more cases than just your own element, it might be a good idea to add it to gsttag.c instead. That's up to you to decide. If you want to do it in your own element, it's easiest to register the tag in one of your class init functions, preferrably _class_init (). Writing Tags to Streams Tag writers are the opposite of tag readers. Tag writers only take metadata tags into account, since that's the only type of tags that have to be written into a stream. Tag writers can receive tags in three ways: internal, application and pipeline. Internal tags are tags read by the element itself, which means that the tag writer is - in that case - a tag reader, too. Application tags are tags provided to the element via the TagSetter interface (which is just a layer). Pipeline tags are tags provided to the element from within the pipeline. The element receives such tags via the GST_EVENT_TAG event, which means that tags writers should automatically be event aware. The tag writer is responsible for combining all these three into one list and writing them to the output stream. The example below will receive tags from both application and pipeline, combine them and write them to the output stream. It implements the tag setter so applications can set tags, and retrieves pipeline tags from incoming events. srcpad, GST_DATA (buf)); } g_value_unset (&to); } static void gst_my_filter_loopfunc (GstElement *element) { GstMyFilter *filter = GST_MY_FILTER (element); GstTagSetter *tagsetter = GST_TAG_SETTER (element); GstData *data; GstEvent *event; gboolean eos = FALSE; GstTagList *taglist = gst_tag_list_new (); while (!eos) { data = gst_pad_pull (filter->sinkpad); /* We're not very much interested in data right now */ if (GST_IS_BUFFER (data)) gst_buffer_unref (GST_BUFFER (data)); event = GST_EVENT (data); switch (GST_EVENT_TYPE (event)) { case GST_EVENT_TAG: gst_tag_list_insert (taglist, gst_event_tag_get_list (event), GST_TAG_MERGE_PREPEND); gst_event_unref (event); break; case GST_EVENT_EOS: eos = TRUE; gst_event_unref (event); break; default: gst_pad_event_default (filter->sinkpad, event); break; } } /* merge tags with the ones retrieved from the application */ if ((gst_tag_setter_get_tag_list (tagsetter)) { gst_tag_list_insert (taglist, gst_tag_setter_get_tag_list (tagsetter), gst_tag_setter_get_tag_merge_mode (tagsetter)); } /* write tags */ gst_tag_list_foreach (taglist, gst_my_filter_write_tag, filter); /* signal EOS */ gst_pad_push (filter->srcpad, GST_DATA (gst_event_new (GST_EVENT_EOS))); gst_element_set_eos (element); } ]]> Note that normally, elements would not read the full stream before processing tags. Rather, they would read from each sinkpad until they've received data (since tags usually come in before the first data buffer) and process that.