Caps negotiation Caps negotiation is the process where elements configure themselves and each other for streaming a particular media format over their pads. Since different types of elements have different requirements for the media formats they can negotiate to, it is important that this process is generic and implements all those use cases correctly. In this chapter, we will discuss downstream negotiation and upstream negotiation from a pipeline perspective, implicating the responsibilities of different types of elements in a pipeline, and we will introduce the concept of fixed caps. Caps negotiation use cases Let's take the case of a file source, linked to a demuxer, linked to a decoder, linked to a converter with a caps filter and finally an audio output. When data flow originally starts, the demuxer will parse the file header (e.g. the Ogg headers), and notice that there is, for example, a Vorbis stream in this Ogg file. Noticing that, it will create an output pad for the Vorbis elementary stream and set a Vorbis-caps on it. Lastly, it adds the pad. As of this point, the pad is ready to be used to stream data, and so the Ogg demuxer is now done. This pad is not re-negotiable, since the type of the data stream is embedded within the data. The Vorbis decoder will decode the Vorbis headers and the Vorbis data coming in on its sinkpad. Now, some decoders may be able to output in multiple output formats, for example both 16-bit integer output and floating-point output, whereas other decoders may be able to only decode into one specific format, e.g. only floating-point (32-bit) audio. Those two cases have consequences for how caps negotiation should be implemented in this decoder element. In the one case, it is possible to use fixed caps, and you're done. In the other case, however, you should implement the possibility for renegotiation in this element, which is the possibility for the data format to be changed to another format at some point in the future. We will discuss how to do this in one of the sections further on in this chapter. The filter can be used by applications to force, for example, a specific channel configuration (5.1/surround or 2.0/stereo), on the pipeline, so that the user can enjoy sound coming from all its speakers. The audio sink, in this example, is a standard ALSA output element (alsasink). The converter element supports any-to-any, and the filter will make sure that only a specifically wanted channel configuration streams through this link (as provided by the user's channel configuration preference). By changing this preference while the pipeline is running, some elements will have to renegotiate while the pipeline is running. This is done through upstream caps renegotiation. That, too, will be discussed in detail in a section further below. In order for caps negotiation on non-fixed links to work correctly, pads can optionally implement a query function that tells peer elements what formats it supports and/or prefers. When upstream renegotiation is triggered, this becomes important. Downstream elements are notified of a newly set caps with a GST_EVENT_CAPS on the sinkpad. So when the vorbis decoder sets a caps on its source pad (to configure the output format), the converter will receive a caps event. When an element receives a buffer, it should check if it has received all needed format information in a CAPS event previously. If it hasn't, it should return an error from the chain function. Fixed caps The simplest way in which to do caps negotiation is setting a fixed caps on a pad. After a fixed caps has been set, the pad can not be renegotiated from the outside. The only way to reconfigure the pad is for the element owning the pad to set a new fixed caps on the pad. Fixed caps is a setup property for pads, called when creating the pad: [..] pad = gst_pad_new_from_static_template (..); gst_pad_use_fixed_caps (pad); [..] The fixed caps can then be set on the pad by calling gst_pad_set_caps (). [..] caps = gst_caps_new_simple ("audio/x-raw", "format", G_TYPE_STRING, GST_AUDIO_NE(F32), "rate", G_TYPE_INT, <samplerate>, "channels", G_TYPE_INT, <num-channels>, NULL); if (!gst_pad_set_caps (pad, caps)) { GST_ELEMENT_ERROR (element, CORE, NEGOTIATION, (NULL), ("Some debug information here")); return GST_FLOW_ERROR; } [..] Elements that could implement fixed caps (on their source pads) are, in general, all elements that are not renegotiable. Examples include: A typefinder, since the type found is part of the actual data stream and can thus not be re-negotiated. Pretty much all demuxers, since the contained elementary data streams are defined in the file headers, and thus not renegotiable. Some decoders, where the format is embedded in the data stream and not part of the peercaps and where the decoder itself is not reconfigurable, too. All other elements that need to be configured for the format should implement full caps negotiation, which will be explained in the next few sections. Downstream caps negotiation Downstream negotiation takes place when a format needs to be set on a source pad to configure the output format, but this element allows renegotiation because its format is configured on the sinkpad caps, or because it supports multiple formats. The requirements for doing the actual negotiation differ slightly. Negotiating caps embedded in input caps Many elements, particularly effects and converters, will be able to parse the format of the stream from their input caps, and decide the output format right at that time already. For those elements, all (downstream) caps negotiation can be done from the _event () function when a GST_EVENT_CAPS is received on the sinkpad. This CAPS event is received whenever the format changes or when no format was negotiated yet. It will always be called before you receive the buffer in the format specified in the CAPS event. In the _event ()-function, the element can forward the CAPS event to the next element and, if that pad accepts the format too, the element can parse the relevant parameters from the caps and configure itself internally. The caps passed to this function is always a subset of the template caps, so there's no need for extensive safety checking. The following example should give a clear indication of how such a function can be implemented: static gboolean gst_my_filter_sink_event (GstPad *pad, GstObject *parent, GstEvent *event) { gboolean ret; GstMyFilter *filter = GST_MY_FILTER (parent); switch (GST_EVENT_TYPE (event)) { case GST_EVENT_CAPS: { GstCaps *caps; GstStructure *s; gst_event_parse_caps (event, &caps); /* forward-negotiate */ ret = gst_pad_set_caps (filter->srcpad, caps); if (!ret) return FALSE; /* negotiation succeeded, so now configure ourselves */ s = gst_caps_get_structure (caps, 0); gst_structure_get_int (s, "rate", &filter->samplerate); gst_structure_get_int (s, "channels", &filter->channels); break; } default: ret = gst_pad_event_default (pad, parent, event); break; } return ret; } There may also be cases where the filter actually is able to change the format of the stream. In those cases, it will negotiate a new format. Obviously, the element should first attempt to configure pass-through, which means that it does not change the stream's format. However, if that fails, then it should call gst_pad_get_allowed_caps () on its sourcepad to get a list of supported formats on the outputs, and pick the first. The return value of that function is guaranteed to be a subset of the template caps or NULL when there is no peer. Let's look at the example of an element that can convert between samplerates, so where input and output samplerate don't have to be the same: static gboolean gst_my_filter_setcaps (GstMyFilter *filter, GstCaps *caps) { if (gst_pad_set_caps (filter->sinkpad, caps)) { filter->passthrough = TRUE; } else { GstCaps *othercaps, *newcaps; GstStructure *s = gst_caps_get_structure (caps, 0), *others; /* no passthrough, setup internal conversion */ gst_structure_get_int (s, "channels", &filter->channels); othercaps = gst_pad_get_allowed_caps (filter->srcpad); others = gst_caps_get_structure (othercaps, 0); gst_structure_set (others, "channels", G_TYPE_INT, filter->channels, NULL); /* now, the samplerate value can optionally have multiple values, so * we "fixate" it, which means that one fixed value is chosen */ newcaps = gst_caps_copy_nth (othercaps, 0); gst_caps_unref (othercaps); gst_pad_fixate_caps (filter->srcpad, newcaps); if (!gst_pad_set_caps (filter->srcpad, newcaps)) return FALSE; /* we are now set up, configure internally */ filter->passthrough = FALSE; gst_structure_get_int (s, "rate", &filter->from_samplerate); others = gst_caps_get_structure (newcaps, 0); gst_structure_get_int (others, "rate", &filter->to_samplerate); } return TRUE; } static gboolean gst_my_filter_sink_event (GstPad *pad, GstObject *parent, GstEvent *event) { gboolean ret; GstMyFilter *filter = GST_MY_FILTER (parent); switch (GST_EVENT_TYPE (event)) { case GST_EVENT_CAPS: { GstCaps *caps; gst_event_parse_caps (event, &caps); ret = gst_my_filter_setcaps (filter, caps); break; } default: ret = gst_pad_event_default (pad, parent, event); break; } return ret; } static GstFlowReturn gst_my_filter_chain (GstPad *pad, GstObject *parent, GstBuffer *buf) { GstMyFilter *filter = GST_MY_FILTER (parent); GstBuffer *out; /* push on if in passthrough mode */ if (filter->passthrough) return gst_pad_push (filter->srcpad, buf); /* convert, push */ out = gst_my_filter_convert (filter, buf); gst_buffer_unref (buf); return gst_pad_push (filter->srcpad, out); } Parsing and setting caps Other elements, such as certain types of decoders, will not be able to parse the caps from their input, simply because the input format does not contain the information required to know the output format yet; rather, the data headers need to be parsed, too. In many cases, fixed-caps will be enough, but in some cases, particularly in cases where such decoders are renegotiable, it is also possible to use full caps negotiation. Fortunately, the code required to do so is very similar to the last code example in , with the difference being that the caps is selected in the _chain ()-function rather than in the _event ()-function. The rest, as for getting all allowed caps from the source pad, fixating and such, is all the same. Re-negotiation, which will be handled in the next section, is very different for such elements, though. Upstream caps (re)negotiation Upstream negotiation's primary use is to renegotiate (part of) an already-negotiated pipeline to a new format. Some practical examples include to select a different video size because the size of the video window changed, and the video output itself is not capable of rescaling, or because the audio channel configuration changed. Upstream caps renegotiation is requested by sending a GST_EVENT_RECONFIGURE event upstream. The idea is that it will instruct the upstream element to reconfigure its caps by doing a new query for the allowed caps and then choosing a new caps. The element that sends out the RECONFIGURE event would influence the selection of the new caps by returning the new prefered caps from its GST_QUERY_CAPS query function. The RECONFIGURE event will set the GST_PAD_FLAG_NEED_RECONFIGURE on all pads that it travels over. It is important to note here that different elements actually have different responsibilities here: Elements that can be reconfigured on the srcpad should check its NEED_RECONFIGURE flag with gst_pad_check_reconfigure () and it should start renegotiation when the function returns TRUE. Elements that want to propose a new format upstream need to send a RECONFIGURE event and be prepared to answer the CAPS query with the new prefered format. It should be noted that when there is no upstream element that can (or wants) to renegotiate, the element needs to deal with the currently configured format. Implementing a CAPS query function A _query ()-function with the GST_QUERY_CAPS query type is called when a peer element would like to know which formats this pad supports, and in what order of preference. The return value should be all formats that this elements supports, taking into account limitations of peer elements further downstream or upstream, sorted by order of preference, highest preference first. static gboolean gst_my_filter_query (GstPad *pad, GstObject * parent, GstQuery * query) { gboolean ret; GstMyFilter *filter = GST_MY_FILTER (parent); switch (GST_QUERY_TYPE (query)) { case GST_QUERY_CAPS { GstPad *otherpad; GstCaps *temp, *caps, *filter, *tcaps; gint i; otherpad = (pad == filter->srcpad) ? filter->sinkpad : filter->srcpad; caps = gst_pad_get_allowed_caps (otherpad); gst_query_parse_caps (query, &filter); /* We support *any* samplerate, indifferent from the samplerate * supported by the linked elements on both sides. */ for (i = 0; i < gst_caps_get_size (caps); i++) { GstStructure *structure = gst_caps_get_structure (caps, i); gst_structure_remove_field (structure, "rate"); } /* make sure we only return results that intersect our * padtemplate */ tcaps = gst_pad_get_pad_template_caps (pad); if (tcaps) { temp = gst_caps_intersect (caps, tcaps); gst_caps_unref (caps); gst_caps_unref (tcaps); caps = temp; } /* filter against the query filter when needed */ if (filter) { temp = gst_caps_intersect (caps, filter); gst_caps_unref (caps); caps = temp; } gst_query_set_caps_result (query, caps); gst_caps_unref (caps); ret = TRUE; break; } default: ret = gst_pad_query_default (pad, parent, query); break; } return ret; } Using all the knowledge you've acquired by reading this chapter, you should be able to write an element that does correct caps negotiation. If in doubt, look at other elements of the same type in our git repository to get an idea of how they do what you want to do.