# How to write GStreamer Elements in Rust Part 2: A raw audio sine wave source
In this part, a raw audio sine wave source element is going to be written. The final code can be found[here](https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/blob/master/gst-plugin-tutorial/src/sinesrc.rs).
The first part here will be all the boilerplate required to set up the element. You can safely[skip](#caps-negotiation)this if you remember all this from the [previous tutorial](tutorial-1.md).
Our sine wave element is going to produce raw audio, with a number of channels and any possible sample rate with both 32 bit and 64 bit floating point samples. It will produce a simple sine wave with a configurable frequency, volume/mute and number of samples per audio buffer. In addition it will be possible to configure the element in (pseudo) live mode, meaning that it will only produce data in real-time according to the pipeline clock. And it will be possible to seek to any time/sample position on our source element. It will basically be a more simply version of the[`audiotestsrc`](https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-plugins/html/gst-plugins-base-plugins-audiotestsrc.html)element from gst-plugins-base.
So let's get started with all the boilerplate. This time our element will be based on the[`PushSrc`](https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer-libs/html/GstPushSrc.html)base class instead of[`BaseTransform`](https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer-libs/html/GstBaseTransform.html). `PushSrc` is a subclass of the [`BaseSrc`](https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer-libs/html/GstBaseSrc.html) base class that only works in push mode, i.e. creates buffers as they arrive instead of allowing downstream elements to explicitly pull them.
If any of this needs explanation, please see the[previous](tutorial-1.md) and the comments in the code. The explanation for all the structs fields and what they're good for will follow in the next sections.
With all of the above and a small addition to`src/lib.rs`this should compile now.
```rust
mod sinesrc;
[...]
fn plugin_init(plugin: &gst::Plugin) -> bool {
[...]
sinesrc::register(plugin);
true
}
```
Also a couple of new crates have to be added to`Cargo.toml`and`src/lib.rs`, but you best check the code in the[repository](https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/tree/master/gst-plugin-tutorial)for details.
### Caps Negotiation
The first part that we have to implement, just like last time, is caps negotiation. We already notified the base class about any caps that we can potentially handle via the caps in the pad template in`class_init`but there are still two more steps of behaviour left that we have to implement.
First of all, we need to get notified whenever the caps that our source is configured for are changing. This will happen once in the very beginning and then whenever the pipeline topology or state changes and new caps would be more optimal for the new situation. This notification happens via the`BaseTransform::set_caps`virtual method.
// If we have no caps yet, any old sample_offset and sample_stop will be
// in nanoseconds
let old_rate = match state.info {
Some(ref info) => info.rate() as u64,
None => gst::SECOND_VAL,
};
// Update sample offset and accumulator based on the previous values and the
// sample rate change, if any
let old_sample_offset = state.sample_offset;
let sample_offset = old_sample_offset
.mul_div_floor(info.rate() as u64, old_rate)
.unwrap();
let old_sample_stop = state.sample_stop;
let sample_stop =
old_sample_stop.map(|v| v.mul_div_floor(info.rate() as u64, old_rate).unwrap());
let accumulator =
(sample_offset as f64).rem(2.0 * PI * (settings.freq as f64) / (info.rate() as f64));
*state = State {
info: Some(info),
sample_offset: sample_offset,
sample_stop: sample_stop,
accumulator: accumulator,
};
drop(state);
let _ = element.post_message(&gst::Message::new_latency().src(Some(element)).build());
true
}
```
In here we parse the caps into a[`AudioInfo`](https://slomo.pages.freedesktop.org/rustdocs/gstreamer/gstreamer_audio/struct.AudioInfo.html)and then store that in our internal state, while updating various fields. We tell the base class about the number of bytes each buffer is usually going to hold, and update our current sample position, the stop sample position (when a seek with stop position happens, we need to know when to stop) and our accumulator. This happens by scaling both positions by the old and new sample rate. If we don't have an old sample rate, we assume nanoseconds (this will make more sense once seeking is implemented). The scaling is done with the help of the[`muldiv`](https://crates.io/crates/muldiv)crate, which implements scaling of integer types by a fraction with protection against overflows by doing up to 128 bit integer arithmetic for intermediate values.
The accumulator is the updated based on the current phase of the sine wave at the current sample position.
As a last step we post a new`LATENCY`message on the bus whenever the sample rate has changed. Our latency (in live mode) is going to be the duration of a single buffer, but more about that later.
`BaseSrc` is by default already selecting possible caps for us, if there are multiple options. However these defaults might not be (and often are not) ideal and we should override the default behaviour slightly. This is done in the`BaseSrc::fixate`virtual method.
Here we take the caps that are passed in, truncate them (i.e. remove all but the very first[`Structure`](https://slomo.pages.freedesktop.org/rustdocs/gstreamer/gstreamer/structure/struct.Structure.html)) and then manually fixate the sample rate to the closest value to 48kHz. By default, caps fixation would result in the lowest possible sample rate but this is usually not desired.
For good measure, we also fixate the number of channels to the closest value to 1, but this would already be the default behaviour anyway. And then chain up to the parent class' implementation of`fixate`, which for now basically does the same as`caps.fixate()`. After this, the caps are fixated, i.e. there is only a single`Structure`left and all fields have concrete values (no ranges or sets).
Now we have everything in place for a working element, apart from the virtual method to actually generate the raw audio buffers with the sine wave. From a high-level`BaseSrc`works by calling the`create`virtual method over and over again to let the subclass produce a buffer until it returns an error or signals the end of the stream.
Let's first talk about how to generate the sine wave samples themselves. As we want to operate on 32 bit and 64 bit floating point numbers, we implement a generic function for generating samples and storing them in a mutable byte slice. This is done with the help of the[`num_traits`](https://crates.io/crates/num-traits)crate, which provides all kinds of useful traits for abstracting over numeric types. In our case we only need the[`Float`](https://docs.rs/num-traits/0.2.0/num_traits/float/trait.Float.html)and[`NumCast`](https://docs.rs/num-traits/0.2.0/num_traits/cast/trait.NumCast.html)traits.
Instead of writing a generic implementation with those traits, it would also be possible to do the same with a simple macro that generates a function for both types. Which approach is nicer is a matter of taste in the end, the compiler output should be equivalent for both cases.
```rust
fn process<F:Float+FromByteSlice>(
data: &mut [u8],
accumulator_ref: &mut f64,
freq: u32,
rate: u32,
channels: u32,
vol: f64,
) {
use std::f64::consts::PI;
// Reinterpret our byte-slice as a slice containing elements of the type
// we're interested in. GStreamer requires for raw audio that the alignment
// of memory is correct, so this will never ever fail unless there is an
// actual bug elsewhere.
let data = data.as_mut_slice_of::<F>().unwrap();
// Convert all our parameters to the target type for calculations
let vol: F = NumCast::from(vol).unwrap();
let freq = freq as f64;
let rate = rate as f64;
let two_pi = 2.0 * PI;
// We're carrying a accumulator with up to 2pi around instead of working
// on the sample offset. High sample offsets cause too much inaccuracy when
// converted to floating point numbers and then iterated over in 1-steps
let mut accumulator = *accumulator_ref;
let step = two_pi * freq / rate;
for chunk in data.chunks_mut(channels as usize) {
let value = vol * F::sin(NumCast::from(accumulator).unwrap());
for sample in chunk {
*sample = value;
}
accumulator += step;
if accumulator >= two_pi {
accumulator -= two_pi;
}
}
*accumulator_ref = accumulator;
}
```
This function takes the mutable byte slice from our buffer as argument, as well as the current value of the accumulator and the relevant settings for generating the sine wave.
As a first step, we "cast" the byte slice to one of the target type (f32 or f64) with the help of the[`byte_slice_cast`](https://crates.io/crates/byte-slice-cast)crate. This ensures that alignment and sizes are all matching and returns a mutable slice of our target type if successful. In case of GStreamer, the buffer alignment is guaranteed to be big enough for our types here and we allocate the buffer of a correct size later.
Now we convert all the parameters to the types we will use later, and store them together with the current accumulator value in local variables. Then we iterate over the whole floating point number slice in chunks with all channels, and fill each channel with the current value of our sine wave.
The sine wave itself is calculated by`val = volume * sin(2 * PI * frequency * (i + accumulator) / rate)`, but we actually calculate it by simply increasing the accumulator by`2 * PI * frequency / rate`for every sample instead of doing the multiplication for each sample. We also make sure that the accumulator always stays between`0`and`2 * PI`to prevent any inaccuracies from floating point numbers to affect our produced samples.
Now that this is done, we need to implement the`PushSrc::create`virtual method for actually allocating the buffer, setting timestamps and other metadata and it and calling our above function.
Just like last time, we start with creating a copy of our properties (settings) and keeping a mutex guard of the internal state around. If the internal state has no`AudioInfo`yet, we error out. This would mean that no caps were negotiated yet, which is something we can't handle and is not really possible in our case.
Next we calculate how many samples we have to generate. If a sample stop position was set by a seek event, we have to generate samples up to at most that point. Otherwise we create at most the number of samples per buffer that were set via the property. Then we allocate a buffer of the corresponding size, with the help of the`bpf`field of the`AudioInfo`, and then set its metadata and fill the samples.
The metadata that is set is the timestamp (PTS), and the duration. The duration is calculated from the difference of the following buffer's timestamp and the current buffer's. By this we ensure that rounding errors are not causing the next buffer's timestamp to have a different timestamp than the sum of the current's and its duration. While this would not be much of a problem in GStreamer (inaccurate and jitterish timestamps are handled just fine), we can prevent it here and do so.
Afterwards we call our previously defined function on the writably mapped buffer and fill it with the sample values.
With all this, the element should already work just fine in any GStreamer-based application, for example`gst-launch-1.0`. Don't forget to set the`GST_PLUGIN_PATH`environment variable correctly like last time. Before running this, make sure to turn down the volume of your speakers/headphones a bit.
Many audio (and video) sources can actually only produce data in real-time and data is produced according to some clock. So far our source element can produce data as fast as downstream is consuming data, but we optionally can change that. We simulate a live source here now by waiting on the pipeline clock, but with a real live source you would only ever be able to have the data in real-time without any need to wait on a clock. And usually that data is produced according to a different clock than the pipeline clock, in which case translation between the two clocks is needed but we ignore this aspect for now. For details check the[GStreamer documentation](https://gstreamer.freedesktop.org/documentation/application-development/advanced/clocks.html).
For working in live mode, we have to add a few different parts in various places. First of all, we implement waiting on the clock in the`create`function.
```rust
fn create(...)
[...]
state.sample_offset += n_samples;
drop(state);
// If we're live, we are waiting until the time of the last sample in our buffer has
// arrived. This is the very reason why we have to report that much latency.
// A real live-source would of course only allow us to have the data available after
// that latency, e.g. when capturing from a microphone, and no waiting from our side
// would be necessary.
//
// Waiting happens based on the pipeline clock, which means that a real live source
// with its own clock would require various translations between the two clocks.
// This is out of scope for the tutorial though.
if element.is_live() {
let clock = match element.get_clock() {
None => return Ok(buffer),
Some(clock) => clock,
};
let segment = element
.get_segment()
.downcast::<gst::format::Time>()
.unwrap();
let base_time = element.get_base_time();
let running_time = segment.to_running_time(buffer.get_pts() + buffer.get_duration());
// The last sample's clock time is the base time of the element plus the
// running time of the last sample
let wait_until = running_time + base_time;
if wait_until.is_none() {
return Ok(buffer);
}
let id = clock.new_single_shot_id(wait_until).unwrap();
To be able to wait on the clock, we first of all need to calculate the clock time until when we want to wait. In our case that will be the clock time right after the end of the last sample in the buffer we just produced. Simply because you can't capture a sample before it was produced.
We calculate the running time from the PTS and duration of the buffer with the help of the currently configured segment and then add the base time of the element on this to get the clock time as result. Please check the[GStreamer documentation](https://gstreamer.freedesktop.org/documentation/application-development/advanced/clocks.html)for details, but in short the running time of a pipeline is the time since the start of the pipeline (or the last reset of the running time) and the running time of a buffer can be calculated from its PTS and the segment, which provides the information to translate between the two. The base time is the clock time when the pipeline went to the`Playing`state, so just an offset.
Next we wait and then return the buffer as before.
Now we also have to tell the base class that we're running in live mode now. This is done by calling`set_live(true)`on the base class before changing the element state from`Ready`to`Paused`. For this we override the`Element::change_state`virtual method.
And as a last step, we also need to notify downstream elements about our[latency](https://gstreamer.freedesktop.org/documentation/application-development/advanced/clocks.html#latency). Live elements always have to report their latency so that synchronization can work correctly. As the clock time of each buffer is equal to the time when it was created, all buffers would otherwise arrive late in the sinks (they would appear as if they should've been played already at the time when they were created). So all the sinks will have to compensate for the latency that it took from capturing to the sink, and they have to do that in a coordinated way (otherwise audio and video would be out of sync if both have different latencies). For this the pipeline is querying each sink for the latency on its own branch, and then configures a global latency on all sinks according to that.
The latency that we report is the duration of a single audio buffer, because we're simulating a real live source here. A real live source won't be able to output the buffer before the last sample of it is captured, and the difference between when the first and last sample were captured is exactly the latency that we add here. Other elements further downstream that introduce further latency would then add their own latency on top of this.
Inside the latency query we also signal that we are indeed a live source, and additionally how much buffering we can do (in our case, infinite) until data would be lost. The last part is important if e.g. the video branch has a higher latency, causing the audio sink to have to wait some additional time (so that audio and video stay in sync), which would then require the whole audio branch to buffer some data. As we have an artificial live source, we can always generate data for the next time but a real live source would only have a limited buffer and if no data is read and forwarded once that runs full, data would get lost.
You can test this again with e.g.`gst-launch-1.0`by setting the`is-live`property to true. It should write in the output now that the pipeline is live.
`audiotestsrc` element also does it via `get_times` virtual method. But as this is only really useful for pseudo live sources like this one, we decided to explain how waiting on the clock can be achieved correctly and even more important how that relates to the next section.
### Unlocking
With the addition of the live mode, the`create`function is now blocking and waiting on the clock for some time. This is suboptimal as for example a (flushing) seek would have to wait now until the clock waiting is done, or when shutting down the application would have to wait.
To prevent this, all waiting/blocking in GStreamer streaming threads should be interruptible/cancellable when requested. And for example the`ClockID`that we got from the clock for waiting can be cancelled by calling`unschedule()`on it. We only have to do it from the right place and keep it accessible. The right place is the`BaseSrc::unlock`virtual method.
```rust
struct ClockWait {
clock_id: Option<gst::ClockId>,
flushing: bool,
}
struct SineSrc {
settings: Mutex<Settings>,
state: Mutex<State>,
clock_wait: Mutex<ClockWait>,
}
[...]
fn unlock(&self, element: &BaseSrc) -> bool {
// This should unblock the create() function ASAP, so we
let mut clock_wait = self.clock_wait.lock().unwrap();
if let Some(clock_id) = clock_wait.clock_id.take() {
clock_id.unschedule();
}
clock_wait.flushing = true;
true
}
```
We store the clock ID in our struct, together with a boolean to signal whether we're supposed to flush already or not. And then inside`unlock`unschedule the clock ID and set this boolean flag to true.
Once everything is unlocked, we need to reset things again so that data flow can happen in the future. This is done in the`unlock_stop`virtual method.
The important part in this code is that we first have to check if we are already supposed to unlock, before even starting to wait. Otherwise we would start waiting without anybody ever being able to unlock. Then we need to store the clock id in the struct and make sure to drop the mutex guard so that the`unlock`function can take it again for unscheduling the clock ID. And once waiting is done, we need to remove the clock id from the struct again and in case of`ClockReturn::Unscheduled`we directly return`FlowReturn::Flushing`instead of the error.
Similarly when using other blocking APIs it is important that they are woken up in a similar way when`unlock`is called. Otherwise the application developer's and thus user experience will be far from ideal.
### Seeking
As a last feature we implement seeking on our source element. In our case that only means that we have to update the`sample_offset`and`sample_stop`fields accordingly, other sources might have to do more work than that.
Seeking is implemented in the`BaseSrc::do_seek`virtual method, and signalling whether we can actually seek in the`is_seekable`virtual method.
Currently no support for reverse playback is implemented here, that is left as an exercise for the reader. So as a first step we check if the segment has a negative rate, in which case we just fail and return false.
Afterwards we again take a copy of the settings, keep a mutable mutex guard of our state and then start handling the actual seek.
If no caps are known yet, i.e. the`AudioInfo`is`None`, we assume a rate of 1 billion. That is, we just store the time in nanoseconds for now and let the`set_caps`function take care of that (which we already implemented accordingly) once the sample rate is known.
Then, if a`Time`seek is performed, we convert the segment start and stop position from time to sample offsets and save them. And then update the accumulator in a similar way as in the`set_caps`function. If a seek is in`Default`format (i.e. sample offsets for raw audio), we just have to store the values and update the accumulator but only do so if the sample rate is known already. A sample offset seek does not make any sense until the sample rate is known, so we just fail here to prevent unexpected surprises later.
Try the following pipeline for testing seeking. You should be able to seek the current time drawn over the video, and with the left/right cursor key you can seek. Also this shows that we create a quite nice sine wave.