mirror of
https://gitlab.freedesktop.org/gstreamer/gstreamer.git
synced 2025-01-01 13:08:49 +00:00
150 lines
6.5 KiB
Text
150 lines
6.5 KiB
Text
|
-*- outline -*-
|
||
|
|
||
|
* Pro Audio with GStreamer
|
||
|
|
||
|
This file attempts to document usage of GStreamer for so-called "pro
|
||
|
audio"[0]. Two audiences are considered: programmers that are
|
||
|
considering GStreamer for their pro-audio app, and GStreamer developers
|
||
|
interested in which parts of GStreamer pro-audio uses.
|
||
|
|
||
|
[0] I actually don't like this term, because it's elitist. Of course
|
||
|
other audio applications are not inferior, but they are different.
|
||
|
I'll stick with the term out of established practice.
|
||
|
|
||
|
** What GStreamer Offers the Pro Audio Developer
|
||
|
|
||
|
Choosing GStreamer for your application gives you lots of things for
|
||
|
free.
|
||
|
|
||
|
*** A high penetration into POSIX desktops
|
||
|
|
||
|
GStreamer is included with Gnome, so you'll find it already installed on
|
||
|
an increasing number of desktops. It makes it easier for a user to
|
||
|
install your app. However, you still have to check for individual
|
||
|
plugins that you depend on.
|
||
|
|
||
|
*** An extremely flexible signal flow graph
|
||
|
|
||
|
You have elements, connection points, different kinds of processing
|
||
|
functions, schedulers, etc. You can subclass just about everything, or
|
||
|
replace whole subsystems as you need to.
|
||
|
|
||
|
All of this you would have to implement somehow. The downside is, of
|
||
|
course, that it's extremely flexible. The graph isn't run by clock-tick
|
||
|
-- the delays are carried out by the timekeeping element (if any), when
|
||
|
execution reaches it. It's cooperative, rather than dictator-style like
|
||
|
Jack. If all problems have been worked out, etc, it runs smoothly, but
|
||
|
one poorly coded element can stall the graph.
|
||
|
|
||
|
Restricting graph operation to clock-ticks and using buses instead, like
|
||
|
SuperCollider 3, would introduce many simplifications to scheduling and
|
||
|
such, I would think. However, you'd still have to implement your
|
||
|
signal-flow infrastructure from scratch if you decided to go it alone.
|
||
|
|
||
|
I might revise the above paragraph, though. I like GStreamer's level of
|
||
|
flexibility a bit too much :)
|
||
|
|
||
|
*** A wide variety of existing plugins
|
||
|
|
||
|
This includes inputs like ALSA, OSS, sndfile, etc, as well as their
|
||
|
corresponding sinks (outputs). Then there are the network transports.
|
||
|
And the sound servers (including Jack). LADSPA plugins for free. Some
|
||
|
DSP things, but admittedly not too much -- this is an area for future
|
||
|
expansion.
|
||
|
|
||
|
*** Generic plugin behavior
|
||
|
|
||
|
Of course you still have to know some specifics about the plugins you
|
||
|
use (which properties they have, for example), but in general elements
|
||
|
of a "pipeline" (signal flow graph -- and no, it doesn't have to look
|
||
|
like a pipe) are replaceable. Your user can choose between ALSA or OSS
|
||
|
or even ESD (shudder), and it's simple to implement.
|
||
|
|
||
|
*** Easy threads
|
||
|
|
||
|
Adding threads to your signal flowgraph does takes some thought, but
|
||
|
once you've decided how to set things up it's reasonably easy.
|
||
|
Unfortunately realtime threads aren't implemented yet, but that should
|
||
|
be an easy project, knock on wood.
|
||
|
|
||
|
*** Other Stuff
|
||
|
|
||
|
GStreamer is big these days. I wouldn't say bloated, but there are a lot
|
||
|
of subsystems relating to "media" that just aren't applicable to
|
||
|
processing float data. There's a whole system (called "caps") that deals
|
||
|
with negotiating common formats between elements, when all pro audio has
|
||
|
to deal with is sample-rate and the number of frames per buffer. There's
|
||
|
a typefinding and pipeline autoplugging subsystem. There's "tags", like
|
||
|
from ID3 tags.
|
||
|
|
||
|
You might find uses for these things, and thankfully these uses blur the
|
||
|
lines between "pro" and "consumer" audio. To an extent, these features
|
||
|
complicate GStreamer programming. But mostly they stay out of your way
|
||
|
-- besides caps, they only bother you when you ask them to :-)
|
||
|
|
||
|
** Pro Audio for GStreamer Programmers
|
||
|
|
||
|
Pro audio is a restricted, almost purely mathematical domain. There's
|
||
|
not that much to worry about. Each channel is separate from the rest
|
||
|
(never interleaved). All data is in float format, and native byte order.
|
||
|
The sample rate is typically the same in the whole system. Same with the
|
||
|
number of frames in a buffer.
|
||
|
|
||
|
So it's simple, but it's different from "normal" audio processing (a
|
||
|
whole mess of variables to synchronise and convert between, interleaved
|
||
|
data, codecs, etc). But it's sufficiently different that in the past
|
||
|
we've had discussions every 8 months or so about why things are
|
||
|
implemented in such-and-such a way, and why don't we change them, and so
|
||
|
on. So this part of the document is aimed at GStreamer developer's as a
|
||
|
kind of documentation for the whole float-caps space.
|
||
|
|
||
|
*** The Format
|
||
|
|
||
|
Pro audio deals with floats. I'm not really worried about doubles --
|
||
|
although LADSPA carefully #define's LADSPA_Sample so you can override
|
||
|
it, everything's in float.
|
||
|
|
||
|
There are two variables to be concerned about. One is sample rate, which
|
||
|
is pretty obvious. The not-so-obvious one is buffer-frames, specifying
|
||
|
the number of frames that will come in a buffer. If a buffer has fewer
|
||
|
frames, that indicates EOS is coming on the next pull. This property is
|
||
|
an optimization to allow easy chaining of buffers in multi-pad elements,
|
||
|
as well as to prevent deadlocks in circular pipelines, and to comply
|
||
|
with systems like Jack that operate on clock ticks.
|
||
|
|
||
|
*** Channels
|
||
|
|
||
|
One variable that is not in pro-audio is the number of channels in a
|
||
|
stream. Streams are always mono. All DSP algorithms expect to receive
|
||
|
mono data. Multichannel processing is done via multiple inputs. This is
|
||
|
the complicated part of pro audio for GStreamer, because it means lots
|
||
|
of multi-pad elements, and complicated pipelines, which is a pain to
|
||
|
code for (if you're not coding it in Scheme, of course ;). So yes, it's
|
||
|
kindof a pain, but it is a flexibility that's necessary.
|
||
|
|
||
|
*** Stability
|
||
|
|
||
|
DSP routines written years back still work, because all you need to use
|
||
|
them is to -lm. GStreamer is a step towards DLL hell. And audio
|
||
|
developers are a funny bunch. Look at Paul Davis's Ardour CVS, for
|
||
|
instance. He has a local copy of every library ever coded, ever. No
|
||
|
joke.
|
||
|
|
||
|
If our platform is to remain attractive to this group, we need to start
|
||
|
to stabilize the way GStreamer works. Of course API and ABI change,
|
||
|
we're young. But outside of media-related work, the core is pretty
|
||
|
stable. When we move to change things after 0.8, changes should be well
|
||
|
documented.
|
||
|
|
||
|
That's all pretty normal, but there is one special consideration. DSP
|
||
|
involves lots of custom plugins, maintained outside the GStreamer tree.
|
||
|
So just because you grep the tree and don't find an instance of X
|
||
|
function or whatever, it doesn't necessarily mean the feature/behaviour
|
||
|
is unused. This will be increasingly true for other GStreamer users in
|
||
|
the future, but it's true now for DSP. I'm talking about me now ;)
|
||
|
|
||
|
OK, enough rambling. Hope this clarifies things a bit.
|
||
|
|
||
|
Andy Wingo, 24 Jan 2004.
|
||
|
|