Optimize LE<->BE conversion by adding a dedicated fast path instead of
using the generic converter. Implement transform_ip function in order to do the
endian swap in place.
This saves buffer allocation for the intermediate format, can be done in place
and also performs the conversion in one step instead of unpack-convert-pack.
For all bit widths the naive algorithm is implemented, which provides the best
performance when compiled with -O3. ORC was considered but eventually removed
as it requires a dedicated function for in-place conversion (due to the
"restrict" parameters).
A more complex algorithm for the 24-bit conversion with unrolled loop and
32-bit processing is implemented in the #if 0 section. It performs better if
compiled with -O2. With -O3 however the naive algorithm performs better.
https://bugzilla.gnome.org/show_bug.cgi?id=773073
It is not needed to store a pointer to every single chain element to free it.
Instead walk the channel list backwards and free the chain elements one by one.
Rename GstAudioConverter->chain_pack to chain_end.
https://bugzilla.gnome.org/show_bug.cgi?id=773073
Refuse to answer BYTES queries ourselves. The only
time they make sense is on raw elementary streams,
in which case upstream would already have answered.
They especially don't make sense for encoders to answer
based on upstream values - although perhaps later
we could make it do TIME->BYTES conversion on the source
pad based on bitrate.
https://bugzilla.gnome.org/show_bug.cgi?id=757631
gst_audio_buffer_reorder_channels() was always mapping the buffer read-write
regardless whether any reordering was needed. If the from and to channel order
is identical return immediately without remapping the buffer.
Add a small helper function gst_audio_channel_positions_equal() which is used
in both gst_audio_reorder_channels() and gst_audio_buffer_reorder_channels().
https://bugzilla.gnome.org/show_bug.cgi?id=773833
All the GstAudioClock method declarations required object of GstClock type
as a first argument, but in fact, required GstAudioClock object (runtime
check in function body). Instead of checking type in run-time, we can
change functions declaration, to accept only GstAudioClock methods. Then,
runtime check is not necessary anymore, since always GstAudioClock object
is passed to a function.
https://bugzilla.gnome.org/show_bug.cgi?id=756628
Seen on the Jenkins CI:
FAILED: subprojects/gst-plugins-base/gst-libs/gst/audio/audio_resampler_sse41@sta/audio-resampler-x86-sse41.c.o
ccache cc '-Isubprojects/gst-plugins-base/gst-libs/gst/audio/audio_resampler_sse41@sta' '-fdiagnostics-color=always' '-I../subprojects/gst-plugins-base/gst-libs/gst/audio' '-Isubprojects/gst-plugins-base/gst-libs/gst/audio' '-Isubprojects/gst-plugins-base/.' '-I../subprojects/gst-plugins-base/.' '-Isubprojects/gst-plugins-base/gst-libs' '-I../subprojects/gst-plugins-base/gst-libs' '-Isubprojects/gstreamer/libs' '-I../subprojects/gstreamer/libs' '-Isubprojects/gstreamer/.' '-I../subprojects/gstreamer/.' '-pipe' '-Wall' '-Winvalid-pch' '-DHAVE_CONFIG_H' '-msse4.1' '-fPIC' '-O0' '-g' '-fPIC' '-I/usr/include/glib-2.0' '-I/usr/lib/glib-2.0/include' '-pthread' '-Isubprojects/gstreamer/gst' '-MMD' '-MQ' 'subprojects/gst-plugins-base/gst-libs/gst/audio/audio_resampler_sse41@sta/audio-resampler-x86-sse41.c.o' '-MF' 'subprojects/gst-plugins-base/gst-libs/gst/audio/audio_resampler_sse41@sta/audio-resampler-x86-sse41.c.o.d' -o 'subprojects/gst-plugins-base/gst-libs/gst/audio/audio_resampler_sse41@sta/audio-resampler-x86-sse41.c.o' -c ../subprojects/gst-plugins-base/gst-libs/gst/audio/audio-resampler-x86-sse41.c
In file included from ../subprojects/gst-plugins-base/gst-libs/gst/audio/audio-resampler.h:24:0,
from ../subprojects/gst-plugins-base/gst-libs/gst/audio/audio-resampler-private.h:23,
from ../subprojects/gst-plugins-base/gst-libs/gst/audio/audio-resampler-macros.h:25,
from ../subprojects/gst-plugins-base/gst-libs/gst/audio/audio-resampler-x86-sse41.h:23,
from ../subprojects/gst-plugins-base/gst-libs/gst/audio/audio-resampler-x86-sse41.c:24:
../subprojects/gst-plugins-base/gst-libs/gst/audio/audio.h:26:39: fatal error: gst/audio/audio-enumtypes.h: No such file or directory
#include <gst/audio/audio-enumtypes.h>
^
compilation terminated.
This makes sure that we only build files that need explicit SIMD support
with the relevant CFLAGS. This allows the rest of the code to be built
without, and specific SSE* code is only called after runtime checks for
CPU features.
https://bugzilla.gnome.org/show_bug.cgi?id=729276
https://github.com/mesonbuild/meson
With contributions from:
Tim-Philipp Müller <tim@centricular.com>
Jussi Pakkanen <jpakkane@gmail.com> (original port)
Highlights of the features provided are:
* Faster builds on Linux (~40-50% faster)
* The ability to build with MSVC on Windows
* Generate Visual Studio project files
* Generate XCode project files
* Much faster builds on Windows (on-par with Linux)
* Seriously fast configure and building on embedded
... and many more. For more details see:
http://blog.nirbheek.in/2016/05/gstreamer-and-meson-new-hope.htmlhttp://blog.nirbheek.in/2016/07/building-and-developing-gstreamer-using.html
Building with Meson should work on both Linux and Windows, but may
need a few more tweaks on other operating systems.
Elements inherited from GstAudioDecoder, supporting PLC and introducing
delay produce invalid timestamps. Good example is opusdec with in-band FEC
enabled. After receiving GAP event it delays the audio concealment until
the next buffer arrives. The next buffer will have DISCONT flag set which
will make GstAudioDecoder to reset it's internal state, thus forgetting
the timestamp of GAP event. As a result the concealed audio will have the
timestamp of the next buffer (with DISCONT flag) but not the timestamp
from the event.
As said in its doc GST_AUDIO_CHANNEL_POSITION_NONE is meant to be used
for "position-less channels, e.g. from a sound card that records 1024
channels; mutually exclusive with any other channel position".
But at the moment using such positions would raise a
'g_return_if_reached' warning as gst_audio_get_channel_reorder_map()
would reject it.
Fix this by preventing any attempt to reorder in such case as that's not
what we want anyway.
https://bugzilla.gnome.org/show_bug.cgi?id=763799
We currently don't log much about channel positions making debugging
harder as it should be. This is the first step in my attempt to improve
this.
https://bugzilla.gnome.org/show_bug.cgi?id=763985
There is a small window of time where the audio ringbuffer thread
can access the parent thread variable, before it's initialized
by the parent thread. The patch replaces this variable use by
g_thread_self().
https://bugzilla.gnome.org/show_bug.cgi?id=764865
Since the allocation query caps contains memory size and the pad's caps
contains the display size, an audio encoder or decoder might need to allocate
a different buffer size than the size negotiated in the caps.
This patch splits this logic distinction for audiodecoder and audioencoder.
Thus the user, if needs a different allocation caps, should set it through
gst_audio_{encoder,decoder}_set_allocation_cap() before calling the negotiate()
vmethod. Otherwise the allocation_caps will be the same as the caps in the
src pad.
https://bugzilla.gnome.org/show_bug.cgi?id=764421
Store the filter in the desired sample format so that we can simply do a
linear or cubic interpolation to get the new filter instead of having to
go through gdouble and then convert.
Remove some unused variables from the inner product functions.
Make filter coefficients by interpolating if required.
Rename some fields.
Try hard to not recalculate filters when just chaging the rate.
Add more proprties to audioresample.
Rearrange the oversampled taps in memory to make it easier to use
SIMD instructions on them. this simplifies some sse code.
Add some more optimizations
Improve int16 resampling by using pmaddwd
Use intrinsics to scale and pack int16 samples
Align the coefficients so that we can use aligned loads
Add padding to taps and samples so that we don't have to use partial
loads for the remainder of the loops.
Remove copy_n, we can reuse the plain copy function with some new
parameters.
Align and pad the sample array.
Remove the consumed/produced output fields from the resampler and
converter. Let the caler specify the right number of input/output
samples so we can be more optimal.
Use just one function to update the converter configuration.
Simplify some things internally.
Make it possible to use writable input as temp space in audioconvert.
If we don't have writable memory, make sure to make a copy of the input
samples into a temporary (writable) buffer, even if we are dealing with
a native intermediate format that we don't need to call the unpack
function for.
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=761655
gst_pad_get_allowed_caps() will return NULL if the srcpad has no peer.
In that case, use gst_pad_peer_query_caps() with template caps as filter
to have negotiated output caps properly before forwarding GAP event.
https://bugzilla.gnome.org/show_bug.cgi?id=761218
It's useful enough already to be used in other elements for audio aggregation,
let's give people the opportunity to use it and give it some API testing.
https://bugzilla.gnome.org/show_bug.cgi?id=760733
It's quite unexpected behaviour that various subclass settings are just
reset before set_format(). Unfortunately changing this now has the risk
of breaking existing code but we should reconsider this for 2.0.
When the input and output formats are the same and in a possible
intermediate format, avoid unpack and pack.
Never do passthrough channel mixing.
Only do dithering and noise shaping in S32 format
Add support for float and int16 mixing
Remove in-place processing, this simplifies things as we won't be using it.
Don't do clipping for float audio formats
Process as many samples as we can from the input and return the number
of processed samples from the chain. This simplifies some code.
Fix the IN_WRITABLE handling, don't overwrite the flags.
Pass flags in _converter_new() so that we can configure ourselves
differently depending on some options.
SOURCE_WRITABLE -> IN_WRITABLE because the array is called 'in'
Simplify the API, we don't need the consumed and produced output
arguments. The caller needs to use the _get_in_frames/get_out_frames API
to check how much input is needed and how much output will be produced.
We did not take the sample size into account. Rearrange the tests to have more
conversion test and an extra test case for passthrough operations.
Fixes#759890
Rename samples to num_samples, since we also have samples in chain, but that is
the data pointer. Always use gzize for num_samples. Make the log output a bit
more homogenous.
Rework the main processing loop. We now create an audio processing
chain from small core functions. This is very similar to how the
video-converter core works and allows us to statically calculate an
optimal allocation strategy for all possible combinations of operations.
Make sure we support non-interleaved data everywhere.
Add functions to calculate in and out frames and latency.