Optimize LE<->BE conversion by adding a dedicated fast path instead of
using the generic converter. Implement transform_ip function in order to do the
endian swap in place.
This saves buffer allocation for the intermediate format, can be done in place
and also performs the conversion in one step instead of unpack-convert-pack.
For all bit widths the naive algorithm is implemented, which provides the best
performance when compiled with -O3. ORC was considered but eventually removed
as it requires a dedicated function for in-place conversion (due to the
"restrict" parameters).
A more complex algorithm for the 24-bit conversion with unrolled loop and
32-bit processing is implemented in the #if 0 section. It performs better if
compiled with -O2. With -O3 however the naive algorithm performs better.
https://bugzilla.gnome.org/show_bug.cgi?id=773073
find_suitable_mask() had complexity O(n^2) on the number of bits.
For common case like 2-channel audio the mask was calculated in about 4k loop
cycles.
Optimize both n_bits_set() and find_suitable_mask() to O(n) where n is the
number of bits set in the mask.
https://bugzilla.gnome.org/show_bug.cgi?id=772864
https://github.com/mesonbuild/meson
With contributions from:
Tim-Philipp Müller <tim@centricular.com>
Jussi Pakkanen <jpakkane@gmail.com> (original port)
Highlights of the features provided are:
* Faster builds on Linux (~40-50% faster)
* The ability to build with MSVC on Windows
* Generate Visual Studio project files
* Generate XCode project files
* Much faster builds on Windows (on-par with Linux)
* Seriously fast configure and building on embedded
... and many more. For more details see:
http://blog.nirbheek.in/2016/05/gstreamer-and-meson-new-hope.htmlhttp://blog.nirbheek.in/2016/07/building-and-developing-gstreamer-using.html
Building with Meson should work on both Linux and Windows, but may
need a few more tweaks on other operating systems.
Pass flags in _converter_new() so that we can configure ourselves
differently depending on some options.
SOURCE_WRITABLE -> IN_WRITABLE because the array is called 'in'
Simplify the API, we don't need the consumed and produced output
arguments. The caller needs to use the _get_in_frames/get_out_frames API
to check how much input is needed and how much output will be produced.
In this specific case it wouldn't cause problems as we only ever access the
first array element, but let's make explicit what is happening here.
CID 1346530 and 1346529
Remove the format and layout from the mix_samples function and use the
format when creating the channel mixer object. Also use a flag to handle
the unlikely case of non-interleaved samples like we do elsewhere.
Add docs for the internal audioconvert object before moving it to the
audio library.
Remove get_sizes and implement the trivial logic in the element.
Remove some unused orc functions
Move the audio quantize code from audioconvert to the audio library.
work on making an audio converter helper function similar to the video
converter.
Fold fastrandom directly into the quantizer, add some ORC code to
optimize this later.
Rename _get_default_mask() to _get_fallback_mask() to make it more
clear that the function only provides a fallback if nothing else can be
done. Also clarify this in the documentation.
API: gst_audio_channel_get_fallback_mask()
Add a TRUNCATE_RANGE flag for unpack functions to fill the least
significate bits with 0 (as did the old code). Also add functions
that don't truncate. Use the TRUNC flag in audioconvert for
backwards compatibility for now.
Use (1 << 31) as the multiplier for int<->float conversions. This makes
sure that int->float conversions always end up with floats between
[-1.0, 1.0].
For the conversion from float to int, this multiplier will give the complete
int range after we perform clipping.
Change the unit test to take this into consideration.
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=755301
Rewrite audioconvert to try to make it more clear what steps are
executed during conversion.
Add passthrough step that just does a memcpy when possible.
Add ORC optimized dither and quantization functions.
Implement noise-shaping on S32 samples only and allow for arbitrary
noise shaping coefficients if we want this later.
Rework the converter to use the pack/unpack functions
Because the unpack functions can only unpack to 1 format, add a separate
conversion step for doubles when the unpack function produces int.
Do conversion to S32 in the quantize function directly.
Tweak the conversion factor for doing float->int conversion slightly to
get the full range of negative samples, use clamp to make sure we don't
exceed our int range on the positive axis (see also #755301)
This is a fixup for b2db18cda2
audioconvert: avoid float calculations when mixing integer-formatted channels
The int matrix was using gint and gint32 synonymously, which can theoretically
cause problems if gint and gint32 are actually different types.
https://bugzilla.gnome.org/show_bug.cgi?id=747005
The patch calculates a second channel mixing matrix from the current one. The
matrix contains the original values * (2^10) as integers. This matrix is used
when integer-formatted channels are mixed.
On a ARM Cortex-A8, single core, 800MHz this improves performance in a
testcase from 29s to 9s for downmixing 6 channels to stereo.
https://bugzilla.gnome.org/show_bug.cgi?id=747005
audio_convert_convert unpacks to default format (signed) before calling
quantize, and the unsigned variants were equivalent to signed anyway,
so we just get rid of them.
Since range size is always 2^n, we can simply use modulo (implemented
with a bitmask).
The previous implementation used 64-bit integer division, which is
done in software on ARMv7. Although the divisor was constant, the
division could not be transformed into "multiplication by magic number"
since the dividend was 64-bit.
The now-unused and not-so-fast gst_fast_random_(u)int32_range functions
were removed.
Also, implementing bug fixes:
1) ADD_DITHER_TPDF_HF_I no longer discards bias.
2) We change TPDF's noise range to be the same as RPDF's. Previously,
RPDF's noise ranged:
{ bias - dither, bias + dither }
while TPDF's noise ranged:
{ bias/2 - dither/2, bias/2 + dither/2 - 1 } +
{ bias/2 - dither/2, bias/2 + dither/2 - 1 } =
{ bias - dither, bias + dither - 2 }
Now, both range:
{ bias - dither, bias + dither - 1 }
https://bugzilla.gnome.org/show_bug.cgi?id=746661