When nsamples_out is larger than nsamples_in, using unsigned
ints lead to a overflow and the resulting value is wrong and
way too large for allocating a buffer. Use signed integers
and returning immediatelly when that happens.
Use more efficient formula that uses less multiplies.
Reduce the amount of scalar code, use MMX to calculate the desired
alpha value.
Unroll and handle 2 pixels in one iteration for improved pairing.
Convert the alpha value to 0->255 when setting and to 0->256 when using as
a scaling factor. This makes sure we can reach the full opacity value of 0xff in
all cases.
Fix some comments, clarify some FIXMEs
Remove the on-ntp-stop signal check now that the jitterbuffer is in
-good and we know that it supports this signal.
A seek in multi-sink pipeline typically leads to several seek events in a row,
which could lead to sending several newsegments in a row without intermediate
flushing. These would then accumulate, distort rendering times and as such
lead to 'hanging'.
Use GstRTPBaseAudioPayload as the base class. This saves a lot of code and fixes
a bunch of problems that were already solved in the base class.
Fixes#853367
Don't make copied in the getter and setter for SDES in the RTPSource. This
avoids a couple of copies of the SDES structure when generating RTCP
packets.
Add a new spspps-interval property to instruct the payloader to insert
SPS and PPS at periodic intervals in the stream.
Rework the SPS/PPS handling so that bytestream and AVC sample code both use the
same code paths to handle sprop-parameter-sets. This also allows to have the AVC
code to insert SPS/PPS like the bytestream code.
Fixes#604913
For some reason latest gcc/binutils accept movzxb here while
movzbl would be correct and is the only thing accepted by older
gcc/binutils.
Fixes bug #604679.
This provides another 7% speedup for the time domain convolution and 1.5%
speedup for the FFT convolution on Mono input.
This optimization assumes that the compiler simplifies calculations
and conditions on constant numbers and unrolls loops with a constant
number of repeats.
This will always use time-domain convolution, which lowers the latency.
With FFT convolution it's always a multiple of the kernel length,
with time domain convolution it's only the pre-latency of the filter kernel.
This provides a great speedup, especially the relationship between kernel
length and processing size is now logarithmic instead of linear. Below a
kernel size of 32 it's a bit slower, afterwards it's much faster:
17 0.788000 -> 0.950000
33 1.208000 -> 1.146000
65 2.166000 -> 1.146000
...
4097 107.444000 -> 1.508000
For sizes smaller 32 the normal time-domain convolution is chosen,
for larger sizes the FFT convolution is automatically used.
Fixes bug #594381.