Improve int16 resampling by using pmaddwd
Use intrinsics to scale and pack int16 samples
Align the coefficients so that we can use aligned loads
Add padding to taps and samples so that we don't have to use partial
loads for the remainder of the loops.
Remove copy_n, we can reuse the plain copy function with some new
parameters.
Align and pad the sample array.
Remove the consumed/produced output fields from the resampler and
converter. Let the caler specify the right number of input/output
samples so we can be more optimal.
Use just one function to update the converter configuration.
Simplify some things internally.
Make it possible to use writable input as temp space in audioconvert.