Rework the audio caps similar to the video caps. Remove
width/depth/endianness/signed fields and replace with a simple string
format and media type audio/x-raw.
Create a GstAudioInfo and some helper methods to parse caps.
Remove duplicate code from the ringbuffer and replace with audio info.
Use AudioInfo in the base audio filter class.
Port elements to new API.
It may not be uncommon for the input timestamps to experience some jitter
around the 'perfect time'. As such, instead of regularly adding and dropping
samples, optionally allow for some tolerance in a more relaxed approach.
API: GstAudioRate:tolerance