Add a max-threads property, which defaults to '0 = auto'
Add a utility function taken from libschroedinger which sets
the ffmpeg worker thread count to match the computer processor
count by default.
Add a new function new_aligned_buffer() which creates a GstBuffer of
the requested size/caps, with the memory being allocated/freed by ffmpeg's
av_malloc/av_free which guarantees properly aligned memory.
Added a can_allocate_aligned internal property which we use to figure out
whether downstream can provide us with 128bit aligned buffers.