gst_buffer_set_qdata() will leak the structure passed to it
when called incorrectly (e.g. on a non-metadata-writable buffer).
This is expected, but we must avoid doing that in valgrind.
Try to avoid floating point maths for each pixel to be blended in
inner loop, and try to avoid the multiplication entirely for the
most common case of the global alpha being 1. Could probably be
refactored a bit more.