Pull in video frame fields into local variables. Without this the
compiler must assume that they could've changed on every use and read
them from memory again.
This reduces the inner loop from 6 memory reads per pixels to 4, and the
number of writes stays at 3.
This allows us to use the GstVideoOverlayComposition API and correctly
handle pre-multiplied alpha, while also only doing the alpha conversion
once instead of twice for the whole frame.
At a later point we can attach the meta to the buffer instead of
blending ourselves if downstream supports that.
https://bugzilla.gnome.org/show_bug.cgi?id=797091
Allows applications to connect to the "draw" signal of
the element and do their custom drawing there.
Includes an example application demonstrating usage.
Fixes: https://bugzilla.gnome.org/show_bug.cgi?id=595520