Decoder bounded CUDA memory is allocated by driver and the pool size
is fixed. Since we don't know how many buffers would be held by
downstream non-CUDA element, we should download such CUDA memory
and release it back to decoder.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4810>