This implementation is similar to what we've done for nvcodec plugin.
Since supported resolution, profiles, and formats are device dependent ones,
single template caps cannot represent them, so this modification
will help autoplugging and fallback.
Note that the legacy gpu list and list of resolution to query were
taken from chromium's code.
Source texture (decoder view) might be larger than destination (staging) texture.
In that case, D3D11_BOX structure should be passed to CopySubresourceRegion method
in order to specify the exact target area.
Use consistent memory layout between dxva and other shader use case.
For example, use DXGI_FORMAT_NV12 texture format instead of
two textures with DXGI_FORMAT_R8_UNORM and DXGI_FORMAT_R8G8_UNORM.
This reverts commit ddd13fc7c0
Dynamic usage can reduce the number of copy per frame but make
things complicated and the benefit seems to not significant.
Also since we don't provide _map() method for the dynamic usage,
application cannot read buffers which make "last-sample" property
unusable in case of d3d11videosink.