... and various code cleanup.
* Move spreaded decoding API calls into one method
Previously, decoding flow of most codecs are
- Call DecoderBeginFrame() on start_picture()
- Call {Get,Release}DecoderBuffer() on decode_slice()
- Call SubmitDecoderBuffers() and DecoderEndFrame() on end_picture()
Such spreaded API calls make it hard to keep track of status
of decoding. Now it will be done at once in a new method.
* Drop a code for non-zero wBadSliceChopping
When bitstream buffer provided by driver is not sufficient
to write compressed bitstream data, host decoder needs to make use
of wBadSliceChopping so that driver can understand there are
multiple bitstream buffer. But it's a bit unrealistic and
not tested. Since FFMpeg's DXVA implemetaion doesn't support it,
we might be able to ignore the case for now.
* Make code more portable
Consider common logic of GstCodecs -> DXVA translation for all D3D APIs
(i,e., D3D9, D3D11, and D3D12).
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2525>
User can get the required buffer size by using buffer pool config.
Since d3d11 implementation is a candidate for public library in the future,
we need to hide everything from header as much as possible.
Note that the total size of allocated d3d11 texture memory by GPU is not
controllable factor. It depends on hardware specific alignment/padding
requirement. So, GstD3D11 implementation updates actual buffer size
by allocating D3D11 texture, since there's no way to get CPU accessible
memory size without allocating real D3D11 texture.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2482>
* Remove unnecessary upcasting. We are now dealing with C++ class objects
and don't need explicit C-style casting in C++ world
* Use helper macro IID_PPV_ARGS() everywhere. It will make code
a little short.
* Use ComPtr smart pointer instead of calling manual IUnknown::Release()
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2461>
Current implementation for translating native coordinate and
video coordinate is very wrong because d3d11videosink doesn't
understand native HWND's coordinate. That should be handled
by GstD3D11Window implementation as an enhancement.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2450>
Inspired by an MR https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2382
The idea is that we can make use of MoveWindow() in WIN32 d3d11window
implementation safely because WIN32 d3d11window implementation creates
internal HWND even when external HWND is set and then subclassing is used to
draw on internal HWND in any case. So the coordinates passed to MoveWindow()
will be relative to parent HWND, and it meets well to the concept of
set_render_rectangle().
On MoveWindow() event, WM_SIZE event will be generated by OS and then
GstD3D11WindowWin32 implementation will update render area including swapchain
correspondingly, as if it's normal window move/resize case.
But in case of UWP (CoreWindow or SwapChainPanel), we need more research to
meet expected behavior of set_render_rectangle()
Fixes: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/issues/1416
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2450>
Qualcomm GPU works fine with current implementation now.
Noticeable difference between when it was disabled and current
d3d11 implementation is that we now support GstD3D11Memory
pool, so there will be no more frequent re-binding decoder surface anymore.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2377>
By this commit, following formats will be newly supported by d3d11 elements
* Y444_{8, 12, 16}LE formats:
Similar to other planar formats. Such Y444 variants are not supported
by Direct3D11 natively, but we can simply map each plane by
using R8 and/or R16 texture.
* P012_LE:
It is not different from P016_LE, but defining P012 and P016 separately
for more explicit signalling. Note that DXVA uses P016 texture
for 12bits encoded bitstreams.
* GRAY:
This format is required for some codecs (e.g., AV1) if monochrome
is supported
* 4:2:0 planar 12bits (I420_12LE) and 4:2:2 planar 8, 10, 12bits
formats (Y42B, I422_10LE, and I422_12LE)
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2346>
Before creating output duplication interface, call SetThreadDesktop()
with HDESK of the current input desktop in case a desktop switch has
occurred.
This allows d3d11desktopdupsrc to capture Windows User Account Control
(UAC) prompts, which appear on a separate secure desktop. Otherwise
IDXGIOutput1::DuplicateOutput() will return E_ACCESSDENIED and the
element won't produce any frames as long as the UAC screen is active.
Note that in order to access secure desktop the application still has to
run at LOCAL_SYSTEM privileges. For GStreamer applications running with
regular user privileges this change has no effect.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2209>
Enable zero-copy if downstream proposed pool and therefore decoder
can know the amount of buffer required by downstream.
Otherwise decoder will copy when our DPB pool has no sufficient
buffers for later decoding operation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2097>
Major changes:
* GstD3D11Allocator: This allocator is now device-independent object
which can allocate GstD3D11Memory object for any GstD3D11Device.
User can get this object via gst_allocator_find(GST_D3D11_MEMORY_NAME)
* GstD3D11PoolAllocator: A new allocator implementation for texture pool.
From now on GstD3D11BufferPool will make use of this memory pool allocator
to avoid frequent texture reallocation. That usually happens because
of buffer copy (gst_buffer_make_writable for example)
In addition to that, GstD3D11BufferPool will provide GstBuffer with
GstVideoMeta, because CPU access to a GstD3D11Memory without GstVideoMeta
is almost impossible since GPU drivers needs padding for stride alignment.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2097>
We've been doing retry with 1ms sleep if DecoderBeginFrame()
returned E_PENDING which means application should call
DecoderBeginFrame() again because GPU is busy.
The 1ms sleep() during retry would result in usually about 15ms delay
in reality because of bad clock precision on Windows.
To improve throughput performance, this commit will enable
high precision clock only for NVIDIA platform since
DecoderBeginFrame() call on the other GPU vendors seems to
succeed without retry.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2099>