Adding a property to control the number of in-flight GPU commands
(default is unlimited). Note that actual maximum number is defined
in d3d12device's direct command queue object which is 32 now,
thus total number of scheduled GPU commands cannot exceed 32.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7444>
Adding a new videosink element for Windows composition API based
applications. Unlike d3d12videosink, this element will create only
DXGI swapchain by using IDXGIFactory2::CreateSwapChainForComposition()
without actual window handle, so that video scene can be composed
via Windows native composition API, such as DirectComposition.
Note that this videosink does not support GstVideoOverlay interface
because of the design.
The swapchain created by this element can be used with
* DirectComposition's IDCompositionVisual in Win32 app
* WinRT and WinUI3's UI.Composition in Win32/UWP app
* UWP and WinUI3 XAML's SwapChainPanel
See also examples in this commit which show usage of the videosink
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7287>
A fence configured in GstD3D12Memory should be used only for
write access to be completed. And because d3d12 -> d3d11 copy path
is read access to d3d12 resource, we should not set fence to
memory. Otherwise another read access to the d3d12 resource
will wait for d3d11 device context's copy operation although
simultaneous read access is allowed.
Use background thread to keep d3d12 resource and wait for d3d11 device's
copy operation instead.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7243>
null event NT handle to ID3D12Fence::SetEventOnCompletion()
will block the calling CPU thread already, thus it has no point that
creating an event NT handle in order to immediate wait for fence at CPU-side.
Note that passing a valid event NT handle to the fence API might be useful
when we need to wait for the fence value later (or timeout is required),
or want to wait for multiple fences at once via WaitForMultipleObjects().
But it's not a considered use case for now.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7176>
GstD3D12Window.priv.input_info is referenced by mouse event handler
in order to calculate corresponding original position
if scene is rotated/flipped by the videosink.
Fixing regression introduced by recent d3d12videosink refactoring
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7177>
If a d3d12 memory holds non-direct-queue fence but the fence was
created with D3D12_FENCE_FLAG_SHARED flag, use the fence instead of
waiting for fence at CPU side. Note that d3d12ipcsrc or
d3d12screencapture elements will hold such sharable fence.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7139>
Since a buffer resource will occupy at least 64KB,
allocating upload resource per decoding command might not be
an optimal approach. Instead, use sub-region of a upload resource
for multiple decoding command if sub-regions are not overlapped
each other.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7108>
Because DXGI flip mode swapchain will disallow GDI operation
to a HWND once swapchain is configured, videosink has been creating
child window of application's window. However, since window creation
would take a few milliseconds, it can cause performance issue such as
UI freezing. Adding a property so that videosink can attach
DXGI swapchain diretly to application's window in order to improve
performance.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
A large refactoring commit for adding features and improve performance
* Reuse internal converter and overlay compositor:
Converter can be reused as long as input and display formats are not
changed. Also overlay compositor reconstruction is required only if
display format is changed
* Don't wait for full GPU flush on resize or close:
D3D12 swapchain requires GPU idle in order to resize backbuffer.
Thus CPU side waiting is required for swapchain related commands
to be finished. However, don't need to wait for full GPU flushing.
* Support multiple sink on a single external window
Keep installed subclass window procedure even if there's no associated
our internal HWND. This will make window procedure hooking less racy.
Then parent HWND's message will be transferred to our internal HWNDs
if needed.
* Adding support for window handle update
Application can change target HWND even when videosink is playing or
paused state. So, users can call gst_video_overlay_set_window_handle()
against d3d12videosink anytime. The videosink will be able to update
internal state and setup resource upon requested.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
Observed Intel GPU driver crash when multiple decoders are
configured in a process. It might be because of frequent
command queue alloc/free or too many in-flight decoding commands.
In order to make command queue persistent and limit the number of
in-flight command lists, holds global decoding command queue.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7019>
Adding a property to control error reporting behavior when output
window is closed in playing or paused state. This can be useful
for apps where an app wants to close window even if it's playing
a stream, and the closed window is expected.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6939>
Conceptually identical to the present signal of d3d11videosink.
This signal will be emitted with current render target
(i.e., swapchain backbuffer) and command queue. Signal handler
can record GPU commands for an overlay image or to blend
an image to the render target.
In addition to d3d12 resources, videosink will send
d3d11 and d2d resources depending on "overlay-mode"
property, so that signal handler can render by using
preferred/required DirectX API.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6838>
The idea of using separate command queue per videosink was that
swapchain is bound to a command queue and we need to flush the
command queue when window size is changed. But the separate
queue does not seem to improve performance a lot.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6838>
Address below message reported by SDK debug layer.
ID3D12Device::CheckFeatureSupport: Unsupported Decode Profile Specified.
Use ID3D12VideoDevice::CheckFeatureSupport with D3D12_FEATURE_VIDEO_DECODE_PROFILES
to retrieve a list of supported profiles
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6839>
Fixing deadlock in below case
* GC lock is taken by background thread, and the background thread calls
gst_d3d12_ipc_client_release_imported_data() which takes ipc lock
* ipc lock is already taken in ipc thread and trying to pushing GC data
via gst_d3d12_command_queue_set_notify()
* gst_d3d12_command_queue_set_notify() is trying to take GC lock
but it's already taken by background thread
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
Ideally, GPU waiting should be scheduled just before executing command list.
But handling the case outside of converter is a bit complicated.
Under an assumption that constructed command list will be executed
immediately, schedules GPU-side waiting inside of conversion method
to simplify the flow.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>