Commit graph

167 commits

Author SHA1 Message Date
Seungha Yang
88efaa35a8 d3d12download: Do not overwrite fence of non-writable memory
A fence configured in GstD3D12Memory should be used only for
write access to be completed. And because d3d12 -> d3d11 copy path
is read access to d3d12 resource, we should not set fence to
memory. Otherwise another read access to the d3d12 resource
will wait for d3d11 device context's copy operation although
simultaneous read access is allowed.

Use background thread to keep d3d12 resource and wait for d3d11 device's
copy operation instead.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7243>
2024-07-27 03:23:22 +09:00
Seungha Yang
800961cf28 d3d12decoder: Add support for d3d11 output again
Although d3d12download supports d3d12 to d3d11 texture copy,
this feature might be useful if an application is not ready to d3d12
support and it expects output type of decodebin(3) is d3d11.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7208>
2024-07-22 23:17:25 +09:00
Seungha Yang
8c2bbd8760 meson: d3d12: Use configuration file
Move defines to config header

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7208>
2024-07-22 23:02:48 +09:00
Seungha Yang
72895ed0fa d3d12: Always allocate output texture using shared heap
... if downstream preference is unknown (e.g., no proposed
buffer pool by downstream), so that produced textures can be
shareable with other APIs such as d3d11 or vulkan, or other processes

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7193>
2024-07-18 14:17:43 +00:00
Seungha Yang
b37bfc02f5 d3d12: Remove unnecessary event handles
null event NT handle to ID3D12Fence::SetEventOnCompletion()
will block the calling CPU thread already, thus it has no point that
creating an event NT handle in order to immediate wait for fence at CPU-side.
Note that passing a valid event NT handle to the fence API might be useful
when we need to wait for the fence value later (or timeout is required),
or want to wait for multiple fences at once via WaitForMultipleObjects().
But it's not a considered use case for now.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7176>
2024-07-16 19:17:15 +00:00
Seungha Yang
783e1ea327 d3d12videosink: Fix mouse event handling
GstD3D12Window.priv.input_info is referenced by mouse event handler
in order to calculate corresponding original position
if scene is rotated/flipped by the videosink.
Fixing regression introduced by recent d3d12videosink refactoring

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7177>
2024-07-16 17:02:24 +00:00
Seungha Yang
dbc8a1397d d3d12compositor: Fix transparent background mode with YUV output
In case of YUV format without alpha channel, zero clear value
for each channle will result in green color. Use calculated black
background color with alpha=0 for transparent background mode instead.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7181>
2024-07-16 14:18:12 +00:00
Seungha Yang
2a6967a8cd d3d12videosink: Clear cached buffer on format change
Otherwise converter will try to read memory of which layout/format
might be different from configured converter pipeline

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7167>
2024-07-15 10:20:46 +00:00
Seungha Yang
e692bf0fb9 d3d12memorycopy: Enhance d3d12 to d3d11 copy
If a d3d12 memory holds non-direct-queue fence but the fence was
created with D3D12_FENCE_FLAG_SHARED flag, use the fence instead of
waiting for fence at CPU side. Note that d3d12ipcsrc or
d3d12screencapture elements will hold such sharable fence.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7139>
2024-07-08 09:57:04 +00:00
Seungha Yang
c0a02affa4 d3d12: Add support for resource copy between d3d11 and d3d12
If driver can support cross-api resource sharing, use device-to-device
resource copy in d3d12upload/download elements.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7119>
2024-07-01 20:18:42 +00:00
Seungha Yang
c013e88295 d3d12decoder: Use sub-allocated bitstream buffer
Since a buffer resource will occupy at least 64KB,
allocating upload resource per decoding command might not be
an optimal approach. Instead, use sub-region of a upload resource
for multiple decoding command if sub-regions are not overlapped
each other.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7108>
2024-06-29 20:47:39 +09:00
Seungha Yang
69ea20b739 d3d12videosink: Present on GstVideoOverlay::expose()
... so that updated backbuffer can be swapped and presented

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7079>
2024-06-24 08:46:24 +00:00
Seungha Yang
dd4d85272e d3d12converter: Fix Y410 conversion
Adding format conversion helper and use compute shader in case that
output format does not support RTV.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7088>
2024-06-23 10:53:18 +00:00
Seungha Yang
327fb9924c d3d12basefilter: Add adapter property
Allows initial GPU adapter selection

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7066>
2024-06-20 00:03:28 +09:00
Seungha Yang
2ec7d82bd7 d3d12: Move fence setter helper method to gst-libs
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:48 +00:00
Seungha Yang
fa7c4a2e39 d3d12converter: Update API signature
Always use device's main direct queue, and control gpu waiting
behavior by using boolean value

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:48 +00:00
Seungha Yang
f6eb3a01c0 d3d12memory: Hide fence value from header
Instead of exposing fence value to wait in header, user setter/getter
methods.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:47 +00:00
Seungha Yang
49c4247cb3 d3d12device: Add helper method for getting fence handle
Add get_fence_handle() method so that caller can get command queue's
dedicated fence handle from device

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:47 +00:00
Seungha Yang
289bc1d440 d3d12: Remove notify_com and notify_mini_object helper methods
Use private macros instead of exposing multiple APIs for the same thing

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:47 +00:00
Seungha Yang
6942868bda d3d12commandqueue: Update API name and arguments
Accepts multiple fences since single command list may have
multiple dependent resources which are associated with
different GPU engines

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:47 +00:00
Seungha Yang
713ea313f8 d3d12device: Use HRESULT return code if possible
Make function signature consistent with that of command queue

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7057>
2024-06-19 13:19:47 +00:00
Seungha Yang
339c81bac8 d3d12: Promote decoder and videosink rank to primary
It's proven that d3d12 performs better than d3d11 while
consumes less resources in various cases.
Assign primary+ rank to decoder and videosink in case of Windows10/11,
so that it can be tested widely

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7045>
2024-06-18 14:40:37 +00:00
Seungha Yang
e5d2dd8e8d d3d12videosink: Add direct-swapchain property
Because DXGI flip mode swapchain will disallow GDI operation
to a HWND once swapchain is configured, videosink has been creating
child window of application's window. However, since window creation
would take a few milliseconds, it can cause performance issue such as
UI freezing. Adding a property so that videosink can attach
DXGI swapchain diretly to application's window in order to improve
performance.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
2024-06-17 16:05:00 +00:00
Seungha Yang
dd1d2f5ff7 d3d12videosink: Add external-window-only property
Adding a new property in order to avoid unintended interanl window
creation.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
2024-06-17 16:05:00 +00:00
Seungha Yang
37e1847464 d3d12videosink: Add support for window handle update
A large refactoring commit for adding features and improve performance

* Reuse internal converter and overlay compositor:
  Converter can be reused as long as input and display formats are not
  changed. Also overlay compositor reconstruction is required only if
  display format is changed

* Don't wait for full GPU flush on resize or close:
  D3D12 swapchain requires GPU idle in order to resize backbuffer.
  Thus CPU side waiting is required for swapchain related commands
  to be finished. However, don't need to wait for full GPU flushing.

* Support multiple sink on a single external window
  Keep installed subclass window procedure even if there's no associated
  our internal HWND. This will make window procedure hooking less racy.
  Then parent HWND's message will be transferred to our internal HWNDs
  if needed.

* Adding support for window handle update
  Application can change target HWND even when videosink is playing or
  paused state. So, users can call gst_video_overlay_set_window_handle()
  against d3d12videosink anytime. The videosink will be able to update
  internal state and setup resource upon requested.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
2024-06-17 16:05:00 +00:00
Seungha Yang
c00c36e33b d3d12overlaycompositor: Remove unused parameter
Don't need to check fence value of overlay buffer since
window uses global direct command queue

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
2024-06-17 16:05:00 +00:00
Seungha Yang
f91be03550 d3d12videosink: Calculate display resolution only per caps change
Don't need to calculate it per window property update

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
2024-06-17 16:05:00 +00:00
Seungha Yang
a2df44da7d d3d12: Workaround for Intel iGPU decoder crash
Observed Intel GPU driver crash when multiple decoders are
configured in a process. It might be because of frequent
command queue alloc/free or too many in-flight decoding commands.
In order to make command queue persistent and limit the number of
in-flight command lists, holds global decoding command queue.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7019>
2024-06-17 15:15:07 +00:00
Seungha Yang
20852ba028 d3d12videosink: Disconnect window signal handler on dispose as intended
Fixing typo

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7012>
2024-06-10 23:33:26 +09:00
Seungha Yang
20db3369d3 d3d12videosink: Add error-on-closed property
Adding a property to control error reporting behavior when output
window is closed in playing or paused state. This can be useful
for apps where an app wants to close window even if it's playing
a stream, and the closed window is expected.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6939>
2024-06-09 16:48:41 +00:00
Seungha Yang
96cf3d7063 d3d12encoder: Do not print error log for not-supported feature
gst_d3d12_result() will print message with ERROR level if failed.
Use FAILED/SUCCEEDED macros instead, since not-supported feature
is not a critical error

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6955>
2024-05-29 10:45:13 +00:00
Seungha Yang
84aecab150 d3d12videosink: Add overlay signal to support d3d12/d3d11/d2d overlay
Conceptually identical to the present signal of d3d11videosink.
This signal will be emitted with current render target
(i.e., swapchain backbuffer) and command queue. Signal handler
can record GPU commands for an overlay image or to blend
an image to the render target.

In addition to d3d12 resources, videosink will send
d3d11 and d2d resources depending on "overlay-mode"
property, so that signal handler can render by using
preferred/required DirectX API.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6838>
2024-05-24 15:55:17 +00:00
Seungha Yang
4d779d7de8 d3d12videosink: Use device's main direct queue
The idea of using separate command queue per videosink was that
swapchain is bound to a command queue and we need to flush the
command queue when window size is changed. But the separate
queue does not seem to improve performance a lot.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6838>
2024-05-24 15:55:17 +00:00
Seungha Yang
9d23c26027 d3d12decoder: Fix SDK debug layer warning
Address below message reported by SDK debug layer.

ID3D12Device::CheckFeatureSupport: Unsupported Decode Profile Specified.
Use ID3D12VideoDevice::CheckFeatureSupport with D3D12_FEATURE_VIDEO_DECODE_PROFILES
to retrieve a list of supported profiles

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6839>
2024-05-13 19:42:52 +00:00
Sebastian Dröge
0ef396359c gst: Move GstQueueArray as GstVecDeque to core
And change lengths and indices from guint to gsize for a more correct type.

Also deprecate GstQueueArray and implement it in terms of GstVecDeque.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6779>
2024-05-06 18:25:42 +00:00
Seungha Yang
cb20a371c2 d3d12decoder: Fix d3d12 resource copy
It was copying to self resource

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6753>
2024-04-28 15:31:19 +00:00
Seungha Yang
46131f0cb0 d3d12ipcclient: Fix deadlock when copying texture
Fixing deadlock in below case
* GC lock is taken by background thread, and the background thread calls
  gst_d3d12_ipc_client_release_imported_data() which takes ipc lock
* ipc lock is already taken in ipc thread and trying to pushing GC data
  via gst_d3d12_command_queue_set_notify()
* gst_d3d12_command_queue_set_notify() is trying to take GC lock
  but it's already taken by background thread

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
19932cf178 d3d12ipcsink: Handle external fence
Waits external fence before sending frame to peer.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
b7844ef307 d3d12decoder: Remove CPU-side waiting
Sets decoder command queue's fence to memory instead of waiting
from decoder's output thread. CPU-side waiting will happen
only if download is required.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
a05961ab7b d3d12screencapturesrc: Fix output to non-d3d12 element
Configures upload/download flags to memory after write

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
e29655e9ca d3d12screencapturesrc: Release and flush d3d11 objects before d3d12
Fixing device-removed error when closing pipeline

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
2c203e0d40 d3d12encoder: Handle external fence explicitly
Waits for external fence if any

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
2a14793ee1 d3d12converter: Add support for GPU-side external fence waiting
Ideally, GPU waiting should be scheduled just before executing command list.
But handling the case outside of converter is a bit complicated.
Under an assumption that constructed command list will be executed
immediately, schedules GPU-side waiting inside of conversion method
to simplify the flow.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
478e49dd73 d3d12: Update copy_texture_region() method
Pass external fence value if any and allow passing fence
data so that dependent resources can be released
once copy is done

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6749>
2024-04-28 12:49:07 +00:00
Seungha Yang
4ac46ce82b d3d12screencapturesrc: Performance improvement
Process captured frame using d3d11 instead of d3d12, and use shared
fence when copying processed d3d11 texture to d3d12 resource.
In this way, capture CPU thread does not need to wait for fence signal.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6733>
2024-04-25 22:51:01 +00:00
Seungha Yang
cecb0f2148 d3d12decoder: Lock DPB while building command
Since DPB resource can be modified in output thread, protect
it when building command list.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6709>
2024-04-23 10:08:19 +00:00
Seungha Yang
27c02a0b80 d3d12decoder: Hold reference pictures in fence data
Keep reference pictures alive during executing decoding commands

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6709>
2024-04-23 10:08:19 +00:00
Seungha Yang
0f5f170a40 d3d12vp9dec: Disallow resolution change to larger size on non-keyframe
Intel GPU seems to be crashing if the case happens.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6709>
2024-04-23 10:08:18 +00:00
Seungha Yang
700c00eda3 d3d12decoder: Fix potential use after free
A DPB buffer held by codec picture object may not be writable
at the moment, then gst_buffer_make_writable() will unref passed buffer.

Specifically, the use after free or double free can happen if:
* Crop meta of buffer copy is required because of non-zero
  top-left crop position
* zero-copy is possible with crop meta
* A picture was duplicated, interlaced h264 stream for example

Interlaced h264 stream with non-zero top-left crop position
is not very common but it's possible configuration in theory.

Thus gst_buffer_make_writable() should be called with
GstVideoCodecFrame.output_buffer directly.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6706>
2024-04-22 13:28:06 +00:00
Seungha Yang
5179cbccfa d3d12testsrc: Use shared 11on12 device
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6697>
2024-04-20 04:16:48 +09:00