Because DXGI flip mode swapchain will disallow GDI operation
to a HWND once swapchain is configured, videosink has been creating
child window of application's window. However, since window creation
would take a few milliseconds, it can cause performance issue such as
UI freezing. Adding a property so that videosink can attach
DXGI swapchain diretly to application's window in order to improve
performance.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
A large refactoring commit for adding features and improve performance
* Reuse internal converter and overlay compositor:
Converter can be reused as long as input and display formats are not
changed. Also overlay compositor reconstruction is required only if
display format is changed
* Don't wait for full GPU flush on resize or close:
D3D12 swapchain requires GPU idle in order to resize backbuffer.
Thus CPU side waiting is required for swapchain related commands
to be finished. However, don't need to wait for full GPU flushing.
* Support multiple sink on a single external window
Keep installed subclass window procedure even if there's no associated
our internal HWND. This will make window procedure hooking less racy.
Then parent HWND's message will be transferred to our internal HWNDs
if needed.
* Adding support for window handle update
Application can change target HWND even when videosink is playing or
paused state. So, users can call gst_video_overlay_set_window_handle()
against d3d12videosink anytime. The videosink will be able to update
internal state and setup resource upon requested.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7013>
Observed Intel GPU driver crash when multiple decoders are
configured in a process. It might be because of frequent
command queue alloc/free or too many in-flight decoding commands.
In order to make command queue persistent and limit the number of
in-flight command lists, holds global decoding command queue.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7019>
This was already being used in handle_frame() for errors that happen when queueing a frame for decoding,
let's do the same when a frame is flagged with an error in the output callback.
From quick testing, this makes seeking more reliable (previously, it would sometimes cause a decoding error
and shut the whole decoder down due to GST_FLOW_ERROR).
Also manually sets the max error count to actually stop processing if too many errors occur.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6446>
ReferenceMissingErr is not critical and the simplest solution is to just ignore it. The frame has
the FrameDropped flag set when it occurs, so we can just drop it as usual.
BadDataErr is also not immediately critical, but in its case let's set the ERROR flag,
so the output loop can use GST_VIDEO_DECODER_ERROR to count and error out if it happens too many times.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6446>
Musl does not implement GNU basename and have fixed a bug where the
prototype was leaked into string.h [1], which resullts in compile errors
with GCC-14 and Clang-17+
| sys/uvcgadget/configfs.c:262:21: error: call to undeclared function 'basename'
ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
| 262 | const char *v = basename (globbuf.gl_pathv[i]);
| | ^
Use glib function instead makes it portable across musl and glibc on
linux
[1] https://git.musl-libc.org/cgit/musl/commit/?id=725e17ed6dff4d0cd22487bb64470881e86a92e7a
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7006>
It can be seen as a WA in the case of multi-channel transcoding (like
decoder output to two channels, one for encoder and one for vpp).
Normally, encoder sets min pts of a huge value to avoid negative dts,
while vpp set pts without this addtional huge value, which are likely to
cause input surface pts does not fit with encoder (since both encoder
and vpp accept the same buffer from decoder, means they modify the timestamp
of one mfx surface). So we add this huge value to vpp to ensure enc and
vpp set the same value to input mfx surface meanwhile does not break
encoder's setting min pts for dts protection.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6971>
Adding a property to control error reporting behavior when output
window is closed in playing or paused state. This can be useful
for apps where an app wants to close window even if it's playing
a stream, and the closed window is expected.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6939>
drm_fourcc.h should be picked up via the pkgconfig include, not the
system includedir directly.
Also consolidate the libdrm usage in va and msdk.
All this allows it to be picked up consistently (via the subproject,
for example).
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6932>
This patch fixes this critical warning when registering MSDK:
_dma_fmt_to_dma_drm_fmts: assertion 'fmt != GST_VIDEO_FORMAT_UNKNOWN' failed
It was because the HEVC string with possible output formats has an extra space
that could not be parsed correctly.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6853>
Adds a separate vtenc_h265a element (with a _hw variant as usual) for the HEVCWithAlpha codec type.
Decided to go with a separate element to not break existing uses of the normal HEVC encoder.
The preserve_alpha property is still only used for ProRes, no need for it here because we explicitly say we want alpha
when using the new element.
For now, the HEVCWithAlpha has an issue where it does not throttle the amount of input frames queued internally.
I added a quick workaround where encode_frame() will block until enqueue_frame() callback notifies it that some space
has been freed up in the internal queue. The limit was set to 5, which should be enough I guess? Hopefully this is not
too prone to race conditions.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6664>
Conceptually identical to the present signal of d3d11videosink.
This signal will be emitted with current render target
(i.e., swapchain backbuffer) and command queue. Signal handler
can record GPU commands for an overlay image or to blend
an image to the render target.
In addition to d3d12 resources, videosink will send
d3d11 and d2d resources depending on "overlay-mode"
property, so that signal handler can render by using
preferred/required DirectX API.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6838>
The idea of using separate command queue per videosink was that
swapchain is bound to a command queue and we need to flush the
command queue when window size is changed. But the separate
queue does not seem to improve performance a lot.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6838>
The prime_fds for multi planes may be the same. For example, on Intel's
platform, the NV12 surface may have the same FD for the plane0 and the
plane1. Then, the DRM_IOCTL_GEM_CLOSE will close the same handle twice
and get an "Invalid argument 22" error the second time.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6914>
The encoder can specify the a preferred_output_delay value to get better throughput
performance. The higher delay may get better HW performance, but it may increases
the encoder and pipeline latency.
When the output queue length is smaller than preferred_output_delay, the encoder
will not block to waiting for the encoding output. It will continue to prepare and
send more commands to GPU, which may improve the encoder throughput performance.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4359>