When the conversion is only caps feature from memory:VAMemory to system memory,
it's possible to optimize by doing a pseudo pass-through since the va-backed
buffers are the same for system memory buffers.
This change will also mitigates #2940
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6174>
If the allocation query received from downstream doesn't handle GstVideoMeta but
it requests memory:DMABuf caps feature, it's incomplete, so we rather reject the
negotiation.
Both in base decoder, base transform and compositor.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6155>
This is a simplification of the venerable
gst_va_base_dec_get_preferred_format_and_caps_features() function, which
predates since gstreamer-vaapi. It's used to select the format and the
capsfeature to use when setting the output state. It was complex and hard to
follow. This refactor simplifies a lot the algorithm.
The first thing to remove _downstream_has_video_meta() since, most of the time
it will be called before the caps negotiation, and allocation queries make sense
only after caps negotiation. It might work during renegotiation but, in that
case, caps feature change is uncommon. Better a simple and common approach.
Also, for performance, instead of dealing with caps features as strings, GQuarks
are used.
The refactor works like this:
1. If peer pad returns any caps, the returned caps feature is system memory and
looks for a proper format in the allowed caps.
2. The allowed caps are traversed at most 3 times: one per each valid caps
feature. First VAMemory, later DMABuf, and last system memory. The first to
match in allowed caps is picked, and the first format matching with the
chroma is picked too.
Notice that, right now, using playbin videoconvert never return any.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6154>
This reverts questionable commit 009bc15f33
which looks completely wrong.
The GstWasapi2RingBuffer:buffer_size variable is used to
calculate available buffer size we can write
(i.e., available size = buffer_size - padding_size).
But the commit makes the size to be exactly same as buffer period.
Then, it can confuse this element as if the endpoint buffer is full on
I/O event callback (if padding size is equal to buffer period)
but it's not true.
Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/2870
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6132>
The global semaphore was never closed/unlinked, causing permission
denied issue if the device is later used by another user. Properly
removing the semaphore when stopping the pipeline would still leave it
open in case of a crash.
With a GStreamer specific name, it was also not preventing other apps to access
the device concurrently.
Finally, if the system has multiple cards, the lock should be per card
and not global (to be confirmed).
Fixes: #3283.
Sponsored-by: Netflix Inc.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6117>
According to recommendation from MS, IDXGIOutputDuplication::ReleaseFrame()
needs to be called just before IDXGIOutputDuplication::AcquireNextFrame()
for performance reasons, so that driver can accumulate dirty rects
and update texture at once. But it seems to cause choppy output.
Do release acquired frame immediately once processing done,
like d3d11 implementation does.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6092>
Fence data could hold GstD3D12Device directly or indirectly.
Then if it's holding last refcount, the device object will
be released from the device object's internal thread,
and will try join self thread.
Delegates it to other global background thread to avoid
self thread joining.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6042>
The output of VP9 and AV1 encoder is a little different from the H264
and H265 encoder, it may contain repeat frames and so the output frame
number may be more than the input. We need to call finish_subframe()
when some frame will be repeated later. So we need to extend the
current prepare_output() virtual function.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3015>
Explicitly calls gst_vtenc_pause_output_loop when going PAUSED->READY to make sure GST_PAD_STREAM_LOCK is not taken.
Before this change, a deadlock would occur if pipeline got stopped right after one output buffer was generated by vtenc.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5933>
Modify the fix_output_format in vpp to directly generate caps with
negotiated src caps, and we have the correct dma caps negotiation in
fix_output_format function. And thus, we can remove the redundant
negotiation of using function pad_accept_memory in vpp.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5845>
The pool currently defaults to performing a layout transition to
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, with some special exceptions for
video usages. This may not be a legal transition depending on the usage.
Provide an API to explicitly control the initial image layout.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5881>
When implementing NDK media support, it would be useful to also have JNI
implementation in the same binary as NDK media compatibility is lower.
As such, implement a rudimentary vtable system for gstamc-codec and
gstamc-format, and allow choosing the implementation at static_init()
time.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4115>
This allows the implementations to do custom logic behind the hood. For
example, when NDK implementation is added, the entrypoint can chooses to
statically initialize the NDK implementations or the JNI one.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4115>
With this patch, the caps is registered in the order of memory features
as: VAMemory, DMABuf then raw caps in linux path, and D3D11Memory then
raw caps in windows path. It helps to prioritize the video memory for all
msdk elements when doing negotiation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5898>
Fixing below debug layer report
ID3D12Device::CreateCommittedResource: Ignoring InitialState D3D12_RESOURCE_STATE_COPY_DEST.
Buffers are effectively created in state D3D12_RESOURCE_STATE_COMMON.
Buffer resource will be automatically promoted to D3D12_RESOURCE_STATE_COPY_DEST
at the very first COPY operation time.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5895>
Since DXGI desktop duplication API does not work with Direct3D12 device,
this element will use Direct3D11 device to acquire frame.
Then other rendering operations (e.g., texture copy, render pipeline) will
happen using Direct3D12 API
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5883>
In case of tier 1 decoder, always use reference-only picture to avoid
fixed-size pool limitation and output decoded picture without
copy even for negative rate. Also do not use copy queue for GPU to GPU
copy. Copy queue is specialized for upload/download and may occupy
PCIE bandwidth. Use direct queue as recommended by vendors.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5877>
On flush event, baseclass will discard all pictures from DPB
but there can be still in-flight commands not finished yet.
Use our command queue, allocator and fence data helper objects
to keep resource available during command execution.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5877>
Shader visible descriptors occupy GPU resource and there are hardware
limits. Thus, in order to minimize the amount of shader visible heaps,
only non shader visible descriptor heap (staging) will be held by d3d12memory.
Then converter will copy the staging descriptor to shader visible
descriptor heap per draw.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Zero initialization would have overhead and it's not required
most cases except for textures. Use CREATE_NOT_ZEROED flag
in case of buffer resource or if a texture will be rendered without any
prior read operation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Conversion will happen when constructed command list is executed,
not by converter element. Thus this object should not map output buffer
with write flag which will result in error if multiple threads
are building commands for the same output target frame.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>