Major changes:
* GstD3D11Allocator: This allocator is now device-independent object
which can allocate GstD3D11Memory object for any GstD3D11Device.
User can get this object via gst_allocator_find(GST_D3D11_MEMORY_NAME)
* GstD3D11PoolAllocator: A new allocator implementation for texture pool.
From now on GstD3D11BufferPool will make use of this memory pool allocator
to avoid frequent texture reallocation. That usually happens because
of buffer copy (gst_buffer_make_writable for example)
In addition to that, GstD3D11BufferPool will provide GstBuffer with
GstVideoMeta, because CPU access to a GstD3D11Memory without GstVideoMeta
is almost impossible since GPU drivers needs padding for stride alignment.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2097>
We've been doing retry with 1ms sleep if DecoderBeginFrame()
returned E_PENDING which means application should call
DecoderBeginFrame() again because GPU is busy.
The 1ms sleep() during retry would result in usually about 15ms delay
in reality because of bad clock precision on Windows.
To improve throughput performance, this commit will enable
high precision clock only for NVIDIA platform since
DecoderBeginFrame() call on the other GPU vendors seems to
succeed without retry.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2099>
After the VA filter creation, when changing the element's state from NULL
to READY, immediatly checks for any filter operation requested by the user.
If any, the passthrough mode is disabled early, so there's no need for a
future renegotiation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2094>
When we transform the caps from the sink to src, or vice versa, the
"caps" passed to us may only contain parts of the features. Which
makes our vpp lose some feature in caps and get a negotiation error.
The correct way should be:
Cleaning the format and resolution of that caps, but adding all VA,
DMA features to it, making it a full feature caps. Then, clipping it
with the pad template.
fixes: #1551
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2081>
For reverse playback, we are always copying decoded
frame to downstream buffer. So the pool size can be
and need to be large enough.
In case that forward playback, however, we need to restrict
the max pool size for performance reason. Otherwise decoder
will keep copying decoded texture to downstream buffer pool
if decoding is faster than downstream throughput
performance and also there are queue element between them.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2083>
Decoder might be able to copy decoded texture to the other buffer pool
during playback depending on context. In that case, copied one
has no D3D11_BIND_DECODER bind flag.
If we used ID3D11VideoProcessor previously for decoder texture,
and incoming texture supports ID3D11VideoProcessor as well even if it has no
D3D11_BIND_DECODER flag (having D3D11_BIND_RENDER_TARGET for example),
allow zero-copying instead of using our fallback texture.
Frequent conversion tool change (between ID3D11VideoProcessor and generic shader)
might result in inconsistent image quality.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2084>
... instead of QueryInterface-ing per elements. Note that
ID3D11VideoDevice and ID3D11VideoContext objects might not be available
if device doesn't support video interface.
So GstD3D11Device object will create those objects only when requested.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2079>
Direct3D11 objects are COM, and most COM C APIs are verbose
(C++ is a little better). So, by using C++ APIs, we can make code
shorter and more readable.
Moreover, "ComPtr" helper class (which is C++ only) can be
utilized, that is very helpful for avoiding error-prone COM refcounting
issue/leak.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2077>
Added helper function _update_passthrough() which will define and set
the pass-through mode of the filter, and it'll either reconfigure both
pads or it will just mark the src pad for renegotiation or nothing at
all.
There are cases where both pads have to be reconfigured (direction
changed, for example), other when just src pad has to (filters
updated) or none (changing to ready state).
The requirement of renegotiation depends on the need to enable/disable
its VA buffer pools.
This patch sets pass-through mode by default, so the buffer pools
aren't allocated if no filtering/direction operations are defined,
which is the correct behavior.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2074>
... instead of the largest we ever seen.
Note that d3d11h264dec element holds previously configured DPB size
for later decoder object re-open decision.
This is to fix below case:
1) Initial SPS, required DPB size is 6
- decoder object is opened with DPB size 6
- max_dpb_size is now 6
2) SPS update with resolution change, required DPB size is 1
- decoder object is re-opened with DPB size 1
- max_dpb_size should be updated to 1, but it didn't happen (BUG)
3) SPS update without resolution change, only required DPB size is updated to 6
- decoder object should be re-opened but didn't happen
because we didn't update max_dpb_size at 2).
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2056>
The new H.264 uAPI requires that all drivers support
scaling matrix only as an option, when a non-flat
scaling matrix is provided in the bitstream headers.
Take advantage of this and avoid passing the scaling
matrix if not needed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1624>
Frame-based decoding mode doesn't require SLICE_PARAMS and
PRED_WEIGHTS controls.
Moreover, if the driver doesn't support these two controls, trying
to set them will fail. Fix this by only setting these on
slice-based decoding mode.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1624>
To convert decoded texture into other format, downstream would use
video processor instead of shader. In order for downstream to
be able to use video processor even if we copied decoded texture
into downstream pool, we should set this bind flag. Otherwise,
downstream would keep switching video processor and shader
to convert format which would result in inconsistent image quality.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2051>
The AV1 codec needs to support the film grain feature. When the film
grain feature is enabled, we need two surfaces as the output of the
decoded picture, one without film grain effect and the other one with
it. The first one acts as the reference and is needed for later pictures'
reconstruction, and the second one is the real display output.
So we need to attach another aux surface to the gst buffer/mem and make
that aux surface as the target of vaBeginPicture.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1636>
AS-IS:
D3D11Convert class is baseclass of D3D11ColorConvert and D3D11Scale
* GstD3D11Convert
|_ GstD3D11ColorConvert
|_ GstD3D11Scale
TO-BE:
Introducing a new base class for color conversion and/or rescale elements
* GstD3D11BaseConvert
|_ GstD3D11Convert
|_ GstD3D11ColorConvert
|_ GstD3D11Scale
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2029>
In 6ae24948 the pipeline buffer destroy were removing assuming it
wasn't required. Nonetheless, debugging the code it looks like a
buffer leak in iHD driver since the ID of the buffer kept increasing.
The difference now is that first the filter buffers are destroy first
and later the pipeline buffer.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2023>
Just like the decoder, the vapostproc also needs to copy the output
buffer to raw buffer if downstream elements only supports raw caps
and does not support the video meta.
The pipeline like:
gst-launch-1.0 filesrc location=xxxx ! h264parse ! vah264dec ! \
vapostproc ! capsfilter caps=video/x-raw,width=55,height=128 ! \
filesink location=xxx
needs this logic to dump the data correctly.
fixes: #1523
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2026>
Both wasapi2 and wasapi plugins use WASAPI API. So "device.api=wasapi"
would make sense for the wasapi2 plugin as well. But people would be
confused by the identical "device.api=wasapi" property if intended
plugin is wasapi, not wasapi2. This change will make them distinguishable
by using "device.api" device property.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2024>
One problem that va dmabuf allocator had is when preparing a buffer from
dmabuf memories in the allocator pool, specially when a buffer is composed by
several memories. This memories have to be by certain number and in certain
order.
This patch stores the number of memories and their address in order when a
dmabuf-based buffer is created and when preparing a buffer, it is reconstructed
with this info.
Finally, instead of pushing the memories as soon as they are unrefed, they are
hold until GstVaBufferSurface's ref_mems_count reaches zero (all the memories
related with that buffer/surface are unrefed). Until that happen, all the
memories are pushed back into the queue, locked, assuring that all the memories
related with a single buffer (with the same surface) remain contiguous, so the
buffer reconstruction is assured.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2013>
Instead of removing memories from buffers at reset_buffer()/release_buffer() the
bufferpool operation is kept as originally designed, still the allocator pool is
used too. Thus, this patch restores the buffer size configuration while removing
release_buffer(), reset_buffer() and acquire_buffer() vmethods overloads.
Then, when the bufferpool base class decides to discard a buffer, the VA
surface-based memory is returned to the allocator pool when its last reference
is freed, and later reused if a new buffer is allocated again.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2013>
Add a new element d3d11deinterlace to support deinterlacing.
Similar to d3d11videosink and d3d11compositor, this element is
a wrapper bin of set of child elements including helpful
conversion elements (upload/download and color convert)
to make this element configurable between non-d3d11 elements.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2016>
Since our decoder DPB texture pool cannot be grown once it's
configured, we should pre-allocate sufficient number of textures
for zero-copy playback (but not too many).
The "min buffers" allocation query parameter can be a hint for
the number of required textures in addition to DPB size.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2017>
There were two problems with frame copy:
1. The input video info are from the format color, not form the allocated VA
surface, it's needed to update the sink video info according with the
allocator's data.
2. The parameters of `gst_video_frame_copy()` were backwards.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2007>
transform_size() basetransform vmethod is used when there's no output buffer
pool and allocates a system memory buffer. With VA this cannot be allowed, since
it needs VASurfaces to process.
Thus transform_size() is not required, but to play safe let's return FALSE.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2007>
Unlike other stateless decoder implementations (e.g., VA),
our DPB pool cannot be grown since we are using
texture array (pre-allocated, fixed-size d3d11 texture pool).
So, if there's no more available texture to use,
there's no way other than copying it to downstream's
d3d11 buffer pool. Otherwise deadlock will happen.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/2003>
This adds a non-thread safe refcount to the GstV4l2Request. This will
allow holding on more then one request in order to implement render
delay. This is made non-thread safe for speed as we know this will all
happen on the same streaming thread.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1881>
Starting from this patch, all queue and dequeue operation happening
on V4L2 is now abstracted with the request. Buffers are dequeued
automatically when pending requests are marked done and only 1 in-flight
request is now used.
Along with fixing issues with request not being reused with slice
decoders, this change reduces the memory footprint by allocating only
two bitstream buffers.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1881>
* Don't warn for live object, since ID3D11Debug itself seems to be
holding refcount of ID3D11Device at the moment we called
ID3D11Debug::ReportLiveDeviceObjects(). It would report live object
always
* Device might not be able to support some formats (e.g., P010)
especially in case of WARP device. We don't need to warn about that.
* gst_d3d11_device_new() can be used for device enumeration. Don't warn
even if we cannot create D3D11 device with given adapter index therefore.
* Don't warn for HLSL compiler warning. It's just noise and
should not be critical thing at all
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1986>
Add a way to support drawing on application's texture instead of
usual window handle.
To make use of this new feature, application should follow below step.
1) Enable this feature by using "draw-on-shared-texture" property
2) Watch "begin-draw" signal
3) On "begin-draw" signal handler, application can request drawing
by using "draw" signal action. Note that "draw" signal action
should be happen before "begin-draw" signal handler is returned
NOTE 1) For texture sharing, creating a texture with
D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX flag is strongly recommend
if possible because we cannot ensure sync a texture
which was created with D3D11_RESOURCE_MISC_SHARED
and it would cause glitch with ID3D11VideoProcessor use case.
NOTE 2) Direct9Ex doesn't support texture sharing which was
created with D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX. In other words,
D3D11_RESOURCE_MISC_SHARED is the only option for Direct3D11/Direct9Ex interop.
NOTE 3) Because of missing synchronization around ID3D11VideoProcessor,
If shared texture was created with D3D11_RESOURCE_MISC_SHARED,
d3d11videosink might use fallback texture to convert DXVA texture
to normal Direct3D texture. Then converted texture will be
copied to user-provided shared texture.
* Why not use generic appsink approach?
In order for application to be able to store video data
which was produced by GStreamer in application's own texture,
there would be two possible approaches,
one is copying our texture into application's own texture,
and the other is drawing on application's own texture directly.
The former (appsink way) cannot be a zero-copy by nature.
In order to support zero-copy processing, we need to draw on
application's own texture directly.
For example, assume that application wants RGBA texture.
Then we can imagine following case.
"d3d11h264dec ! d3d11convert ! video/x-raw(memory:D3D11Memory),format=RGBA ! appsink"
^
|_ allocate new Direct3D texture for RGBA format
In above case, d3d11convert will allocate new texture(s) for RGBA format
and then application will copy again the our RGBA texutre into
application's own texture. One texture allocation plus per frame GPU copy will hanppen
in that case therefore.
Moreover, in order for application to be able to access
our texture, we need to allocate texture with additional flags for
application's Direct3D11 device to be able to read texture data.
That would be another implementation burden on our side
But with this MR, we can configure pipeline in this way
"d3d11h264dec ! d3d11videosink".
In that way, we can save at least one texture allocation and
per frame texutre copy since d3d11videosink will convert incoming texture
into application's texture format directly without copy.
* What if we expose texture without conversion and application does
conversion by itself?
As mentioned above, for application to be able to access our texture
from application's Direct3D11 device, we need to allocate texture
in a special form. But in some case, that might not be possible.
Also, if a texture belongs to decoder DPB, exposing such texture
to application is unsafe and usual Direct3D11 shader cannot handle
such texture. To convert format, ID3D11VideoProcessor API needs to
be used but that would be a implementation burden for application.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1873>
gstd3d11window_corewindow.cpp(408): warning C4189:
'storage': local variable is initialized but not referenced
gstd3d11window_corewindow.cpp(490): warning C4189:
'self': local variable is initialized but not referenced
gstd3d11window_swapchainpanel.cpp(481): warning C4189:
'self': local variable is initialized but not referenced
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1962>
Some GPUs (especially NVIDIA) are complaining that GPU is still busy
even we did 50 times of retry with 1ms sleep per failure.
Because DXVA/D3D11 doesn't provide API for "GPU-IS-READY-TO-DECODE"
like signal, there seems to be still no better solution other than sleep.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1913>
The vabasedec's display and decoder are created/destroyed between
the gst_va_base_dec_open/close pair. All the data and event handling
functions are between this pair and so the accessing to these pointers
are safe. But the query function can be called anytime. So we need to:
1. Make these pointers operation in open/close and query atomic.
2. Hold an extra ref during query function to avoid it destroyed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1957>
Initial support for d3d11 texture so that encoder can copy
upstream d3d11 texture into encoder's own texture pool without
downloading memory.
This implementation requires MFTEnum2() API for creating
MFT (Media Foundation Transform) object for specific GPU but
the API is Windows 10 desktop only. So UWP is not target
of this change.
See also https://docs.microsoft.com/en-us/windows/win32/api/mfapi/nf-mfapi-mftenum2
Note that, for MF plugin to be able to support old OS versions
without breakage, this commit will load MFTEnum2() symbol
by using g_module_open()
Summary of required system environment:
- Needs Windows 10 (probably at least RS 1 update)
- GPU should support ExtendedNV12SharedTextureSupported feature
- Desktop application only (UWP is not supported yet)
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1903>
Move d3d11 device, memory, buffer pool and minimal method
to gst-libs so that other plugins can access d3d11 resource.
Since Direct3D is primary graphics API on Windows, we need
this infrastructure for various plugins can share GPU resource
without downloading GPU memory.
Note that this implementation is public only for -bad scope
for now.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/464>
Add the helper function _get_surface_id() which extracts the
VASurfaceID from the passed picture. This function gets the surface of
the next and previous reference picture.
Instead of if-statements, this refactor uses a switch-statement with a
fall-through, for P-type pictures, making the code a bit more readable.
Also it adds quirks for gallium driver, which cannot handle invalid
surfaces as forwarding nor backwarding references, so the function fails.
Also iHD cannot handle them, but to avoid failing, the current picture
is used as self-reference.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1939>
When missing the reference frames, we should not just discard the current
frame. Some streams have group of picture header. It is an optional header
that can be used immediately before a coded I-frame to indicate to the decoder
if the first consecutive B-pictures immediately following the coded I-frame can
be reconstructed properly in the case of a random access.
In that case, the B frames may miss the previous reference and can still be
correctly decoded. We also notice that the second field of the I frame may
be set to P type, and it only ref its first field.
We should not skip all those frames, and even the frame really misses the
reference frame, some manner such as inserting grey picture should be used
to handle these cases.
The driver crashes when it needs to access the reference picture while we set
forward_reference_picture or backward_reference_picture to VA_INVALID_ID. We
now set it to current picture to avoid this. This is just a temp manner.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1929>
The behavior for zero AVEncMPVGOPSize value would be
varying depending on GPU vendor implementation and some
GPU will produce keyframe only once at the beginning of encoding.
That's unlikely expected result for users.
To make this property behave consistently among various GPUs,
this commit will change default value of "gop-size" property to -1
which means "auto". When "gop-size" is unspecified, then
mfvideoenc will calculate GOP size based on framerate
like that of our x264enc implementation.
See also
https://docs.microsoft.com/en-us/windows/win32/directshow/avencmpvgopsize-property
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1911>
Add a new video source element "d3d11desktopdupsrc" for capturing desktop image
via Desktop Duplication based on Microsoft's Desktop Duplication sample available at
https://github.com/microsoft/Windows-classic-samples/tree/master/Samples/DXGIDesktopDuplication
This element is expected to be a replacement of existing dxgiscreencapsrc
element in winscreencap plugin.
Currently this element can support (but dxgiscreencapsrc cannot)
- Copying captured D3D11 texture to output buffer without download
- Support desktop session transition
e.g., can capture desktop without error even in case that
"Lock desktop" and "Permission dialog"
- Multiple d3d11desktopdupsrc elements can capture the same monitor
Not yet implemented features
- Cropping rect is not implemented, but that can be handled by downstream
- Mult-monitor is not supported. But that is also can be implemented by
downstream element for example via multiple d3d11desktopdup elements
with d3d11compositor
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1855>
Hide most of symbols of GstD3D11Memory object.
GstD3D11Memory is one of primary resource for imcoming d3d11 library
and it's expected to be a extensible feature.
Hiding implementation detail would be helpful for later use case.
Summary of this commit:
* Now all native Direct3D11 resources are private of GstD3D11Memory.
To access native resources, getter methods need to be used
or generic map (e.g., gst_memory_map) API should be called
apart from some exceptional case such as d3d11decoder case.
* Various helper methods are added for GstBuffer related operations
and in order to remove duplicated code.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1892>
... instead of READY state. READY state is too early for setting
overlay window handle especially playbin/playsink scenario
since playsink will set given overlay handle on videosink once
READY state change of videosink is ensured.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1893>
Unlike software MFT (Media Foundation Transform) which is synchronous
in terms of processing input and output data, hardware MFT works
in asynchronous mode. output data might not be available right after
we pushed one input data into MFT.
Note that async MFT will fire two events, one is "METransformNeedInput"
which happens when MFT can accept more input data,
and the other is "METransformHaveOutput", that's for signaling
there's pending data which can be outputted immediately.
To listen the events, we can wait synchronously via
IMFMediaEventGenerator::GetEvent() or make use of IMFAsyncCallback
object which is asynchronous way and the event will be notified
from Media Foundation's internal worker queue thread.
To handle such asynchronous operation, previous working flow was
as follows (IMFMediaEventGenerator::GetEvent() was used for now)
- Check if there is pending output data and push the data toward downstream.
- Pulling events (from streaming thread) until there's at least
one pending "METransformNeedInput" event
- Then, push one data into MFT from streaming thread
- Check if there is pending "METransformHaveOutput" again.
If there is, push new output data to downstream
(unlikely there is pending output data at this moment)
Above flow was processed from upstream streaming thread. That means
even if there's available output data, it could be outputted later
when the next buffer is pushed from upstream streaming thread.
It would introduce at least one frame latency in case of live stream.
To reduce such latency, this commit modifies the flow to be fully
asynchronous like hardware MFT was designed and to be able to
output encoded data whenever it's available. More specifically,
IMFAsyncCallback object will be used for handling
"METransformNeedInput" and "METransformHaveOutput" events from
Media Foundation's internal thread, and new output data will be
also outputted from the Media Foundation's thread.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1520>
Add a new property "render-stats" to allow rendering statistics
data on window for debugging and/or development purpose.
Text rendering will be accelerated by GPU since this implementation
uses Direct2D/DirectWrite API and Direct3D inter-op for minimal overhead.
Specifically, text data will be rendered on swapchain backbuffer
directly without any copy/allocation of extra texture.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1830>
Since GstVaDecodePicture is destroyed completely with its free() function and
it's used as destroy notify by codecs picture, there's no need to call
gst_va_decoder_destroy_buffers() externally, since the codecs base classes
destroy the codec picture when it's required.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1841>
The current way of GstVaDecodePicture's finalize will leak some
resource such as parameter buffers and slice data.
The current way deliberately leaves these resource releasing logic
to va decoder related function and trigger a warning if we free the
GstVaDecodePicture without releasing these resources.
But in practice, sometimes, you do not have the chance to release
these resource before picture is freed. For example, H264/Mpeg2
support multi slice NALs/Packets for one frame. It is possible that
we already succeed to parse and generate the first several slices
data by _decode_slice(), but then we get a wrong slice NAL/packet
and fail to parse it. We decide to discard the whole frame in the
decoder's base class, it just free the current picture and does not
trigger sub class's function again. In this kind of cases, we do
not have the chance to cleanup the resource, and the resource will
be leaked.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1841>
Even if resolution and/or bitdepth is not updated, required
DPB size can be changed per SPS update and it could be even
larger than previously configured size of DPB. If so, we need
to reconfigure DPB d3d11 texture pool again.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1839>
In order to honor GST_BUFFER_POOL_ACQUIRE_FLAG_DONTWAIT in VA pool, allocators'
wait_for_memory() has to be decoupled from their prepare_buffer() so it could be
called in pools' acquire_buffer() if the flag is not set.
wait_for_memory() functions are blocking so the received memories are assigned
to the fist requested buffer, if multithreaded calls. For this a new mutex were
added.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1815>
An application, using for example appsink, can hold buffers from any
va allocator after setting the pipeline to NULL. We need to destroy
the allocator when that memory is unrefed.
This patch juggles a bit with the allocator reference count in
memories in order to achieve this:
1. When memory is created no alloc ref is modified
2. When memory is released, alloc ref is decreased
3. When memory is reassiged to a buffer, alloc ref is increased
4. When memory is flushed, alloc ref is increased becase it is going
to be decreased in gst_memory_unref()
Also this patch moves the deallocation of member variables to
finalize() rather than dispose()
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1815>
In case that "pic_order_cnt_type" is equal to zero, ref picture
list for B slice should not include non-existing picture
as per spec 8.2.4.2.3. And, the second field is not needed
for the process of frame picture reference list construction
since it needs to be frame unit, not field picture in that case.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1812>
force_videometa should mean that the buffer must use video meta to
map correctly. When the stride or the offset of the alloc_info is
different from the src caps, the downstream must use video meta.
So this flag should not link with the RAW caps only. All kinds of
caps(memory:VAMemory, memory:DMABuf) should have this flag.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1711>
When the downstream element reports an ANY caps, and it also fails to
support VideoMeta, we should fallback to the system memory.
Note: the basetransform kind elements never return valid allocation
query before set_caps(). So, if a basetransform return an ANY sink
caps, we always fallback to system memory for it.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1711>
Allocate a GArray which is used to fill
VAPictureParameterBufferH264.ReferenceFrames (called per frame),
instead of alloc/free per frame.
Also this commit is to fix the condition where long-term reference
picture is needed for VAPictureParameterBufferH264.ReferenceFrames
entry.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1813>
We don't need to preserve input color range for transformed target
color space. Also some GPUs doesn't seem to be happy with 16-235
color range for RGB color space.
Also, since our default display target color space is
DXGI_COLOR_SPACE_RGB_FULL_G22_NONE_P709, choosing full color range
would make more sense.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1814>
When gst_va_dmabuf_allocator_setup_buffer_full() receives info (not NULL) it is
supposed that this buffer is not part of the allocator pool, so it has to be
de-allocated as soon it is freed.
This patch sets the destroy notify of the assigned GstVaBufferSurface if info is
not NULL.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1811>
Managing reference picture type by using two variables
(ref and long_term) seems to be redundant and that can be
represented by using a single enum value.
This is to sync this implementation with gstreamer-vaapi so that
make comparison between this and gstreamer-vaapi easier and also
in order to minimize the change required for subclass to be able
to support interlaced.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1534>
As per spec 7.4.3 Slice header semantics, the flag value is derived as
MbaffFrameFlag = (mb_adaptive_frame_field_flag && !field_pic_flag)
and DXVA uses the value.
Regarding FrameNumList, in case of long-term ref, FrameNumList[i]
value should be long_term_frame_idx not long_term_pic_num.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1780>
This add HEVC decoding support into the new VA plugin. This implementation has
been tested using the ITU comformance test (through fluster). It fails all
MAIN10 tests, as this is not implemented yet along with the following:
CONFWIN_A_Sony_1 (looks fine, but md5sum is incorrect)
PICSIZE_A_Bossen_1 (height too high)
PICSIZE_B_Bossen_1 (same)
VPSSPSPPS_A_MainConcept_1 (parser issue)
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1714>
This causes no changes to the profile but keeps the existing settings.
The profile can also be changed from e.g. the card's configuration
application and in that case probably should be left alone.
The default is the new value as it keeps the profile setting as it is,
which is consistent with the previous behaviour in 1.18.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1721>
In our va implemenation, we just use image's info to map the buffer.
The padding info just plays a role as a place holder to expand the
allocation size in caps when decoding size is bigger than display
size. So the padding_right or padding_left does not change the result.
But we find if using padding_left, it is hard to meet the requirement
of gst_video_meta_validate_alignment(), when the video meta's stride
is different from the allocation width.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1698>
We have already done the jobs in gst_va_base_dec_decide_allocation()
and no need to call base class' decide_allocation() again. The base
class' decide_allocation() will set_format() again and let use do the
image/surface testing again, which is low performance and no needed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1698>
Use this standalone function to update the allocator info and make
all ensure_image() and mem_alloc() API clean.
We also change the default way of using image. We now set the non
derive manner as the default manner, and if it fails, then fallback
to the derived image manner.
On a lot of platforms, the derived image does not have caches, so the
read and write operations have very low performance.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1698>
Moving the parameters testing and setting from the allocator_alloc_full()
to the allocator_try(). The allocator_alloc_full() will be called every
time when we need to allocate a new memory. But all these parameters such
as the surface and the image format, rt_format, etc, are unchanged during
the whole allocator lifetime. Just setting them in set_format() is enough.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1698>
Output texture of d3d11 decoder cannot have the bind flag
D3D11_BIND_SHADER_RESOURCE (meaning that it cannot be used for shader
input resource). So d3d11convert (and it's subclasses) was copying
texture into another internal texture to use d3d11 shader.
It's obviously overhead and we can avoid texture copy for
colorspace conversion or resizing via ID3D11VideoProcessor
as it supports decoder output texture.
This commit would be a visible optimization for d3d11 decoder with
d3d11compositor use case because we can avoid texture copy per frame.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1718>
GstMemory object could be disposed if GstBuffer is not allocated
by GstD3D11BufferPool such as via gst_buffer_copy() and/or
gst_buffer_make_writable(). So attaching qdata on GstMemory
object would cause unnecessary view alloc/free.
By using view pool which is implemented in GstD3D11Allocator,
we can avoid redundant view alloc/free.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1716>
In order to know the chroma format, besides profile, subsampling_x and
subsampling_y are needed (Spec 7.2.2 Color config semantics). These values are
in GstVp9Parser but not in GstVp9Framehdr.
Also, bit_depth is available in parser but not frame header. Evenmore, those
values are copied to picture structure later.
In case of VA-API, to configure the pipeline, it is require to know the chroma
format and depth.
It is possible to know chroma and depth through caps coming from vp9parser, but
it requires string parsing. It would be less error prone to get these values
through the parser structure at new_sequence() virtual method.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1700>
Add new video composition element which is equivalent to compositor
and glvideomixer elements. When d3d11 decoder elements are used,
d3d11compositor can do efficient graphics memory handling
(zero copying or at least copying memory on GPU memory space).
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/1323>