The output of VP9 and AV1 encoder is a little different from the H264
and H265 encoder, it may contain repeat frames and so the output frame
number may be more than the input. We need to call finish_subframe()
when some frame will be repeated later. So we need to extend the
current prepare_output() virtual function.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3015>
Explicitly calls gst_vtenc_pause_output_loop when going PAUSED->READY to make sure GST_PAD_STREAM_LOCK is not taken.
Before this change, a deadlock would occur if pipeline got stopped right after one output buffer was generated by vtenc.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5933>
Modify the fix_output_format in vpp to directly generate caps with
negotiated src caps, and we have the correct dma caps negotiation in
fix_output_format function. And thus, we can remove the redundant
negotiation of using function pad_accept_memory in vpp.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5845>
The pool currently defaults to performing a layout transition to
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, with some special exceptions for
video usages. This may not be a legal transition depending on the usage.
Provide an API to explicitly control the initial image layout.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5881>
When implementing NDK media support, it would be useful to also have JNI
implementation in the same binary as NDK media compatibility is lower.
As such, implement a rudimentary vtable system for gstamc-codec and
gstamc-format, and allow choosing the implementation at static_init()
time.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4115>
This allows the implementations to do custom logic behind the hood. For
example, when NDK implementation is added, the entrypoint can chooses to
statically initialize the NDK implementations or the JNI one.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4115>
With this patch, the caps is registered in the order of memory features
as: VAMemory, DMABuf then raw caps in linux path, and D3D11Memory then
raw caps in windows path. It helps to prioritize the video memory for all
msdk elements when doing negotiation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5898>
Fixing below debug layer report
ID3D12Device::CreateCommittedResource: Ignoring InitialState D3D12_RESOURCE_STATE_COPY_DEST.
Buffers are effectively created in state D3D12_RESOURCE_STATE_COMMON.
Buffer resource will be automatically promoted to D3D12_RESOURCE_STATE_COPY_DEST
at the very first COPY operation time.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5895>
Since DXGI desktop duplication API does not work with Direct3D12 device,
this element will use Direct3D11 device to acquire frame.
Then other rendering operations (e.g., texture copy, render pipeline) will
happen using Direct3D12 API
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5883>
In case of tier 1 decoder, always use reference-only picture to avoid
fixed-size pool limitation and output decoded picture without
copy even for negative rate. Also do not use copy queue for GPU to GPU
copy. Copy queue is specialized for upload/download and may occupy
PCIE bandwidth. Use direct queue as recommended by vendors.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5877>
On flush event, baseclass will discard all pictures from DPB
but there can be still in-flight commands not finished yet.
Use our command queue, allocator and fence data helper objects
to keep resource available during command execution.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5877>
Shader visible descriptors occupy GPU resource and there are hardware
limits. Thus, in order to minimize the amount of shader visible heaps,
only non shader visible descriptor heap (staging) will be held by d3d12memory.
Then converter will copy the staging descriptor to shader visible
descriptor heap per draw.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Zero initialization would have overhead and it's not required
most cases except for textures. Use CREATE_NOT_ZEROED flag
in case of buffer resource or if a texture will be rendered without any
prior read operation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Conversion will happen when constructed command list is executed,
not by converter element. Thus this object should not map output buffer
with write flag which will result in error if multiple threads
are building commands for the same output target frame.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Add macro which converts picture frame number to suitable timestamp in
nanoseconds for use in V4L2 VB2 buffer lookup. Since multiple codecs do
the same operation and almost all got it wrong, do it in one place so it
can be fixed in one place again, if needed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstMpeg2Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecMpeg2Dec *_ref_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during MPEG2 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstAV1Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecAV1Dec v4l2_av1_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during AV1 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstVp9Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecVp9Dec v4l2_vp9_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during VP9 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstVp8Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecVp8Dec v4l2_vp8_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during VP8 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The input of the vacompositor may be DMA buffers. And in this case, the input
caps has the format=DMA_DRM, which can not be recognized by base video
aggregator class' find_best_format() function. So we need to override the
update_caps() virtual function.
Also we consider the DMA kind caps in negotiated_src_caps() for output.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5160>
To achieve maximum throughput, waiting on command commit thread
is not ideal. And render-delay will introduce unwanted latency.
Best is to split thread and wait finished decoding job in a dedicated
output thread
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5812>
When creating a new VA pool set config size to zero because it's not used.
Also, given the potential different sizes from software buffer pools and VA
buffer pools, this patch handle that potential different values.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
When creating a new VA pool set config size to zero because it's not used.
Also, given the potential different sizes from software buffer pools and VA
buffer pools, this patch handle that potential different values.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
When creating a new VA pool set config size to zero because it's not used.
Also, given the potential different sizes from software buffer pools and VA
buffer pools, this patch handle that potential different values.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
VA drivers allocate surfaces given their properties, so there's no need to
provide a buffer size to the VA pool.
Though, the buffer size is provided by the driver, or the canonical size
is used for single planed surfaces.
This patch removes the need to provide a size for the function
gst_va_pool_new_with_config() and adds a helper method to retrieve the surface
size, gst_va_pool_get_buffer_size(). Also change the callers accordingly.
Changes for custom VA pool creation will be addressed in the following commits.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
In multi-card scenario, user can set GST_MSDK_DRM_DEVICE env variable to
choose the device. This patch can align vpl's queried results with the
users' choice by passing deviceID when creating mfx implementation.
Co-authored-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5697>
Direct3D feature level 10 supported GPUs were released
more than 15 years ago, around the time when Windows
Vista / 7 were released. Also our d3d11 plugin/library
does not support feature level 9.x very well already.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5709>
We were previously always querying index 0, and while the number of planes per
buffer will never change, it seems more proper to query the right buffer rather
than always the first one.
This was found while reading strace logs, and wondering why the
V4L2_BUF_FLAG_MAPPED flag was present on all ¬0 indices even though that
happened before VIDIOC_EXPBUF.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5647>
Even if IDXGIOutput6 says current display colorspace is HDR,
captured texture via IDXGIOutputDuplication::AcquireNextFrame()
is converted frame by OS unless we use IDXGIOutput5::DuplicateOutput1()
with DXGI_FORMAT_R16G16B16A16_FLOAT format, in order for captured
frame to be scRGB color space. Then application should perform
tonemap operation based on reported display white level, color primaries, etc.
Since we don't have any tonemapping implementation, ignores colorimetry
reported by IDXGIOutput6.
Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3128
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5671>
d2d runtime seems to execute pending GPU command list
when DXGI ID2D1RenderTarget is being released, and it will invoke
d3d11 immediate context APIs. Should protect all rendering operations
and DXGI resources with lock.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5659>
In the case of encoders and filters when importing a DMABuf, use
GstVideoInfoDmaDrm to get the drm fourcc and modifier.
In both cases, instead of keeping the original GstVideoInfoDmaDrm from caps, the
GstVideoInfo part of the structure is converted as canonical one, given the
format from the fourcc. It's kept in the way to handle V4L2 linear DMABufs and
to avoid too many changes in the current code.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5264>
Instead of guessing the DRM format and modifier, pass a DRM video info to
gst_va_dmabuf_memories_setup().
Still, it checks for the DRM parameters in DRM info, if they are not available,
as in the case of V4L2 buffers, the part of the video info is used.
This is an API breakage, but since the plugin is still in stage, it's still
allowed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5264>
Should fix auto-generated follow-up sections like "Hierarchy" or
"Factory details" to be listed under the element name in the
table-of-contents of the document, instead of a stand-alone
"Duplex-Mode" section.
Also cleanup some spurious colon suffix after section names.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5625>
Now that nvidia-vaapi-driver appeared and isn't yet supported by GstVA, we've to
add an allowed list of supported drivers.
This patch implements it adding a environment variable to disable this driver
check: GST_VA_ALL_DRIVERS
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5616>
When decoding stream using hardware V4L2 decoder element, in any of the
currently supported formats, the decoding will fail once frame number
1000000 is reached. The reported error clearly indicates a wrap-around
occured, instead of receiving decoded frame 1000000, frame 0 is received
from the hardware V4L2 decoder driver.
The problem is actually not in the driver itself, but rather in gstreamer,
which uses `struct v4l2_buffer` member `.timestamp` in a special way. The
timestamp of buffers with encoded data added to the SINK (input) queue of
the driver is copied by the driver into matching buffers with decoded data
added to the SOURCE (output) queue of the driver. In fact, the timestamp
is not a timestamp at all, but rather in this special case, only part of
it is used as an incrementing frame counter.
The `.timestamp` is of type `struct timeval`, which is defined in
`sys/time.h` [1]. Only the `tv_usec` member of this structure is used
for the incrementing frame counter. However, suseconds_t tv_usec [2]
may be limited to range [-1, 1000000]:
"
[XSI] The type suseconds_t shall be a signed integer type capable of
storing values at least in the range [-1, 1000000].
"
Therefore, once frame 1000000 is reached, a rollover occurs and decoding
fails.
Fix this by using both `struct timeval` members, `.tv_sec` and `.tv_usec`
with matching modular arithmetic, this way the failure would occur again
just short of 2^84 frames, which should be plenty.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_time.h.html
[2] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html
A test case using stateless hardware h264 decoder, the WARN/ERROR output
in gstreamer log indicates a failure occurred. With this change, that
error no longer occurs and the WARN/ERROR are not present:
```
pc$ gst-launch-1.0 videotestsrc num-buffers=1001001 pattern=6 ! \
video/x-raw,width=16,height=16,format=I420 ! \
x264enc ! filesink location=/tmp/test.h264
dut$ GST_DEBUG="*:3" gst-launch-1.0 filesrc location=/tmp/test.h264 ! \
h264parse ! v4l2slh264dec ! fakesink
...
0:03:51.393677606 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000000, but driver returned frame 0.
0:03:51.394140597 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000001, but driver returned frame 1.
0:03:51.394425216 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000002, but driver returned frame 2.
0:03:51.394665211 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000003, but driver returned frame 3.
0:03:51.394785833 12111 0x370df400 WARN \
v4l2codecs-h264dec gstv4l2codech264dec.c:1059:gst_v4l2_codec_h264_dec_output_picture:<v4l2slh264dec0> \
error: Failed to decode frame 1000000
ERROR: from element /GstPipeline:pipeline0/v4l2slh264dec:v4l2slh264dec0: Failed to decode frame 1000000
```
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5598>