Add macro which converts picture frame number to suitable timestamp in
nanoseconds for use in V4L2 VB2 buffer lookup. Since multiple codecs do
the same operation and almost all got it wrong, do it in one place so it
can be fixed in one place again, if needed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstMpeg2Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecMpeg2Dec *_ref_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during MPEG2 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstAV1Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecAV1Dec v4l2_av1_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during AV1 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstVp9Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecVp9Dec v4l2_vp9_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during VP9 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstVp8Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecVp8Dec v4l2_vp8_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during VP8 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The input of the vacompositor may be DMA buffers. And in this case, the input
caps has the format=DMA_DRM, which can not be recognized by base video
aggregator class' find_best_format() function. So we need to override the
update_caps() virtual function.
Also we consider the DMA kind caps in negotiated_src_caps() for output.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5160>
To achieve maximum throughput, waiting on command commit thread
is not ideal. And render-delay will introduce unwanted latency.
Best is to split thread and wait finished decoding job in a dedicated
output thread
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5812>
When creating a new VA pool set config size to zero because it's not used.
Also, given the potential different sizes from software buffer pools and VA
buffer pools, this patch handle that potential different values.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
When creating a new VA pool set config size to zero because it's not used.
Also, given the potential different sizes from software buffer pools and VA
buffer pools, this patch handle that potential different values.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
When creating a new VA pool set config size to zero because it's not used.
Also, given the potential different sizes from software buffer pools and VA
buffer pools, this patch handle that potential different values.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
VA drivers allocate surfaces given their properties, so there's no need to
provide a buffer size to the VA pool.
Though, the buffer size is provided by the driver, or the canonical size
is used for single planed surfaces.
This patch removes the need to provide a size for the function
gst_va_pool_new_with_config() and adds a helper method to retrieve the surface
size, gst_va_pool_get_buffer_size(). Also change the callers accordingly.
Changes for custom VA pool creation will be addressed in the following commits.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5805>
In multi-card scenario, user can set GST_MSDK_DRM_DEVICE env variable to
choose the device. This patch can align vpl's queried results with the
users' choice by passing deviceID when creating mfx implementation.
Co-authored-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5697>
Direct3D feature level 10 supported GPUs were released
more than 15 years ago, around the time when Windows
Vista / 7 were released. Also our d3d11 plugin/library
does not support feature level 9.x very well already.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5709>
We were previously always querying index 0, and while the number of planes per
buffer will never change, it seems more proper to query the right buffer rather
than always the first one.
This was found while reading strace logs, and wondering why the
V4L2_BUF_FLAG_MAPPED flag was present on all ¬0 indices even though that
happened before VIDIOC_EXPBUF.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5647>
Even if IDXGIOutput6 says current display colorspace is HDR,
captured texture via IDXGIOutputDuplication::AcquireNextFrame()
is converted frame by OS unless we use IDXGIOutput5::DuplicateOutput1()
with DXGI_FORMAT_R16G16B16A16_FLOAT format, in order for captured
frame to be scRGB color space. Then application should perform
tonemap operation based on reported display white level, color primaries, etc.
Since we don't have any tonemapping implementation, ignores colorimetry
reported by IDXGIOutput6.
Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3128
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5671>
d2d runtime seems to execute pending GPU command list
when DXGI ID2D1RenderTarget is being released, and it will invoke
d3d11 immediate context APIs. Should protect all rendering operations
and DXGI resources with lock.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5659>
In the case of encoders and filters when importing a DMABuf, use
GstVideoInfoDmaDrm to get the drm fourcc and modifier.
In both cases, instead of keeping the original GstVideoInfoDmaDrm from caps, the
GstVideoInfo part of the structure is converted as canonical one, given the
format from the fourcc. It's kept in the way to handle V4L2 linear DMABufs and
to avoid too many changes in the current code.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5264>
Instead of guessing the DRM format and modifier, pass a DRM video info to
gst_va_dmabuf_memories_setup().
Still, it checks for the DRM parameters in DRM info, if they are not available,
as in the case of V4L2 buffers, the part of the video info is used.
This is an API breakage, but since the plugin is still in stage, it's still
allowed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5264>
Should fix auto-generated follow-up sections like "Hierarchy" or
"Factory details" to be listed under the element name in the
table-of-contents of the document, instead of a stand-alone
"Duplex-Mode" section.
Also cleanup some spurious colon suffix after section names.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5625>
Now that nvidia-vaapi-driver appeared and isn't yet supported by GstVA, we've to
add an allowed list of supported drivers.
This patch implements it adding a environment variable to disable this driver
check: GST_VA_ALL_DRIVERS
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5616>
When decoding stream using hardware V4L2 decoder element, in any of the
currently supported formats, the decoding will fail once frame number
1000000 is reached. The reported error clearly indicates a wrap-around
occured, instead of receiving decoded frame 1000000, frame 0 is received
from the hardware V4L2 decoder driver.
The problem is actually not in the driver itself, but rather in gstreamer,
which uses `struct v4l2_buffer` member `.timestamp` in a special way. The
timestamp of buffers with encoded data added to the SINK (input) queue of
the driver is copied by the driver into matching buffers with decoded data
added to the SOURCE (output) queue of the driver. In fact, the timestamp
is not a timestamp at all, but rather in this special case, only part of
it is used as an incrementing frame counter.
The `.timestamp` is of type `struct timeval`, which is defined in
`sys/time.h` [1]. Only the `tv_usec` member of this structure is used
for the incrementing frame counter. However, suseconds_t tv_usec [2]
may be limited to range [-1, 1000000]:
"
[XSI] The type suseconds_t shall be a signed integer type capable of
storing values at least in the range [-1, 1000000].
"
Therefore, once frame 1000000 is reached, a rollover occurs and decoding
fails.
Fix this by using both `struct timeval` members, `.tv_sec` and `.tv_usec`
with matching modular arithmetic, this way the failure would occur again
just short of 2^84 frames, which should be plenty.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_time.h.html
[2] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html
A test case using stateless hardware h264 decoder, the WARN/ERROR output
in gstreamer log indicates a failure occurred. With this change, that
error no longer occurs and the WARN/ERROR are not present:
```
pc$ gst-launch-1.0 videotestsrc num-buffers=1001001 pattern=6 ! \
video/x-raw,width=16,height=16,format=I420 ! \
x264enc ! filesink location=/tmp/test.h264
dut$ GST_DEBUG="*:3" gst-launch-1.0 filesrc location=/tmp/test.h264 ! \
h264parse ! v4l2slh264dec ! fakesink
...
0:03:51.393677606 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000000, but driver returned frame 0.
0:03:51.394140597 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000001, but driver returned frame 1.
0:03:51.394425216 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000002, but driver returned frame 2.
0:03:51.394665211 12111 0x370df400 WARN \
v4l2codecs-decoder gstv4l2decoder.c:1157:gst_v4l2_request_set_done:<v4l2decoder2> \
Requested frame 1000003, but driver returned frame 3.
0:03:51.394785833 12111 0x370df400 WARN \
v4l2codecs-h264dec gstv4l2codech264dec.c:1059:gst_v4l2_codec_h264_dec_output_picture:<v4l2slh264dec0> \
error: Failed to decode frame 1000000
ERROR: from element /GstPipeline:pipeline0/v4l2slh264dec:v4l2slh264dec0: Failed to decode frame 1000000
```
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5598>
If users update geometry related properties very frequently
for a stream to be animated, redrawing on every update
can make rendering choppy or can be a performance bottleneck.
To address the issue, adding a property to control the behavior
of redrawing scene when geometry related properties are updated.
Also, do not resize swapchain on such property update, since
re-allocating backbuffer and multi-sampled render target is
unnecessary in that case.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5575>
Other Windows applications allow window switching even when
an application window is in fullscreen mode. Also fixing
regression introduced in 15248d8b84
which makes restored window is always located at topmost
since we do not call SetWindowPos() anymore when restoring
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5574>
ISimpleAudioVolume controls volume of corresponding audio session
and there would be only single input/output audio session
in case of share-mode, which means that it controls audio volume of the
process. Instead, use IAudioStreamVolume interface which controls
volume of the stream.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5549>
Ignore alpha component of source (mouse cursor texture)
when blending alpha channel, otherwise the background area of source
(which has zeros) will be written to render target. Then it will result
in black rectangle if output texture is converted to premultiplied alpha
texture
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5566>
This was wrongly calling the base class method, which unnecessairly took the stream lock, already taken by
handle_frame(). The drain() call in negotiate() would then wait for the output loop to pause, while that loop
is stuck waiting to take the stream lock, thus causing a deadlock.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5521>
While adding arbitrary tile support, a round up operation was badly
converter. This caused the Y component of the stride to be 0. This
eventually lead to a crash in glupoad preceded by the following
assertion.
gst_gl_buffer_allocation_params_new: assertion 'alloc_size > 0' failed
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5458>
The caps that were sent by the caps event can be retrieved from the sinkpad
using gst_pad_get_current_caps(). This is more reliable than using cur_caps as
we know exactly which caps upstream selected when the UVC host didn't select a
format, yet.
This further allows to simplify the check, if the uvcsink has to wait for the
caps event before switching to the internal v4l2sink.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4994>
The probe passes all events except the EVENT_CAPS. Installing and removing the
probe doesn't provide any additional value.
Install an event function and always handle EVENT_CAPS. Use the caps_changed
field, to decide, if the element has to do anything special on a EVENT_CAPS.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4994>
Move the sanity checks to the beginning of the function. Make the actual effect
of the function more obvious and reset the flags in the end.
This should make it easier to understand what this function is doing.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4994>
The probe that installs the buffer probe is already on the correct pad. There is
no need for a separate function to install the probe.
While at it, change the signature of the probe functions to GstPadProbeCallback
to avoid the cast when installing the probes.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4994>
The uvcsink calculates the caps for the format that the UVC host selected. The
gst_uvc_sink_parse_cur_caps() sets these caps as cur_caps as a side effect. This
behavior is surprising as cur_caps is later updated to reflect the actually used
caps.
Just return the configured caps to avoid side effects. This makes the function
easier to understand. Update the function name to reflect the new behavior.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4994>
The only job of the event peer probe is to catch the upcoming caps event
and be able to react with the sink change. All other events that are
passing the pad shall be passed and ignored.
Since the probe is a blocking probe, there is no use in returning
with GST_PAD_PROBE_OK on other events. Otherwise the event would just
be blocked.
Since we are handling the probe removal of the probe already in the
event switch, we can remove the second explicit probe removal.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4994>
Since DXVA does not support some profiles such as HEVC RExt,
vendor specific decoding API is still required.
When decoder is negotiated with d3d11 caps, decoder will convert
semi-planar frame to planar since semi-planar format (e.g.,
DXGI_FORMAT_NV12) is not supported by CUDA/D3D11 interop.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
Moves outputting frames to a task on the source pad, bringing vtdec in line with vtenc.
This brings possible performance improvements thanks to decoupling queueing new frames from outputting processed ones.
The queue length is limited to `2*DBP` to prevent decoding too far ahead compared to what we're pushing downstream.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5163>
This was easy to trigger when testing with e.g. vtenc ! vtdec ! glimagesink and closing the sink via window button,
causing GST_FLOW_ERROR to be received by the output loop, stopping it with the queue still full. This made the
enqueue_buffer() callback to lock waiting for space in our queue, while handle_frame() was waiting for the internal
VideoToolbox queue to free up, so that VTCompressionSessionEncodeFrame could finish. As the output loop was not
running, both functions waited forever.
Fixed by 1) immediately emptying our queue when GST_FLOW_ERROR is received (like we already did with _FLUSHING)
and 2) unconditionally setting the flushing flag in finish_encoding() when it sees the output loop stopped because
of GST_FLOW_ERROR, so that enqueue_buffer() will immediately discard any new frames coming out of VideoToolbox.
Both of those make sure we never run into the both-queues-full scenario.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5303>
As a short-term solution before full d3d12 rendering feature,
copy decoded d3d12 texture to shared d3d11 texture in order to use
existing various d3d11 implementations such as conversion, resizing,
and videosink.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5356>
Even if decoder is negotiated with CUDA memory feature, if downstream
proposed no buffer pool, assume that the pool size is unknown.
And disable zero-copy if there's no more free output surface.
Or, in case of reverse playback, always copy frames.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5338>
Issue is that when amc was producing a codec-data buffer, a
GstVideoCodecFrame was being popped off the internal queue. This meant
that the codec-data was being associated with the first input frame and
the second (first encoded buffer) output buffer with the second input
frame. At the end (assuming one input produces one output which seems
to hold in my testing and how the encoder is currently implemented)
there would be an input frame missing and would be pushed without any
timing information. This would lead to e.g. muxers rejecting the buffer
without PTS and failing to mux.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5330>
This patch removes the code duplication of input buffer importation, in all the
va elements that import video frames. It defines a synthetic object whose
members are required to create a new input buffer and do the importation of the
upstream buffer.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5257>
When we consider the DMA kind caps as input, the input_state->info
only contains the video format of GST_VIDEO_FORMAT_DMA_DRM, which
is not enough for va plugins. The new info in base encoder contains
the correct video info after the DMA caps parsing.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5189>
Since d3d11convert and its variant elements does not enable basetransform's
passthrough, passthrough allocation query needs to be handled
manually in order to respect downstream element's min/max buffer
requirement.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5255>
Adding cudaipc{src,sink} element for CUDA IPC support.
Implementation note:
* For the communication between end points, Win32 named-pipe
and unix domain socket will be used on Windows and Linux respectively.
* cudaipcsink behaves as a server, and all GPU resources will be owned by
the server process and exported for other processes, then cudaipcsrc
(client) will import each exported handle.
* User can select IPC mode via "ipc-mode" property of cudaipcsink.
There are two IPC mode, one is "legacy" which uses legacy CUDA IPC
method and the other is "mmap" which uses CUDA virtual memory API
with OS's resource handle sharing method such as DuplicateHandle()
on Windows. The "mmap" mode might be better than "legacy" in terms
of stability since it relies on OS's resource management but
it would consume more GPU memory than "legacy" mode.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4510>
If glyphrun unit is changed in a single line, there could be
overlapped background area which result in drawing background
twice. Adding geometry combine so that background geometry objects
with the same color can be merged and rendered at once
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5179>
Latest MSYS2 MinGW provides these now, so we don't need to define them
if they're already present in the header.
The AudioClient3 GUID requires the Windows 10 SDK, so it's only
available in the latest MinGW, and the MinGW in Cerbero is too old.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5155>
VA decoders implementation has been verified from 1.18 through 1.22
development cycles and also via the Fluster test framework. Similar
to other cases, we can prefer hardware over software in most cases.
At the same time, GStreamer-VAAPI decoders are demoted to NONE to
avoid collisions. The first step to their deprecation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/2312>
According to libva API description, cu_qp_delta in VAConfigAttribValEncHEVCFeatures
is supposed to be used as a flag not the value of depth. And if flag enabled,
diff_cu_qp_delta_depth should be decided by log2_diff_max_min_luma_coding_block_size.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5068>