In case of tier 1 decoder, always use reference-only picture to avoid
fixed-size pool limitation and output decoded picture without
copy even for negative rate. Also do not use copy queue for GPU to GPU
copy. Copy queue is specialized for upload/download and may occupy
PCIE bandwidth. Use direct queue as recommended by vendors.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5877>
On flush event, baseclass will discard all pictures from DPB
but there can be still in-flight commands not finished yet.
Use our command queue, allocator and fence data helper objects
to keep resource available during command execution.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5877>
Shader visible descriptors occupy GPU resource and there are hardware
limits. Thus, in order to minimize the amount of shader visible heaps,
only non shader visible descriptor heap (staging) will be held by d3d12memory.
Then converter will copy the staging descriptor to shader visible
descriptor heap per draw.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Zero initialization would have overhead and it's not required
most cases except for textures. Use CREATE_NOT_ZEROED flag
in case of buffer resource or if a texture will be rendered without any
prior read operation.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Conversion will happen when constructed command list is executed,
not by converter element. Thus this object should not map output buffer
with write flag which will result in error if multiple threads
are building commands for the same output target frame.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5875>
Add macro which converts picture frame number to suitable timestamp in
nanoseconds for use in V4L2 VB2 buffer lookup. Since multiple codecs do
the same operation and almost all got it wrong, do it in one place so it
can be fixed in one place again, if needed.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstMpeg2Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecMpeg2Dec *_ref_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during MPEG2 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstAV1Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecAV1Dec v4l2_av1_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during AV1 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstVp9Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecVp9Dec v4l2_vp9_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during VP9 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The GstVp8Picture system_frame_number is guint32, constant 1000 is guint32,
GstV4l2CodecVp8Dec v4l2_vp8_frame.*_frame_ts multiplication result is u64 .
```
u64 result = (u32)((u32)system_frame_number * (u32)1000);
```
behaves the same as
```
u64 result = (u32)(((u32)system_frame_number * (u32)1000) & 0xffffffff);
```
so in case `system_frame_number > 4294967295 / 1000`, the `result` will
wrap around. Since the `result` is really used as a cookie used to look
up V4L2 buffers related to the currently decoded frame, this wraparound
leads to visible corruption during VP8 decoding. At 30 FPS this occurs
after cca. 40 hours of playback .
Fix this by changing the 1000 from u32 to u64, i.e.:
```
u64 result = (u64)((u32)system_frame_number * (u64)1000ULL);
```
this way, the wraparound is prevented and the correct cookie is used.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5791>
The input of the vacompositor may be DMA buffers. And in this case, the input
caps has the format=DMA_DRM, which can not be recognized by base video
aggregator class' find_best_format() function. So we need to override the
update_caps() virtual function.
Also we consider the DMA kind caps in negotiated_src_caps() for output.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5160>
To achieve maximum throughput, waiting on command commit thread
is not ideal. And render-delay will introduce unwanted latency.
Best is to split thread and wait finished decoding job in a dedicated
output thread
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5812>