There is a case where there are no lines in the temp cache, and
it's possible to skip straight to the request line and not generate
intermediate ones. This is really only beneficial when doing
nearest-neighbour downscaling, as other methods generally require
all input lines sequentially to generate the output. In that case,
this change has no effect and all lines are generated and cached
as before.
As a side effect however, this fixes corruption when downscaling
using nearest-neighbour, as interactions with the pass_alloc flag
and reuse of temporary lines causes the unecessarily-generated
cache lines to overwrite the final output.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/919>
Rename remaining `gst_video_color_transfer_{encode,decode}` functions on
the `GstVideoTransferFunction` enumeration to
`gst_video_transfer_function_{encode,decode}` permitting
gobject-introspection to turn these into associated functions and place
them under the respective `<enumeration>` block in gir XML files.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/805>
These then don't require going through the generic code path via AYUV64
first but can be converted directly.
This speeds up processing of
videotestsrc ! v210 ! videoconvert ! other_format ! fakesink
by a factor of 1.55 for I420/YV12 and 1.40 for the other destination
formats and reduces memory pressure considerably.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/775>
The type is called GstVideoTransferFunction so the function names should
match, otherwise gobject-introspection is keeping the functions as
global functions instead of methods on the type.
The same mistake was also made in lots of other APIs over the years, but
here we can at least fix it for 1.18 still.
Thanks to Marijn Suijten for noticing.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/807>
This adds linear 32x32 NV12 based tiles. This format is notably used by
Allwinner VCU and exposed in V4L2 as being "SUNXI Tiled" format. In this
patch we generalize the plane info calculation so we can share this part
with the 4L4 variant.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/754>
Since we are using VideoMeta, the converter (similarly to the video_frame_copy
utility) should have no issue dealing with frames that are slightly larger.
This situation occure as some element will use padded width/height for
allocation, which results in a VideoMeta width/height being larger then the
display width/height found in the negotiated caps.
Fixes#790
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/747>
For example, BT709, BT601, and BT2020_10 all have theoretically
different transfer functions, but the same function in practice. In
these cases, we should use the fast path for negotiating. Also,
BT2020_12 is essentially the same as the other three, just with one more
decimal point, so it gives the same result for fewer bits. This is now
also aliased to the former three.
Also make videoconvert do passthrough if the caps have equivalent
transfer functions but are otherwise matching.
As of the previous commit, we write the correct transfer function for
BT601, instead of the (functionally identical but different ISO code)
transfer function for BT709. Files created using GStreamer prior to that
commit write the wrong transfer function for BT601 and are, strictly
speaking, 2:4:5:4 instead. However, this commit takes care of
negotiation, so that conversions from/to the same transfer function are
done using the fast path.
Fixes#783
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/724>
If the cropping or scaling input or output rects put us completely
outside the input/output frame respectively, we can't draw anything
except black safely. Check for those conditions and don't set up a
configuration that attempts to access out of bounds memory outside
the input/output framebuffers.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/696>
If the frames passed in to gst_video_converter_frame()
have a different layout than was configured for, the
conversion code might go out of bounds and crash.
Do a sanity check on each frame passed in, and in the
absence of a return value in the API, just
refuse the conversion in invalid cases and leave the
destination frame untouched so it's obvious to
users that it was broken.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/-/merge_requests/696>
The matrices were in the wrong order.
Instead of the conversion matrix being
_ XYZ_TO_RGB_output * RGB_TO_XYZ_input * input_RGB
It was
_ RGB_TO_XYZ_input * XYZ_TO_RGB_output * input_RGB
Without this, scaling e.g. interlaced UYVY causes corrupted output with
lines as follows: f1 f1 f2 f2, i.e. two lines of each field and only
then the other field.
Packed 10 bits per each R, G and B channel with MSB 2bits alpha channel.
This format is mapped to Windows' DXGI_FORMAT_R10G10B10A2_UNORM format which is
required for 10bits HDR rendering.
Note that this RGB10A2_LE format is R - B channel swapped version of BGR10A2_LE
We make an allocator for temporary lines and then use this for all
the steps in the conversion that can do in-place processing.
Keep track of the number of lines each step needs and use this to
allocate the right number of lines.
Previously we would not always allocate enough lines and we would
end up with conversion errors as lines would be reused prematurely.
Fixes#350
This pixel format is a fully packed variant of NV12_10LE32,
a luma pixel would take 10bits in memory, without any
filled bits between pixels in a stride. The color range
follows the BT.2020 standard.
In order to get a better performance in hardware memory
operation, it may expend the stride, append zero data at the
end of echo lines.
Pack function by Nicolas Dufresne.
https://bugzilla.gnome.org/show_bug.cgi?id=795462
Signed-off-by: Nicolas Dufresne <nicolas@ndufresne.ca>
Signed-off-by: ayaka <ayaka@soulik.info>
The problem is that even though the functions we are calling are
in-place transformation, orc automatically puts the restrict keyword
on all arguments. To silence that warning just create yet-another
variable containing the same value.
https://bugzilla.gnome.org/show_bug.cgi?id=795765
This pixel format is a fully packed variant of NV12, a luma
pixel would take 10bits in memory, without any filled bits
between pixels in a stride. The color range follows
the BT.2020 standard.
In order to get a performance in hardware memory
operation, it may expend the stride, append zero data at the
end of echo lines.
Signed-off-by: ayaka <ayaka@soulik.info>
https://bugzilla.gnome.org/show_bug.cgi?id=795462
This adds a 10 bit variant for NV16 packed into 32 bits little endian
words. The MSB 2 bits are padding. This format is used on Xilinx SoC and
identified with the FOURCC XV20.
https://bugzilla.gnome.org/show_bug.cgi?id=789876
This add a 10bit variant of gray scale packed into 32bits little endian
words. The MSB 2 bits are padding and should be ignored. This format is
used on Xilinx SoC and is identified with the FOURCC XV10.
https://bugzilla.gnome.org/show_bug.cgi?id=789876
This adds a 10bit variant for NV12 which packs 3 10bit components
into little endian 32bit words. The MSB 2 bits are padding and should be
ignored. This format is used on Xilinx SoC and is identified with there
with the FOURCC XV15
https://bugzilla.gnome.org/show_bug.cgi?id=789876
This adds a property to select the maximum number of threads to use for
conversion and scaling. During processing, each plane is split into
an equal number of consecutive lines that are then processed by each
thread.
During tests, this gave up to 1.8x speedup with 2 threads and up to 3.2x
speedup with 4 threads when converting e.g. 1080p to 4k in v210.
https://bugzilla.gnome.org/show_bug.cgi?id=778974
Fix problem with the line cache where it would forget the first line in
the cache in some cases.
Keep as much backlog as we have taps. This generally works better and we
could do even better by calculating the overlap in all taps.
Allocated enough lines for the line cache.
Use only half the number of taps for the interlaced lines because we
only have half the number of lines.
The pixel shift should be relative to the new output pixel size so scale
it.
Fixes: https://bugzilla.gnome.org/show_bug.cgi?id=767921