Commit graph

9 commits

Author SHA1 Message Date
Seungha Yang abe1f5044d cuda: Prefer CUBIN over PTX
System installed NVRTC library might be newer version than
driver, then generate PTX can be incompatible with the driver.
Instead of the intermediate code PTX, use actual assembly code
directly.

Fixes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3108
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5639>
2024-01-02 10:10:09 +00:00
Seungha Yang c818906236 cuda: Add support for I420_12LE format
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5409>
2023-09-29 12:36:01 +00:00
Seungha Yang a80f542f66 cuda: Add support for P012_LE and Y444/GBR high bitdepth formats
Adding P012, Y444_10, Y444_12, GBR_10, GBR_12 and GBR_16 formats support

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/5375>
2023-09-23 13:12:55 +00:00
Seungha Yang 7a3be74b63 cudaconvertscale: Add support for flip/rotation
Similar to the d3d11convert element, colorspace conversion, resizing and
flip/rotation operations can be done in a single kernel function call

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4640>
2023-05-16 19:24:36 +00:00
Seungha Yang 67764a1579 cudaconverter: Rename CUDA kernel function
Changing its name (was too generic) to help GPU tracing via Nsight tool

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4029>
2023-02-21 20:07:31 +00:00
Seungha Yang fa2bb42fda cudaconverter: Use cached texture
... instead of per conversion texture alloc/free

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3884>
2023-02-16 17:49:54 +00:00
Seungha Yang 1cb47d549b cudaconverter: Don't sync per conversion
Caller should take care of synchronization

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00
Seungha Yang 661b5f60c6 cuda: Provide single header include entry point
Add "gstcuda.h" header file

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3629>
2023-02-03 15:27:42 +00:00
Seungha Yang c11f8fa930 cuda: Rewrite colorspace/rescale object
Rewriting GstCudaConverter object, since the old implementation was not
well organized and it's hard to add new features.
Moreover, the conversion operations were not very optimized.

Major change of this implementation:
* Remove redundant intermediate conversion operations such as
  any RGB -> ARGB(64) conversion or any YUV -> Y444 (or 16bits Y444).
  That's not required most of cases. The only required case is
  converting 24bits (such as RGB/BGR) packed format to 32bits format
  because CUDA texture object does not support sampling 24bits format
* Use normalized sample fetching (i.e., [0, 1] range float value)
  and also normalized coordinates system for CUDA texture.
  It's consistent with the other graphics APIs such as Direct3D
  and OpenGL, that makes sampling operations much easier.
* Support a kind of viewport and adopt math for colorspace conversion
  from GstD3D11 implementation

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3389>
2022-11-15 16:25:44 +00:00