Commit graph

13 commits

Author SHA1 Message Date
Mathieu Duponchelle
2d0effd781 speechmaticstranscriber: add properties for speaker detection
diarization=speaker can be set to enable speaker detection, and
max-speakers can be set to control the maximum number of detected
speakers.

An event is then forwarded downstream upon speaker changes.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/2055>
2025-02-10 11:16:44 +00:00
Mathieu Duponchelle
9da6dff1a9 speechmaticstranscriber: post messages with raw results
This deprecates the buffers pushed on the unsynced pads, which should
be removed prior to release.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/2055>
2025-02-10 11:16:44 +00:00
Mathieu Duponchelle
484275b350 speechmaticstranscriber: output items as early as possible
There is no reason to delay the output of items until the deadline.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/2055>
2025-02-10 11:16:44 +00:00
Mathieu Duponchelle
0ed3f833ac speechmaticstranscriber: add new max-delay property
This allows controlling the requested delay independently from the
latency of the element.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/2055>
2025-02-10 11:16:44 +00:00
Mathieu Duponchelle
c51a65d973 awstranscriber, speechmatics: store language tags on translation source pads
In order to do so we need to activate the pad as soon as it is added,
which means we can no longer start the task at this point, instead wait
for stream-start to do so now.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/2029>
2025-01-20 14:27:05 +00:00
Mathieu Duponchelle
0376cd2752 speechmatics: expose properties for controlling punctuation joining
By default the transcriber will attempt to join punctuation with the
preceding word, expose a property to control that.

As speechmatics sometimes outputs punctuation for a sentence in the
next transcript, it will sometimes arrive too late for joining. In
order to work around this behavior, a lower max-delay is used by
default, that may not always be desirable, especially if low latency is
a concern.

Expose a property to disable the hack.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1909>
2024-12-09 17:29:47 +00:00
Mathieu Duponchelle
4e722d6dcc speechmatics: expose unsynced pads on transcriber
This can be used for storing original transcripts for editing after the
fact.

Modeled on the aws transcriber, to be usable from transcriberbin.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1963>
2024-12-06 09:38:39 +00:00
Mathieu Duponchelle
849ae7c845 speechmatics: fix hang when one source pad errors out
We still want to push translations / transcripts on the other pads, and
prior to that patch as the pad only paused itself but kept its mpsc
channel alive and stopped reading from it, it would block further messages
from being processed by the other source pads.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1925>
2024-11-20 12:52:17 +01:00
Mathieu Duponchelle
dc1d63419e speechmaticstranscriber: store and use a start time
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1869>
2024-10-23 13:48:46 +00:00
Sebastian Dröge
7e59c3f0fd Remove once_cell dependency
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1868>
2024-10-21 17:53:18 +00:00
Mathieu Duponchelle
867408b1c0 speechmaticstranscriber: add debug
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1831>
2024-10-02 11:16:02 +00:00
Sebastian Dröge
c505d9a418 Update to async-tungstenite 0.28
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1772>
2024-09-10 09:19:18 +03:00
Mathieu Duponchelle
170e769812 audio: add speechmatics transcriber
Element implemented around the Speechmatics API:

<https://docs.speechmatics.com/rt-api-ref>

The element also comes with translation support, and offers a similar
interface to the one exposed by `awstranscriber`.

The Speechmatics service has good accuracy, and can be deployed on
premises, offering an advantage over AWS transcribe.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1665>
2024-08-21 17:43:02 +00:00