The original awstranscribe element has grown too complex when
integrating translations for reasons that in retrospect were wrong:
As awstranscribe outputs words one by one, I decided we wanted to
perform translations there with larger sentences if available, but an
alternative design where a separate translation element is composed
downstream is also possible, as long as that element accumulates words
and enough latency is set on the transcriber.
An important difference is that the new elements do not expose unsynced
pads, this use case is instead now served by simple messages on the bus.
The elements should otherwise be at feature parity with the original
element.
A higher-level bin is also provided for convenience (and usage within
transcriberbin): translationbin.
A transcriber element can be provided to this bin, which exposes an
always audio sink pad, and an always text sink pad (for the
transcripts).
Additional source pads can be requested for translations, for now the
bin always uses `awstranslate` as the translator, but this can be made
configurable.
This element is usable as a transcriber in `transcriberbin`.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/2055>
This adds support for direct encoding of common formats into ISO base media file
format.
There are unit tests for formats that are not completely supported, to
check that those functions work correctly, and to ease future extension.
End-to-end testing currently requires use of gpac to validate files.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1990>
Brings support for multiple streams of each kind to fallbacksrc.
Usage past 1video/1audio stream now requires using the stream selection
API.
fallbacksrc will expose its own collection of streams, which will be
mapped to streams from the main and fallback source automatically.
This mapping can be changed via the map-streams signal.
The amount of streams being exposed by fallbacksrc is dictated by the
main source.
CustomSource has been updated to also support multi-stream scenarios,
both for stream-aware elements and for simple bins without such
functionality.
Co-authored-by: Sebastian Dröge <sebastian@centricular.com>
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1832>
By default the transcriber will attempt to join punctuation with the
preceding word, expose a property to control that.
As speechmatics sometimes outputs punctuation for a sentence in the
next transcript, it will sometimes arrive too late for joining. In
order to work around this behavior, a lower max-delay is used by
default, that may not always be desirable, especially if low latency is
a concern.
Expose a property to disable the hack.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1909>
As the application expects to have the bin buffer the audio stream
internally and output it again unchanged, and transcribers might
expect a set number of channels, we need to expose a property to
let the user control how to downmix the audio stream teed through
the transcriber.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1969>
With standard voices, AWS polly supports passing a max-duration
attribute.
When the element gets raw text passed in, it can wrap it as SSML and set
the max duration attribute, this to make sure synthesized speech
doesn't overlap.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1930>
This commit adds a new "synthesis-languages" property. Users can set it
to define a map of languages (typically translations) that should then
be routed through a "synthesis" bin, with its description specifiable
as the value of the map.
The output of this bin is then exposed as a new pad on the top-level
bin.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1930>
When the transcriber is used in a live situation, it can be useful
to save a transcript for editing after the fact when producing a
VOD.
Each source pad now gets an "unsynced_" pendant. That unsynced pad
is pushed to from the context of the "live" source pad task. Flow
returns from the unsynced pads are ignored, we simply check the
last flow return before attempting to push the next transcript.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1915>
similar to the non threadshare counterparts, the ts-udpsink can accept
only one multicast interface and the ts-udpsrc can accept a list of
interfaces to be listening on for the multicast.
Use the getifaddrs crate to get the available network interfaces and filter
the desired interfaces from the available interfaces
Reuse a custom api written for PTP helper to join and leave multicast group
for IPv4 based addresses. Continue to use the UdpSocket crate's _multicast_v6
to join/leave an IPv6 multicast group
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1420>
We want to enable passthrough internally, and only notify that
internally it has been enabled once the transcriber has been unlinked.
This way applications connected to the notify handler can synchronously
update the properties and attempt to disable passthrough again.
Doing so properly requires a refactoring of the transition to the
passthrough state, with the currently set passthrough mode maintained
separately from the target passthrough state.
This commit also finishes the work left incomplete in
17d7997137 by moving the passthrough
property to the sink pad class, making each transcriber passthrough
state independent from the others.
Also adds an example to demonstrate the behavior
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1856>
The event-types property defaults to Eos. Setting an
array of additional, serialized event types results in
calling the producer.set_forward_events with those types
so that the events will be forwarded to any consumers.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1875>