Commit graph

92 commits

Author SHA1 Message Date
Mathieu Duponchelle
6346d5608e net/aws/transcriber: track discont offset in input stream
and add it up to subsequent transcripts.

This ensures synchronization is maintained even after the input stream
experiences a discontinuity and a gap in its timestamps.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1230>
2023-06-02 08:55:11 +00:00
Sebastian Dröge
a27be7d054 net: Update to AWS SDK 0.28
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1224>
2023-05-25 13:23:49 +03:00
François Laignel
7ba0073052 use Pad builders for optional name definition
Also, apply auto-naming in the following cases

* When building from a non wildcard-named template, the name of the template is
  automatically assigned to the Pad. User can override with a specific name by
  calling `name()` on the `PadBuilder`.
* When building with a target and no name was provided via the above, the
  GhostPad is named after the target.

See https://gitlab.freedesktop.org/gstreamer/gstreamer-rs/-/issues/448
Auto-naming discussion: https://gitlab.freedesktop.org/gstreamer/gstreamer-rs/-/merge_requests/1255#note_1891181

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1197>
2023-05-12 12:55:31 +02:00
Sebastian Dröge
cb5b527d74 Update to AWS SDK 0.27 and async-tungstenite 0.22
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1199>
2023-05-02 15:30:00 +03:00
Sebastian Dröge
5451035215 Update async-tungstenite and AWS SDK dependencies
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1187>
2023-04-21 10:48:10 +00:00
Mathieu Duponchelle
f366c20869 awstranscriber: fix what we send over for translations
Prior to this commit, we were sending over words concatenated together
with no separators, for instance "Idon'twanttobeanemperor".

The translation service seems clever enough to translate the contents
anyway, but there is no reason to make its task harder than necessary,
and it didn't re-add separators when the target language was the same as
the source language, which resulted in less than ideal output.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1171>
2023-04-10 20:47:12 +00:00
Mathieu Duponchelle
408fd2030c awstranscriber: slight debug improvement
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1171>
2023-04-10 20:47:12 +00:00
Guillaume Desmottes
403004a85e fix typos
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1170>
2023-04-10 13:35:32 +02:00
Seungha Yang
762fb86ce7 awstranscriber: Reset start_time per task
Otherwise wrong start time can be assigned if the element is
reused with state change

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1159>
2023-04-05 18:22:59 +00:00
Tim-Philipp Müller
8845f6a4c6 git: replace LICENSE file symlinks with copies
Git will de-duplicate the contents for us anyway, and
symlinks can cause problems with some versions of git
and also on Windows.

https://github.com/mesonbuild/meson/issues/11646
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4326

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1157>
2023-04-04 14:26:37 +01:00
Seungha Yang
4000d60305 awstranscriber: Avoid too large initial GAP event
Initialized GstSegment.position is always zero

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1154>
2023-04-03 13:05:15 +00:00
Sebastian Dröge
6fe806c2b5 aws: Update to AWS SDK 0.55/0.25
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1152>
2023-03-31 09:12:26 +00:00
François Laignel
2b32d00589 net/aws/transcriber: use two queues for sending transcript items
* A queue dedicated to transcript items not intended for translation.
* A queue dedicated to transcript items intended for translation. The items are
  enqueued after a separator is detected or translate-lookahead was reached.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1137>
2023-03-16 20:29:31 +01:00
François Laignel
5a5ca76d9d net/aws/transcriber: desambiguify SrcPad output items queue
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1137>
2023-03-16 12:41:07 +01:00
François Laignel
162db2f3b9 net/aws/transcriber: fix translate lookahead
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1137>
2023-03-16 12:39:15 +01:00
François Laignel
d5d6a4daf9 net/aws/transcriber: rename prop transcript-lookahead & TranslationSrcPad
... as translate-lookahead and TranslateSrcPad.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1137>
2023-03-16 12:37:31 +01:00
François Laignel
3b3f0c1a29 net/aws/transcriber: fix transcript-lookahead prop nick
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1136>
2023-03-14 21:11:33 +01:00
François Laignel
299e25ab3c net/aws/transcriber: translate: optional experimental translation tokenization
This commit adds an optional experimental translation tokenization feature.
It can be activated using the `translation_src_%u` pads property
`tokenization-method`. For the moment, the feature is deactivated by default.

The Translate ws accepts '<span></span>' tags in the input and adds matching
tags in the output. When an 'id' is also provided as an attribute of the
'span', the matching output tag also uses this 'id'.

In the context of close captions, the 'id's are of little use. However, we can
take advantage of the spans in the output to identify translation chunks, which
more or less reflect the rythm of the input transcript.

This commit adds simples spans (no 'id') to the input Transcript Items and
parses the resulting spans in the translated output, assigning the timestamps
and durations sequentially from the input Transcript Items. Edge cases such as
absence of spans, nested spans were observed and are handled here. Similarly,
mismatches between the number of input and output items are taken care of by
some sort of reconcialiation.

Note that this is still experimental and requires further testings.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1109>
2023-03-14 13:48:32 +00:00
François Laignel
743e97738f net/aws/transcriber: add translation request src pads
This commit adds an optional transcript translation feature implemented as
request src Pads.

When requesting a src Pad, the user can specify the translation language code
using Pad properties 'language-code'.

The following properties are defined on the Element:

- 'transcribe-latency': formerly 'latency', defines the expected latency for
  the Transcribe webservice.
- 'translate-latency': defines the expected latency for the Translate
  webservice.
- 'transcript-lookahead': maximum transcript duration to send to translation
  when a transcript is hitting its deadline and no punctuation was found.

When the input and output languages are the same, only the 'transcribe-latency'
is used for the Pad. Otherwise, the resulting latency is the addition of
'transcribe-latency' and 'translate-latency'.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1109>
2023-03-14 13:48:32 +00:00
Sebastian Dröge
4eccd30ce2 Revert "aws: Temporarily enable the default features of the test-with crate"
This reverts commit 42116b5bce.
2023-03-14 13:28:28 +02:00
Sebastian Dröge
42116b5bce aws: Temporarily enable the default features of the test-with crate
Version 0.9.4 fails compiling without them enabled.

See https://github.com/yanganto/test-with/pull/57
2023-03-14 09:19:26 +02:00
François Laignel
b9cd71d8eb net/aws/transcriber: fix eos not being sent
For eos to be sent from the srcpad task loop, we need to go through `dequeue`.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1122>
2023-03-09 13:07:03 +01:00
François Laignel
2ea9f147ab net/aws/transcriber: fix deadlock when the pipeline is interrupted
... also makes sure to abort the taks_iter Future.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1122>
2023-03-09 13:07:03 +01:00
Sebastian Dröge
3ef8a48ded Fix a few new clippy warnings
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1120>
2023-03-07 08:47:01 +00:00
François Laignel
4a988aaeb8 net/aws/transcriber: use a TranscriberLoop struct
This helps gather together the details related to the `TranscriberLoop`.
One difference with previous implementation is that the ws `Client` is
build each time the loop is started instead of being reused. With the new
approach, we don't keep the connection open after EOS and we should be
more resistant in case of a connection failure.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1104>
2023-03-01 08:47:58 +00:00
François Laignel
f1a080c94e net/aws/transcriber: own transcription items
So that we can avoid copying the content.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1104>
2023-03-01 08:47:58 +00:00
François Laignel
36ae29d746 net/aws: enqueue transcribed buffers within the ws loop
Instead of sending transcription events to the src pad loop, this commit
enqueues the transcribed buffers immediately in the ws loop, then notifies
the src pad loop. The src pad loop is only in charge of dequeuing the buffers.

This should help with upcoming evolutions.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1104>
2023-03-01 08:47:58 +00:00
François Laignel
00153754bb net/aws: use aws-sdk-transcribestreaming
Switch from manual webservice client impl to `aws-sdk-transcribestreaming`.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1104>
2023-03-01 08:47:58 +00:00
François Laignel
57f365979c net/aws: remove aws_ from the aws_transcribe* folder names
Those folders reside under `aws`, so there's shouldn't be any confusion.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1104>
2023-03-01 08:47:58 +00:00
Sebastian Dröge
9fc1404415 Update minimum supported Rust version to 1.66
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1096>
2023-02-20 11:09:01 +02:00
Sebastian Dröge
ac8afc4ac0 Update to async-tungstenite 0.20
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1087>
2023-02-10 13:03:07 +02:00
Sebastian Dröge
1e13dbb99c Update versions to 0.11.0-alpha.1 2023-02-10 00:23:56 +02:00
rajneeshksoni
994c79569e awss3sink: Add properties to set content-Type and content-disposition.
for uploaded object default content-type is set to binary/octet-stream,
which is correct.
metadata cannot be used to set content-type and content-disposition as
setting metadata add a prefix x-amz-meta to key
e.g. setting metadate "content-type=video/mp4" actually set value as
x-amz-meta-content-type. So these has to be seaprate property.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1085>
2023-02-09 19:04:07 +00:00
Sanchayan Maity
6006a0ba36 aws/s3hlssink: Fix deadlock on EOS
In state change to NULL, we take state lock and call stop. When stop
is called, we will try to upload queued segments in S3 request thread.
That tries to take the state lock again and deadlocks.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1076>
2023-02-03 19:09:18 +05:30
Sanchayan Maity
41aa1e51da aws/s3hlssink: Use factory name when checking name of child element
Commit ad3f1cf fixed the name of hlssink child element to be the same
for hlssink2 and hlssink3. However, we rely on element name to return
boolean in case of hlssink3 or None in case of hlssink2 as the return
value of the delete-fragment closure.

Fix this by using the factory name instead of the element name.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1076>
2023-02-03 19:08:40 +05:30
Sebastian Dröge
a1cce9b796 aws: Update to AWS SDK 0.54/0.24
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1066>
2023-01-27 22:10:23 +02:00
Sebastian Dröge
3b4c48d9f5 Fix various new clippy warnings
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1062>
2023-01-25 10:31:19 +02:00
Arun Raghavan
ad3f1cf534 aws: s3hlssink: Fix the name of the hlssink child element
It's easier to set child element properties if the name doesn't depend
on the factory.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1061>
2023-01-24 18:56:46 +00:00
Sebastian Dröge
4582ae91ab Move remaining plugins to ParamSpec builders
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1054>
2023-01-21 18:34:55 +02:00
Sebastian Dröge
458b2386ed Update for glib API changes 2023-01-21 18:13:48 +02:00
Sebastian Dröge
0c954135a3 aws: Update to AWS SDK 0.53/0.23
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1047>
2023-01-14 18:58:30 +02:00
Sebastian Dröge
781fd1df9a aws: Update to test-with 0.9
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1035>
2023-01-05 12:35:42 +02:00
rajneeshksoni
d846f527af awss3hlssink: Add stats property.
application can monitor the progress of hls segment generation
and upload progress.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1022>
2023-01-04 12:36:13 +00:00
Sebastian Dröge
4e444a066c aws: Update to AWS SDK 0.52/0.22
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1020>
2022-12-18 07:54:30 +00:00
Sebastian Dröge
3f904553ea Fix various new clippy warnings
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1011>
2022-12-13 11:43:16 +02:00
Sebastian Dröge
fb42cd8a0f net: Update to async-tungstenite 0.19
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1005>
2022-12-11 12:54:24 +02:00
Sebastian Dröge
0e2a00cbc8 aws: Update to env_logger 0.10 for the tests
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/984>
2022-11-25 11:08:19 +02:00
Arun Raghavan
3abd13e57b aws: s3sink: Treat stopping without EOS as an error for multipart upload
This allows us to try to clean up based on configuration (abort /
complete / do nothing) if the pipeline is shut down without an EOS.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/970>
2022-11-15 02:28:35 +00:00
Arun Raghavan
54c84a7211 aws: Skip s3 test on Windows until we figure out why it times out 2022-11-02 13:14:08 -04:00
Sebastian Dröge
a8250abbf1 Fix various new clippy warnings 2022-11-01 10:27:48 +02:00