mirror of https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs.git synced 2024-11-23 12:01:01 +00:00

History

François Laignel 299e25ab3c net/aws/transcriber: translate: optional experimental translation tokenization This commit adds an optional experimental translation tokenization feature. It can be activated using the `translation_src_%u` pads property `tokenization-method`. For the moment, the feature is deactivated by default. The Translate ws accepts '<span></span>' tags in the input and adds matching tags in the output. When an 'id' is also provided as an attribute of the 'span', the matching output tag also uses this 'id'. In the context of close captions, the 'id's are of little use. However, we can take advantage of the spans in the output to identify translation chunks, which more or less reflect the rythm of the input transcript. This commit adds simples spans (no 'id') to the input Transcript Items and parses the resulting spans in the translated output, assigning the timestamps and durations sequentially from the input Transcript Items. Edge cases such as absence of spans, nested spans were observed and are handled here. Similarly, mismatches between the number of input and output items are taken care of by some sort of reconcialiation. Note that this is still experimental and requires further testings. Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/1109>		2023-03-14 13:48:32 +00:00
..
src	net/aws/transcriber: translate: optional experimental translation tokenization	2023-03-14 13:48:32 +00:00
tests	Fix various new clippy warnings	2023-01-25 10:31:19 +02:00
build.rs	Rename rusoto to aws	2022-06-14 08:03:49 +00:00
Cargo.toml	net/aws/transcriber: add translation request src pads	2023-03-14 13:48:32 +00:00
LICENSE-MPL-2.0	Rename rusoto to aws	2022-06-14 08:03:49 +00:00
README.md	aws: fix title in README	2022-10-26 11:13:47 +02:00

README.md

gst-plugin-aws

This is a GStreamer plugin to interact with Amazon Web Services. We currently have elements to interact with S3 and Transcribe.

AWS Credentials

AWS credentials are picked up using the mechanism described by AWS SDK. At the moment, that is:

Environment variables: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
AWS credentials file. Usually located at ~/.aws/credentials.
IAM instance profile. Will only work if running on an EC2 instance with an instance profile/role.

An example credentials file might look like:

[default]
aws_access_key_id = ...
aws_secret_access_key = ...

s3src

Reads from a given S3 (region, bucket, object, version?) tuple. The version may be omitted, in which case the default behaviour of fetching the latest version applies.

$ gst-launch-1.0 \
    s3src uri=s3://ap-south-1/my-bucket/my-object-key/which-can-have-slashes?version=my-optional-version !
    filesink name=my-object.out

s3sink

Writes data to a specified S3 (region, bucket, object, version?) tuple. The version may be omitted.

$ gst-launch-1.0 \
    videotestsrc ! \
    theoraenc ! \
    oggmux ! \
    s3sink uri=s3://us-west-1/example-bucket/my/file.ogv?version=my-optional-version

s3hlssink

Writes a single variant HLS stream directly to a specified S3 (region, bucket, path prefix) tuple. Takes the encoded audio and video stream as input, and uses hlssink3 if available, else hlssink2. HLS stream parameters such as playlist length, segment duration, etc. can be tweaked by accesing the underlying sink using the hlssink property.

awstranscriber

Transcribes audio to text.