Commit graph

102 commits

Author SHA1 Message Date
Arun Raghavan
724c6d6e32 rusoto: s3sink: Make remaining requests bounded in time
This implements a default timeout and retry duration for the remaining
S3 requests that were still able to be blocked indefinitely. There are 3
classes of operations: multipart upload creation/abort (should not take
too long), uploads (duration depends on part size), multipart upload
completion (can take several minutes according to documentation).

We currently only expose the part upload times as configurable, and hard
code the rest. If it seems sensible, we can expose the other two sets of
parameters as well.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/690>
2022-03-21 13:50:07 +05:30
Arun Raghavan
7b8d3acf10 s3src: Consolidate stream reading into get object retries
Previously, the actual reading from the streaming body of a GetObject
request was not within the same timeout/retry path as the dispatch of
the HTTP request itself. We consolidate these two into a single async
block and create a sum type to encapsulate the rusoto and std library
error paths within that future.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/690>
2022-03-21 13:50:07 +05:30
Arun Raghavan
1ad277a410 rusoto: s3src: Implement timeout and retries
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/690>
2022-03-21 13:50:07 +05:30
Arun Raghavan
930f51edbc rusoto: s3sink, s3src: Retry on server errors
We can retry in the case of 500/503/other errors that might occur that
might be recoverable, instead of bailing.
2022-03-21 13:50:07 +05:30
Ray Tiley
cab33768e2 awstranscribe - increase presisigned url duration to 5 mins from 60s
Have seen a few times where machines that are in perfect time sync with a good source the requests fail with `RequestExpired` errors.

https://docs.aws.amazon.com/transcribe/latest/dg/CommonErrors.html

While not perfect, bumping to five minutes gives more a chance that the signed requests to start streaming won't be expired.
2022-03-21 13:50:07 +05:30
Sebastian Dröge
b9887e1057 Update versions to 0.8.3 2022-03-08 19:49:16 +02:00
Sebastian Dröge
1c2d4d4350 rusoto: Update async-tungstenite dependency to 0.17 2022-03-08 19:23:26 +02:00
Arun Raghavan
249b0ac4c1 rusoto: s3sink: Implement timeout/retry for part uploads
Rusoto does not implement timeouts or retries for any of its HTTP
requests. This is particularly problematic for the part upload stage of
multipart uploads, as a blip in the network could cause part uploads to
freeze for a long duration and eventually bail.

To avoid this, for part uploads, we add (a) (configurable) timeouts for
each request, and (b) retries with exponential backoff, upto a
configurable duration.

It is not clear if/how we want to do this for other types of requests.
The creation of a multipart upload should be relatively quick, but the
completion of an upload might take several minutes, so there is no
one-size-fits-all configuration, necessarily.

It would likely make more sense to implement some sensible hard-coded
defaults for these other sorts of requests.
2022-03-08 19:23:26 +02:00
Sebastian Dröge
4ef0fcd22e Update versions to 0.8.2 2022-02-21 12:51:40 +02:00
fb1cbe1a4c rusoto: Export AwsTranscriberResultStability enum 2022-02-20 20:46:32 +02:00
Sebastian Dröge
f0add79b7d Update versions to 0.8.1 2022-02-04 18:46:12 +02:00
Sebastian Dröge
81a571bd8b Replace Foo::from_instance(foo) with foo.imp() 2022-01-18 15:48:28 +02:00
Sebastian Dröge
00172c0485 rusoto: Add missing license file 2022-01-16 13:53:04 +02:00
Sebastian Dröge
818a508ac5 Re-license LGPL-2.1 plugins to MPL-2
Fixes https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/issues/168
2022-01-15 21:44:39 +02:00
Sebastian Dröge
8ffb6c1584 Switch to 0.15 branches of gtk-rs, and 0.18 of gstreamer-rs and provide a version 2022-01-15 20:31:40 +02:00
Sebastian Dröge
ab14c50d1c Ignore clippy::non_send_fields_in_send_ty lint
It's useless in its current shape and wrongly triggering on all types.

See https://github.com/rust-lang/rust-clippy/issues/8045
2022-01-14 12:09:57 +02:00
Sebastian Dröge
81f5f0f60c Fix various clippy warnings 2022-01-12 19:51:08 +02:00
Sanchayan Maity
099a3f2114 rusoto: s3sink: Support aborting or completing multipart upload on error
A multipart upload should either be completed or aborted on error. In
the current state of things, a multipart upload would neither be
completed nor aborted, putting the onus on an external entity to take
care of finishing incomplete uploads or relying on a sane bucket
life cycle policy configured to abort incomplete multipart uploads.

An incomplete multipart upload still contributes to the storage costs as
long as it exists.

We introduce a property here to allow the user to select either aborting
or completing multipart uploads on error. Aborting the upload causes
whole of data to be discarded and the same upload ID is not usable for
uploading more parts to the same.

Completing an incomplete multipart upload can be useful in situations
like having a streamable MP4 where one might want to complete the upload
and have part of the data which was uploaded be preserved.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/618>
2021-12-07 18:29:52 +05:30
Sebastian Dröge
c46901d150 Fix or silence various new 1.57 clippy warnings 2021-11-30 16:31:50 +02:00
Mathieu Duponchelle
97e6a89cac aws_transcriber: sanity check alternative length
The design of the element is based on the assumption that when
receiving a partial result, the following result will contain
at least as many items as there were stable items in the previous
result.

This patch adds a sanity check to make sure our "partial index"
isn't larger than the new received result, and errors out otherwise.

partial_index will eventually be reset to 0 once we receive a
new non-partial result.
2021-11-24 13:10:00 +00:00
Guillaume Desmottes
0b348406ef s3sink: add metadata property
This property can be used to set metadata on the S3 storage object.
2021-11-22 17:03:24 +01:00
Guillaume Desmottes
11bef9066c s3sink: log when setting properties 2021-11-22 16:52:04 +01:00
Sebastian Dröge
86f422592b Update for glib::Enum / glib::Boxed / glib::flags! macro renames 2021-11-22 11:04:26 +02:00
Sebastian Dröge
55aad51141 Update for glib constructor renames
See https://github.com/gtk-rs/gtk-rs-core/pull/384
2021-11-20 14:31:06 +02:00
Sebastian Dröge
f817f6e9b9 Update to rav1e 0.5 and async-tungstenite 0.16
Also add an asm feature to rav1e, which requires nasm to be in place.
2021-11-17 10:10:00 +02:00
Sebastian Dröge
d9bda62a47 Update for GLib/GStreamer API changes
And clean up a lot of related property/caps/structure code.
2021-11-06 09:34:10 +02:00
Sebastian Dröge
0a7d1639e7 Update to Rust edition 2021 and minimum supported Rust version to 1.56 2021-10-31 17:40:05 +02:00
Sebastian Dröge
b9541b2ca4 Update for GstObjectImpl API change 2021-10-23 12:31:33 +03:00
François Laignel
27b9f0d868 Improve usability thanks to opt-ops
The crate option-operations simplifies usage when dealing with
`Option`s, which is often the case with `ClockTime`.
2021-10-18 15:09:47 +02:00
Sebastian Dröge
69bb09f7ad rusoto/s3: Allow passing custom AWS-compatible regions
For the region property this would be provided as
    `region-name+https://region.end/point`
while for the URI this unfortunately has to be base32 encoded to allow
usage as the host part of the URI.
2021-09-28 06:23:07 +00:00
Sebastian Dröge
502b336361 rusoto: Implement auth via explicit access-key/secret-access-key properties
This allows passing them explicitly as strings to the elements instead
of relying on system/per-user configuration.
2021-09-27 17:00:36 +03:00
Sebastian Dröge
f4613bfc07 Use Buffer::from_mut_slice() in more places
This allows downstream to map the memory mutable.
2021-09-18 11:58:59 +03:00
Sebastian Dröge
ea394fb06e rusoto: Update to async-tungstenite 0.15 2021-09-11 08:44:32 +03:00
Sebastian Dröge
96d86eaa06 Clean up clippy warnings and CI configuration
Put clippy overrides into the sources files instead of the CI
configuration, and fix various warnings / clean up code.
2021-09-08 12:35:41 +00:00
Mathieu Duponchelle
626df03961 aws_transcriber: fix CRC check
This was broken when porting to crc 2, based on:

https://github.com/mrhooray/crc-rs/issues/62#issuecomment-850591181

> CRC_32_BZIP2 is a different algorithm from CRC_32_IEEE, try CRC_32_ISO_HDLC instead.

The correct algorithm for replacing checksum_ieee is not CRC_32_BZIP2.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/555>
2021-09-03 23:37:14 +02:00
Mathieu Duponchelle
1a4e6d58f4 net/rusoto: implement parser for AWS transcription file
AWS can generate JSON files containing a full transcript, implement
a simple push parser to support the format.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/547>
2021-08-27 19:53:57 +00:00
Sebastian Dröge
4a870af19c Update various dependencies 2021-08-26 09:44:43 +03:00
Sebastian Dröge
848b296390 Add capi feature to all plugin crates
This fixes the build with cargo-c 0.9.2.
2021-08-11 20:51:36 +03:00
Sebastian Dröge
052365ba1a Fix various needless-borrow clippy warnings and others 2021-07-30 13:53:35 +03:00
Mathieu Duponchelle
a051127cb1 aws_transcriber: expose lateness property
The default behavior for the transcriber is to output text buffers
synchronized with the input stream, introducing a configurable
latency.

For use cases where synchronization is not crucial, but latency
is, the lateness property can be used instead of or in combination
with the latency property, in order to introduce a configurable
offset with the input stream.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/534>
2021-07-28 00:48:14 +02:00
Ruben Gonzalez
54d8c5f6a9 Delete minimum GStremer required version for some plugins
Tested building the pluging with cargo-c and running gst-inspect-1.0
in a Ubuntu Xenial 18.04 LTS. It contains GStreamer 1.8.3.
2021-07-20 21:49:24 +02:00
Sebastian Dröge
24ec79cd1a Update versions to 0.8.0 for the master branch 2021-07-09 13:49:33 +03:00
Sebastian Dröge
1c3ae0f89a Update versions to 0.7.0 2021-07-09 13:49:21 +03:00
Mathieu Duponchelle
9415c50200 awstranscriber: further decouple output from input
As awstranscriber might in theory push out gap events without
any flow of input data, it needs to send its mandatory events
(stream-start, caps, segment) independently.

In addition, track a start time and use it to offset the 0-based
timestamps returned by AWS in order to output buffers timestamped
in the running-time domain, and perform item timing adjustment
only when dequeuing, instead of when queuing.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/525>
2021-06-26 00:46:28 +02:00
Arun Raghavan
3a2d16f00c rusoto: s3sink: Bring back bucket, key and region properties
We don't want to drop these entirely while introducing the URI handler,
as that would break backwards compatibility.
2021-06-21 06:58:06 -04:00
Mathieu Duponchelle
640ce43fee awstranscriber: make use of new result stability AWS API option
<https://aws.amazon.com/blogs/machine-learning/amazon-transcribe-now-supports-partial-results-stabilization-for-streaming-audio/>

Amazon seem to have realized the previous iteration of their API
made it difficult to identify items from one result to the next,
which made the element much more complicated than it should have
been. With that new "stability" option, we can enqueue items as
soon as they stabilize, and simply rely on the current index in
the transcript to output them exactly once.

This also means the "use_partial_results" is now useless, as there
will be no difference in accuracy between a non-partial result and
and of its stable items that might have been pushed from previous
partial versions of the result.

The property is removed, instead a new option is exposed to let
users control how fast results should stabilize.

This greatly simplifies the code, and also improves the output as
punctuation doesn't need to be randomly discarded anymore.
2021-06-19 14:45:22 +02:00
Mathieu Duponchelle
d15e97efb8 awstranscriber: expose optional session-id property
When set, it can be used to identify transcription sessions
a posteriori.
2021-06-17 00:54:14 +02:00
François Laignel
5439f14e57 fix clippy warnings 2021-06-05 10:36:22 +02:00
François Laignel
2c4c35deba net: migrate to new ClockTime design 2021-06-05 10:36:21 +02:00
François Laignel
8dfc872544 use gst::glib where applicable 2021-06-03 20:53:16 +02:00