This new video filter is able to detect the dominant color in a video frame.
When the color has changed from the previous frame the filter posts an Element
message on the bus, the associated structure is named `colordetect` and has two
fields:
* a string field named `dominant-color`
* a list field containing the whole color palette, stored as uint values, sorted
by dominance, with more dominant colors first
There can be small race where transcription-bin is linked with
tee but state change of the transcription-bin is not finished.
And at the same time, upstream pushes event/buffer to the
transcription-bin. Do state change first then link to avoid
the condition
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/716>
Zero-padding is not specified for the indices but all time components
need to be zero-padded (3 digits for fractional seconds, 2 digits for
everything else).
If transcription runs slow or has issues the queue can fill up and block
all audio processing. This gives the queue a sufficent buffer and allows
it to drop audio if it eventually fills up. This was most noticable with
bad internet connections using the `awstrnascriber` where it would take
quite a while for the websocket to eventually timeout and the bin to
enter `passthrough=true`.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/688>
By using this new property, application can select exclusive caption
source. There are three source types
- Both: Inband and transcription captions are combined if exist.
This is default behavior.
- Inband: Transcription buffers will be dropped
- Transcription: Caption meta of each video buffer will be dropped
In this version, transcriberbin doesn't provide any hint
for application to help caption source decision. That can be done
by application's strategy, passthrough status or probing inband
caption meta for example.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/684>
Fix race between latency query handler and setup_transcription()
method.
Locking order of setup_transcription() is
state lock -> setup_transcription() -> settings lock
So taking state lock inside of setting lock in src_query()
can cause deadlock.
As a side effect this allows us also to handle errors more gracefully
and to reduce memory load by outputting decoded frames immediately.
Also the code was changed a bit to reduce the number of redundant mutex
lock/unlocks.
This plugin takes I420/YUV and appends an alpha plane to give YUVA/A420
to round the corners analogous to the border-radius in CSS. Other video
formats like NV12 not supported yet. Support for other planar formats
will follow.
Not all ways of specifying border-radius as in CSS are implemented at
the moment. Currently, we only support specifying it in pixels and it
gets applied uniformly to all corners.
I hadn't really tested the element with pop-on mode, and the row
for each line in the input text was hardcoded to 13, which was
clearly wrong.
Switch to incrementing it properly.
C.9 Automatic Caption Erasure (Preferred)
[...]
Some manufacturers have suggested building automatic timeout into their
decoders. They propose that if no data are received for the selected caption
channel within a given time, the decoder should automatically erase the
caption. Such erasure may supersede the intentions of the caption service
providers and institute one maximum display time for all captioning services.
If such a timeout is deemed necessary, however, the time limit should be no less
than 16 seconds, an amount of time said by caption service providers to be longer
than their most enduring caption. It is preferred, when automatic caption erasure
is used in a decoder, that only displayed memory be erased, since some caption
service providers may, contrary to recommended practice (see Section B.8.3), send
pop-on style caption data to non-displayed memory more than 16 seconds before
sending the EOC command which causes the caption to display.
In this mode, cues are output as soon as they are ready for
display, without a duration. This can be useful in live mode,
when downstream is OK with determining the duration after the
fact, through clear=True.
The consequence of this is that the current roll-up window will
be output repetitively, it is up to downstream to deal with that
how it prefers.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/554>
There is no point to that, the code is already factored in such
a way that erase_display_memory is inserted at the correct time,
including while loading the next pop-on captions in non displayed
memory.
Locking order of state and settings was inconsistent, and causing
deadlocks. Fix and document it, consistently drop locks before
chaining up events / pushing and avoid sequentially unlocking /
relocking settings in the same local code path.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/539>
This new element puts together some of the elements we've written
in recent times (awstranscriber, tttocea608, textwrap, cccombiner)
into a convenience high-level element.
The design of the element is AV in -> AV (+ CC metas) out.
The element exposes property to set and unset a "passthrough" mode,
during which the transcriber element's state is set to NULL but kept
in the bin, in order for the user to be able to set properties on
sub elements no matter what the current mode is, using the
GstChildProxy interface.
In addition, the element ensures that the latency it reports stays
fixed so that playback continues uninterrupted.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/528>
As stated in the spec:
> In addition, the user must have the capability to select a black
> background over which the captioned letters are displaced.
The property is MUTABLE_PLAYING
In roll-up modes, we open new lines when the last column is reached.
This commit implements lookahead on a word basis, in order to avoid
splitting words unless absolutely necessary (when a word won't fit
on a full row)
Trying to write "" in order to erase characters in the caption
frame simply fails silently, the proper way to implement
delete_to_end_of_row and backspace was to memset the relevant
cells.
This element outputs the same format expected by tttocea608 in
json mode.
It notably differs from cea608tott in that it only uses libcaption's
low-level API, as it needs to maintain its own view of the current
state of the screen, and make fine-grained decisions as to when
to output data and how to timestamp it.
It covers a large portion of the 608 spec, with the exception of
a few features that probably haven't ever seen widespread usage,
those are listed in a TODO list at the top.
It has been tested with a reference file produced by CEA and covers
all the features it demonstrates.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/480>
Up to now, tttocea608 supported text/utf8, and no interface to
control the positioning of closed captions apart from new lines
in the input text.
CEA 608 supports a larger set of features than that, such as
positioning CC precisely in its 32 x 15 grid, styling text,
switching from one mode to another, resetting the base row
in roll-up mode etc ..
A custom, JSON-based format is now supported by the element
(caps application/x-json, format=cea608), allowing users to
control those features in a pretty advanced manner.
A side effect of this is that the approach previously used
by the element to ensure frame-accurate CC display is now
untenable: where we knew before that an input buffer would
at most span 74 buffers and calculate a somewhat reasonable
latency based on that, this is no longer possible. Instead
we pick the approach most CC encoders seem to pick, and
accept a certain latency at display time: for example the
flipping of the back buffer to the display buffer for a
10-character text buffer will occur 7 frames after its
PTS. This has obvious benefits in terms of code complexity
and should generally be acceptable.
+ Removes a now irrelevant test, updates other tests
+ Extracts the Mode enum to the root of the crate, it will
be used by another element in a follow-up commit
style preambles look like:
|P|0|0|1|C|0|ROW| |P|1|N|0|STYLE|U|
and column preambles look like:
|P|0|0|1|C|0|ROW| |P|1|N|1|CURSR|U|
Both preambles go through eia608_row_pramble(), the value they
pass as the x parameter is supposed to hold 4 bits, either
0|STYLE
or 1|CURSR
This value then gets bit-shifted by 1 and or'd in the second byte.
The value is also and' with 0x1E to ensure it can't leak into
the upper bits.
The previous code resulted in x being a 5-bit value, 0x10 (0b10000).
This resulted in outputting a style preamble, as 0x10 << 1 & 0x1E
is 0b00000. When the indent was 0 (the usual case), this went
undetected, but with any other value it resulted in no indent being
applied, but the text getting colored or italicized.
This patch fixes x to have the correct value of 0x8 | indent.
cargo-c will produce a pkg-config file making it easier to statically
link plugins.
Also add 'static' features for plugins depending on < 1.14 as this is the
minimal required version to use static linking because of ABI changes in
core.
Various SCC files have invalid drop frame timecodes.
Every full minute the first two timecodes are skipped, except for every
tenth minute, which means that e.g. "00:01:00;00" is not a valid
timecode and the next valid timecode would be "00:01:00;02".
There is no way to dynamically ask Cargo to build static or dynamic lib
so we have to build both and pick the one we care when doing the meson
processing.
Fix#88
We don't need JPEG, GIF, etc. support so depending on the whole
dependency chain of them is not needed and only wastes CPU time.
As a result we can remove the gif crate exception in deny.toml.
Only 64k are allowed for the sum of all private instance structs in the
class hierarchy, as well as for the public instance structs.
The CdgInterpreter itself is huge and adding just another two integers
to GstVideoDecoderPrivate in libgstvideo is causing the limit to be
reached, so let's allocate it in a separate memory area.