In roll-up mode, when no more timed text comes in, the closed
captions may remain displayed on screen indefinitely (unless the
decoder implements a timeout, but that is not mandatory).
Expose a property to erase the display memory after a configurable
amount of time has elapsed instead.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/754>
There can be small race where transcription-bin is linked with
tee but state change of the transcription-bin is not finished.
And at the same time, upstream pushes event/buffer to the
transcription-bin. Do state change first then link to avoid
the condition
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/716>
Zero-padding is not specified for the indices but all time components
need to be zero-padded (3 digits for fractional seconds, 2 digits for
everything else).
If transcription runs slow or has issues the queue can fill up and block
all audio processing. This gives the queue a sufficent buffer and allows
it to drop audio if it eventually fills up. This was most noticable with
bad internet connections using the `awstrnascriber` where it would take
quite a while for the websocket to eventually timeout and the bin to
enter `passthrough=true`.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/688>
By using this new property, application can select exclusive caption
source. There are three source types
- Both: Inband and transcription captions are combined if exist.
This is default behavior.
- Inband: Transcription buffers will be dropped
- Transcription: Caption meta of each video buffer will be dropped
In this version, transcriberbin doesn't provide any hint
for application to help caption source decision. That can be done
by application's strategy, passthrough status or probing inband
caption meta for example.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/684>
Fix race between latency query handler and setup_transcription()
method.
Locking order of setup_transcription() is
state lock -> setup_transcription() -> settings lock
So taking state lock inside of setting lock in src_query()
can cause deadlock.
I hadn't really tested the element with pop-on mode, and the row
for each line in the input text was hardcoded to 13, which was
clearly wrong.
Switch to incrementing it properly.
C.9 Automatic Caption Erasure (Preferred)
[...]
Some manufacturers have suggested building automatic timeout into their
decoders. They propose that if no data are received for the selected caption
channel within a given time, the decoder should automatically erase the
caption. Such erasure may supersede the intentions of the caption service
providers and institute one maximum display time for all captioning services.
If such a timeout is deemed necessary, however, the time limit should be no less
than 16 seconds, an amount of time said by caption service providers to be longer
than their most enduring caption. It is preferred, when automatic caption erasure
is used in a decoder, that only displayed memory be erased, since some caption
service providers may, contrary to recommended practice (see Section B.8.3), send
pop-on style caption data to non-displayed memory more than 16 seconds before
sending the EOC command which causes the caption to display.
In this mode, cues are output as soon as they are ready for
display, without a duration. This can be useful in live mode,
when downstream is OK with determining the duration after the
fact, through clear=True.
The consequence of this is that the current roll-up window will
be output repetitively, it is up to downstream to deal with that
how it prefers.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/554>
There is no point to that, the code is already factored in such
a way that erase_display_memory is inserted at the correct time,
including while loading the next pop-on captions in non displayed
memory.
Locking order of state and settings was inconsistent, and causing
deadlocks. Fix and document it, consistently drop locks before
chaining up events / pushing and avoid sequentially unlocking /
relocking settings in the same local code path.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/539>
This new element puts together some of the elements we've written
in recent times (awstranscriber, tttocea608, textwrap, cccombiner)
into a convenience high-level element.
The design of the element is AV in -> AV (+ CC metas) out.
The element exposes property to set and unset a "passthrough" mode,
during which the transcriber element's state is set to NULL but kept
in the bin, in order for the user to be able to set properties on
sub elements no matter what the current mode is, using the
GstChildProxy interface.
In addition, the element ensures that the latency it reports stays
fixed so that playback continues uninterrupted.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/528>
As stated in the spec:
> In addition, the user must have the capability to select a black
> background over which the captioned letters are displaced.
The property is MUTABLE_PLAYING
In roll-up modes, we open new lines when the last column is reached.
This commit implements lookahead on a word basis, in order to avoid
splitting words unless absolutely necessary (when a word won't fit
on a full row)
Trying to write "" in order to erase characters in the caption
frame simply fails silently, the proper way to implement
delete_to_end_of_row and backspace was to memset the relevant
cells.
This element outputs the same format expected by tttocea608 in
json mode.
It notably differs from cea608tott in that it only uses libcaption's
low-level API, as it needs to maintain its own view of the current
state of the screen, and make fine-grained decisions as to when
to output data and how to timestamp it.
It covers a large portion of the 608 spec, with the exception of
a few features that probably haven't ever seen widespread usage,
those are listed in a TODO list at the top.
It has been tested with a reference file produced by CEA and covers
all the features it demonstrates.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/480>
Up to now, tttocea608 supported text/utf8, and no interface to
control the positioning of closed captions apart from new lines
in the input text.
CEA 608 supports a larger set of features than that, such as
positioning CC precisely in its 32 x 15 grid, styling text,
switching from one mode to another, resetting the base row
in roll-up mode etc ..
A custom, JSON-based format is now supported by the element
(caps application/x-json, format=cea608), allowing users to
control those features in a pretty advanced manner.
A side effect of this is that the approach previously used
by the element to ensure frame-accurate CC display is now
untenable: where we knew before that an input buffer would
at most span 74 buffers and calculate a somewhat reasonable
latency based on that, this is no longer possible. Instead
we pick the approach most CC encoders seem to pick, and
accept a certain latency at display time: for example the
flipping of the back buffer to the display buffer for a
10-character text buffer will occur 7 frames after its
PTS. This has obvious benefits in terms of code complexity
and should generally be acceptable.
+ Removes a now irrelevant test, updates other tests
+ Extracts the Mode enum to the root of the crate, it will
be used by another element in a follow-up commit
style preambles look like:
|P|0|0|1|C|0|ROW| |P|1|N|0|STYLE|U|
and column preambles look like:
|P|0|0|1|C|0|ROW| |P|1|N|1|CURSR|U|
Both preambles go through eia608_row_pramble(), the value they
pass as the x parameter is supposed to hold 4 bits, either
0|STYLE
or 1|CURSR
This value then gets bit-shifted by 1 and or'd in the second byte.
The value is also and' with 0x1E to ensure it can't leak into
the upper bits.
The previous code resulted in x being a 5-bit value, 0x10 (0b10000).
This resulted in outputting a style preamble, as 0x10 << 1 & 0x1E
is 0b00000. When the indent was 0 (the usual case), this went
undetected, but with any other value it resulted in no indent being
applied, but the text getting colored or italicized.
This patch fixes x to have the correct value of 0x8 | indent.
cargo-c will produce a pkg-config file making it easier to statically
link plugins.
Also add 'static' features for plugins depending on < 1.14 as this is the
minimal required version to use static linking because of ABI changes in
core.
Various SCC files have invalid drop frame timecodes.
Every full minute the first two timecodes are skipped, except for every
tenth minute, which means that e.g. "00:01:00;00" is not a valid
timecode and the next valid timecode would be "00:01:00;02".
There is no way to dynamically ask Cargo to build static or dynamic lib
so we have to build both and pick the one we care when doing the meson
processing.
Fix#88
Only two uses of unsafely setting the pad functions is left:
- fallbacksrc for overriding the chain function of the proxy pad of a
ghost pad
- threadshare for overriding the pad functions after creationg, which
probably needs some fixing at some point
I thought I could spare some bandwidth by letting renderers pick
the base row, but it turns out this triggers some unwanted behaviours
with compliant renderers.
Instead, we now follow the protocol laid out in EIA/CEA-608-B,
section B.8.1
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/355>
Useful complement to cea708overlay, that can only render native
708.
The element isn't an aggregator, and simply parses and renders
closed caption meta on its input video buffers.
No property is exposed, the rendering is done using a monospace
font, over a 32 x 15 grid with the font size fitted to fill as
much of the viewport as possible.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/343>
In roll-up mode, the element expects input text without layout
(eg new lines), and the characters it outputs are displayed
immediately, without double-buffering as in pop-on mode.
Once the last column is reached, the element simply outputs
a carriage return and the text scrolls up, potentially splitting
words with no hyphenation.
The main advantage of this mode is its simplicity and the near-zero
latency it introduces.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/347>
- Report a latency:
By design, tttocea608 will output buffers in the "past" when
receiving an input buffer: we want the second to last buffer
in the buffer list that we output to have the same pts as the
input buffer, as it contains the end_of_caption control code
which determines when the current closed caption actually gets
displayed in pop_on mode. The previous buffers have timestamps
decreasing as a function of the framerate, for up to potentially
74 byte pairs (the breakdown is detailed in a comment).
The element thus has to report a latency, at 30 frames per second
it represents around 2.5 seconds.
- Refactor timestamping:
Stop using a frame duration, but rather base our timestamps on
a scaled frame index. This is to avoid rounding errors, and
allow for exactly one byte pair per buffer if the proper framerate
is set on the closed caption branch, and the video branch has
perfect timestamps, eg videorate. In practice, that one byte
pair per frame requirement should only matter for line 21 encoding,
but we have to think about this use case too.
- Splice in erase_display_memory:
When there is a gap between the end of a buffer and the start
of the next one, we want to erase the display memory (this
is unnecessary otherwise, as the end_of_caption control code
will in effect ensure that the display is erased when the
new caption is displayed). The previous implementation only
supported this imperfectly, as it could cause timestamps to
go backwards.
- Output last erase_display_memory:
The previous implementation was missing the final
erase_display_memory on EOS
- Output gaps
- Write more tests
Part-of: <https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/merge_requests/314>
Contrary to what one might believe, this actually reduces the size of
the structs due to alignment constraints. On Linux x86-64 clang/gcc it
reduces the size of the caption_frame_t struct from 7760 bytes to 6800
bytes, on Windows x86-64 MSVC from 11600 bytes to 6800 bytes.
It also causes simpler and potentially faster assembly to be generated
as the values can be directly accessed as uint8_t instead of having to
extract the corresponding bits with bitwise operations.
It also gives us the same ABI with clang/gcc and MSVC.
There is some broken software out there not inserting the empty lines
and we don't really need them for proper parsing. Only require an empty
line between header and the first caption line.