gstreamer/docs/random/bbb/subtitles

69 lines
3.1 KiB
Text
Raw Normal View History

Subtitles
=========
1. Problem
GStreamer currently does not support subtitles.
2. Proposed solution
- Elements
- Text-overlay
- Autoplugging
- Scheduling
- Stream selection
The first thing we'll need is subtitle awareness. I'll focus on AVI/MKV/OGM
here, because I know how that works. The same methods apply to DVD subtitles
as well. The matroska demuxer (and Ogg) will need subtitle awareness. For
AVI, this is not needed. Secondly, we'll need subtitle stream parsers (for
all popular subtitle formats), that can deal both with parsed streams (MKV,
OGM) as well as .sub file chunks (AVI). Sample code is available in
gst-sandbox/textoverlay/.
Secondly, we'll need a textoverlay filter that can take text and video and
blits text on video. We have several such elements (e.g. the cairo-based
element) in gst-plugins already. Those might need some updates to work
exactly as expected.
Thirdly, playbin will need to handle all that. We expect subtitle streams
to end up as subimages or plain text (or xhtml text). Note that playbin
should also allow access to the unblitted subtitle as text (if available)
for accessibility purposes.
A problem popping up is that subtitles are no continuous streams. This is
especially noticeable in the MKV/OGM case, because there the input of data
depends on the other streams, so we'll only notice delays inside an element
when we've received the next data chunk. There are two possible solutions:
using timestamped filler events or using decoupled subtitle overlay elements
(bins, probably). The first has as a difficulty that it only works well in
the AVI/.sub case, where we will notice discontinuities before they become
problematic. The second is more difficult to implement, but works for both
cases.
A) fillers
Imagine that two subtitles come after each other, with 10 seconds of no-data
in between. By parsing a .sub file, we would notice immediately and we could
send a filler event (or empty data) with a timestamp and duration in between.
B) decoupled
Imagine this text element:
------------------------------
video ----- | actual element |out
| / -----------------|
text - - |
------------------------------
where the text pad is decoupled, like a queue. When no text data is available,
the pad will have received no data, and the element will render no subtitles.
The actual element can be a bin here, containing another subtitle rendering
element. Disadvantage: it requires threading, and the element itself is (in
concept) kinda gross. The element can be embedded in playbin to hide this
fact (i.e. not be available outside the scope of playbin).
Whichever solution we take, it'll require effort from the implementer.
Scheduling (process, not implementation) knowledge is assumed.
Stream selection is a problem that audio has, too. We'll need a solution for
this at the playback bin level, e.g. playbin. By muting all unused streams
and dynamically unmuting the selected stream, this is easily solved. Note
that synchronization needs to be checked in this case. The solution is not
hard, but someone has to do it.
3. Written by
Ronald S. Bultje <rbultje@ronald.bitfreak.net>, Dec. 25th, 2004