diff --git a/docs/design/part-seeking.txt b/docs/design/part-seeking.txt index 8fddb91f3f..01788dde05 100644 --- a/docs/design/part-seeking.txt +++ b/docs/design/part-seeking.txt @@ -28,11 +28,13 @@ Seeking can be performed in different formats such as time, frames or samples. The seeking can be performed to a nearby key unit or to the exact -(estimated) unit in the media (GST_SEEK_FLAG_KEY_UNIT). +(estimated) unit in the media (GST_SEEK_FLAG_KEY_UNIT). See below for more +details on this. The seeking can be performed by using an estimated target position or in an accurate way (GST_SEEK_FLAG_ACCURATE). For some formats this can result in having to scan the complete file in order to accurately find the target unit. +See below for more details on this. Non segment seeking will make the pipeline emit EOS when the configured segment has been played. @@ -109,3 +111,135 @@ segment seeking without FLUSH This seek is typically performed when continuing seamless looping. + + +======================================================================== + Demuxer/parser behaviour and SEEK_FLAG_KEY_UNIT and SEEK_FLAG_ACCURATE +======================================================================== + +This section aims to explain the behaviour expected by an element with regard +to the KEY_UNIT and ACCURATE seek flags using the example of a parser or +demuxer. + +1. DEFAULT BEHAVIOUR: + +When a seek to a certain position is requested, the demuxer/parser will +do two things (ignoring flushing and segment seeks, and simplified for +illustration purposes): + + - send a newsegment event with a new start position + + - start pushing data/buffers again + +To ensure that the data corresponding to the requested seek position +can actually be decoded, a demuxer or parser needs to start pushing data +from a keyframe/keyunit at or before the requested seek position. + +Unless requested differently (via the KEY_UNIT flag), the start of the +newsegment event should be the requested seek position. + +So by default a demuxer/parser will then start pushing data from +position DATA and send a newsegment event with start position SEG_START, +and DATA <= SEG_START. + +If DATA < SEG_START, a well-behaved video decoder will start decoding frames +from DATA, but take into account the segment configured by the demuxer via +the newsegment event, and only actually output decoded video frames from +SEG_START onwards, dropping all decoded frames that are before the +segment start and adjusting the timestamp/duration of the buffer that +overlaps the segment start ("clipping"). A not-so-well-behaved video decoder +will start decoding frames from DATA and push decoded video frames out +starting from position DATA, in which case the frames that are before +the configured segment start will usually be dropped/clipped downstream +(e.g. by the video sink). + + +2. GST_SEEK_FLAG_KEY_UNIT: + +If the KEY_UNIT flag is specified, the demuxer/parser should adjust the +segment start to the position of the key frame closest to the requested +seek position and then start pushing out data from there. The nearest +key frame may be before or after the requested seek position, but many +implementations will only look for the closest keyframe before the +requested position. + +Most media players and thumbnailers do (and should be doing) KEY_UNIT seeks +by default, for performance reasons, to ensure almost-instant responsiveness +when scrubbing (dragging the seek slider in PAUSED or PLAYING mode). This +works well for most media, but results in suboptimal behaviour for a small +number of 'odd' files (e.g. files that only have one keyframe at the very +beginning, or only a few keyframes throughout the entire stream). At the +time of writing, a solution for this still needs to be found, but could be +implemented demuxer/parser-side, e.g. make demuxers/parsers ignore the +KEY_UNIT flag if the position adjustment would be larger than 1/10th of +the duration or somesuch. + +Summary: + + - if the KEY_UNIT flag is *not* specified, the demuxer/parser should + start pushing data from a key unit preceding the seek position + (or from the the seek position if that falls on a key unit), and + the start of the new segment should be the requested seek position. + + - if the KEY_UNIT flag is specified, the demuxer/parser should start + pushing data from the key unit nearest the seek position (or from + the the seek position if that falls on a key unit), and + the start of the new segment should be adjusted to the position of + that key unit which was nearest the requested seek position (ie. + the new segment start should be the position from which data is + pushed). + + +3. GST_SEEK_FLAG_ACCURATE: + +If the ACCURATE flag is specified in a seek request, the demuxer/parser +is asked to do whatever it takes (!) to make sure that the position seeked +to is accurate in relation to the beginning of the stream. This means that +it is not acceptable to just approximate the position (e.g. using an average +bitrate). The achieved position must be exact. In the worst case, the demuxer +or parser needs to push data from the beginning of the file and let downstream +clip everything before the requested segment start. + +The ACCURATE flag does not affect what the segment start should be in +relation to the requested seek position. Only the KEY_UNIT flag (or its +absence) has any effect on that. + +Video editors and frame-stepping applications usually use the ACCURATE flag. + +Summary: + + - if the ACCURATE flag is *not* specified, it is up to the demuxer/parser + to decide how exact the seek should be. If the flag is not specified, + the expectation is that the demuxer/parser does a resonable best effort + attempt, trading speed for accuracy. In the absence of an index, the + seek position may be approximated. + + - if the ACCURATE flag is specified, absolute accuracy is required, and + speed is of no concern. It is not acceptable to just approximate the + seek position in that case. + + - the ACCURATE flag does not imply that the segment starts at the + requested seek position or should be adjusted to the nearest keyframe, + only the KEY_UNIT flag determines that. + + +4. ACCURATE and KEY_UNIT combinations: + +All combinations of these two flags are valid: + + - neither flag specified: segment starts at seek position, send data + from preceding key frame (or earlier), feel free to approximate the + seek position + + - only KEY_UNIT specified: segment starts from position of nearest + keyframe, send data from nearest keyframe, feel free to approximate the + seek position + + - only ACCURATE specified: segment starts at seek position, send data + from preceding key frame (or earlier), do not approximate the seek + position under any circumstances + + - ACCURATE | KEY_UNIT specified: segment starts from position of nearest + keyframe, send data from nearest key frame, do not approximate the seek + position under any circumstances +