Problem:
multiple aggregator elements (audiomixer, compositor) in a live
pipeline use a lot of CPU waiting each other up. This is because
of the previously unused clock entry unscheduling during regular
operation.
Clock entry unscheduling has the potential to wake up every clock entry
waiting using the system clock which may be a large number.
Solution:
Implement waiting per entry and only wakeup the unscheduled entry.
While this may be possible using GCond, theoretically GCond only gives
us microsecond accuracy and uses relative waits in a number of places.
We can unfortunately do better poking at the platform specifics
ourselves by using futexes on linux and pthread on other unix. Windows
may have a possible implementation using Waitable timers but that is
not implemented here and instead falls back to the GCond implementation.
GCond waits on Windows is still as accurate as the previous GstPoll-based
implementation.
When a live pipeline goes to PLAYING, its change_state method
is called twice for PAUSED_TO_PLAYING: the first time is
from GstElement, when NO_PREROLL is returned, the second
is from GstBin, after all async_done messages have been
collected.
base_time selection is done only the first time, through
comparisons with start_time.
On the other hand, when this live pipeline gets flush seeked,
even though start_time is reset by the sink upon reception
of flush_stop(reset_time=TRUE), PAUSED_TO_PLAYING only occurs
once, from GstBin, after all async_done messages have been
collected. This causes the base_time to be off by <latency>.
This commit addresses this by mimicing the behaviour of
GstElement on NO_PREROLL, and calling the change_state
method manually when the following conditions are met:
* The pipeline is live
* The target state is PLAYING
The returned "stats" structure contains, for now, one array called
"queues" with one GstStructure per internal queue, containing said
queue's current level of bytes, buffers, and time.
Even when pulling a new 64KB buffer from upstream, don't return
more data than was asked for in the pull_range() method and then
return less later, as that confused subclasses like h264parse.
Add a unit test that when a subclass asks for more data, it always
receives a larger buffer on the next iteration, never less.
Fixes https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/530
It is not explicitly specified anywhere in the docs that 0% buffering is
at low-watermark and 100% buffering is at high-watermark. It was
specified only in the sources.
This new API allow resuming a task if it was paused, while leaving it to
stopped stated if it was stopped or not started yet. This new API can be
useful for callback driver workflow, where you basically want to pause and
resume the task when buffers are notified while avoiding the race with a
gst_task_stop() coming from another thread.
Going through each state on the way back down to GST_STATE_NULL
can cause deadlocks, for example:
gst-launch-1.0 audiotestsrc ! valve drop=true ! autoaudiosink
ctrl + C
Hangs forever when going to PAUSED, because the "final" state is
ASYNC, and the sink blocks waiting for a preroll buffer.
Going straight to NULL addresses this issue, and also helps
making teardown faster when piping sparse streams to a
sync sink.
When running in pull mode (for e.g. mp3 reading),
baseparse currently reads 64KB from upstream, then mp3parse
consumes typically around 417/418 bytes of it. Then
on the next loop, it will read a full fresh 64KB again,
which is a big waste.
Fix the read loop to use the available cache buffer first
before going for more data, until the cache drops to < 1KB.
Fixes https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/518
current_buf_mem_idx stands for the index of memory of the corresponding
buffer which is scheduled to be written in the next iteration.
If all memory objects were scheduled to be written in the current
iteration, reset the index to zero so that starting from the first
memory object of the next buffer.
Previously the default and full modes were the same. Now the default
mode is like before: it accumulates all buffers in a buffer list until
the threshold is reached and then writes them all out, potentially in
multiple writes.
The new full mode works by always copying memory to a single memory area
and writing everything out with a single write once the threshold is
reached.
If buffer lists with too many buffers would be written before, a stack
overflow would happen because of memory linear with the number of
GstMemory would be allocated on the stack. This could happen for example
when filesink is configured with a very big buffer size.
Instead now move the buffer and buffer list writing into the helper
functions and at most write IOV_MAX memories at once. Anything bigger
than that wouldn't be passed to writev() anyway and written differently
in the previous code, so this also potentially speeds up writing for
these cases.
For example the following pipeline would crash with a stackoverflow:
gst-launch-1.0 audiotestsrc ! filesink buffer-size=1073741824 location=/dev/null
What may happen is that during the course of processing a buffer,
all of the pads in a flow combiner may disappear. In this case, we
would return NOT_LINKED. Instead return whatever the input flow return
was.
The old code would leave a dangling pointer in oldstr_ptr if two threads
attempted to take the same structure into the same location at the same
time:
1. First "oldstr == newstr" check (before the loop) fails.
2. Compare-and-exchange fails, due to a second thread completing the
same gst_structure_take.
3. Second "oldstr == newstr" check (in the loop) succeeds, loop breaks.
4. "oldstr" check succeeds, old structure gets freed.
5. oldstr_ptr now contains a dangling pointer.
This shouldn't happen in code that handles ownership sanely, so check
that we don't try to do this and complain loudly.
Also simplify the function by using a do-while loop, like
gst_mini_object_take.
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/413
When this `_strnlen` internal method was added, strnlen (in glibc)
was not available yet (appeared in 2.10 it was released that same
year).
If available, use the much more optimized strnlen
The type checks at the end of `gst_value_intersect` to call the flagset
intersection are relatively expensive.
If we already know that:
* There was a compare function but it didn't return GST_VALUE_EQUAL
* AND none of the registered intersect functions failed
Then we know they can't intersect and can return early.
Trims ~20% of the instruction calls
For subtracting a list from another, the previous implementation would
do a double subtraction of one from another (which would create temporary
arrays/values which would then be discarded). Instead iterate and do
the comparision directly.
For intersecting a list with another, we can directly iterate both at
once and therefore avoid doing a *full* check of all values of the list
against all other values of the list.
This tries to inline as much as possible array/list and its contents
in order to avoid double allocation/freeing. This also improves the
locality of data.
The internal value is still API/ABI compatible with the *public*
GArray structure. This allows READ-ONLY backwards compatibility with
any external users that assume that the content of a list/array value
is backed by a GArray.
In case the buffer is not writable, the parent (the BufferList) is not
removed before calling func. So if it is changed, the parent (the BufferList)
of the previous buffer should be removed after calling func.
For all the structure creation using valist/varargs we calculate
the number of fields we will need to store. This ensures all callers
will end up with a single allocation.
Instead of having 3 allocations:
* One for GstStructure
* One for GArray
* One for the array *within* GArray
We try to limit this to a single allocation, inlining everything. This
reduces the number of micro-allocations and improves locality of data
access.
Before that commit `{test, }` wouldn't be accepted as an array
because of the trailing coma, the commit fixes that.
At the same time, the code has been refactored to avoid special casing
the first element of the list, making `{,}` or `<,>` valid lists.
We kept the start time around and subtracted it everywhere for "easy of
debugging", but we don't do anything like this anywhere else and it
only complicates the code unnecessarily.
fixate() will return empty caps if it gets empty caps passed and assert
early if any caps are provided as there's no meaningful way of fixating
any caps.
truncate() and simplify() will return the input caps in case of
any/empty caps as before, but slightly optimized and as documented
behaviour.
Also add tests for this and a few other operations behaviour on
empty/any caps.