This fix a very slow rendering rate regression that only
happens when using gst-launch, i.e. in the case where
the main thread does not run any NSApp loop.
Git bisect reported it has been introduced by the commit
e10d2417e2:
"move to CGL and CAOpenGLLayer for rendering".
Then the commit 7d46357627:
"gstglwindow_cocoa: fix slow render rate" attempted to fix
the slow rendering rate problem when using gst-launch.
At least for me it does not work. I tried several
combinations, for example to flush CA transactions in the
custom app loop, as mentioned in the doc, but the only solution
that fixes the slow rendering is by reducing the loop latency.
From what I tested, no need to put less than 60ms, even if the
framerate has an interval much lower (16.6ms for 60 fps).
Anytime else, we have no idea how to match up map and unmaps.
We also don't know exactly how the calling code is using us.
Also fixes the case where we're trying to transfer while someone else
is accessing our data pointer or texture resulting in mismatched video
frames.
https://bugzilla.gnome.org/show_bug.cgi?id=744839
One has to use the src_lock anyway to protect the min/max/live so they
can be notified atomically to the src thread to wake it up on changes,
such as property changes. So no point in having a second lock.
Also, the object lock was being held across a call to
GST_ELEMENT_WARNING, guaranteeing a deadlock.
While gst_aggregator_iterate_sinkpads() makes sure that every pad is only
visited once, even when the iterator has to resync, this is not all we have
to do for querying the latency. When the iterator resyncs we actually have
to query all pads for the latency again and forget our previous results. It
might have happened that a pad was removed, which influenced the result of
the latency query.
It was between another function and its helper function before, which was
confusing when reading the code as it had nothing to do with the other
functions.
This lock is not what is commonly known as a "stream lock" in GStremer,
it's not recursive and it's taken from the non-serialized FLUSH_START event.
https://bugzilla.gnome.org/show_bug.cgi?id=742684
Move the property from subclasses to adaptivedemux, it allows
selecing the percentage of the measured bitrate to be used when
selecting stream bitrates
Allows to set a bitrate directly instead of measuring it internally
based on the received chunks. The connection-speed was removed from
mssdemux and hlsdemux as it is now in the base class
Don't use private GMutex implementation details to check
whether it has been freed already or not. Just clear mutex
and GCond unconditionally in free function, they are always
inited anyway, and the free function can't be called multiple
times either.
steal_buffer() + unref seems to be a wide-spread idiom
(which perhaps indicates that something is not quite
right with the way aggregator pad works currently).
Where possible, use the _OBJECT variants in order to track better from
which object the debug statement is coming from
Define (and use) GST_CAT_DEFAULT where applicable
Use GST_PTR_FORMAT where applicable
And use the average to go up in resolution, and the last fragment
bitrate to go down.
This allows the demuxer to react rapidly to bitrate loss, and
be conservative for bitrate improvements.
+ Add a construct only property to define the number of fragments
to consider when calculating the average moving bitrate.
https://bugzilla.gnome.org/show_bug.cgi?id=742979
If the src framerate and videoaggreator's output framerate were
different, then we were taking every single buffer that had duration=-1
as it came in regardless of the buffer's start time. This caused the src
to possibly run at a different speed to the output frames.
https://bugzilla.gnome.org/show_bug.cgi?id=744096
In gst_gl_filter_fixate_caps () it can goto done without freeing the memory of
the tmp GstStructure. This makes it go out of scope and leak.
CID #1265765
Allows finer grain decisions about formats and features at each
stage of the pipeline.
Also provide propose_allocation for glupload besed on the supported
methods.
In gst_gl_window_cocoa_draw we used to just call setNeedsDisplay:YES. That was
creating an implicit CA transaction which was getting committed at the next
runloop iteration. Since we don't know how often the main runloop is running,
and when we run it implicitly (from gst_gl_window_cocoa_nsapp_iteration) we only
do so every 200ms, use an explicit CA transaction instead and commit it
immediately. CA transactions nest and debounce automatically so this will never
result in extra work.
Make GstGLMemory hold the texture target (tex_target) the texture it represents
(tex_id) is bound to. Modify gst_gl_memory_wrapped_texture and
gst_gl_download_perform_with_data to take the texture target as an argument.
This change is needed to support wrapping textures created outside libgstgl,
which might be bound to a target other than GL_TEXTURE_2D. For example on OSX
textures coming from VideoToolbox have target GL_TEXTURE_RECTANGLE.
With this change we still keep (and sometimes imply) GL_TEXTURE_2D as the
target of textures created with libgstgl.
API: modify GstGLMemory
API: modify gst_gl_memory_wrapped_texture
API: gst_gl_download_perform_with_data
Don't call glClear && glClearColor at each draw since we're going to draw the
whole viewport anyway. Gets rid of a glFlush triggered by glClear on OSX.
Instead of using the GST_OBJECT_LOCK we should have
a dedicated mutex for the pad as it is also associated
with the mutex on the EVENT_MUTEX on which we wait
in the _chain function of the pad.
The GstAggregatorPad.segment is still protected with the
GST_OBJECT_LOCK.
Remove the gst_aggregator_pad_peak_unlocked method as it does not make
sense anymore with a private lock.
https://bugzilla.gnome.org/show_bug.cgi?id=742684
Some members sometimes used atomic access, sometimes where not locked at
all. Instead consistently use a mutex to protect them, also document
that.
https://bugzilla.gnome.org/show_bug.cgi?id=742684
Reduce the number of locks simplify code, what is protects
is exposed, but the lock was not.
Also means adding an _unlocked version of gst_aggregator_pad_steal_buffer().
https://bugzilla.gnome.org/show_bug.cgi?id=742684
Whenever a GCond is used, the safest paradigm is to protect
the variable which change is signalled by the GCond with the same
mutex that the GCond depends on.
https://bugzilla.gnome.org/show_bug.cgi?id=742684
This can happen if this is a live pipeline and no source produced any buffer
and sent no caps until an output buffer should've been produced according to the
latency.
This fix is similar in spirit to commit be7034d1 by Sebastian for audiomixer.
In order to use pbo's efficiently, the transfer operation has to
be separated from the use of the downloaded data which requires some
rearchitecturing around glcolorconvert/gldownload and elements
Unset out buffer in clip function when we unref the buffer to be
clipped, otherwise aggregator will continue to use the already-
freed buffer. Fixes crash when buffers without timestamps are
being fed to aggregator. Partly because aggregator ignores the
error flow return.
https://bugzilla.gnome.org/show_bug.cgi?id=743334
READ_UE_ALLOWED was almost exclusively used with min == 0, which doesn't
make much point for unsigned integers.
Add a READ_UE_MAX variant and use that instead. Also replaced two usages
of CHECK_ALLOWED (a,0,something) by CHECK_ALLOWED_MAX (a, something)
Depending on the platform, it was only ever implemented to 1) set a
default surface size, 2) resize based on the video frame or 3) nothing.
Instead, provide a set_preferred_size () that elements/applications
can use to request a certain size which may be ignored for
videooverlay/other cases.
Removes the use of NSOpenGL* variety and functions. Any Cocoa
specific functions that took/returned a NSOpenGL* object now
take/return the CGL equivalents.
Add more power to the chunk_received function (renamed to data_received)
and also to the fragment_finish function.
The data_received function must parse/decrypt the data if necessary and
also push it using the new push_buffer function that is exposed now. The
default implementation gets data from the stream adapter (all available)
and pushes it.
The fragment_finish function must also advance the fragment. The default
implementation only advances the fragment.
This allows the subsegment handling in dashdemux to continuously download
the same file from the server instead of stopping at every subsegment
boundary and starting a new request
If we say it is the first segment after a new period it will resync
the segment.start value and all buffers will be late for the new period
we are trying to play. Otherwise we want to keep the segment.start with
the previous value to allow the running time to smoothly increase
Check if there is a next fragment before advancing to avoid causing
a bitrate switch (and maybe exposing new pads) only to push EOS.
This causes playback to stop with an error instead of properly
finishing with EOS message.
The subsegment boundary return tells the adaptivedemux that it can
try to switch to another representation as the stream is at a suitable
position for starting from another bitrate.
In order to get some subsegment information, subclasses might want
to download only the headers to have enough data (the index)
to decide where to start downloading from the subsegment.
This allows the subclasses to know if the chunks that are downloaded are
part of the header or of the index and will parse the parts that are
of their interest.
Ensure that we do not trust the bitstream when filling a table
with a fixed max size.
Additionally, the code was not quite matching what the spec says:
- a value of 3 broke from the loop before adding an entry
- an unhandled value did not add an entry
The reference algorithm does these things differently (7.3.3.1
in ITU-T Rec. H.264 (05/2003)).
This plays (apparently correctly) the original repro file, with
no stack smashing.
Based on a patch and bug report by André Draszik <git@andred.net>
The hack causes deadlocks and other interesting problems and it really
can only be fixed properly inside GLib. We will include a patch for
GLib in our builds for now that handles this, and hopefully at some
point GLib will also merge a proper solution.
A proper solution would first require to refactor the polling in
GMainContext to only provide a single fd, e.g. via epoll/kqueue
or a thread like the one added by our patch. Then this single
fd could be retrieved from the GMainContext and directly integrated
into a NSRunLoop.
https://bugzilla.gnome.org/show_bug.cgi?id=741450https://bugzilla.gnome.org/show_bug.cgi?id=704374
Soon after setting two variables to 1, the code checks if their values are
different from each other. This would never be true. Removing this.
CID 1226443
No need to use an iterator for this which creates a temporary
structure every time and also involves taking and releasing the
object lock many times in the course of iterating. Not to mention
all that GList handling in gst_aggregator_iterate_sinkpads().
The minimum latency is the latency we have to wait at least
to guarantee that all upstreams have produced data. The maximum
latency has no meaning like that and shouldn't be used for waiting.
When iterating sink pads to collect some data, we should take the stream lock so
we don't get stale data and possibly deadlock because of that. This fixes
a definitive deadlock in _wait_and_check() that manifests with high max
latencies in a live pipeline, and fixes other possible race conditions.
Segment start needs only to be updated when starting the streams
or after a seek, doing it during bitrate changes will cause the
running time to go discontinuous (jump back to a previous ts)
and QOS will drop buffers
This simplifies the code and also makes sure that we don't forget to check all
conditions for waiting.
Also fix a potential deadlock caused by not checking if we're actually still
running before starting to wait.
Actually we should always recalculate buffer size since our buffer size
even when not-padded is smaller for many sub-sampled formats. This is
because we don't add padding between the planes.
https://bugzilla.gnome.org/show_bug.cgi?id=740900
Problem was that if buffer was mapped READWRITE (state of buffers from
libav right now), mapping it READ/GL will not upload. This is because the
flag is only set when the buffer is unmapped. We can fix this by setting
the flags in map. This result in already mapped buffer that get mapped
to be read in GL will be uploaded. The problem is that if the write
mapper makes modification afterward, the modification will never get
uploaded.
https://bugzilla.gnome.org/show_bug.cgi?id=740900
When this is TRUE, we really have to produce output. This happens
in live mixing mode when we have to output something for the current
time, no matter if we have enough input or not.
This removes the uses of GAsyncQueue and replaces it with explicit
GMutex, GCond and wakeup count which is used for the non-live case.
For live pipelines, the aggregator waits on the clock until either
data arrives on all sink pads or the expected output buffer time
arrives plus the timeout/latency at which time, the subclass
produces a buffer.
https://bugzilla.gnome.org/show_bug.cgi?id=741146
To avoid race conditions with gst_task_stop(); gst_task_join() with
another thread doing gst_task_pause(), the joining thread would be
waiting for the task to stop but it would never happen. So just
use gst_task_stop() everywhere to prevent more mutexes