No mutex is locked while calling any OpenMAX functions anymore
and everything from the OpenMAX callbacks is inserted into a message
queue and handled from outside the callbacks.
Also there's only a single mutex and condition variable per component
now for handling anything from OpenMAX callbacks and a single mutex
for keeping our component/port state sane.
Between lock() and begin_recursion() it was possible for another thread to
try to do a recursive_lock(). This would block because the mutex was already
locked(), but not ready for recursive locking yet. unlock() would never
happen in the original thread because it was waiting for the other thread
to finish first.
Happened on the Raspberry Pi.
According to the OMX specification, implementations are allowed to call
callbacks in the context of their function calls. However, our callbacks
take locks and this causes deadlocks if the unerlying OMX implementation
uses this kind of in-context calls.
A solution to the problem would be a recursive mutex. However, a normal
recursive mutex does not fix the problem because it is not guaranteed
that the callbacks are called from the same thread. What we see in Broadcom's
implementation for example is:
- OMX_Foo is called
- OMX_Foo waits on a condition
- A callback is executed in a different thread
- When the callback returns, its calling function
signals the condition that OMX_Foo waits on
- OMX_Foo wakes up and returns
The solution I came up with here is to take a second lock inside the callback,
but only if recursion is expected to happen. Therefore, all calls to OMX
functions are guarded by calls to gst_omx_rec_mutex_begin_recursion() / _end_recursion(),
which effectively tells the mutex that at this point we want to allow calls
to _recursive_lock() to succeed, although we are still holding the master lock.
This happens on the Galaxy Nexus, and causes the pipeline to hang waiting
endlessly for a drain. The hack replaces the wait with a wait + 500ms timeout.
gst_omx_port_set_flushing() calls OMX_FillThisBuffer at the end of a flush
without releasing the port lock, and this can cause a deadlock with the
EventHandler. This patches fixes this by dropping the lock for the duration of
the fill buffer call.