We can't just fixate to any close to what we receive right now but only
support exactly the caps we receive. So check the format of each frame
and negotiate exactly those caps as needed when receiving frames.
Also re-negotiate if the caps are ever changing.
The latter must only be called on memory that was allocated by Rust for
a Vec and will cause crashes depending on the platform otherwise.
Also it would free the memory as if a Vec was allocated, which would
free memory that we don't own to begin with.
Due to the possibility to connect to two or more streams simultaneously with different clocks synchronization It's necessary to improve the timestamps calculation to detect this.
Prior to this commit, we saved the first timestamp that arrive and use it to calculate the running time of the stream for the rest of frames (pts field in gstreamer buffer) in all of the streams. This lead to problems when connecting to multiple streams in multiple computers and the clocks were not correctly synchronized.
To fix this, now we save a different initial timestamp for each stream.
See: https://github.com/FFmpeg/FFmpeg/blob/master/libavdevice/libndi_newtek_common.h#L27
From NDI SDK Documentation:
This is the timecode of this frame in 100ns intervals. This is generally not used internally by the SDK, but is passed
through to applications who may interpret it as they wish. When sending data, a value of
NDIlib_send_timecode_synthesize can be specified (and should be the default), the operation of this value is
documented in the sending section of this documentation. NDIlib_send_timecode_synthesize will yield UTC
time in 100ns intervals since the Unix Time Epoch 1/1/1970 00:00. When interpreting this timecode a receiving
application may choose to localise the time of day based on time zone offset which can optionally be communicated by
the sender in connection metadata. Since timecode is stored in UTC within NDI, communicating timecode time of day for
non UTC time zones requires a translation
We could also go via the glib::Type but that requires more steps unless
we also add a getter from the registered type to the audio/video source
modules.