This patch includes:
1\ Implements MsdkDmaBufAllocator and allocation of msdk dmabuf memroy.
2\ Each msdk dmabuf memory include its own msdk surface kept by GQuark.
3\ Adds new option GST_BUFFER_POOL_OPTION_MSDK_USE_DMABUF
https://bugzilla.gnome.org/show_bug.cgi?id=793707
There needs to be generalized for the parameter from
GstVideoMsdkVideoMemory to GstMemory.
Thus we can call these functions if using DMABuf memory.
https://bugzilla.gnome.org/show_bug.cgi?id=793707
For example, if framerate 0/1 is provided from upstream, the driver
fails to configure and complain about it.
We can let it go and make the driver assuming framerate itself.
https://bugzilla.gnome.org/show_bug.cgi?id=789752
There is no log of gst_decklink_com_thread () which initializes COM.
The initialization part is not valid with #ifdef MSC_VER.
Windows binaries are built with gcc.
As with other codes, it was avoidable by setting it to G_OS_WIN32
instead of MSC_VER.
https://bugzilla.gnome.org/show_bug.cgi?id=794652
There was not handling the end of encoding sequence in encoder.
This patch does drain any remaining internal streams while decoder
already does this.
Document says:
"To mark the end of the encoding sequence, call this function with a
NULL surface
pointer. Repeat the call to drain any remaining internally cached
bitstreams—one
frame at a time—until MFX_ERR_MORE_DATA is returned."
https://bugzilla.gnome.org/show_bug.cgi?id=793236
Sometimes parent context is released before its children get released.
In this case MFXClose of parent session fails.
To make sure that child sessions are closed before closing a parent
session,
Parent context needs to manage child sessions and close them first when
it's released.
https://bugzilla.gnome.org/show_bug.cgi?id=793412
Currently a gst buffer has one mfxFrameSurface when it's allocated and
can't be changed.
This is based on that the life of gst buffer and mfxFrameSurface would
be same.
But it's not true. Sometimes even if a gst buffer of a frame is finished
on downstream,
mfxFramesurface coupled with the gst buffer is still locked, which means
it's still being used in the driver.
So this patch does this.
Every time a gst buffer is acquired from the pool, it confirms if the
surface coupled with the buffer is unlocked.
If not, replace it with new unlocked one.
In this way, user(decoder or encoder) doesn't need to manage gst buffers
including locked surface.
To do that, this patch includes the following:
1. GstMsdkContext
- Manages MSDK surfaces available, used, locked respectively as the
following:
1\ surfaces_avail : surfaces which are free and unused anywhere
2\ surfaces_used : surfaces coupled with a gst buffer and being used
now.
3\ surfaces_locked : surfaces still locked even after the gst buffer
is released.
- Provide an api to get MSDK surface available.
- Provide an api to release MSDK surface.
2. GstMsdkVideoMemory
- Gets a surface available when it's allocated.
- Provide an api to get an available surface with new unlocked one.
- Provide an api to release surface in the msdk video memory.
3. GstMsdkBufferPool
- In acquire_buffer, every time a gst buffer is acquired, get new
available surface from the list.
- In release_buffer, it confirms if the buffer's surface is unlocked or
not.
- If unlocked, it is put to the available list.
- If still locked, it is put to the locked list.
This also fixes bug #793525.
https://bugzilla.gnome.org/show_bug.cgi?id=793413https://bugzilla.gnome.org/show_bug.cgi?id=793525
Directsoundsrc/sink have multiple issues, most of which cannot be
fixed at all because the API is deprecated and is implemented as a
compatibility wrapper around WASAPI since Vista.
Users and developers should now use the wasapisrc/sink elements, and
future development efforts should go towards that.
The low-latency property is *always* safe to enable, so applications
that do realtime communication should set it, and the elements will
automatically configure WASAPI to use the lowest possible device
period, and the audioringbuffer in audiobasesink will also be
configured accordingly.
Applications can also use exclusive mode during capture and playback
for the lowest possible latency if they know that the device will not
be used by any other application.
In this mode, the latency-time and buffer-time properties will be
completely ignored.
The AudioClient3 API is only available on Windows 10, and we will
automatically detect when it is available and use it.
However, using it for capturing audio with low latency and without
glitches seems to require setting the realtime priority of the entire
pipeline to "critical", which we cannot do from inside the element.
Hence, we can only enable that by default for wasapisink since
apps should be able to safely set the low-latency property to TRUE if
they need low-latency capture or playback.
This allows us to request ultra-low-latency device periods even in
shared mode. However, this requires good drivers and Windows 10, so
we only enable this when we detect that we are running on Windows 10
at runtime.
You can forcibly disable this feature on Windows 10 by setting
GST_WASAPI_DISABLE_AUDIOCLIENT3=1 in the environment.
Since there is already an "adaptive-B" option, just
use boolean property for B-pyramid enabling.
Fixme: Not sure whether this can be supported in vp8 and vp9.
It could be possible through GPB (b without backward ref) but
can't verify currently. We can move this as common property
once verified with vp8 and vp9 without breaking any backward
compatibility.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
Add a new property "trellis" to enable trellis quantization.
Keeping trellis as a flag value (which is boolean for gst x264 enc element)
since it is possible to enable/disable this seperately for
I,P and B frames through MediaSDK ext option headers.
The subclass implementations always need to inform base-encoder
if it requires the inclusion of Extend Header buffers (mfxExtCodingOption2
and mfxExtCodingOption3).
https://bugzilla.gnome.org/show_bug.cgi?id=791637
This option controls down sampling in look ahead bitrate
control mode. According to spec it is only supported in AVC.
Fixme: Probably HEVC also have support for this in recent
MSDK versions. We could move the enumeration types to common
header usable for multiple codecs.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
MediaSDK has support for a number of rate control algorithms.
Adding all possible options to the property rate-control.
Fixme1: In case of failure, currently we don't have a proper method
to show which rate-control has been failed. It could be better
to add some extensive validation on EncQuery output in case of error.
Unfortunately, not all ratecontrol methods are supported by every codecs
and we don't have the dynamic detection of supported ratecontrol methods yet.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
We have the property "i-frames" to set the IDR interval in a
gop. Unfortunately MSDK HEVC encoder behaves bit differently
for IdrInterval field, IdrInteval == 1 indicate every
I-frame should be an IDR (which is IdrInterval == 0 for other codecs),
IdrInteval == 2 means every other I-frame is an IDR
(which is IdrInterval == 1 for other codecs) etc.
So we generalize the behaviour of property "i-frames" by
incrementing the value by one in each case (only for HEVC).
https://bugzilla.gnome.org/show_bug.cgi?id=791637
The base encoder common properties are not valid for
mjpeg encoder where there is no motion compensation or rate control.
Delaying the property installation on the base gobject
untill the subclass class_init get invoked.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
The gst-msdk decoders prefer packetized streams as input
and in this case we can avoid unnecessary input bitstream copy
to mfxBitstream. This works fine for codecs like h264 where
we only support byte-stream with au alignment. Other format
conversions should be done thorugh parsers. But this won't work
for codecs like vc1 where we don't have an autoplugged parser.
Even the parser is not capable to do format conversions.
Packetizing through base decoders parse() routine will bring a
lot of uncecessary of complexities and codecparser libraray dependency.
So we just use an interal gst_adaper to keep track of bitstream
which is not consumed by msdk durig AsynchronusDecoding.
This adapter will get used only if subclass implementations
set the "is_packetized" to FALSE for msdk base encoder.
https://bugzilla.gnome.org/show_bug.cgi?id=792589
Adding Simple and Main profiles decode support.
Currently msdkvc1dec is not capable to handle the codec_data,
only instream headers are supported. Also msdk vc1 decoder
expecting instream with Sequence header as per SMPTE 421M Annex L.
Most of the decdoebin/playbin pipeline won't work with the above
constraints
because vc1parse is still not an autoplug element.
Only way to make mskdvc1dec work is by connecting a vc1parse
as an upstream element.
https://bugzilla.gnome.org/show_bug.cgi?id=792589
Use drm render node as the first choice of device node file.
Fall backs to use drm primary (/dev/dri/card[0-9])
if there is no render node available
Basic logic is inherited from gstreamer-vaapi, but using
gudev API rather than libudev directly.
Added gudev library as dependency for msdk.
https://bugzilla.gnome.org/show_bug.cgi?id=791599
1\ If downstream's pool is MSDK bufferpool,
2\ If there's shared GstMsdkContext in the pipeline,
a decoder decides to use video memory.
This policy should be improved to handle more cases.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
In case that pipeline is like ".. ! decoder ! encoder ! ..." with using
video memory,
decoder needs to know the async depth of the following msdk element so
that it could
allocate the correct number of video memory.
Otherwise, decoder's memory is exhausted while processing.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
How to share/create GstMsdkcontext is the following:
- Search GstMsdkContext if there's in the pipeline.
- If found, check if it's decoder, encoder or vpp by job type.
- If it's same job type, it creates another instance of
GstMsdkContext
with joined-session.
- Otherwise just use the shared GstMsdkContext.
- If not found, just creates new instance of GstMsdkContext.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
According to the driver's instruction, if there are two or more encoders
or decoders in a process, the session should be joined by
MFXJoinSession.
To achieve this successfully by GstContext, this patch adds job type
specified if it's encoder, decoder or vpp.
If a msdk element gets to know if joining session is needed by the
shared context,
it should create another instance of GstContext with joined session,
which
is not shared.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
1\ In decide_allocation, it makes its own msdk bufferpool.
- If downstream supports video meta, it just replace it with the msdk
bufferpool.
- If not, it uses the msdk bufferpool as a side pool, which will be
decoded into.
and will copy it to downstream's bufferpool.
2\ Decide if using video memory or system memory.
- This is not completed in this patch.
- It might be decided in update_src_caps.
- But tested for both system memory and video memory cases.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
1\ Proposes msdk bufferpool to upstream.
- If upstream has accepted the proposed msdk bufferpool,
encoder can get msdk surface from the buffer directly.
- If not, encoder get msdk surface its own msdk bufferpool
and copy from upstream's frame to the surface.
2\ Replace arrays of surfaces with msdk bufferpool.
3\ In case of using VPP, there should be another msdk bufferpool
with NV12 info so that it could convert first and encode.
Calls gst_msdk_set_frame_allocator and uses video memory only on linux.
and uses system memory on Windows until d3d allocator is implemented.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Implements 2 memory allocators:
1\ GstMsdkSystemAllocator: This will allocate system memory.
2\ GstMsdkVideoAllocator: This will allocate device memory depending
on the platform. (eg. VASurface)
Currently GstMsdkBufferPool uses video allocator currently by default
only on linux. On Windows, we should use system memory until d3d
allocator
is implemented.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Implements msdk frame allocator which is required from the driver.
Also makes these functions global so that GstMsdkAllocator could use
the allocated video memory later and couple with GstMsdkMemory.
GstMsdkContext keeps allocation information such as mfxFrameAllocRequest
and mfxFrameAllocResponse after allocation.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Makes GstMsdkContext to be a descendant of GstObject so that
we could track the life-cycle of the session of the driver.
Also replaces MsdkContext with this one.
Keeps msdk_d3d.c alive for the future.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Same changes as done for wasapisink in cbe2fc40a. Turns out this is
sometimes also needed for capture. Reported by Mathieu_Du.
Also improve logging in that case for easier debugging.
Sometimes the minimum period advertised by a card results in an
unaligned buffer size error during initialization in exclusive mode.
In that case, we can fetch the actual buffer size in frames and
calculate the period from that.
We can't do this pre-emptively because we can't call GetBufferSize
till Initialize has been called at least once.
https://bugzilla.gnome.org/show_bug.cgi?id=793289
This reduces the chances of startup glitches, and also reduces the
chances that we'll get garbled output due to driver bugs.
Recommended by the WASAPI documentation.
https://bugzilla.gnome.org/show_bug.cgi?id=793289
So far, we have been completely discarding the values of latency-time
and buffer-time and trying to always open the device in the lowest
latency mode possible. However, sometimes this is a bad idea:
1. When we want to save power/CPU and don't want low latency
2. When the lowest latency setting causes glitches
3. Other audio-driver bugs
Now we will try to follow the user-set values of latency-time and
buffer-time in shared mode, and only latency-time in exclusive mode (we
have no control over the hardware buffer size, and there is no use in
setting GstAudioRingBuffer size to something larger).
The elements will still try to open the devices in the lowest latency
mode possible if you set the "low-latency" property to "true".
https://bugzilla.gnome.org/show_bug.cgi?id=793289
This requires using allocated strings, but it's the best option. For
instance, a call could fail because CoInitialize() wasn't called, or
because some other thing in the stack failed.
https://bugzilla.gnome.org/show_bug.cgi?id=793289
This is particularly important when running in exclusive mode because
any delays will immediately cause glitching.
The MinGW version in Cerbero is too old, so we can only enable this when
building with MSVC or when people build GStreamer for MSYS2 or other
MinGW-based distributions.
To force-enable this code when building with MinGW, build with
CFLAGS="-DGST_FORCE_WIN_AVRT -lavrt".
https://bugzilla.gnome.org/show_bug.cgi?id=793289
This provides much lower latency compared to opening in shared mode,
but it also means that the device cannot be opened by any other
application. The advantage is that the achievable latency is much
lower.
In shared mode, WASAPI's engine period is 10ms, and so that is the
lowest latency achievable.
In exclusive mode, the limit is the device period itself, which in my
testing with USB DACs, on-board PCI sound-cards, and HDMI cards is
between 2ms and 3.33ms.
We set our audioringbuffer limits to match the device, so the
achievable sink latency is 6-9ms. Further improvements can be made if
needed.
https://bugzilla.gnome.org/show_bug.cgi?id=793289
We will use ->device for storing a pointer to the IMMDevice structure
which is needed for fetching the caps supported by devices in
exclusive mode.
https://bugzilla.gnome.org/show_bug.cgi?id=793289
This will set the actual-latency-time and actual-buffer-time of the sink
and source.
We completely ignore the latency-time/buffer-time values set
on the element because WASAPI is happiest when it is reading/writing at
the default period. Improving this will likely require the use of the
IAudioClient3 interfaces which are not available in MinGW yet.
https://bugzilla.gnome.org/show_bug.cgi?id=792897
Currently only does probing and does not handle messages from
endpoints/devices. In the future we want to do proper monitoring which
is well-supported in WASAPI.
https://bugzilla.gnome.org/show_bug.cgi?id=792897
We need to parse the WAVEFORMATEXTENSIBLE structure, figure out what
positions the channels have (if they are positional), and reorder them
as necessary.
https://bugzilla.gnome.org/show_bug.cgi?id=792897
There is no fixed limitation for the number of devices on the
decklink API side according to BlackMagic. Many PC motherboards
are able support 6 decklink cards each with up to 8 inputs so
a limit of 16 might well be too low.
https://bugzilla.gnome.org/show_bug.cgi?id=777239
Both the source and the sink elements were broken in a number of ways:
* prepare() was assuming that the format was always S16LE 2ch 44.1KHz.
We now probe the preferred format with GetMixFormat().
* Device initialization was done with the wrong buffer size
(buffer_time is in microseconds, not nanoseconds).
* sink_write() and src_read() were just plain wrong and would never
write or read anything useful.
* Some functions in prepare() were always returning FALSE which meant
trying to use the elements would *always* fail.
* get_caps() and delay() were not implemented at all.
TODO: support for >2 channels
TODO: pro-audio low-latency
TODO: SPDIF and other encoded passthroughs
Three new properties are now implemented: role, mute, and device.
* 'role' designates the stream role of the initialized device, see:
https://msdn.microsoft.com/en-us/library/windows/desktop/dd370842(v=vs.85).aspx
* 'device' is a system-wide GUIDesque string for a specific device.
* 'mute' is a sink property and simply mutes it.
On my Windows 8.1 system, the lowest latency that works is:
wasapisrc buffer-time=20000
wasapisink buffer-time=10000
aka, 20ms and 10ms respectively. These values are close to the lowest
possible with the IAudioClient interface. Further improvements require
porting to IAudioClient2 or IAudioClient3.
https://docs.microsoft.com/en-us/windows-hardware/drivers/audio/low-latency-audio
Sometimes we might get an audio packet without a corresponding video
frame. In these cases, the stream and hardware reference timestamps
would be missing, because they're called on the video frame. Instead of
potentially breaking stuff downstream that might depend on these, we now
extrapolate them.
https://bugzilla.gnome.org/show_bug.cgi?id=792042
When we receive a video or audio buffer, we calculate the next stream
time based on the current stream time + buffer duration. If the next
buffer's stream time is after that, we issue a warning.
This happens because the stream time incoming from Decklink should be
really constant and without gaps. If there is a gap, it means that
something went wrong, e.g. the internal buffer pool is empty (too many
buffers queued up downstream).
https://bugzilla.gnome.org/show_bug.cgi?id=781776
Sometimes we might get an audio packet without a corresponding video
frame. In these cases, the stream and hardware reference timestamps
would be missing, because they're called on the video frame. Instead of
potentially breaking stuff downstream that might depend on these, we now
extrapolate them.
https://bugzilla.gnome.org/show_bug.cgi?id=792042
The correct behaviour of anything stuck in the ->render() function
between ->unlock() and ->unlock_stop() is to call
gst_base_sink_wait_preroll() and only return an error if this returns an
error, otherwise, it must continue where it left off!
https://bugzilla.gnome.org/show_bug.cgi?id=774950
Not only if the video sink is set to PLAYING so far. Also give more
useful debug output about why we don't start, and don't start if already
started.
Also refactor the function to early-return instead of having a huge
if-else block over the whole function.
https://bugzilla.gnome.org/show_bug.cgi?id=790114
The Decklink and GstAudioBaseSink APIs don't fit very well together,
which causes various problems due to inaccuracies in the clock
calculations and the actual ringbuffer and GStreamer's copy getting of
sync.
Problems are audio drop-outs and A/V sync getting wrong after
pausing/seeking.
https://bugzilla.gnome.org/show_bug.cgi?id=790114
When we cannot scale, we need to enforce the pixel aspect ratio.
This was partly implemented in the previous patch. Doing this
simplify some of the code.
https://bugzilla.gnome.org/show_bug.cgi?id=784599
1. Similar to 880f3d8, don't consider not getting an output buffer as
an error during flushing. I've seen the following sometimes when
encoding:
W GStreamer+amcvideoenc: java.lang.IllegalStateException
W GStreamer+amcvideoenc: at android.media.MediaCodec.getBuffer(Native Method)
W GStreamer+amcvideoenc: at android.media.MediaCodec.getOutputBuffer(MediaCodec.java:2886)
2. For amcvideodec/enc, call _find_nearest_frame (which grabs a fresh
reference on a GstVideoCodecFrame) after we have an output buffer,
so as to not leak the reference, in case getting an output buffer
fails.
Otherwise, if we get an error grabbing the output buffer, we leak
the reference to the frame. This can cause issues with a
v4l2bufferpool feeding the encoder not being able to clean itself
up properly due to buffers still being marked as in-use.
https://bugzilla.gnome.org/show_bug.cgi?id=791258
This is to be used with gst_video_overlay_set_render_rectangle()
so the application can calculate a rectangle that fits inside
the display. The property changes are notify in a way that you
can watch either notify::display-width or notify::display-height
and both will be up-to-data when this is called back. Before the
element is started, the size will be 0x0.
https://bugzilla.gnome.org/show_bug.cgi?id=784599
Implement videooverlay interface in kmssink, divided into two cases:
when driver supports scale, then we do refresh in show_frame(); if
not, send a reconfigure event to upstream and re-negotiate, using the
new size.
https://bugzilla.gnome.org/show_bug.cgi?id=784599
If the driver requires more data, just unref the frame at the moment
then retreive/finish the frame after encoding is finished.
This also fixes a memory leak.
https://bugzilla.gnome.org/show_bug.cgi?id=790312