In gst-msdk, a mfx session may be shared between different gst
elements, each element tries to set the frame allocator. However, per
the MSDK documation[1], the behavior is undefined if reset the frame
allocator while the previous allocator is in use. Fortunately all
elements use the same frame allocator, so we can avoid to call
MFXVideoCORE_SetFrameAllocator again.
[1]: https://software.intel.com/en-us/node/628430#MFXVideoCORE3
If so, BGRA is the preferred output format hence BGRA will be selected
as input format by default, e.g. in the pipleline below, BGRA instead of
NV12 is selected without renegotiation, so we can avoid the NV12 issue
(see commit 3f2314a) by default.
gst-launch-1.0 videotestsrc ! msdkvpp ! glimagesink
Otherwise MFXVideoVPP_Init will fail because it is called twice without
a close.
Example pipeline:
gst-launch-1.0 videotestsrc ! msdkvpp ! glimagesink
Sometimes glimagesink emits GST_EVENT_RECONFIGURE event which results
in that MFXVideoVPP_Init is called twice, then get the negotiation
failure below:
0:00:00.093715518 21218 0x558ef56231e0 ERROR msdkvpp
gstmsdkvpp.c:995:gst_msdkvpp_initialize:<msdkvpp0> Init failed
(undefined behavior)
WARNING: from element /GstPipeline:pipeline0/GstMsdkVPP:msdkvpp0: not
negotiated
After applying this commit, the pipeline above may run without
negotiation failure, however NV12 layout in dmabuf mode is selected in
renegotiation, the display image is corrupted due to the NV12 issue which
was mentioned in commit 3f2314a. Some other fixes are needed to avoid
renegotiation by default
In general, we should assume any unhandled error is
non-recoverable.
In the flush frames loop, some error states can cause us
to never increment the task and therefore we get stuck
in an infinite loop and generate GST_ELEMENT_ERROR
over and over again. This eventually consumes all
system memory and triggers OOM. Thus, assume the worst
and break out of the loop upon the first "unhandled" error.
https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/859
We will add more profiles in the sink caps of msdkh265enc, so let
msdkh265enc re-add the sink pad template. Note this change doesn't
impact any capability
CodecProfile will be set in MFXVideoDECODE_DecodeHeader() to match
the input stream. Setting the hard-coded profile here will mislead
user that msdkh265dec supports a special profile only.
Previously alloc_info is initialized when both thiz->initialized
and thiz->allocation_caps are true, but only thiz->initialized is
checked when alloc_info is used.
MediaSDK has been released as open source [1], but the directories
where it installs its files, are different from the binary only
distribution.
This patch adds to the libraries path the directory /lib. Also it
is defined in meson if the include directory has the mfx/ prefix,
something that is already handled in autotools.
1. https://github.com/Intel-Media-SDK/MediaSDK
the 2018.3.1 intel sdk release places libraries into /lib64 instead of
/lib/lin_x64 or /lib/x64, this commit adds /lib64 to the libdir
locations list
Fixes#815
Use async_depth for latency calcuation instead of
the length of Tasks array which could be NULL since we
don't do the msdk decoder init in set_format().
According to MediaSDK specification,
Width must be a multiple of 16 and Height must be a multiple
of 16 for progressive frame sequence and a multiple of 32 otherwise.
This patch sets a 16 bit alignment for width and 32 bit alignment
for height as default.
https://bugzilla.gnome.org/show_bug.cgi?id=796566
In cases where we do hard resest, the current code destroys the frame
which has new resolution bit early and this causes buffer_unmap
warnings. Keep an extra ref to the frame internally to avoid this.
The gst-msdk decoders only support packetized formats for
all codecs except VC1. For VC1, it supports codec_data for advanced
profiles and this codec_data wan't submitting to MSDK's DecodeHeader APIs.
Make sure the subclass deocders correctly configured so that
the codec_data buffers are in place in the internal adapter for
MediaSDK's DecoderHeader usage.
Currently we use the gst_video_decoder_get_oldest_frame()
to get the old pending frame to output. But this is not correct
if pts re-ordering required. This patch uses a custom made
get_old_frame() which accounts the PTS too similar to the
v4l2decoder.
https://bugzilla.gnome.org/show_bug.cgi?id=796699
The patch adds a serios of changes to support dynamic resolution
change and efficient utilization of resources.
Major changes:
-- Use MSDK's apis to retrieve the headers instead of only relying
on upsteram notification. For eg: avc decoder requires SEI header
information for dpb count calculation which we don't get from caps.
-- For all codecs other than VP9, we force the reset of decoder
if resoultion changes to fit with gstreamer flow. VP9 enfource
the hard reset only if the new resolution is bigger.
-- delay the src caps setting till msdk api's invokation in
handle_frame to avoid caching multiple configuration values
-- ensure pool negotiation is based on decoder's allocation_caps.
--dynamic resoluttion change use an explicit allocation_query
to reclaim the buffers before closing the decoder (thanks to v4l2dec)
--In case if we don't get upstream notification of res change (for eg,
this can can happen for vp9 frames with ivfheader where ivfparse
is not able to notify the dynamic changes), we handle the the case
based on MFX_ERR_INCOMPATIBLE_VIDEO_PARAM which is the return value
of MFXVideoDECODE_DecodeFrameAsync
-- calculate the minimum surfaces to be preallocated based on
msdk suggestion, downstream requirement, async depth and scratch surface
count for smooth display.
https://bugzilla.gnome.org/show_bug.cgi?id=796566
According to msdk spec, there are two ways to enable filters:
1: Filters can be enabled by adding a filter ID
to mfxExtVPPDoUse. In this case, default filter parameters are used
2: Add filter configuration structures directly to mfxVideoParam.
Using 1 with 2 is optional but legal. Unfortunately it won't work
with some specific use cases like Detail/EdgeEnhancement.
Let's stick with option2 which works fine for all VPP operations.
https://bugzilla.gnome.org/show_bug.cgi?id=796468
Since we do the MSDK initializing in set_caps(), a FALSE
return may still cause the invokation of set_caps() again
and this will leads to buffer allocation and other mess-up.
So make sure the msdk initialized correctly before trying
to do any buffer allocation.
https://bugzilla.gnome.org/show_bug.cgi?id=796465
Make sure all the enabled filter structures are added in the
mfxVideoParm before doing the VPPQuery so that msdk
can do the input param validation
https://bugzilla.gnome.org/show_bug.cgi?id=796465
Using NV12 layout in dmabuf mode giving mis-aligned
VPP output with the media-driver. Keep the NV12 support
(so that we can file the bug agianst msdk or mediadriver),
but lower the ordering so that BGRA picks as default.
NV12 issue can be reproduced with explicit capfilter:
vidoetestsrc ! msdkvpp ! video/x-raw\(memory:DMABuf\),format=NV12 ! glimagesink
Added a utility method to replace the MemID (interanl VASurfaceID)
associated with the mfxFrameSurface. This is usefull for dmabuf-import
where we need to replace the memID dynamically
https://bugzilla.gnome.org/show_bug.cgi?id=794817
Exporting DRM_PRIME fd to VASurface requires direct
invocation of VA api VACreateSurface with
VASurfaceAttribExternalBufferDescriptor and other
necessary surface attributes.
https://bugzilla.gnome.org/show_bug.cgi?id=794817
The new property "output-order" can be set to either "display" order
which is the default where frames will be outputting in display order,
or "decoded-order" which will be outputting the frames in decoded order.
The "decoded order" output is generally useful for debugging. But there
are few
customers who use it for low-latency streaming. For eg if the customer
already knows that the stream doesn't have b-frames (which means no
algorithm requires for display order calculation), then they can use
"decoded-order"
output to skip some of the DPB logic to avoid the frame accumulation at
start-up.
The root cause of the above issue is a bit of unclarity in h264 spec +
lazy implementation of many H264 encoders; This is well handled in
gstreamer-vaapi using "low-latency" property:
https://bugzilla.gnome.org/show_bug.cgi?id=762509https://bugzilla.gnome.org/show_bug.cgi?id=795783
For packetized input, inform the msdk that the buffer has
a complete frame or complementary field pairs. For decoding,
this means that the decoder can proceed with this buffer without
waiting for the start of the next frame, which effectively reduces
decoding latency.
https://bugzilla.gnome.org/show_bug.cgi?id=795783
Currently we use an async depth of 4 as default (based on
recommendations
in msdk apps), which indicates how many asynchronous operations an
application performs
before the application explicitly synchronizes the result. As a result,
we
queue four frames in decoder which might not be good approach for
live streaming.
This patch reset the async-depth to 1 as default so that we do sync for
each frame we decode without queuing. Customer can play with already
exposed "async-depth" property for other use cases
https://bugzilla.gnome.org/show_bug.cgi?id=795783
So far msdk produced dmabuf fds are non-mappable.
If user wants to download the content of underlined surfaces,
dmabufcapsfeature negotiated pipeline will fail. So if the input surface
is dmabuf and downstream doesn't have support for dmabuf capsfeatures,
we do the vpp (no passthrough) and produce the mappable videomemory
buffers.
https://bugzilla.gnome.org/show_bug.cgi?id=794946
prpose_allocation:
-- always instantiate a pool for for upstream
-- use async_depth + 1 as min buffer count
decide_allocation:
-- always create a new bufferpool for source pad.
Each of the msdk element has to create it's own mfxsurfacepool
which is an msdk contraint. For eg: Each Msdk component (vpp, dec and
enc)
will invoke the external Frame allocator for video-memory usage
So sharing the pool between gst-msdk elements might not be a good idea.
https://bugzilla.gnome.org/show_bug.cgi?id=793705
Using the default value (InterleavedDec == MFX_SCANTYPE_UNKNOWN)
causing issues with non-interleaved sample decode. Ideally the usage
of MFXVideoDECODE_DecodeHeader should fix these type of issue, but
it seems to be not. But hardcoding the InterleaveDec to
MFX_SCANTYPE_NONINTERLEAVED
is fixing the problem and fortunately msdk seems to be taking care of
Interleaved samples
too .So let's hardcode it for now.
https://bugzilla.gnome.org/show_bug.cgi?id=793787
This patch includes:
1\ Implements MsdkDmaBufAllocator and allocation of msdk dmabuf memroy.
2\ Each msdk dmabuf memory include its own msdk surface kept by GQuark.
3\ Adds new option GST_BUFFER_POOL_OPTION_MSDK_USE_DMABUF
https://bugzilla.gnome.org/show_bug.cgi?id=793707
There needs to be generalized for the parameter from
GstVideoMsdkVideoMemory to GstMemory.
Thus we can call these functions if using DMABuf memory.
https://bugzilla.gnome.org/show_bug.cgi?id=793707
For example, if framerate 0/1 is provided from upstream, the driver
fails to configure and complain about it.
We can let it go and make the driver assuming framerate itself.
https://bugzilla.gnome.org/show_bug.cgi?id=789752
There was not handling the end of encoding sequence in encoder.
This patch does drain any remaining internal streams while decoder
already does this.
Document says:
"To mark the end of the encoding sequence, call this function with a
NULL surface
pointer. Repeat the call to drain any remaining internally cached
bitstreams—one
frame at a time—until MFX_ERR_MORE_DATA is returned."
https://bugzilla.gnome.org/show_bug.cgi?id=793236
Sometimes parent context is released before its children get released.
In this case MFXClose of parent session fails.
To make sure that child sessions are closed before closing a parent
session,
Parent context needs to manage child sessions and close them first when
it's released.
https://bugzilla.gnome.org/show_bug.cgi?id=793412
Currently a gst buffer has one mfxFrameSurface when it's allocated and
can't be changed.
This is based on that the life of gst buffer and mfxFrameSurface would
be same.
But it's not true. Sometimes even if a gst buffer of a frame is finished
on downstream,
mfxFramesurface coupled with the gst buffer is still locked, which means
it's still being used in the driver.
So this patch does this.
Every time a gst buffer is acquired from the pool, it confirms if the
surface coupled with the buffer is unlocked.
If not, replace it with new unlocked one.
In this way, user(decoder or encoder) doesn't need to manage gst buffers
including locked surface.
To do that, this patch includes the following:
1. GstMsdkContext
- Manages MSDK surfaces available, used, locked respectively as the
following:
1\ surfaces_avail : surfaces which are free and unused anywhere
2\ surfaces_used : surfaces coupled with a gst buffer and being used
now.
3\ surfaces_locked : surfaces still locked even after the gst buffer
is released.
- Provide an api to get MSDK surface available.
- Provide an api to release MSDK surface.
2. GstMsdkVideoMemory
- Gets a surface available when it's allocated.
- Provide an api to get an available surface with new unlocked one.
- Provide an api to release surface in the msdk video memory.
3. GstMsdkBufferPool
- In acquire_buffer, every time a gst buffer is acquired, get new
available surface from the list.
- In release_buffer, it confirms if the buffer's surface is unlocked or
not.
- If unlocked, it is put to the available list.
- If still locked, it is put to the locked list.
This also fixes bug #793525.
https://bugzilla.gnome.org/show_bug.cgi?id=793413https://bugzilla.gnome.org/show_bug.cgi?id=793525
Since there is already an "adaptive-B" option, just
use boolean property for B-pyramid enabling.
Fixme: Not sure whether this can be supported in vp8 and vp9.
It could be possible through GPB (b without backward ref) but
can't verify currently. We can move this as common property
once verified with vp8 and vp9 without breaking any backward
compatibility.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
Add a new property "trellis" to enable trellis quantization.
Keeping trellis as a flag value (which is boolean for gst x264 enc element)
since it is possible to enable/disable this seperately for
I,P and B frames through MediaSDK ext option headers.
The subclass implementations always need to inform base-encoder
if it requires the inclusion of Extend Header buffers (mfxExtCodingOption2
and mfxExtCodingOption3).
https://bugzilla.gnome.org/show_bug.cgi?id=791637
This option controls down sampling in look ahead bitrate
control mode. According to spec it is only supported in AVC.
Fixme: Probably HEVC also have support for this in recent
MSDK versions. We could move the enumeration types to common
header usable for multiple codecs.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
MediaSDK has support for a number of rate control algorithms.
Adding all possible options to the property rate-control.
Fixme1: In case of failure, currently we don't have a proper method
to show which rate-control has been failed. It could be better
to add some extensive validation on EncQuery output in case of error.
Unfortunately, not all ratecontrol methods are supported by every codecs
and we don't have the dynamic detection of supported ratecontrol methods yet.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
We have the property "i-frames" to set the IDR interval in a
gop. Unfortunately MSDK HEVC encoder behaves bit differently
for IdrInterval field, IdrInteval == 1 indicate every
I-frame should be an IDR (which is IdrInterval == 0 for other codecs),
IdrInteval == 2 means every other I-frame is an IDR
(which is IdrInterval == 1 for other codecs) etc.
So we generalize the behaviour of property "i-frames" by
incrementing the value by one in each case (only for HEVC).
https://bugzilla.gnome.org/show_bug.cgi?id=791637
The base encoder common properties are not valid for
mjpeg encoder where there is no motion compensation or rate control.
Delaying the property installation on the base gobject
untill the subclass class_init get invoked.
https://bugzilla.gnome.org/show_bug.cgi?id=791637
The gst-msdk decoders prefer packetized streams as input
and in this case we can avoid unnecessary input bitstream copy
to mfxBitstream. This works fine for codecs like h264 where
we only support byte-stream with au alignment. Other format
conversions should be done thorugh parsers. But this won't work
for codecs like vc1 where we don't have an autoplugged parser.
Even the parser is not capable to do format conversions.
Packetizing through base decoders parse() routine will bring a
lot of uncecessary of complexities and codecparser libraray dependency.
So we just use an interal gst_adaper to keep track of bitstream
which is not consumed by msdk durig AsynchronusDecoding.
This adapter will get used only if subclass implementations
set the "is_packetized" to FALSE for msdk base encoder.
https://bugzilla.gnome.org/show_bug.cgi?id=792589
Adding Simple and Main profiles decode support.
Currently msdkvc1dec is not capable to handle the codec_data,
only instream headers are supported. Also msdk vc1 decoder
expecting instream with Sequence header as per SMPTE 421M Annex L.
Most of the decdoebin/playbin pipeline won't work with the above
constraints
because vc1parse is still not an autoplug element.
Only way to make mskdvc1dec work is by connecting a vc1parse
as an upstream element.
https://bugzilla.gnome.org/show_bug.cgi?id=792589
Use drm render node as the first choice of device node file.
Fall backs to use drm primary (/dev/dri/card[0-9])
if there is no render node available
Basic logic is inherited from gstreamer-vaapi, but using
gudev API rather than libudev directly.
Added gudev library as dependency for msdk.
https://bugzilla.gnome.org/show_bug.cgi?id=791599
1\ If downstream's pool is MSDK bufferpool,
2\ If there's shared GstMsdkContext in the pipeline,
a decoder decides to use video memory.
This policy should be improved to handle more cases.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
In case that pipeline is like ".. ! decoder ! encoder ! ..." with using
video memory,
decoder needs to know the async depth of the following msdk element so
that it could
allocate the correct number of video memory.
Otherwise, decoder's memory is exhausted while processing.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
How to share/create GstMsdkcontext is the following:
- Search GstMsdkContext if there's in the pipeline.
- If found, check if it's decoder, encoder or vpp by job type.
- If it's same job type, it creates another instance of
GstMsdkContext
with joined-session.
- Otherwise just use the shared GstMsdkContext.
- If not found, just creates new instance of GstMsdkContext.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
According to the driver's instruction, if there are two or more encoders
or decoders in a process, the session should be joined by
MFXJoinSession.
To achieve this successfully by GstContext, this patch adds job type
specified if it's encoder, decoder or vpp.
If a msdk element gets to know if joining session is needed by the
shared context,
it should create another instance of GstContext with joined session,
which
is not shared.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
1\ In decide_allocation, it makes its own msdk bufferpool.
- If downstream supports video meta, it just replace it with the msdk
bufferpool.
- If not, it uses the msdk bufferpool as a side pool, which will be
decoded into.
and will copy it to downstream's bufferpool.
2\ Decide if using video memory or system memory.
- This is not completed in this patch.
- It might be decided in update_src_caps.
- But tested for both system memory and video memory cases.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
1\ Proposes msdk bufferpool to upstream.
- If upstream has accepted the proposed msdk bufferpool,
encoder can get msdk surface from the buffer directly.
- If not, encoder get msdk surface its own msdk bufferpool
and copy from upstream's frame to the surface.
2\ Replace arrays of surfaces with msdk bufferpool.
3\ In case of using VPP, there should be another msdk bufferpool
with NV12 info so that it could convert first and encode.
Calls gst_msdk_set_frame_allocator and uses video memory only on linux.
and uses system memory on Windows until d3d allocator is implemented.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Implements 2 memory allocators:
1\ GstMsdkSystemAllocator: This will allocate system memory.
2\ GstMsdkVideoAllocator: This will allocate device memory depending
on the platform. (eg. VASurface)
Currently GstMsdkBufferPool uses video allocator currently by default
only on linux. On Windows, we should use system memory until d3d
allocator
is implemented.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Implements msdk frame allocator which is required from the driver.
Also makes these functions global so that GstMsdkAllocator could use
the allocated video memory later and couple with GstMsdkMemory.
GstMsdkContext keeps allocation information such as mfxFrameAllocRequest
and mfxFrameAllocResponse after allocation.
https://bugzilla.gnome.org/show_bug.cgi?id=790752
Makes GstMsdkContext to be a descendant of GstObject so that
we could track the life-cycle of the session of the driver.
Also replaces MsdkContext with this one.
Keeps msdk_d3d.c alive for the future.
https://bugzilla.gnome.org/show_bug.cgi?id=790752