Experimental ffmpeg-vaapi plugin

w23 · Jan 16, 2017

So I really wanted to stream Clustertruck in 1440p60, so I spent a whole weekend reading ffmpeg sources instead.
As a result, there's a very naive ffmpeg-vaapi plugin (basically is a copy of ffmpeg-nvenc with vaapi-specific hw frame upload added) in the obs-ffmpeg module in this branch: https://github.com/w23/obs-studio/tree/ffmpeg-vaapi
The exact commit adding it: https://github.com/w23/obs-studio/commit/9c70ee2347285c4d7e087106c565ba5b5bbe16a6

It is in a rather early stage:

no GUI controls
display/device name is hardcoded to ":0"
memory management/leaks were not considered at all
actual performance wasn't measured

I'd appreciate any kind of feedback if anyone is interested.

My experience playing with it for a few hours so far:

FFmpeg@master, Mesa 13.0.3, VA-API 0.39 (libva 1.7.3), kernel 4.8.12 on AMD Radeon R9 Fury X: there are weird issues with the video it produces. Basically it looks like ffmpeg+h264_vaapi emits packets very rarely, only a couple per second. Which could be fine otherwise, but the thing is that this packet rate equals apparent framerate of the produced video (1440p2 is not what I wanted!). Also, I don't visually see any P-frames at all. Tuning gop_size or other parameters doesn't make this packet rate better. This HW also requires b-frames to be set to zero, and VAAPI_DISABLE_INTERLACE=1 envvar set.
On another hw (some intel on dell xps'13 from 2015) that I've tested very-very briefly, the stream is also low-fps and jittery, but not that much (likely the vaapi is fine, but the machine itself is rather weak). And p-frames are clearly visible.

Thing is that this jitter is not obs-specific. For example, running a screen capture with ffmpeg itself produces the same (if not actually worse) result:

Code:

ffmpeg \
    -loglevel debug \
    -f x11grab -video_size 2560x1440 -framerate 60 -x 1920 -i :0.0 \
    -vaapi_device ":0" \
    -vf 'format=nv12,hwupload' -map 0:0 -threads 8 -aspect 16:9 -y -f mp4 \
    -bf 0 -qp 42 -quality 8 \
    -vcodec h264_vaapi -profile 100 \
    test-vaapi.mp4

Testing instructions.
0. libva is obviously needed.
1. Rather fresh ffmpeg with h264_vaapi encoding support is required. I took latest master (and I probably shouldn't have done that! This would be ironic if the issues above are due to an dev-unstable ffmpeg).
If you need to compile ffmpeg yourself, then you'd need it to have at least the following options for its ./configure:

Code:

--enable-shared --enable-pic --disable-static \
--enable-hwaccel=h264_vaapi \
--enable-filter=hwupload,scale \
--enable-encoder=h264_vaapi,aac \
--enable-muxer=h264,mp4,flv,md5 \
--enable-protocol=file,rtmp \
--enable-decoder=rawvideo

Also for cmdline ffmpeg testing:

Code:

--enable-indev=v4l2,x11grab_xcb,xcbgrab \
--enable-parser=mjpeg \
--enable-decoder=mjpeg

Also also, don't forget to set envvar PKG_CONFIG_PATH=<where-you-installed-ffmpeg>/lib/pkgconfig before you run cmake on OBS.
2. Build OBS and run it as usual. Go to advanced and pick VAAPI encoder.

Lain · Jan 23, 2017

Just popping in to say that this is awesome that you wrote this. Unfortunately at the moment my dedicated linux machine is down so I can't test, but I'm going to have some other people try this out in the mean time and see how it runs for them with different hardware.

Xaymar · Jan 23, 2017

As far as I know, VAAPI is only supported officially for decoding on AMD gpus, encoding is afaik only available with mesa drivers. So you might need to find mesa drivers where it works or straight up blacklist AMD cards for now.

There's a ticket open for on the AMF issue tracker: https://github.com/GPUOpen-LibrariesAndSDKs/AMF/issues/4

w23 · Jan 24, 2017

Thanks! I have spent a lot more time on this HW encoding on Linux problem. Here's what I found.

TL;DR: I could get AMD GPU to encode h264 only using gstreamer-vaapi.

Mesa (as of 13.0.3) does support hardware h264 encoding on AMD GPUs. However, there are limitations: vaDeriveImage() function always fails, there is no support for B-frames and packed headers. These are pretty much hardcoded in Mesa VAAPI state tracker, so hardware is not even asked of its capabilities. Have no idea what no packed headers mean (haven't read the MPEG4 AVC spec {yet?:|}), and also not sure about implications of not having B-frames for streaming games. No vaDeriveImage is also not fatal, but it means that we can't direcly map to GPU mem, so there's a performance loss by yet another memcpy (and a more complicated codepath).
And another thing, there's a bug where AMD driver interprets everything as interlaced (despite what's been told via libva API), so one should always have VAAPI_DISABLE_INTERLACE=1 in the environment all the time.

I couldn't make any version of FFmpeg to correctly make use of VAAPI. The release (3.2.2) version just crashes. Current master complains that packed headers aren't there and produces this low-framerate output that I talked about above.

The official libva tests/samples also don't work, because they expect vaDeriveImage to work.

There already was another VAAPI plugin for OBS made a few years ago: https://github.com/reboot/obs-studio/tree/vaapi-h264/plugins/linux-vaapi. Making it compile and load with contemporary OBS is trivial. But I couldn't make it work on my driver. Basically, it expects h264 packed headers, which aren't there.

The only thing that does work, and seems to work sufficiently well is using gstreamer-vaapi. This command does produce a valid 1440p60 video while using under 15% CPU:

Code:

gst-launch-1.0 -e ximagesrc display-name=:0 use-damage=0 startx=1920 starty=0 endx=$((1920+2560-1)) endy=1439 !\
    multiqueue ! video/x-raw,format=BGRx,framerate=60/1 ! videoconvert ! video/x-raw,format=I420,framerate=60/1 !\
    multiqueue ! vaapih264enc dct8x8=true ! h264parse ! multiqueue ! matroskamux name=muxer muxer. ! progressreport name=Rec_time !\
    filesink location=/tmp/gstreamer-video.mkv

However:
- I haven't tried to use it for longer than a few minutes.
- Capturing frames still does interfere with games. E.g. Clustertruck (which triggered all this endeavor!) still experiences frame drops when capturing. This needs to be profiled.

VAAPI seems to have no way of accessing hardware framebuffer. It can use GLX context and texture for output, but not for input. Maybe its possible to use lower level dri2 apis to do something like that, but this is way beyond my immediate capabilities.

Or maybe it would be possible to write a special Xorg compositor that could capture frames at a lower level and more efficiently. I have no idea.

So, a conclusion:
1. The only way forward for me is to make yet another VAAPI plugin for OBS, this time based on gstreamer. From the looks of it, gstreamer seems to be documented and sane (yes, I am looking at you, FFmpeg), so maybe this or next weekend I will come up with something.
2. I need to profile the hell out of all this if I ever want to share with my friends how bad I am at Clustertruck.

ZombieMeat · Feb 1, 2017

You are the best! Just created an account to tell you that.

I've been able to test it out on Dell XPS 13(Kaby Lake), with a few more lines to let it be able to tweak the per-encoder options.

I would say my experience have been good, although I haven't used it for a long period. The only thing of note is that higher value for "quality" means faster or lower quality, per "ffmpeg -h encoder=h264_vaapi." And, they are not even supported on intel GPU, with only 0 and 1 supported while other values give crashes.

beniwtv · Feb 2, 2017

Nice work! Going to try that out this weekend - have hoped someone would implement this!
I'm going to try with a AMD RX 480 8GB - MESA 13.0.4.

ZombieMeat · Feb 3, 2017

Been playing around a bit. It seems like the most significant parameter is QP. Also, Specifying bitrate made FFmpeg behave a bit funky. It forces the bitrate to be consistent with the specified one even when it doesn't need to. Moreover, when the image changes drastically it needs more bits to encode: the bitrate rises; the buffer overflows, which seems to degrade the encoding quality quite a bit.

I'm all new to this so just conjecturing, but instead of setting bitrate, what if we only set the buffer size to reduce the possibility of overflow. Right now, QP and buffer size combination needs to be tested empirically.

In case anyone is interested, I made a PKGBUILD(archlinux) for my test setup.

Xaymar · Feb 3, 2017

The easiest way to explain that is to know the VCE part responds to certain parameters. There's several ones which affect what values are picked on the hardware but lets go with the absolute default one (Usage: Transcoding). Since I don't know what Mesa's VAAPI integration all actually sets, this is mostly from the actual usage on Windows, which should be identical since it maps almost directly to the hardware from my experience (+ some gpu transfer/conversion stuff).

There are three main Rate Control Methods that VCE has: Constant QP, Constant Bitrate and Variable Bitrate. Variable Bitrate has a Peak Constrained and a Latency Constrained version (latter is great for recording with no impact, former is great for actual quality). Constant Bitrate is the only one of these where VCE uses Filler Data, though normally only if enabled. If Constant Bitrate is used without Filler Data, it behaves like Peak Constrained Variable Bitrate except that the Target Bitrate is the Peak Bitrate and Peak Bitrate is ignored.

So in order to actually get Variable Bitrate behaviour, FFMPEG would need to be configured to use VBR mode. I'm not sure if Mesa's VAAPI exposes this.

As for Buffer Size (VBV Buffer Size), if you want your Bitrate to be perfectly matched you'd want a value between 1/FPS*Bitrate and 8/FPS*Bitrate. The lower you go the less space an individual packet can take up (directly affects I&P&B Frame quality.

beniwtv · Feb 5, 2017

So I got around to test this now, and after spending the whole day I have to report success!

I used the PKGBUILD files that @ZombieMeat provided above - allthough adapted for Docker and Ubuntu.

With FFmpeg 3.2.2 it crashes - just like @w23 reported. Also it complains about no B-frames and crashes if you set these to 0.
With FFmpeg master from today (5. Feb, 2017), it no longer complains about B-frames and does not crash.

Actually, it seems to encode just fine - and CPU does stay low - the same as if you're not encoding :)
I did notice some text artifacts - but I have not yet been playing around with the options provided (just leaving them standard as they came with it).

So my final configuration was:
MESA 17.0.1-devel from Padoka ppa
Docker using Ubuntu 16.04 as base
AMD RX 480

I have attached my Docker files in case anyone wants to have a quick way of testing :)
NOTE: The start script is currently hard-coded for display :0 and UID 1000

EDIT: Made a video
https://www.youtube.com/watch?v=m8OBFLaNl5Q

petunder · Feb 5, 2017

I'm very excited, but how to install pkgbuild in ubuntu? Can you build a .deb?
Thank you!

David Cooper · Mar 15, 2017

I had no trouble merging the reboot/vaaapi-h264 tree into latest master, building, and installing. I can upload a .DEB if you folks would like. I made sure that the libva* deps were installed.

Angry Penguin · Mar 16, 2017

David Cooper said:
I had no trouble merging the reboot/vaaapi-h264 tree into latest master, building, and installing. I can upload a .DEB if you folks would like. I made sure that the libva* deps were installed.

If you can do it. I would be very grateful (probably not only me) :)

w23 · Mar 19, 2017

My apologies for long absence and no progress on my plugin.
The thing is I figured out that VAAPI doesn't help me with the capture performance problems I have on my system, and the actual bottleneck is somewhere else (likely XSHM). Therefore, I don't think I will be making any progress here soon. If anyone wants to take the plugin and make it production ready, be my guest. I believe the only major thing left to do is to add proper GUI controls. License is whatever license OBS is under.

I want to do a thorough Linux screen capture performance research (including things like custom compositors, Wayland and friends) in the coming months. But I cannot promise anything as there is just too much stuff on my plate already.

Steeled_Pick · Apr 25, 2017

Tried to compile but getting the following error and it fails: /home/user/Downloads/obs-studio-vaapi-h264/plugins/linux-vaapi/surface-queue.h:3:19: fatal error: va/va.h: No such file or directory
Makefile:149: recipe for target 'all' failed
make: *** [all] Error 2
ubuntu 16.04
ffmpeg version 3.2.4

Fixed: I forgot to install libva-dev

Bleuzen · Apr 28, 2017

Works very well on Arch with an i7-7700 iGPU (Intel® HD Graphics 630). I just had to change the display/device name from ":0" to "/dev/dri/renderD128".
Thanks! Now I can record / stream without much CPU usage.
Please implement this in OBS (with GUI) ... this would be great ;D

David Carver · Jun 7, 2017

Going to try this out with 16.10 of Ubuntu tonight. It has to be better than the horrible experience I had trying to get QSV support working. At least this doesn't require a new kernel compilation.

David Carver · Jun 8, 2017

I managed to get the reboot/vaapi-h264 version working with the master branch of obs-studio. No changes were necessary to get it to compiled besides what was in his base branch. It would be fantastic if somebody could get this incorporated into the main obs-studio code base. I can create a branch and pull request but we really should contact Reboot to get permission to include his work into the main distribution.

My CPU usage went from hitting 19 to 20 percent, to between 5% and 8% with using the VAAPI-H264 encoder settings.

Update: I created a simple blog entry with a link to Reboot's code working with the current master branch (19.x) In case anybody is interested. No problems merging the code in and getting it to work. Here is the link.

https://wordpress.com/post/intellectualcramps.wordpress.com/1151

Arjen · Jun 16, 2017

I tried the merge git repo, it all seems to go quite well until I press 'stop recording'. The terminal then shows the following error message:

Code:

info: [ffmpeg muxer: 'adv_file_output'] Writing file '/home/arjen/Videos/2017-06-16_16-09-32.mkv'...
error: [VAAPI encoder]: "vaEndPicture(q->display, q->context)": invalid parameter
error: [VAAPI encoder]: unable to encode frame
error: Error encoding with encoder 'streaming_h264'

My vainfo output is as follows:

Code:

vainfo: VA-API version: 0.40 (libva )
vainfo: Driver version: Intel i965 driver for Intel(R) Haswell Mobile - 1.8.2
vainfo: Supported profile and entrypoints
  VAProfileMPEG2Simple  :   VAEntrypointVLD
  VAProfileMPEG2Simple  :   VAEntrypointEncSlice
  VAProfileMPEG2Main  :   VAEntrypointVLD
  VAProfileMPEG2Main  :   VAEntrypointEncSlice
  VAProfileH264ConstrainedBaseline:   VAEntrypointVLD
  VAProfileH264ConstrainedBaseline:   VAEntrypointEncSlice
  VAProfileH264Main  :   VAEntrypointVLD
  VAProfileH264Main  :   VAEntrypointEncSlice
  VAProfileH264High  :   VAEntrypointVLD
  VAProfileH264High  :   VAEntrypointEncSlice
  VAProfileH264MultiviewHigh  :   VAEntrypointVLD
  VAProfileH264MultiviewHigh  :   VAEntrypointEncSlice
  VAProfileH264StereoHigh  :   VAEntrypointVLD
  VAProfileH264StereoHigh  :   VAEntrypointEncSlice
  VAProfileVC1Simple  :   VAEntrypointVLD
  VAProfileVC1Main  :   VAEntrypointVLD
  VAProfileVC1Advanced  :   VAEntrypointVLD
  VAProfileNone  :   VAEntrypointVideoProc
  VAProfileJPEGBaseline  :   VAEntrypointVLD

Any ideas?

cRaZy-bisCuiT · Aug 1, 2017

David Carver said:
Update: I created a simple blog entry with a link to Reboot's code working with the current master branch (19.x) In case anybody is interested. No problems merging the code in and getting it to work. Here is the link.

https://wordpress.com/post/intellectualcramps.wordpress.com/1151

Unfortunately I can't read that blog entry: Wordpress asks me for my login credentials. Could you check that address please?

Also, has someone else managed to merge the patch with the current master of obs? Is there a tutorial somewhere? Thanks to all girls & guys participating here! :)

RytoEX · Aug 1, 2017

cRaZy-bisCuiT said:
Unfortunately I can't read that blog entry: Wordpress asks me for my login credentials. Could you check that address please?

Also, has someone else managed to merge the patch with the current master of obs? Is there a tutorial somewhere? Thanks to all girls & guys participating here! :)

I assume @David Carver meant this URL: https://intellectualcramps.wordpress.com/2017/06/08/obs-studio-and-hardware-encoding-for-linux/

There is a pull request on GitHub, which I was able to successfully compile in a VM, but I didn't extensively test it. As far as I know, it's currently waiting on some pretty substantial rewrites.

Experimental ffmpeg-vaapi plugin

New Member

Active Member

New Member

New Member

Attachments

New Member

New Member

Attachments

Active Member

New Member

Attachments

New Member

New Member

New Member

New Member

Member

New Member

Member

Member

New Member

New Member