Why your kernel’s drm.edid_firmware parameter doesn’t work anymore in libvirt environments

The enterprise world is one that is fond of funny, crappy hardware gadgets such as KVM (Keyboard-Video-Mouse) switches, often in the matrix variant: you have one or more consoles connected matrix-style to different computers, for the obvious benefit of not having to connect keyboard, mouse and monitor when needed let alone have a set connected to each computer all the time. More advanced examples of this species even allows remote access via IP, maybe even in the modern variant through a Web browser obsoleting the need for a dedicated proprietary console. But advanced as they may be they usually all fail when it comes down to one elementary feature: DDC simulation.

Display Data Channel (DDC) is an I2C-based means of communication between a computer and a connected monitor to determine its capabilities in implementing “plug ‘n play”. Part of the information transmitted is Extended Display Identification Data (EDID) which contains modelines, information required by a graphics driver in order to implement different screen resolutions with the right timings. This is what, in the ideal case, makes your computer drive a connected display automatically in its native screen resolution.

Now if you insert a KVM switch and accessories between a computer and a monitor, the monitor is not necessarily connected the very moment the computer when the graphics driver wants to configure the “right” resolution. And, since there can be multiple users with different monitors that may connect to a computer at any time, the KVM switch must somehow perform some sort of DDC emulation if the operating system is to be able to successfully configure the screen. Simply forwarding the DDC information from the connected monitor each time you switch to a computer is not really gonna help since you can’t expect to dynamically reconfigure their video output. Some KVM switches offer dongles as an accessory that emulate a given EDID set with a given resolution, thus locking a computer’s resolution at e.g. 1280×1024 pixels no matter the capabilites of the actually connected monitor. But these dongles are not necessarily cheap and even if they are this is a solution that does not really scale as the number of connected systems grows.

Linux wouldn’t be Linux if there wasn’t a cheaper and actually more elegant way in the form of the drm.edid_firmware parameter (in kernels <4.15 drm_kms_helpers.edid_firmware). Using this parameter you can specify the path to an alternative EDID table that will override whatever the monitor or, in our case, the connected KVM switch will supply. You can use your own custom EDID table if you place it in initrd/initramfs but there’s also a set of built-in override tables named after the desired resolution. For example, specifying drm.edid_firmware=edid/1280x1024.bin (a magic pseudo path, this file does not actually exist in initrd/initramfs) will give you a 1280×1024 pixels console framebuffer and Xorg screen.

This conjunction is important if you look at apparant alternative of using the video= parameter. video= allows you to configure the screen resolution of the console framebuffer in a graphics driver-independent way. However, this will have no effect on Xorg because it will look at the obtained modelines and given no manual configuration by default use the first modeline to determine its resolution. In other words, since video= by contrast to drm.edid_firmware= does not change the EDID table and thus the modelines, X will stick to whatever deficitary EDID information your KVM switch supplies (unless you specify a different resolution in a xorg.conf snippet). The whole fun behind the built-in EDID override tables is that they differ in this very first modeline: edid/1280x/1024.bin will have a 1024×1024 resolution in the first modeline, edid/1920x1080.bin a 1920×1080 resolution and so on. You can verify this easily by looking at Xorg’s logfile.

Up to and including kernel version 4.14 drm.edid_firmware even worked when testing a deployment in a virtualized (libvirt/qemu) environment: even if the actual timings in the EDID table had little effect you could test whether you got the desired resolution. This is what it would look like in dmesg:

[drm] Got built-in EDID base block and 0 extensions from "edid/1280x1024.bin" for connector "Virtual-1"

I used this in a SLES12 SP2 (kernel 4.4.21) deployment that was to be updated to SLES12 SP5 (kernel 4.12.14) — where suddenly the kernel parameter did not seem to work anymore. The dmesg output would not show the EDID table even being loaded anymore. Huh ?!

The hours I spent in digging into this issue and the involved kernels’ sources are the reason for this blog post. The culprit, if you like to call it that way, is the patch drm: handle override and firmware EDID at drm_do_get_edid() level by Jani Nikula of Intel fame dated 2017-09-12, which SUSE backported from kernel 4.15 into their SP5 kernel. Jani writes very usable commit messages but I’ll detail nevertheless what changed in the relevant code paths:

Prior to the patch:

drm_probe_helper.c‘s drm_helper_probe_single_connector_modes() function would always try to load any EDID override table specified with the edid_firmware kernel command line parameter.
if that didn’t yield any modelines (e.g. because the parameter wasn’t specified) it would look for an EDID override table passed via the sysfs edid_override node (which I will not detail here).
if that still didn’t yield any modelines it would call the active graphics driver’s get_modes() function, which, in turn, usually called drm_edid‘s drm_get_edid() function which implemented the physical EDID reading.

After the patch, i.e. since upstream kernel 4.15 and also SLES12 SP5, the responsabilities have been moved around a bit:

drm_probe_helper.c‘s drm_helper_probe_single_connector_modes() now always call the graphics driver’s get_modes() function.
As above, almost all graphics drivers call drm_edid‘s drm_get_edid() function.
It is now drm_get_edid() which (together with helper functions) prior to doing physical EDID reading checks for an EDID override table supplied via either edid_firmware kernel command line parameter or sysfs edid_override node.

The emphasis is on almost — there is a group of graphics drivers that do not call drm_get_edid(): drivers for virtualized graphics such as bochs_drm (used for libvirt’s “VGA” and “Bochs” models), qxl (“QXL”) and virtio (“Virtio”). Since there is no bus to read data from it never made sense for them to call that function. And arguably supplying a custom EDID file should not be necessary for virtual graphics anyway… except if you want to lock a certain screen resolution and not use the combination of video= and a xorg.conf snippet.

It might be possible that in kernel versions this code path was changed again so that EDID loading becomes possible with virtualized graphics drivers again but in this particular project I’m stuck with SLES12 SP5 for now, so I did some further looking into how I could still get my desired 1280×1024 resolution in virtual environments as well. Turns out the bochs_drm driver has two module options defx and defy which can be used to achieve the same effect as edid_firmware: to change which modeline is returned first. Adding bochs_drm.defx=1280 bochs_drm.defy=1024 to the kernel command line fixes the problem as long as you can live with a “VGA” video adapter model instead of the usually preferable “Virtio” or “QXL” — neither of these two know similar parameters.

I’ll close with a short glimpse at those still using VirtualBox: you’ll probably never seen this phenomen of edid_firmware not working anymore because there are no virtualized graphics drivers for VirtualBox. It’s been long ago that I’ve been using it but remember that it has to emulate “real”, ancient hardware. I wonder if it simulates DDC/EDID, though.