Bug 1178360 - X11 Pixel Garbage in VirtualBox
X11 Pixel Garbage in VirtualBox
Status: RESOLVED WORKSFORME
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 openSUSE Tumbleweed
: P5 - None : Major (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-11-02 13:53 UTC by Stefan Hundhammer
Modified: 2021-03-10 12:53 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
X11 log at /var/log/Xorg.0.log (26.74 KB, text/plain)
2020-11-02 13:55 UTC, Stefan Hundhammer
Details
sudo journalctl -b (131.62 KB, text/plain)
2020-11-02 13:56 UTC, Stefan Hundhammer
Details
Screenshot: Pixel garbage (945.87 KB, image/png)
2020-11-02 14:01 UTC, Stefan Hundhammer
Details
Screenshot: Correct Xfce4 Desktop (358.32 KB, image/png)
2020-11-02 14:13 UTC, Stefan Hundhammer
Details
Output of rpm -qa (78.47 KB, text/plain)
2020-11-10 16:57 UTC, Stefan Hundhammer
Details
Output of rpm -qa --last (174.90 KB, text/plain)
2020-11-10 16:58 UTC, Stefan Hundhammer
Details
Output of sudo journalctl -b (117.98 KB, text/plain)
2020-11-10 16:59 UTC, Stefan Hundhammer
Details
Output of sudo journalctl -b -1 (booting with kernel 5.8.18; no pixel garbage) (148.32 KB, text/plain)
2020-11-10 17:00 UTC, Stefan Hundhammer
Details
Output of lsmod (3.04 KB, text/plain)
2020-11-10 17:04 UTC, Stefan Hundhammer
Details
Output of lsmod (3.41 KB, text/plain)
2020-11-10 17:10 UTC, Stefan Hundhammer
Details
Screenshot: VirtualBox display settings (57.22 KB, image/png)
2020-11-11 09:05 UTC, Stefan Hundhammer
Details
Screenshot: VirtualBox 6.1.10 display settings (84.80 KB, image/png)
2020-11-11 09:46 UTC, Stefan Hundhammer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Hundhammer 2020-11-02 13:53:31 UTC
With the latest Tumbleweed (at least since the middle of last week), I get half a screen full of pixel garbage in a VirtualBox VM, and the screen coordinates seem to be off. Keyboard input is also not always echoed correctly; it arrives at the application, but I only see the first one or two key presses.

The machine seems to work fine otherwise; I can ssh into it and work normally. But the console diplays garbage, and that garbage is also mirrored in the little "preview" in the VirtualBox console (see screenshot).

It does not make a difference if 3D acceleration is on or off in VirtualBox, or how much video RAM I allocate for the VM.

It worked fine for months until early last week. After a "zypper dup" last Wednesday it broke, and I reverted the VM to a previous snapshot (and it worked again). "zypper dup" right now gave me the same pixel garbage.

When I open a window (Alt-F2 "xterm", PrintScreen key), it's not centered, but near the bottom, and also not centered as it should. Notice how the Xfce screenshoter displays the correct screen content and the VirtualBox preview also the pixel garbage.


[sh @ balrog-tw-dev] ~ 1 % cat /etc/os-release 
NAME="openSUSE Tumbleweed"
# VERSION="20201030"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20201030"
PRETTY_NAME="openSUSE Tumbleweed"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:tumbleweed:20201030"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"
DOCUMENTATION_URL="https://en.opensuse.org/Portal:Tumbleweed"
LOGO="distributor-logo"

[sh @ balrog-tw-dev] ~ 2 % rpm -qa | grep X
libXtst6-1.2.3-2.1.x86_64
libXft2-2.3.3-1.7.x86_64
perl-XML-Twig-3.52-2.2.noarch
libXau6-1.0.9-1.7.x86_64
libXfont2-2-2.0.4-1.5.x86_64
libXmuu1-1.1.3-1.7.x86_64
xorg-x11-libX11-ccache-7.6-21.7.x86_64
perl-XML-Simple-2.25-1.8.noarch
perl-XML-Dumper-0.81-69.13.x86_64
libXmu6-1.1.3-1.7.x86_64
libXaw7-1.0.13-2.9.x86_64
perl-X11-Protocol-0.56-15.4.noarch
libXt6-1.2.0-1.5.x86_64
libXRes1-1.2.0-1.9.x86_64
xorg-x11-Xvnc-1.10.1-6.1.x86_64
xorg-x11-server-Xvfb-1.20.9-3.1.x86_64
perl-XML-SAX-Expat-0.51-4.7.noarch
libZXing1-1.1.0-3.1.x86_64
perl-X500-DN-0.29-109.3.x86_64
libX11-xcb1-1.6.12-1.1.x86_64
libXfixes3-5.0.3-1.11.x86_64
perl-XML-NamespaceSupport-1.12-1.10.noarch
typelib-1_0-XApp-1_0-1.6.10-4.1.x86_64
libXrandr-devel-1.5.2-1.7.x86_64
libXrandr2-1.5.2-1.7.x86_64
libQt5Xml-devel-5.15.1-4.1.x86_64
perl-XML-LibXML-2.0206-1.1.x86_64
libXdamage1-1.1.5-1.7.x86_64
libXext6-1.3.4-1.7.x86_64
libX11-devel-1.6.12-1.1.x86_64
libX11-6-1.6.12-1.1.x86_64
libQt5XmlPatterns5-5.15.1-1.1.x86_64
libXcursor1-1.2.0-1.5.x86_64
libXi6-1.7.10-1.5.x86_64
libXdmcp6-1.1.3-1.7.x86_64
libXrender1-0.9.10-1.12.x86_64
libXxf86vm1-1.1.4-1.17.x86_64
libXinerama1-1.1.4-1.8.x86_64
perl-XML-SAX-Base-1.09-1.11.noarch
perl-Cpanel-JSON-XS-4.25-1.1.x86_64
libXfontcache1-1.0.5-12.18.x86_64
perl-XML-Parser-2.46-1.4.x86_64
libXpm4-3.5.13-1.4.x86_64
libXrender-devel-0.9.10-1.12.x86_64
libXext-devel-1.3.4-1.7.x86_64
libXvnc1-1.10.1-6.1.x86_64
libXv1-1.0.11-1.11.x86_64
libXaw3d8-1.6.3-1.8.x86_64
libXcomposite1-0.4.5-1.5.x86_64
libXau-devel-1.0.9-1.7.x86_64
xorg-x11-Xvnc-module-1.10.1-6.1.x86_64
perl-XML-Writer-0.900-1.1.noarch
libXss1-1.2.3-1.9.x86_64
libX11-data-1.6.12-1.1.noarch
libQt5X11Extras5-5.15.1-1.1.x86_64
perl-XML-SAX-1.02-1.4.noarch
libXpresent1-1.0.0-2.3.x86_64
libQt5Xml5-5.15.1-4.1.x86_64


[sh @ balrog-tw-dev] /etc/X11 5 % ls -l
total 16
drwxr-xr-x 2 root root 4096 Okt 19 17:59 xdm
drwxr-xr-x 3 root root 4096 Okt 15 15:12 xinit
drwxr-xr-x 2 root root 4096 Okt 15 15:12 xorg.conf.d
-rw-r--r-- 1 root root  938 Mär 23  2020 xorg.conf.install

[sh @ balrog-tw-dev] /etc/X11 6 % cd xorg.conf.d 

[sh @ balrog-tw-dev] .../etc/X11/xorg.conf.d 7 % ls -l
total 4
-rw-r--r-- 1 root root 440 Mär 23  2020 00-keyboard.conf
Comment 1 Stefan Hundhammer 2020-11-02 13:55:09 UTC
Created attachment 843220 [details]
X11 log at /var/log/Xorg.0.log
Comment 2 Stefan Hundhammer 2020-11-02 13:56:23 UTC
Created attachment 843221 [details]
sudo journalctl -b
Comment 3 Stefan Hundhammer 2020-11-02 13:58:19 UTC
The VirtualBox host is using VirtualBox 5.2.42.
Comment 4 Stefan Hundhammer 2020-11-02 14:01:49 UTC
Created attachment 843222 [details]
Screenshot: Pixel garbage
Comment 5 Stefan Hundhammer 2020-11-02 14:13:01 UTC
Created attachment 843223 [details]
Screenshot: Correct Xfce4 Desktop

That's what it should look like (after reverting to the previous VirtualBox snapshot).
Comment 6 Stefan Hundhammer 2020-11-02 14:15:14 UTC
Also notice the remnants of green text console messages in the pixel garbage; this looks like "OK" messages from started services during booting.
Comment 7 Stefan Dirsch 2020-11-02 14:44:04 UTC
Nothing obvious I could spot, but Kernel was updated from 5.8 to 5.9.1 with TW 20201026. It would be worth a try to test again with a Kernel 5.8 kernel. Where still to find 5.8 kernel packages? Unfortunately I don't know. :-(
Comment 8 Stefan Hundhammer 2020-11-02 15:24:00 UTC
Since I reverted to the previous VM snapshot that still has kernel 5.8.15, I can try to set the kernel to "protected" and upgrade all other packages.

Let me try that.
Comment 9 Stefan Hundhammer 2020-11-02 16:33:12 UTC
After locking those packages and doing a "zypper dup" with everything else, it still works fine:


[sh @ balrog-tw-dev] ~ 8 % sudo zypper locks

# | Name                   | Type    | Repository
--+------------------------+---------+-----------
1 | kernel-default         | package | (any)
2 | kernel-default-base    | package | (any)
3 | virtualbox-dmp-default | package | (any)
4 | virtualbox-guest*      | package | (any)
5 | virtualbox-guest-tools | package | (any)
6 | virtualbox-guest-x11   | package | (any)
7 | virtualbox-kmp-default | package | (any)

[sh @ balrog-tw-dev] ~ 9 % rpm -qa "kernel-default*" "virtualbox*" | sort -u

kernel-default-5.8.14-1.2.x86_64
kernel-default-5.8.15-1.2.x86_64
virtualbox-guest-tools-6.1.14-2.1.x86_64
virtualbox-guest-x11-6.1.14-2.1.x86_64
virtualbox-kmp-default-6.1.14_k5.8.15_1-2.5.x86_64
Comment 10 Stefan Hundhammer 2020-11-02 16:45:00 UTC
After upgrading to the new kernel and rebooting and just locking the virtualbox* packages, it still works fine.

[sh @ balrog-tw-dev] ~ 24 % sudo zypper locks

# | Name                   | Type    | Repository
--+------------------------+---------+-----------
1 | virtualbox-guest*      | package | (any)
2 | virtualbox-kmp-default | package | (any)


[sh @ balrog-tw-dev] ~ 25 % sudo zypper dup

Loading repository data...
Reading installed packages...
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Computing distribution upgrade...

The following 5 items are locked and will not be changed by any action:
 Available:
  virtualbox-guest-desktop-icons virtualbox-guest-source
 Installed:
  virtualbox-guest-tools virtualbox-guest-x11 virtualbox-kmp-default

Nothing to do.
Comment 11 Stefan Hundhammer 2020-11-02 16:54:34 UTC
After upgrading the last remaining package virtualbox-kmp-default-6.1.14_k5.9.1_1-2.6.x86_64 it STILL works.

But I noticed that zypper also downloaded a new kernel package while I was upgrading (see comment #10) despite having cached everything before ("sudo zypper dup --download-only"). I now have kernel-default-5.9.1-1.2.x86_64 in that VM.

So maybe it was that latest kernel that fixed it. I have no other explanation to offer. !?!?
Comment 12 Stefan Hundhammer 2020-11-10 16:56:54 UTC
It just came back after "zypper dup" today.

When I boot the old kernel 5.8.18 from the Grub2 prompt, it's working okay.
When I boot the new kernel 5.9.1, I get the same pixel garbage.

I tried recreating the initrd ("sudo mkinitrd"), but that did not change anything.
Comment 13 Stefan Hundhammer 2020-11-10 16:57:56 UTC
Created attachment 843465 [details]
Output of   rpm -qa
Comment 14 Stefan Hundhammer 2020-11-10 16:58:24 UTC
Created attachment 843466 [details]
Output of rpm -qa --last
Comment 15 Stefan Hundhammer 2020-11-10 16:59:16 UTC
Created attachment 843467 [details]
Output of   sudo journalctl -b
Comment 16 Stefan Hundhammer 2020-11-10 17:00:42 UTC
Created attachment 843468 [details]
Output of   sudo journalctl -b -1   (booting with kernel 5.8.18; no pixel garbage)

This is not after resetting the VM to the previous state, just selecting the previous kernel 5.8.18 from the Grub2 boot menu.
Comment 17 Stefan Hundhammer 2020-11-10 17:01:41 UTC
Booting kernel 5.9.1 with safe settings from the Grub2 boot menu also resulted in pixel garbage.
Comment 18 Stefan Hundhammer 2020-11-10 17:04:02 UTC
Created attachment 843469 [details]
Output of  lsmod
Comment 19 Stefan Hundhammer 2020-11-10 17:08:51 UTC
So AFAICS this might be a problem with that new kernel 5.9.1 or maybe with the virtualbox-guest* packages

  virtualbox-guest-x11-6.1.14-2.1.x86_64
  virtualbox-guest-tools-6.1.14-2.1.x86_64
Comment 20 Stefan Hundhammer 2020-11-10 17:10:22 UTC
Created attachment 843471 [details]
Output of  lsmod

(The previously attached "lsmod" output here was from the host system after the ssh session to the VM had died unexpectedly)
Comment 21 Stefan Hundhammer 2020-11-10 17:14:26 UTC
When I shut down that VM ("sudo halt" in an ssh session), the VirtualBox window also remains open after "Stopping disk". ACPI problem?
Comment 22 Stefan Dirsch 2020-11-10 17:23:08 UTC
Let's better reassign to kernel component.
Comment 23 Stefan Dirsch 2020-11-10 17:23:58 UTC
Also adding maintainer of virtualbox package
Comment 24 Stefan Hundhammer 2020-11-10 18:25:37 UTC
I tried to force-reinstall the virtualbox packages which appears to rebuild the kernel module and runs "dracut", but that also doesn't help.

  sudo zypper in -f virtualbox-guest-x11 virtualbox-guest-tools virtualbox-kmp-default
Comment 25 robert spitzenpfeil 2020-11-10 21:49:07 UTC
You might want to change the VM's graphic driver to "VMSVGA".
Comment 26 robert spitzenpfeil 2020-11-10 21:51:03 UTC
On the host, in the VM's "Display" settings.
Comment 27 Larry Finger 2020-11-10 22:04:42 UTC
Using VMSVGA is what I was about to suggest.

I am unable to reproduce your problem, but my setup is different. My host is TW running the 5.9.1 kernel, and my guest is TW, also running a 5.9.1 kernel. Both host and guest are fully updated. My desktop on both is KDE Plasma.

What is your host that you are running VB 5.2.42? As I recall, that version has some exploits noted by CVE that were fixed in 5.2.44.
Comment 28 Stefan Hundhammer 2020-11-11 09:04:06 UTC
There is no such setting in that version of VirtualBox (5.2.42); will attach screenshot.

The host system is Xubuntu 18.04.5 LTS; I had to revert to that one after experiencing a ton of problems on Xubuntu 20.04 LTS, and then I had to change that parameter back to something that the previous VirtualBox could cope with.

Right now it's

      <Display VRAMSize="20" accelerate3D="true"/>

(3D or not 3D or the changing the VRAMSize does not make a difference)

https://github.com/shundhammer/huha-linux-tips/blob/master/doc/virtualbox-tips.md#no-graphics-in-guest-after-virtualbox-downgrade


It worked well with that for months; as written here, it also works with the previous kernel 5.8.18 with the exact same setup, just booting the older kernel from the Grub2 menu.
Comment 29 Stefan Hundhammer 2020-11-11 09:05:13 UTC
Created attachment 843484 [details]
Screenshot: VirtualBox display settings
Comment 30 Stefan Hundhammer 2020-11-11 09:07:53 UTC
Notice that it was VirtualBox itself that changed my previous

  <Display controller="VBoxVGA" VRAMSize="20" accelerate3D="true"/>

to

  <Display VRAMSize="20" accelerate3D="true"/>

i.e. it removed the "controller=..." attribute (probably after creating a snapshot or me editing display settings).
Comment 31 Stefan Hundhammer 2020-11-11 09:46:01 UTC
I just experimented some more with booting Xubuntu 20.04 LTS which has VirtualBox 6.1.10:

It has that "Graphics Controller" field in the display settings, of course, and it wants to force it to "VMSVGA". When I try to change it back to "VBoxVGA", a warning "Invalid settings detected" appears in the status line with a tooltip saying to use "VMSVGA" only "if you have a reason".

Booting the VM with that configuration shows the exact same pixel garbage, and after a short while the VM crashes (!); the VM status in the VirtualBox main window says "Aborted".

Switching to "VMSVGA" indeed seems to fix the problem: No pixel garbage, correct display.


But all that really doesn't help me. I can't work with that Xubuntu 20.04 LTS for any period of time because the NVidia driver keeps crashing randomly; that was the main reason to go back to the previous 18.04. And that system works fine; it has been working perfectly all the time, even with that older VirtualBox. And even that older VirtualBox works fine with the older kernel 5.8.18; just not with the latest kernel 5.9.1.


Also consider that there are will be a lot of users out there who need to run our latest kernel on their older host systems; after all, that's the whole point of having virtualization systems.

We can't force them all to upgrade their host systems to something newer because our latest kernel doesn't work well with the older VirtualBox version that comes with their host system.
Comment 32 Stefan Hundhammer 2020-11-11 09:46:41 UTC
Created attachment 843486 [details]
Screenshot: VirtualBox 6.1.10 display settings
Comment 33 Larry Finger 2020-11-11 16:51:52 UTC
In all my testing, there are no display problems with the graphics controller set to VMSVGA. Of course, all that is with openSUSE hosts and various guests. My standard test machines are Windows XP, 7, and 10 as well as openSUSE TW, Leap 15.1 and Leap 15.2. I do have one test machine that gets other Linux distros if there are complaints. If you had a problem with a Ubuntu guest on an openSUSE host, I could look at that.

Note that even VB 6.1.10 is pretty old. With 6.1.12, Oracle fixed 25 CVE exploits - the ones I mentioned earlier. I would not want to run such a system five months after those exploits have been published.

I have no idea what change there might be in kernel 5.9 that would break graphics on an old host system. At least it works with the recommended VMSVGA virtual controller.

It may be true that "We can't force them all to upgrade their host systems to something newer", but it is definitely true that I have little enthusiasm for debugging Ubuntu problems. I have too many problems already, including that VB may never run on hosts with kernel 5.10+.

My inclination is to close this with a WONT FIX.
Comment 34 Stefan Hundhammer 2020-11-11 17:44:20 UTC
I really don't care about CVEs for this; this is purely for test machines and development. 

What I do care for is a stable working platform for my daily work - which Tumbleweed unfortunately is not; and Leap is too old to bother with (and the desktops unfortunately poorly maintained for lack of desktop teams).

Thus the other platform. I can live perfectly well with an older VirtualBox manager on my workstation.

But for developing for our next release, I do need Tumbleweed; that's what I have that VM for. And it served me well ever since the Corona crisis with subsequent home office for all of our R&D began. And at home I am the master of my own hardware, and I use what works best for me.


Sure, you can close this as WONTFIX. But rest assured that in that case this will be my last bug report EVER for everything beyond my direct area of responsibility. The vast majority of bugs that I have reported in all my years at SUSE (since 1999) ended up in some unsatisfactory state anyway (WORKSFORME, WONTFIX or in unfixed limbo forever).


This is not "an Ubuntu problem". This is a VirtualBox problem or (less likely) a problem with that latest kernel 5.9.1.

What makes you so sure that our Leap or SLE users won't have the exact same problem when they try to run a VM with that kernel (or a later one)?
Comment 35 Larry Finger 2020-11-11 20:28:33 UTC
Tumbleweed is stable for me, even though one does not choose a rolling release for stability.

I understand your frustration, but what do you expect me to do? I just installed Xubuntu 18.04 LTS in a spare partition of my host, updated it, and installed VirtualBox. Running my TW KDE VM resulted in a graphics screen with a cursor only. That was with both 5.9.1 and 5.8.4 kernels. The only thing that changed was Xubuntu instead of a TW host. Why is this an openSUSE problem?
Comment 36 Stefan Hundhammer 2020-11-12 09:39:49 UTC
I don't know what's going on there. But I do know that the problem started with the latest kernel 5.9.1, and it's the same with VirtualBox 6.1.10 using the VBoxVGA graphics driver there.

When I look at that pixel garbage (see screenshot), I can see remnants of the framebuffer console with kernel and systemd messages with green [success] messages. The pixel lines are of course not aligned to make a proper image (different resolution), but it looks very much like what used to be the framebuffer during the boot process.

At the same time, the X11 screen content appears to be offset downwards and to the right (which probably explains why keyboard echo does not appear where it should).

So, knowing nothing about how the kernel, the framebuffer or the graphics drivers work, it appears to me as if the start pointer of the graphics pixel buffer is offset in that scenario.

From previous ventures into low-level graphics stuff back in the early 90s that reminds me of base parameters of how the graphics are set up are wrong; back then it was bit order vs. byte order, pixel line padding and similar things. Maybe that newer kernel departed from previous such settings that were always taken for granted (and thus coded into that VBoxVGA driver on the VirtualBox side). But that is pure speculation, of course.

My point is when I have that problem, others will have it as well; this may very well be a ticking time bomb.

I personally can live for a while (weeks or months) with simply not updating that kernel, and when I have to (which sooner or later will happen), try my luck with another virtualization solution like libvirt or VmWare.


Paying SLE customers may see that differently, though, and insist on a solution (with filing L3 support calls). I don't know how far L3 support for VirtualBox in SLE goes, but customers may argue that it's a SUSE problem, not their VirtualBox host.
Comment 37 Stefan Hundhammer 2020-11-12 09:54:09 UTC
Please look at that screenshot again:

https://bugzilla.opensuse.org/attachment.cgi?id=843222

The first green pixel line starts with a black portion. That is exactly how much the rest of the X11 screen is offset to the right. In the vertical dimension, the black part is exactly how much it's offset downwards.

When you look at the visible part of the Xfce screenshot utility at the bottom part of that X11 screen, you will see that the screenshot it took inside that VM is correct; there is no pixel garbage, and the icons are lined up correctly at the left top corner.

So for the X server inside the VM, everything is alright (AFAICS). It must be somewhere in the communication protocol between the VM guest and the host.
Comment 38 Takashi Iwai 2020-11-12 10:02:32 UTC
What happened between your comment 11 and comment 12?  You showed the problem was introduced (again) there.  What packages have been updated exactly and what action done?

The information until now doesn't show whether it's a problem in the kernel update itself or the VB module.

BTW, no VB package is provided on SLE.
Comment 39 Stefan Hundhammer 2020-11-12 13:00:20 UTC
(In reply to Takashi Iwai from comment #38)
> What happened between your comment 11 and comment 12?  

Another "zypper dup" with more updated packages.

> You showed the
> problem was introduced (again) there.  What packages have been updated
> exactly and what action done?

Unfortunately, I only have a partial package list of the previous state (see comment #0). For the new state, it's the attachment from comment #13.

The attachment from comment #14 contains the output of "rpm -qa --last" which includes timestamps, however:

https://bugzilla.opensuse.org/attachment.cgi?id=843466
Comment 40 Stefan Hundhammer 2020-11-12 13:02:33 UTC
A complete shot in the dark (and, as noted before, I know nothing about kernel stuff and how all that works):

https://build.opensuse.org/package/view_file/openSUSE:Factory/virtualbox/fixes_for_5.9.patch?expand=1


Index: VirtualBox-6.1.14/src/VBox/Additions/linux/drm/vbox_ttm.c
===================================================================
--- VirtualBox-6.1.14.orig/src/VBox/Additions/linux/drm/vbox_ttm.c
+++ VirtualBox-6.1.14/src/VBox/Additions/linux/drm/vbox_ttm.c
@@ -445,7 +445,11 @@ err_free_vboxbo:
 
 static inline u64 vbox_bo_gpu_offset(struct vbox_bo *bo)
 {
+#if RTLNX_VER_MAX(5, 9, 0)
 	return bo->bo.offset;
+#else
+	return bo->offset;
+#endif
 }


This affects an offset, and the function name suggests it's related to the GPU. That might be a hint.
Comment 41 Stefan Hundhammer 2020-11-12 13:16:50 UTC
My last good snapshot (who which I reverted in the meantime) runs with


  Linux balrog-tw-dev.fritz.box 5.9.1-1-default #1 SMP 
  Mon Oct 26 07:02:23 UTC 2020 (435e92d) x86_64 x86_64 x86_64 GNU/Linux

i.e. kernel-default-5.9.1-1.2.x86_64

"zypper dup" would update to kernel-default-5.9.1-2.2.x86_64 .
Comment 42 Stefan Hundhammer 2020-11-12 14:29:58 UTC
It keeps getting weirder and weirder.

I did more experiments with kernel-default locked and doing "zypper dup" for all the rest. I had not realized that this would also install kernel-default-base which zypper installed to resolve the dependencies of virtualbox-kmp.

That lead to being restricted to a 640x480 (standard VGA) screen resolution because virtualbox-kmp could not be properly initialized:

Nov 12 14:21:12 localhost kernel:

  vboxvideo: Unknown symbol ttm_bo_mmap (err -2)
  vboxvideo: Unknown symbol ttm_bo_manager_func (err -2)
  vboxvideo: Unknown symbol ttm_bo_glob (err -2)
  vboxvideo: Unknown symbol ttm_bo_device_release (err -2)
  vboxvideo: Unknown symbol ttm_bo_kunmap (err -2)
  vboxvideo: Unknown symbol ttm_bo_device_init (err -2)
  vboxvideo: Unknown symbol ttm_bo_init_mm (err -2)
  vboxvideo: Unknown symbol ttm_bo_dma_acc_size (err -2)
  vboxvideo: Unknown symbol ttm_tt_init (err -2)
  vboxvideo: Unknown symbol ttm_bo_kmap (err -2)
  vboxvideo: Unknown symbol ttm_bo_init (err -2)
  vboxvideo: Unknown symbol ttm_bo_validate (err -2)
  vboxvideo: Unknown symbol ttm_bo_move_to_lru_tail (err -2)
  vboxvideo: Unknown symbol ttm_bo_put (err -2)
  vboxvideo: Unknown symbol ttm_tt_fini (err -2)
  vboxvideo: Unknown symbol ttm_bo_eviction_valuable (err -2)

But there was no pixel garbage.

Later I unlocked kernel-default, uninstalled kernel-default-base (which also uninstalled the virtualbox-* packages) and then force-reinstalled the new kernel and the virtualbox-* packages:

  zypper in --force kernel-default virtualbox-guest-x11 virtualbox-guest-tools virtualbox-kmp-default

...and now everything is working again: No pixel garbage, not restricted to 640x480.


Somehow I feel this is a bootstrap problem of the involved packages or a binary compatibility problem: Just like previously, upgrading the packages step by step appears to work fine, while upgrading everything at once has weird side effects.

virtualbox-kmp-default rebuilds the kernel module in its post-install script (and  recreates the initrd with dracut). Is it possible or plausible that in certain situations it does that with the wrong kernel headers or something like that? Using the ones of the currently running kernel rather than the one that is also being installed in the same "zypper dup" run?

Is it plausible that this might be only a missing dependency in the virtualbox .spec file; or a post-install script being executed at an inappropriate time, before the complete upgrade process is complete?
Comment 43 Stefan Hundhammer 2020-11-12 14:33:39 UTC
The installation / upgrade sequence of those four critical packages was:

kernel-default-5.9.1-2.2.x86_64               Do 12 Nov 2020 15:09:09 CET
virtualbox-kmp-default-6.1.14_k5.9.1_2-2.8.x86_64 Do 12 Nov 2020 15:09:46 CET
virtualbox-guest-x11-6.1.14-2.1.x86_64        Do 12 Nov 2020 15:11:03 CET
virtualbox-guest-tools-6.1.14-2.1.x86_64      Do 12 Nov 2020 15:11:04 CET

This resulted in a correctly working system.
Comment 44 Larry Finger 2020-11-12 15:59:13 UTC
(In reply to Stefan Hundhammer from comment #42)
> virtualbox-kmp-default rebuilds the kernel module in its post-install script
> (and  recreates the initrd with dracut). Is it possible or plausible that in
> certain situations it does that with the wrong kernel headers or something
> like that? Using the ones of the currently running kernel rather than the
> one that is also being installed in the same "zypper dup" run?

If you build a kernel module with the wrong headers, the resulting code will simply not load, thus your supposition is wrong!

> 
> Is it plausible that this might be only a missing dependency in the
> virtualbox .spec file; or a post-install script being executed at an
> inappropriate time, before the complete upgrade process is complete?

The VirtualBox modules do not need to be in the initrd as they are not loaded until the file system is up and running. Nonetheless, the install system puts them in initrd. The spec process is rather complicated; however, I think it is doing the right thing otherwise.
Comment 45 Larry Finger 2020-11-12 16:04:05 UTC
(In reply to Stefan Hundhammer from comment #40)
> A complete shot in the dark (and, as noted before, I know nothing about
> kernel stuff and how all that works):
> 
> https://build.opensuse.org/package/view_file/openSUSE:Factory/virtualbox/
> fixes_for_5.9.patch?expand=1

I am the author of that patch. All of those changes are needed because the kernel developers changed the API in a number of places in the 5.9 kernel. Without those changes, the modules would not build.
Comment 46 Larry Finger 2020-11-12 16:08:58 UTC
(In reply to Takashi Iwai from comment #38)
> 
> BTW, no VB package is provided on SLE.

I do ensure that the VB packages will build on SLE. Thus if a user wants to install VMs using VB, it would work. They would, of course, blow their warranty. Unfortunately, VB is rather insecure!
Comment 47 Larry Finger 2020-11-12 19:17:00 UTC
(In reply to Stefan Hundhammer from comment #42)
> virtualbox-kmp-default rebuilds the kernel module in its post-install script
> (and  recreates the initrd with dracut). Is it possible or plausible that in
> certain situations it does that with the wrong kernel headers or something
> like that? Using the ones of the currently running kernel rather than the
> one that is also being installed in the same "zypper dup" run?


If you build a kernel module with the wrong headers, the resulting code will simply not load, thus your supposition is wrong!

> 
> Is it plausible that this might be only a missing dependency in the
> virtualbox .spec file; or a post-install script being executed at an
> inappropriate time, before the complete upgrade process is complete?


The VirtualBox modules do not need to be in the initrd as they are not loaded until the file system is up and running. Nonetheless, the install system puts them in initrd. The spec process is rather complicated; however, I think it is doing the right thing otherwise.
Comment 48 Stefan Hundhammer 2021-03-10 12:53:12 UTC
I don't know what fixed it, but after a long-postponed "zypper dup" today which gave me kernel 5.11.2-1-default (among 2433 other new packages) it's now working again.