Bug 1218231

Summary: Hw virtualization enabled in BIOS of Intel Skylake i5-6200U, but lscpu says "VMX" disabled
Product: [openSUSE] openSUSE Distribution Reporter: ell1e <el>
Component: KVMAssignee: E-mail List <kvm-bugs>
Status: RESOLVED INVALID QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: aginies, claudio.fontana, dfaggioli, el, tiwai
Version: Leap 16.0   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: lscpu output in full

Description ell1e 2023-12-19 17:59:32 UTC
I boot with these kernel options for CPU security: "mitigations=auto,nosmt mds=full,nosmt", on a laptop with Skylake i5-6200U. However, in the BIOS/UEFI options I enabled all virtualization features of this CPU. This means VT-x/VMX should be available, since Skylake supports this according to product pages. The kernel's iTLB multihit mitigation documentation does NOT suggest that any mitigation option will ever disable VT-x/VMX fully. Therefore, I would strongly expect it to be enabled on this system with this configuration.

Now here comes the problem:

$ lscpu | grep VMX
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT disabled
$ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
KVM: Mitigation: VMX disabled

Something seems to be wrong here. As a semi naive uninformed user, I am guessing the following possible options for what happened:

Option A: VMX is actually disabled by the kernel through some mitigation option. In this case this is a documentation issue, since the kernel help explaining this "KVM: Mitigation: VMX disabled" text ( https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/multihit.html ) is written like no mitigation would actively disable it, suggesting "VMX disabled" is only meant to show when the BIOS/UEFI turned it off. So the help page should then say what may mess with VMX on a kernel level beyond BIOS/UEFI.

Option B: VMX is actually enabled, and the iTLB mitigation is set to "Split huge pages" (this would as far as I can tell be the "correct" option for this system). In this case, there seems to be some mostly benign iTLB info output bug.

Option C: VMS is actually enabled, and the iTLB mitigation is NOT active and the device is vulnerable. This potential outcome is why I marked this as SECURITY ISSUE until it can be ruled out, since if that is true then lscpu clearly tells me I'm safe while I'm actually not. So this outcome could trick admins into making expensive hardening missteps.

Option D: Maybe I'm just not smart enough to use my BIOS/UEFI, lol, or I misunderstand VMX/VT-x fundamentally. I'm not a kernel developer, after all.

Vaguely related, this server fault post suggests this may also happen on many distributions: https://serverfault.com/questions/1073247/if-kvm-is-working-why-does-vmx-show-as-disabled (Not entirely sure though if it's the same cause)

First seen with Fedora's 5.13.15-200.fc34.x86_64, still present with OpenSUSE slowroll's 6.5.9-1-default.

Steps to reproduce:

1. Boot an Intel Skylake computer with Fedora installed. Ensure all virtualization features are enabled in the UEFI/BIOS options.

2. Once Linux has booted, change /etc/default/grub to add in "mitigations=auto,nosmt mds=full,nosmt" and update your generated grub2 files, and reboot.

3. Once inside Linux again, check "lscpu" output and /sys/devices/system/cpu/vulnerabilities/itlb_multihit output. Expected output would be something else than "VMX disabled".
Comment 1 ell1e 2023-12-19 18:02:15 UTC
Created attachment 871454 [details]
lscpu output in full
Comment 2 ell1e 2023-12-19 18:04:50 UTC
This bug was originally filed at https://bugzilla.redhat.com/show_bug.cgi?id=2005094 where it has been sitting hidden from the public with no action for years, so I decided maybe it would be better to make it public.
Comment 3 Takashi Iwai 2024-01-18 14:28:14 UTC
Let's toss to virtualization team.
Comment 4 Antoine Ginies 2024-01-18 15:01:45 UTC
"1. Boot an Intel Skylake computer with Fedora installed. Ensure all virtualization features are enabled in the UEFI/BIOS options."

Is it a Fedora or a leap installed on the system?
Comment 5 ell1e 2024-01-18 16:44:16 UTC
Currently it's latest Slowroll! My apologies, I just forgot to update this text since I initially ran into this on Fedora but openSUSE is also affected. I abandoned Fedora a while ago.
Comment 6 Dario Faggioli 2024-01-24 17:11:50 UTC
(In reply to ell1e from comment #0)
> Steps to reproduce:
> 
> 1. Boot an Intel Skylake computer with Fedora installed. Ensure all
> virtualization features are enabled in the UEFI/BIOS options.
> 
> 2. Once Linux has booted, change /etc/default/grub to add in
> "mitigations=auto,nosmt mds=full,nosmt" and update your generated grub2
> files, and reboot.
> 
> 3. Once inside Linux again, check "lscpu" output and
> /sys/devices/system/cpu/vulnerabilities/itlb_multihit output. Expected
> output would be something else than "VMX disabled".
>
Err... So, can you:
- check if the KVM module is loaded (`lsmod |grep kvm`)
- start a VM (making sure that it uses KVM, so `--enable-kvm` from the QEMU command line, or whatever it's necessary --which should be the default anywa-- for that in VirtManager)
- Check again with `lscpu |grep VMX` and/or `cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit`

?
Comment 7 Dario Faggioli 2024-01-24 17:23:34 UTC
FTR, my situation is as follows:

- I have "nx_huge_pages=off" on the command line
- Right after boot, I see:

> cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit 
> KVM: Mitigation: VMX disabled

- Still, VMX is there (`lscpu|grep Flags|grep vmx`) and kvm_intel is loaded
- As soon as I start a KVM VM, I see:

> cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit 
> KVM: Vulnerable

So, the first "VMX disabled", AFAIUI, only really means that VMX is not in use at that time (e.g., there's no VM running!), not that it's really disabled. In fact, you can start using it and you should get "KVM: Split huge pages".

I agree it's a bit confusing, but that's how it is right now. If we want it differently, this should be changed in the kernel.
Comment 8 ell1e 2024-01-24 18:15:47 UTC
Can confirm the output makes more sense once actually launching a VM and then lists mitigations as expected, here's the output after I launched a VM:

$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0,1
  Off-line CPU(s) list:  2,3
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
    CPU family:          6
    Model:               78
    Thread(s) per core:  1
    Core(s) per socket:  2
    Socket(s):           1
    Stepping:            3
    CPU(s) scaling MHz:  96%
    CPU max MHz:         2800.0000
    CPU min MHz:         0.0000
    BogoMIPS:            4801.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss 
                         ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
                          arch_perfmon pebs bts rep_good nopl xtopology nonstop_
                         tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cp
                         l vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1
                          sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsav
                         e avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault
                          epb pti ssbd ibrs ibpb stibp tpr_shadow flexpriority e
                         pt vpid ept_ad fsgsbase tsc_adjust sgx bmi1 avx2 smep b
                         mi2 erms invpcid mpx rdseed adx smap clflushopt intel_p
                         t xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pt
                         s hwp hwp_notify hwp_act_window hwp_epp vnmi md_clear f
                         lush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   64 KiB (2 instances)
  L1i:                   64 KiB (2 instances)
  L2:                    512 KiB (2 instances)
  L3:                    3 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0,1
Vulnerabilities:         
  Gather data sampling:  Vulnerable: No microcode
  Itlb multihit:         KVM: Mitigation: Split huge pages
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushe
                         s, SMT disabled
  Mds:                   Mitigation; Clear CPU buffers; SMT disabled
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT disabled
  Retbleed:              Mitigation; IBRS
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-
                         eIBRS Not affected
  Srbds:                 Mitigation; Microcode
  Tsx async abort:       Not affected

There are two points that still confuse me:

1. The "gather data sampling" line looks surprising to me as well, doesn't OpenSUSE ship microcode updates or are they lacking behind a little? Or did intel just not care to address it?

1. Regarding this:

   > I agree it's a bit confusing, but that's how it is right now. If we want it 
   > differently, this should be changed in the kernel.

   I think a less confusing wording would be "VMX unused" instead of "VMX
   disabled", while no VM is running and no mitigation active.
Comment 9 Dario Faggioli 2024-01-29 11:30:47 UTC
(In reply to ell1e from comment #8)
> There are two points that still confuse me:
> 
> 1. The "gather data sampling" line looks surprising to me as well, doesn't
> OpenSUSE ship microcode updates or are they lacking behind a little? Or did
> intel just not care to address it?
> 
We do ship microcode updates and, while it's always possible that some lagging behind would happen, we usually manage to stay pretty current.

In fact, on this box, I see:

> lscpu | grep sampling
> Vulnerability Gather data sampling: Mitigation; Microcode

So, it seems it's actually mitigated in microcode for me. Maybe there's no microcode (from Intel, I mean) for your CPU yet?

I have:

>  Model name:            Intel(R) Xeon(R) W-2125 CPU @ 4.00GHz
>    BIOS Model name:     Intel(R) Xeon(R) W-2125 CPU @ 4.00GHz  CPU @ 4.0GHz
>    BIOS CPU family:     179
>    CPU family:          6
>    Model:               85

And:

> cat /proc/cpuinfo | grep -i microcode | uniq
> microcode	: 0x2007006

>    > I agree it's a bit confusing, but that's how it is right now. If we
> want it 
>    > differently, this should be changed in the kernel.
> 
>    I think a less confusing wording would be "VMX unused" instead of "VMX
>    disabled", while no VM is running and no mitigation active.
>
Yeah, I do agree. But we need someone to write a kernel patch for that (which I wish I'd have the time for, but it's not the case right now :-()
Comment 10 Dario Faggioli 2024-01-29 11:32:08 UTC
(In reply to Dario Faggioli from comment #9)
> In fact, on this box, I see:
> 
> > lscpu | grep sampling
> > Vulnerability Gather data sampling: Mitigation; Microcode
> 
> So, it seems it's actually mitigated in microcode for me. Maybe there's no
> microcode (from Intel, I mean) for your CPU yet?
> 
> I have:
> 
> >  Model name:            Intel(R) Xeon(R) W-2125 CPU @ 4.00GHz
> >    BIOS Model name:     Intel(R) Xeon(R) W-2125 CPU @ 4.00GHz  CPU @ 4.0GHz
> >    BIOS CPU family:     179
> >    CPU family:          6
> >    Model:               85
> 
> And:
> 
> > cat /proc/cpuinfo | grep -i microcode | uniq
> > microcode	: 0x2007006
> 
Oh, by the way, just for the record, I am on Tumbleweed.
Comment 11 ell1e 2024-02-05 15:00:31 UTC
>  Maybe there's no microcode (from Intel, I mean) for your CPU yet?

Can you elaborate on the "yet", is there info whether one will be available?

I tried to contact Intel about this during the recent days, but their support sems to think I'm asking some warranty question, and due to the CPUs age refuses to answer. I was also sent to the "OS developer" which would be here I guess, but I assume only Intel can provide these microcode updates so that seems nonsensical to me. Maybe I got their answers wrong, but it didn't really seem like Intel support understood the workflow with microcode updates on Linux. In any case, I would be curious to know if anyone has any insight on this. (And my apologies, I know this would be more of a question for Intel, but oh well.)
Comment 12 ell1e 2024-02-05 15:07:57 UTC
(I guess it makes sense to close this for now, since the VMX mitigations seem to work, and the initial bug was about them possibly not working.)
Comment 13 ell1e 2024-02-06 17:15:22 UTC
After more poking, I actually managed to get a concrete response from intel. I was told that if you got a Skylake Core i5-6200U, apparently there won't be a fix and from my understanding, they expect you to either accept that or literally throw away your working hardware. Good on the kernel for being transparent here.