Bug 1224574 (CVE-2024-35980)

Summary: VUL-0: CVE-2024-35980: kernel: arm64: tlb: Fix TLBI RANGE operand
Product: [Novell Products] SUSE Security Incidents Reporter: SMASH SMASH <smash_bz>
Component: IncidentsAssignee: Security Team bot <security-team>
Status: RESOLVED FIXED QA Contact: Security Team bot <security-team>
Severity: Normal    
Priority: P3 - Medium CC: gianluca.gabrielli, ivan.ivanov, mhocko
Version: unspecified   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://smash.suse.de/issue/406709/
Whiteboard:
Found By: Security Response Team Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description SMASH SMASH 2024-05-20 14:39:53 UTC
In the Linux kernel, the following vulnerability has been resolved:

arm64: tlb: Fix TLBI RANGE operand

KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty
pages are collected by VMM and the page table entries become write
protected during live migration. Unfortunately, the operand passed
to the TLBI RANGE instruction isn't correctly sorted out due to the
commit 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()").
It leads to crash on the destination VM after live migration because
TLBs aren't flushed completely and some of the dirty pages are missed.

For example, I have a VM where 8GB memory is assigned, starting from
0x40000000 (1GB). Note that the host has 4KB as the base page size.
In the middile of migration, kvm_tlb_flush_vmid_range() is executed
to flush TLBs. It passes MAX_TLBI_RANGE_PAGES as the argument to
__kvm_tlb_flush_vmid_range() and __flush_s2_tlb_range_op(). SCALE#3
and NUM#31, corresponding to MAX_TLBI_RANGE_PAGES, isn't supported
by __TLBI_RANGE_NUM(). In this specific case, -1 has been returned
from __TLBI_RANGE_NUM() for SCALE#3/2/1/0 and rejected by the loop
in the __flush_tlb_range_op() until the variable @scale underflows
and becomes -9, 0xffff708000040000 is set as the operand. The operand
is wrong since it's sorted out by __TLBI_VADDR_RANGE() according to
invalid @scale and @num.

Fix it by extending __TLBI_RANGE_NUM() to support the combination of
SCALE#3 and NUM#31. With the changes, [-1 31] instead of [-1 30] can
be returned from the macro, meaning the TLBs for 0x200000 pages in the
above example can be flushed in one shoot with SCALE#3 and NUM#31. The
macro TLBI_RANGE_MASK is dropped since no one uses it any more. The
comments are also adjusted accordingly.

References:
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2024-35980
https://www.cve.org/CVERecord?id=CVE-2024-35980
https://git.kernel.org/stable/c/944db7b536baaf49d7e576af36a94f4719552b07
https://git.kernel.org/stable/c/ac4ad513de4fba18b4ac0ace132777d0910e8cfa
https://git.kernel.org/stable/c/e3ba51ab24fddef79fc212f9840de54db8fd1685
https://git.kernel.org/pub/scm/linux/security/vulns.git/plain/cve/published/2024/CVE-2024-35980.mbox
Comment 2 Michal Hocko 2024-05-23 15:27:26 UTC
none of our kernels is affected.
Comment 3 Andrea Mattiazzo 2024-06-12 07:38:58 UTC
All done, closing.