Bug 1227942 (CVE-2022-48802)

Summary: VUL-0: CVE-2022-48802: kernel: fs/proc: task_mmu.c: don't read mapcount for migration entry
Product: [Novell Products] SUSE Security Incidents Reporter: SMASH SMASH <smash_bz>
Component: IncidentsAssignee: Kernel Bugs <kernel-bugs>
Status: NEW --- QA Contact: Security Team bot <security-team>
Severity: Normal    
Priority: P3 - Medium CC: gianluca.gabrielli
Version: unspecified   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://smash.suse.de/issue/414206/
Whiteboard:
Found By: Security Response Team Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description SMASH SMASH 2024-07-16 13:43:40 UTC
Description
===========

In the Linux kernel, the following vulnerability has been resolved:

fs/proc: task_mmu.c: don't read mapcount for migration entry

The syzbot reported the below BUG:

  kernel BUG at include/linux/page-flags.h:785!
  invalid opcode: 0000 [#1] PREEMPT SMP KASAN
  CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
  RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
  Call Trace:
    page_mapcount include/linux/mm.h:837 [inline]
    smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
    smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
    smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
    walk_pmd_range mm/pagewalk.c:128 [inline]
    walk_pud_range mm/pagewalk.c:205 [inline]
    walk_p4d_range mm/pagewalk.c:240 [inline]
    walk_pgd_range mm/pagewalk.c:277 [inline]
    __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
    walk_page_vma+0x277/0x350 mm/pagewalk.c:530
    smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
    smap_gather_stats fs/proc/task_mmu.c:741 [inline]
    show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
    seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
    seq_read+0x3e0/0x5b0 fs/seq_file.c:162
    vfs_read+0x1b5/0x600 fs/read_write.c:479
    ksys_read+0x12d/0x250 fs/read_write.c:619
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x44/0xae

The reproducer was trying to read /proc/$PID/smaps when calling
MADV_FREE at the mean time.  MADV_FREE may split THPs if it is called
for partial THP.  It may trigger the below race:

           CPU A                         CPU B
           -----                         -----
  smaps walk:                      MADV_FREE:
  page_mapcount()
    PageCompound()
                                   split_huge_page()
    page = compound_head(page)
    PageDoubleMap(page)

When calling PageDoubleMap() this page is not a tail page of THP anymore
so the BUG is triggered.

This could be fixed by elevated refcount of the page before calling
mapcount, but that would prevent it from counting migration entries, and
it seems overkilling because the race just could happen when PMD is
split so all PTE entries of tail pages are actually migration entries,
and smaps_account() does treat migration entries as mapcount == 1 as
Kirill pointed out.

Add a new parameter for smaps_account() to tell this entry is migration
entry then skip calling page_mapcount().  Don't skip getting mapcount
for device private entries since they do track references with mapcount.

Pagemap also has the similar issue although it was not reported.  Fixed
it as well.

[shy828301@gmail.com: v4]
  Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
[nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()]
  Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org

The Linux kernel CVE team has assigned CVE-2022-48802 to this issue.


Affected and fixed versions
===========================

	Issue introduced in 4.5 with commit e9b61f19858a and fixed in 5.10.102 with commit db3f3636e4ae
	Issue introduced in 4.5 with commit e9b61f19858a and fixed in 5.15.25 with commit a8dd0cfa3779
	Issue introduced in 4.5 with commit e9b61f19858a and fixed in 5.16.10 with commit 05d3f8045efa
	Issue introduced in 4.5 with commit e9b61f19858a and fixed in 5.17 with commit 24d7275ce279

Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.

Unaffected versions might change over time as fixes are backported to
older supported kernel versions.  The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2022-48802
will be updated if fixes are backported, please check that for the most
up to date information about this issue.


Affected files
==============

The file(s) affected by this issue are:
	fs/proc/task_mmu.c


Mitigation
==========

The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes.  Individual
changes are never tested alone, but rather are part of a larger kernel
release.  Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all.  If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
	https://git.kernel.org/stable/c/db3f3636e4aed2cba3e4e7897a053323f7a62249
	https://git.kernel.org/stable/c/a8dd0cfa37792863b6c4bf9542975212a6715d49
	https://git.kernel.org/stable/c/05d3f8045efa59457b323caf00bdb9273b7962fa
	https://git.kernel.org/stable/c/24d7275ce2791829953ed4e72f68277ceb2571c6

References:
http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2022-48802
https://git.kernel.org/pub/scm/linux/security/vulns.git/plain/cve/published/2022/CVE-2022-48802.mbox
https://git.kernel.org/stable/c/db3f3636e4aed2cba3e4e7897a053323f7a62249
https://git.kernel.org/stable/c/a8dd0cfa37792863b6c4bf9542975212a6715d49
https://git.kernel.org/stable/c/05d3f8045efa59457b323caf00bdb9273b7962fa
https://git.kernel.org/stable/c/24d7275ce2791829953ed4e72f68277ceb2571c6