Bug 1219593 - CONFIG_READ_ONLY_THP_FOR_FS enabled on Leap/SLE, but not on Tumbleweed
Summary: CONFIG_READ_ONLY_THP_FOR_FS enabled on Leap/SLE, but not on Tumbleweed
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Current
Hardware: Other openSUSE Tumbleweed
: P5 - None : Enhancement (vote)
Target Milestone: ---
Assignee: Vlastimil Babka
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-05 23:57 UTC by Aaron Puchert
Modified: 2024-03-23 18:29 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Puchert 2024-02-05 23:57:16 UTC
While I'm aware that CONFIG_READ_ONLY_THP_FOR_FS is still experimental, I found it odd that it seems to be enabled on Leap/SLE kernels but not on Tumbleweed (6.7.2-1-default). Is there a reason for that, or is it just an oversight?

My reason for asking is that I'd like to try out section alignment for LLVM, since some have reported noticeable performance gains by mapping into huge pages. (See https://easyperf.net/blog/2022/09/01/Utilizing-Huge-Pages-For-Code. LLVM has a large .text and tends to wildly jump around, leaving to a trail of L1 TLB misses that can significantly slow it down. And it's large enough that the alignment shouldn't increase size too much.)

I'd expect that some other large binaries could also benefit from this, such as browsers. Since they tend to be long-running, we wouldn't even need MADV_COLLAPSE and could just wait for the kernel to automatically replace by THP.
Comment 1 Takashi Iwai 2024-02-06 16:47:21 UTC
The option was disabled on SLE15-SP4, too.  It caused the regression, leading corruptions of files at building packages (a SLE15-SP4 bug, bsc#1195774).

So, the disablement of CONFIG_READ_ONLY_THP_FOR_FS is intentional, but the still enabled SLE15-SP5 was rather an overlook.  It should have carried the config change from SP4, but apparently it didn't happen...
Comment 2 Vlastimil Babka 2024-02-08 10:59:04 UTC
While SP5 appears as enabled, it's effectively disabled by a patch, see bug 1195774 comment 58.

Whether to enable it for Tumbleweed is thus a decision independent of SLES/Leap at this moment. I think with all that's going on upstream currently towards large folios support, it would make sense. CCing Mel and Michal.
Comment 3 Michal Hocko 2024-02-09 08:29:22 UTC
(In reply to Vlastimil Babka from comment #2)
> While SP5 appears as enabled, it's effectively disabled by a patch, see bug
> 1195774 comment 58.
> 
> Whether to enable it for Tumbleweed is thus a decision independent of
> SLES/Leap at this moment. I think with all that's going on upstream
> currently towards large folios support, it would make sense. CCing Mel and
> Michal.

No strong objection from me. The code should be more matured now.
Comment 4 Mel Gorman 2024-02-12 11:21:01 UTC
(In reply to Michal Hocko from comment #3)
> (In reply to Vlastimil Babka from comment #2)
> > While SP5 appears as enabled, it's effectively disabled by a patch, see bug
> > 1195774 comment 58.
> > 
> > Whether to enable it for Tumbleweed is thus a decision independent of
> > SLES/Leap at this moment. I think with all that's going on upstream
> > currently towards large folios support, it would make sense. CCing Mel and
> > Michal.
> 
> No strong objection from me. The code should be more matured now.

Agreed.
Comment 5 Vlastimil Babka 2024-02-12 11:22:44 UTC
Thanks, will do then.
Comment 6 Aaron Puchert 2024-03-23 18:29:43 UTC
Accidentally stumbled upon this today:

/usr/lib/modules/6.8.1-1-default/config:CONFIG_READ_ONLY_THP_FOR_FS=y

And it does seem to work:

7f00ff200000-7f0102826000 r--p 00000000 fe:00 1723864 /usr/lib64/libLLVM.so.18.1
Size:              55448 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:               12160 kB
[...]
FilePmdMapped:      8192 kB
[...]
THPeligible:           1

Apparently the base address is already aligned, which makes the initial segment eligible without any section alignment. I'm not aware of the dynamic loader aligning anything without section alignment, so it must be the kernel. Pretty nice!

The second segment is not aligned and thus not eligible:

7f0102826000-7f010623c000 r-xp 03625000 fe:00 1723864 /usr/lib64/libLLVM.so.18.1
[...]
THPeligible:           0

This is ld.lld-specific. Binaries produced by ld.bfd have their guard page after the executable segment due to a different section layout:

7f0118000000-7f0118064000 r--p 00000000 fe:00 2230928 /usr/lib64/firefox/libxul.so
Size:                400 kB
[...]
THPeligible:           0
7f0118064000-7f011d9ec000 r-xp 00064000 fe:00 2230928 /usr/lib64/firefox/libxul.so
Size:              91680 kB
[...]
FilePmdMapped:     10240 kB
[...]
THPeligible:           1
7f011d9ec000-7f01200d6000 r--p 059ec000 fe:00 2230928 /usr/lib64/firefox/libxul.so
Size:              39848 kB
[...]
FilePmdMapped:      8192 kB
[...]
THPeligible:           1
7f01200d6000-7f01205aa000 r--p 080d5000 fe:00 2230928 /usr/lib64/firefox/libxul.so
Size:               4944 kB
[...]
THPeligible:           0

The first segment correctly aligned, but still not eligible probably because it's too small. The next two segments (code and read-only data) are eligible and have huge pages. Then comes the unaligned relocation segment, but that is dirty and thus probably not eligible anyway.

I haven't done any performance measurements yet, but having it widely used without any intervention is already pretty nice.

Thanks!