Bug 1217030

Summary: can't set 'lz4' to /sys/block/zram0/comp_algorithm
Product: [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP6 Reporter: WEI GAO <wegao>
Component: KernelAssignee: Kernel Bugs <kernel-bugs>
Status: VERIFIED FIXED QA Contact:
Severity: Normal    
Priority: P2 - High CC: jan.stehlik, martin.doucha, petr.vorel, rtsvetkov, tiwai, wegao
Version: unspecifiedFlags: jan.stehlik: SHIP_STOPPER+
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.suse.de/tests/12774874/modules/zram01/steps/7
See Also: https://bugzilla.suse.com/show_bug.cgi?id=1207933
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---

Description WEI GAO 2023-11-10 13:46:39 UTC
Openqa failed case:
openQA test in scenario sle-15-SP6-Online-x86_64-ltp_kernel_misc@64bit fails in
[zram01](https://openqa.suse.de/tests/12774874/modules/zram01/steps/7)


ver: 6.4.0-150600.1-default

Manual reproduce:

modprobe zram num_devices=7
cat /sys/block/zram0/comp_algorithm
lzo [lzo-rel] lz4 lz4hc 842 zstd
echo lz4 > /sys/block/zram0/comp_algorithm
-bash: echo: write error: Invalid argument  <<<<<<<<


Test case failed log:
zram01 2 TINFO: test that we can set compression algorithm
zram01 2 TINFO: supported algs: lzo lzo-rle lz4 lz4hc 842 zstd 
zram01 2 TINFO: /sys/block/zram0/comp_algorithm = 'lzo'
zram01 2 TINFO: /sys/block/zram0/comp_algorithm = 'lzo-rle'
/opt/ltp/testcases/bin/zram_lib.sh: line 143: echo: write error: Invalid argument
zram01 2 TFAIL: can't set 'lz4' to /sys/block/zram0/comp_algorithm
Comment 1 WEI GAO 2023-11-10 13:50:39 UTC
for case on power show lz4hc can not set.
https://openqa.suse.de/tests/12789340#step/zram01/6

zram01 1 TINFO: timeout per run is 3h 30m 0s
zram01 1 TINFO: create '7' zram device(s)
zram01 1 TPASS: all zram devices (/dev/zram0~6) successfully created
zram01 1 TCONF: The device attribute max_comp_streams was introduced in kernel 3.15 and deprecated in 4.7
zram01 2 TINFO: test that we can set compression algorithm
zram01 2 TINFO: supported algs: lzo lzo-rle lz4hc 842 zstd 
zram01 2 TINFO: /sys/block/zram0/comp_algorithm = 'lzo'
zram01 2 TINFO: /sys/block/zram0/comp_algorithm = 'lzo-rle'
/opt/ltp/testcases/bin/zram_lib.sh: line 143: echo: write error: Invalid argument
zram01 2 TFAIL: can't set 'lz4hc' to /sys/block/zram0/comp_algorithm
Comment 2 Takashi Iwai 2023-11-10 14:02:51 UTC
Is it a regression from SLE15-SP5?

The lz4 module is found in kernel-default-optional, hence it's available only for Leap.

Similarly, lz4hc is included in kernel-default-extra.

The situation above doesn't change since SP5.
Comment 3 WEI GAO 2023-11-11 01:02:33 UTC
(In reply to Takashi Iwai from comment #2)
> Is it a regression from SLE15-SP5?
> 
> The lz4 module is found in kernel-default-optional, hence it's available
> only for Leap.
> 
> Similarly, lz4hc is included in kernel-default-extra.
> 
> The situation above doesn't change since SP5.

I can not found old result in current openqa for sle15-sp5, i need install it manually in local env. I can do this later.

The result is good on 15sp6 build 26.14(28 days ago), later build the issue happen.
sle-15-SP6-Online-x86_64-Build26.14-ltp_kernel_misc@64bit
https://openqa.suse.de/tests/12506466#step/zram01/6

The test logic is get the support algorithm firstly and try set for each one. 
I think if system not install lz4, then when you do following command you should not   
 see lz4.
cat /sys/block/zram0/comp_algorithm
Comment 4 Takashi Iwai 2023-11-13 09:36:09 UTC
AFAIK, it's a sort of behavior fix (improvement) in the recent kernel.

On SLE15-SP5, it allowed to set the compression algorithm even if the module isn't present.  e.g. lz4hc is shown in the list because it was enabled at the build time, but the modules are included in kernel-default-extra and kernel-default-optional, hence they are actually unsable when you install only kernel-default.  When the module is available, they are (auto-)loaded at the time of zram swap device creation, not at the time of sysfs write.

OTOH, on SLE15-SP6, the write to the sysfs actually tries to switch the algorithm and load the needed modules via crypto API.  When the modules aren't available, it gives the error now, instead of at the time of swap device creation.

So, the question is whether you really could create a zram swap device with the given algorithm.  If it used to work only with kernel-default and lz4hc, something must be wrong.

Meanwhile, we may "fix" this bug by moving the corresponding crypto compression modules to kernel-default.  I believe it's a sensible move.
Comment 5 Martin Doucha 2023-11-13 10:02:39 UTC
A similar problem was reported and fixed on SLE-15SP4 as bug 1207933.

(In reply to Takashi Iwai from comment #2)
> Is it a regression from SLE15-SP5?

Yes, the lz4 algorithm can be successfully set to zram devices on the latest SLE-15SP5 kernel update and KOTD:
https://openqa.suse.de/tests/12745388#step/zram01/8
https://openqa.suse.de/tests/12743904#step/zram01/8

> The lz4 module is found in kernel-default-optional, hence it's available
> only for Leap.
> 
> Similarly, lz4hc is included in kernel-default-extra.

kernel-default-extra is installed on the test VM:
https://openqa.suse.de/tests/12774874#step/boot_ltp/80
Comment 6 Takashi Iwai 2023-11-13 10:06:12 UTC
(In reply to Martin Doucha from comment #5)
> A similar problem was reported and fixed on SLE-15SP4 as bug 1207933.
> 
> (In reply to Takashi Iwai from comment #2)
> > Is it a regression from SLE15-SP5?
> 
> Yes, the lz4 algorithm can be successfully set to zram devices on the latest
> SLE-15SP5 kernel update and KOTD:
> https://openqa.suse.de/tests/12745388#step/zram01/8
> https://openqa.suse.de/tests/12743904#step/zram01/8

And, the swap device can be created with lz4 on SLE15-SP5?
It should fail unless the corresponding crypto module is present.
Comment 7 Martin Doucha 2023-11-13 10:21:43 UTC
(In reply to Takashi Iwai from comment #6)
> (In reply to Martin Doucha from comment #5)
> > A similar problem was reported and fixed on SLE-15SP4 as bug 1207933.
> > 
> > (In reply to Takashi Iwai from comment #2)
> > > Is it a regression from SLE15-SP5?
> > 
> > Yes, the lz4 algorithm can be successfully set to zram devices on the latest
> > SLE-15SP5 kernel update and KOTD:
> > https://openqa.suse.de/tests/12745388#step/zram01/8
> > https://openqa.suse.de/tests/12743904#step/zram01/8
> 
> And, the swap device can be created with lz4 on SLE15-SP5?
> It should fail unless the corresponding crypto module is present.

Test output from https://openqa.suse.de/tests/12745388#step/zram01/8 (other algorithms omitted):

zram01 2 TINFO: test that we can set compression algorithm
zram01 2 TINFO: supported algs: lzo lzo-rle lz4 lz4hc 842 zstd 
zram01 2 TINFO: /sys/block/zram0/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram0/comp_algorithm = 'lz4hc'
zram01 2 TINFO: /sys/block/zram1/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram1/comp_algorithm = 'lz4hc'
zram01 2 TINFO: /sys/block/zram2/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram2/comp_algorithm = 'lz4hc'
zram01 2 TINFO: /sys/block/zram3/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram3/comp_algorithm = 'lz4hc'
zram01 2 TINFO: /sys/block/zram4/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram4/comp_algorithm = 'lz4hc'
zram01 2 TINFO: /sys/block/zram5/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram5/comp_algorithm = 'lz4hc'
zram01 2 TINFO: /sys/block/zram6/comp_algorithm = 'lz4'
zram01 2 TINFO: /sys/block/zram6/comp_algorithm = 'lz4hc'
zram01 2 TPASS: test succeeded
Comment 8 Takashi Iwai 2023-11-13 10:33:01 UTC
Weird.  If you don't install kernel-default-optional on SLE15-SP5, lz4 isn't available for zram, and setting the disksize or mkswap must fail.

That said, if it succeeded, something could be wrong in the test itself.
Comment 9 Martin Doucha 2023-11-13 10:50:04 UTC
(In reply to Takashi Iwai from comment #8)
> Weird.  If you don't install kernel-default-optional on SLE15-SP5, lz4 isn't
> available for zram, and setting the disksize or mkswap must fail.
> 
> That said, if it succeeded, something could be wrong in the test itself.

When the test sets the compression algorithm, the device doesn't have disksize yet. The test will first cycle all zram devices through all algorithms and set everything to the last one, which is zstd. Then it assigns disksize and mem_limit and creates a filesystem. Would that explain why on SLE-15SP5, lz4 can be set even though the modules is missing?
Comment 10 Takashi Iwai 2023-11-13 10:56:10 UTC
(In reply to Martin Doucha from comment #9)
> (In reply to Takashi Iwai from comment #8)
> > Weird.  If you don't install kernel-default-optional on SLE15-SP5, lz4 isn't
> > available for zram, and setting the disksize or mkswap must fail.
> > 
> > That said, if it succeeded, something could be wrong in the test itself.
> 
> When the test sets the compression algorithm, the device doesn't have
> disksize yet. The test will first cycle all zram devices through all
> algorithms and set everything to the last one, which is zstd. Then it
> assigns disksize and mem_limit and creates a filesystem. Would that explain
> why on SLE-15SP5, lz4 can be set even though the modules is missing?

Yes.  As mentioned in comment 4, it's a sort of behavior improvement with the recent kernel; the availability of crypto backend is verified at the time of write to comp_algorithm.  On SLE15-SP5, it wasn't checked at switching comp_algorithm but the kernel spews errors at the actual disksize change.
Comment 11 Martin Doucha 2023-11-13 11:08:36 UTC
(In reply to Takashi Iwai from comment #10)
> Yes.  As mentioned in comment 4, it's a sort of behavior improvement with
> the recent kernel; the availability of crypto backend is verified at the
> time of write to comp_algorithm.  On SLE15-SP5, it wasn't checked at
> switching comp_algorithm but the kernel spews errors at the actual disksize
> change.

Maybe it'd be even better to load the compression modules on the first open() of comp_algorithm so that reading from it doesn't show unavailable algs?
Comment 12 Takashi Iwai 2023-11-13 11:16:30 UTC
OTOH, you'd like to see which algorithm can be potentially usable with the available module, too.

I'm inclined to make lz4-related modules available for the default SLE installation.  It's not included just because it hasn't been requested, but a report like this indicates that the module is actually demanded, and it would make more sense to include in kernel-default.  LZ4-* may be used not only for zram but also for module, squashfs and pstore, too.
Comment 13 Martin Doucha 2023-11-13 11:26:56 UTC
Adding lz4 to kernel-default-extra would be sufficient. We install it in all tests.
Comment 14 Takashi Iwai 2023-11-13 11:29:31 UTC
OK, that sounds more safer.
Comment 15 Takashi Iwai 2023-11-13 11:41:25 UTC
The fix pushed to SLE15-SP6 PR.
lz4 and lz4_compress modules are included in kernel-default-extra.
Comment 29 WEI GAO 2024-01-25 02:58:44 UTC
x86 case start show pass now.
https://openqa.suse.de/tests/13319824

For power and aarch seems still not install kernel-default-extra in openqa so result still show failed. So i suppose some fix should be done in openqa side.
Comment 30 Martin Doucha 2024-01-29 15:54:50 UTC
It seems that kernel-default-extra belongs to SLE-Product-WE repo which only exists for x86_64.
Comment 31 Petr Vorel 2024-01-29 17:52:03 UTC
(In reply to Martin Doucha from comment #30)
> It seems that kernel-default-extra belongs to SLE-Product-WE repo which only
> exists for x86_64.

Yes. And also, because of JeOS everything basic should be in kernel-default-base (JeOS does not install kernel-default).
Comment 32 Takashi Iwai 2024-01-30 07:14:01 UTC
I think we can include lz4 modules to kernel-default as mentioned in comment 12, then.  It'd be also useful for other modules than zram.

The handling in kernel-default-base is another story, thought; you'd need to update the list of the included modules in addition to the update of kernel-default.
Comment 33 Takashi Iwai 2024-01-30 08:58:25 UTC
Now pushed the change.  I'm afraid that it slipped from beta3, though.
Comment 38 WEI GAO 2024-03-14 05:36:53 UTC
Openqa show pass result