Bugzilla – Bug 1185570
New kernel breaks bcache mounted rootfs
Last modified: 2023-01-18 16:47:28 UTC
kernel-devel-5.12.0-1.2.noarch breaks completely the bcache mounted rootfs I verified that rolling back to 5.11.16 make bcache working again
Could you elaborate what's exactly broken and how? In anyway, adding bcache maintainer to Cc.
Created attachment 848980 [details] dmesg file carrying a single kernel oops Ok, sorry but it's very difficult to have a complete kernel dump as the system start to printout kernel panic at full speed. Normally if I jump into runlevel 1 the system seem to start correctly, as soon as I jump in another runlevel (eg. 3) system start to printout kernel panics in the console and is not possible to have any info. One time, when I jumped into runlevel 1, in the dmesg there was a kernel panic that I can include (dmesg.gz)
Created attachment 848981 [details] systemd-journal file showing a bunch of kernel oops while changing from runlevel 1 to runlevel 3 see with journalctl --file=./system.journal
Thanks! The Oops in comment 2 indicates that the bcache tries to call bio_alloc_bioset() with too many nr_vecs. In 5.11.x kernel, bio_alloc_bioset() returned NULL in such a case without complaints, but now it hits the kernel panic instead. The BUG() call is intentional, but it doesn't look like the most helpful way... The call pattern is via cached_dev_cache_miss(), and it calculates the nr_vecs like DIV_ROUND_UP(s->insert_bio_sectors, PAGE_SECTORS) and this is likely over BIO_MAX_VECS (=256). Dropping BUG() call in bio.c should restore the old behavior (although there is still another WARN_ON()), but the real fix is needed rather in the caller side in bcache code, I suppose.
A test kernel with the drop of BUG() call is being built in OBS home:tiwai:bsc1185570 repo. It'll be available later at http://download.opensuse.org/repositories/home:/tiwai:/bsc1185570/standard/ Please give it a try later. Note that the kernel will likely show a WARNING with stack trace once in your case. It's expected behavior, and the kernel isn't supposed to be the right "fix". The only point here is to check whether it can go forward over the BUG() call.
Created attachment 849061 [details] dmesg booting from bsc1185570 kernel As said, now I have a single kernel oops
Thanks. It's no Oops but the normal kernel warning with stack trace, as expected. So far, so good. Usually this can be fixed by capping nr_iovecs via bio_max_segs(). But as I don't know the details of bcache, I reassign this bug to Coly.
(In reply to Takashi Iwai from comment #7) > Thanks. It's no Oops but the normal kernel warning with stack trace, as > expected. So far, so good. > > Usually this can be fixed by capping nr_iovecs via bio_max_segs(). But as I > don't know the details of bcache, I reassign this bug to Coly. I am back from public days, now I look into the bcache part. Thanks. Coly Li
(In reply to Coly Li from comment #8) > (In reply to Takashi Iwai from comment #7) > > Thanks. It's no Oops but the normal kernel warning with stack trace, as > > expected. So far, so good. > > > > Usually this can be fixed by capping nr_iovecs via bio_max_segs(). But as I > > don't know the details of bcache, I reassign this bug to Coly. > > I am back from public days, now I look into the bcache part. There are similar reports from mailing list. A testing patch is posted to linux-bcache mailing list, for other reporters to test and verify. Coly Li
5.12.2 released from opensuse but it have the same problem, the difference is that I have only a single kernel oops but the system is slowed down in an unmanageable way
Coly, given the severity of the bug, could you put a temporary workaround to stable branch at least (e.g. just drop BUG() call)? Once after the proper upstream fix arrives, we can replace with it.
(In reply to Takashi Iwai from comment #11) > Coly, given the severity of the bug, could you put a temporary workaround to > stable branch at least (e.g. just drop BUG() call)? Once after the proper > upstream fix arrives, we can replace with it. Current bcache code exceeds two size limitations in the cache miss code path. In the past days I working on the fixes and today it seems a better solution comes to a proper shape. It looks like this, diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 29c231758293..cd0431fd9d20 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -515,18 +515,25 @@ static int cache_lookup_fn(struct btree_op *op, struct btree *b, struct bkey *k) struct search *s = container_of(op, struct search, op); struct bio *n, *bio = &s->bio.bio; struct bkey *bio_key; - unsigned int ptr; + unsigned int ptr, max_cache_miss_size; if (bkey_cmp(k, &KEY(s->iop.inode, bio->bi_iter.bi_sector, 0)) <= 0) return MAP_CONTINUE; + /* + * Make sure the cache missing size won't exceed the restrictions of + * max bkey size and max bio's bi_max_vecs. + */ + max_cache_miss_size = min_t(uint64_t, + (1 << KEY_SIZE_BITS) - 1, BIO_MAX_VECS * PAGE_SECTORS); + if (KEY_INODE(k) != s->iop.inode || KEY_START(k) > bio->bi_iter.bi_sector) { unsigned int bio_sectors = bio_sectors(bio); unsigned int sectors = KEY_INODE(k) == s->iop.inode - ? min_t(uint64_t, INT_MAX, + ? min_t(uint64_t, max_cache_miss_size, KEY_START(k) - bio->bi_iter.bi_sector) - : INT_MAX; + : max_cache_miss_size; int ret = s->d->cache_miss(b, s, bio, sectors); if (ret != MAP_CONTINUE) @@ -547,7 +554,7 @@ static int cache_lookup_fn(struct btree_op *op, struct btree *b, struct bkey *k) if (KEY_DIRTY(k)) s->read_dirty_data = true; - n = bio_next_split(bio, min_t(uint64_t, INT_MAX, + n = bio_next_split(bio, min_t(uint64_t, max_cache_miss_size, KEY_OFFSET(k) - bio->bi_iter.bi_sector), GFP_NOIO, &s->d->bio_split); But I need to have 1 or 2 days to test and verify. I hope it can be finally in next 2 days. P.S previous work around just avoid the panic but won't cache the missing data, this is why I didn't post it. Coly Li
Created attachment 849484 [details] bcache: avoid oversized read request in cache missing code path The is the patch I posted to linux-bcache (Cc linux-block and linux-kernel) as a fix for the reported issue. This is also related to another bug and our customer is testing now. This patch survived from my pressure testing, once we have positive response from customer (or community users as well), I will do the back port. Coly Li
(In reply to Coly Li from comment #13) > Created attachment 849484 [details] > bcache: avoid oversized read request in cache missing code path > > The is the patch I posted to linux-bcache (Cc linux-block and linux-kernel) > as a fix for the reported issue. > > This is also related to another bug and our customer is testing now. This > patch survived from my pressure testing, once we have positive response from > customer (or community users as well), I will do the back port. Currently it seems this fix works. Although there are code review comments for a better patch, IMHO we can have this patch in, and replace it with upstream version later after the finally patch merged into mainline kernel. Coly Li
I'd be a potential tester, too.
(In reply to Bodo Eggert from comment #15) > I'd be a potential tester, too. I will add the fast fix to our kernel very soon. And the fast fix will be replaced with upstream version once it merged into kernel finally. Coly Li
New kernel 5.12.4, same issue so we are waiting
(In reply to Diego Ercolani from comment #17) > New kernel 5.12.4, same issue so we are waiting OK, working on the fast fix backport now. Please notice: this is not final upstream version. Coly Li
(In reply to Coly Li from comment #18) > (In reply to Diego Ercolani from comment #17) > > New kernel 5.12.4, same issue so we are waiting > > OK, working on the fast fix backport now. Please notice: this is not final > upstream version. > Patches are submitted and accepted into SLE15-SP3 kernel. Coly Li
Installed kernel-default-5.12.12-1.1 via the "zypper dup" command and suse repositories issue seems resolved Thank you
openSUSE-SU-2021:2184-1: An update that solves four vulnerabilities and has 107 fixes is now available. Category: security (important) Bug References: 1087082,1152489,1154353,1174978,1176447,1176771,1177666,1178134,1178378,1178612,1179610,1182999,1183712,1184259,1184436,1184631,1185195,1185428,1185497,1185570,1185589,1185675,1185701,1186155,1186286,1186460,1186463,1186472,1186501,1186672,1186677,1186681,1186752,1186885,1186928,1186949,1186950,1186951,1186952,1186953,1186954,1186955,1186956,1186957,1186958,1186959,1186960,1186961,1186962,1186963,1186964,1186965,1186966,1186967,1186968,1186969,1186970,1186971,1186972,1186973,1186974,1186976,1186977,1186978,1186979,1186980,1186981,1186982,1186983,1186984,1186985,1186986,1186987,1186988,1186989,1186990,1186991,1186992,1186993,1186994,1186995,1186996,1186997,1186998,1186999,1187000,1187001,1187002,1187003,1187038,1187039,1187050,1187052,1187067,1187068,1187069,1187072,1187143,1187144,1187167,1187334,1187344,1187345,1187346,1187347,1187348,1187349,1187350,1187351,1187357,1187711 CVE References: CVE-2020-26558,CVE-2020-36385,CVE-2020-36386,CVE-2021-0129 JIRA References: Sources used: openSUSE Leap 15.3 (src): kernel-64kb-5.3.18-59.10.1, kernel-debug-5.3.18-59.10.1, kernel-default-5.3.18-59.10.1, kernel-default-base-5.3.18-59.10.1.18.4.2, kernel-docs-5.3.18-59.10.1, kernel-kvmsmall-5.3.18-59.10.1, kernel-obs-build-5.3.18-59.10.1, kernel-obs-qa-5.3.18-59.10.1, kernel-preempt-5.3.18-59.10.1, kernel-source-5.3.18-59.10.1, kernel-syms-5.3.18-59.10.1, kernel-zfcpdump-5.3.18-59.10.1
SUSE-SU-2021:2184-1: An update that solves four vulnerabilities and has 107 fixes is now available. Category: security (important) Bug References: 1087082,1152489,1154353,1174978,1176447,1176771,1177666,1178134,1178378,1178612,1179610,1182999,1183712,1184259,1184436,1184631,1185195,1185428,1185497,1185570,1185589,1185675,1185701,1186155,1186286,1186460,1186463,1186472,1186501,1186672,1186677,1186681,1186752,1186885,1186928,1186949,1186950,1186951,1186952,1186953,1186954,1186955,1186956,1186957,1186958,1186959,1186960,1186961,1186962,1186963,1186964,1186965,1186966,1186967,1186968,1186969,1186970,1186971,1186972,1186973,1186974,1186976,1186977,1186978,1186979,1186980,1186981,1186982,1186983,1186984,1186985,1186986,1186987,1186988,1186989,1186990,1186991,1186992,1186993,1186994,1186995,1186996,1186997,1186998,1186999,1187000,1187001,1187002,1187003,1187038,1187039,1187050,1187052,1187067,1187068,1187069,1187072,1187143,1187144,1187167,1187334,1187344,1187345,1187346,1187347,1187348,1187349,1187350,1187351,1187357,1187711 CVE References: CVE-2020-26558,CVE-2020-36385,CVE-2020-36386,CVE-2021-0129 JIRA References: Sources used: SUSE Linux Enterprise Workstation Extension 15-SP3 (src): kernel-default-5.3.18-59.10.1, kernel-preempt-5.3.18-59.10.1 SUSE Linux Enterprise Module for Live Patching 15-SP3 (src): kernel-default-5.3.18-59.10.1, kernel-livepatch-SLE15-SP3_Update_2-1-7.5.1 SUSE Linux Enterprise Module for Legacy Software 15-SP3 (src): kernel-default-5.3.18-59.10.1 SUSE Linux Enterprise Module for Development Tools 15-SP3 (src): kernel-docs-5.3.18-59.10.1, kernel-obs-build-5.3.18-59.10.1, kernel-preempt-5.3.18-59.10.1, kernel-source-5.3.18-59.10.1, kernel-syms-5.3.18-59.10.1 SUSE Linux Enterprise Module for Basesystem 15-SP3 (src): kernel-64kb-5.3.18-59.10.1, kernel-default-5.3.18-59.10.1, kernel-default-base-5.3.18-59.10.1.18.4.2, kernel-preempt-5.3.18-59.10.1, kernel-source-5.3.18-59.10.1, kernel-zfcpdump-5.3.18-59.10.1 SUSE Linux Enterprise High Availability 15-SP3 (src): kernel-default-5.3.18-59.10.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
SUSE-SU-2021:2202-1: An update that solves four vulnerabilities and has 98 fixes is now available. Category: security (important) Bug References: 1152489,1154353,1174978,1176447,1176771,1178134,1178612,1179610,1183712,1184259,1184436,1184631,1185195,1185570,1185589,1185675,1185701,1186155,1186286,1186463,1186472,1186672,1186677,1186752,1186885,1186928,1186949,1186950,1186951,1186952,1186953,1186954,1186955,1186956,1186957,1186958,1186959,1186960,1186961,1186962,1186963,1186964,1186965,1186966,1186967,1186968,1186969,1186970,1186971,1186972,1186973,1186974,1186976,1186977,1186978,1186979,1186980,1186981,1186982,1186983,1186984,1186985,1186986,1186987,1186988,1186989,1186990,1186991,1186992,1186993,1186994,1186995,1186996,1186997,1186998,1186999,1187000,1187001,1187002,1187003,1187038,1187039,1187050,1187052,1187067,1187068,1187069,1187072,1187143,1187144,1187167,1187334,1187344,1187345,1187346,1187347,1187348,1187349,1187350,1187351,1187357,1187711 CVE References: CVE-2020-26558,CVE-2020-36385,CVE-2020-36386,CVE-2021-0129 JIRA References: Sources used: SUSE Linux Enterprise Module for Public Cloud 15-SP3 (src): kernel-azure-5.3.18-38.8.1, kernel-source-azure-5.3.18-38.8.1, kernel-syms-azure-5.3.18-38.8.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
openSUSE-SU-2021:2202-1: An update that solves four vulnerabilities and has 98 fixes is now available. Category: security (important) Bug References: 1152489,1154353,1174978,1176447,1176771,1178134,1178612,1179610,1183712,1184259,1184436,1184631,1185195,1185570,1185589,1185675,1185701,1186155,1186286,1186463,1186472,1186672,1186677,1186752,1186885,1186928,1186949,1186950,1186951,1186952,1186953,1186954,1186955,1186956,1186957,1186958,1186959,1186960,1186961,1186962,1186963,1186964,1186965,1186966,1186967,1186968,1186969,1186970,1186971,1186972,1186973,1186974,1186976,1186977,1186978,1186979,1186980,1186981,1186982,1186983,1186984,1186985,1186986,1186987,1186988,1186989,1186990,1186991,1186992,1186993,1186994,1186995,1186996,1186997,1186998,1186999,1187000,1187001,1187002,1187003,1187038,1187039,1187050,1187052,1187067,1187068,1187069,1187072,1187143,1187144,1187167,1187334,1187344,1187345,1187346,1187347,1187348,1187349,1187350,1187351,1187357,1187711 CVE References: CVE-2020-26558,CVE-2020-36385,CVE-2020-36386,CVE-2021-0129 JIRA References: Sources used: openSUSE Leap 15.3 (src): kernel-azure-5.3.18-38.8.1, kernel-source-azure-5.3.18-38.8.1, kernel-syms-azure-5.3.18-38.8.1
The fixes are in stable kernel and our products, people confirm the reported issue is fixed. Here I close this report.
Hello, last upgrade (kernel vmlinuz-5.15.2-1-default) broke bcache again
(In reply to Diego Ercolani from comment #33) > Hello, last upgrade (kernel vmlinuz-5.15.2-1-default) broke bcache again This is from another different regression. My current solution has 3 locations to fix, 1, Revert commit 2fd3e5efe791946be0957c8e1eed9560b541fe46 2, Revert commit f8b679a070c536600c64a78c83b96aa617f8fa71 3, Do the following change in drivers/md/bcache.c, @@ -885,9 +885,9 @@ static void bcache_device_free(struct bcache_device *d) bcache_device_detach(d); if (disk) { - blk_cleanup_disk(disk); ida_simple_remove(&bcache_device_idx, first_minor_to_idx(disk->first_minor)); + blk_cleanup_disk(disk); } Coly Li
(In reply to Diego Ercolani from comment #33) > Hello, last upgrade (kernel vmlinuz-5.15.2-1-default) broke bcache again Hi Diego, Does the suggested fix in comment #34 work? Coly Li
(In reply to Coly Li from comment #35) Hello, I didn't understood that you was suggesting me recompile the kernel. By the way with kernel subrelease -2 & -3 (vmlinuz-5.15.2-3-default) and 5.15.5-1-default the problem disappeared... I had not time to investigate or verify log details but it seem there are no oops evidences: 5.15.5-1-default dmesg: [ 20.739222] bcache: register_bcache() error : device already registered [ 20.770803] bcache: register_bcache() error : device already registered [ 24.760848] e1000e 0000:00:19.0 eno1: NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx [ 24.760855] e1000e 0000:00:19.0 eno1: 10/100 speed: disabling TSO [ 24.760898] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready [ 24.888044] NET: Registered PF_PACKET protocol family [ 641.887295] BTRFS info (device bcache1): qgroup scan completed (inconsistency flag cleared) [11498.952152] perf: interrupt took too long (2517 > 2500), lowering kernel.perf_event_max_sample_rate to 79250 [17032.148265] perf: interrupt took too long (3159 > 3146), lowering kernel.perf_event_max_sample_rate to 63250 [18126.766311] perf: interrupt took too long (4011 > 3948), lowering kernel.perf_event_max_sample_rate to 49750 [19895.734321] perf: interrupt took too long (5039 > 5013), lowering kernel.perf_event_max_sample_rate to 39500 I attach the boot log since yesterday evening
Created attachment 854277 [details] boot log 5.15.5 NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS sda sda1 sda2 bcache 7d42c6d3-4087-405e-842c-483103d627f4 sda3 bcache e36884b0-361e-43ff-8534-08834789484b sda4 ext4 1.0 bootfs 8396a4bf-5694-4afb-97c9-5649e4dd461e 4.3G 2% /boot sdb sdb2 swap 1 26c8a8e3-a3eb-4fe0-aded-c2d60abcea50 [SWAP] sdb4 bcache 8b269e3d-bf8a-47af-9884-6aa432385c15 sdb5 bcache 6dc60da4-e73c-4e64-8b52-5f19f0cd7d92 sr0 bcache0 btrfs homefs b3c77e21-e124-46fa-855b-90b5b75fe166 95.8G 1% /home bcache1 btrfs rootfs 2d0a1196-d6e5-42bf-b251-523a4c32d586 70.8G 23% /root /opt /var /usr/local /srv /.snapshots /
(In reply to Diego Ercolani from comment #36) > (In reply to Coly Li from comment #35) > Hello, > I didn't understood that you was suggesting me recompile the kernel. > By the way with kernel subrelease -2 & -3 (vmlinuz-5.15.2-3-default) and > 5.15.5-1-default the problem disappeared... > I had not time to investigate or verify log details but it seem > there are no oops evidences: > > 5.15.5-1-default dmesg: > [ 20.739222] bcache: register_bcache() error : device already registered > [ 20.770803] bcache: register_bcache() error : device already registered > [ 24.760848] e1000e 0000:00:19.0 eno1: NIC Link is Up 100 Mbps Full > Duplex, Flow Control: Rx/Tx > [ 24.760855] e1000e 0000:00:19.0 eno1: 10/100 speed: disabling TSO > [ 24.760898] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready > [ 24.888044] NET: Registered PF_PACKET protocol family > [ 641.887295] BTRFS info (device bcache1): qgroup scan completed > (inconsistency flag cleared) > [11498.952152] perf: interrupt took too long (2517 > 2500), lowering > kernel.perf_event_max_sample_rate to 79250 > [17032.148265] perf: interrupt took too long (3159 > 3146), lowering > kernel.perf_event_max_sample_rate to 63250 > [18126.766311] perf: interrupt took too long (4011 > 3948), lowering > kernel.perf_event_max_sample_rate to 49750 > [19895.734321] perf: interrupt took too long (5039 > 5013), lowering > kernel.perf_event_max_sample_rate to 39500 > > I attach the boot log since yesterday evening OK, maybe the fixes are in stable kernel now. Since you don't encounter the panic, I plan to close this report again. Thanks. Coly Li