|
Bugzilla – Full Text Bug Listing |
| Summary: | Filesystem corruption found for filesystems on top of raid -> luks -> lvm | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Marcus Rückert <mrueckert> |
| Component: | Kernel | Assignee: | Coly Li <colyli> |
| Status: | NEW --- | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P1 - Urgent | CC: | ailiopoulos, antonio.feijoo, aschnell, asn, colyli, hare |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
The kernel from the hetzner rescue system is a custom build 6.3.7
requested lvm dump |
||
Created attachment 868247 [details]
The kernel from the hetzner rescue system is a custom build 6.3.7
that is the config.gz file from them.
i only went into the rescue system on the machine with ext4 because the generated initrd with dracut-sshd was not working correctly. For the first machine where i saw the broken filesystem: When i browse the lost and found directory to maybe find a few important files I see this in dmesg. (again with the 6.3.7 kernel) ``` [Fr Jul 14 01:39:23 2023] XFS (dm-1): Corruption detected. Unmount and run xfs_repair [Fr Jul 14 01:39:23 2023] XFS (dm-1): Internal error ir.loaded != ifp->if_nextents at line 1169 of file fs/xfs/libxfs/xfs_bmap.c. Caller xfs_iread_extents+0xaf/0x113 [xfs] [Fr Jul 14 01:39:23 2023] CPU: 3 PID: 13788 Comm: grep Tainted: G O 6.3.7 #1 [Fr Jul 14 01:39:23 2023] Hardware name: Hetzner /B450D4U-V1L, BIOS L1.02W 07/09/2020 [Fr Jul 14 01:39:23 2023] Call Trace: [Fr Jul 14 01:39:23 2023] <TASK> [Fr Jul 14 01:39:23 2023] dump_stack_lvl+0x45/0x5e [Fr Jul 14 01:39:23 2023] xfs_corruption_error+0x64/0x85 [xfs] [Fr Jul 14 01:39:23 2023] xfs_iread_extents+0xdf/0x113 [xfs] [Fr Jul 14 01:39:23 2023] ? xfs_iread_extents+0xaf/0x113 [xfs] [Fr Jul 14 01:39:23 2023] xfs_bmapi_read+0x11e/0x277 [xfs] [Fr Jul 14 01:39:23 2023] xfs_read_iomap_begin+0xc1/0x17c [xfs] [Fr Jul 14 01:39:23 2023] ? kmem_cache_alloc_lru+0x13a/0x163 [Fr Jul 14 01:39:23 2023] iomap_iter+0x1a3/0x261 [Fr Jul 14 01:39:23 2023] iomap_read_folio+0xbd/0x133 [Fr Jul 14 01:39:23 2023] filemap_read_folio+0x22/0x70 [Fr Jul 14 01:39:23 2023] filemap_get_pages+0x19b/0x51a [Fr Jul 14 01:39:23 2023] ? terminate_walk+0x1d/0x6f [Fr Jul 14 01:39:23 2023] filemap_read+0xc1/0x26f [Fr Jul 14 01:39:23 2023] ? kmem_cache_free+0xc5/0x152 [Fr Jul 14 01:39:23 2023] ? slab_post_alloc_hook+0x3e/0x177 [Fr Jul 14 01:39:23 2023] ? rwsem_read_trylock+0x40/0x4d [Fr Jul 14 01:39:23 2023] ? down_read+0x27/0x5c [Fr Jul 14 01:39:23 2023] xfs_file_buffered_read+0x6b/0x8e [xfs] [Fr Jul 14 01:39:23 2023] xfs_file_read_iter+0x8b/0xd8 [xfs] [Fr Jul 14 01:39:23 2023] vfs_read+0x108/0x1a3 [Fr Jul 14 01:39:23 2023] ksys_read+0x76/0xc3 [Fr Jul 14 01:39:23 2023] do_syscall_64+0x73/0x8a [Fr Jul 14 01:39:23 2023] entry_SYSCALL_64_after_hwframe+0x72/0xdc [Fr Jul 14 01:39:23 2023] RIP: 0033:0x7fe87a8a903d [Fr Jul 14 01:39:23 2023] Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d a6 55 0a 00 e8 39 fe 01 00 66 0f 1f 84 00 00 00 00 00 80 3d a1 25 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec [Fr Jul 14 01:39:23 2023] RSP: 002b:00007fff9fb978f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [Fr Jul 14 01:39:23 2023] RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007fe87a8a903d [Fr Jul 14 01:39:23 2023] RDX: 0000000000226000 RSI: 00007fe879d2e000 RDI: 0000000000000003 [Fr Jul 14 01:39:23 2023] RBP: 0000000000226000 R08: 0000000000000000 R09: 0000000000000000 [Fr Jul 14 01:39:23 2023] R10: 0000000000001000 R11: 0000000000000246 R12: 00007fe879d2e000 [Fr Jul 14 01:39:23 2023] R13: 0000000000000003 R14: 00007fe879d2d010 R15: 0000000000000003 [Fr Jul 14 01:39:23 2023] </TASK> [Fr Jul 14 01:39:23 2023] XFS (dm-1): Corruption detected. Unmount and run xfs_repair ``` (In reply to Marcus Rückert from comment #0) > [Sun Jul 16 16:25:33 2023] EXT4-fs (dm-2): mounted filesystem > 93180584-c8c2-4f30-8433-61c69b93199e with ordered data mode. Quota mode: > none. > [Sun Jul 16 16:25:39 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: > inode #393231: comm bash: iget: checksum invalid > [Sun Jul 16 16:25:45 2023] EXT4-fs (md126): mounted filesystem > 2e9067bb-136c-4d1e-aa58-e20a0fe31456 with ordered data mode. Quota mode: > none. dm-2 is which block device specifically? lvmsys-root? I am curious if you can also reproduce this on md126 which doesn't go through dm-crypt and lvm, e.g. from the rescue kernel trying to access any files within /boot. > This is a fresh installation done on Sunday 2023-07-16. with the latest TW ( > 20230714 ) so the whole setup was done and filesystems were freshly formatted on v6.4.2, correct? and while accessing the files there on TW v6.4.2 doesn't show any issues, when you boot with the hetzner rescue kernel (v6.3.7) you get the cksum errors on ext4, and only on the lvmsys-root partition, right? could you share output of the following from both kernels (rescue v6.3.7 and then TW v6.4.2): mdadm -D /dev/md125 lvmdump output tarball tune2fs -l /dev/mapper/lvmsys-root @Coly, anything else? Perhaps partitioning/alignment also could affect this? I've tried to reproduce by creating a md-raid1+dm-crypt+lvm(linear)+ext4 on v6.4.2 and going back to v6.3.7 is able to read it without issues, but there may be many more variables at play. mdadm -D /dev/md125
/dev/md125:
Version : 1.0
Creation Time : Sun Jul 16 15:38:58 2023
Raid Level : raid1
Array Size : 1874062464 (1787.25 GiB 1919.04 GB)
Used Dev Size : 1874062464 (1787.25 GiB 1919.04 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Jul 17 13:24:59 2023
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : any:raidlvm
UUID : 88810fa3:01805871:6e1316dc:0a341041
Events : 1971
Number Major Minor RaidDevice State
0 259 7 0 active sync /dev/nvme1n1p3
1 259 4 1 active sync /dev/nvme0n1p3
tune2fs -l /dev/mapper/lvmsys-root
tune2fs 1.47.0 (5-Feb-2023)
Filesystem volume name: <none>
Last mounted on: /
Filesystem UUID: 93180584-c8c2-4f30-8433-61c69b93199e
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index orphan_file filetype needs_recovery extent 64bit flex_bg metadata_csum_seed sparse_super large_file huge_file dir_nlink extra_isize metadata_csum orphan_present
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1048576
Block count: 4194304
Reserved block count: 209715
Overhead clusters: 109857
Free blocks: 3522328
Free inodes: 979057
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Reserved GDT blocks: 1024
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
RAID stride: 32
RAID stripe width: 32
Flex block group size: 16
Filesystem created: Sun Jul 16 15:39:05 2023
Last mount time: Sun Jul 16 16:25:33 2023
Last write time: Mon Jul 17 12:16:49 2023
Mount count: 5
Maximum mount count: -1
Last checked: Sun Jul 16 15:39:05 2023
Check interval: 0 (<none>)
Lifetime writes: 3214 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 32
Desired extra isize: 32
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: e819d649-e411-4d2d-b721-9d013ffee44a
Journal backup: inode blocks
FS Error count: 83
First error time: Sun Jul 16 16:25:40 2023
First error function: ext4_lookup
First error line #: 1851
First error inode #: 393231
First error err: EFSBADCRC
Last error time: Mon Jul 17 12:16:49 2023
Last error function: ext4_lookup
Last error line #: 1853
Last error inode #: 279321
Last error err: EFSCORRUPTED
Checksum type: crc32c
Checksum: 0xfdee151b
Checksum seed: 0x9681dfd4
Orphan file inode: 12
Created attachment 868270 [details]
requested lvm dump
> and while accessing the files there on TW v6.4.2 doesn't show any issues,
> when you boot with the hetzner rescue kernel (v6.3.7) you get the cksum
> errors on ext4, and only on the lvmsys-root partition, right?
The only reason why I even accessed the whole setup from the rescue system: after adding dracut-sshd to the setup. so it might be that something broke while building the initrd. and while investigating the broken initrd i found the FS trouble and focused my debugging on that before continue the dracut investigation.
an a friend of mine also ran into this problem: for him it is just raid1 -> luks -> FS so lvm is out of the equation here. (In reply to Marcus Rückert from comment #8) > an a friend of mine also ran into this problem: > > for him it is just > > raid1 -> luks -> FS > > so lvm is out of the equation here. Check if this bug cannot be reproduced after snapshot 20230717 (so I'd be a duplicate of bsc#1213227) Can you check if commit 219580eea1ee ("iomap: update ki_pos in iomap_file_buffered_write") is present? That might be a possible fix here.
This will affect the ALP kernel as well. Raising to P1 as it's a corruption issue. (In reply to Hannes Reinecke from comment #10) > Can you check if commit 219580eea1ee ("iomap: update ki_pos in > iomap_file_buffered_write") is present? That might be a possible fix here. we have seen this now also with 2 machines that use ext4. Anthony said ext4 wouldnt use iomap_file_buffered_write. (In reply to Antonio Feijoo from comment #9) > Check if this bug cannot be reproduced after snapshot 20230717 (so I'd be a > duplicate of bsc#1213227) I will reinstall my 2nd server ( the one with the broken ext4 ) and see if it breaks again like last time. for the server with the broken XFS it currently runs in a TW rescue system with kernel 6.3.9. all raids are assembled and the luks opened. Would the theory then be that the broken raid during the earlier boot led to some breakages when the raid was reassembled? or what would be the relation between the 2 bugs? (In reply to Marcus Rückert from comment #5) > mdadm -D /dev/md125 > /dev/md125: > Version : 1.0 > Creation Time : Sun Jul 16 15:38:58 2023 [...] > tune2fs -l /dev/mapper/lvmsys-root [...] > Filesystem created: Sun Jul 16 15:39:05 2023 > Last mount time: Sun Jul 16 16:25:33 2023 > Last write time: Mon Jul 17 12:16:49 2023 > Mount count: 5 > Maximum mount count: -1 > Last checked: Sun Jul 16 15:39:05 2023 > Check interval: 0 (<none>) > Lifetime writes: 3214 MB [...] > FS Error count: 83 > First error time: Sun Jul 16 16:25:40 2023 > First error function: ext4_lookup > First error line #: 1851 > First error inode #: 393231 > First error err: EFSBADCRC > Last error time: Mon Jul 17 12:16:49 2023 > Last error function: ext4_lookup > Last error line #: 1853 > Last error inode #: 279321 > Last error err: EFSCORRUPTED > Checksum type: crc32c > Checksum: 0xfdee151b > Checksum seed: 0x9681dfd4 > Orphan file inode: 12 Focusing on this one a bit to see if we could narrow things down, this ext4fs was created just after md assembly, and less than 1 hour later it started reporting crc errors according to the above timestamps. On what kernel was this fs created? What happened during this hour after creation and before the first error? There have been 5 counts of mounting this fs at that point. (In reply to Marcus Rückert from comment #13) > (In reply to Antonio Feijoo from comment #9) > > Check if this bug cannot be reproduced after snapshot 20230717 (so I'd be a > > duplicate of bsc#1213227) > > I will reinstall my 2nd server ( the one with the broken ext4 ) and see if > it breaks again like last time. > > for the server with the broken XFS it currently runs in a TW rescue system > with kernel 6.3.9. all raids are assembled and the luks opened. Would the > theory then be that the broken raid during the earlier boot led to some > breakages when the raid was reassembled? or what would be the relation > between the 2 bugs? The temporal coincidence of bug reports (this would be the 3rd) with such a specific setup (raid+luks), just to rule that out, although the previous two did not show any fs corruption. > (In reply to Marcus Rückert from comment #5) > > Filesystem created: Sun Jul 16 15:39:05 2023 > > Last mount time: Sun Jul 16 16:25:33 2023 > > Last write time: Mon Jul 17 12:16:49 2023 > > Mount count: 5 > > Lifetime writes: 3214 MB > > First error time: Sun Jul 16 16:25:40 2023 Regarding the above, I wonder if we can assume that the installer created, mounted and wrote into the filesystem, and the "last mount time" is actually the first time that the booted kernel is mounting it (which is also the first time that errors are being encountered). In other words, if this is some issue between how the installation medium sets up and sees this storage stack vs. how the actual installed kernel that boots up after the installation does it. Marcus can you comment on the above, and also mention which exact installation medium/iso you used for this, and what was the kernel you booted into, after installation. nope. I booted once into the the system. did the necessary changes to use dracut-sshd. and when i wanted to test that working setup it failed to boot and then showed the above errors. ``` cat << EOF > /etc/systemd/network/10-external.link [Match] MACAddress=de:ad:be:ef:de:ad [Link] Description=External router Name=external EOF cat << EOF > /etc/systemd/network/10-external.network [Match] MACAddress=de:ad:be:ef:de:ad [Network] DHCP=yes IPv6AcceptRA=yes EOF cat << EOF > /etc/dracut.conf.d/90-network.conf add_dracutmodules+=" systemd-networkd sshd " install_items+=" /etc/systemd/network/10-external.link /etc/systemd/network/10-external.network " EOF systemctl enable systemd-networkd.service dracut --force ``` This is the needed changes. For pixls.us I have pretty much the same setup running on 15.5 atm. Marcus, trying to sum up a bit what occurred for your ext4 corruption: - installed TW snapshot 20230714, installer did the md-raid1/dm-crypt/dm-linear setup plus mkfs.ext4 etc - booted the first time into the system without observing any issues with the filesystem - made the changes for dracut-sshd and rebooted - encountered the corruption report from ext4 for the first time Is that correct? Any important details missing? Can you reproduce this with the same steps (from the same TW snapshot)? I have unsuccessfully attempted to reproduce, trying to retrace Marcus' steps: using: https://download.opensuse.org/history/20230714/tumbleweed/repo/oss/boot/x86_64/loader/initrd https://download.opensuse.org/history/20230714/tumbleweed/repo/oss/boot/x86_64/loader/linux qemu-system-x86_64 -enable-kvm -m 64G -cpu host -smp 64 -M q35 -nographic -kernel linux -initrd initrd \ -append 'console=ttyS0 consoleblank=0 nomodeset mitigations=auto quiet systemd.show_status=1 elevator=deadline crashkernel=712M,high crashkernel=72M,low install=https://download.opensuse.org/history/20230714/tumbleweed/repo/oss/ lang=en' \ -drive format=raw,file=disk.raw,if=virtio,cache=none,werror=report and going through the text-mode installation, choosing server profile, and going over the expert partitioning setting up the following: Changes to partitioning: * Create GPT on /dev/sda * Create partition /dev/sda1 (8.00 MiB) as BIOS Boot Partition * Create partition /dev/sda2 (50.00 GiB) as Linux RAID * Create partition /dev/sda3 (50.00 GiB) as Linux RAID * Create encrypted RAID1 /dev/md0 (49.83 GiB) from /dev/sda2 (50.00 GiB) and /dev/sda3 (50.00 GiB) * Create volume group vg00 (49.82 GiB) with /dev/mapper/cr_md0 (49.82 GiB) * Create LVM logical volume /dev/vg00/swap (24.82 GiB) on volume group vg00 for swap * Create LVM logical volume /dev/vg00/root (25.00 GiB) on volume group vg00 for / with ext4 Booting into the installed system, zypper in -y dracut-sshd systemd-network and then doing Marcus' changes in verbatim (comment #18), and rebooting. Note that the installer kernel is 6.4.2-1-default (b97b894) and the system boots for the first time in 6.4.3-1-default (5ab030f). I have not encountered any ext4 errors whatsoever after stressing the filesystem in various ways and rebooting multiple times, the filesystem is completely clean also in fsck. There are probably some partitioning differences that matter, hopefully Marcus can more faithfully reproduce his initial setup. Rudi and me looked into this some more this morning after I tried to restore the last VM over the weekend. When i ran XFS repair on the data disk I noticed all files are back to state of 2023-06-26. that was the time of the last reboot before the one that broke it. so that made me think ... it is like it used and old disk or so: ``` lsblk -o '+UUID' | grep 2c8ff24e-c5d8-41e5-8c82-17a7d9c54933 └─sda3 8:3 0 7.3T 0 part 2c8ff24e-c5d8-41e5-8c82-17a7d9c54933 └─md128 9:128 0 7.3T 0 raid1 2c8ff24e-c5d8-41e5-8c82-17a7d9c54933 └─sdb3 8:19 0 7.3T 0 part 2c8ff24e-c5d8-41e5-8c82-17a7d9c54933 └─md128 9:128 0 7.3T 0 raid1 2c8ff24e-c5d8-41e5-8c82-17a7d9c54933 ``` the crypttab points to UUID=2c8ff24e-c5d8-41e5-8c82-17a7d9c54933. crypttab: ``` cat ./102: cr_md-uuid-3e1721eb:02a60ecc:6ba577d0:37933902 UUID=2c8ff24e-c5d8-41e5-8c82-17a7d9c54933 stat 102 File: 102 Size: 90 Blocks: 8 IO Block: 4096 regular file Device: 253,1 Inode: 102 Links: 1 Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2023-07-24 09:55:17.900258737 +0000 Modify: 2020-10-14 20:15:42.816007642 +0000 Change: 2020-10-14 20:15:42.816007642 +0000 Birth: 2020-10-14 20:15:42.816007642 +0000 ``` so this was written like this by yast. mdadm.conf ``` 0:mcp:/mnt/lost+found # cat 101 DEVICE containers partitions ARRAY /dev/md/boot UUID=c7becf36:4dcdbf24:99e0de1e:10f14249 ARRAY /dev/md/system UUID=3e1721eb:02a60ecc:6ba577d0:37933902 ``` on my desktop ``` lsblk -o '+UUID' | grep d9515e3e-438e-4a0c-9bcc-e665bc146893 └─nvme0n1p2 259:3 0 931,3G 0 part d9515e3e-438e-4a0c-9bcc-e665bc146893 └─nvme1n1p2 259:5 0 931,3G 0 part d9515e3e-438e-4a0c-9bcc-e665bc146893 grep d9515e3e-438e-4a0c-9bcc-e665bc146893 /etc/crypttab cr_sysraid UUID=d9515e3e-438e-4a0c-9bcc-e665bc146893 ``` so yeah ... definitely a bad idea from yast to use the partition UUID instead of the mdraid uuid for the crypttab. I guess this combined with bug 1213227 broke all those setups. Though not sure yet how that broke the ext4 during the fresh install. now doing a fullbackup of my desktop. Spoke to Andreas via the phone quickly and his system was also set back to the time of the last boot. So pretty much the same issue. luckily for him he rebooted more regularly. Let me take a look for this bug. (In reply to Coly Li from comment #23) > Let me take a look for this bug. I will be on vacation from Aug 1 to 6, if it is very urgent during my vacation, please take on the bug owner during my vacation days. Thanks. Coly Li (In reply to Coly Li from comment #24) > (In reply to Coly Li from comment #23) > > Let me take a look for this bug. > > I will be on vacation from Aug 1 to 6, if it is very urgent during my > vacation, please take on the bug owner during my vacation days. > I am back from vacation, and continue to handle this bug. I ran into an issue with a similar setup: raid1 (mdraid) -> luks -> ext4 After a resume from suspend, the machine didn't react and I had to force poweroff the machine. After the reboot it looked liked I'm on an old version of my data. For example a git repository missed several branches I created the day before. I got everything back from remote git servers. However the next boot was the same again and then after the 3rd boot the ext4 filesystem was corrupted. The system stopped in a emergency console and complained that ext4 wants a file system check. I guess somehow the mdraid seemed to have mounted old data and the out of sync data somehow destroyed the ext4 filesystem when the raid tried to repair. I can't say for sure. I think I run fsck about 50 times trying to fix the system. I gave up, copied data to another disk and recreated ext4 filesystem. Finally with help from Marcus Ruckert, I make a similar setup which I didn't do before. With detailed explaining from Marcus, I realize this is a regress mentioned in Bug #1213227, and should be fixed in Tumbleweed snapshot 20230717. In the situation which Marcus experienced, the UUIDs of the md raid1 array and its component partitions are identical, which was automatically filled by Yast installer. Because Luks was built on top of the md raid1 array, with the identical UUID issue, the component partition/disks are recognized by Luks other then the composed md raid1 array. When I redo all the steps with Marcus' help and with openSUSE-Tumbleweed-DVD-x86_64-Snapshot20230806-Media.iso, the raid UUID informations are handled correctly and no identical UUIDs exited, so the reported corruption is not reproduced. Now with the latest Tumbleweed snapshot, this reported problem should be fixed. But Marcus suggested and I agreed that something from mdadm should be fixed. If mdadm can detect the assigning UUID of the raid array is identical to the component disks, it should drop an error message and reject to assemble the array which may cause consequential chaos. Then such hiding problem can be blocked on md raid layer and won't leak it into upper layer. (In reply to Coly Li from comment #27) > Finally with help from Marcus Ruckert, I make a similar setup which I didn't > do before. > > With detailed explaining from Marcus, I realize this is a regress mentioned > in Bug #1213227, and should be fixed in Tumbleweed snapshot 20230717. > > In the situation which Marcus experienced, the UUIDs of the md raid1 array > and its component partitions are identical, which was automatically filled > by Yast installer. Because Luks was built on top of the md raid1 array, with > the identical UUID issue, the component partition/disks are recognized by > Luks other then the composed md raid1 array. > > When I redo all the steps with Marcus' help and with > openSUSE-Tumbleweed-DVD-x86_64-Snapshot20230806-Media.iso, the raid UUID > informations are handled correctly and no identical UUIDs exited, so the > reported corruption is not reproduced. > > Now with the latest Tumbleweed snapshot, this reported problem should be > fixed. But Marcus suggested and I agreed that something from mdadm should be > fixed. > > If mdadm can detect the assigning UUID of the raid array is identical to the > component disks, it should drop an error message and reject to assemble the > array which may cause consequential chaos. Then such hiding problem can be > blocked on md raid layer and won't leak it into upper layer. Also from Bug #1213227, if the UUID of md raid array is not filled correctly, Luks will contribute to similar chaos by using the UUID of component disks of the raid array. Although I don't understand how Luks works with block device UUIDs, rejecting identical UUIDs of the md raid array and its component disks still a good idea to avoid unnecessary confusion and chaos. |
i have a setup like this: ``` lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS loop0 7:0 0 3G 1 loop nvme1n1 259:0 0 1.7T 0 disk ├─nvme1n1p1 259:5 0 256M 0 part │ └─md127 9:127 0 255.9M 0 raid1 /boot/efi ├─nvme1n1p2 259:6 0 1G 0 part │ └─md126 9:126 0 1023.9M 0 raid1 /boot └─nvme1n1p3 259:7 0 1.7T 0 part └─md125 9:125 0 1.7T 0 raid1 └─crypto 253:0 0 1.7T 0 crypt ├─lvmsys-swap 253:1 0 4G 0 lvm └─lvmsys-root 253:2 0 16G 0 lvm / nvme0n1 259:1 0 1.7T 0 disk ├─nvme0n1p1 259:2 0 256M 0 part │ └─md127 9:127 0 255.9M 0 raid1 /boot/efi ├─nvme0n1p2 259:3 0 1G 0 part │ └─md126 9:126 0 1023.9M 0 raid1 /boot └─nvme0n1p3 259:4 0 1.7T 0 part └─md125 9:125 0 1.7T 0 raid1 └─crypto 253:0 0 1.7T 0 crypt ├─lvmsys-swap 253:1 0 4G 0 lvm └─lvmsys-root 253:2 0 16G 0 lvm / ``` The whole setup is on a server at hetzner ( a big server hoster in germany ) when using their rescue system i noticed the following in dmesg ``` [Sun Jul 16 16:25:33 2023] EXT4-fs (dm-2): mounted filesystem 93180584-c8c2-4f30-8433-61c69b93199e with ordered data mode. Quota mode: none. [Sun Jul 16 16:25:39 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393231: comm bash: iget: checksum invalid [Sun Jul 16 16:25:39 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393231: comm bash: iget: checksum invalid [Sun Jul 16 16:25:45 2023] EXT4-fs (md126): mounted filesystem 2e9067bb-136c-4d1e-aa58-e20a0fe31456 with ordered data mode. Quota mode: none. [Sun Jul 16 16:25:55 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:26:13 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:26:13 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:26:13 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:28:26 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:28:28 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:28:28 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:28:28 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:06 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:37 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:37 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:37 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:38 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:39 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:39 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:29:39 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:30:06 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:30:06 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:30:10 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:30:10 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:30:10 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393232: comm vim-nox11: iget: checksum invalid [Sun Jul 16 16:30:19 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393227: comm systemd-tmpfile: iget: checksum invalid [Sun Jul 16 16:30:19 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393227: comm systemd-tmpfile: iget: checksum invalid [Sun Jul 16 16:30:19 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #393227: comm systemd-tmpfile: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282413: comm systemctl: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282414: comm systemctl: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282415: comm systemctl: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282416: comm systemctl: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282417: comm systemctl: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282413: comm systemctl: iget: checksum invalid [Sun Jul 16 16:30:20 2023] EXT4-fs error (device dm-2): ext4_lookup:1851: inode #282414: comm systemctl: iget: checksum invalid [Sun Jul 16 16:31:49 2023] EXT4-fs error: 35 callbacks suppressed [Sun Jul 16 16:31:49 2023] EXT4-fs error (device dm-2): ext4_lookup:1853: inode #265272: comm rpm: deleted inode referenced: 282389 ``` This is a fresh installation done on Sunday 2023-07-16. with the latest TW ( 20230714 ) and this comes after i lost another file system with a very similar partitioning raid+luks+lvm stacking where the XFS root fs disintegrated completely ( all files in lost+found now) ... This makes me a bit worried that we have some mdraid/luks/lvm regression in the current TW kernel. the old root FS that I lost on friday was upgrading a 6.3.9 kernel to 6.4.2. the new server which i installed on sunday started out with 6.4.2 right away. i have a similar setup on basically all my machines. though ... i have not noticed any weirdness in a machine which only has 1 nvme SSD in it and as such is just partition -> luks -> lvm without the raid layer inbetween.