|
Bugzilla – Full Text Bug Listing |
| Summary: | fdisk creates broken partition table | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Volker Kuhlmann <bugz57> |
| Component: | Installation | Assignee: | Stanislav Brabec <sbrabec> |
| Status: | CONFIRMED --- | QA Contact: | Jiri Srain <jsrain> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | aschnell, bugz57, snwint |
| Version: | Leap 15.5 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
Tar file containing sparse 2TB disk image
Partition table (fdisk -l) Hex dump of 2TB disk image Reproducer shell script |
||
Created attachment 874129 [details]
Partition table (fdisk -l)
parted crashes reading from a disk with this partition table. This file was created with fdisk -l and the table was created with fdisk (leap 15.4 or 15.5).
Created attachment 874130 [details]
Hex dump of 2TB disk image
Ah, Arvin, you seem to be the one taking care of parted. Reassigning. The crash can be avoided by moving the end of the previous partition down by 1 sector (512 bytes). I've seen parted creating partitions starting on an odd(!!) sector number, which is plain stupid. For each logical partitions there is an extended boot record (EBR). Typical this EBR is placed between the logical partitions. That is why AFIAS parted needs a gap of at least a sector between logical partitions. Since there is no gap between the logical partitions of your partition table it is at least not typical. But AFAIS it is even broken in that the EBR is located *inside* the partition. I have gathered the data by placing extra logging in parted since I am not aware of a program showing the EBR locations. Partition 5 spans sectors 86507520 to 1135083519. The EBR of partition 6 is located at 1135081472 (so at the end of partition 5 but *inside* partition 5). To further investigate the program could you please provide the exact steps to produce the partition table? Steps to create this partition table: * Create 2TB sparse file with truncate. * fdisk of openSUSE 15.4 - enter the numbers of this table. Alignment is critical for SSD performance. Partitions, or maybe more importantly filesystems, should be at least 1MiB aligned - or whatever the disk's internal block size is. Equally important is erase block size, but manufacturers treat this as secret. It may be in the order of 128MiB now. One could test with e.g. flashbench. If the yast partitioner could meet all these constraints, maybe with a user-settable alignment size, and optionally operate in sectors so one can be sure and/or verify that'd be great. Until then I'll need to partition manually with other tools. flashbench https://github.com/bradfa/flashbench Then fdisk is creating a broken partition table. I cannot say whether fdisk expects the user to make sure there is place for the EBRs or not. You can use hexdump on partition 5 and see the EBR for partition 6, esp. the DOS signature "55 aa". With that partition table data loss is imminent. E.g. copy from /dev/zero to the last GiB of partition 5 and fdisk cannot find partition 6 and 7 (without a warning). parted complains about the broken signature. Simply running fstrim might cause the same data loss. BTW: YaST does take care of 1 MiB alignment since at least 10 years. And with GPT the whole problem of placing EBRs does not exist. > Alignment is critical for SSD performance.
Your partitions are aligned perfectly. They just overlap. While this
is great for performance, it is not so great for the data.
It's a bug in fdisk that it allows you to do this.
The problem is that you leave a gap before the first logical partition.
So after you created the 1st logical partition, fdisk keeps
suggesting a range that starts at the first available block
in the original extended partition and ends at the end of free space.
This is clearly not true (since there's already a partition in between).
This allows you to enter basically any random partition borders without
fdisk validating them properly.
And yes, as Arvin already mentioned - it is technically impossible
to have logical partitions seamlessly one after another. There MUST be
space in between them. And if you keep the start values fdisk is
suggesting (that is, no artificial gap), everything is fine.
OK so I didn't take care of a correct layout, and fdisk didn't show a warning (as long as it's a warning, not an error).
That's still no reason for parted to crash (and installation to fail)!
For that matter, parted can't validate a partition table either. It only has align-checks for min and opt, which all pass.
Yast may align partitions to 1MiB but AFAIR it doesn't show useful information that it does so ("166.5GB" is completely useless for checking what it's doing re alignment).
And as I explained, alignment to anything less than erase block size (which is much larger than block size) will degrade performance. Yast partitioner isn't able to take care of this. With today's disk sizes loosing a few 10s of MB between partitions doesn't matter, performance does, and aiming a bit higher than needed doesn't hurt. 1MiB alignment is no longer good enough.
So fdisk, parted, yast all need fixing...
The start values fdisk suggests for logical partitions are not useful for increasing alignment because the start value it suggests is the beginning of a 35MB gap between the second last and last logical partition, or similar, so one is forced to manually put in a value after the end of the last logical partition. > as long as it's a warning, not an error). I know I'm talking to a wall but I'll try one last time: your partitions overlap and you will see data corruption eventually. Your layout is wrong and you should fix it asap if you value your data. > 1MiB alignment is no longer good enough. One person's alignment gap is the other person's free usable space. So in light of this, am I correct assuming you tried to create a layout with 256 MiB manually with fdisk? Why aren't you using GPT where you can put your partitions one after another without gap, perfectly aligned? > So fdisk, parted, yast all need fixing... Yes. And world peace, too! There's two real issues: (1) fdisk should not allow such partitioning and (2) parted should probably not just crash when it sees it. > I know I'm talking to a wall but I'll try one last time: Calm down. I am fixing it with urgency. > One person's alignment gap is the other person's free usable space. Correct. Yast doesn't give options (it doesn't even say what it's doing), hence using other tools. Bad luck fdisk didn't flag user error. That's unix - everyone's allowed to be an idiot but doesn't have to be. With Redmond OS you must be an idiot, big brother always knows what you need. > So in light of this, am I correct assuming you tried to create a layout > with 256 MiB manually with fdisk? 128MiB alignment, yes. Information I found suggests erase block sizes are much larger than 1MiB and flashbench testing indicates 128MiB to be a good choice. > Why aren't you using GPT where you can > put your partitions one after another without gap, perfectly aligned? If I knew everything, I could tell whether OtherOS on the earlier partitions still boots with GPT, but I don't. DOS tables are supposed to work on a 2TB disk so that seems the safe option. Plus it's always worked so far, so no change is safest. > There's two real issues: (1) fdisk should not allow such partitioning and I prefer a warning, not error. There is often unexpected use in tools that eventually do what told. But whatever. > (2) parted should probably not just crash when it sees it. Nothing is allowed to crash on bad input. Especially when it causes install failure. Show a warning, fine, a completely blank result with no reason why is not useful. I can recover now regardless whether anything gets fixed. Thanks for looking at it, I appreciate it. Well, I have checked the fdisk output.
The first EBR is placed in the first sector mentioned in the line with Type == Extended.
EBR of the next partition inside the Extended space is invisible for fdisk.
Anyway, trying to reproduce on 15.6 with a randomly sized partitions don't allow me to reproduce the bug. I got a good proposal and trying to add bad number reports "Value out of range.".
But there is apparently a different problem that allows to create overlapping partitions:
truncate -s 2000398934016 SSD-sparse-crash.diskimg2
fdisk SSD-sparse-crash.diskimg2
n
p
200703
n
p
84107519
n
p
84148224
86245375
n
e
86245376
n
86507520
1135083519
Here you will get correct:
Command (m for help): n
All primary partitions are in use.
Adding logical partition 5
First sector (86247424-3907029167, default 86247424): 86507520
Last sector, +/-sectors or +/-size{K,M,G,T,P} (86507520-3907029167, default 3907029167): 1135083519
Created a new partition 5 of type 'Linux' and of size 500 GiB.
But now error appears:
n
Command (m for help): n
All primary partitions are in use.
Adding logical partition 6
First sector (86247424-3907029167, default 86247424): 1135083520
Last sector, +/-sectors or +/-size{K,M,G,T,P} (1135083520-3907029167, default 3907029167): 3904897023
Created a new partition 6 of type 'Linux' and of size 1.3 TiB.
The range allows to create fully overlapping partitions. Just like the previously allocated sectors are ignored.
But it happens specifically for the reporter's data. Creating a random layout with a different sizes correctly sets the down limit.
But if I try to abuse this bug and try to create totally overlapping partitions, it does not work: Command (m for help): n All primary partitions are in use. Adding logical partition 6 First sector (86247424-3907029167, default 86247424): 86507520 Sector 86507520 is already allocated. So there is something wrong with suggested lowest sector, and something other wrong with a sector which is really accepted. And yet another attempt: Command (m for help): n All primary partitions are in use. Adding logical partition 6 First sector (86247424-3907029167, default 86247424): 86507520 Sector 86507520 is already allocated. First sector (1135085568-3907029167, default 1135085568): 1135083520 Value out of range. So entering a bad sector number forces fdisk to recompute the lowest sector, and then it proposes a working value. Created attachment 875682 [details]
Reproducer shell script
|
Created attachment 874128 [details] Tar file containing sparse 2TB disk image I always partition my disks as it works best for me, and these disks last multiple re-installations of openSUSE versions. In yast I allocate filesystems manually to the existing partitions. Yast is not allowed to change the partition layout. Leap 15.5 is unable to be installed on this already partitioned disk (15.4 had no problems) because when yast is told to read the partition table nothing shows up in yast. Yast runs parted -l -s -m (or similar) for this, and parted crashes while reading the partition table(!) and doesn't produce any output yast can use. To reproduce the crash it is sufficient to create a sparse file "disk" and partition it. The disk has this size: Disk SSD-sparse-crash.diskimg: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Partition with fdisk, as per the attached table created after the fact with fdisk -l. All filesystem boundaries in that table are intended to start at a multiple of a large number of MiB. Partitions overlap and there's no error in the table. Incredibly parted can be made to not crash just by introducing a very small gap between partition boundaries. It's still a parted bug. I'll try and attach a tar file (20k) of the 2TB sparse disk image.