Bug 117691

Summary: Shrinking ReiserFS during install makes fsck fail on reboot
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Jens Benecke <jens-novell>
Component: YaST2Assignee: Thomas Fehr <fehr>
Status: RESOLVED FIXED QA Contact: Klaus Kämpf <kkaempf>
Severity: Critical    
Priority: P5 - None    
Version: RC 1   
Target Milestone: SUSE Linux 10.1   
Hardware: 32bit   
OS: All   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: YaST2 installation log archive

Description Jens Benecke 2005-09-18 20:21:14 UTC
Hello,  
  
I had a system with two Windows partitions and two Linux (swap + root)  
partitions. The Linux partition was formatted ReiserFS and contained SuSE 9.1.  
Because of several conflicts trying to upgrade this version to 10.0rc1 (from a 
running system as well as from a network-install ISO) I decided to reinstall, 
but _not_ reformat the existing ReiserFS partition, because /home (hda6) 
contained lots of data.  Instead, I deleted the partition completely except 
for /home. 
 
During install I shrunk the ReiserFS partition hda6 from 85GB to 75GB and 
created a new 11GB partition hda7 to install SuSE 10.0rc1 into because the 
installer complained that not re-formatting ReiserFS partitions could "create  
problems" (what problems? why?). This went without error message, the rest of 
the installation also.  
  
disk-hda from /var/log/Yast2 before repartitioning: 
 
SizeK: 160086528 
Partition: 1 /dev/hda1 10482381 3 1 0 1305 c primary 
Partition: 2 /dev/hda2 61890412 3 2 1305 7705 7 primary 
Partition: 3 /dev/hda3 87698835 3 3 9010 10918 f extended boot 
Partition: 5 /dev/hda5 1028128 3 5 9010 128 82 logical 
Partition: 6 /dev/hda6 86670643 3 6 9138 10790 83 logical 
 
after:  
Partition: 1 /dev/hda1 10482381 3 1 0 1305 c primary 
Partition: 2 /dev/hda2 61890412 3 2 1305 7705 7 primary 
Partition: 3 /dev/hda3 87698835 3 3 9010 10918 f extended boot 
Partition: 5 /dev/hda5 1028128 3 5 9010 128 82 logical 
Partition: 6 /dev/hda6 75505469 3 6 9138 9401 83 logical 
Partition: 7 /dev/hda7 11165143 3 7 18538 1390 83 logical 
 
 
However, upon the first reboot, I ended up with an emergency root shell 
because reiserfsck complained about the shrunk partition - it claimed that the 
superblock said the file system was bigger than the partition. I could mount 
the partition fine, however, and copy data off of it fine as well. And it had 
been checked before (last two columns in fstab were "1 1" so reiserfsck 
checked it on every boot) and under SuSE 9.1 it did not show any problems. 
 
I am attaching an archive of /var/log/YaST2 for you to take apart. ;) 
 
Thanks, 
 
Jens
Comment 1 Jens Benecke 2005-09-18 20:22:47 UTC
Created attachment 50254 [details]
YaST2 installation log archive
Comment 2 Jens Benecke 2005-09-18 20:34:14 UTC
Here is the reiserfsck output from the partition: 
 
Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes 
bread: Cannot read the block (18876374): (Invalid argument). 
 
reiserfs_open: Your partition is not big enough to contain the 
filesystem of (18876374) blocks as was specified in the found super block. 
 
Failed to open the filesystem. 
 
If the partition table has not been changed, and the partition is 
valid  and  it really  contains  a reiserfs  partition,  then the 
superblock  is corrupted and you need to run this utility with 
--rebuild-sb. 
 
Aborted 
 
Comment 3 Michael Gross 2005-09-19 12:56:57 UTC
This sounds not good. Raising severity to critical.

As it looks the resize-tool does not modify the superblock of the resized
filesystem correctly!
Comment 4 Thomas Fehr 2005-09-27 15:40:21 UTC
Problem can happen if resized partition does not start on a cylinder boundary.
Fixed in current SVN head. Fix will be available in SL 10.1.
Comment 5 Jens Benecke 2005-09-27 17:35:24 UTC
Hi, 
 
Thanks a lot for fixing this bug ... but ... are you leaving a critical file 
system destroying bug that effectively prevents an upgrade in the released 
product (10.0)?  
 
I suspect I don't see the "whole picture" behind this bug, but I can see the 
news headlines already: "SuSE Linux upgrade destroys harddisks" ... ;-) 
 
Isn't it possible to check for the described case where the partition might 
get treated wrongly, and then refuse to resize (for 10.0)? IMHO this would be 
much preferable. 
 
Thanks!  
 
Jens 
Comment 6 Thomas Fehr 2005-09-28 12:43:54 UTC
Backporting the fix to 10.0 is not the problem. I will do this and will make
an online update available. Unfortunately this will not help much, since most
people will resize during installation and the buggy version is on the DVD.

Fortunetaly the problem is not as bad as it looks at first sight. Effectly
the last some Megabytes of the fs get lost (the loss is smaller than a disk 
cylinder, in your case it was 32k). One can rebuild the corrupted reiserfs
pretty easy the follwing way:
 1) reiserfsck --rebuild-sb -y <device>
    answer "y" to all questions asked an do not change the suggest size
 2) reiserfsck --check -y <device>
    this will tell you if anything is corrupted and how to call reiserfsck
    next. I my testcases I had to call reiserfsck with option "--fix-fixable"
 3) reiserfsck --fix-fixable -y /dev/hdb2
 4) mount fixed reiserfs

I did  the following testcases:
started filesystem with size of 2.0 Gig and about 200000 files that was 71% full
- resized to about 88% full loosing the last 5 Meg of the fs 
  --> no files lost/corrupted
- resized to about 93% full loosing the last 6 Meg of the fs 
  --> 13 files lost/corrupted
- resized to about 96% full loosing the last 6 Meg of the fs 
  --> no files lost/corrupted
- resized to about 99% full loosing the last 6 Meg of the fs 
  --> 231 files lost/corrupted
Comment 7 Jens Benecke 2005-09-30 08:35:50 UTC
Hi, 
 
I understand the DVD images are already frozen. OK, I see there is not much 
one can do about this (maybe provide a "driver disk" with a fix that is loaded 
during install?) 
 
Thank you for the instructions about how to fix my partition. :-) 
Thank you also for tracking this bug down and doing the testcases! 
 
I'm gonna buy a 10.0 box when it's released... 
 
Jens