Bug 1214173 - unable to install on NFS root
Summary: unable to install on NFS root
Status: RESOLVED WONTFIX
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: YaST2 (show other bugs)
Version: Leap 15.5
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: E-mail List
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-11 08:09 UTC by Per Jessen
Modified: 2023-09-06 12:40 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
y2logs as saved by save_y2logs (1015.09 KB, application/x-compressed-tar)
2023-08-11 08:24 UTC, Per Jessen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Per Jessen 2023-08-11 08:09:12 UTC
When installing on an nfs-mounted root, at first yast complains about lack of diskspace.  I get the option to continue anyway, which is great.

Out of disk space!
"/" needs 391.6 MiB more disk space.
You can choose to install anyway, but you risk getting a corrupted system.

When I am ready to install, I am told:
Not enough disk space. Remove some packages in the single selection. 

Error
The proposal contains an error that must be resolved before continuing.
Comment 1 Per Jessen 2023-08-11 08:23:53 UTC
Oh, yast did also complain about chrony not being installed, I had selected ntp instead.  I did try adding chrony, but that had no effect.
Comment 2 Per Jessen 2023-08-11 08:24:34 UTC
Created attachment 868758 [details]
y2logs as saved by save_y2logs
Comment 3 Per Jessen 2023-08-11 08:51:29 UTC
FWIW, I am now trying with Leap 15.4 and I did not have any such problem.  Seems to be a regression in 15.5.
Comment 4 Per Jessen 2023-08-11 13:53:58 UTC
I compared the dracut output between 15.4 and 15.5 - one key difference seems to be that systemd-network was included on 15.5.  I uninstalled "systemd.network", re-built the initrd - et voila, it works.
Comment 5 Stefan Hundhammer 2023-08-14 14:14:45 UTC
Unfortunately, there is not much logging about that NFS.

In y2log, I see it being mounted and then immediately unmounted again:

2023-08-11 08:51:10 <1> heron(5117)
> [Ruby] nfs/routines.rb(FormatHostnameForFstab):162 FormatHostnameForFstab: hostname=rootserver
> [Ruby] nfs/routines.rb(CheckHostName):70 CheckHostName: hostname=rootserver
> [Ruby] lib/cheetah.rb(record_commands):160 Executing "/usr/bin/mount -o ro rootserver:/srv/nfsroot/mirage /mnt".
> [Ruby] lib/cheetah.rb(record_status):180 Status: 0
> [Ruby] lib/cheetah.rb(record_commands):160 Executing "/usr/bin/umount -R /mnt".
> [Ruby] lib/cheetah.rb(record_status):180 Status: 0


storage-inst/05-actions.txt:

  2023-08-11 08:51:19 +0100
  
  Mount NFS rootserver:/srv/nfsroot/mirage on /
Comment 6 Stefan Hundhammer 2023-08-14 14:27:16 UTC
2023-08-11 09:53:18

> <1> heron(5117) [Ruby] modules/Packages.rb(CheckDiskSize):420 Resetting space calculation
> <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):327 EstimateTargetUsage([])
> <3> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):331 Invalid input: []
> <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(get_partition_info):692 get_partition_info: part []
> <1> heron(5117) [Pkg] modules/SpaceCalculation.rb:693 Pkg Builtin called: TargetInitDU
> <1> heron(5117) [Pkg] Target_DU.cc(TargetInitDU):193 Initializing Disk Usage couter from the system
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):171 Discard mount point : proc /proc proc rw,relatime 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):171 Discard mount point : sysfs /sys sysfs rw,relatime 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):254 Filter ro mount point : /dev/loop0 /parts/mp_0000 squashfs ro,relatime,errors=continue 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):254 Filter ro mount point : /dev/loop1 /parts/mp_0001 squashfs ro,relatime,errors=continue 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):171 Discard mount point : devtmpfs /dev devtmpfs rw,relatime,size=1917104k,nr_inodes=479276,mode=755,inode64 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):171 Discard mount point : devpts /dev/pts devpts rw,relatime,mode=600,ptmxmode=000 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):171 Discard mount point : rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):235 Filter mount point : /dev/loop2 /mounts/mp_0000 squashfs ro,relatime,errors=continue 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):235 Filter mount point : /dev/loop3 /mounts/mp_0001 squashfs ro,relatime,errors=continue 0 0
> <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722 INIT done, SpaceCalculation - partitions: []
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):235 Filter mount point : /dev/loop5 /mounts/mp_0003 squashfs ro,relatime,errors=continue 0 0
> <1> heron(5117) [zypp++] DiskUsageCounter.cc(detectMountPoints):235 Filter mount point : /dev/loop6 /mounts/mp_1002 squashfs ro,relatime,errors=continue 0 0
> <1> heron(5117) [Pkg] modules/SpaceCalculation.rb:868 Pkg Builtin called: TargetGetDU
> <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(block in CheckDiskSize):869 /: [1030828, 562116, 1444086, 0]
> <2> heron(5117) [Ruby] modules/SpaceCalculation.rb(block in CheckDiskSize):875 Partition "/" needs 403.57 MiB more disk space.
> <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(CheckDiskSize):886 Total used space (kB): 1444086, fits ?: false

> 2023-08-11 09:53:21 <1> heron(5117) [Ruby] modules/Packages.rb(AddFailedMounts):496 Proposal summary:
> $["warning":"Not enough disk space. Remove some packages in the single selection.", "warning_level":`blocker]


I have doubts if this takes the NFS into account at all.
Comment 7 Stefan Hundhammer 2023-08-14 14:41:33 UTC
On my Leap 15.5, that part looks like this:

> 2023-07-12 08:48:37 <1> balrog(4312) [Pkg] modules/SpaceCalculation.rb:174 Pkg Builtin called: TargetInitDU
> 2023-07-12 08:48:37 <1> balrog(4312) [Pkg] Target_DU.cc(TargetInitDU):286 Adding /
> 2023-07-12 08:48:37 <1> balrog(4312) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722
>   INIT done, SpaceCalculation - partitions:
>   [$["filesystem":"ext4", "free":15193124, "growonly":false, "name":"/", "used":12956544]]

Notice "Adding /" and details about free and used disk space for the root filesystem. I see none of that in comment #6:

> 2023-08-11 09:53:18 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722
>  INIT done, SpaceCalculation - partitions: []
Comment 8 Stefan Hundhammer 2023-08-14 15:16:36 UTC
I tried to look up the steps for this kind of installation in the official reference documentation, but I didn't find any: Neither for openSUSE Leap 15.5 nor for SLE 15 SP5.

https://doc.opensuse.org/documentation/leap/startup/single-html/book-startup/
https://documentation.suse.com/sles/15-SP5/single-html/SLES-deployment/

This begs the question if this is even a supported installation method. I dimly recall in the distant past there was something about NFS root; but is that just history, or is this officially supported?

Whenever the documentation talks about NFS, it's about NFS installation sources or repos, or about importing or exporting NFS shares. I couldn't find a single word about NFS root.

So Per, please, if you know some official documentation about this, please point me to it. I wouldn't discount the possibility that this worked in the past by accident.
Comment 9 Stefan Hundhammer 2023-08-14 15:39:24 UTC
https://github.com/yast/yast-storage-ng/blob/master/src/lib/y2storage/boot_requirements_strategies/nfs_root.rb#L24-L31

> # Strategy to calculate the boot requirements for a system in which the root
> # filesystem is an NFS share.
> #
> # This actually checks nothing on top of the basic checks of the base class,
> # users installing on top of NFS are supposed to know what they are doing and
> # are on their own.
> class NfsRoot < Base
> end


In all the NFS client code (which is also used in the partitioner), there were no significant code changes during the last two years:

https://github.com/yast/yast-nfs-client/tree/master/src

https://github.com/yast/yast-nfs-client/pulls?q=is%3Apr+is%3Aclosed


So frankly, I don't know what might have changed so it worked before and now not anymore.
Comment 10 Per Jessen 2023-08-14 17:32:55 UTC
> So Per, please, if you know some official documentation about this, please
> point me to it. I wouldn't discount the possibility that this worked in the
> past by accident.

Hi Stefan,
you get plus points for being so open :-) 

"official documentation" - no, I don't even know where to start looking (other than google).  I have been running systems with root on nfs for years and years, but I've never felt the need to look it up.  I am sure I have sometimes installed on local disks and then later copied to the nfs server, but I am also certain I have sometimes installed directly onto NFS (because I remember it takes forever).  Booting with an NFS root has definitely been supported for since 7.x or 8.x, but I agree booting != installing. 

https://www.suse.com/support/kb/doc/?id=000016166

For me, it is clear that YaST supported/permitted it in 15.4 - somehow an extra check was introduced in 15.5.
Comment 11 Per Jessen 2023-08-14 17:36:57 UTC
(In reply to Stefan Hundhammer from comment #9)

> So frankly, I don't know what might have changed so it worked before and now
> not anymore.

It has to be that final step before the installations goes ahead, some check that there are no error conditions.  I did wonder if it was the chrony vs ntp issue, but I resolved that by installing chrony.  
The "Not enough diskspace" warning has to be the issue - afaict, the calculations of the disk space on NFS looks a bit weird.
Comment 12 Stefan Hundhammer 2023-08-15 08:30:45 UTC
Actually, it's not the final step since it's already the software proposal that fails: You create the setup for the NFS root in the partitioner, at which point it does mount the NFS share to test if it can be accessed; and immediately after that, it unmounts it again since the test was successful. Now it's not mounted anymore.

Then the other proposals are executed one by one, including the software proposal. For that, SpaceCalculation injects information about the planned storage setup into libzypp so libzypp can keep track of the required disk space in case there are separate filesystems for the directories where packaged files go to.

And in your scenario, the NFS root does not seem to be taken into account; see comment #7. AFAICS the list of partitions / filesystems that it uses for checking is completely empty, so of course you get an error about disk space.
Comment 13 Stefan Hundhammer 2023-08-15 09:48:47 UTC
Asking around in the team, the general consensus was that it was never an officially supported feature, so it received some degree of support in YaST mostly for internal users of the type "need to know what they are doing" type. But I guess that type includes you as well. ;-)

I am sure you also know about the "nfsroot" start mode for the network interface, right?

https://github.com/search?q=org%3Ayast+nfsroot+language%3ARuby&type=code&l=Ruby

(just asking)
Comment 14 Per Jessen 2023-08-15 16:36:28 UTC
(In reply to Stefan Hundhammer from comment #13)
> Asking around in the team, the general consensus was that it was never an
> officially supported feature, so it received some degree of support in YaST
> mostly for internal users of the type "need to know what they are doing"
> type. But I guess that type includes you as well. ;-)

Maybe .... 😀 

> I am sure you also know about the "nfsroot" start mode for the network
> interface, right?

Oh yes 😀  YaST also knows about it and would set it correctly, in 15.4 for instance.
Comment 15 Stefan Hundhammer 2023-08-16 14:01:19 UTC
The deeper I look into this, the more I am convinced that the problem is that the list of partitions that are planned to be used is empty. All over the y2log I see an empty list [] where there should be the list of partitions:

> 2023-08-11 09:53:18 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):327 EstimateTargetUsage([])
> 2023-08-11 09:53:18 <3> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):331 Invalid input: []
> 2023-08-11 09:53:18 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(get_partition_info):692 get_partition_info: part []
> 2023-08-11 09:53:18 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722 INIT done, SpaceCalculation - partitions: []
> 2023-08-11 09:55:11 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):327 EstimateTargetUsage([])
> 2023-08-11 09:55:11 <3> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):331 Invalid input: []
> 2023-08-11 09:55:11 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(get_partition_info):692 get_partition_info: part []
> 2023-08-11 09:55:11 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722 INIT done, SpaceCalculation - partitions: []
> 2023-08-11 10:04:32 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):327 EstimateTargetUsage([])
> 2023-08-11 10:04:32 <3> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):331 Invalid input: []
> 2023-08-11 10:04:32 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(get_partition_info):692 get_partition_info: part []
> 2023-08-11 10:04:32 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722 INIT done, SpaceCalculation - partitions: []
> 2023-08-11 10:06:47 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):327 EstimateTargetUsage([])
> 2023-08-11 10:06:47 <3> heron(5117) [Ruby] modules/SpaceCalculation.rb(EstimateTargetUsage):331 Invalid input: []
> 2023-08-11 10:06:47 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(get_partition_info):692 get_partition_info: part []
> 2023-08-11 10:06:47 <1> heron(5117) [Ruby] modules/SpaceCalculation.rb(GetPartitionInfo):722 INIT done, SpaceCalculation - partitions: []
Comment 16 Stefan Hundhammer 2023-08-16 14:03:09 UTC
But where this information gets lost I haven't found out yet. A comparison of YaST source repositories for relevant parts didn't yield any thing meaningful so far.

https://github.com/yast/yast-packager/compare/SLE-15-SP4...SLE-15-SP5
https://github.com/yast/yast-installation/compare/SLE-15-SP4...SLE-15-SP5
https://github.com/yast/yast-storage-ng/compare/SLE-15-SP4...SLE-15-SP5

https://github.com/openSUSE/libstorage-ng/compare/SLE-15-SP4...master
Comment 17 Stefan Hundhammer 2023-08-16 14:05:05 UTC
Arvin, does this ring any bells on your side? Where could that NFS share that was planned to be used for the root filesystem get lost?

Even though it was never an officially supported feature, using an NFS root worked in SLE-15-SP4, but no longer in SLE-15-SP5.
Comment 18 Stefan Hundhammer 2023-09-05 08:45:09 UTC
Arvin, this is waiting for your feedback.
Comment 19 Arvin Schnell 2023-09-05 13:43:39 UTC
I was out of office.

Unfortunately I cannot say why the NFS shares get lost. Maybe comparing
logs from 15.4 and 15.5 could help.
Comment 20 Stefan Hundhammer 2023-09-06 12:40:26 UTC
I already did what I could; see comment #16.

There is a limit how much time we can spend on something like this, especially since it was never an official feature to begin with. Sorry.