Bugzilla – Bug 1214540
NFS mount doesn't succeed any more
Last modified: 2023-10-16 01:25:03 UTC
After one of the last Tumbleweed updates, the nfs kernel server doesn't work correctly any more. An nfs mount command from another machine doesn't return, it hangs forever and nothing is mounted. Old working configuration ------------------------- nfs-client-2.6.2-38.1 nfs-kernel-server-2.6.2-38.1 Syslog: Aug 23 17:08:36 orion rpc.mountd[19233]: v4.0 client attached: 0x131b2e4a64e620dc from "192.168.1.21:991" The mount command on the client machine returns immediately and the remote directory is mounted correctly. New non working configuration ----------------------------- nfs-client-2.6.3-39.4 nfs-kernel-server-2.6.3-39.4 Syslog: Aug 23 16:47:37 orion rpc.mountd[18307]: auth_unix_ip: inbuf 'nfsd 192.168.1.21' Aug 23 16:47:37 orion rpc.mountd[18307]: successful authentication for IP 192.168.1.21 as 192.168.1.0/24 Aug 23 16:47:37 orion rpc.mountd[18307]: auth_unix_ip: client 0x5619032d0770 '192.168.1.0/24' Aug 23 16:47:37 orion rpc.mountd[18307]: v4.0 client attached: 0x131b2e4764e61c00 from "192.168.1.21:717" Aug 23 16:47:37 orion rpc.mountd[18307]: nfsd_fh: inbuf '192.168.1.0/24 1 \x00000000' Aug 23 16:47:37 orion rpc.mountd[18307]: nfsd_fh: found 0x5619032d2410 path / The mount command on the client machine doesn't return and nothing is mounted. After killing the mount command, this line appears in the servers syslog: Aug 23 16:53:55 orion rpc.mountd[18307]: v4.0 client detached: 0x131b2e4764e61c00 from "192.168.1.21:717" There's no difference, whether NFSv4 or NFSv3 is used and there's no message in the syslog of the client machine.
Interesting ... I'm using tumbleweed with that version of nfs-client and server, and I don't see a problem. A lot of NFS support is in the kernel rather than on the nfs-kernel-server package, so for completeness could you please report kernel version (uname -a) where it works and where it doesn't. Also please collecting tracing logs on the server. To do this run trace-cmd record -e sunrpc:svc_xdr\* -e sunrpc:svcsock\* -e nfsd:nfsd_read\* wait until it says "Hit Ctrl^C to stop recording" and then attempt the mount. When you are sure it isn't working, hit ctrl-C. This should create "trace.dat". Compress that file and attach it to this bug. It might also help to capture a packet trace during the attempt. tcpdump -s 0 -w /tmp/dump.pcap port 2049 then attempt the mount, then interrupt the tcpdump, then compress and attach /tmp/dump.pcap
Created attachment 869138 [details] trace-cmd output
Created attachment 869139 [details] tcpdump output
Neil, thanks for looking into this issue. It's really strange, because it doesn't occur on a second machine with the same OS version. It started on the affected machine after a tumbleweed update that installed nfs-kernel-server-2.6.3-39.4. The kernel was also updated, but I don't think that it's a kernel issue, because it works with the new kernel after downgrading nfs-kernel-server to version 2.6.2-38.1. This is the output of uname -a: Linux orion.lopa.de 6.4.9-1-default #1 SMP PREEMPT_DYNAMIC Wed Aug 9 05:07:55 UTC 2023 (5b9ad20) x86_64 x86_64 x86_64 GNU/Linux I've attached the compressed outputs of trace-cmd and tcpdump.
Thanks. The significant detail is in the tcpdump capture. The client sends a PUTROOTFH request, and gets a reply NFS4ERR_DELAY. That means that the kernel asked rpc.mountd for some information about exported filesystems, and is waiting for an answer. Then a few seconds later the client re-sends the request (as expected) and gets the same reply. This suggests that rpc.mountd isn't working properly. So please add the two lines: [mountd] debug=all to /etc/nfs.conf.local (create the file if needed). Then restart mountd: systemctl restart nfs-mountd then attempt the mount. Then journalctl -u nfs-mountd and collect the output. Maybe add "--since -10m" to only get the last 10 minutes of output, or something like that. Then you can remove the two lines from nfs.conf.local and restart nfs-mountd. Hopefully that log will provide some more hints.
(In reply to Neil Brown from comment #5) My bug description already included the rpc.mountd syslog debug messages. I've now attached a file containing the messages in the good and the bad case. The first 6 messages are identical, but the new server stops after the message "nfsd_fh: found 0x56002b75d290 path /" and the next message "nfsd_export: inbuf '192.168.1.0/24 /'" doesn't occur. The "client detached" message comes in the good case after the umount command and in the bad case after killing the mount command on the client.
Created attachment 869233 [details] rpc.mountd debug messages
Thanks for those debug messages. I think they help. I should have asked earlier: Could you please provide your "/etc/exports" file, and /etc/nfs.conf.local if there is anything in that. Also what "mount" command are you using? i.e. what filesystem are you trying to mount from the server. I found a change in 2.6.3 which looks questionable, but I really want to be able to reproduce the problem - but I cannot yet. Hopefully if I can copy your config, and I can do it. Sorry this is taking so long - I've had some distractions.
(In reply to Neil Brown from comment #8) Hi Neil, > I should have asked earlier: Could you please provide your "/etc/exports" > file, and /etc/nfs.conf.local if there is anything in that. I've attached my servers (orion) /etc/exports. /etc/nfs.conf.local contains only [mountd] debug=all > Also what "mount" command are you using? i.e. what filesystem are you > trying to mount from the server. It's the ext4 filesystem /video that I want to mount and the client has it listed in /etc/fstab. But this causes the boot sequence to hang forever. So I commented out the line in /etc/fstab to be able to boot the client. A simple mount orion:/video /mnt on the client doesn't succeed and hangs forever until it is killed. > Sorry this is taking so long - I've had some distractions. No problem, because the nfs mount succeeds again after downgrading nfs-kernel-server to version 2.6.2-38.1. I only have to ignore the dependency on nfs-client = 2.6.2 ot to downgrade this package, too. Thanks and best regards Lothar
Created attachment 869571 [details] /etc/exports
(In reply to Lothar Paltins from comment #10) > Created attachment 869571 [details] > /etc/exports Thanks. Could you please try removing: / 192.168.1.0/24(fsid=0,crossmnt,rw,no_root_squash,sync,no_subtree_check) That line shouldn't be needed (but it shouldn't cause a hang either....). If that does work-around the problem, the I have little doubt that I'll find the code fix fairly quickly (once I make time).
(In reply to Neil Brown from comment #11) > Thanks. Could you please try removing: > > / 192.168.1.0/24(fsid=0,crossmnt,rw,no_root_squash,sync,no_subtree_check) Yes, this helps. But without this line I can only mount the exported sub-filesystems, but not the root filesystem. This is ok for mounting /video, but it's not a general workaround. I played a little bit with different options in /etc/exports and it seems to be the crossmnt option that causes the hang. Without it, I can leave the line for / in /etc/exports. This is now the content: /video 192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check) /home 192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check) /home/lothar/work 192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check) / 192.168.1.0/24(fsid=0,rw,no_root_squash,sync,no_subtree_check) This works, but one thing I don't understand. If I execute mount orion:/ /mnt on the client, then the root directory of orion is mounted but all other exported directories from orion are also mounted. I thought, that the crossmnt option is needed for this behavior.
I still cannot reproduce this problem. I might need to provide you an nfs-kernel-server package with lots of extra debugging so I can see in more detail what is happening. But maybe we can get by with something simpler. Could you please reproduce the problem again, but this time run strace -o /tmp/mountd.out -s 200 -p `pidof rpc.mountd` on the server before attempting the mount. Then after the test, interrupt the strace, and attach the /tmp/monntd.out file here. Thanks. The "crossmnt" flag is, to so extent, assumed when NFSv4 is used. The man page for "man exports" says this at the end of the "nohide" section: > This option is not relevant when NFSv4 is use. NFSv4 never hides subordinate filesystems. Any filesystem that is exported will be visible where expected when using NFSv4. Then the "crossmnt" section starts "This option is similar to nohide..." so you might say this behaviour is documented.
> Could you please reproduce the problem again, but this time run > > strace -o /tmp/mountd.out -s 200 -p `pidof rpc.mountd` > > on the server before attempting the mount. Of course, I should have thought about that earlier. I've attached 2 trace files, mountd-bad.out with the crossmnt option and mountd-good.out without it. The first 71 lines are almost identical, but beginning with line 72 they are totally different.
Created attachment 869754 [details] mountd-bad.out
Created attachment 869755 [details] mountd-good.out
Thanks. In mountd-bad.out, the failure to stat /run/user/1000/doc is, I think, significant. Could you get me a "good" trace using the same /etc/exports configuration as the bad trace, but using the older nfs-kernel-server package? I would think that the /run/user/1000/doc failure would affect the older code too, but it seems that it doesn't.
(In reply to Neil Brown from comment #17) > Thanks. In mountd-bad.out, the failure to stat /run/user/1000/doc is, I > think, significant. This looks very strange, indeed. Something's wrong in this system directory. 1000 is my user id (lothar): # ls -l /run/user/1000 ls: cannot access '/run/user/1000/gvfs': Permission denied ls: cannot access '/run/user/1000/doc': Permission denied total 4 -rw-r--r-- 1 lothar users 85 Sep 26 12:19 KSMserver__0 drwx------ 2 lothar users 60 Sep 26 12:19 at-spi srw-rw-rw- 1 lothar users 0 Sep 26 12:19 bus drwx------ 3 lothar users 60 Sep 26 12:19 dbus-1 drwx------ 2 lothar users 60 Sep 26 12:19 dconf d????????? ? ? ? ? ? doc srw------- 1 lothar users 0 Sep 26 12:19 drkonqi-coredump-launcher drwx------ 2 lothar users 120 Sep 26 12:19 gnupg d????????? ? ? ? ? ? gvfs drwx------ 2 lothar users 40 Sep 26 12:19 gvfsd drwx------ 2 lothar users 60 Sep 26 12:19 keyring srwxr-xr-x 1 lothar users 0 Sep 26 12:19 kwallet5.socket srw-rw-rw- 1 lothar users 0 Sep 26 12:19 pipewire-0 -rw-r----- 1 lothar users 0 Sep 26 12:19 pipewire-0.lock drwx------ 2 lothar users 80 Sep 26 12:19 pulse drwxr-xr-x 7 lothar users 180 Sep 26 12:19 systemd But that's probably a different bug, it shouldn't affect rpc.mountd. > Could you get me a "good" trace using the same /etc/exports configuration as > the bad trace, but using the older nfs-kernel-server package? It's attached as mountd-old.out. > I would think that the /run/user/1000/doc failure would affect the older > code too, but it seems that it doesn't. Both the new code without the crossmnt option and the old code independent of the crossmnt option don't try to access this path.
Created attachment 869769 [details] mountd-old.out
It's strange, but the directories, that are seemingly corrupted, are only looking corrupted for the user root. From my user account, everything seems to be correct: lothar@orion:~> ll /run/user/1000 total 4 -rw-r--r-- 1 lothar users 85 Sep 27 08:30 KSMserver__0 drwx------ 2 lothar users 60 Sep 27 08:30 at-spi srw-rw-rw- 1 lothar users 0 Sep 27 08:30 bus drwx------ 3 lothar users 60 Sep 27 08:30 dbus-1 drwx------ 2 lothar users 60 Sep 27 12:19 dconf dr-x------ 2 lothar users 0 Jan 1 1970 doc srw------- 1 lothar users 0 Sep 27 08:30 drkonqi-coredump-launcher drwx------ 2 lothar users 120 Sep 27 08:30 gnupg dr-x------ 2 lothar users 0 Sep 27 08:30 gvfs drwx------ 2 lothar users 40 Sep 27 08:30 gvfsd drwx------ 2 lothar users 60 Sep 27 08:30 keyring srwxr-xr-x 1 lothar users 0 Sep 27 08:30 kwallet5.socket srw-rw-rw- 1 lothar users 0 Sep 27 08:30 pipewire-0 -rw-r----- 1 lothar users 0 Sep 27 08:30 pipewire-0.lock drwx------ 2 lothar users 80 Sep 27 08:30 pulse drwxr-xr-x 7 lothar users 180 Sep 27 08:30 systemd lothar@orion:~> ll /run/user/1000/doc total 0 dr-x------. 2 lothar users 0 Jan 1 1970 9a2a1c13 dr-x------ 2 lothar users 0 Jan 1 1970 by-app lothar@orion:~> ll /run/user/1000/gvfs total 0 This is the output of mount for these file systems: tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3210296k,nr_inodes=802574,mode=700,uid=1000,gid=100,inode64) portal on /run/user/1000/doc type fuse.portal (rw,nosuid,nodev,relatime,user_id=1000,group_id=100) gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=100) It seems to be a problem of the file system types fuse.portal and fuse.gvfsd-fuse.
Thanks for all the details. I have a fairly good idea what is happening now, and most of why. These two "fuse" filesystem are probably working as designed. There is no need for root to be able to see the contents and it is perfectly reasonable for them to reject any access by root. The problem is very much in rpc.mountd which either shouldn't look at them or shouldn't respond this way to the error. I don't entirely understand the piece of code that responds to that error and I've email the author to hopefully get an explanation. The change in nfs-utils 2.6.3 which made resulted in the mount failures was actually a bug-fix. rpc.mountd wasn't handing "/" exports quite properly. Fixing that exposed this other problem that strange filesystems under a crossmnt export cause problems. As you have a stable work-around I won't rush a fix until I fill in them missing parts of the story, but it's likely that a proper fix will be out within a month.
(In reply to Neil Brown from comment #21) I'm not familiar with the implementation of rpc.mountd, but I'm wondering, why it now tries to access each and every mounted filesystem under /, even if it's absolutely unimportant for the actual mount request. The client wants to mount /video and not / and there's no other mounted filesystem under /video. And you said, that the crossmnt option is now the default, so why is the behavior totally different with or without this option? > As you have a stable work-around I won't rush a fix until I fill in them > missing parts of the story, but it's likely that a proper fix will be out > within a month. Yes, just leaving out the redundant crossmnt option is an easy workaround. Thanks and best regards Lothar
Hello, I'm not sure if this is related or not, but it seems like the latest nfs-client update broke the nfs client in some cases. Context: I'm running multiple Kubernetes clusters on top of OpenSUSE MicroOS using the following Terraform module: https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner I'm using Rancher Longhorn for volumes in the cluster and Longhorn's RWX volumes are using nfs under the hood. Yesterday, a transactional-update scheduled update upgraded all my nodes (including nfs-client package from 2.6.3-39.4 to 2.6.3-39.5) and the RWX volumes started to fail. The error message was the following in the journal: --- Oct 11 07:42:23 dev-worker-1 k3s[1294]: Mounting command: /usr/local/sbin/nsmounter Oct 11 07:42:23 dev-worker-1 k3s[1294]: Mounting arguments: mount -t nfs -o vers=4.1,noresvport,intr,hard 10.43.207.185:/pvc-13538170-4278-4467-b2b0-1f1ba6f54a4c /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/185c34f566c2eca6e8c7c6a2ede2094c076d7d25ddae286dc633eeef80551af0/globalmount Oct 11 07:42:23 dev-worker-1-autoscaled-small-19baf778f50efd8c k3s[1294]: Output: mount.nfs: Protocol not supported for 10.43.207.185:/pvc-13538170-4278-4467-b2b0-1f1ba6f54a4c on /var/lib/kubelet/plugin --- I also enabled debug logging in the kernel, you can find those logs in the related bug report. Anyway, after a few hours I decided to use MicroOS rollback going back to the latest snapshot from 2 days ago and the issue is solved now on that node. Related bug report on Github: https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/issues/1016 Please let me know if I can provide you any additional information about this issue. Regards, Janos Miko
I've submitted a patch that fixes this problem. It will probably take a little while to get into Tumbleweed. BTW the crossmnt option isn't entirely redundant. There are some situations where it can be needed. If the server is rebooted while a client is accessing a filesystem that is not explicitly exported, then the client may not handle the server reboot properly without the crossmnt option. So when the new nfs-kernel-server package lands (rpm -q --changelog nfs-kernel-server will mention this bug number) please put the crossmnt option back.
This is an autogenerated message for OBS integration: This bug (1214540) was mentioned in https://build.opensuse.org/request/show/1117914 Factory / nfs-utils