Bug 1214540 - NFS mount doesn't succeed any more
Summary: NFS mount doesn't succeed any more
Status: IN_PROGRESS
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Network (show other bugs)
Version: Current
Hardware: x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-23 15:56 UTC by Lothar Paltins
Modified: 2023-10-16 01:25 UTC (History)
1 user (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
trace-cmd output (444 bytes, application/octet-stream)
2023-08-30 21:49 UTC, Lothar Paltins
Details
tcpdump output (1.39 MB, application/octet-stream)
2023-08-30 21:50 UTC, Lothar Paltins
Details
rpc.mountd debug messages (2.07 KB, text/plain)
2023-09-04 11:13 UTC, Lothar Paltins
Details
/etc/exports (273 bytes, text/plain)
2023-09-18 10:37 UTC, Lothar Paltins
Details
mountd-bad.out (13.09 KB, text/plain)
2023-09-26 12:42 UTC, Lothar Paltins
Details
mountd-good.out (95.73 KB, text/plain)
2023-09-26 12:42 UTC, Lothar Paltins
Details
mountd-old.out (96.55 KB, text/plain)
2023-09-26 18:33 UTC, Lothar Paltins
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lothar Paltins 2023-08-23 15:56:57 UTC
After one of the last Tumbleweed updates, the nfs kernel server doesn't work correctly any more. An nfs mount command from another machine doesn't return, it hangs forever and nothing is mounted.

Old working configuration
-------------------------
nfs-client-2.6.2-38.1
nfs-kernel-server-2.6.2-38.1

Syslog:
Aug 23 17:08:36 orion rpc.mountd[19233]: v4.0 client attached: 0x131b2e4a64e620dc from "192.168.1.21:991"

The mount command on the client machine returns immediately and the remote directory is mounted correctly.


New non working configuration
-----------------------------
nfs-client-2.6.3-39.4
nfs-kernel-server-2.6.3-39.4

Syslog:
Aug 23 16:47:37 orion rpc.mountd[18307]: auth_unix_ip: inbuf 'nfsd 192.168.1.21'
Aug 23 16:47:37 orion rpc.mountd[18307]: successful authentication for IP 192.168.1.21 as 192.168.1.0/24
Aug 23 16:47:37 orion rpc.mountd[18307]: auth_unix_ip: client 0x5619032d0770 '192.168.1.0/24'
Aug 23 16:47:37 orion rpc.mountd[18307]: v4.0 client attached: 0x131b2e4764e61c00 from "192.168.1.21:717"
Aug 23 16:47:37 orion rpc.mountd[18307]: nfsd_fh: inbuf '192.168.1.0/24 1 \x00000000'
Aug 23 16:47:37 orion rpc.mountd[18307]: nfsd_fh: found 0x5619032d2410 path /


The mount command on the client machine doesn't return and nothing is mounted. After killing the mount command, this line appears in the servers syslog:
Aug 23 16:53:55 orion rpc.mountd[18307]: v4.0 client detached: 0x131b2e4764e61c00 from "192.168.1.21:717"

There's no difference, whether NFSv4 or NFSv3 is used and there's no message in the syslog of the client machine.
Comment 1 Neil Brown 2023-08-30 06:57:00 UTC
Interesting ... I'm using tumbleweed with that version of nfs-client and server, and I don't see a problem.
A lot of NFS support is in the kernel rather than on the nfs-kernel-server package, so for completeness could you please report kernel version (uname -a) where it works and where it doesn't.

Also please collecting tracing logs on the server.
To do this run

  trace-cmd record -e sunrpc:svc_xdr\* -e sunrpc:svcsock\* -e nfsd:nfsd_read\*

wait until it says "Hit Ctrl^C to stop recording" and then attempt the mount.
When you are sure it isn't working, hit ctrl-C.
This should create "trace.dat".
Compress that file and attach it to this bug.

It might also help to capture a packet trace during the attempt.

  tcpdump -s 0 -w /tmp/dump.pcap port 2049

then attempt the mount, then interrupt the tcpdump, then compress and attach /tmp/dump.pcap
Comment 2 Lothar Paltins 2023-08-30 21:49:48 UTC
Created attachment 869138 [details]
trace-cmd output
Comment 3 Lothar Paltins 2023-08-30 21:50:48 UTC
Created attachment 869139 [details]
tcpdump output
Comment 4 Lothar Paltins 2023-08-30 22:05:04 UTC
Neil, thanks for looking into this issue. It's really strange, because it doesn't occur on a second machine with the same OS version. It started on the affected machine after a tumbleweed update that installed nfs-kernel-server-2.6.3-39.4. The kernel was also updated, but I don't think that it's a kernel issue, because it works with the new kernel after downgrading nfs-kernel-server to version 2.6.2-38.1.

This is the output of uname -a:

Linux orion.lopa.de 6.4.9-1-default #1 SMP PREEMPT_DYNAMIC Wed Aug  9 05:07:55 UTC 2023 (5b9ad20) x86_64 x86_64 x86_64 GNU/Linux

I've attached the compressed outputs of trace-cmd and tcpdump.
Comment 5 Neil Brown 2023-09-04 01:42:08 UTC
Thanks.
The significant detail is in the tcpdump capture.  The client sends a PUTROOTFH request, and gets a reply NFS4ERR_DELAY.
That means that the kernel asked rpc.mountd for some information about exported filesystems, and is waiting for an answer.
Then a few seconds later the client re-sends the request (as expected) and gets the same reply.
This suggests that rpc.mountd isn't working properly.

So please add the two lines:
 [mountd]
 debug=all

to /etc/nfs.conf.local (create the file if needed).
Then restart mountd:
  systemctl restart nfs-mountd

then attempt the mount.
Then
  journalctl -u nfs-mountd
and collect the output.  Maybe add "--since -10m" to only get the last 10 minutes of output, or something like that.

Then you can remove the two lines from nfs.conf.local and restart nfs-mountd.

Hopefully that log will provide some more hints.
Comment 6 Lothar Paltins 2023-09-04 11:12:21 UTC
(In reply to Neil Brown from comment #5)
My bug description already included the rpc.mountd syslog debug messages. I've now attached a file containing the messages in the good and the bad case. The first 6 messages are identical, but the new server stops after the message "nfsd_fh: found 0x56002b75d290 path /" and the next message "nfsd_export: inbuf '192.168.1.0/24 /'" doesn't occur. The "client detached" message comes in the good case after the umount command and in the bad case after killing the mount command on the client.
Comment 7 Lothar Paltins 2023-09-04 11:13:37 UTC
Created attachment 869233 [details]
rpc.mountd debug messages
Comment 8 Neil Brown 2023-09-17 23:41:57 UTC
Thanks for those debug messages.  I think they help.

I should have asked earlier:  Could you please provide your "/etc/exports" file, and /etc/nfs.conf.local if there is anything in that.
Also what "mount" command are you using?  i.e. what filesystem are you trying to mount from the server.

I found a change in 2.6.3 which looks questionable, but I really want to be able to reproduce the problem - but I cannot yet.  Hopefully if I can copy your config, and I can do it.

Sorry this is taking so long - I've had some distractions.
Comment 9 Lothar Paltins 2023-09-18 10:35:56 UTC
(In reply to Neil Brown from comment #8)

Hi Neil,

> I should have asked earlier:  Could you please provide your "/etc/exports"
> file, and /etc/nfs.conf.local if there is anything in that.

I've attached my servers (orion) /etc/exports. /etc/nfs.conf.local contains only

[mountd]
debug=all

> Also what "mount" command are you using?  i.e. what filesystem are you
> trying to mount from the server.

It's the ext4 filesystem /video that I want to mount and the client has it listed in /etc/fstab. But this causes the boot sequence to hang forever. So I commented out the line in /etc/fstab to be able to boot the client. A simple

mount orion:/video /mnt

on the client doesn't succeed and hangs forever until it is killed.

> Sorry this is taking so long - I've had some distractions.

No problem, because the nfs mount succeeds again after downgrading nfs-kernel-server to version 2.6.2-38.1. I only have to ignore the dependency on nfs-client = 2.6.2 ot to downgrade this package, too.

Thanks and best regards
Lothar
Comment 10 Lothar Paltins 2023-09-18 10:37:13 UTC
Created attachment 869571 [details]
/etc/exports
Comment 11 Neil Brown 2023-09-19 00:20:24 UTC
(In reply to Lothar Paltins from comment #10)
> Created attachment 869571 [details]
> /etc/exports

Thanks.  Could you please try removing:

/	192.168.1.0/24(fsid=0,crossmnt,rw,no_root_squash,sync,no_subtree_check)

That line shouldn't be needed (but it shouldn't cause a hang either....).  If that does work-around the problem, the I have little doubt that I'll find the code fix fairly quickly (once I make time).
Comment 12 Lothar Paltins 2023-09-19 10:53:37 UTC
(In reply to Neil Brown from comment #11)
> Thanks.  Could you please try removing:
> 
> /	192.168.1.0/24(fsid=0,crossmnt,rw,no_root_squash,sync,no_subtree_check)

Yes, this helps. But without this line I can only mount the exported sub-filesystems, but not the root filesystem. This is ok for mounting /video, but it's not a general workaround.

I played a little bit with different options in /etc/exports and it seems to be the crossmnt option that causes the hang. Without it, I can leave the line for / in /etc/exports. This is now the content:

/video 192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check)
/home  192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check)
/home/lothar/work      192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check)
/	192.168.1.0/24(fsid=0,rw,no_root_squash,sync,no_subtree_check)

This works, but one thing I don't understand. If I execute

mount orion:/ /mnt

on the client, then the root directory of orion is mounted but all other exported directories from orion are also mounted. I thought, that the crossmnt option is needed for this behavior.
Comment 13 Neil Brown 2023-09-25 05:34:46 UTC
I still cannot reproduce this problem.

I might need to provide you an nfs-kernel-server package with lots of extra debugging so I can see in more detail what is happening.
But maybe we can get by with something simpler.

Could you please reproduce the problem again, but this time run

   strace -o /tmp/mountd.out -s 200 -p `pidof rpc.mountd`

on the server before attempting the mount.
Then after the test, interrupt the strace, and attach the /tmp/monntd.out file here.
Thanks.

The "crossmnt" flag is, to so extent, assumed when NFSv4 is used.
The man page for "man exports" says this at the end of the "nohide" section:


> This option is not relevant when NFSv4 is use.  NFSv4 never hides  subordinate  filesystems.   Any filesystem that is exported will be visible where expected when using NFSv4.

Then the "crossmnt" section starts "This  option  is  similar to nohide..." so you might say this behaviour is documented.
Comment 14 Lothar Paltins 2023-09-26 12:40:52 UTC
> Could you please reproduce the problem again, but this time run
> 
>    strace -o /tmp/mountd.out -s 200 -p `pidof rpc.mountd`
> 
> on the server before attempting the mount.

Of course, I should have thought about that earlier. I've attached 2 trace files, mountd-bad.out with the crossmnt option and mountd-good.out without it. The first 71 lines are almost identical, but beginning with line 72 they are totally different.
Comment 15 Lothar Paltins 2023-09-26 12:42:24 UTC
Created attachment 869754 [details]
mountd-bad.out
Comment 16 Lothar Paltins 2023-09-26 12:42:51 UTC
Created attachment 869755 [details]
mountd-good.out
Comment 17 Neil Brown 2023-09-26 13:00:26 UTC
Thanks.  In mountd-bad.out, the failure to stat /run/user/1000/doc is, I think, significant.

Could you get me a "good" trace using the same /etc/exports configuration as the bad trace, but using the older nfs-kernel-server package?
I would think that the /run/user/1000/doc failure would affect the older code too, but it seems that it doesn't.
Comment 18 Lothar Paltins 2023-09-26 18:32:42 UTC
(In reply to Neil Brown from comment #17)
> Thanks.  In mountd-bad.out, the failure to stat /run/user/1000/doc is, I
> think, significant.

This looks very strange, indeed. Something's wrong in this system directory. 1000 is my user id (lothar):

# ls -l /run/user/1000
ls: cannot access '/run/user/1000/gvfs': Permission denied
ls: cannot access '/run/user/1000/doc': Permission denied
total 4
-rw-r--r-- 1 lothar users  85 Sep 26 12:19 KSMserver__0
drwx------ 2 lothar users  60 Sep 26 12:19 at-spi
srw-rw-rw- 1 lothar users   0 Sep 26 12:19 bus
drwx------ 3 lothar users  60 Sep 26 12:19 dbus-1
drwx------ 2 lothar users  60 Sep 26 12:19 dconf
d????????? ? ?      ?       ?            ? doc
srw------- 1 lothar users   0 Sep 26 12:19 drkonqi-coredump-launcher
drwx------ 2 lothar users 120 Sep 26 12:19 gnupg
d????????? ? ?      ?       ?            ? gvfs
drwx------ 2 lothar users  40 Sep 26 12:19 gvfsd
drwx------ 2 lothar users  60 Sep 26 12:19 keyring
srwxr-xr-x 1 lothar users   0 Sep 26 12:19 kwallet5.socket
srw-rw-rw- 1 lothar users   0 Sep 26 12:19 pipewire-0
-rw-r----- 1 lothar users   0 Sep 26 12:19 pipewire-0.lock
drwx------ 2 lothar users  80 Sep 26 12:19 pulse
drwxr-xr-x 7 lothar users 180 Sep 26 12:19 systemd

But that's probably a different bug, it shouldn't affect rpc.mountd.

> Could you get me a "good" trace using the same /etc/exports configuration as
> the bad trace, but using the older nfs-kernel-server package?

It's attached as mountd-old.out.

> I would think that the /run/user/1000/doc failure would affect the older
> code too, but it seems that it doesn't.

Both the new code without the crossmnt option and the old code independent of the crossmnt option don't try to access this path.
Comment 19 Lothar Paltins 2023-09-26 18:33:33 UTC
Created attachment 869769 [details]
mountd-old.out
Comment 20 Lothar Paltins 2023-09-27 21:15:41 UTC
It's strange, but the directories, that are seemingly corrupted, are only looking corrupted for the user root. From my user account, everything seems to be correct:

lothar@orion:~> ll /run/user/1000
total 4
-rw-r--r-- 1 lothar users  85 Sep 27 08:30 KSMserver__0
drwx------ 2 lothar users  60 Sep 27 08:30 at-spi
srw-rw-rw- 1 lothar users   0 Sep 27 08:30 bus
drwx------ 3 lothar users  60 Sep 27 08:30 dbus-1
drwx------ 2 lothar users  60 Sep 27 12:19 dconf
dr-x------ 2 lothar users   0 Jan  1  1970 doc
srw------- 1 lothar users   0 Sep 27 08:30 drkonqi-coredump-launcher
drwx------ 2 lothar users 120 Sep 27 08:30 gnupg
dr-x------ 2 lothar users   0 Sep 27 08:30 gvfs
drwx------ 2 lothar users  40 Sep 27 08:30 gvfsd
drwx------ 2 lothar users  60 Sep 27 08:30 keyring
srwxr-xr-x 1 lothar users   0 Sep 27 08:30 kwallet5.socket
srw-rw-rw- 1 lothar users   0 Sep 27 08:30 pipewire-0
-rw-r----- 1 lothar users   0 Sep 27 08:30 pipewire-0.lock
drwx------ 2 lothar users  80 Sep 27 08:30 pulse
drwxr-xr-x 7 lothar users 180 Sep 27 08:30 systemd
lothar@orion:~> ll /run/user/1000/doc
total 0
dr-x------. 2 lothar users 0 Jan  1  1970 9a2a1c13
dr-x------  2 lothar users 0 Jan  1  1970 by-app
lothar@orion:~> ll /run/user/1000/gvfs
total 0

This is the output of mount for these file systems:

tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3210296k,nr_inodes=802574,mode=700,uid=1000,gid=100,inode64)
portal on /run/user/1000/doc type fuse.portal (rw,nosuid,nodev,relatime,user_id=1000,group_id=100)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=100)

It seems to be a problem of the file system types fuse.portal and fuse.gvfsd-fuse.
Comment 21 Neil Brown 2023-09-27 22:23:35 UTC
Thanks for all the details.  I have a fairly good idea what is happening now, and most of why.

These two "fuse" filesystem are probably working as designed.  There is no need for root to be able to see the contents and it is perfectly reasonable for them to reject any access by root.  The problem is very much in rpc.mountd which either shouldn't look at them or shouldn't respond this way to the error.
I don't entirely understand the piece of code that responds to that error and I've email the author to hopefully get an explanation.

The change in nfs-utils 2.6.3 which made resulted in the mount failures was actually a bug-fix.  rpc.mountd wasn't handing "/" exports quite properly.  Fixing that exposed this other problem that strange filesystems under a crossmnt export cause problems.

As you have a stable work-around I won't rush a fix until I fill in them missing parts of the story, but it's likely that a proper fix will be out within a month.
Comment 22 Lothar Paltins 2023-09-28 08:01:03 UTC
(In reply to Neil Brown from comment #21)

I'm not familiar with the implementation of rpc.mountd, but I'm wondering, why it now tries to access each and every mounted filesystem under /, even if it's absolutely unimportant for the actual mount request. The client wants to mount /video and not / and there's no other mounted filesystem under /video. And you said, that the crossmnt option is now the default, so why is the behavior totally different with or without this option?

> As you have a stable work-around I won't rush a fix until I fill in them
> missing parts of the story, but it's likely that a proper fix will be out
> within a month.

Yes, just leaving out the redundant crossmnt option is an easy workaround.

Thanks and best regards
Lothar
Comment 23 Janos Miko 2023-10-11 20:46:49 UTC
Hello,

I'm not sure if this is related or not, but it seems like the latest nfs-client update broke the nfs client in some cases.

Context: I'm running multiple Kubernetes clusters on top of OpenSUSE MicroOS using  the following Terraform module:
https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner

I'm using Rancher Longhorn for volumes in the cluster and Longhorn's RWX volumes are using nfs under the hood.

Yesterday, a transactional-update scheduled update upgraded all my nodes (including nfs-client package from 2.6.3-39.4 to 2.6.3-39.5) and the RWX volumes started to fail.

The error message was the following in the journal:
---
Oct 11 07:42:23 dev-worker-1 k3s[1294]: Mounting command: /usr/local/sbin/nsmounter
Oct 11 07:42:23 dev-worker-1 k3s[1294]: Mounting arguments: mount -t nfs -o vers=4.1,noresvport,intr,hard 10.43.207.185:/pvc-13538170-4278-4467-b2b0-1f1ba6f54a4c /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/185c34f566c2eca6e8c7c6a2ede2094c076d7d25ddae286dc633eeef80551af0/globalmount
Oct 11 07:42:23 dev-worker-1-autoscaled-small-19baf778f50efd8c k3s[1294]: Output: mount.nfs: Protocol not supported for 10.43.207.185:/pvc-13538170-4278-4467-b2b0-1f1ba6f54a4c on /var/lib/kubelet/plugin
---

I also enabled debug logging in the kernel, you can find those logs in the related bug report.

Anyway, after a few hours I decided to use MicroOS rollback going back to the latest snapshot from 2 days ago and the issue is solved now on that node.

Related bug report on Github:
https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/issues/1016

Please let me know if I can provide you any additional information about this issue.

Regards,
Janos Miko
Comment 24 Neil Brown 2023-10-16 01:03:37 UTC
I've submitted a patch that fixes this problem. It will probably take a little while to get into Tumbleweed.

BTW the crossmnt option isn't entirely redundant.  There are some situations where it can be needed.  If the server is rebooted while a client is accessing a filesystem that is not explicitly exported, then the client may not handle the server reboot properly without the crossmnt option.

So when the new nfs-kernel-server package lands (rpm -q --changelog nfs-kernel-server will mention this bug number) please put the crossmnt option back.
Comment 25 OBSbugzilla Bot 2023-10-16 01:25:03 UTC
This is an autogenerated message for OBS integration:
This bug (1214540) was mentioned in
https://build.opensuse.org/request/show/1117914 Factory / nfs-utils