|
Bugzilla – Full Text Bug Listing |
| Summary: | Can't lock file on NFS from openSUSE-11.0 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.0 | Reporter: | Petr Mladek <pmladek> |
| Component: | Basesystem | Assignee: | Neil Brown <nfbrown> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Blocker | ||
| Priority: | P5 - None | CC: | coolo, mabrand, mmeeks, nfbrown |
| Version: | Beta 1 | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | openSUSE 11.0 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Bug Depends on: | |||
| Bug Blocks: | 383390 | ||
| Attachments: |
Testcase.
strace from 10.3 strace from SLED10-SP1 strace from 11.0 output from ls -lR /var/lib/nfs Patch to start statd properly |
||
|
Description
Petr Mladek
2008-04-28 19:05:43 UTC
Created attachment 210952 [details]
Testcase.
You might try the following steps:
gcc test-lock.c
echo hello >~/test.txt
./a.out ~/test.txt
Created attachment 210953 [details]
strace from 10.3
The locking did not work.
Created attachment 210954 [details]
strace from SLED10-SP1
I actually just booted SLED10-SP1 on the same machine. Then I chrooted into the 11.0 system and started exactly the same binary on exactly the same file from exactly the same nfs server.
Created attachment 210955 [details]
strace from 11.0
Urgh, please ignore the strace from 10.3. It was strace from 11.0. I just mentioned wrong version in the file name and comment :-(
It works on 10.3 the same way like on 11.0. Only 11.0 does not work.
From the trace, the problem is that on OpenSUSE-11 you are getting the error 'ENOLCK' when trying to get a lock. If there server is known to work correctly (as seems to be the case), this suggests that 'statd' isn't running on your OpenSUSE-11 client. Please check is statd is running: ps axgu | grep statd rpcinfo -p ls -lR /var/lib/nfs Thanks. Good catch, rpc.statd really was not running on 11.0:
root@golem:/> ps axgu | grep statd
root 3273 0.0 0.0 2288 792 pts/2 S+ 11:39 0:00 grep statd
root@golem:/> rpcinfo -p
program verz proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100021 1 tcp 50936 nlockmgr
100021 3 tcp 50936 nlockmgr
100021 4 tcp 50936 nlockmgr
I wonder if it might be somewhat related to the new installation image magic.
Created attachment 211089 [details]
output from ls -lR /var/lib/nfs
OOo and the locking works correctly after I started "rpc.statd --no-notify" by hand. Ok, closing out, this is not a kernel bug... I agree that it is not a kernel bug but we still need to find why rpc.statd was not running => REOPENING There is a note in /etc/init.d/nfs that statd should get started by mount.nfs when needed. Also the rpcinfo output looks suspicious. I am not expert in this area, so I am not sure what to check, ... really REOPEN reassigning to a different group then, as this isn't a kernel issue... Added Neil to CC because he maintains nfs-client. /etc/init.d/nfs and mount.nfs are part of this package, ... Statd should be started when you first mount an NFS filesystem. The mount.nfs program will run /usr/sbin/start-statd Could you please check that this script is installed and executable? If you kill statd, then run /usr/sbin/start-statd does statd start? Thanks, Everything seems to be fine. /usr/sbin/start-statd is on the system and is executable. If I kill statd and run start-statd, statd is started again. So maybe mount isn't running start-statd like it should... Can you kill statd, unmount the NFS filesystem, then mount it again and see if statd gets started? Can you tell me more about the NFS filesystem that is causing problems. Is it automounted, or mounted by /etc/fstab, or mounted by hand? What are the mount options? I tried to mount it via yast and it did not start statd. I tried it by hand "mount -t nts nfs.suse.cz:/home /home" and it did not start statd as well. I did not use any special mount options. OK, I've figured it out. There are two quite separate branches of code in mount.nfs. One performs the mount using the 'old style' binary data structure to pass options to the kernel. The other uses the 'new style' text string to pass options to the kernel. The code for checking and starting statd was only in the 'old style' branch. I have commited a patch to STABLE which moves that code into common code. I'll attach the patch for completeness. It has been sent upstream. Created attachment 212522 [details]
Patch to start statd properly
*** Bug 385289 has been marked as a duplicate of this bug. *** It works for me on 11.0-beta2 => FIXED *** Bug 221193 has been marked as a duplicate of this bug. *** |