Bugzilla – Bug 134366
Nfsv4 : PyNFS tests failing - LOCK with newlockowner's lockseqid=1 should return NFS4ERR_BAD_SEQID, instead got NFS4_OK
Last modified: 2005-12-21 05:06:49 UTC
Pynfs (now called newpynfs), a suite of several Python tools for NFS4 is used for testing NFSv4 RFC compliance. It can be downloaded here http://www.citi.umich.edu/projects/nfsv4/pynfs/ Steps to reproduce the bug: 1. Setup NFSv4 server 2. Install newpynfs 3. Run the pynfs testserver.py script ./testserver.py --maketree <hostname> all -v or python testserver.py --maketree <hostname> all -v 4. The results will be store in out_last file 5. Redirect results to a text file ./showresults.py out_last > SUSE10.1.txt 6. The following test is failing LOCK8c st_lock.testNonzeroLockSeqid : FAILURE LOCK with newlockowner's lockseqid=1 should return NFS4ERR_BAD_SEQID, instead got NFS4_OK Note: These tests are passing in other SUSE Linux Professional 9.3 and other distors like Debian & RHEL 4.0 AS
Hi Neil - is it OK to assign NFSv4 bugs to you?
Yes, I'll take server Bugs at least ... maybe even a few client bugs. I'll explore on Monday and see what I can find.
Hmm. looks like it was broken inmMid June, and fixed in mid September, and 10.1 is based on 2.6.13 which is between those dates. I'll attach a patch which has gone upstream and may well fix this problem (though I haven't understood the code sufficiently yet to be sure exactly what is causing the problem). IF you could test it and report back, I'd appreciate it. Thanks, NeilBrown
Created attachment 57821 [details] upstream patch which mackes changes to lock_seqid Please test this patch in SL10.1 and report back.
(Grumble ... why do I have to go to a separate page to attach things, and so require 3 comments - one for comment, one for attachment, one to change status...)
Have tested the patch. The patch is not fixing the problem. In fact this patch introduced a new failure, CID4 st_setclientid.testAllCases : FAILURE OP_SETCLIENTID should return NFS4_OK, instead got NFS4ERR_INVAL which was passing earlier. Thanks, Suresh
Ok, thanks. I will look more deeply and see what I can find. Meanwhile, are you able to test the lastest kernel.org kernel - e.g. 2.6.15-rc2. That would help me know if it has been fixed somewhere in recent patches. Thanks.
I've just been reading through RFC3530 and I can find no suggestion that the lock_seqid must start at 0, as the failing test implies. The closest I can find is: The starting value of the seqid field is undefined at the end of section 2.2. It could be argued that this is talking about a different seqid, but if it is, there are no statements at all about the initial value of lock_seqid. So I think this is a problem with newpynfs rather than the nfs server. However I am still interested in test results against 2.6.15-rc2. Thanks.
*** Bug 134368 has been marked as a duplicate of this bug. ***
RFC doesn't have much info about lock_seqid. I think the info. present in section 2.2 is related to the lock_seqid in question. I shall send a mail to Peter Astrand (author of newpynfs) asking for clarification and get back. Couldn't find 2.5.15-rc2 kernel on http://kernel.org. Have just downloaded from other mirror. Will test and report back.
Pasting the response from Trond. *** See the 1st paragraph of section 8.1.5. "Sequencing of Lock Requests". The first request issued for any given lock_owner is issued with a sequence number of zero. Note that this only applies to lock_owners, not to open_owners. ***
The Bug is present in the latest 2.6.15-rc2 kernel also. The test is failing. However the tests are passing in SUSE 9.3 professional which has 2.6.11.4-20a. My guess is that the bug could have been introduced somewhere between the 2.6.11 and 2.6.13 Thanks, Suresh
Created attachment 58720 [details] lock seqid provided by the client should be zero. RFC section 8.1.5 Out of interest, I started looking at the server code. Created a small patch (attached) which fixes the problem. As Iam not familiar with the code, I would you and Olaf to review the patch before I can move this to RESOLVED FIXED. Neil & Olaf, please comment on the patch.
Have tested this patch. It's fixing the problem and didn't introduce any new failures.
You patch adds back code that was in 2.6.11. Given this hint (thanks) I went looking to find out why that code was removed. The comment with the patch which rmeoved this code is below. I tend to agree that the check really don't add any value to the server. It would certainly be worth checking that lock reclaim from the Linux client still works... Are you able to test that? ---------------- nfsd4: relax new lock seqid check We're insisting that the lock sequence id field passed in the open_to_lockowner struct always be zero. This is probably thanks to the sentence in rfc3530: "The first request issued for any given lock_owner is issued with a sequence number of zero." But there doesn't seem to be any problem with allowing initial sequence numbers other than zero. And currently this is causing lock reclaims from the Linux client to fail. In the spirit of "be liberal in what you accept, conservative in what you send", we'll relax the check (and patch the Linux client as well).
Facing some issues in testing lock reclaim behaviour. However, Iam following the discussions in the NFSv4 mailing list regarding new lock_seqid issue. http://linux-nfs.org/pipermail/nfsv4/2005-November/002978.html Bruce says "Reverting the patch would not cause any problems". Iam assuming the only advantage Bruce's patch has is that the security advantage in randomising the new lock_seqid. Shall we wait for the discussion on nfsv4@ietf.org to complete and then decide on closing the issue? BTW, Iam not subscribed to nfsv4@ietf.org. Would it be helpful to do so? Thanks, Suresh
Yes, it probably would be a good idea to join nfsv4@ietf.org The general directory of the discussion has been that insisting on the seqid starting a 0 is probably in appropriate. However on other server implementation does reject sequences that don't start at 0. I don't think there is real value in rejecting such sequences and suggest we closed this bug as wont-fix. Would you be happy with that?
As there is no significant value addition, you can go ahead and close this bug as won't fix. However, how should we proceed in making newpynfs not to complain about the non-zero new lock_seqid? Does it require change in the RFC itself? If yes, how to take it forward? Meanwhile, I shall send a mail to nfsv4-wg@citi.umich.edu to report the newpynfs behaviour.
Comment from Fredric Isaman: (NewPyNFS maintainer) "I maintain newpynfs. I agree the requirement is not useful, but 8.1.5 IS in the RFC, and I think the tests should match the RFC specs. However, there are several tests like this one (the utf-8 tests for example) that are in some sense "lower priority", and can drown out more critical failures. I will soon add some sort of priority filter to the tests so it will be easier to ignore these failures, but I would still say the are indeed failures." Neil, you can close this bug as "Won't fix". We may have to try and convince newpynfs maintainer to ignore this check in future.
I think the goal with newpynfs would be to make this check report a warning rather than an error. Closed. Thanks, NeilBrown
Moving this Bug to NFSv4 Bugzilla(on newpynfs) for tracking the changes in newpynfs. http://developer.osdl.org/dev/nfsv4/bugzilla-new/show_bug.cgi?id=76 Closed. Thanks, Suresh