Bug 134366 - Nfsv4 : PyNFS tests failing - LOCK with newlockowner's lockseqid=1 should return NFS4ERR_BAD_SEQID, instead got NFS4_OK
Summary: Nfsv4 : PyNFS tests failing - LOCK with newlockowner's lockseqid=1 should ret...
Status: VERIFIED WONTFIX
: 134368 (view as bug list)
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Alpha 2
Hardware: Other SuSE Linux 10.0
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-18 09:59 UTC by Forgotten User b5BnQSUi71
Modified: 2005-12-21 05:06 UTC (History)
0 users

See Also:
Found By: Other
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
upstream patch which mackes changes to lock_seqid (3.84 KB, patch)
2005-11-20 22:42 UTC, Neil Brown
Details | Diff
lock seqid provided by the client should be zero. RFC section 8.1.5 (756 bytes, patch)
2005-11-25 11:49 UTC, Forgotten User b5BnQSUi71
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Forgotten User b5BnQSUi71 2005-11-18 09:59:05 UTC
Pynfs (now called newpynfs), a suite of several Python tools for NFS4 is used for testing NFSv4 RFC compliance. It can be downloaded here http://www.citi.umich.edu/projects/nfsv4/pynfs/

Steps to reproduce the bug:
1. Setup NFSv4 server
2. Install newpynfs
3. Run the pynfs testserver.py script
./testserver.py --maketree <hostname> all -v
        or
python testserver.py --maketree <hostname> all -v
4. The results will be store in out_last file
5. Redirect results to a text file
./showresults.py out_last > SUSE10.1.txt
6. The following test is failing 
LOCK8c   st_lock.testNonzeroLockSeqid                     : FAILURE
    LOCK with newlockowner's lockseqid=1 should return
    NFS4ERR_BAD_SEQID, instead got NFS4_OK

Note: These tests are passing in other SUSE Linux Professional 9.3 and other distors like Debian & RHEL 4.0 AS
Comment 1 Olaf Kirch 2005-11-18 10:23:26 UTC
Hi Neil - is it OK to assign NFSv4 bugs to you?
Comment 2 Neil Brown 2005-11-18 10:36:09 UTC
Yes, I'll take server Bugs at least ... maybe even a few client bugs.  I'll explore on Monday and see what I can find.
Comment 3 Neil Brown 2005-11-20 22:40:48 UTC
Hmm. looks like it was broken inmMid June, and fixed in mid September, and
10.1 is based on 2.6.13 which is between those dates.

I'll attach a patch which has gone upstream and may well fix this problem (though 
I haven't understood the code sufficiently yet to be sure exactly what is
causing the problem).

IF you could test it and report back, I'd appreciate it.

Thanks,
NeilBrown
Comment 4 Neil Brown 2005-11-20 22:42:08 UTC
Created attachment 57821 [details]
upstream patch which mackes changes to lock_seqid

Please test this patch in SL10.1 and report back.
Comment 5 Neil Brown 2005-11-20 22:43:24 UTC
(Grumble ... why do I have to go to a separate page to attach things,
 and so require 3 comments - one for comment, one for attachment, one to change status...)
Comment 6 Forgotten User b5BnQSUi71 2005-11-21 07:14:37 UTC
Have tested the patch. The patch is not fixing the problem. 
In fact this patch introduced a new failure,
CID4     st_setclientid.testAllCases                        : FAILURE
       OP_SETCLIENTID should return NFS4_OK, instead got
       NFS4ERR_INVAL
which was passing earlier.

Thanks,
Suresh
Comment 7 Neil Brown 2005-11-21 22:08:27 UTC
Ok, thanks.  I will look more deeply and see what I can find.

Meanwhile, are you able to test the lastest kernel.org kernel - e.g. 2.6.15-rc2. That would help me know if it has been fixed somewhere in recent patches.

Thanks.
Comment 8 Neil Brown 2005-11-21 22:34:07 UTC
I've just been reading through RFC3530 and I can find no suggestion that
the lock_seqid must start at 0, as the failing test implies.

The closest I can find is:
    The starting value of the seqid field is undefined
at the end of section 2.2.  It could be argued that this is talking about a different seqid, but if it is, there are no statements at all about the
initial value of lock_seqid.
So I think this is a problem with newpynfs rather than the nfs server.
However I am still interested in test results against 2.6.15-rc2.
Thanks.
Comment 9 Forgotten User b5BnQSUi71 2005-11-23 11:16:33 UTC
*** Bug 134368 has been marked as a duplicate of this bug. ***
Comment 10 Forgotten User b5BnQSUi71 2005-11-23 12:41:24 UTC
RFC doesn't have much info about lock_seqid. I think the info. present in section 2.2 is related to the lock_seqid in question. I shall send a mail to Peter Astrand (author of newpynfs) asking for clarification and get back.

Couldn't find 2.5.15-rc2 kernel on http://kernel.org. Have just downloaded from other mirror. Will test and report back.
Comment 11 Forgotten User b5BnQSUi71 2005-11-23 13:30:15 UTC
Pasting the response from Trond.
***
See the 1st paragraph of section 8.1.5. "Sequencing of Lock Requests".

        The first request issued for any given lock_owner is issued with
        a sequence number of zero.

Note that this only applies to lock_owners, not to open_owners.
***
Comment 12 Forgotten User b5BnQSUi71 2005-11-25 04:05:34 UTC
The Bug is present in the latest 2.6.15-rc2 kernel also. The test is failing.
However the tests are passing in SUSE 9.3 professional which has 2.6.11.4-20a.
My guess is that the bug could have been introduced somewhere between the 2.6.11 and 2.6.13

Thanks,
Suresh
Comment 13 Forgotten User b5BnQSUi71 2005-11-25 11:49:02 UTC
Created attachment 58720 [details]
lock seqid provided by the client should be zero. RFC section 8.1.5

Out of interest, I started looking at the server code. Created a small patch (attached) which fixes the problem. As Iam not familiar with the code, I would you and Olaf to review the patch before I can move this to RESOLVED FIXED.

Neil & Olaf, please comment on the patch.
Comment 14 Forgotten User b5BnQSUi71 2005-11-25 11:50:06 UTC
Have tested this patch. It's fixing the problem and didn't introduce any new failures.
Comment 15 Neil Brown 2005-11-28 00:16:47 UTC
You patch adds back code that was in 2.6.11.  Given this hint (thanks) I went looking to find out why that code was removed.  The comment with the patch which rmeoved this code is below.  I tend to agree that the check really don't add any value to the server. 
It would certainly be worth checking that lock reclaim from the Linux client still works...
Are you able to test that?

----------------
nfsd4: relax new lock seqid check

We're insisting that the lock sequence id field passed in the open_to_lockowner
struct always be zero.  This is probably thanks to the sentence in rfc3530:
"The first request issued for any given lock_owner is issued with a sequence
number of zero."

But there doesn't seem to be any problem with allowing initial sequence numbers
other than zero.  And currently this is causing lock reclaims from the Linux
client to fail.

In the spirit of "be liberal in what you accept, conservative in what you
send", we'll relax the check (and patch the Linux client as well).
Comment 16 Forgotten User b5BnQSUi71 2005-11-30 06:25:29 UTC
Facing some issues in testing lock reclaim behaviour.
 
However, Iam following the discussions in the NFSv4 mailing list regarding new lock_seqid issue.
http://linux-nfs.org/pipermail/nfsv4/2005-November/002978.html
Bruce says "Reverting the patch would not cause any problems". Iam assuming the only advantage Bruce's patch has is that the security advantage in randomising the new lock_seqid. Shall we wait for the discussion on nfsv4@ietf.org to complete and then decide on closing the issue? 

BTW, Iam not subscribed to nfsv4@ietf.org. Would it be helpful to do so?

Thanks,
Suresh
Comment 17 Neil Brown 2005-12-04 23:11:27 UTC
Yes, it probably would be a good idea to join nfsv4@ietf.org

The general directory of the discussion has been that insisting on the
seqid starting a 0 is probably in appropriate.  However on other 
server implementation does reject sequences that don't start at 0.

I don't think there is real value in rejecting such sequences and
suggest we closed this bug as wont-fix.  Would you be happy with that?

Comment 18 Forgotten User b5BnQSUi71 2005-12-05 09:23:21 UTC
As there is no significant value addition, you can go ahead and close this bug as won't fix. However, how should we proceed in making newpynfs not to complain about the non-zero new lock_seqid? Does it require change in the RFC itself? If yes, how to take it forward?

Meanwhile, I shall send a mail to nfsv4-wg@citi.umich.edu to report the newpynfs behaviour.
Comment 19 Forgotten User b5BnQSUi71 2005-12-09 06:09:09 UTC
Comment from Fredric Isaman: (NewPyNFS maintainer)

"I maintain newpynfs.  I agree the requirement is not useful, but 8.1.5 IS
in the RFC, and I think the tests should match the RFC specs.  However,
there are several tests like this one (the utf-8 tests for example) that
are in some sense "lower priority", and can drown out more critical
failures.  I will soon add some sort of priority filter to the tests so it
will be easier to ignore these failures, but I would still say the are
indeed failures."

Neil, you can close this bug as "Won't fix". We may have to try and convince newpynfs maintainer to ignore this check in future.
Comment 20 Neil Brown 2005-12-11 22:19:16 UTC
I think the goal with newpynfs would be to make this check report a warning rather than an error.

Closed.

Thanks,
NeilBrown
Comment 21 Forgotten User b5BnQSUi71 2005-12-21 05:06:49 UTC
Moving this Bug to NFSv4 Bugzilla(on newpynfs) for tracking the changes in newpynfs.

http://developer.osdl.org/dev/nfsv4/bugzilla-new/show_bug.cgi?id=76

Closed.

Thanks,
Suresh