Bug 1217488 - pynfs CID* tests on NFS v4.0 fail: OP_SETCLIENTID should return NFS4_OK, instead got NFS4ERR_DELAY
Summary: pynfs CID* tests on NFS v4.0 fail: OP_SETCLIENTID should return NFS4_OK, inst...
Status: IN_PROGRESS
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel:Filesystems (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: Petr Vorel
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-24 19:58 UTC by Petr Vorel
Modified: 2024-05-10 06:51 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Petr Vorel 2023-11-24 19:58:34 UTC
Although we start due #1217128 testing via cdmackay/pynfs.git, which has fix for "nfs4lib.BadCompoundRes: operation OP_SETCLIENTID should return NFS4_OK [1], instead got NFS4ERR_DELAY", we still get 9 tests failing with this error on Tumbleweed. Any idea what could be wrong now?

O I can ask Calum Mackay on linux-nfs if you're busy with more important stuff.

[1] https://git.linux-nfs.org/?p=cdmackay/pynfs.git;a=commit;h=0d4d3fd0bb7a63860b46f3fed9e9ebf287ea51f8
Comment 1 Petr Cervinka 2024-04-17 09:35:45 UTC
It got to 15-SP6 in latest build 80.1: https://openqa.suse.de/tests/14048454#step/CID5/2


nfs4lib.BadCompoundRes: operation OP_SETCLIENTID should return NFS4_OK, instead got NFS4ERR_DELAY


Traceback (most recent call last):
  File "/root/pynfs/nfs4.0/lib/testmod.py", line 234, in run
    self.runtest(self, environment)
  File "/root/pynfs/nfs4.0/servertests/st_setclientid.py", line 363, in testLotsOfClients
    c.init_connection(id)
  File "/root/pynfs/nfs4.0/nfs4lib.py", line 407, in init_connection
    check_result(res)
  File "/root/pynfs/nfs4.0/nfs4lib.py", line 918, in check_result
    raise BadCompoundRes(resop, res.status, msg)
nfs4lib.BadCompoundRes: operation OP_SETCLIENTID should return NFS4_OK, instead got NFS4ERR_DELAY
Comment 2 Neil Brown 2024-04-22 02:03:04 UTC
pynfs only waits for 10 seconds for the DELAY error to go away.  I guess that isn't long enough.

I think that failing the OP_SETCLIENTID just because there are already lots of clients is a bad choice.  Certainly fail if there is a real shortage of memory, but  not otherwise.  Certainly look for idle clients to clean up, but don't fail.

I'll post a patch upstream and see what they think.
Comment 3 Petr Vorel 2024-04-22 04:59:02 UTC
Neil's patch in ML: https://lore.kernel.org/linux-nfs/171375175915.7600.6526208866216039031@noble.neil.brown.name/

Thanks, Neil!
Comment 4 Petr Vorel 2024-04-23 15:16:20 UTC
Based on upstream maintainer's comment about 1 GB not being enough [1] I tested with more RAM (QEMURAM=3600) and it solved the problem [2]. Let's see if Neil's v2 fix [3] is merged in upstream or not.

[1] https://lore.kernel.org/linux-nfs/ZiZnbV+htcvGuGQl@tissot.1015granger.net/
[2] http://quasar.suse.cz/tests/3237
[3] https://lore.kernel.org/linux-nfs/171385732687.7600.2864936377155228614@noble.neil.brown.name/