Bug 151133 - NFS lock reclaiming is not working
Summary: NFS lock reclaiming is not working
Status: RESOLVED INVALID
Alias: None
Product: SUSE LINUX 10.0
Classification: openSUSE
Component: Network (show other bugs)
Version: unspecified
Hardware: i586 SuSE Pro 9.2
: P5 - None : Normal
Target Milestone: ---
Assignee: E-mail List
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-15 11:28 UTC by Asha yarangatta
Modified: 2006-02-16 14:56 UTC (History)
1 user (show)

See Also:
Found By: Other
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Asha yarangatta 2006-02-15 11:28:49 UTC
The problem is that after the NFS sever machine reboots its sm-notify sends a
notification to all NFS clients that had locking activity but the clients
fail to reclaim their locks. So /var/lib/nfs/sm directory is not getting updated.

I tried it with SUSE9 SP2. The debugging messages of lockd on client are:

Feb 15 22:36:03 sgmlx1 kernel: lockd: nlm_host_rebooted("sgmlx2")
Feb 15 22:36:03 sgmlx1 kernel: nlmsvc_retry_blocked(00000000, when=0)
Comment 1 Michael Gross 2006-02-15 12:29:42 UTC
> I tried it with SUSE9 SP2.
You mean SLES 9 SP2 here?

Please give more information about the configuration and add more of your syslog (about 200 lines).
Comment 2 Asha yarangatta 2006-02-16 06:18:55 UTC

(In reply to comment #1)
> > I tried it with SUSE9 SP2.
> You mean SLES 9 SP2 here?
> Please give more information about the configuration and add more of your
> syslog (about 200 lines).

My NFS Server and NFS client are on SLES9 SP2. My NFS server is sgmlx1(15.70.191.172) and NFS client is sgmlx2(15.70.191.173). I mounted NFS file system on the client(mount 15.70.191.172:/test /test.). Then on the client, I acquired a lock on NFS shared file using fcntl. On NFS server, an entry was made for the client in sm directory. I restarted NFS server and then started sm-notify application. The client failed to reclaim the lock and I could acquire the lock on the same file from another client. 

The reclaiming of locks was working fine on base SLES9. It started failing after I updated SLES9 with SP2.

I have captured lockd debugging messages and network tetheral trace on client and posting the same.

Debugging Messages of lockd on client:

Feb 16 11:04:44 sgmlx2 kernel: NFS lockd/statd started (ver 0.5).
Feb 16 11:04:50 sgmlx2 kernel: lockd: nlm_lookup_host(0f46bfac, p=6, v=4, my role=client, name=15.70.191.172)
Feb 16 11:04:50 sgmlx2 kernel: lockd: host garbage collection
Feb 16 11:04:50 sgmlx2 kernel: lockd: nlmsvc_mark_resources
Feb 16 11:04:50 sgmlx2 kernel: lockd: nlm_bind_host(0f46bfac)
Feb 16 11:05:35 sgmlx2 last message repeated 4 times
Feb 16 11:05:35 sgmlx2 kernel: lockd: release host 15.70.191.172
Feb 16 11:05:35 sgmlx2 kernel: lockd: get host 15.70.191.172
Feb 16 11:06:34 sgmlx2 kernel: lockd: request from 0f46bfac
Feb 16 11:06:34 sgmlx2 kernel: lockd: nlm_host_rebooted("sgmlx1")
Feb 16 11:06:34 sgmlx2 kernel: nlmsvc_retry_blocked(00000000, when=0)
Feb 16 11:11:43 sgmlx2 kernel: device eth0 left promiscuous mode
Feb 16 11:11:46 sgmlx2 kernel: lockd: nlm_lookup_host(0f46bfac, p=6, v=4, my role=client, name=15.70.191.172)
Feb 16 11:11:46 sgmlx2 kernel: lockd: host garbage collection
Feb 16 11:11:46 sgmlx2 kernel: lockd: nlmsvc_mark_resources
Feb 16 11:11:46 sgmlx2 kernel: nlm_gc_hosts skipping 15.70.191.172 (cnt 1 use 0 exp 2355525)
Feb 16 11:11:46 sgmlx2 kernel: lockd: get host 15.70.191.172
Feb 16 11:11:46 sgmlx2 kernel: lockd: nlm_bind_host(0f46bfac)
Feb 16 11:12:01 sgmlx2 kernel: lockd: release host 15.70.191.172
Feb 16 11:12:01 sgmlx2 kernel: lockd: release host 15.70.191.172

Tethereal network trace of what client is seeing:

 1   0.000000 15.70.191.172 -> 224.0.0.251  IGMP V2 Membership Report
  2   0.792683 15.70.191.172 -> 224.0.1.22   IGMP V2 Membership Report
  3   1.386484 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
  4  12.614506 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
  5  36.068268 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2183646 TSER=0 WS=0
  6  36.068577 15.70.191.172 -> 15.70.191.173 TCP sunrpc > phonebook [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=226815 TSER=2183646 WS=0
  7  36.068599 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=2183646 TSER=226815
  8  36.068670 15.70.191.173 -> 15.70.191.172 Portmap V2 DUMP Call
  9  36.068824 15.70.191.172 -> 15.70.191.173 TCP sunrpc > phonebook [ACK] Seq=1 Ack=45 Win=5792 Len=0 TSV=226815 TSER=2183646
 10  36.069313 15.70.191.172 -> 15.70.191.173 Portmap V2 DUMP Reply (Call In 8)[Unreassembled Packet]
 11  36.069320 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK] Seq=45 Ack=401 Win=6432 Len=0 TSV=2183647 TSER=226815
 12  36.069481 15.70.191.172 -> 15.70.191.173 RPC Continuation
 13  36.069486 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK] Seq=45 Ack=497 Win=6432 Len=0 TSV=2183647 TSER=226816
 14  36.069508 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [FIN, ACK] Seq=45 Ack=497 Win=6432 Len=0 TSV=2183647 TSER=226816
 15  36.069560 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2183647 TSER=0 WS=0
 16  36.069782 15.70.191.172 -> 15.70.191.173 TCP 812 > 768 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=226816 TSER=2183647 WS=0
 17  36.069793 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=2183647 TSER=226816
 18  36.069810 15.70.191.172 -> 15.70.191.173 TCP sunrpc > phonebook [FIN, ACK] Seq=497 Ack=46 Win=5792 Len=0 TSV=226816 TSER=2183647
 19  36.069820 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK] Seq=46 Ack=498 Win=6432 Len=0 TSV=2183647 TSER=226816
 20  36.069848 15.70.191.173 -> 15.70.191.172 MOUNT V3 MNT Call
 21  36.070005 15.70.191.172 -> 15.70.191.173 TCP 812 > 768 [ACK] Seq=1 Ack=89 Win=5792 Len=0 TSV=226816 TSER=2183647
 22  36.119549 15.70.191.172 -> 15.70.191.173 MOUNT V3 MNT Reply (Call In 20)
 23  36.119556 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [ACK] Seq=89 Ack=61 Win=5840 Len=0 TSV=2183697 TSER=226866
 24  36.119648 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2183697 TSER=0 WS=0
 25  36.119865 15.70.191.172 -> 15.70.191.173 TCP sunrpc > cadlock [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=226866 TSER=2183697 WS=0
 26  36.119877 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=2183697 TSER=226866
 27  36.119908 15.70.191.173 -> 15.70.191.172 Portmap V2 GETPORT Call NFS(100003) V:3 TCP
 28  36.120067 15.70.191.172 -> 15.70.191.173 TCP sunrpc > cadlock [ACK] Seq=1 Ack=61 Win=5792 Len=0 TSV=226866 TSER=2183698
 29  36.120273 15.70.191.172 -> 15.70.191.173 Portmap V2 GETPORT Reply (Call In 27) Port:2049
 30  36.120278 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [ACK] Seq=61 Ack=33 Win=5840 Len=0 TSV=2183698 TSER=226866
 31  36.120298 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [FIN, ACK] Seq=61 Ack=33 Win=5840 Len=0 TSV=2183698 TSER=226866
 32  36.120335 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [FIN, ACK] Seq=89 Ack=61 Win=5840 Len=0 TSV=2183698 TSER=226866
 33  36.120509 15.70.191.172 -> 15.70.191.173 TCP sunrpc > cadlock [FIN, ACK] Seq=33 Ack=62 Win=5792 Len=0 TSV=226867 TSER=2183698
 34  36.120527 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [ACK] Seq=62 Ack=34 Win=5840 Len=0 TSV=2183698 TSER=226867
 35  36.120553 15.70.191.172 -> 15.70.191.173 TCP 812 > 768 [FIN, ACK] Seq=61 Ack=90 Win=5792 Len=0 TSV=226867 TSER=2183698
36  36.120559 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [ACK] Seq=90 Ack=62 Win=5840 Len=0 TSV=2183698 TSER=226867
 37  36.167056 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2183745 TSER=0 WS=0
 38  36.167287 15.70.191.172 -> 15.70.191.173 TCP nfs > 1023 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=226913 TSER=2183745 WS=0
 39  36.167311 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=2183745 TSER=226913
 40  36.167323 15.70.191.173 -> 15.70.191.172 NFS V3 FSINFO Call, FH:0x14d80402
 41  36.167482 15.70.191.172 -> 15.70.191.173 TCP nfs > 1023 [ACK] Seq=1 Ack=93 Win=5792 Len=0 TSV=226914 TSER=2183745
 42  36.187813 15.70.191.172 -> 15.70.191.173 NFS V3 FSINFO Reply (Call In 40)
 43  36.187820 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=93 Ack=85 Win=5840 Len=0 TSV=2183765 TSER=226934
 44  36.187848 15.70.191.173 -> 15.70.191.172 NFS V3 GETATTR Call, FH:0x14d80402
 45  36.188058 15.70.191.172 -> 15.70.191.173 NFS V3 GETATTR Reply (Call In 44)
 46  36.228420 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=185 Ack=201 Win=5840 Len=0 TSV=2183806 TSER=226934
 47  42.005867 15.70.191.173 -> 15.70.191.172 NFS V3 ACCESS Call, FH:0x14d80402
 48  42.006151 15.70.191.172 -> 15.70.191.173 NFS V3 ACCESS Reply (Call In 47)
 49  42.006170 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=281 Ack=325 Win=5840 Len=0 TSV=2189585 TSER=232753
 50  42.006210 15.70.191.173 -> 15.70.191.172 NFS V3 LOOKUP Call, DH:0x14d80402/hi
 51  42.016430 15.70.191.172 -> 15.70.191.173 NFS V3 LOOKUP Reply (Call In 50), FH:0x7608060e
 52  42.016461 15.70.191.173 -> 15.70.191.172 NFS V3 ACCESS Call, FH:0x7608060e
 53  42.016683 15.70.191.172 -> 15.70.191.173 NFS V3 ACCESS Reply (Call In 52)
 54  42.056037 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=501 Ack=697 Win=6432 Len=0 TSV=2189635 TSER=232764
 55  42.810338 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2190389 TSER=0 WS=0
 56  42.810617 15.70.191.172 -> 15.70.191.173 TCP sunrpc > 1022 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=233558 TSER=2190389 WS=0
 57  42.810642 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=2190389 TSER=233558
 58  42.810668 15.70.191.173 -> 15.70.191.172 Portmap V2 GETPORT Call NLM(100021) V:4 TCP
 59  42.810823 15.70.191.172 -> 15.70.191.173 TCP sunrpc > 1022 [ACK] Seq=1 Ack=61 Win=5792 Len=0 TSV=233558 TSER=2190389
 60  42.811094 15.70.191.172 -> 15.70.191.173 Portmap V2 GETPORT Reply (Call In 58) Port:32769
 61  42.811103 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [ACK] Seq=61 Ack=33 Win=5840 Len=0 TSV=2190390 TSER=233558
 62  42.811143 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [FIN, ACK] Seq=61 Ack=33 Win=5840 Len=0 TSV=2190390 TSER=233558
 63  42.811247 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2190390 TSER=0 WS=0
 64  42.811454 15.70.191.172 -> 15.70.191.173 TCP sunrpc > 1022 [FIN, ACK] Seq=33 Ack=62 Win=5792 Len=0 TSV=233559 TSER=2190390
 65  42.811469 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [ACK] Seq=62 Ack=34 Win=5840 Len=0 TSV=2190390 TSER=233559
 66  42.811490 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=233559 TSER=2190390 WS=0
 67  42.811503 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=2190390 TSER=233559
68  42.811519 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e svid:6954 pos:0-0
 69  42.811702 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [ACK] Seq=1 Ack=189 Win=5604 Len=0 TSV=233559 TSER=2190390
 70  42.811746 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 68) NLM_DENIED_GRACE_PERIOD
 71  42.811752 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK] Seq=189 Ack=41 Win=5840 Len=0 TSV=2190390 TSER=233559
 72  50.328078 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
 73  57.809375 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e svid:6954 pos:0-0
 74  57.809672 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 73) NLM_DENIED_GRACE_PERIOD
 75  57.809688 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK] Seq=377 Ack=81 Win=5840 Len=0 TSV=2205391 TSER=248560
 76  65.135773 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
 77  72.807840 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e svid:6954 pos:0-0
 78  72.808138 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 77) NLM_DENIED_GRACE_PERIOD
 79  72.808154 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK] Seq=565 Ack=121 Win=5840 Len=0 TSV=2220391 TSER=263561
 80  87.805307 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e svid:6954 pos:0-0
 81  87.845402 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [ACK] Seq=121 Ack=753 Win=5604 Len=0 TSV=278601 TSER=2235391
 82  87.939016 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 80)
 83  87.939036 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK] Seq=753 Ack=161 Win=5840 Len=0 TSV=2235525 TSER=278694
 84  99.789128 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
 85 100.787646 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
 86 121.309035 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
 87 125.271852 15.70.191.172 -> 224.0.1.22   IGMP V2 Membership Report
 88 132.657551 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [FIN, ACK] Seq=161 Ack=753 Win=5604 Len=0 TSV=323421 TSER=2235525
 89 132.657595 15.70.191.172 -> 15.70.191.173 TCP nfs > 1023 [FIN, ACK] Seq=697 Ack=501 Win=5792 Len=0 TSV=323421 TSER=2189635
 90 132.697683 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK] Seq=753 Ack=162 Win=5840 Len=0 TSV=2280290 TSER=323421
 91 132.697688 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=501 Ack=698 Win=6432 Len=0 TSV=2280290 TSER=323421
 92 137.696503 CompaqHp_50:5f:d3 -> CompaqHp_2d:70:6e ARP Who has 15.70.191.172?  Tell 15.70.191.173
 93 137.696634 CompaqHp_2d:70:6e -> CompaqHp_50:5f:d3 ARP 15.70.191.172 is at 00:0b:cd:2d:70:6e
 94 142.173489 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
 95 146.171834 15.70.191.172 -> 15.70.191.173 Portmap V2 GETPORT Call STAT(100024) V:1 UDP
 96 146.172124 15.70.191.173 -> 15.70.191.172 Portmap V2 GETPORT Reply (Call In 95) Port:32775
 97 146.172373 15.70.191.172 -> 15.70.191.173 STAT V1 NOTIFY Call
 98 146.172426 15.70.191.173 -> 15.70.191.172 STAT V1 NOTIFY Reply (Call In 97)
 99 155.659246 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1
100 203.565599 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172?  Tell 15.70.191.1


Comment 3 Michael Gross 2006-02-16 11:45:50 UTC
So this problem relates to SLES, not to SUSE Linux?
Comment 4 Asha yarangatta 2006-02-16 12:00:05 UTC
(In reply to comment #3)
> So this problem relates to SLES, not to SUSE Linux?

Yes, this problem is related to SUSE LINUX Enterprise Server 9 SP2 (i586)
Comment 5 Michael Gross 2006-02-16 14:56:47 UTC
Sorry, but bugs related to SLES cannot filed against the openSUSE bugzilla. For SLES use the official support channels or interfaces to bugzilla provided to you.