Bugzilla – Bug 139529
Access NFS Server Freezes the machine
Last modified: 2006-04-27 02:21:53 UTC
I have SUSE 10.0 with the latest kernel updates. It is running on AMD on ASUS k8vSE motherboard. The machine has 1.5 Gbytes of ram. It is running the 64 bit version of the operatings system. I had install nfs-utils with the box version of SUSE 10.0. Autofs has been enable with auto.net. With the original version, I have a cyrix 933 machine with basically the same configuration exept it has only 1 Gbyte of RAM. The Cyrix has a a /export/home exported with the default parameters. The amd 64 has /export/home & /export/home0. I have all my users automounted to their respective file. on the cyrix machine I could do an ls/net/crab/export/home0 and get a listing with no problem. Likewise I ccould do ls /net/sirus and the directories would appear with no problem. After the kernel update a few weeks ago, whenever the cyrix machine would do a ls /net/crab/export/home0 it would freeze the AMD64 bit machine. The only way to grab control is to do reset. The cyrix machine has none of this symptons. It seems to be specific to the 64 bit cde.
Joseph, could you enable verbose kernel logging to the console on the NFS server (klogconsole -l8 -r0) and switch to the virtual console (Alt-F1)? I'd like to know whether the kernel prints any oops messages. Neil, could you take care of this one, please?
Yes, kernel messages would be helpful. Also, a few more details about what works and what doesn't. You say that 'ls /net/crab/export/home0' will freeze the AMD64 machine (the server). Does any NFS access to that server cause it to freeze, or just that particular access? What about 'ls /net/crab/exprot/home' ?? What if you mount manually rather than using automount. e.g. on sirus (the Cyrix machine I assume) mkdir /test mount crab:/export/home0 /test ls -l /test echo hello > /test/testfile rm /test/testfile Does any or all of that work? If not, at which point does 'crab' stop responding? Thanks, NeilBrown
Comment 1 The screen dump was extremely long I tooks some snippets of the screen Unable to handle kernel null pointer dereference at 0000000000000098 RIP: <ffffffff8815375a>{:sk981in:FreeTxDescriptors+186} PGD 0 OOP :0000[1] CPU 0 Moduleslinked in:NLS_utf8 nvidia hfsplus vfat fat subfs freq_table ipv6 autofs4 nfsd exportfs snd_pcm_oss snd_mixer_oss button batter ac snd_seq ... PID 0, Com: swapper tainted:pf u 2.6.13.-15.7-default RIP:0010 [<ffffffff8815375a7>] <ffffffff8815375a>{:sk981ub:FreeTxDescriptor+186} Call Trace <IRQ><ffffffff88153888>{:sk91in:xmitFrane+72><ffffffff88153>{sklin:sk gexmit+89} <ffffffff802e93d7>{ip_finish_output+455}<ffffffff802c5d62>{dev_queue_xmit+242} <ffffffff802e6b9d>{ip_dst_oupput+109}<fffffffff802e9799>{ip-ouptut+201} As for comment 2 It also freezes the machine with no problem.
"If not, at which point does 'crab' stop responding?" ???? I'm guessing at the mount command. Is that correct? The stack trace seems to point to a problem in the network driver rather than in NFS. Could you try the confirm that by trying various other network accesses to 'crab'. e.g. ping ping -s 2000 ssh and anything else that you happen to have a server for on 'crab'. If none of them fail, try to explicitly set the transport protocol that nfs is using with mount -o udp .... and mount -o tcp .... and see if one works better than the other. Also, please report what network card you have on 'crab'. Finally, if you have a digital camera available, a photograph of the dump messages may be very helpful. Thanks,
I have bad news on the bug. It just went away for no reason. The only thing that is different is that I rebooted my sius computer. It wass running for several days with a bitt torrent client. I do have one domment. I never had a problem with the other tcp/ip services at all. I regularly usee ssh between the two machines
Ok, let's close this as 'WORKSFORME' seeing it works-for-you now... Reopen if it happens again.