|
Bugzilla – Full Text Bug Listing |
| Summary: | Samba server locks machine up after large data transfer | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Phil Stopford <phil> |
| Component: | Basesystem | Assignee: | The 'Opening Windows to a Wider World' guys <samba-maintainers> |
| Status: | RESOLVED INVALID | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | ||
| Version: | Beta 2 | ||
| Target Milestone: | --- | ||
| Hardware: | i686 | ||
| OS: | SuSE Pro 9.3 | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
|
Description
Phil Stopford
2005-08-29 00:04:06 UTC
This is almost certainly not a Samba problem. We are a *userspace* application. The machine locking up like that sounds like a hardware problem to me. At the very worst it would be a kernel bug. I'd suggest downloading and running memtest86 on this box. See here: http://www.memtest86.com/ for details. Jeremy. This has occurred on multiple boxes - desktops, laptops, etc. and across both wireless and wired networks. The same hardware running Windows doesn't show an issue. The issue also is client independent - Windows, Mac or SuSE. I'm not convinced in any way that there is a hardware fault to blame in this given these factors. Logs don't show anything obvious - the transfer just stalls and the server fails to respond to any input made locally or otherwise. Ok, then it's a kernel bug - there is *NO WAY* an smbd server as a user application can lock up the box so you have to reboot. But you're going to have to do a lot more triage to even show evidence of a kernel bug given the vagueness of your report. Jeremy. If I had a starting point, I'd possibly be able to do something. With no obvious information in the logs, no error on screen and no information to hand to help me make a start (or even get a remote login set up), as a regular user I'm left to make 'vague reports'. I can only do what I can do *shrug* Phil: Can you add a null modem cabel to the server, configure a serial console and provide mem and task dump of the machine when the problem occures again? I suggest to use screen to attach to the serial console as it's possible to grab all data (ctrl+a + ctrl+h). Enable also sysrq (/etc/sysctl.conf or set /etc/sysconfig/sysctl:ENABLE_SYSRQ="yes"). If you have this and have serial --unit=0 --speed=38400 --word=8 --parity=no --stop=1 terminal --timeout=5 serial console kernel ... console=tty0 console=ttyS0,38400 in /boot/grub/menu.lst of grub, then it's possible to send sysrqs via the serial console (e.g. sync ctrl+a + ctrl+s + s). screen example command line missing: screen /dev/ttyS0 38400,cs8 I'll give it a whirl, but it will very likely be Monday before I get time. The problem surfaces after a couple of large transfers so the only tricky thing would be getting a serial console set up, but hopefully your instructions will help me out there. The tightly compressed beta schedule of SuSE makes it very tricky (for me at least) to get time to look at these things before another release appears, so I cannot predict whether it will be beta 4 or RC1 that gets the testing. I have now been shifting data via samba on 10.0beta4 for the full day. Conservatively, between 40 and 50 GB of data has been moved, consisting of several large ~2 GB files and lots of smaller files. No problems have been seen. Clients have been Windows, SUSE 9.3 and SUSE 10.0, with a single Windows client pulling a full 15 GB of data without issue. 9.3 does continue to show problems, but I'm uncertain how to proceed - the sysrq instructions above don't seem to dump data to the serial console and the serial console does not seem to pass/take input from the keyboard at that console. There does appear to be information in /var/log/messages, however that might not be usable/relevant. Advice would be welcome, assuming 9.3 is a maintenance target for SuSE. I am not changing status of this bug as 9.3 is affected, but won't complain if this is changed. :) lmuelle@gab:~> cat /proc/sys/kernel/sysrq 1 If you got a '0' please call echo "1">/proc/sys/kernel/sysrq as user root. See /usr/src/linux/Documentation/sysrq.txt for more details on sysrq handling. Please check if Alt+Sysrq+s results in syslog messages. No additional information provided since quite a time. |