Bug 148343 - Random system freezes with activated Xen and some running domUs
Summary: Random system freezes with activated Xen and some running domUs
Status: RESOLVED WONTFIX
Alias: None
Product: SUSE LINUX 10.0
Classification: openSUSE
Component: Xen (show other bugs)
Version: Final
Hardware: i586 Other
: P5 - None : Major
Target Milestone: ---
Assignee: E-mail List
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-06 10:20 UTC by Jan Brinkmann
Modified: 2008-06-25 09:53 UTC (History)
0 users

See Also:
Found By: Other
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Brinkmann 2006-02-06 10:20:46 UTC
I installed the 8259 Changeset for SuSE 10.0 because there was a problem with the 6xxx changeset which ships with SuSE 10.0. That bug prevented the domUs from rebooting. The domUs never came back after a reboot (also see #143266). That problem was fixed in the 8259 changeset. However, we're now experiencing problems with system freezes from time to time. The system completly locks up, no special log messages are generated or something, it just freezes and there is no other way to get the system back up running, than hitting the reset button. We already read something about problems with Hyperthreading enabled, and disabled it in the bios, without success. After that, we already tried to disable usb because we read about problems related to usb. It worked for about 1,5 weeks, but then the system locked up again. I'll try to compile a newer xen version (3.0.1) on my own, and report back if that fixed it. However, since the system only locks up ca. once a week, it can take some time. Is this problem known in any way and maybe fixed in a newer changeset? Our xen-host is a ibm xseries 306 with a P4 3.20GHz and 3GB of Memory. We already changed the ram modules, since it's possible that bad hardware could be the problem, but this wasnt a solution. The ram module we tested definitly works fine in another system. Any help, ideas or a working solution would be appreciated.
Comment 1 Michael Gross 2006-02-06 10:57:28 UTC
Have you tried using the latest available beta version of SL (10.1 Beta 3)? If you insert the boot-cd there is a diagnostic tool (memtest86) which can be called using `memory test' in the boot menu. On an installed system, it should also be available. This software does extensive memory tests, you should let it run at least 2 days, if there are no messages, the hardware is most likely OK.

Before we do anything: Please try the latest version of SL. Fixing bugs which are non-reproducible is almost impossible.
Comment 2 Charles Coffing 2006-02-10 21:31:37 UTC
You may also want to try running the debug version of Xen (which is included in our xen RPM).  This will log messages to the console if the dom0 kernel tries to do something stupid.  Here's an excerpt from the latest README, explaining this:

To debug Xen or dom0 Linux crashes or hangs, it may be useful to use the debug-enabled hypervisor, and to prevent automatic rebooting.  Change your Grub configuration from something like this:
    kernel (hd0,5)/xen.gz
To something like this:
    kernel (hd0,5)/xen-dbg.gz noreboot
After rebooting, the Xen hypervisor will write any error messages directly to the text console.
Comment 3 Michael Gross 2006-02-16 12:14:13 UTC
Jan: Please reopen this buf if you can provide more information.
Comment 4 Stephan Kulow 2008-06-25 09:35:22 UTC
mass reopening all SuSE Linux bugs that are set to REMIND+LATER to change the resolution to WONTFIX (adapting to new policy)
Comment 5 Stephan Kulow 2008-06-25 09:37:26 UTC
mass reopening all SuSE Linux bugs that are set to REMIND+LATER to change the resolution to WONTFIX (adapting to new policy)
Comment 6 Stephan Kulow 2008-06-25 09:42:07 UTC
mass reopening all SuSE Linux bugs that are set to REMIND+LATER to change the resolution to WONTFIX (adapting to new policy)
Comment 7 Stephan Kulow 2008-06-25 09:53:32 UTC
Closing old LATER+REMIND bugs as WONTFIX - if you still plan to work on it, feel free to reopen and set to ASSIGNED.

In case the report saw repeated reopen comments, it's due to bugzilla timing out on the huge request ;(