|
Bugzilla – Full Text Bug Listing |
| Summary: | x86_64/mm/init.c:146 bad pte ..., while suspending to disk APIC enabled | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Markus Walser <markus.walser> |
| Component: | Kernel | Assignee: | Pavel Machek <pavel> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | ||
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | SuSE Linux 10.0 | ||
| Whiteboard: | |||
| Found By: | Customer | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
|
Description
Markus Walser
2005-11-19 16:12:12 UTC
Pavel, I'm assigning this to you Is it reproducible? It looks like duplicate of bug #119833 to me. I tried three times to suspend and hit this bug every time. Note that this bug happend during suspend while bug #119833 happend during resume. Shall I try to apply the patch mentioned in bug #119833 or do you have a more recent one from Andi to test? Stefan, have you seen something similar? Can you try it with minimum drivers? init=/bin/bash. i haven't seen this (but i don't have many x86_64 machines) and it looks pretty different to bug#119833 (to me :-). Maybe the "bad pte" is not really the fatal error but the driver for device 0000:00:13.0 is hanging on resume-during-suspend? Hi, Just tried to suspend with init=/bin/bash and "echo 4 > /proc/acpi/sleep". (The only thing I did after booting was a "swapon /dev/hda2" and remounting /proc). It ended with almost the same result: http://homepage.hispeed.ch/hb9xcg/suspend_with_init_bash.jpg Can you give me an advise how to find out what´s behind 0000:00:13.0? A "lspci | grep 13.0" would report: 00:13.0 USB Controller: ATI Technologies Inc IXP SB400 USB Host Controller BTW, trying to suspend to ram with "echo 3 > /proc/acpi/sleep" switchs off the notebook. Not that I could resume, but at least it suspends to ram. Could it be a problem that I have 1.5GB RAM and only 1GB swap? USB Host would be ehci_hcd, uhci_hcd or ohci_hcd, but since with init=/bin/bash those are not loaded, they are probably not the troublemakers. Also, the swap size does not really matter here. Sorry, i have no further ideas. What are your current parameters at kernel command line? Can you try adding noapic? cmdline is: root=/dev/hda3 vga=0x342 selinux=0 resume=/dev/hda2 splash=0 init=/bin/bash With additional parameter such as noapic or pci=noacpi or acpi=off the machine doesn´t boot and prints no messages at all. The only acpi thing I found which boots is acpi=oldboot, but the result is about the same after "echo shutdown >/sys/power/disk;echo disk>/sys/power/state": http://homepage.hispeed.ch/hb9xcg/suspend_with_acpi_oldboot.jpg Do you use SMP kernel by chance? I can't verify it at the moment because the notebook is at home. But according to the suspend2disk.log it's SuSEs default kernel /boot/2.6.13-15-default which doesn't have SMP support, I suppose. It's definitely no SMP: turion:~ # cat /proc/config.gz | gunzip | grep SMP CONFIG_BROKEN_ON_SMP=y # CONFIG_SMP is not set turion:~ # uname -a Linux turion 2.6.13-15-default #1 Tue Sep 13 14:56:15 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux Complete config is here: http://homepage.hispeed.ch/hb9xcg/config Okay, I guess we see the problem as 113886 -- APIC troubles. I guess testing 32-bit kernel would be hard? Also try latest vanilla kernel... Hopefully it is 113886 duplicate. *** This bug has been marked as a duplicate of 113886 *** I think it is more duplicated with this bug: http://bugzilla.kernel.org/show_bug.cgi?id=5534 Hi, I tried suspend to disk again with the 2.6.16-rc1-mm3 kernel and got interessting results. Basically resuming from disk work the first time on this nx6125. But with an ugly soft lockup on CPU0 during suspend to disk: http://homepage.hispeed.ch/hb9xcg/img_0410.jpg That followed resuming went well. Config was: http://homepage.hispeed.ch/hb9xcg/config-2.6.16-rc1-mm3 And SystemMap was: http://homepage.hispeed.ch/System.map-2.6.16-rc1-mm3-mw An the dmesg after resume: http://homepage.hispeed.ch/resume-2.6.16-rc1-mm3.log May be there's a connection between this lockup and the messages like: " osl-0822 [77] os_wait_semaphore : Failed to acquire semaphore[ffff81005733ad40|1|0], AE_TIME" Which I see very often in the log. F.e. when accessing the proc filesystem: "cat /proc/acpi/thermal_zone/TZ1/temperature" Any suggestions? These seem to be different problems, please log them into bugzilla.kernel.org. Two links in Comment #17 are wrong. They should be: http://homepage.hispeed.ch/hb9xcg/System.map-2.6.16-rc1-mm3-mw http://homepage.hispeed.ch/hb9xcg/resume-2.6.16-rc1-mm3.log |