|
Bugzilla – Full Text Bug Listing |
| Summary: | kernel immediately crashes on amd quadcore | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.2 | Reporter: | Marcus Meissner <meissner> |
| Component: | Kernel | Assignee: | Thomas Renninger <trenn> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | fhuebner, joerg.roedel, michael.wright2001 |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | maint:released:11.2:29469 | ||
| Found By: | Development | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
screenshot.JPG
Full boot log and backtrace from the duplicate marked bug Boot log of same kernel and same machine as comment #5, with noacpi passed: boots acpidump with 32 bit kernel (2.6.25.20-0.5-pae) |
||
|
Description
Marcus Meissner
2009-12-01 21:32:20 UTC
Created attachment 330353 [details]
screenshot.JPG
(not sure how to extract acpi tables correctly for your help) Is this a backtrace of the last called functions?
If yes it looks like a (apic) timer interrupt happened while trying to set up the Embedded Controller (EC) (acpi_ec_ecdt_probe()).
Only quick boot param test related to EC is:
ec_intr=0
Alexey, have you already seen something similar?
Unfortunately the reason of the panic is not exposed on the screenshot (you cannot scroll up a bit and shoot another picture?)
Is this always reproducable?
Can we be sure this is a regression to 11.1 (no HW change, no BIOS update, no HW breakage)?
> not sure how to extract acpi tables correctly for your help
Run: acpidump >/tmp/acpidump
But if there isn't something obvious in the ECDT table, I fear this info won't help much at this point (still it would be nice to have it in the bug for reference, e.g. if you still can boot 11.1)
*** Bug 548165 has been marked as a duplicate of this bug. *** Created attachment 330469 [details]
Full boot log and backtrace from the duplicate marked bug
acpi_os_map_memory fails, this may not be related to acpi, but to memory management.
I wonder why this happens when setting up the EC, shouldn't this be used earlier already?
Do you have an IOMMU [enable/disable] or memory hole configuration in your BIOS?
Toggling this (better leave the mem hole and try IOMMU first or set it back if it's not that) might help.
If not, iommu=soft could help?
Adding Joerg (and Rafael, not sure whether this one should be added to the official regressions, we take care that a possible fix hits a stable kernel here, may not be worth the overhead) Any idea about this regression: kernel BUG at /usr/src/packages/BUILD/kernel-default-2.6.31.5/linux-2.6.31/mm/vmalloc.c:1136! invalid opcode: 0000 [#1] SMP CPU 1 Pid: 1, comm: swapper Not tainted 2.6.31.5-0.1-default #1 GA-MA69G-S3H RIP: 0010:[<ffffffff81139966>] [<ffffffff81139966>] __get_vm_area_node+0x1f6/0x260 (In reply to comment #6) > Adding Joerg (and Rafael, not sure whether this one should be added to the > official regressions, we take care that a possible fix hits a stable kernel > here, may not be worth the overhead) If this is a mainline regression and we don't have a fix shortly, it may be worth adding, although the mainline regressions introduced before 2.6.31.1 will not be tracked any more after 2.6.32 is released. its the very first thing the kernel does after grub... crash. I dont think I can scroll up. I will do some more tries tomorrow. i am using opensuse 11.0 (not 11.1) just fine on the same machine (typing from it right now). Marcus: No need to scroll up, we have a full backtrace/boot log in comment #5. The problem is that io memory remapping fails and ACPI is probably the one who calls ioremap first that early. Can you try nommconf boot param. Not sure whether it may help, but there are two differences I could imagine are related in working (noacpi boot param) and not working dmesg: noacpi | default ----------------------+---------- mmconf not used | used mtrr get touched | no msg Frank: Do you also have an older kernel boot with acpi on working, lying around? I wonder whether the mtrr settings got overridden there, too. Created attachment 330701 [details] Boot log of same kernel and same machine as comment #5, with noacpi passed: boots This could be it: git commit: de2a47cf2b3f59ef9664b277f4021b91af13598e Subject: x86: Fix error return sequence in __ioremap_caller() Currently building for x86_64 -default flavor: stravinsky-trenn-45 Damn it, the patch is mainline. I pushed it into 11.2 branch, please test a kernel from here: ftp://ftp.suse.com/pub/projects/kernel/kotd/openSUSE-11.2/x86_64/ in some hours (best tomorrow). Double check whether the commit is already in: rpm -qp --changelog kernel-default.rpm |less Thu Dec 3 15:40:01 CET 2009 - trenn@suse.de - patches.arch/x86_fix_ioremap.patch: x86: Fix error return sequence in __ioremap_caller() (bnc#559680). > I am using opensuse 11.0 (not 11.1) just fine on the same machine (typing from > it right now). Cool, so it shouldn't be that hard for you to give it a test on a 11.1 system with rpm -ivh kernel-default{-base,}.rpm [--force] (In reply to comment #9) > Marcus: No need to scroll up, we have a full backtrace/boot log in comment #5. > The problem is that io memory remapping fails and ACPI is probably the one who > calls ioremap first that early. > Can you try nommconf boot param. Not sure whether it may help, but there are > two differences I could imagine are related in working (noacpi boot param) and > not working dmesg: > noacpi | default > ----------------------+---------- > mmconf not used | used > mtrr get touched | no msg > > Frank: Do you also have an older kernel boot with acpi on working, lying > around? > I wonder whether the mtrr settings got overridden there, too. Yes, I use 2.6.25.20-0.5-pae (actual 11.0 32 bit kernel). I am not sure wheather you want me to send an acpidump with this kernel or with a 64 bit kernel (11.1 installation disk works). To speed up a bit I will add now an acpidump with 32 bit version. I hope this is what you want. Anyway, as it's the first time in my life to help debugging the kernel, can you please describe what you want more detailed? It will help me send you the right information. Created attachment 330864 [details]
acpidump with 32 bit kernel (2.6.25.20-0.5-pae)
Thanks Frank. I hope this is not necessary anymore. For now it would be great if someone could give kernel-default.rpm (+ -base.rpm) from here a test: ftp://ftp.suse.com/pub/projects/kernel/kotd/openSUSE-11.2/x86_64 (I double checked, the patch is included) I found a possibly related patch which was sent some time ago, marked as urgent. I didn't see the initial discussion about it, but it could be it and I added it. If above kernel boots fine, we are done. Eventually I could create a boot.iso with the new kernel and place it onto my ftp then for easy installation. I tried 2.6.31.6-0.0.0.42.eab586f, it still gives the same panic for me. :/ Marcus: This machine is a private one, right? If we'd had it here, I could prepare a git bisect via PXE boot which should not take that long... Is that possible? its my home workstation :( i will see if I can carry it to the office at some point, but not in this weather :/ This is a duplicate of bug #548108. Please follow comment #30 and try out a fixed kernel Jeff is pointing to there. *** This bug has been marked as a duplicate of bug 548108 *** (In reply to comment #18) > This is a duplicate of bug #548108. > Please follow comment #30 and try out a fixed kernel Jeff is pointing to there. > > *** This bug has been marked as a duplicate of bug 548108 *** I too upgraded my bios (posted in bug 548021#c26). Now I can boot with the original kernel. So from my side no need to try an updated kernel. I do have a Gigabyte GA-MA69GM-S3H (in bug#548108 a GA-MA69-S2H), and tried bios release 7. After bios update I choose "load optimized defaults" as settings, as described in the manual. The bios update solved an issue with usb too (in /var/log/messages: reset high speed USB device using ehci_hcd and address x). Update released for: kernel-debug, kernel-debug-base, kernel-debug-base-debuginfo, kernel-debug-debuginfo, kernel-debug-debugsource, kernel-debug-devel, kernel-debug-devel-debuginfo, kernel-default, kernel-default-base, kernel-default-base-debuginfo, kernel-default-debuginfo, kernel-default-debugsource, kernel-default-devel, kernel-default-devel-debuginfo, kernel-desktop, kernel-desktop-base, kernel-desktop-base-debuginfo, kernel-desktop-debuginfo, kernel-desktop-debugsource, kernel-desktop-devel, kernel-desktop-devel-debuginfo, kernel-pae, kernel-pae-base, kernel-pae-base-debuginfo, kernel-pae-debuginfo, kernel-pae-debugsource, kernel-pae-devel, kernel-pae-devel-debuginfo, kernel-source, kernel-source-vanilla, kernel-syms, kernel-trace, kernel-trace-base, kernel-trace-base-debuginfo, kernel-trace-debuginfo, kernel-trace-debugsource, kernel-trace-devel, kernel-trace-devel-debuginfo, kernel-vanilla, kernel-vanilla-base, kernel-vanilla-base-debuginfo, kernel-vanilla-debuginfo, kernel-vanilla-debugsource, kernel-vanilla-devel, kernel-vanilla-devel-debuginfo, kernel-xen, kernel-xen-base, kernel-xen-base-debuginfo, kernel-xen-debuginfo, kernel-xen-debugsource, kernel-xen-devel, kernel-xen-devel-debuginfo, preload-kmp-default, preload-kmp-desktop Products: openSUSE 11.2 (debug, i586, x86_64) |