|
Bugzilla – Full Text Bug Listing |
| Summary: | Kernel Panic on AMD Opteron 4P/8P system with 2GB+ used by PCI devices (in some memory configurations) | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Brian Richardson <brianr> |
| Component: | Kernel | Assignee: | Andreas Kleen <ak> |
| Status: | RESOLVED INVALID | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | david.keck, jacob.shin, mark.langsdorf |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | SuSE Linux 10.1 | ||
| Whiteboard: | |||
| Found By: | Third Party Developer/Partner | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
|
Description
Brian Richardson
2006-01-27 21:57:05 UTC
Please try our 10.1 beta kernels. If this still fails there, we can move forward with debugging it. ftp://ftp.suse.com/pub/projects/kernel/kotd/x86_64/HEAD/kernel-smp.x86_64.rpm Adding David and Jacob as Mark is on sabbatical Identicial failure on SLES10 Beta (2.6.15-git12-6-smp). Waiting for results on SLES10 Beta3. Similar failure on Win2K3 Server. Andi, Greg, any ideas? Whom of you should I assign this to? To me. I fixed various issues with this recently so I would recommend to test the lastest kernel (2.6.15-git12 is really old) But if you get a similar failure on Windows maybe it's not the OS that is to blame? Does something simple like memtest86 work? If not then likely it's a hardware/BIOS problem. We had this in the past when MTRRs were not correctly set up or e820 RAM entries pointed to non-RAM etc. Basically we require the all RAM entries in e820 point to real usable RAM and if that's not the case there's nothing we can do from the OS side. Thanks Andi. Handing over to you. Still need information. AMD has verified this problem on another manufacturer's hardware. A future version of their memory/CPU reference code (AGESA 1.32.01) will resolve this issue. A patch provided by AMD was tested on the PANTA hardware using SuSe 10.0 x64. This resolves the issue described in this report. Can you attach the patch? It's a source-level change to AMD confidential code, and our BIOS structure doesn't use a patching system ... so I can't provide anything here. The version of BIOS that resolves the issue is 0ABHQ013. Since the issue is resolved without changing any SuSe product code, what is the proper resolution to use when closing this bug? Ah thanks - i thought you were refering to a Linux kernel patch. If it's a BIOS bug we normally close it as INVALID since it's not our bug. But could you please give me a quick summary what is wrong and why it crashes if you know? That's just so that I can more easily recognize the symptoms when in future a customer runs into it and blames us. Thanks. |