|
Bugzilla – Full Text Bug Listing |
| Summary: | Won't boot after first reboot on initial install - blank screen - kernel panic | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | craig gardner <cgardner> |
| Component: | Kernel | Assignee: | Andreas Kleen <ak> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | acpi |
| Version: | Beta 4 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | All | ||
| Whiteboard: | |||
| Found By: | Development | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | screen log | ||
|
Description
craig gardner
2005-09-08 15:16:24 UTC
Only difference is: During installation, an UP kernel is used but for the installed system a SMP kernel is trying to be booted. *** Bug 115885 has been marked as a duplicate of this bug. *** I've looked at several other similar bug reports, and have tried a variety of things to work past this problem. For example, I first tried setting the following as grub boot options: early_printk=vga vga=1 But that doesn't work, because it hangs before anything can get logged, or at least before I can switch to the log console. So I tried these grub boot options: pci=noacpi noapic acpi_irq_balance These also didn't change anything. So I looked at the options that are set by "Linux (failsafe)", and wanted to find the combination of options that would make it work. I guessed that it has something to do with powermanagement, so I tried: apm=off acpi=off That worked! But I wanted to get the smallest number of option to make it easier to debug. So I tried: apm=off That didn't work. The server still hanged. So I tried: acpi=off And that worked! The acpi switch being off is the lone switch that makes the difference. I'm attaching /var/log/messages and dmesg. Ohoh, maybe this is one of the apci modules now loaded through initrd. I expect the processor module... I didn't manage to write the code to disable them via boot param. Increase severity, whether there is still time for a fast hack (just declaring three global varialbes for __setup and go out of init function of fan/thermal/processor modules ...). Could you please try: boot into working system. delete thermal,fan,processor modules from: INITRD_MODULES="sata_promise sata_via via82cxxx processor thermal fan reiserfs ..." in /etc/sysconfig/kernel then invoke mkinitrd and try whether you can boot. early_printk=vga vga=1 -> you should see something, delete other vga=XXX and splash= options in /boot/grub/menu.lst and you should see something. Removed thermal, fan and processor from INITRD_MODULES. Then ran mkinitrd. Rebooted. No improvement. Still doesn't work. I got the early_printk to work, thanks to your help, by removing vga=XXX and
splash= from menu.lst. Here's the abreviated output:
ACPI: Looking for DSDT in initrd... not found!
not found!
Using local APIC timer interrupts.
Detected 12.528 MHz APIC timer.
cpu_up: attempt to bring up CPU 1 failed
Unable to handle kernel paging request at 0000006f812c5160 RIP:
<ffffffff8016f4cb>{free_block+123}
PGD 0
Oops: 0000 [1] SMP
All the register and callback data goes here.... I can include it if you
want/need.
<0> Kernel panic - not syncing: Attempted to kill init!
Please include the full log. That's easiest if you use earlyprintk=serial,ttyS0,baud and a null modem cable to another machine. I was hoping you wouldn't ask me to do that. ;-)
But now that I've found a null modem cable, I've got the output:
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 1804.115 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Memory: 1019548k/1048544k available (2418k kernel code, 0k reserved, 932k
data, 212k init)
Calibrating delay using timer specific routine.. 3615.70 BogoMIPS
(lpj=7231418)
Security Framework v1.0.0 initialized
SELinux: Disabled at boot.
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0(2) -> Node 0 -> Core 0
mtrr: v2.0 (20020519)
checking if image is initramfs... it is
ACPI: Looking for DSDT in initrd... not found!
not found!
Using local APIC timer interrupts.
Detected 12.528 MHz APIC timer.
cpu_up: attempt to bring up CPU 1 failed
Unable to handle kernel paging request at 000000ef81cb1fb0 RIP:
<ffffffff8016f4cb>{free_block+123}
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.13-3-smp
RIP: 0010:[<ffffffff8016f4cb>] <ffffffff8016f4cb>{free_block+123}
RSP: 0000:ffff81003ffb1e58 EFLAGS: 00010012
RAX: 0000001e002fca9e RBX: ffff810002532100 RCX: 000f0017e54f0000
RDX: 0000000000000001 RSI: 0000000000000010 RDI: f000ff54f0000073
RBP: 0000000000000010 R08: ffff81003ffb0000 R09: 0000000000000001
R10: 0000000000019a28 R11: 0000000000000000 R12: ffff810002532508
R13: 0000000000000000 R14: 0000000000000001 R15: ffff810002532528
FS: 0000000000000000(0000) GS:ffffffff80508800(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000ef81cb1fb0 CR3: 0000000000101000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff81003ffb0000, task ffff81003ffaf500)
Stack: ffffffff7fffffff 0000000000000000 ffff810002532100 ffff810002532568
0000000000000001 0000000000000000 ffffffff804cb460 ffffffff80170167
0000000100000001 ffffffff803d3620
Call Trace:<ffffffff80170167>{cpuup_callback+455}
<ffffffff8014a56f>{notifier_call_chain+31}
<ffffffff80156a01>{cpu_up+225} <ffffffff8010c15c>{init+268}
<ffffffff8010f92e>{child_rip+8} <ffffffff8010c050>{init+0}
<ffffffff8010f926>{child_rip+0}
Code: 48 8b 14 c5 c0 ca 4c 80 48 8d 04 cd 00 00 00 00 48 c1 e1 06
RIP <ffffffff8016f4cb>{free_block+123} RSP <ffff81003ffb1e58>
CR2: 000000ef81cb1fb0
<0>Kernel panic - not syncing: Attempted to kill init!
Change status back to ASSIGNED. Can you give me the full log starting with the beginning of the boot? Sorry. Adding screen log. Attached. Created attachment 49414 [details]
screen log
Your BIOS is somewhat broken and generates an invalid SRAT table. The fallback error code had a quirk that lead to the crash later. It should work if you boot with numa=noacpi Probably didn't make RC2, but you can get a kotd later with that change and then drop the command line option again - patches.arch/srat-fallback: (#115891) Backport some x86-64 srat fallback fixes |