|
Bugzilla – Full Text Bug Listing |
| Summary: | Stop during install CD boot at "scanning usb..." | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Forgotten User N1m2whZ-xl <forgotten_N1m2whZ-xl> |
| Component: | Kernel | Assignee: | Thomas Renninger <trenn> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | snwint |
| Version: | Beta 3 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | SUSE Other | ||
| Whiteboard: | |||
| Found By: | Beta-Customer | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
acpidmp
dmesg |
||
|
Description
Forgotten User N1m2whZ-xl
2005-08-20 00:40:19 UTC
Here is my chance: with "manual=1" I get beyond the stopping point. But if YaST later wants to "modprobe usb-storage", I have to say "no". If I say "yes", the process "modprobe usb-storage" does not return and is not killable (not even with -9). After installation, reboot did hang within /etc/init.d/bluetooth.
The internal bluetooth device is connected via usb.
I had to "insserv -r bluetooth" to be able to boot.
Later I find in /var/log/messages:
Aug 20 19:30:43 turion kernel: irq 177: nobody cared (try booting with the
"irqpoll" option)
Aug 20 19:30:43 turion kernel:
Aug 20 19:30:43 turion kernel: Call Trace: <IRQ>
<ffffffff801580d5>{__report_bad_irq+53} <ffffffff801582e7>{note_interrupt+439}
Aug 20 19:30:44 turion kernel: <ffffffff80157c4f>{__do_IRQ+207}
<ffffffff80111498>{do_IRQ+72}
Aug 20 19:30:44 turion kernel: <ffffffff8010eede>{ret_from_intr+0} <EOI>
<ffffffff801920a8>{__d_lookup+104}
Aug 20 19:30:44 turion kernel: <ffffffff8018764c>{do_lookup+60}
<ffffffff80187b4f>{__link_path_walk+847}
Aug 20 19:30:44 turion kernel: <ffffffff80136e24>{do_wait+2532}
<ffffffff80188657>{link_path_walk+135}
Aug 20 19:30:44 turion kernel: <ffffffff80177cfa>{get_unused_fd+90}
<ffffffff8018c6df>{filldir+127}
Aug 20 19:30:44 turion kernel: <ffffffff80188bdc>{path_lookup+380}
<ffffffff8018a13c>{open_namei+172}
Aug 20 19:30:44 turion kernel: <ffffffff80178a4d>{filp_open+45}
<ffffffff80177cfa>{get_unused_fd+90}
Aug 20 19:30:44 turion kernel: <ffffffff80178b02>{sys_open+82}
<ffffffff8010e91a>{system_call+126}
Aug 20 19:30:44 turion kernel:
Aug 20 19:30:44 turion kernel: handlers:
Aug 20 19:30:44 turion kernel: [<ffffffff88117b20>] (usb_hcd_irq+0x0/0x70 [usbcore])
Aug 20 19:30:44 turion last message repeated 2 times
Aug 20 19:30:44 turion kernel: Disabling IRQ #177
Seems the kernel has problems with those modules. You can use brokenmodules=foo,bar,whatever to get past linuxrc and probably through yast, too. Marcel? I have no idea. Never seen this before and the oops is not Bluetooth related. Send in the output of /proc/bus/usb/devices. The bluetooth device is internally connected via usb. turion:3 17:46:47 ~ # cat /proc/bus/usb/devices T: Bus=03 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=480 MxCh= 8 B: Alloc= 0/800 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=01 MxPS= 8 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.13-rc6-git7-3-default ehci_hcd S: Product=EHCI Host Controller S: SerialNumber=0000:00:13.2 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 0mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=256ms T: Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 4 B: Alloc= 57/900 us ( 6%), #Int= 2, #Iso= 2 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.13-rc6-git7-3-default ohci_hcd S: Product=OHCI Host Controller S: SerialNumber=0000:00:13.1 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 0mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms T: Bus=02 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 2 Spd=1.5 MxCh= 0 D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=1241 ProdID=1166 Rev= 2.70 C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=usbhid E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=10ms T: Bus=02 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(unk. ) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0db0 ProdID=6855 Rev=15.00 S: Manufacturer=SiW S: Product=SiW S: SerialNumber=AA850B091100 C:* #Ifs= 2 Cfg#= 1 Atr=a0 MxPwr= 50mA I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(unk. ) Sub=01 Prot=01 Driver=hci_usb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms T: Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 4 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 2.06 S: Manufacturer=Linux 2.6.13-rc6-git7-3-default ohci_hcd S: Product=OHCI Host Controller S: SerialNumber=0000:00:13.0 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr= 0mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms turion:3 18:17:44 ~ # To be honest this really looks like an USB bug and not like a Bluetooth problem. Maybe it is also a bug in the interrupt handling or a hardware problem. Surely an USB bug. The whole USB is not working, not only the USB Bluetooth device. The kernel parameter "irqpoll" avoids this problem. So I like to suggest to take "irqpoll" into the append sequence for the "Failsafe" target. The bug is still present with Beta3. Please assign this bug to the right person. Well, I'm not the right person for USB stack bugs. ;-) Sounds more like a kernel issue. As long as this bug is not resolved, "irqpoll" should be added to the Failsafe append parameters. Please do that for Beta4... This sounds very much like a driver enabling a piece of hardware before calling request_irq. The first question that comes to my mind is: who is setting up irq 177? That should tell us rather quickly which module is broken. turion:1 12:08:37 ~ # cat /proc/interrupts
CPU0
0: 60190297 local-APIC-edge timer
1: 449 IO-APIC-edge i8042
8: 0 IO-APIC-edge rtc
12: 925 IO-APIC-edge i8042
14: 278731 IO-APIC-edge ide0
15: 2115524 IO-APIC-edge ide1
169: 486454 IO-APIC-level acpi, ohci1394
177: 138025 IO-APIC-level ehci_hcd:usb1, ohci_hcd:usb2, ohci_hcd:usb3, yenta
185: 114416512 IO-APIC-level yenta, ath0
NMI: 1
LOC: 60193736
ERR: 837
MIS: 0
turion:1 12:08:44 ~ #
turion:1 12:12:34 ~ # cat /proc/cmdline
BOOT_IMAGE=suse100 ro root=303 resume=/dev/hda8 selinux=0 console=tty0
no_timer_check showopts irqpoll
turion:1 12:12:41 ~ #
This "irqpoll" gives an info line during boot (directly before initializing CPU0):
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
and after CPU initializing:
ACPI Namespace successfully loaded at root ffffffff804f5580
evxfevnt-0096 [03] acpi_enable : Transition to ACPI mode successful
..MP-BIOS bug: 8254 timer not connected to IO-APIC
failed.
timer doesn't work through the IO-APIC - disabling NMI Watchdog!
Uhhuh. NMI received for unknown reason 3d.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
works.
Using local APIC timer interrupts.
Detected 12.436 MHz APIC timer.
testing NMI watchdog ... CPU#0: NMI appears to be stuck (1->1)!
and later:
ACPI: PCI Interrupt 0000:02:04.0[A] -> GSI 19 (level, low) -> IRQ 177
...
pcie_portdrv_probe->Dev[5a34:1002] has invalid IRQ. Check vendor BIOS
...
ACPI: PCI Interrupt 0000:00:13.2[A] -> GSI 19 (level, low) -> IRQ 177
ehci_hcd 0000:00:13.2: EHCI Host Controller
ehci_hcd 0000:00:13.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:13.2: irq 177, io mem 0xfbdff000
ehci_hcd 0000:00:13.2: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
APIC error on CPU0: 00(40)
APIC error on CPU0: 40(40)
APIC error on CPU0: 40(40)
irq 177: nobody cared (try booting with the "irqpoll" option)
Call Trace: <IRQ> <ffffffff801580d5>{__report_bad_irq+53} <ffffffff801582e7>{not
e_interrupt+439}
<ffffffff80157c4f>{__do_IRQ+207} <ffffffff80111498>{do_IRQ+72}
<ffffffff8010eede>{ret_from_intr+0} <ffffffff88004281>{:ide_core:ide_intr
+273}
<ffffffff80157b4c>{handle_IRQ_event+44} <ffffffff80157c36>{__do_IRQ+182}
<ffffffff80111498>{do_IRQ+72} <ffffffff8010eede>{ret_from_intr+0}
<EOI> <ffffffff80223765>{copy_page+5} <ffffffff801681c9>{do_wp_page+521}
<ffffffff801209ed>{do_page_fault+1165} <ffffffff8010f295>{error_exit+0}
<ffffffff8010e91a>{system_call+126} <ffffffff8013ea14>{sys_rt_sigaction+1
48}
<ffffffff8010f295>{error_exit+0}
handlers:
[<ffffffff880f8b20>] (usb_hcd_irq+0x0/0x70 [usbcore])
Disabling IRQ #177
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:00:13.0[A] -> GSI 19 (level, low) -> IRQ 177
ohci_hcd 0000:00:13.0: OHCI Host Controller
ohci_hcd 0000:00:13.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:13.0: irq 177, io mem 0xfbdfd000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 4 ports detected
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: PCI Interrupt 0000:00:13.1[A] -> GSI 19 (level, low) -> IRQ 177
ohci_hcd 0000:00:13.1: OHCI Host Controller
ohci_hcd 0000:00:13.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:13.1: irq 177, io mem 0xfbdfe000
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 4 ports detected
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
ACPI: PCI Interrupt 0000:02:04.0[A] -> GSI 19 (level, low) -> IRQ 177
Yenta: CardBus bridge found at 0000:02:04.0 [1462:0291]
usb 3-2: new low speed USB device using ohci_hcd and address 2
...
Yenta: ISA IRQ mask 0x04b8, PCI irq 177
Socket status: 30000006
pcmcia: parent PCI bridge I/O window: 0xe000 - 0xefff
pcmcia: parent PCI bridge Memory window: 0xfbf00000 - 0xfbffffff
pcmcia: parent PCI bridge Memory window: 0x40000000 - 0x44ffffff
...
usbcore: registered new driver hiddev
input.c: calling hotplug without a hotplug agent defined
input: USB HID v1.10 Mouse [1241:1166] on usb-0000:00:13.1-2
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.01:USB HID core driver
Bluetooth: Core ver 2.7
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: HCI USB driver ver 2.8
usbcore: registered new driver hci_usb
...
Kernel is 2.6.13-rc6-git13-4-default.
Thanks, this looks like we can really pin it to ehci_hcd. Greg, could you have a look at this, please? I've set this bug to ``assigned'' Please don't forget updating the status! I don't think this is a usb issue, but an acpi one, right? The usb ehci driver isn't doing anything wrong, if a change in the boot option fixes it. I don't have any ideas, sorry. The bug is still present in kernel 2.6.13-8 (from rc1).
It was not present with all 9.3 kernels.
During ehci_hcd initialization, an interrupt occurs, but "nobody cared".
The "irqpoll" kernel parameter was already set (necessary - if not, the system
freezes at this point):
ehci_hcd 0000:00:13.2: irq 177, io mem 0xfbdff000
ehci_hcd 0000:00:13.2: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
irq 177: nobody cared (try booting with the "irqpoll" option)
Call Trace: <IRQ> <ffffffff80158015>{__report_bad_irq+53} <ffffffff80158227>{not
e_interrupt+439}
<ffffffff80157b8f>{__do_IRQ+207} <ffffffff801113d8>{do_IRQ+72}
<ffffffff8010eebc>{ret_from_intr+0} <ffffffff8029ed50>{cfq_queue_empty+0}
<ffffffff8029ed50>{cfq_queue_empty+0} <ffffffff88004281>{:ide_core:ide_in
tr+273}
<ffffffff80157a8c>{handle_IRQ_event+44} <ffffffff80157b76>{__do_IRQ+182}
<ffffffff801113d8>{do_IRQ+72} <ffffffff8010eebc>{ret_from_intr+0}
<EOI> <ffffffff8010e91a>{system_call+126}
handlers:
[<ffffffff88113b20>] (usb_hcd_irq+0x0/0x70 [usbcore])
Disabling IRQ #177
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
But even when that message happens in the syslog, everything works properly, right? Yes - as long as "irqpoll" is given. Without "irqpoll", the installed system has USB totally not working. Trying to boot from CD1 and do "installation" freezes at Starting hardware detection... Scanning USB devices... So my suggestion for 10.0-final was at least to add "irqpoll" to the Failsafe target... Ok, that's up to someone else to add that to the target, I can't do that... Steffen, can you do that? Its somewhere in syslinux or whatever. Aeh, what? Failsafe should include 'irqpoll'? Fine with me, if you kernel people think it's safe to add. please add it, ak said ok. added Does pci=noacpi help? Please add whole dmesg and acpidmp output. Created attachment 56803 [details]
acpidmp
Created attachment 56804 [details]
dmesg
pci=noacpi does not help (no USB working, and additionally cardbus WLAN card not working). Please be sure you have installed the latest available BIOS for this machine. Can you also attach full dmesg and /proc/interrupts output of both, booting with and without apci=off. Just a guess, but maybe it works with one of the boot params: enable_timer_pin_1, disable_timer_pin_1 or acpi_skip_timer_override? As there is an acceptable workaround for 10.0, I am closing this one. Please try a recent OpenSuse 10.1 version as early as possible and reopen if the bug is still valid for current kernels. |