Bugzilla – Bug 119207
Kernel crash when pcmcia is stopped/ejected
Last modified: 2005-11-22 23:43:42 UTC
This is coming out of another bug, bug #116991. When powersave goes through its suspend process, it ejects PCMCIA cards. I have captured a /var/log/messages that contains a kernel crash in the pcmcia/yenta area that is triggered by this PCMCIA eject. I will attach the /var/log/ messages to this next. Please let me know what else I can provide for help in getting this fixed....
Created attachment 51049 [details] /var/log/messages /var/log/messages that contains the kernel crash that is triggered by the PCMCIA eject. The crash starts at Sep 27 17:36:41. My hardware is: IBM Thinkpad A31 Symbol V2.90-57 PCMCIA wireless network card (Symbol Wireless 4111 Spectrum card) Please let me know what else to provide. Thanks!
Pavel, I'm not sure if you're the right victim for this bug; but I don't know who deals with PCMCIA issues nowadays...
Well, unfortunately noone cares about PCMCIA these days, and big problem that it is *noone*, not *noone inside suse* :-(. OTOH I do not see any oops. I see a mishandled interrupt with a backtrace. Can you reproduce it manually, without suspending? Does it also happen with vanilla kernel?
Hi Pavel, Yes, I can reproduce this by simply running "pccardctl eject" while dhcpcd is running for the wireless card. Backtrace shown below: Oct 4 07:09:03 gandalf sudo: gideon : TTY=pts/5 ; PWD=/home/gideon ; USER=root ; COMMAND=/sbin/pccardctl eject Oct 4 07:09:06 gandalf kernel: irq 4: nobody cared (try booting with the "irqpoll" option) Oct 4 07:09:06 gandalf kernel: [<c013c17c>] __report_bad_irq+0x1c/0x70 Oct 4 07:09:06 gandalf kernel: [<c013c26b>] note_interrupt+0x6b/0xd0 Oct 4 07:09:06 gandalf kernel: [<c013bd0c>] __do_IRQ+0xbc/0xd0 Oct 4 07:09:06 gandalf kernel: [<c01055e8>] do_IRQ+0x38/0x60 Oct 4 07:09:06 gandalf kernel: [<c0103dea>] common_interrupt+0x1a/0x20 Oct 4 07:09:06 gandalf kernel: [<c013bc05>] handle_IRQ_event+0x15/0x60 Oct 4 07:09:06 gandalf kernel: [<c013bcd5>] __do_IRQ+0x85/0xd0 Oct 4 07:09:06 gandalf kernel: [<c01055e8>] do_IRQ+0x38/0x60 Oct 4 07:09:06 gandalf kernel: [<c0103dea>] common_interrupt+0x1a/0x20 Oct 4 07:09:06 gandalf kernel: [<c013bc05>] handle_IRQ_event+0x15/0x60 Oct 4 07:09:06 gandalf kernel: [<c013bcd5>] __do_IRQ+0x85/0xd0 Oct 4 07:09:06 gandalf kernel: [<c01055e8>] do_IRQ+0x38/0x60 Oct 4 07:09:06 gandalf kernel: [<c0103dea>] common_interrupt+0x1a/0x20 Oct 4 07:09:06 gandalf kernel: [<c013007b>] posix_cpu_timer_schedule+0xcb/ 0x480 Oct 4 07:09:06 gandalf kernel: [<c0120b91>] __do_softirq+0x31/0xa0 Oct 4 07:09:06 gandalf kernel: [<c0120c26>] do_softirq+0x26/0x30 Oct 4 07:09:06 gandalf kernel: [<c01055ed>] do_IRQ+0x3d/0x60 Oct 4 07:09:06 gandalf [powersave]: Info (increaseSpeed:120) current speed: 1200000KHz new speed: 1900000KHz Oct 4 07:09:06 gandalf kernel: [<c0103dea>] common_interrupt+0x1a/0x20 Oct 4 07:09:06 gandalf [powersave]: Info (setSpeed:90) Speed set to: 1900000KHz in file /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed Oct 4 07:09:06 gandalf kernel: [<f9334331>] yenta_set_power+0x31/0x60 [yenta_socket] Oct 4 07:09:06 gandalf [powersave]: Info (adjustSpeed:202) Speed increased to (1900000) MHz Oct 4 07:09:06 gandalf dhcpcd[9517]: terminating on signal 15 Oct 4 07:09:06 gandalf kernel: [<f9334377>] yenta_set_socket+0x17/0x1e0 [yenta_socket] Oct 4 07:09:06 gandalf dhcpcd[9517]: dhcpStop: ioctl SIOCSIFADDR: No such device Oct 4 07:09:06 gandalf kernel: [<f9334981>] yenta_clear_maps+0x61/0xa0 [yenta_socket] Oct 4 07:09:06 gandalf dhcpcd[9517]: dhcpStop: ioctl SIOCSIFFLAGS: No such device Oct 4 07:09:06 gandalf kernel: [<f9334a5f>] yenta_sock_init+0x6f/0xa0 [yenta_socket] Oct 4 07:09:06 gandalf kernel: [<f9317586>] shutdown_socket+0x26/0xa0 [pcmcia_core] Oct 4 07:09:06 gandalf kernel: [<f9317b68>] socket_remove+0x8/0x40 [pcmcia_core] Oct 4 07:09:06 gandalf kernel: [<f9317fbc>] pcmcia_eject_card+0x4c/0x50 [pcmcia_core] Oct 4 07:09:06 gandalf kernel: [<f931a9cb>] pccard_store_eject+0x1b/0x30 [pcmcia_core] Oct 4 07:09:06 gandalf kernel: [<f931a9b0>] pccard_store_eject+0x0/0x30 [pcmcia_core] Oct 4 07:09:06 gandalf kernel: [<c025630b>] class_device_attr_store+0x1b/0x20 Oct 4 07:09:06 gandalf kernel: [<c01914ed>] flush_write_buffer+0x1d/0x30 Oct 4 07:09:06 gandalf kernel: [<c0191530>] sysfs_write_file+0x30/0x50 Oct 4 07:09:06 gandalf kernel: [<c0191500>] sysfs_write_file+0x0/0x50 Oct 4 07:09:06 gandalf kernel: [<c015935d>] vfs_write+0x8d/0x170 Oct 4 07:09:06 gandalf modify_resolvconf: restored /etc/resolv.conf.saved.by. dhcpcd.eth1 to /etc/resolv.conf Oct 4 07:09:06 gandalf kernel: [<c01594ec>] sys_write+0x3c/0x70 Oct 4 07:09:06 gandalf kernel: [<c0102d1b>] sysenter_past_esp+0x54/0x79 Oct 4 07:09:06 gandalf kernel: handlers: Oct 4 07:09:06 gandalf kernel: [<f92ac0d0>] (usb_hcd_irq+0x0/0x60 [usbcore]) Oct 4 07:09:06 gandalf kernel: [<f9334830>] (yenta_interrupt+0x0/0xc0 [yenta_socket]) Oct 4 07:09:06 gandalf kernel: [<f8ee7410>] (radeon_driver_irq_handler+0x0/0xa0 [radeon]) Oct 4 07:09:06 gandalf kernel: Disabling IRQ #4 As far as from a vanilla kernel.... I've not compiled a vanilla kernel in quite a while. =:) And I don't have room on my laptop to build it right now, unfortunately. Do you have a vanilla kernel RPM that I can install?
Not me, but I'm sure it is somewhere inside suse.
Well, um. I don't think there's any info that I need to provide, so I'm taking the "ASSIGNED"/NEEDINFO off of me, hopefully.
I still would like you to test with vanilla kernel. Compiling it yourself does not take *that* much space...
Hmmmmm. I've compiled a vanilla linux-2.6.13.4 kernel and could not reproduce it. Then I tried with the default linux-2.6.13-15 kernel from 10.0-final and couldn't reproduce it either. If you'd like to close this, that's fine with me and I'll re-open it if I see it again. Again, wishing there was a WTF resolution. =:/
ok. Let's assume it was a passing cosmic particle that crashed your machine ;-)
Gah. I spoke too soon. I just saw that this happened again this morning. This time, it was again as I was suspending my laptop in the morning to go to work. This is with the default 10.0-final kernel. An excerpt from /var/log/messages again: Oct 14 07:25:59 linux kernel: irq 4: nobody cared (try booting with the "irqpoll" option) Oct 14 07:25:59 linux kernel: [<c013c18c>] __report_bad_irq+0x1c/0x70 Oct 14 07:25:59 linux kernel: [<c013c27b>] note_interrupt+0x6b/0xd0 Oct 14 07:25:59 linux kernel: [<c013bd1c>] __do_IRQ+0xbc/0xd0 Oct 14 07:25:59 linux kernel: [<c01055e8>] do_IRQ+0x38/0x60 Oct 14 07:25:59 linux kernel: [<c0103dea>] common_interrupt+0x1a/0x20 Oct 14 07:25:59 linux kernel: [<c013007b>] posix_cpu_timer_schedule+0xbb/0x480 Oct 14 07:25:59 linux kernel: [<c0120ba1>] __do_softirq+0x31/0xa0 Oct 14 07:25:59 linux kernel: [<c0120c36>] do_softirq+0x26/0x30 Oct 14 07:25:59 linux kernel: [<c01055ed>] do_IRQ+0x3d/0x60 Oct 14 07:25:59 linux kernel: [<c0103dea>] common_interrupt+0x1a/0x20 Oct 14 07:25:59 linux kernel: [<f8f1d331>] yenta_set_power+0x31/0x60 [yenta_socket] Oct 14 07:25:59 linux kernel: [<f8f1d377>] yenta_set_socket+0x17/0x1e0 [yenta_socket] Oct 14 07:25:59 linux kernel: [<f8f1d981>] yenta_clear_maps+0x61/0xa0 [yenta_socket] Oct 14 07:25:59 linux kernel: [<f8f1da5f>] yenta_sock_init+0x6f/0xa0 [yenta_socket] Oct 14 07:25:59 linux kernel: [<f9031586>] shutdown_socket+0x26/0xa0 [pcmcia_core] Oct 14 07:26:00 linux kernel: [<f9031b68>] socket_remove+0x8/0x40 [pcmcia_core] Oct 14 07:26:00 linux kernel: [<f9031fbc>] pcmcia_eject_card+0x4c/0x50 [pcmcia_core] Oct 14 07:26:00 linux kernel: [<f90349cb>] pccard_store_eject+0x1b/0x30 [pcmcia_core] Oct 14 07:26:00 linux kernel: [<f90349b0>] pccard_store_eject+0x0/0x30 [pcmcia_core] Oct 14 07:26:00 linux kernel: [<c025643b>] class_device_attr_store+0x1b/0x20 Oct 14 07:26:00 linux kernel: [<c01914fd>] flush_write_buffer+0x1d/0x30 Oct 14 07:26:00 linux kernel: [<c0191540>] sysfs_write_file+0x30/0x50 Oct 14 07:26:00 linux kernel: [<c0191510>] sysfs_write_file+0x0/0x50 Oct 14 07:26:00 linux kernel: [<c015936d>] vfs_write+0x8d/0x170 Oct 14 07:26:00 linux kernel: [<c01594fc>] sys_write+0x3c/0x70 Oct 14 07:26:00 linux kernel: [<c0102d1b>] sysenter_past_esp+0x54/0x79 Oct 14 07:26:01 linux kernel: handlers: Oct 14 07:26:02 linux kernel: [<f8f1d830>] (yenta_interrupt+0x0/0xc0 [yenta_socket]) Oct 14 07:26:02 linux kernel: [<f8c35410>] (radeon_driver_irq_handler+0x0/0xa0 [radeon]) Oct 14 07:26:02 linux kernel: [<f8f000d0>] (usb_hcd_irq+0x0/0x60 [usbcore]) Oct 14 07:26:02 linux kernel: Disabling IRQ #4 Hm. I just noticed that immediately after my laptop resumed, my usb mouse went completely wonky. Seeing the "Disabling IRQ #4" now makes perfect sense, since this is what /proc/interrupts looks like: CPU0 0: 173346 XT-PIC timer 1: 2115 XT-PIC i8042 2: 0 XT-PIC cascade 4: 53977 XT-PIC uhci_hcd:usb1, yenta, radeon@pci:0000:01:00.0 5: 18451 XT-PIC Intel 82801CA-ICH3, yenta 7: 0 XT-PIC uhci_hcd:usb3 8: 8555 XT-PIC rtc 9: 0 XT-PIC uhci_hcd:usb2 11: 12616 XT-PIC 0.0 12: 110 XT-PIC i8042 14: 37246 XT-PIC ide0 15: 3682 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 0 MIS: 0 Anyway, I'm reopening this since it did happen again, and I didn't see any cosmic particles this time. =;) What's the next step? Do I need to run the vanilla kernel I built and see if it happens again? If so, are there any problems with running a generic kernel with Suse 10.0-final? Will everything work? (hal, etc.?)
You should find a reproducible way to reproduce this one... yenta_set_power triggers an interrupt noone knows how to handle. That triggers disabling of irq4.
Is there reliable way to reproduce it?
grep -E "Disabling|yenta_set_power" /var/log/messages Oct 14 07:25:59 linux kernel: [<f8f1d331>] yenta_set_power+0x31/0x60 [yenta_socket] Oct 14 07:26:02 linux kernel: Disabling IRQ #4 Oct 19 13:45:02 linux kernel: [<f8f2c331>] yenta_set_power+0x31/0x60 [yenta_socket] Oct 19 13:45:03 linux kernel: Disabling IRQ #4 Oct 20 17:45:24 linux kernel: [<f8fed331>] yenta_set_power+0x31/0x60 [yenta_socket] Oct 20 17:45:26 linux kernel: Disabling IRQ #4 Well, yes there must be, since it's happened 3 times in the last ten days on my laptop. I've not had time to try to track it down yet. Also, I have had this in /etc/pcmcia/config.opts for a long time now... exclude irq 4 ... and apparently it is being ignored, since irq 4 is being used for PCMCIA? 4: 20329964 XT-PIC yenta, radeon@pci:0000:01:00.0, uhci_hcd:usb1
In needinfo for way too long...