Bugzilla – Bug 144623
Kerneloopses with 2.6.15-git12-6-smp
Last modified: 2006-02-28 17:11:29 UTC
After Finishing the 10.1 beta1 32bit installation and rebooting the system I got the following oops when X11 was starting. System is a ASUS A8N32 Athlon64 2G Ram with Geforce 7800GT graphics and WinTV 350PVR and Technotrend C2300 DVB-C cards: Jan 21 18:07:57 emil3 klogd: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Jan 21 18:07:57 emil3 klogd: printing eip: Jan 21 18:07:57 emil3 klogd: c108cfe1 Jan 21 18:07:57 emil3 klogd: *pde = 00000000 Jan 21 18:07:57 emil3 klogd: Oops: 0000 [#2] Jan 21 18:07:57 emil3 klogd: SMP Jan 21 18:07:57 emil3 klogd: Modules linked in: button battery ac snd_mpu401 snd_mpu401_uart snd_rawmidi snd_seq_device ns558 gameport sky2 ohci1394 ieee1394 stradis compat_ioctl32 videodev snd_intel8x0 ehci_hcd snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd ohci_hcd i2c_nforce2 soundcore snd_page_alloc i2c_core usbcore forcedeth generic shpchp pci_hotplug dm_mod parport_pc lp parport reiserfs fan thermal processor sata_sil24 sg sata_nv libata amd74xx sd_mod scsi_mod ide_disk ide_core Jan 21 18:07:57 emil3 klogd: CPU: 0 Jan 21 18:07:57 emil3 klogd: EIP: 0060:[<c108cfe1>] Tainted: G U VLI Jan 21 18:07:57 emil3 klogd: EFLAGS: 00010246 (2.6.15-git12-6-smp) Jan 21 18:07:57 emil3 klogd: EIP is at sysfs_lookup+0x30/0x1a3 Jan 21 18:07:57 emil3 klogd: eax: 00000000 ebx: f7314198 ecx: c2563f68 edx: 00000004 Jan 21 18:07:57 emil3 klogd: esi: 00000000 edi: f73141f8 ebp: f71e6204 esp: c2563e54 Jan 21 18:07:57 emil3 klogd: ds: 007b es: 007b ss: 0068 Jan 21 18:07:57 emil3 klogd: Process hald (pid: 2949, threadinfo=c2562000 task=dfe92a90) Jan 21 18:07:57 emil3 klogd: Stack: <0>f71e622c 00000001 c11fc140 f7314198 f71ce608 f71ce67c c1064cde c2563ec4 Jan 21 18:07:57 emil3 klogd: c2563eb8 c2563f68 dfd72f40 baf3f719 f71ce608 c265101e c2563f68 c1066832 Jan 21 18:07:57 emil3 klogd: c2651024 00000000 00000000 c11f88a8 000280d2 c11f88a8 00000000 c2289280 Jan 21 18:07:57 emil3 klogd: Call Trace: Jan 21 18:07:57 emil3 klogd: [<c1064cde>] do_lookup+0xa3/0x135 Jan 21 18:07:57 emil3 klogd: [<c1066832>] __link_path_walk+0x7fd/0xc41 Jan 21 18:07:57 emil3 klogd: [<c1066cbf>] link_path_walk+0x49/0xbd Jan 21 18:07:57 emil3 klogd: [<c1066fba>] path_lookup+0x145/0x17a Jan 21 18:07:57 emil3 klogd: [<c106760c>] __user_walk+0x21/0x31 Jan 21 18:07:57 emil3 klogd: [<c10618a0>] sys_readlink+0x20/0x8c Jan 21 18:07:57 emil3 klogd: [<c1003c2b>] sysenter_past_esp+0x54/0x79 Jan 21 18:07:57 emil3 klogd: Code: d3 83 ec 08 8b 42 18 8b 40 54 89 04 24 8b 68 0c e9 63 01 00 00 f6 45 18 2c 0f 84 56 01 00 00 89 e8 e8 cc ec ff ff 8b 7b 24 89 c6 <ac> ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 0f 85 32 Jan 21 18:07:57 emil3 klogd: <6>BIOS EDD facility v0.16 2004-Jun-25, 3 devices found Jan 21 18:07:57 emil3 klogd: Unable to handle kernel NULL pointer dereference at virtual address 00000030 Jan 21 18:07:57 emil3 klogd: printing eip: Jan 21 18:07:57 emil3 klogd: f93e7488 Jan 21 18:07:57 emil3 klogd: *pde = 7f4b3067 Jan 21 18:07:57 emil3 klogd: Oops: 0000 [#3] Jan 21 18:07:57 emil3 klogd: SMP Jan 21 18:07:57 emil3 klogd: Modules linked in: edd button battery ac snd_mpu401 snd_mpu401_uart snd_rawmidi snd_seq_device ns558 gameport sky2 ohci1394 ieee1394 stradis compat_ioct l32 videodev snd_intel8x0 ehci_hcd snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd ohci_hcd i2c_nforce2 soundcore snd_page_alloc i2c_core usbcore forcedeth generic shpchp pci_hotp lug dm_mod parport_pc lp parport reiserfs fan thermal processor sata_sil24 sg sata_nv libata amd74xx sd_mod scsi_mod ide_disk ide_core Jan 21 18:07:57 emil3 klogd: CPU: 0 Jan 21 18:07:57 emil3 klogd: EIP: 0060:[<f93e7488>] Tainted: G U VLI Jan 21 18:07:57 emil3 klogd: EFLAGS: 00013246 (2.6.15-git12-6-smp) Jan 21 18:07:57 emil3 klogd: EIP is at video_open+0xb4/0x16a [videodev] Jan 21 18:07:57 emil3 klogd: eax: 00000000 ebx: f93e8c20 ecx: f93fd520 edx: f6cbe000 Jan 21 18:07:57 emil3 klogd: esi: c1251e40 edi: f777242c ebp: 00000000 esp: f6cbfef8 Jan 21 18:07:57 emil3 klogd: ds: 007b es: 007b ss: 0068 Jan 21 18:07:57 emil3 klogd: Process X (pid: 3536, threadinfo=f6cbe000 task=dfee2560) Jan 21 18:07:57 emil3 klogd: Stack: <0>00000000 f786e8c0 00000000 f777242c c1061169 c1251e40 00000000 c1251e40 Jan 21 18:07:57 emil3 klogd: f777242c 00000000 c1061043 c1058594 dfd72140 f70e042c c1251e40 f6cbff54 Jan 21 18:07:57 emil3 ifstatus: eth0 device: nVidia Corporation CK804 Ethernet Controller (rev a3) Jan 21 18:07:57 emil3 klogd: b7ea3ff4 00000008 c10586dc c1251e40 00000000 00008002 c1058712 f70e042c Jan 21 18:07:57 emil3 ifstatus: eth0 configuration: eth-bus-pci-0000:00:13.0 Jan 21 18:07:58 emil3 klogd: Call Trace: Jan 21 18:07:58 emil3 klogd: [<c1061169>] chrdev_open+0x126/0x163 Jan 21 18:07:58 emil3 klogd: [<c1251e40>] trap_init+0x135/0x1c8 Jan 21 18:07:58 emil3 klogd: [<c1251e40>] trap_init+0x135/0x1c8 Jan 21 18:07:58 emil3 klogd: [<c1061043>] chrdev_open+0x0/0x163 Jan 21 18:07:58 emil3 klogd: [<c1058594>] __dentry_open+0xc7/0x1ab Jan 21 18:07:58 emil3 klogd: [<c1251e40>] trap_init+0x135/0x1c8 Jan 21 18:07:58 emil3 klogd: [<c10586dc>] nameidata_to_filp+0x19/0x28 Jan 21 18:07:58 emil3 klogd: [<c1251e40>] trap_init+0x135/0x1c8 Jan 21 18:07:58 emil3 klogd: [<c1058712>] filp_open+0x27/0x2d Jan 21 18:07:58 emil3 klogd: [<c1251e40>] trap_init+0x135/0x1c8 Jan 21 18:07:58 emil3 klogd: [<c1059499>] do_sys_open+0x33/0xa3 Jan 21 18:07:58 emil3 klogd: [<c1003c2b>] sysenter_past_esp+0x54/0x79 Jan 21 18:07:58 emil3 klogd: Code: 85 d2 74 1b b8 00 e0 ff ff 21 e0 83 3a 02 8b 40 10 74 11 c1 e0 07 8d 84 10 00 01 00 00 ff 00 8b 41 34 eb 02 31 c0 89 46 10 31 ed <8b> 48 30 85 c9 74 6b 89 f2 89 f8 ff d1 85 c0 89 c5 74 5f 8b 46
Created attachment 64402 [details] Full boot.msg with crash After another reboot the system locked hard even before X11 was about to start. This is the saved boot.msg
Can you attach the output of 'hwinfo'? It looks like you have some bad driver issues :( Also, any chance you can test a kernel-of-the-day on this box?
Created attachment 64866 [details] hwinfo This is hwinfo, run from a SUSE 10.0 x86_64 installation with kernel 2.6.15-20060109195850-smp. on that system.
Created attachment 64867 [details] boot.msg of kotd 2.6.16-rc1-git3-20060120194150-smp (X86_64) on SUSE 10.0 For test purposes I did allready compile and run 2.6.16-rc1-git3-20060120194150-smp on my SUSE-10.0 installation on that system . I did install the kernel-source.rpm and configured it using arch/x86_64/defconfig.smp. That kernel is not exactly comparable with the problematic kernel from 10.1beta: Its x86_64 arch not i586 It was compiled with gcc-4.0.2 (SuSE10.0) not with gcc-4.1 SuSE_10.0 is started, not 10.1 And the stradis driver , which seems to be problematic ("EIP is at stradis_probe+0x532/0x97f [stradis]" in the full boot.log oops) is not loaded. Tomorrow^wToday I will try to find time to install the current KOTD on the currently broken 10.1 installation.
Created attachment 64884 [details] Boot.msg 10.1b1 with kotd 2.6.16-rc1-git3-20060124182340-smp (i586) Seems that the kotd didn't fix the problem yet. It still oopses in stradis_probe.
I think, I have now an idea what happens: The stradis driver was recently changed to use the 2.6 pci api ( see http://lkml.org/lkml/2005/12/31/144) . Both the Stradis Mpeg Output card and my Octal/Technotrend DVB-C card make use of a SAA7146 device. Now the stradis driver sees the SAA7146 on the Technotrend card and crashes when probing it.
It's definitely the stradis module, which causes the problem. I've blacklisted it in /etc/modprobe.conf.local and now the oops doesn't show up any more. So the question is now, why and where this module is configured to load on 10.1b1 without stradis hardware.
This problem also bugged me in when installing beta2. It caused a oops when restarting the system after installation of CD1. Workaround: blacklisting stradis.ko.
It's getting loaded because you have the hardware for that driver in your system. As for why it is crashing, I do not know. For beta3 there should be some more debugging information in the crash to help us track this down. Can you reopen this bug then with the new oops message?
No, I have no stradis hardware, I have that Technotrend DVB-C device. Sure the Technotrend device makes also use of the SAA7146, but note there are two variants of the 7146 driver : saa7146.ko and saa7146_vv.ko. The dvb_ttpci driver for the Techotrend device makes use of the saa7146_vv.ko module, the stradis driver seems to use saa7146.ko. The differences between these drivers may be the reason for the crash. I think the stradis driver lacks proper hardware recognition,which can make a difference between the stradis hardware and the Technotrend device. But I must admit, that such a routine is a special case, which is normally not needed for PCI devices with their unique IDs.
As expected, there was no change with beta3. Unfortunately the first reboot during installation caused two kernel oops with hard lockup. No logfiles were written. So i can give you only picture of the screen with the second oops ( no scroollback possible). Then I rebooted into safe mode. This time a log was written.
Created attachment 66360 [details] Screenshoot of oops during normal boot
Created attachment 66361 [details] Boot.msg 10.1b3, booting in safe mode
Same problem here : oops with beta3 kernel vmlinuz-2.6.16-rc1-git3-7-default .... EIP is at saa7146_irq+0x1f/0x515[stradis] .... and no possibility to do a page up, no logs I could write down some lines if of interrest I have no stradis, I have a Hauppauge wintv dvb-s card 01:09.0 Multimedia controller: Philips Semiconductors SAA7146 (rev 01)
AFAIK the Hauppauge Nexus-S is identical to the Technotrend DVB-S card, like my Technotrend DVB-C is the same as the Hauppauge Nexus-CA. Both are "full featured" DVB cards with hardware mpeg2 decoder.
Created attachment 67766 [details] boot.msg with kotd kernel-debug-2.6.16_rc2_git8-20060210184420.i586 with the latest kernel, the stradis module seems no longer to interfere with my dvb-ttpci card, no kernel crash.
Well,I'am not sure that the latest kotd solves the problem. Just tested it: Removed the blacklist entry for stradis in /etc/modprobe.conf.local in my RC3 installation with original kernel and rebooted 2 times: No stradis module was loaded. And I also do run the kotds on a SuSE-10.0 installation and the problems with the stradis driver never showed up. So only the conditions in the early installation phase of 10.1 seem to trigger the problem.
I modprobed the stradis driver on my 10.0 installation with online-updated kernel, then rmmod it. The system became inoperative not immediately but about 30 sec later, having switched to graphical mode and moved the mouse. To test your objection, I made a fresh install of 10.1Beta3 and on reboot after CD1 started on my 10.0 installation and installed the kotd in that 10.1 partition, then rebooted and continued the installation with CDs2-5 : no problem and there "are" traces in boot.log that the stradis module is probed at boot time. vdr seems to work as expected. I think the problem is at least improving
Gerd, can you take a quick look at this?
The device probing it does looks a bit scary, it grabs every saa7146 device instead of checking PCI Subsystem IDs. That can't work and should be fixed. But I have no idea what the PCI subsystem ID's are. Google doesn't find me anything, and /usr/share/pci.ids hasn't it either :-/ As far I know that piece of hardware is quite old, was expencive and probably is very rare. So simply disabling the driver in the kernel config or maybe better only blacklisting it by default is the easiest way to deal with it. Blacklisting modules is done by module-init-tools these days, right? Marian, it's stradis.ko
*** Bug 150251 has been marked as a duplicate of this bug. ***
Added stradis to blacklist.
As "my" Bug 150251, "Kernel panic - not syncing: Fatal exception in interupt" on 10.1 beta2/beta3 has been marked as a duplicate of this bug, I'll continue here. To repeat, on my same hardware and disk I've installed and run SuSE Linux 9.0-9.3 Professional and jds2/SLES8 the last years. Currently I'm running SuSE 10.0 and jds3/SLES9 (beta) beside Win2k, in a multiboot configuration without any hardware or driver problems. This kernel panic problem started recently when I tried to install 10.1 beta2 and continued with beta3 installed on the same hardware. As the bug looks to be related to the mentioned "Stradis" driver in 10.1 beta, I searched the web (Stradis home page http://www.stradis.com/ ) that told me that that Stradis has long been a standard-definition MPEG and MEG-2 video decoder of choice for applications requiring true broadcast quality. This brought me to think that the Linux Stradis driver problem in my case may be related to my Pinnacle DV500 DVD video capture and editing card (for Windows). Even that this 32 bit busmastering PCI card (+ a Breakout Box and IEEE 1394 DV cable) now is 5 years old, it was and is still quite capable capturing both analog S-Video and DV beside its MEG-2 import. Some more description about the DV500 DVD card and software product is available on e.g these links: http://www.videoguys.com/dv500DVD.html http://www.tomshardware.com/2001/08/01/building_a_digital_video_capture_system_/page13.html http://www.pinnaclesys.com/WebVideo/dv500dvd/English(US)/doc/DV500_Datasheet.pdf In case it can be of some help for debugging this Stradis driver problem by comparing hwinfo on 10.0 (that works ok) and 10.1b3 (once I was able to do a tty login in failsafe mode I think, else I only get a black login screen), I'll attach both here. Beside I also attach 10.1b3 /var/log/boot.msg and /var/log/boot.omsg Terje J. Hanssen
I wish my info comment above and my attachments belown to be evaluated, as I think it looks too easy to just blacklist stradis that has worked for all pre-Suse distros before 10.1? Terje J. Hanssen
Created attachment 69987 [details] 10.0 - hwinfo
Created attachment 69988 [details] 10.1b3 - hwinfo
Created attachment 69990 [details] 10.1b3 - /var/log/boot.msg
Created attachment 69991 [details] 10.1b3 - /var/log/boot.omsg
The stradis driver just didn't got autoloaded via hotplug on older distro versions (none of the v4l drivers was), thats why the bug didn't trigger. If you manually load the driver using "modprobe stradis" on the 10.0 installation it likely blows up too. Problem is that the stradis driver actually tries to handle every card with an saa7146 chip on it, no matter whenever it actually is a stradis card or something else using the same chip (like your pinnacle card). And because of that behavior we simply can't autoload it, thus the blacklist entry. Owners of a stradis card still can load it manually and it probably even works. We can't verify that due to lack of hardware though.
Although this does'nt pose any problem in my installation, I'd like to signal that stradis is not blacklisted as of beta5. So I do not reopen the bug. Still getting : <4>videodev: "SAA7146A" has no release callback. Please fix your driver for proper sysfs support, see http://lwn.net/Articles/36850/ <4>stradis0: config = 00 03 13 c2 26 0f ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff in /var/log/boot.msg