|
Bugzilla – Full Text Bug Listing |
| Summary: | Kerneloopses with 2.6.15-git12-6-smp | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE Linux 10.1 | Reporter: | Markus Koßmann <markus.kossmann> |
| Component: | Kernel | Assignee: | Gerd Hoffmann <kraxel> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | michel.munnix, terjejhanssen |
| Version: | Beta 1 | ||
| Target Milestone: | --- | ||
| Hardware: | 32bit | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
Full boot.msg with crash
hwinfo boot.msg of kotd 2.6.16-rc1-git3-20060120194150-smp (X86_64) on SUSE 10.0 Boot.msg 10.1b1 with kotd 2.6.16-rc1-git3-20060124182340-smp (i586) Screenshoot of oops during normal boot Boot.msg 10.1b3, booting in safe mode boot.msg with kotd kernel-debug-2.6.16_rc2_git8-20060210184420.i586 10.0 - hwinfo 10.1b3 - hwinfo 10.1b3 - /var/log/boot.msg 10.1b3 - /var/log/boot.omsg |
||
|
Description
Markus Koßmann
2006-01-21 17:45:16 UTC
Created attachment 64402 [details]
Full boot.msg with crash
After another reboot the system locked hard even before X11 was about to start. This is the saved boot.msg
Can you attach the output of 'hwinfo'? It looks like you have some bad driver issues :( Also, any chance you can test a kernel-of-the-day on this box? Created attachment 64866 [details]
hwinfo
This is hwinfo, run from a SUSE 10.0 x86_64 installation with kernel 2.6.15-20060109195850-smp.
on that system.
Created attachment 64867 [details]
boot.msg of kotd 2.6.16-rc1-git3-20060120194150-smp (X86_64) on SUSE 10.0
For test purposes I did allready compile and run 2.6.16-rc1-git3-20060120194150-smp on my SUSE-10.0 installation on that system . I did install the kernel-source.rpm and configured it using arch/x86_64/defconfig.smp.
That kernel is not exactly comparable with the problematic kernel from 10.1beta:
Its x86_64 arch not i586
It was compiled with gcc-4.0.2 (SuSE10.0) not with gcc-4.1
SuSE_10.0 is started, not 10.1
And the stradis driver , which seems to be problematic ("EIP is at stradis_probe+0x532/0x97f [stradis]" in the full boot.log oops) is not loaded.
Tomorrow^wToday I will try to find time to install the current KOTD on the currently broken 10.1 installation.
Created attachment 64884 [details]
Boot.msg 10.1b1 with kotd 2.6.16-rc1-git3-20060124182340-smp (i586)
Seems that the kotd didn't fix the problem yet. It still oopses in stradis_probe.
I think, I have now an idea what happens: The stradis driver was recently changed to use the 2.6 pci api ( see http://lkml.org/lkml/2005/12/31/144) . Both the Stradis Mpeg Output card and my Octal/Technotrend DVB-C card make use of a SAA7146 device. Now the stradis driver sees the SAA7146 on the Technotrend card and crashes when probing it. It's definitely the stradis module, which causes the problem. I've blacklisted it in /etc/modprobe.conf.local and now the oops doesn't show up any more. So the question is now, why and where this module is configured to load on 10.1b1 without stradis hardware. This problem also bugged me in when installing beta2. It caused a oops when restarting the system after installation of CD1. Workaround: blacklisting stradis.ko. It's getting loaded because you have the hardware for that driver in your system. As for why it is crashing, I do not know. For beta3 there should be some more debugging information in the crash to help us track this down. Can you reopen this bug then with the new oops message? No, I have no stradis hardware, I have that Technotrend DVB-C device. Sure the Technotrend device makes also use of the SAA7146, but note there are two variants of the 7146 driver : saa7146.ko and saa7146_vv.ko. The dvb_ttpci driver for the Techotrend device makes use of the saa7146_vv.ko module, the stradis driver seems to use saa7146.ko. The differences between these drivers may be the reason for the crash. I think the stradis driver lacks proper hardware recognition,which can make a difference between the stradis hardware and the Technotrend device. But I must admit, that such a routine is a special case, which is normally not needed for PCI devices with their unique IDs. As expected, there was no change with beta3. Unfortunately the first reboot during installation caused two kernel oops with hard lockup. No logfiles were written. So i can give you only picture of the screen with the second oops ( no scroollback possible). Then I rebooted into safe mode. This time a log was written. Created attachment 66360 [details]
Screenshoot of oops during normal boot
Created attachment 66361 [details]
Boot.msg 10.1b3, booting in safe mode
Same problem here : oops with beta3 kernel vmlinuz-2.6.16-rc1-git3-7-default .... EIP is at saa7146_irq+0x1f/0x515[stradis] .... and no possibility to do a page up, no logs I could write down some lines if of interrest I have no stradis, I have a Hauppauge wintv dvb-s card 01:09.0 Multimedia controller: Philips Semiconductors SAA7146 (rev 01) AFAIK the Hauppauge Nexus-S is identical to the Technotrend DVB-S card, like my Technotrend DVB-C is the same as the Hauppauge Nexus-CA. Both are "full featured" DVB cards with hardware mpeg2 decoder. Created attachment 67766 [details]
boot.msg with kotd kernel-debug-2.6.16_rc2_git8-20060210184420.i586
with the latest kernel, the stradis module seems no longer to interfere with my dvb-ttpci card, no kernel crash.
Well,I'am not sure that the latest kotd solves the problem. Just tested it: Removed the blacklist entry for stradis in /etc/modprobe.conf.local in my RC3 installation with original kernel and rebooted 2 times: No stradis module was loaded. And I also do run the kotds on a SuSE-10.0 installation and the problems with the stradis driver never showed up. So only the conditions in the early installation phase of 10.1 seem to trigger the problem. I modprobed the stradis driver on my 10.0 installation with online-updated kernel, then rmmod it. The system became inoperative not immediately but about 30 sec later, having switched to graphical mode and moved the mouse. To test your objection, I made a fresh install of 10.1Beta3 and on reboot after CD1 started on my 10.0 installation and installed the kotd in that 10.1 partition, then rebooted and continued the installation with CDs2-5 : no problem and there "are" traces in boot.log that the stradis module is probed at boot time. vdr seems to work as expected. I think the problem is at least improving Gerd, can you take a quick look at this? The device probing it does looks a bit scary, it grabs every saa7146 device instead of checking PCI Subsystem IDs. That can't work and should be fixed. But I have no idea what the PCI subsystem ID's are. Google doesn't find me anything, and /usr/share/pci.ids hasn't it either :-/ As far I know that piece of hardware is quite old, was expencive and probably is very rare. So simply disabling the driver in the kernel config or maybe better only blacklisting it by default is the easiest way to deal with it. Blacklisting modules is done by module-init-tools these days, right? Marian, it's stradis.ko *** Bug 150251 has been marked as a duplicate of this bug. *** Added stradis to blacklist. As "my" Bug 150251, "Kernel panic - not syncing: Fatal exception in interupt" on 10.1 beta2/beta3 has been marked as a duplicate of this bug, I'll continue here. To repeat, on my same hardware and disk I've installed and run SuSE Linux 9.0-9.3 Professional and jds2/SLES8 the last years. Currently I'm running SuSE 10.0 and jds3/SLES9 (beta) beside Win2k, in a multiboot configuration without any hardware or driver problems. This kernel panic problem started recently when I tried to install 10.1 beta2 and continued with beta3 installed on the same hardware. As the bug looks to be related to the mentioned "Stradis" driver in 10.1 beta, I searched the web (Stradis home page http://www.stradis.com/ ) that told me that that Stradis has long been a standard-definition MPEG and MEG-2 video decoder of choice for applications requiring true broadcast quality. This brought me to think that the Linux Stradis driver problem in my case may be related to my Pinnacle DV500 DVD video capture and editing card (for Windows). Even that this 32 bit busmastering PCI card (+ a Breakout Box and IEEE 1394 DV cable) now is 5 years old, it was and is still quite capable capturing both analog S-Video and DV beside its MEG-2 import. Some more description about the DV500 DVD card and software product is available on e.g these links: http://www.videoguys.com/dv500DVD.html http://www.tomshardware.com/2001/08/01/building_a_digital_video_capture_system_/page13.html http://www.pinnaclesys.com/WebVideo/dv500dvd/English(US)/doc/DV500_Datasheet.pdf In case it can be of some help for debugging this Stradis driver problem by comparing hwinfo on 10.0 (that works ok) and 10.1b3 (once I was able to do a tty login in failsafe mode I think, else I only get a black login screen), I'll attach both here. Beside I also attach 10.1b3 /var/log/boot.msg and /var/log/boot.omsg Terje J. Hanssen I wish my info comment above and my attachments belown to be evaluated, as I think it looks too easy to just blacklist stradis that has worked for all pre-Suse distros before 10.1? Terje J. Hanssen Created attachment 69987 [details]
10.0 - hwinfo
Created attachment 69988 [details]
10.1b3 - hwinfo
Created attachment 69990 [details]
10.1b3 - /var/log/boot.msg
Created attachment 69991 [details]
10.1b3 - /var/log/boot.omsg
The stradis driver just didn't got autoloaded via hotplug on older distro versions (none of the v4l drivers was), thats why the bug didn't trigger. If you manually load the driver using "modprobe stradis" on the 10.0 installation it likely blows up too. Problem is that the stradis driver actually tries to handle every card with an saa7146 chip on it, no matter whenever it actually is a stradis card or something else using the same chip (like your pinnacle card). And because of that behavior we simply can't autoload it, thus the blacklist entry. Owners of a stradis card still can load it manually and it probably even works. We can't verify that due to lack of hardware though. Although this does'nt pose any problem in my installation, I'd like to signal that stradis is not blacklisted as of beta5. So I do not reopen the bug. Still getting : <4>videodev: "SAA7146A" has no release callback. Please fix your driver for proper sysfs support, see http://lwn.net/Articles/36850/ <4>stradis0: config = 00 03 13 c2 26 0f ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff in /var/log/boot.msg |