Bug 113778

Summary: Acer Travelmate 8100 Bootstop
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Daniel Gramsch <info>
Component: KernelAssignee: Olaf Hering <ohering>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: behlert, frank_fischer, jari
Version: Beta 3   
Target Milestone: ---   
Hardware: i586   
OS: All   
Whiteboard:
Found By: Beta-Customer Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: hwinfo for acer-travelmate 8100
Last module is ahci.ko
error by module ahci.ko
1. Screen of Messages
2. Screen of Messages
3. Screen of Messages
1. Screen of Modules
2. Screen of Modules
Dmesg of kernel 091105
PATCH for yenta init
dmesg of pci-debug kernel 091205
lspci -vvvx output
dmesg of kernel 2.6.12.4-default
Dmesg of kernel-default-2.6.13-20050913_yenta_subordinate
yenta_subordinate_oops.patch

Description Daniel Gramsch 2005-08-29 09:41:30 UTC
Hello from Germany,

With SuSE 9.3 my "Acer TravelMate 8104WLMi" is booting correctly. With SuSE 
10.0 beta 1, 2 and beta 3 the booting stops by the step "Searching for info 
file...". The notebook hangs itself up completely.

http://www.ateo.de/download/suse/suse10beta3-acer-travelmate-8100-bootstop.jpg

When you have solved this problem, I must test the support for "ATI Mobility™ 
Radeon® X700 mit 128MB PCIe".

http://www.acer.de/acereuro/wr-
resource/932727457/upload/E0Entity3/1/TM8100_de170205final.pdf

Best Regards - Mario
Comment 1 Stanislav Visnovsky 2005-08-29 14:23:24 UTC
Please, attach the full output of hwinfo.  
Comment 2 Daniel Gramsch 2005-08-30 16:07:12 UTC
Created attachment 48196 [details]
hwinfo for acer-travelmate 8100
Comment 3 Steffen Winterfeldt 2005-09-02 09:22:06 UTC
Very likely a kernel problem. 
 
Look at console 3 for the last module(s) linuxrc loads and console 4 for 
kernel messages. 
Comment 4 Daniel Gramsch 2005-09-04 22:23:18 UTC
Created attachment 48735 [details]
Last module is ahci.ko
Comment 5 Daniel Gramsch 2005-09-04 22:25:19 UTC
Created attachment 48736 [details]
error by module ahci.ko
Comment 6 Olaf Kirch 2005-09-05 07:30:26 UTC
Oops, there's an oops in the yenta_socket module. 
 
Can you please try to capture more of this oops? When you switch to 
console 3 before the oops happens, you should be able to scroll back by 
half a screenful using Shift+PageUp. This should give us some additional 
information about where it oopsed. 
Comment 8 Daniel Gramsch 2005-09-05 12:09:55 UTC
Created attachment 48784 [details]
1. Screen of Messages
Comment 9 Daniel Gramsch 2005-09-05 12:10:45 UTC
Created attachment 48786 [details]
2. Screen of Messages
Comment 10 Daniel Gramsch 2005-09-05 12:11:27 UTC
Created attachment 48788 [details]
3. Screen of Messages
Comment 11 Daniel Gramsch 2005-09-05 12:12:37 UTC
Created attachment 48789 [details]
1. Screen of Modules
Comment 12 Daniel Gramsch 2005-09-05 12:13:20 UTC
Created attachment 48790 [details]
2. Screen of Modules
Comment 13 Daniel Gramsch 2005-09-05 12:22:26 UTC
Is this bug responsibly for the black screen and the complitly crash then the
system ist booting and acpi is not disabled ?!?
Comment 14 Chris L Mason 2005-09-05 12:25:27 UTC
Since you need to test other components with this machine, you can probably skip the  
pcmcia init by running insserv -r pcmcia.  
  
Then, when the machine is booted you can run 'rcpcmcia start' to do the pcmcia probe, 
or experiment with different pcmcia drivers. 
 
(this way we can at least work on the kernel side in parallel with your other testing). 
Comment 17 Daniel Gramsch 2005-09-05 13:01:16 UTC
sorry: "pcmcia init by running insserv -r pcmcia" (cd-boot-options: acpi=off
insserv -r pcmcia) is not helpfully. the system is stopping by the same step.

how can I merge rpm's into the ISO beta4?
Comment 18 Olaf Hering 2005-09-05 13:08:32 UTC
does it boot with safe settings? 

try to boot with this kernel cmdline option:

BrokenModules=yenta_socket

and pray that linuxrc and yast do really ignore them.
Comment 19 Christian Zoz 2005-09-05 13:27:28 UTC
And in installed system you have to add yenta_socket to /etc/hotplug/blacklist.
There is no more /etc/init.d/pcmcia.
To trigger loading of pcmcia stuff later just modprobe yenta_socket.
Comment 20 Daniel Gramsch 2005-09-05 14:17:24 UTC
the booting with safe settings and options "... BrokenModules=yenta_socket" is
okay. at the next night i starting the download for the iso 2-5. then wie
testing the other functions.

the patches from ftp.suse.com/pub/people/olh/kernel/bug115118/ must be installed
after the completely system installation?
Comment 21 Olaf Hering 2005-09-05 14:19:29 UTC
yes, this is a replacement kernel for the one on the CDs.
Comment 22 Michael Gross 2005-09-06 09:10:13 UTC
This is a problem that affects only a small number of users, hence reducing
severity to `critical'.
Comment 23 Jens Axboe 2005-09-06 18:24:56 UTC
*** Bug 112941 has been marked as a duplicate of this bug. ***
Comment 24 Olaf Kirch 2005-09-07 09:05:19 UTC
It dies in yenta_config_init because dev->subordinate is NULL.  
It's unclear to me how that could to happen.  
  
The boot messages for SL9.3 are also rather interesting - right after the 
yenta IRQ probe it reports "irq 11: nobody cared!" and disables that IRQ. 
It would seem cardbus cards never worked on that type of laptop. 
Comment 25 Hubert Mantel 2005-09-08 13:09:29 UTC
Olaf, do you have an idea who could continue investigating this one?
Comment 26 Olaf Kirch 2005-09-09 07:03:22 UTC
I think this belongs with the mobile devices team. 
 
Greg KH, something seems to go wrong when scanning the cardbus bridge. 
Can you give a hint where to look next? 
Comment 27 Christian Zoz 2005-09-09 07:38:30 UTC
There is noone in mobile devices team with sufficiant kernel knowledge to
investigate this kernel bug properly. Olaf, please assign it to any kernel
developer.
Comment 29 Olaf Hering 2005-09-09 08:58:24 UTC
argh, can the reporter please get the damn thing installed and upgrade to
ftp.suse.com/pub/projects/kernel/kotd/i386/HEAD/kernel-default.i586.rpm
I'm almost sure this will fix it.
Comment 30 Olaf Kirch 2005-09-09 09:29:22 UTC
Olaf H, will you track this please? Thanks! 
Comment 31 Frank Fischer 2005-09-12 04:28:34 UTC
Created attachment 49559 [details]
Dmesg of kernel 091105
Comment 32 Olaf Hering 2005-09-12 15:20:01 UTC
thanks, can you attach lspci -vvvvx?
Comment 33 Olaf Hering 2005-09-12 15:27:32 UTC
the only place where pci_dev->subordinate is set is in pci_alloc_child_bus().

either pci_alloc_child_bus wasnt called or pci_alloc_bus returned NULL.
Comment 34 Olaf Hering 2005-09-12 15:35:26 UTC
this is similar to http://lkml.org/lkml/2005/9/2/283
Comment 35 Marcus Wegner 2005-09-12 19:15:46 UTC
Created attachment 49673 [details]
PATCH for yenta init

I had the same problem with an ACER 8101 WLMI. I reverted the initialization
partly and it works now, but I don't know if it really fixes the bug. Something
seems to wrong in the pci initcode.
Comment 36 Olaf Hering 2005-09-12 21:55:04 UTC
please try ftp.suse.com/pub/people/olh/kernel/bug113778/ , it has PCI_DEBUG=y.

a863cc8cce6e0d3d7939c19c388149cc 
kernel-default-2.6.13-20050912_CONFIG_PCI_DEBUG.i586.rpm
3f593e7c428e058829f16ba353c2a224 
kernel-default-2.6.13-20050912_CONFIG_PCI_DEBUG.ia64.rpm
dfd382a128344075a4302bec4fa52522 
kernel-default-2.6.13-20050912_CONFIG_PCI_DEBUG.nosrc.rpm
d81768a9d84916a4bd73ee3264dc099e 
kernel-default-2.6.13-20050912_CONFIG_PCI_DEBUG.ppc.rpm
79f2af93aac531e6fad8a96359048cc0 
kernel-default-2.6.13-20050912_CONFIG_PCI_DEBUG.x86_64.rpm
321d2ae7be027237576eb63e1bcdcae6 
kernel-default-debuginfo-2.6.13-20050912_CONFIG_PCI_DEBUG.i586.rpm
b78bf871b99ee3f4ecce22e30b1ded01 
kernel-default-debuginfo-2.6.13-20050912_CONFIG_PCI_DEBUG.x86_64.rpm
233800c82908fc8fd4a039d43d999209 
kernel-default-nongpl-2.6.13-20050912_CONFIG_PCI_DEBUG.i586.rpm
d61a342e5166cddce7e418b86b27d679 
kernel-default-nongpl-2.6.13-20050912_CONFIG_PCI_DEBUG.ia64.rpm
7fc51282fbdf44a5b081f510e6f7bbe9 
kernel-default-nongpl-2.6.13-20050912_CONFIG_PCI_DEBUG.ppc.rpm
4f2ac2c7bec3dd84bdfa363ff8365622 
kernel-default-nongpl-2.6.13-20050912_CONFIG_PCI_DEBUG.x86_64.rpm
500d620e27c765cc5bd823117938202a  kernel-default.changes
Comment 37 Frank Fischer 2005-09-13 02:39:48 UTC
Created attachment 49715 [details]
dmesg of pci-debug kernel 091205
Comment 38 Frank Fischer 2005-09-13 02:42:36 UTC
Created attachment 49716 [details]
lspci -vvvx output
Comment 39 Frank Fischer 2005-09-13 03:16:48 UTC
Created attachment 49717 [details]
dmesg of kernel 2.6.12.4-default

This is the dmesg of my home-brewed kernel 2.6.12.4. There the yenta-socket
seems to initialize correctly, though I don't have any cardbus devices to test
it. With my 2.6.13-rc6-git11 it doesn't work. Maybe this information helps.
Comment 40 Olaf Kirch 2005-09-13 09:04:13 UTC
The commit that's being backed out by the patch in comment #35 is 
cc57450f5c044270d2cf1dd437c1850422262109 
 
[PATCH] acpi bridge hotadd: Prevent duplicate bus numbers when scanning PCI 
bridge 
  
 When hot-plugging a root bridge, as we try to assign bus numbers we may find 
 that the hotplugged hieratchy has more PCI to PCI bridges (i.e.  bus 
 requirements) than available.  Make sure we don't step over an existing bus 
 when that happens. 
 
 
Comment 41 Olaf Kirch 2005-09-13 09:13:33 UTC
In theory, 06:09.3 should have a secondary bus number of 9. But we have  
already assigned that number to 00:1c.0, so it doesn't get any bus id.  
  
06:09.3 CardBus bridge: O2 Micro, Inc. OZ711M3/MC3 4-in-1 MemoryCardBus  
Controller  
	[...] 
	Bus: primary=00, secondary=00, subordinate=00, sec-latency=0  
  
Comment 42 Olaf Kirch 2005-09-13 10:12:52 UTC
Does using "pci=assign-busses" on the kernel command line help? 
Comment 43 Frank Fischer 2005-09-13 16:37:50 UTC
pci=assign-buses makes the debug kernel stop at
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.2
Comment 44 Olaf Hering 2005-09-13 16:41:55 UTC
there are new rpms on the ftp server, they just check if dev->subordinate is
NULL and return.

ae6d3e7ac7ae08c0d33f190bef4455ee 
kernel-default-2.6.13-20050913_yenta_subordinate.i586.rpm
e84ba3542aa392e400abe942ec6f7ab3 
kernel-default-2.6.13-20050913_yenta_subordinate.nosrc.rpm
0a73d58a4724dd4b27336fb3c46b27d6 
kernel-default-2.6.13-20050913_yenta_subordinate.x86_64.rpm
2eb3aad6f228987c4829e7ffb290c889 
kernel-default-debuginfo-2.6.13-20050913_yenta_subordinate.i586.rpm
42f7071b2a37602396b28d6aaf269692 
kernel-default-debuginfo-2.6.13-20050913_yenta_subordinate.x86_64.rpm
2352d115e98fc31a7eca1bc53463b625 
kernel-default-nongpl-2.6.13-20050913_yenta_subordinate.i586.rpm
420e96a1690749af876d062816e73a7f 
kernel-default-nongpl-2.6.13-20050913_yenta_subordinate.x86_64.rpm
1152ba7c2e92d221385f9b656e9b51a7  kernel-default.changes
Comment 45 Frank Fischer 2005-09-14 01:34:04 UTC
Created attachment 49855 [details]
Dmesg of kernel-default-2.6.13-20050913_yenta_subordinate
Comment 46 Olaf Hering 2005-09-14 16:11:01 UTC
Created attachment 49920 [details]
yenta_subordinate_oops.patch

added this patch to cvs.
Comment 47 Olaf Hering 2005-09-14 16:11:44 UTC
added this patch, might now make it into 10.0 final.

+- add patches.fixes/yenta_subordinate_oops.patch
+  check for NULL pointer to fix crash when BIOS left some bus numbers
+  unassigned. (113778)