Bug 669556 - 11.4 m6 often crashing on boot
Summary: 11.4 m6 often crashing on boot
Status: RESOLVED INVALID
Alias: None
Product: openSUSE 11.4
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Milestone 6 of 6
Hardware: 32bit Other
: P5 - None : Major with 1 vote (vote)
Target Milestone: ---
Assignee: E-mail List
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-04 14:12 UTC by Roger Whittaker
Modified: 2011-04-20 09:08 UTC (History)
3 users (show)

See Also:
Found By: Community User
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
photo of screen after crash on boot (730.99 KB, image/jpeg)
2011-02-04 14:20 UTC, Roger Whittaker
Details
output of hwinfo (423.43 KB, application/octet-stream)
2011-02-04 14:26 UTC, Roger Whittaker
Details
hwinfo for MSI S271 noebook (OpenSUSE 11.4 Factory) (274.31 KB, text/plain)
2011-02-10 16:45 UTC, Vadim Plessky
Details
dmesg for MSI S271 notebook (boot with kernel 2.6.37-RC5) (49.30 KB, text/plain)
2011-02-10 16:49 UTC, Vadim Plessky
Details
OpenSUSE 11.4 MS6 boot - core dump (photo) (933.03 KB, image/jpeg)
2011-02-11 12:23 UTC, Vadim Plessky
Details
OpenSUSE 11.4 MS6 boot - core dump on Acer 6935G - photo (864.09 KB, image/jpeg)
2011-02-11 12:49 UTC, Vadim Plessky
Details
Hardware info - Acer 6935G (notebook) (754.55 KB, text/plain)
2011-02-11 12:55 UTC, Vadim Plessky
Details
dmesg from Rawhide (20110214.15) boot, kernel 2.6.38 (71.67 KB, text/plain)
2011-02-15 22:00 UTC, Vadim Plessky
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roger Whittaker 2011-02-04 14:12:37 UTC
Perhaps 3 times out of 4, with default boot options, m6 crashes during boot on my Samsung NP-N310 netbook.  I suspect i915 might be the problem as it doesn't seem to happen with the failsafe option where mode setting is disabled.  Unfortunately there's no serial port, so I can't get proper details, but I'll take a photo of the screen and upload shortly, as well as hwinfo output.
Comment 1 Roger Whittaker 2011-02-04 14:20:33 UTC
Created attachment 412281 [details]
photo of screen after crash on boot
Comment 2 Roger Whittaker 2011-02-04 14:22:41 UTC
When this happens, there's a wait after loading i915, followed by the crash.
Comment 3 Roger Whittaker 2011-02-04 14:26:23 UTC
Created attachment 412282 [details]
output of hwinfo
Comment 4 Roger Whittaker 2011-02-04 14:28:34 UTC
booting like this:

kernel /vmlinuz-2.6.37-20-default root=/dev/disk/by-id/ata-SAMSUNG_HM160HI_S1WWJH0S445408-part2 resume=/dev/disk/by-id/ata-SAMSUNG_HM160HI_S1WWJH0S445408-part3 splash=silent edd=off

I had switched off edd in the hope that it might improve matters.
Comment 5 Jeff Mahoney 2011-02-07 19:07:58 UTC
If you suspect modeset, can you boot with nomodeset?
Comment 6 Roger Whittaker 2011-02-08 09:12:57 UTC
Yes - after playing some more I'm fairly sure about this.  With nomodeset, it boots reliably.  Without, sometimes it boots OK, and sometimes it crashes (ratio of crashes to good boots maybe 3:1).  And the crash happens following a long wait after loading i915, just at the point where on a good boot the video mode would switch.
Comment 7 Vadim Plessky 2011-02-10 08:40:30 UTC
OpenSUSE 11.4 Milestone 6 was not booting at all for me (tested on MSI S271 notebook, and Acer 6935G)

Couple of Factory releases after MS6 - were crashing in kernel during boot.

Downloaded today:
openSUSE-KDE-LiveCD-i686-Build1056-Media.iso  
and tried it once more.

Kernel is crashing, I get into continuous loop of core dumps on console.

OpenSUSE 11.3 works fine on this notebook.
I also have install of 11.4 Milestone 4/5 on this notebook.
It boots fine with kernel 2.6.37-rc5-16, but doesn't boot with 2.6.37-30 (milestone 5)
Comment 8 Vadim Plessky 2011-02-10 08:47:34 UTC
Selecting "Kernel Safe Options" with MSI S271 notebook and Build1056 ->
  -> kernel crash

 pid: 1428, comm: clicfs Not tainted 2.6.37-20-default
 EIP: 0060 [<c032bb9e>] EFLAGS: 00010246 CPU: 1
 EIP is at _mark_inode_dirty+0xfe/0x1c0

Then Stack and Call Trace follows
Comment 9 Vadim Plessky 2011-02-10 08:49:29 UTC
This problem is also has been discussed in bug:  
https://bugzilla.novell.com/show_bug.cgi?id=668141
Comment 10 Vadim Plessky 2011-02-10 08:58:07 UTC
with 'nomodeset' - I also get crash
But in different place now

 EXT3-fs (sdb4): I/O error while writing superblock
 EXT3-fs (sdb4): error in ext3_reserve_inode_write: IO failure
 EXT3-fs (sdb4): I/O error while writing superblock
 BUG: unable to handle kernel NULL pointer dereference at 00000010
 IP: [<c032bb9e>] __mark_inode_dirty+0xfe/0x1c0
 *pde = 00000000
 Ooops: 0000 [#1] SMP
 last sysfs file:  /sys/devices/pci0000:00/0000:00:14.2/uevent
...
Comment 11 Vadim Plessky 2011-02-10 16:45:03 UTC
Created attachment 413358 [details]
hwinfo for MSI S271 noebook (OpenSUSE 11.4 Factory)


Booted with kernel:

Linux linux-msi-s271.site 2.6.37-rc5-16-default #1 SMP 2010-12-09 16:04:57 +0100 i686 athlon i386 GNU/Linux

Ths is last bootable kernel from recent (MS4-MS6) milestones
Comment 12 Vadim Plessky 2011-02-10 16:49:02 UTC
Created attachment 413362 [details]
dmesg for MSI S271 notebook (boot with kernel 2.6.37-RC5)


dmesg for MSI S271 notebook (OpenSUSE 11.4 Factory) boot

:~> uname -a
Linux linux-msi-s271.site 2.6.37-rc5-16-default #1 SMP 2010-12-09 16:04:57 +0100 i686 athlon i386 GNU/Linux

system boots with this kernel without problems
networking and graphics working fine.
Comment 13 Jeff Mahoney 2011-02-10 19:07:51 UTC
Can you attach the full trace? Just __mark_inode_dirty isn't enough to debug it. A full capture would be ideal, but a photo would work as well.
Comment 14 Vadim Plessky 2011-02-10 19:42:13 UTC
(In reply to comment #13)
> Can you attach the full trace? Just __mark_inode_dirty isn't enough to debug
> it. A full capture would be ideal, but a photo would work as well.

Jeff, pls advise how I can get full trace - as OpenSUSE doesn't boot, I can't reach command prompt.
Comment 15 Jeff Mahoney 2011-02-10 19:44:56 UTC
You quoted the beginning of it in comment #10. Boot without splash=silent or quiet.

Ideally a serial console could be used to capture it but failing that, a photo will do.
Comment 16 Vadim Plessky 2011-02-11 12:23:01 UTC
Created attachment 413552 [details]
OpenSUSE 11.4 MS6 boot - core dump (photo)


OpenSUSE 11.4 MS6 boot - core dump (photo)
System: MSI S271 notebook
Boot parameters: nomodeset
Comment 17 Vadim Plessky 2011-02-11 12:49:55 UTC
Created attachment 413555 [details]
OpenSUSE 11.4 MS6 boot - core dump on Acer 6935G - photo


OpenSUSE 11.4 MS6 boot - core dump on Acer 6935G (notebook)

Acer 6935G has Intel CPU (T9400)/ ans Intel chipset.
So problem with boot is not related to CPU or chipset.

(on previous photo - MSI S271 has AMD CPU and AMD chipset)
Comment 18 Vadim Plessky 2011-02-11 12:55:49 UTC
Created attachment 413556 [details]
Hardware info - Acer 6935G (notebook)


Hardware info - Acer 6935G (notebook) - for second photo (core dump)
taken from OpenSUSE 11.3
Comment 19 Vadim Plessky 2011-02-11 12:58:36 UTC
Comment on attachment 413556 [details]
Hardware info - Acer 6935G (notebook)


"Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz"
Comment 20 Vadim Plessky 2011-02-15 22:00:11 UTC
Created attachment 414288 [details]
dmesg from Rawhide (20110214.15) boot, kernel 2.6.38


To make sure that problem is not related to recent kernel (2.6.37.x and up), I downloaded Fedora Rawhide 20110214.15 and booted from USB stick on MSI S271 notebook.
Fedora Rawhide has kernel 2.6.38-0.rc4.git0.2.fc15.i686

I got up to graphical screen, with flashing window says that I have crash in 'mutter' (filed bug report: https://bugzilla.redhat.com/show_bug.cgi?id=677810 )
But boot process was going through.

Hope this information would be helpful.

Looking forward to test new snapshot with updated kernel! :)
Comment 21 Jeff Mahoney 2011-02-15 22:10:54 UTC
By using rawhide, you're taking the fault out of the equation. The crash involved clicfs, which is a FUSE file system.

Please open a new report for this since it's different than the original issue.
Comment 22 Vadim Plessky 2011-02-16 07:56:23 UTC
Jeff, thanks for explanation.
Would problem with clicfs/FUSE file system be fixed in OpenSUSE soon?

My understanding that OpenSUSE has RC1 state.
But it's even not possible to boot system right now, on two computers which I have for tests like this.
Comment 23 Vadim Plessky 2011-02-27 08:23:05 UTC
I successfully booted OpenSUSE 11.4 RC2 (x86) KDE LiveCD on Acer 6935G.
Comment 24 Vadim Plessky 2011-02-27 08:38:13 UTC
On MSI S271: OpenSUSE 11.4 RC2 doesn't crash
But it also doens't boot to Login

...
 squashfs: version 4.0
 fuse init (API vrsion 7.15)
 kjournald starting. Commit interval 15 sec.
 EXT3-fs (sdb4): mounted filesystem with ordered data mode
.. 
/etc/initscript: line 77: /etc/config/ulimit: unptup/output error
Id "2" respawning too fast. disabled for 5 minutes
 
INIT: Id "3" respawning too fast. disabled for 5 minutes

Not sure if I should make separate bug report, or it's already known issue and can be fixed ASAP?
Comment 25 Jeff Mahoney 2011-04-19 19:10:00 UTC
This report now has three separate bugs in it and is going to be tough to unravel. If any of them still exist, please open new reports.
Comment 26 Roger Whittaker 2011-04-20 08:38:39 UTC
I still see the behaviour I originally reported with the final 11.4 on the same hardware.  I'll open a new report as suggested.
Comment 27 Roger Whittaker 2011-04-20 09:08:42 UTC
https://bugzilla.novell.com/show_bug.cgi?id=688714