Bug 140089 - Cannot Shutdown or Reboot
Summary: Cannot Shutdown or Reboot
Status: RESOLVED INVALID
Alias: None
Product: SUSE LINUX 10.0
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Final
Hardware: x86-64 SuSE Linux 10.0
: P5 - None : Major
Target Milestone: ---
Assignee: E-mail List
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-12-18 06:10 UTC by William Henson
Modified: 2006-03-06 14:38 UTC (History)
0 users

See Also:
Found By: Customer
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
hwinfo output piped to file as requested (203.93 KB, text/plain)
2005-12-21 01:47 UTC, William Henson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description William Henson 2005-12-18 06:10:05 UTC
System Hadware Specs:

Celeron D 336 2.8 Ghz (64-bit)
Asus P5LD2-VM mobo (using GMA 950 onboard video)
1 GB Corsair DDR2 RAM
74 GB WD Raptor SATA HDD
Linksys WUSB11 802.11/b
LiteOn CDRW/DVDRW Combo drive (PATA)
Generic wheel mouse (PS2)
Generic 104 keyboard (PS2)
LCD monitor

OS:
Suse 10 x86-64 (Retail DVD used for installation)
uname -a
Linux linux 2.6.13-15.7-default #1 Tue Nov 29 14:32:29 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux


Problem/Bug


1) Normal shutdown or reboot will not work at all by any method at all. It gets as far as terminating the GUI and that's it, the system is frozen at a black screen, totally unresponsive.
2) A command line "halt" has the same result.
3) A command line "init 0" has the same result.
4) Setting acpi=off renders the system dead-Suse will not load after post.
5) The last kernel update specifically mentions including a fix "for some machines" for shutdown/reboot. Looks like maybe not all are covered by the bug fix.
6) Different distros, albeit 32 bit, shut down and reboot flawlessly on this same machine.
7) Removing the USB device (WUSB11) makes no difference. Shutdown/reboot crashes exactly the same.
8) I have NOT attempted a BIOS update yet. The motherboard is brand new and there is only ONE BIOS update newer than what is on the board. According to the Asus site, that update simply adds support for "new CPUs".
9) I've been all over the Suse forums and also searched the bug reports. This seems like it is not uncommon and has been around for awhile. Shutdown/reboot issues may not be considered critical or eeven major, technically speaking, they are however very upsetting and, I would think, potentially damaging in some regard. 

Thanks for any advice. Additional supporting data follows:

Logs

Here's an extract from boot.omsg (everything from when shutdown was ordered)

boot.omsg

Master Resource Control: previous runlevel: 5, switching to runlevel: 0
<notice>killproc: kill(6230,15)
<notice>killproc: kill(6291,15)
Shutting down cupsddone
<notice>killproc: kill(6314,15)
Shutting down Clam AntiVirus daemon done
Shutting down Clam AntiVirus database update daemon done
Shutting down mdnsd <notice>killproc: kill(5953,2)
done
<notice>killproc: kill(6119,15)
<notice>killproc: kill(6395,15)
Saving random seeddone
<notice>killproc: kill(6229,15)
Umount SMB/ CIFS File Systems done
Shutting down Name Service Cache Daemondone
<notice>killproc: kill(4830,15)
Shutting down SMPPPD

/var/log/messages

Dec 13 14:13:29 linux gconfd (root-25913): GConf server is not in use, shutting down.
Dec 13 14:13:29 linux gconfd (root-25913): Exiting
Dec 13 14:13:36 linux kernel: mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x400000
Dec 13 14:14:00 linux dhcpcd[25368]: terminating on signal 11
Dec 13 14:14:00 linux dhcpcd[25368]: terminating on signal 15
Dec 13 14:14:01 linux kernel: Unable to handle kernel paging request at ffff810014ed8000 RIP: 
Dec 13 14:14:01 linux kernel: <ffffffff802238e2>{clear_page+18}
Dec 13 14:14:01 linux kernel: PGD 8063 PUD 9063 PMD 8000000014e001e3 BAD
Dec 13 14:14:01 linux kernel: Oops: 000b [1] 
Dec 13 14:14:01 linux kernel: CPU 0 
Dec 13 14:14:01 linux kernel: Modules linked in: nls_iso8859_1 nls_cp437 nls_utf8 udf at76c503_rfmd firmware_class at76c503 at76_usbdfu joydev hfsplus vfat fat subfs ipt_pkttype ipt_LOG ipt_limit freq_table snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device button battery ac af_packet ide_cd cdrom floppy i2c_i801 i2c_core e1000 ehci_hcd generic uhci_hcd usbcore edd snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ip6t_REJECT ipt_REJECT ipt_state iptable_mangle iptable_nat iptable_filter ip6table_mangle ip_conntrack parport_pc ip_tables lp parport ip6table_filter ip6_tables ipv6 dm_mod reiserfs fan thermal processor sg it821x ata_piix libata piix sd_mod scsi_mod ide_disk ide_core
Dec 13 14:14:01 linux kernel: Pid: 3658, comm: TakeDevices Tainted: G     U 2.6.13-15.7-default
Dec 13 14:14:01 linux kernel: RIP: 0010:[<ffffffff802238e2>] <ffffffff802238e2>{clear_page+18}
Dec 13 14:14:01 linux kernel: RSP: 0000:ffff810039805d00  EFLAGS: 00010216
Dec 13 14:14:01 linux kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000000003f
Dec 13 14:14:01 linux kernel: RDX: ffff810001493f40 RSI: 0000000000000000 RDI: ffff810014ed8000
Dec 13 14:14:01 linux kernel: RBP: ffff810001493f78 R08: 0000000000000000 R09: 0000000000000000
Dec 13 14:14:01 linux kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff810001493f40
Dec 13 14:14:01 linux kernel: R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000080d2
Dec 13 14:14:01 linux kernel: FS:  0000000000000000(0000) GS:ffffffff8049b800(0000) knlGS:0000000000000000
Dec 13 14:14:01 linux kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 13 14:14:01 linux kernel: CR2: ffff810014e006c0 CR3: 000000001c33a000 CR4: 00000000000006e0
Dec 13 14:14:01 linux kdm: :0[4774]: Cannot execute reset script "/etc/X11/xdm/Xreset"
Dec 13 14:14:01 linux kernel: Process TakeDevices (pid: 3658, threadinfo ffff810039804000, task ffff81003b0621b0)
Dec 13 14:14:01 linux kernel: Stack: ffffffff8015ddb8 00002aaaab293000 0000000000000000 0000000000000246 
Dec 13 14:14:01 linux kernel:        ffffffff8015dd51 0000000000000246 ffffffff803adc68 0000000000000000 
Dec 13 14:14:01 linux kernel:        0000000000000000 00000000000080d2 
Dec 13 14:14:01 linux kernel: Call Trace:<ffffffff8015ddb8>{buffered_rmqueue+568} <ffffffff8015dd51>{buffered_rmqueue+465}
Dec 13 14:14:01 linux kernel:        <ffffffff8015df40>{__alloc_pages+256} <ffffffff801684f8>{do_no_page+248}
Dec 13 14:14:01 linux kernel:        <ffffffff8012092d>{do_page_fault+1165} <ffffffff8016cd5d>{do_mprotect+1549}
Dec 13 14:14:01 linux kernel:        <ffffffff8010f1dd>{error_exit+0} 
Dec 13 14:14:01 linux kernel: 
Dec 13 14:14:01 linux kernel: Code: 48 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 
Dec 13 14:14:01 linux kernel: RIP <ffffffff802238e2>{clear_page+18} RSP <ffff810039805d00>
Dec 13 14:14:01 linux kernel: CR2: ffff810014ed8000
Dec 13 14:14:01 linux kernel:  <1>Unable to handle kernel paging request at ffff810014fd57b0 RIP: 
Dec 13 14:14:01 linux kernel: <ffffffff801215d0>{__change_page_attr+224}
Dec 13 14:14:01 linux kernel: PGD 8063 PUD 9063 PMD 8000000014e001e3 BAD
Dec 13 14:14:01 linux kernel: Oops: 0009 [2] 
Dec 13 14:14:01 linux kernel: CPU 0 
Dec 13 14:14:01 linux kernel: Modules linked in: nls_iso8859_1 nls_cp437 nls_utf8 udf at76c503_rfmd firmware_class at76c503 at76_usbdfu joydev hfsplus vfat fat subfs ipt_pkttype ipt_LOG ipt_limit freq_table snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device button battery ac af_packet ide_cd cdrom floppy i2c_i801 i2c_core e1000 ehci_hcd generic uhci_hcd usbcore edd snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ip6t_REJECT ipt_REJECT ipt_state iptable_mangle iptable_nat iptable_filter ip6table_mangle ip_conntrack parport_pc ip_tables lp parport ip6table_filter ip6_tables ipv6 dm_mod reiserfs fan thermal processor sg it821x ata_piix libata piix sd_mod scsi_mod ide_disk ide_core
Dec 13 14:14:01 linux kernel: Pid: 3636, comm: X Tainted: G     U 2.6.13-15.7-default
Dec 13 14:14:01 linux kernel: RIP: 0010:[<ffffffff801215d0>] <ffffffff801215d0>{__change_page_attr+224}
Dec 13 14:14:01 linux kernel: RSP: 0018:ffff81001ef3bdc8  EFLAGS: 00010282
Dec 13 14:14:01 linux kernel: RAX: 0000000014fd57b0 RBX: ffff81003d8f6000 RCX: 0000000014fd5163
Dec 13 14:14:01 linux kernel: RDX: 0000000014fd5000 RSI: 000000000003d8f6 RDI: ffff810000000000
Dec 13 14:14:01 linux kernel: RBP: ffff81003d8f6000 R08: 03fffffffffff000 R09: 8000000000000163
Dec 13 14:14:01 linux kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Dec 13 14:14:01 linux kernel: R13: ffff810014fd57b0 R14: 0000000000000f60 R15: 8000000000000163
Dec 13 14:14:01 linux kernel: FS:  00002aaaab36a6e0(0000) GS:ffffffff8049b800(0000) knlGS:0000000000000000
Dec 13 14:14:01 linux kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 13 14:14:01 linux kernel: CR2: ffff810014e00ea8 CR3: 000000000d5ca000 CR4: 00000000000006e0
Dec 13 14:14:01 linux kernel: Process X (pid: 3636, threadinfo ffff81001ef3a000, task ffff81003b640b30)
Dec 13 14:14:01 linux kernel: Stack: ffff810037c56f70 ffffffff80101810 ffff8100011aab30 ffffffff00000000 
Dec 13 14:14:01 linux kernel:        0000000000000000 0000000000000292 0000000000000000 ffff81003d8f6000 
Dec 13 14:14:01 linux kernel:        000000000003d8f6 0000000000000000 
Dec 13 14:14:01 linux kernel: Call Trace:<ffffffff80101810>{init_level4_pgt+2064} <ffffffff801219b0>{change_page_attr_addr+160}
Dec 13 14:14:01 linux kernel:        <ffffffff8027ec35>{unmap_page_from_agp+21} <ffffffff8027f028>{agp_generic_destroy_page+88}
Dec 13 14:14:01 linux kernel:        <ffffffff8027f31f>{agp_free_memory+111} <ffffffff8027dff8>{agp_release+200}
Dec 13 14:14:01 linux kernel:        <ffffffff8017a622>{__fput+194} <ffffffff80177998>{filp_close+104}
Dec 13 14:14:01 linux kernel:        <ffffffff80177f70>{sys_close+112} <ffffffff8010e91a>{system_call+126}
Dec 13 14:14:01 linux kernel:        
Dec 13 14:14:01 linux kernel: 
Dec 13 14:14:01 linux kernel: Code: 49 8b 4d 00 f6 c1 81 0f 84 c4 02 00 00 4c 89 ea 48 b8 ff ff 
Dec 13 14:14:01 linux kernel: RIP <ffffffff801215d0>{__change_page_attr+224} RSP <ffff81001ef3bdc8>
Dec 13 14:14:01 linux kernel: CR2: ffff810014fd57b0
Dec 13 14:14:01 linux kernel:  <1>Unable to handle kernel paging request at ffff810014fab040 RIP: 
Dec 13 14:14:01 linux kernel: <ffffffff80160e7b>{free_block+139}
Dec 13 14:14:01 linux kernel: PGD 8063 PUD 9063 PMD 8000000014e001e3 BAD
Dec 13 14:14:01 linux kernel: Oops: 000b [3] 
Dec 13 14:14:01 linux kernel: CPU 0 
Dec 13 14:14:01 linux kernel: Modules linked in: nls_iso8859_1 nls_cp437 nls_utf8 udf at76c503_rfmd firmware_class at76c503 at76_usbdfu joydev hfsplus vfat fat subfs ipt_pkttype ipt_LOG ipt_limit freq_table snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device button battery ac af_packet ide_cd cdrom floppy i2c_i801 i2c_core e1000 ehci_hcd generic uhci_hcd usbcore edd snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc shpchp pci_hotplug ip6t_REJECT ipt_REJECT ipt_state iptable_mangle iptable_nat iptable_filter ip6table_mangle ip_conntrack parport_pc ip_tables lp parport ip6table_filter ip6_tables ipv6 dm_mod reiserfs fan thermal processor sg it821x ata_piix libata piix sd_mod scsi_mod ide_disk ide_core
Dec 13 14:14:01 linux kernel: Pid: 3636, comm: X Tainted: G     U 2.6.13-15.7-default
Dec 13 14:14:01 linux kernel: RIP: 0010:[<ffffffff80160e7b>] <ffffffff80160e7b>{free_block+139}
Dec 13 14:14:01 linux kernel: RSP: 0018:ffff81001ef3bac8  EFLAGS: 00010012
Dec 13 14:14:01 linux kernel: RAX: ffff810014fab040 RBX: ffff81003f79da80 RCX: 000000000043d738
Dec 13 14:14:01 linux kernel: RDX: ffff810015f6a040 RSI: ffff810013621040 RDI: ffff810013621778
Dec 13 14:14:01 linux kernel: RBP: ffff81003f7960b0 R08: 00000000000000b0 R09: 0000000000000000
Dec 13 14:14:01 linux kernel: R10: 0000000000000034 R11: ffffffff8015edc0 R12: ffff81003f79da90
Dec 13 14:14:01 linux kernel: R13: 0000000000000014 R14: 000000000000003c R15: ffff81003f79dab0
Dec 13 14:14:01 linux kernel: FS:  0000000000000000(0000) GS:ffffffff8049b800(0000) knlGS:0000000000000000
Dec 13 14:14:01 linux kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 13 14:14:01 linux kernel: CR2: ffff810014e00d58 CR3: 000000000d5ca000 CR4: 00000000000006e0
Dec 13 14:14:01 linux kernel: Process X (pid: 3636, threadinfo ffff81001ef3a000, task ffff81003b640b30)
Dec 13 14:14:01 linux kernel: Stack: ffff8100123da380 ffff81003f79eb20 000000000000003c ffff81003f796010 
Dec 13 14:14:01 linux kernel:        ffff81003f796000 ffff81003b640b30 ffff81001dc72480 ffffffff801611ce 
Dec 13 14:14:01 linux kernel:        ffff81003f796000 0000000000000292 
Dec 13 14:14:01 linux kernel: Call Trace:<ffffffff801611ce>{cache_flusharray+110} <ffffffff80160cb8>{kmem_cache_free+56}
Dec 13 14:14:01 linux kernel:        <ffffffff8016a83f>{exit_mmap+303} <ffffffff8013168b>{mmput+27}
Dec 13 14:14:01 linux kernel:        <ffffffff80135931>{do_exit+497} <ffffffff80277f27>{do_unblank_screen+119}
Dec 13 14:14:01 linux kernel:        <ffffffff80120bee>{do_page_fault+1870} <ffffffff8010f1dd>{error_exit+0}
Dec 13 14:14:01 linux kernel:        <ffffffff801215d0>{__change_page_attr+224} <ffffffff80101810>{init_level4_pgt+2064}
Dec 13 14:14:01 linux kernel:        <ffffffff801219b0>{change_page_attr_addr+160} <ffffffff8027ec35>{unmap_page_from_agp+21}
Dec 13 14:14:01 linux kernel:        <ffffffff8027f028>{agp_generic_destroy_page+88} <ffffffff8027f31f>{agp_free_memory+111}
Dec 13 14:14:01 linux kernel:        <ffffffff8027dff8>{agp_release+200} <ffffffff8017a622>{__fput+194}
Dec 13 14:14:01 linux kernel:        <ffffffff80177998>{filp_close+104} <ffffffff80177f70>{sys_close+112}
Dec 13 14:14:01 linux kernel:        <ffffffff8010e91a>{system_call+126} 
Dec 13 14:14:01 linux kernel: 
Dec 13 14:14:01 linux kernel: Code: 48 89 10 48 2b 7e 18 48 c7 06 00 01 10 00 48 c7 46 08 00 02 
Dec 13 14:14:01 linux kernel: RIP <ffffffff80160e7b>{free_block+139} RSP <ffff81001ef3bac8>
Dec 13 14:14:01 linux kernel: CR2: ffff810014fab040
Dec 13 14:14:01 linux kernel:  <1>Fixing recursive fault but reboot is needed!
Dec 13 14:14:01 linux kernel: Unable to handle kernel paging request at ffff81008d732b64 RIP: 
Dec 13 14:14:01 linux kernel: <ffffffff80160ea8>{free_block+184}
Dec 13 14:14:01 linux kernel: PGD 8063 PUD 0
Comment 1 Olaf Kirch 2005-12-19 11:01:41 UTC
Looks like something corrupts your system's memory quite badly when
you shut it down.

Can you please run hwinfo (as root) when the system is up and running, and
attach the output to this report? Thanks!
Comment 2 William Henson 2005-12-21 01:47:39 UTC
Created attachment 61514 [details]
hwinfo output piped to file as requested

The RAM is Corsair DDR2 533/PC4200...never had issues with Corsair before, but who knows.

I did a BIOS update prior to running hwinfo, so the BIOS now the most current, but no change in the shutdown problem.

The hwinfo output is quite long  and it's attached as a file named 
hardwareinfo.txt.

Thanks for the help. I love Suse and really hope this last issue gets resolved.
Comment 3 Olaf Kirch 2006-01-02 09:05:05 UTC
Two things stand out in your hwinfo - the wireless driver and the ITE 821x
RAID controller everything else looks like pretty common HW that I wouldn't
suspect of causing memory corruption on such a massive scale.

I would suspect the WLAN driver. Please try running without the WLAN card
(or move the kernel module out of the /lib/modules directory)
Comment 4 William Henson 2006-01-02 22:46:43 UTC
The WLAN device is a Linksys WUSB11 configured to start when plugged in (only way I could get it to work). I shutdown the box (via Kmenu but it went unclean of course), unplugged the USB wireless adapter, and started back up. Logs show no eveidence that the driver is running, iwconfig shows no wireless extensions now, and route shows loopback and no gateway. To me, that would indicate the wireless device is out of the picture. Note that I have not removed the device from YaST yet. Shutdown is still dirty so I guess the next thing is your last instruction: move the kernel module out of the /lib/modules directory). I need to be clear on just what you mean? Certainly not moving the entire kernel. The wireless drivers are located in /lib/modules/2.6.13-15.7-default/extra. The drivers for this device (Atmel chipset)include the following:

-rw-r--r--  1 root root  11800 Nov 29 17:32 at76c503-i3861.ko
-rw-r--r--  1 root root  10152 Nov 29 17:32 at76c503-i3863.ko
-rw-r--r--  1 root root  10152 Nov 29 17:32 at76c503-rfmd-acc.ko
-rw-r--r--  1 root root  12208 Nov 29 17:32 at76c503-rfmd.ko
-rw-r--r--  1 root root 151616 Nov 29 17:32 at76c503.ko
-rw-r--r--  1 root root  10112 Nov 29 17:32 at76c505-rfmd.ko
-rw-r--r--  1 root root  10856 Nov 29 17:32 at76c505-rfmd2958.ko
-rw-r--r--  1 root root  10152 Nov 29 17:32 at76c505a-rfmd2958.ko

So do you mean move these, or something else? Also, do you think it would be a good idea to remove the wireless device from YaST also?

As a final note, after the system boots and I'm just at the log-in screen (before any configurations are loaded), shutdown and reboot do work fine.

Thanks
Comment 5 William Henson 2006-01-02 22:53:40 UTC
BTW--Bug 115018 reads almost exactly the same and appears to have been temporarily solved by using a KOTD.
Comment 6 Olaf Kirch 2006-01-03 13:26:56 UTC
William, feel free to try a newer 10.0 KOTD but I am not entirely convinced
this is the same as Bug 115018

Just removing the card should be sufficient to prevent the driver from
loading. No further moving about of the module is required - It wasn't
clear to me it was removable.

One thing you may want to try is find out whether this is related to
any video activity. Ie boot into runlevel 3, make sure nvidia-agp and
related are not loaded. If that goes well, compare the list of kernel
modules loaded.
Comment 7 Olaf Kirch 2006-03-06 14:38:18 UTC
No feedback in 2 months, closing