Bug 566303

Summary: System hangs on default boot; probably due to "auditd"
Product: [openSUSE] openSUSE 11.2 Reporter: Hung Ming Tsoi <hungming.tsoi>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: forgotten_qMyteedNxa, jeffm
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 11.2   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Extracts from /var/log/messages
Log of several bootup tries ("yes" to prompt_for_confirm)
Log of pressing "alt+sysreq+w" after successful boot by "y" to every prompt

Description Hung Ming Tsoi 2009-12-20 17:52:00 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091103 SUSE/3.5.5-1.1.2 Firefox/3.5.5

Boot problem has arised immediately after non-updating installation (booting was successful for 2 or 3 times). When booting using failsafe, the system then boots perfectly. If use prompt_for_confirm option in the normal boot and press "y" to every process (even "y" to auditd and apparmor), the system boots normally. Boot fails when disabling confirming process.

Action taken: (Re-)Installed kernel-desktop and kernel-default both 2.6.31.5-0.1.1 and tried to boot using the new kernels. The same problem persisted.

For failed boots, the last lines in /var/log/messages are always:
Dec 20 13:14:23 linux-hsgi kernel: [ 1459.938332] [drm] Num pipes: 3
Dec 20 13:14:23 linux-hsgi kernel: [ 1460.015624] mtrr: MTRR 2 not used
Dec 20 13:14:23 linux-hsgi shutdown[4224]: shutting down for system reboot
Dec 20 13:14:24 linux-hsgi init: Switching to runlevel: 6
Dec 20 13:14:26 linux-hsgi avahi-daemon[1427]: Got SIGTERM, quitting.
Dec 20 13:14:26 linux-hsgi avahi-daemon[1427]: Leaving mDNS multicast group on interface eth0.IPv4 with address 10.72.31.1.
Dec 20 13:14:26 linux-hsgi smartd[1990]: smartd received signal 15: Terminated
Dec 20 13:14:26 linux-hsgi smartd[1990]: Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.TOSHIBA_MK1637GSX-387MT1RVT.ata.state
Dec 20 13:14:26 linux-hsgi smartd[1990]: smartd is exiting (exit status 0)
Dec 20 13:14:26 linux-hsgi auditd[1292]: Error sending signal_info request (Operation not supported)
Dec 20 13:14:26 linux-hsgi auditd[1292]: The audit daemon is exiting.
Dec 20 13:14:26 linux-hsgi network: Shutting down the NetworkManager
Dec 20 13:14:26 linux-hsgi kernel: [ 1463.680092] sky2 eth0: disabling interface
Dec 20 13:14:28 linux-hsgi ifdown: wlan0 device: Atheros Communications Inc. AR5001 Wireless Network Adapter (rev 01)
Dec 20 13:14:29 linux-hsgi rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"
Dec 20 13:14:29 linux-hsgi kernel: Kernel logging (proc) stopped.
Dec 20 13:14:29 linux-hsgi rsyslogd: [origin software="rsyslogd" swVersion="4.4.1" x-pid="1276" x-info="http://www.rsyslog.com"] exiting on signal 15.


I have also tried to disable (i.e. press "n" on confirmation) auditd or/and rpcbind on booting, and press “continue” after these 2 commands, but the system halts afterwards without logging. 

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Comment 1 Hung Ming Tsoi 2009-12-20 18:05:31 UTC
Created attachment 333587 [details]
Extracts from /var/log/messages
Comment 2 Jeff Mahoney 2009-12-22 21:04:55 UTC
Can you boot with sysrq=1 and then provide the output of alt+sysrq+w ?
Comment 3 Hung Ming Tsoi 2009-12-23 00:42:54 UTC
Created attachment 333992 [details]
Log of several bootup tries ("yes" to prompt_for_confirm)
Comment 4 Hung Ming Tsoi 2009-12-23 22:41:21 UTC
I think I have forgotten to comment.

In my boots, I always set "yes" in the option of PROMPT_FOR_CONFIRM (in /etc/sysconfig/boot). 
00:59 was a failed boot by pressing "c" at the first prompt, in which "n" was pressed when asked whether to confirm on startup. 01:01 was the same as 00:59.

01:04-01:12 was a normal boot for me till shutdown, i.e. "y" was the response to all options at startup. 01:13 was again the same as in 00:59 and 01:01. 01:15 was again my "normal boot".
Comment 5 Hung Ming Tsoi 2009-12-24 23:00:18 UTC
Created attachment 334271 [details]
Log of pressing "alt+sysreq+w" after successful boot by "y" to every prompt
Comment 6 Jeff Mahoney 2010-01-04 14:23:34 UTC
That's the output of alt+sysrq+q, which shows timer info. I'm looking for blocked tasks.
Comment 7 Forgotten User qMyteedNxa 2010-01-23 00:40:39 UTC
i would try disabling smartd (insserv -r smartd) to see if that makes a different. 
i have a bugreport open for a box which is crashed by smartd and there is a chance that smartd is the culprit here.
Comment 8 Jeff Mahoney 2010-09-03 17:38:28 UTC
openSUSE 11.2 is in security-maintenance mode. Please reopen if this issue still occurs with 11.3 or Factory.