Bugzilla – Bug 555653
kernel: ... BUG: scheduling while atomic: cupsd ...
Last modified: 2009-12-01 14:59:13 UTC
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.10 (like Gecko) SUSE > cups start Starting cupsdcupsd: Child exited on signal 11! startproc: exit status of parent of /usr/sbin/cupsd: 3 > tail -f /var/log/messages Nov 15 18:42:42 linux-k7n1 kernel: [29409.355480] cupsd[2632]: segfault at 7fe59cfd7c10 ip 00007fe59cfd7c10 sp 00007fffafaf04e8 error 14 Nov 15 18:42:42 linux-k7n1 kernel: [29409.355507] note: cupsd[2632] exited with preempt_count 1 Reproducible: Always
Interestingly cups starts well if I set CUPSD_OPTIONS="-f" in /etc/sysconfig/cups. Nonetheless installing my printer driver fails because lpinfo -v hangs.
Sometimes cupsd terminates unmotivatedly and has to be restarted.
Updated to newest cups-1.3.11-25.1.x86_64 from the build service (repositories/Printing/openSUSE_11.2/). Error is still the same: Cupsd is only working with the -f or -F option; loglevel debug in cupsd.conf set; without -f no message in error_log; with -F: sigsegv when trying to add a printer; see attachement
Created attachment 327696 [details] cups error_log, loglevel debug
Do you again have AppArmor running like in bug #474403 or bug #539401? If yes, switch off AppArmor completely and retry.
Attachment #327696 [details] shows ------------------------------------------------------------------------ E [16/Nov/2009:14:01:09 +0100] PID 4183 (/usr/lib64/cups/daemon/cups-deviced) crashed on signal 11! I [16/Nov/2009:14:01:40 +0100] Scheduler shutting down normally. ------------------------------------------------------------------------ so that it is not the cupsd which segfaults but the cups-deviced and because of this the cupsd is "shutting down normally". If the root cause is not AppArmor, (i.e. if cups-deviced segfaults also without running AppArmor) to find out more about cups-deviced, run it as root as /usr/lib64/cups/daemon/cups-deviced 1 0 4 requested-attributes=all and report its results.
Created attachment 327864 [details] console output of cups-deviced hope that helps.
Do you again have AppArmor running like in bug #474403 or bug #539401?
As far as I see attachment #327864 [details] doesn't show any error. What results for you /usr/lib64/cups/daemon/cups-deviced 1 0 4 \ requested-attributes=all 1>/dev/null ; echo $?
No, apparmor is disabled for cups and cups-deviced. > /usr/lib64/cups/daemon/cups-deviced 1 0 4 \ > requested-attributes=all 1>/dev/null ; echo $? DEBUG: No address specified and no Address line in /etc/cups/snmp.conf... DEBUG: [cups-deviced] Added device "beh"... DEBUG: [cups-deviced] Added device "ipp"... DEBUG: [cups-deviced] Added device "socket"... DEBUG: [cups-deviced] Added device "hpfax"... DEBUG: [cups-deviced] Added device "lpd"... DEBUG: [cups-deviced] Added device "pipe"... DEBUG: [cups-deviced] Added device "smb"... DEBUG: [cups-deviced] Added device "hal"... DEBUG: [cups-deviced] Added device "scsi"... DEBUG: [cups-deviced] Added device "hp"... DEBUG: [cups-deviced] Added device "http"... 0
Strange: Comment #0: "cupsd ... segfault" Comment #2: "Sometimes cupsd terminates unmotivatedly" Comment #4 the attachment therein: "cupsd ... normally" but "cups-deviced crashed on signal 11" Comment #10: cups-deviced works well I have no idea what exactly goes wrong on your system except that something seems to be somehow wrong... Currently I can only set it back to you as "needinfo" to provide me a description how you can reproduce a crash of cupsd or cups-deviced or anything related.
Is CUPS the only part which causes trouble on this system? Does the rest of this system (X, desktop, applications,...) work stable and reliably (provided there is a reasonable amount of load for the rest of the system)?
everything works perfect; except cups and lpr (lpq, ...).
Are perhaps the packages cups-libs, cups, and cups-client mixed up with different versions installed? It is crucial that cups and cups-client have exactly the same version as cups-libs. What results rpm -q cups-libs cups-client cups I still need a description how you can reproduce a crash of cupsd or cups-deviced or anything related like lpr/lpq/... When you have a reproducible crash case on your system, provide a gdb backtrace as follows: Prerequisite: Install the cups-debuginfo package where its version matches exactly to the version of your installed packages cups-libs, cups, and cups-client so that rpm -q cups-libs cups-client cups cups-debuginfo shows the exact same version for all those packages. Without installed cups-debuginfo package a gdb backtrace is useless for me because it would not show the function name where the crash actually happened so that I could not find the point of interest in the sources. Assume it is the cupsd which crashes, then do the following: Run /usr/sbin/cupsd in gdb with: gdb /usr/sbin/cupsd run -f ("-f" is passed as argument for /usr/sbin/cupsd so that here "/usr/sbin/cupsd -f" is run). Wait for the crash. Then do at the gdb prompt "(gdb)" where to see a backtrace. Post the gdb backtrace here. Additionally attach the last about 100 lines (i.e. the last part which looks of interest in relation to the crash) from /var/log/cups/error_log as MIME type "text/plain" to this bug so that I may have a chance to see which incident leads to the crash. If it is not cupsd but e.g. lpstat which crashes, post a gdb backtrace for /usr/bin/lpstat plus the exact lpstat command which you called plus the last part of interest from /var/log/cups/error_log
Unfortunately I could not reproduce the crash invoking cups with the -f option. However without -f there seems to be no way to obtain a backtrace with gdb. When running cups with -f my printer starts to be on the blink shutting up and down all the time without being able to print (only disconnect from PC helps or cups shutdown; though I think I had this bug also once before 11.1 perhaps). Horrible. Even downgrades to RC1, MS7, os11.1 or os11.0 could not make cups run without crashing (though cups runs without -f on 11.1 and 11.0). Do you think the errors pertain to the environment cups runs in or may it help to let the buildservice build an elder version of cups?
I am not such an expert to decide if the root cause is in CUPS or in its environmet but when "/usr/sbin/cupsd -f" works but the default "/usr/sbin/cupsd" (i.e. run in the background as a "daemon") doesn't it looks from my point of view very much like an error in the environmet in your special case (but I have no idea how to find out what exactly makes your case so "special"). I think this in particular because if cupsd would usually crash in openSUSE 11.2 I would have seen many bug reports but up to now your's is the only one. How did you install openSUSE 11.2? Was it a new installation from scratch or an update? If the latter from which older openSUSE version? Please try the maximum LogLevel debug2 in /etc/cups/cupsd.conf to log all debugging information to get perhaps some helpful messages in the CUPS error_log? As some kind of desperate attempt finally try "Reinstalling the Printing System" according to http://en.opensuse.org/SDB:CUPS_-_Reinstalling_the_Printing_System
Unfortunately even with Loglevel debug2 nothing has gone into the error_log of cupsd before it has crashed. Carefully having followed all steps at SDB:CUPS_-_Reinstalling_the_Printing_System everything crashed as usual. The thing about it is that I am working on a plain new installation of openSUSE 11.2. There is nothing I have actually done before trying to run cups (except some sys. cfg tasks). I have formerly had some problems with the printing system whenever my computer got cracked. The fact that the problem only seems to apply to my computer is no good sign either. This time there should not have been any possibility to crack it since I kept it disconnected until I could enable Apparmor for all running services, even before I did the first update. There must be a massive backdoor somewhere in the distro. It actually seems to be impossible to connect my computer to the internet without giving away its control to these crackers. Tools like my self-developed checkroot can only help if a computer gets cracked later on. I will need somebody who can help me - or otherwise my only possiblity will be not to use Linux again and thereby to resign with my commitment to openSUSE, Linux and my testing activtiy. (although I don`t like that idea at all). Anyway. Many thanks for your effort!
I don't think crackers are the root cause here. When I Google for "exited with preempt_count 1" I get results related to kernel "Oops" issues. What results grep -i "oops" /var/log/messages or more precisely grep "Oops:" /var/log/messages for you? If you have kernel oopses, the root cause is your kernel, more precisely your kernel on your particular hardware which does not work reliable and stable and as a side effect you see various crashes related to CUPS (see comment #11) but then it is a bit unexpected how the rest of your system could work "perfect" (see comment #12 and comment #13).
Created attachment 328434 [details] tail -f /var/log/messages No kernel oops; just segfault & register - dump messages.
As elder versions of cups from 11.1 and 11.0 do not compile free of errors under 11.2 any more (buildservice) I have started to look for rpm-based distros featuring cups 1.4; perhaps the paackages from Mandriva should be worth a try.
Attachment #328434 [details] looks similar to issues like http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=301021 If I am right, the issue here is also a bug in the kernel so that I change the component of this bug report accordingly from "Printing" to "Kernel".
Please attach the output for a cupsd crash from strace -f /usr/sbin/cupsd &>/tmp/strace-f.cupsd in /tmp/strace-f.cups as MIME type text/plain to this bug.
Created attachment 328474 [details] strace -f cupsd
I personally can`t see any similarity except that both programs have terminated unmotivatedly (called kernel-Oops, right?). For ifconfig the cause was a paging request, in our case it seems to be unknown yet.
Just in case, can you attach whole dmesg? dmesg -s 1000000 >dmesg.out
Created attachment 328490 [details] dmesg -s 1000000
(In reply to comment #27) > Created an attachment (id=328490) [details] > dmesg -s 1000000 Unfortunately kernel ring buffer was already rewritten. The machine is after several hours of uptime with many suspend cycles. Could you * find last 0.000000 kernel line in /var/log/messages* and attach the one which you find it in and all from that moment on (e.g. in one tarball)? * retry after reboot
(In reply to comment #28) > * find last 0.000000 kernel line in /var/log/messages* and attach the one which > you find it in and all from that moment on (e.g. in one tarball)? Or just tar 'em all.
> grep 0[.]000000 /var/log/messages* > # nothing found! Why do you think that there has been nothing about the cups crash in the attached dmesg? I have let it crash immediately before taking the dmesg!
(In reply to comment #30) > > grep 0[.]000000 /var/log/messages* > > # nothing found! Doesn't your logrotate setup gzip them? > Why do you think that there has been nothing about the cups crash in the > attached dmesg? I don't think that. Cups is broken and touches memory which doesn't belong to that. Hence it crashes. But you hit a different problem before the cups. There has to be some bug triggered in the kernel. And I need the logs to find that. > I have let it crash immediately before taking the dmesg! Yup, you probably hit two bugs.
Created attachment 328508 [details] /var/log/messages* There is only a /var/log/messages, no messagesXY.gz or something else that starts with 'mess'. By the way what is the difference between viewing dmesg and /var/log/messages? I always keep looking at the latter.
Elmar, can you try an updated kernel from http://ftp.suse.com/pub/projects/kernel/kotd/openSUSE-11.2/ ? I believe this is a bug I ran into and fixed independently yesterday.
Cups crashes are resolved with 2.6.31.6-0.0.0.23.273000e-desktop. Nonetheless my proprietary Brother drivers doesn`t seem to work with newer versions of cups.
I opened bug 557302 to explain this appropriately and make for easier searching. Closing this as a duplicate of that one. *** This bug has been marked as a duplicate of bug 557302 ***
*** Bug 539401 has been marked as a duplicate of this bug. ***