Bugzilla – Bug 114171
iscsid consumes 100% CPU
Last modified: 2005-12-12 20:09:45 UTC
I am testing Xen on SUSE Linux 10.0 betas. I have an iSCSI target, and several (the number varies) iSCSI initiators. On the initiators, I am running OCFS2 to create clustered storage, although I don't think that OCFS2 is relevant to this bug. The initiators are running: open-iscsi-0.3rc6-4 xen-3.0_6458 kernel-xen-2.6.13-2 (Kurt's private build, but I have seen this with previous kernels too) ocfs2-tools-1.1.1-3 Target is running unh_iscsi-1.60-7. Target machine shows no usual iSCSI messages in the log. The bug: Sometimes the iSCSI connection seems to go bad on the initiator side (there are many error messages in /var/log/messages), and then iscsid starts to consume 100% of the CPU. I have left the machine in this state overnight, and it hangs hard and has to be powered off. I will attach /var/log/message, and try to reproduce this with symbols.
Created attachment 48203 [details] /var/log/messages iscsid goes to 100% CPU shortly after 10:01 in the log
I no longer believe that the hard hang is open-iscsi's fault. That appears to be a bug in OCFS2, triggered when it loses connectivity. So this bug is only about the 100% CPU usage by iscsid.
This might be fixed by the NOOP IN handling fix which went in with the latest update to open-iscsi. And nevertheless, the unh-iscsi target seems to be a bit dodgy: - Appearently it does not support the caching mode page 8; hence the error 'sda: got wrong page'. Can you try to run 'sg_modes -p=8 -6 /dev/sda' and attach the output? - Appearently both iscsi devices have the same target name (The cause for the message 'picking OUI'). Is that intended?
Please re-test with RC1 and re-open if the problem persists.
Test update - please ignore.
Test update2 - please ignore.