Bug 114171

Summary: iscsid consumes 100% CPU
Product: [openSUSE] SUSE LINUX 10.0 Reporter: Charles Coffing <ccoffing>
Component: NetworkAssignee: Hannes Reinecke <hare>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: xdl-novell-bugzilla
Version: Beta 3   
Target Milestone: ---   
Hardware: Other   
OS: All   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: /var/log/messages

Description Charles Coffing 2005-08-30 16:39:17 UTC
I am testing Xen on SUSE Linux 10.0 betas.  I have an iSCSI target, and several
(the number varies) iSCSI initiators.  On the initiators, I am running OCFS2 to
create clustered storage, although I don't think that OCFS2 is relevant to this bug.

The initiators are running:
open-iscsi-0.3rc6-4
xen-3.0_6458
kernel-xen-2.6.13-2 (Kurt's private build, but I have seen this with previous
kernels too)
ocfs2-tools-1.1.1-3

Target is running unh_iscsi-1.60-7.  Target machine shows no usual iSCSI
messages in the log.

The bug:

Sometimes the iSCSI connection seems to go bad on the initiator side (there are
many error messages in /var/log/messages), and then iscsid starts to consume
100% of the CPU.  I have left the machine in this state overnight, and it hangs
hard and has to be powered off.

I will attach /var/log/message, and try to reproduce this with symbols.
Comment 1 Charles Coffing 2005-08-30 16:42:55 UTC
Created attachment 48203 [details]
/var/log/messages

iscsid goes to 100% CPU shortly after 10:01 in the log
Comment 2 Charles Coffing 2005-08-30 20:45:38 UTC
I no longer believe that the hard hang is open-iscsi's fault.  That appears to
be a bug in OCFS2, triggered when it loses connectivity.

So this bug is only about the 100% CPU usage by iscsid.
Comment 3 Hannes Reinecke 2005-09-08 08:15:25 UTC
This might be fixed by the NOOP IN handling fix which went in with the latest
update to open-iscsi.
And nevertheless, the unh-iscsi target seems to be a bit dodgy:

- Appearently it does not support the caching mode page 8; hence the error 'sda:
got wrong page'. Can you try to run 'sg_modes -p=8 -6 /dev/sda' and attach the
output?

- Appearently both iscsi devices have the same target name (The cause for the
message 'picking OUI'). Is that intended?
Comment 4 Hannes Reinecke 2005-10-26 06:48:46 UTC
Please re-test with RC1 and re-open if the problem persists.
Comment 5 NetApp Linux Engineering 2005-12-12 20:07:17 UTC
Test update - please ignore.
Comment 6 NetApp Linux Engineering 2005-12-12 20:09:45 UTC
Test update2 - please ignore.