Bug 117749 - IPsec crashes kernel
Summary: IPsec crashes kernel
Status: RESOLVED FIXED
: 115200 134731 (view as bug list)
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Beta 2
Hardware: i686 SUSE Other
: P5 - None : Major (vote)
Target Milestone: ---
Assignee: Jiri Bohac
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-19 11:33 UTC by Lars Müller
Modified: 2006-06-14 15:33 UTC (History)
5 users (show)

See Also:
Found By: Development
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Compressed /var/log/messages of the system with kernel 2.6.14-rc4-20051012095427-default (248.04 KB, application/x-bzip2)
2005-10-18 19:26 UTC, Lars Müller
Details
Output of ip xfrm policy list at the moment when I got 'connect: Resource temporarily unavailable'. (2.05 KB, text/plain)
2005-10-18 20:44 UTC, Lars Müller
Details
Output of ip xfrm policy list when everything is fine. (2.89 KB, text/plain)
2005-10-18 20:48 UTC, Lars Müller
Details
Test Patch (1.92 KB, patch)
2005-11-30 10:47 UTC, Olaf Kirch
Details | Diff
/var/log/messages (17.33 KB, text/plain)
2005-11-30 17:45 UTC, Lars Müller
Details
BUG: soft lockup detected... (2.81 MB, image/png)
2006-01-26 15:46 UTC, Christoph Thiel
Details
next try... (3.18 MB, image/jpeg)
2006-01-26 15:47 UTC, Christoph Thiel
Details
sysrq-p (2.72 MB, image/jpeg)
2006-01-26 15:48 UTC, Christoph Thiel
Details
xfrm_policy.c.rej (530 bytes, application/x-reject)
2006-01-26 17:40 UTC, Christoph Thiel
Details
Updated test patch (2.12 KB, patch)
2006-01-27 10:05 UTC, Olaf Kirch
Details | Diff
dmesg output as requested with comment #61. (17.61 KB, text/plain)
2006-01-27 14:39 UTC, Lars Müller
Details
Updated test patch (2.58 KB, patch)
2006-01-30 12:05 UTC, Olaf Kirch
Details | Diff
/var/log/messages of the system (322.16 KB, text/plain)
2006-02-02 13:07 UTC, Lars Müller
Details
serial console capture containing some Oopses while system is booting (10.93 KB, application/x-tar)
2006-02-27 13:02 UTC, Marius Tomaschewski
Details
serial console capture with sysrq-t/p output while system freeze (83.60 KB, application/x-tar)
2006-02-27 13:07 UTC, Marius Tomaschewski
Details
updated SUSE 10.0 patch (745 bytes, patch)
2006-06-01 10:53 UTC, Marius Tomaschewski
Details | Diff
proper patch for the 10.0 kernel (1011 bytes, patch)
2006-06-06 16:01 UTC, Jiri Bohac
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Lars Müller 2005-09-19 11:33:42 UTC
I still have problems with the ipsec and RC4.  After some time (10-15 minutes)
my system got frozen.  And this only happens if I use ipsec.
Comment 1 Lars Müller 2005-09-19 11:34:31 UTC
Nothing in the logs.  sysrq doesn't help.
Comment 2 Olaf Kirch 2005-09-20 08:17:19 UTC
Please enable logging to console (klogconsole -l8 -r0), switch to 
virtual console and wait for it to happen. 
Comment 3 Lars Müller 2005-09-21 19:33:36 UTC
I followed the instructions from comment #2 and the system had frozen again. 
But nothing on the console.
Comment 4 Olaf Kirch 2005-09-22 07:30:15 UTC
Okay, next try: please enable nmi_watchdog in addition and see if something 
shows up on the console 
Comment 5 Lars Müller 2005-09-22 11:38:07 UTC
With set to 1 and 2 the NMI count of /proc/interrupts is zero.

The CPU is Intel(R) Pentium(R) III Mobile 1200MHz.
Comment 6 Lars Müller 2005-10-01 18:31:39 UTC
Anything I can do?  I'm not in the office atm and need a working IPsec.
Comment 7 Olaf Kirch 2005-10-04 06:58:33 UTC
I'm currently at a loss on how to debug this. Do you have a machine where 
NMI watchdog works? 
Comment 8 Chris L Mason 2005-10-04 11:58:11 UTC
Well, there goes my theory that it might be x86-64 only.  What kind of nic do you have 
lars?  I'm a via rhine. 
Comment 9 Lars Müller 2005-10-04 13:13:10 UTC
To comment #7: Only the two systems of my girlfriend  And testing with this
systems might be to risky for my life.

To comment #8: lmuelle@gab:~> /sbin/lspci | grep -i ether
02:00.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78)
Comment 10 Lars Müller 2005-10-18 19:21:45 UTC
I've updated my kernel to 2.6.14-rc4-20051012095427-default from kotd.  After booting this kernel and starting SuSE RAS/ IPsec I no longer have the system freeze.  From time to time I have a non working IPsec conenction.  Sometimes the connection recovers and sometime not.  Then I have to restart it.
Comment 11 Lars Müller 2005-10-18 19:26:33 UTC
Created attachment 54633 [details]
Compressed /var/log/messages of the system with kernel 2.6.14-rc4-20051012095427-default
Comment 12 Lars Müller 2005-10-18 20:43:14 UTC
Now I have again such an unrecoverable situation.

lmuelle@gab:~> ping gray.suse.de
connect: Resource temporarily unavailable
Comment 13 Lars Müller 2005-10-18 20:44:33 UTC
Created attachment 54650 [details]
Output of ip xfrm policy list at the moment when I got 'connect: Resource temporarily unavailable'.
Comment 14 Lars Müller 2005-10-18 20:48:31 UTC
Created attachment 54652 [details]
Output of ip xfrm policy list when everything is fine.
Comment 15 Lars Müller 2005-10-20 13:26:43 UTC
While trying to do some work remote I have the same problem again even with this kernel version.
Comment 16 Dirk Mueller 2005-11-26 01:49:52 UTC
*** Bug 134731 has been marked as a duplicate of this bug. ***
Comment 17 Dirk Mueller 2005-11-26 01:53:09 UTC
I believe this is actually a duplicate of 115200 which contains a lot more information already. 

if you all agree please resolve as duplicate. 
Comment 18 Dirk Mueller 2005-11-26 01:53:53 UTC
ah, I also experience the issues now but currently can't add more information than was already given. 

Comment 19 Michael Matz 2005-11-28 06:41:20 UTC
As suggested by Olaf on my bug 134731 (now dup of this one) I tried various
things:
1) nmi_watchdog=1 : doesn't help, so the CPU is not locked in that way
2) working only from the text console so that I can look at sysrq:
   once the machine is locked sysrq itself still works, but it can't
   write any files anymore, hence any sysrq output more than 30 lines
   is lost for me (because I also have no serial console :-( ), so sysrq-t
   is useless (I see only a few tasks, and they simply sleep in read or
   similar harmless syscalls).  But the interesting thing is, that sysrq-p
   (show Pc) shows it spinning in the process "ping".  At that time only
   one ping is running, namely the one to keep the ipsec tunnel open (which
   btw doesn't seem to work that well, because sometimes the tunnel goes
   sort of down anyway, and I have to do another ping which then triggers
   a restart of the tunnel, and all continues to work.  So somehow the
   period of the normal ping (30 seconds) isn't enough to keep the tunnel
   working.  I have the _feeling_ (and it's only that) that the lockups
   happen when the tunnel is down in this way, and in the process of being
   retriggered.  Because the last log message in var/log/messages when
   such lockup happens is about the ipsec SA being expired (I have to
   verify if this is really the case every time).

   So, anyway, if the machine is locked up, sysrq-p points at ping running
   heavily.  I've noted the backtraces on a sheet of paper, which I forgot
   home, so will enter that some other time.  But it was spinning in
   functions having xfrm_ in their name.
3) tried two ways of crash dumping to no avail.  lkcd-stuff doesn't seem
   to be supported with the 10.0 kernel.  And the kexec based things I
   also didn't get to work.  I've had to compile the kexec-utils myself,
   because no rpm on 10.0, and as it needs a normal vmlinux (unpacked),
   I had to compile my own kernel as the secondary kernel, which is necessary
   anyway because of the necessary adjustment to the load address.
   Once I did that, and kexec-utils finally loaded that kernel sysrq-c
   (forced kexec) just locked the machine.

Yeah, that's it.  So still no really useful information :-(
Comment 20 Lars Müller 2005-11-28 09:41:16 UTC
I've updated my system to 10.1 post Alpha 3 and still have the crashes.

Marius and I before did also several tests with openswan 2.4.4 and kotd.

I now move this to critical as it is not only my bug and as it is in the 10.1 tree too.
Comment 21 Lars Müller 2005-11-28 10:06:56 UTC
Could someone consider if this is a duplicate of bug #115200?
Comment 22 Olaf Kirch 2005-11-28 10:09:59 UTC
We're getting somewhere, thanks to Michael's debugging help.

Michael, can you add the sysrq-p backtrace? That would be extremely
helpful.

Concerning sysrq-t: as long as you do not switch consoles, you can
scroll up using Shift-PageUp for a few pages worth of data. The scrollback
buffer is clobbered as soon as you switch consoles though.
Comment 23 Michael Matz 2005-11-28 10:14:42 UTC
No, I can't scroll back, because the keyboard doesn't react to anything,
not even to shift-pgup/pgdown.  It only reacts to sysrq-* .

Well, at least the _effect_ is that it doesn't react.  It might of course
also just be, that the scrolling of the console doesn't work in that state
anymore.
Comment 24 Michael Matz 2005-11-29 19:41:30 UTC
So, here come four different traces of sysrq-p when the machine is
frozen.  I should perhaps add that _all_ of these backtrace were in ping,
once the machine is frozen, where usually (when not frozen) most of these
sysrq-p traces are in the swapper task (IIRC).  Also sysrq-t shows at the
end a list of running (R) tasks; ping was included also in all of them.

Anyway, here the traces, from my sheet of paper, so hopefully all function
names are correct, or at least identifieable.  I'll write then in opposite
entry order, i.e. first the caller, then the callee.  First the common
part for all traces:

syscall_call
sys_gettimeofday
sys_socketcall
do_pollfd
sock_poll
schedule
__alloc_pages
buffered_runqueue (unsure, can't read it fully on the paper, definitely
                   buffered_run...)
sys_fadvise64_64
sys_sendmsg
verify_iovec
autoremove_wake_function
soft_cursor
sock_sendmsg
inet_sendmsg
raw_sendmsg
default_wake_function

Now the four rests of the backtraces, which just have to be added to the
above list to form the full backtraces:

1)
xfrm_lookup
xfrm_tmpl_resolve
xfrm_state_find

2)
xfrm_lookup
xfrm_find_bundle
__xfrm4_find_bundle
local_bh_enable

3)
xfrm_lookup
stale_bundle
xfrm_bundle_create
xfrm_bundle_ok

4)
xfrm_policy_lookup
xfrm_lookup
flow_cache_lookup
flow_hash_code

Yes, the first three have xfrm_lookup in the first position, the fourth
in the second.

I find the traces a bit strange, as I can't find a reason why
sys_gettimeofday should do a socketcall, or pollfd.  So perhaps the start
of the trace is inconsistent, or contains old data.  But this trace does
happen all the time when the machine is frozen (tested at two different
freezes), and changes only in the last four to six frames.
Comment 25 Olaf Kirch 2005-11-30 10:15:06 UTC
Ah! useful intelligence - thanks a lot!
Comment 26 Olaf Kirch 2005-11-30 10:46:32 UTC
There ought to be a ip_route_output_flow call between raw_sendmsg
and xfrm_lookup - this is how we end up in the xfrm code.

It seems we're looping in xfrm_lookup because the newly created
xfrm bundle is deemed stale.

Can you please try the attached test patch? If my theory is right,
the kernel should not lock up anymore, and print a bunch of diagnostics
instead.
Comment 27 Olaf Kirch 2005-11-30 10:47:20 UTC
Created attachment 59102 [details]
Test Patch
Comment 28 Lars Müller 2005-11-30 15:42:40 UTC
Built kernels for 10.0 and stable are at http://w3.suse.de/~lmuelle/kernel/
Comment 29 Lars Müller 2005-11-30 15:44:24 UTC
Adding Andreas as the patch mentioned in comment #27 might concern him too.
Comment 30 Lars Müller 2005-11-30 17:23:42 UTC
While testing the patched kernel I got:

Nov 30 17:39:09 gab kernel: xfrm_bundle_ok: 1151
Nov 30 17:48:46 gab kernel: xfrm_bundle_ok: 1151
Nov 30 17:48:54 gab last message repeated 3 times
Nov 30 17:58:46 gab kernel: xfrm_bundle_ok: 1151
Nov 30 17:58:54 gab last message repeated 5 times
Nov 30 18:08:46 gab kernel: xfrm_bundle_ok: 1151
Nov 30 18:08:54 gab last message repeated 7 times
Comment 31 Lars Müller 2005-11-30 17:45:00 UTC
Created attachment 59285 [details]
/var/log/messages

Systen is now up for around an hour.

My running rsync got stuck one time in between.  At that moment a ping to an internal host succeeded and the rsync process continued.
Comment 32 Lars Müller 2005-11-30 21:51:18 UTC
Four hours and 2 GB received data without a crash.
Comment 33 Michael Matz 2005-12-01 14:36:43 UTC
I've used RAS the whole evening yesterday (no mass transfers but continuous
working over ssh) and got no lockups, so ... YEAH!!! ... ;-)

Like Lars I also get occassional connection stucks, which I also got before,
and where I conjectured that these may be the points in time when lockups
actually occur.  Now I have such things in my messages:

Dec  1 14:08:50 hell6 pluto[9403]: "matz-suse" #48: IPsec SA expired (LATEST!)
Dec  1 14:09:15 hell6 pluto[9403]: initiate on demand from 10.204.0.110:0 to 10.0.9.7:0 proto=0 state: fos_start because: acquire
Dec  1 14:09:15 hell6 pluto[9403]: "matz-suse" #53: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP {using isakmp#49}
Dec  1 14:09:15 hell6 pluto[9403]: "matz-suse" #53: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
Dec  1 14:09:15 hell6 pluto[9403]: "matz-suse" #53: STATE_QUICK_I2: sent QI2, IPsec SA established {ESP=>0xb16fac6b <0xa100f7b7 xfrm=AES_0-HMAC_SHA1 NATD=195.135.221.4:4500 DPD=none}
Dec  1 14:09:45 hell6 kernel: xfrm_bundle_ok: 1153
Dec  1 14:19:45 hell6 kernel: xfrm_bundle_ok: 1153
Dec  1 14:19:45 hell6 kernel: xfrm_bundle_ok: 1153
Dec  1 14:29:46 hell6 kernel: xfrm_bundle_ok: 1153
Dec  1 14:29:46 hell6 last message repeated 2 times
Dec  1 14:39:46 hell6 kernel: xfrm_bundle_ok: 1153
Dec  1 14:39:46 hell6 last message repeated 3 times
Dec  1 14:49:46 hell6 kernel: xfrm_bundle_ok: 1153
Dec  1 14:49:46 hell6 last message repeated 4 times
Dec  1 14:54:44 hell6 pluto[9403]: "matz-novell1" #54: initiating Main Mode to replace #49

This happened quite frequently, every time after the STATE_QUICK_I2 message
happening after the "SA expired" message occuring roughly every 50 minutes.

I've left it run idly overnight with IPsec active (formerly there was
a high chance that it would lockup over night), and it still runs.  I just
look at it from the office, and it hasn't locked up, but RAS also
isn't working anymore, I only get "Resource temporarily unavailable" from
pings, without the connection ever being established.  In the log I see:

Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: initiating Main Mode
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: received Vendor ID payload[draft-ietf-ipsec-nat-t-ike-03] method set to=108
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: enabling possible NAT-traversal with method RFC 3947 (NAT-Traversal)
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: transition from state STATE_MAIN_I1 to state STATE_MAIN_I2
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: STATE_MAIN_I2: sent MI2, expecting MR2
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: I did not send a certificate because I do not have one.
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: NAT-Traversal: Result using draft-ietf-ipsec-nat-t-ike-02/03: i am NATed
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: transition from state STATE_MAIN_I2 to state STATE_MAIN_I3
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: STATE_MAIN_I3: sent MI3, expecting MR3
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: ignoring informational payload, type INVALID_ID_INFORMATION
Dec  1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: received and ignored informational message
Dec  1 16:24:10 hell6 pluto[9403]: "matz-suse" #104: discarding duplicate packet; already STATE_MAIN_I3
Dec  1 16:24:10 hell6 pluto[9403]: "matz-suse" #104: ignoring informational payload, type INVALID_ID_INFORMATION
Dec  1 16:24:10 hell6 pluto[9403]: "matz-suse" #104: received and ignored informational message
Dec  1 16:24:30 hell6 pluto[9403]: "matz-suse" #104: discarding duplicate packet; already STATE_MAIN_I3
Dec  1 16:24:30 hell6 pluto[9403]: "matz-suse" #104: ignoring informational payload, type INVALID_ID_INFORMATION
Dec  1 16:24:30 hell6 pluto[9403]: "matz-suse" #104: received and ignored informational message
Dec  1 16:24:46 hell6 pluto[9403]: initiate on demand from 10.204.0.110:0 to 10.0.9.7:0 proto=0 state: fos_start because: acquire
Dec  1 16:25:10 hell6 pluto[9403]: "matz-suse" #104: max number of retransmissions (2) reached STATE_MAIN_I3.  Possible authentication failure: no acceptable response to our first encrypted message

The above block is repeated with increasing connection IDs.  So it seems
that my machine can't authenticate against the server anymore.  Just before
these failing auth attempts started I see this in my log:

Dec  1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec  1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec  1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec  1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec  1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec  1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec  1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec  1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec  1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec  1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec  1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec  1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale

So, this is a different __LINE__ than the above messages from the patch,
and of course we have a new stale bundle.  OTOH this occured also two hours
before already, and thereafter the connection still worked and was able
to authenticate.  So the mere existence of a stale bundle is not the
reason for failing auth.  But still there might be a connection.

Many many thanks for your theory testing patch, finally I can work overnight
again comfortably ;-)
Comment 34 Olaf Kirch 2005-12-01 16:38:58 UTC
Well, I'm happy it works for you... but it's still a bit strange.

You said it usually locks up after 10 minutes. This matches exactly
the intervals between the xfrm_bundle_ok messages (10 mins). So
congrats, you bio clock is highly accurate!

But - if we are looping where I think we would loop, you would see
10 messages in a row saying "newly created bundle is stale". But there
are just one or two. AND you're seeing this message only when RAS dies!

The fact that we hit xfrm_bundle_ok/line 1153 means that the
dst cache entry we're looking at has been marked for destruction
by someone: dst->obsolete is set, which happens if someone calls 
dst_free() on it while others still hold a reference to it. Maybe
pluto is clearing out some policy entries at 10 minute intervals.

At any rate, this needs a new patch with a few more printk's.
Comment 35 Andreas Schwab 2005-12-01 17:08:43 UTC
My experience is that the lockups only happen with NAT-T.
Comment 36 Lars Müller 2005-12-02 16:46:55 UTC
At all my locations I use NAT-T too.
Comment 37 Forgotten User ZhJd0F0L3x 2005-12-03 06:07:39 UTC
i am seeing this (~lmuelle-kernel):

Dec  2 22:41:20 susi kernel: xfrm_bundle_ok: 1153

long after i stopped pppd and racoon:

Dec  2 20:41:22 susi pppd[9104]: Terminating on signal 15
Dec  2 20:41:22 susi pppd[9104]: Connect time 71.8 minutes.
Dec  2 20:41:22 susi pppd[9104]: Sent 713396 bytes, received 1505904 bytes.
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi kernel: xfrm_bundle_ok: 1153
Dec  2 20:41:22 susi pppd[9104]: Script /etc/ppp/ip-down started (pid 15366)
Dec  2 20:41:22 susi pppd[9104]: sent [LCP TermReq id=0x2 "User request"]
Dec  2 20:41:22 susi pppd[9104]: rcvd [LCP TermAck id=0x2]
Dec  2 20:41:22 susi pppd[9104]: Connection terminated.
Dec  2 20:41:23 susi modify_resolvconf: restored /etc/resolv.conf.saved.by.pppd.modemU to /etc/resolv.conf
Dec  2 20:41:23 susi pppd[9104]: Waiting for 1 child processes...
Dec  2 20:41:23 susi pppd[9104]:   script /etc/ppp/ip-down, pid 15366
Dec  2 20:41:24 susi pppd[9104]: Script /etc/ppp/ip-down finished (pid 15366), status = 0x0
Dec  2 20:41:24 susi pppd[9104]: Exit.
Dec  2 20:41:24 susi /etc/ppp/ip-down.d/racoon: VPNCLIENT-CLI-0272:Gateway IP Address is invalid
Dec  2 20:41:25 susi racoon: INFO: caught signal 15
Dec  2 20:41:25 susi racoon: ERROR: libipsec failed pfkey check (Invalid message type)
Dec  2 20:41:26 susi racoon: ERROR: sendmsg (Invalid argument)
Dec  2 20:41:26 susi racoon: ERROR: sendfromto failed
Dec  2 20:41:26 susi racoon: INFO: racoon shutdown
Dec  2 20:41:26 susi /etc/ppp/ip-down.d/racoon: Shutting down IPsec IKE daemon (racoon)..done

It did not lock up this time, i just wondered about the ipsec message (is it one?) long after usage.

My machine also practically never locks up while the tunnel is active but long after i stopped using ipsec and - i almost don't dare to say it - it seems to coincide with ACPI LID button events....
Comment 38 Lars Müller 2006-01-14 15:00:37 UTC
Olaf: How about adding this workaround to the current kernel.  I have to recompile kernels for each preview.
Comment 39 Michael Matz 2006-01-14 17:46:25 UTC
So, is any kernel or patch available which has some more printk's as
noted in comment #34?  Because, while the machine doesn't lockup anymore
IPSEC still doesn't work satisfyingly.  After some time the tunnel goes down
when no traffic goes over it (_although_ there is that 30 second ping,
so it seems this ping doesn't count as activity),  After this happens
there is a short time interval (about some minutes) where the tunnel
can be revived by another ping to the IPsec'ed network (yes, that's
strange).  If this is not done, and the tunnel is down for a longer period
than this, then it's in an unrevivable state (i.e. the pings return
just with 'connect: Resource temporarily unavailable').

Everything I said in comment #33 still applies.  I.e. after such
ping-to-revive pluto tries to recreate the connection but fails because of:

Jan 13 22:56:15 hell6 pluto[9662]: "matz-novell5" #24: max number of
 retransmissions (2) reached STATE_MAIN_I3.  Possible authentication failure:
 no acceptable response to our first encrypted message

Right before the _first_ attempt which fails like above I have this:

Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178
Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale
Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178
Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale
Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178
Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale
Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178
Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale
Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178
Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale
Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178
Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale

And this is at the same time the _only_ time I see this.  All other times
I see something from the printks in the patch it's in line 1153.

So it seems that something leads to the above messages in line 1178,
i.e. a stale bundle, and from then on the IPsec connection can't be
recreated correctly anymore.
Comment 40 Dirk Mueller 2006-01-16 00:05:20 UTC
did anyone try the current kotd? there were plenty of ipsec related changes in there, also with NAT. 

Comment 41 Forgotten User ZhJd0F0L3x 2006-01-16 08:32:37 UTC
i had today a "system hangs at connection initiation" with 2.6.15-git8-20060113093923 i don't know if this is the same bug, though.
Comment 42 Christoph Thiel 2006-01-25 16:01:54 UTC
I just had a crash/freeze on SUSE Linux 10.1 (almost)Beta2, kernel: 2.6.16-rc1-git3-4-default.
Comment 43 Olaf Kirch 2006-01-25 16:07:56 UTC
Please provide additional info - was it a crash or a freeze? Any
kernel oops? NMI watchdog? ping still working? what is the hwinfo?
anything visible if you enable kernel logging to the console, and
switch to a VC before the crash?
Comment 44 Christoph Thiel 2006-01-25 16:17:15 UTC
It was a complete lockup => freeze. I didn't had any chance of getting additional data :( As this is happening on my laptop and I'm en route ATM, I couldn't try to ping the maschine.

Any suggestions how to gather data on this to solve this problem?
Comment 45 Olaf Kirch 2006-01-26 11:09:25 UTC
If you can reproduce this, please switch to a VC, and enable kernel
logging to the console and enable sysrq

	klogconsole -l8 -r0
	echo 1 >/proc/sys/kernel/sysrq

If possible, attach a serial console.

Then work from the console and do whatever it takes to trigger the
lockup. Do several sysrq-p's in a row to see where it's looping. Try
sysrq-t for a backtrace of all tasks (this will only be fun if you
have a serial console, though)

You may also want to try nmi_watchdog=1
Comment 46 Christoph Thiel 2006-01-26 15:46:51 UTC
Created attachment 65244 [details]
BUG: soft lockup detected...
Comment 47 Christoph Thiel 2006-01-26 15:47:24 UTC
Created attachment 65245 [details]
next try...
Comment 48 Christoph Thiel 2006-01-26 15:48:42 UTC
Created attachment 65248 [details]
sysrq-p
Comment 49 Michael Matz 2006-01-26 15:53:11 UTC
So it's still in the xfrm_* stuff.  I'm really reluctant to try any
kernel except the one containing Olafs debug patch, because I don't want
to risk freezing machines.
Comment 50 Christoph Thiel 2006-01-26 15:56:38 UTC
That's it for the moment. If you need futher information, just ask! I was running the 10.1 beta2 kernel, but have rebuilt that one with a km_wlan in prefer rpms. The mbuild job should still be there: zion-cthiel-10. (I'v copied that one to ~cthiel/Export/ as well.)
Comment 51 Olaf Kirch 2006-01-26 16:41:25 UTC
Indeed it looks like we're still looping in the xfrm stuff.

Christoph, can you please test a kernel with the debug patch from
attachment #59102 [details]?

If that cures your problem, we need to add some more instrumentation to
find out what's going on there.
Comment 52 Christoph Thiel 2006-01-26 17:40:43 UTC
Created attachment 65282 [details]
xfrm_policy.c.rej
Comment 53 Christoph Thiel 2006-01-26 17:41:29 UTC
Olaf, could you please provide a fixed patch?
Comment 54 Dirk Mueller 2006-01-26 17:42:58 UTC
modprobe crasher could help as well. afaik it does some slab corruption debug stuff. 

Comment 55 Olaf Kirch 2006-01-27 10:05:09 UTC
Created attachment 65369 [details]
Updated test patch

Please try this patch
Comment 56 Lars Müller 2006-01-27 10:22:40 UTC
Can we please add this workaround to our kernel packages?  I have to rebuild packages after every system update to betaX or stable.
Comment 57 Olaf Kirch 2006-01-27 10:30:42 UTC
This patch will pollute syslog with lots of useless chatter.

Let's have some test feedback, and fix it for real.
Comment 58 Lars Müller 2006-01-27 10:51:19 UTC
Yes, a freezing system is politer than a polluted syslog.  I'll buy a new laptop or use a different OS. :)
Comment 60 Lars Müller 2006-01-27 13:26:32 UTC
kernel-default-2.6.16_rc1_git3-5 from stable plus the patch mentioned in comment #55 built, installed and in use since good one hour.

I got again the 'Resource temporarily unavailable' and one klogd messaage:

lmuelle@gab:~> ping gray.suse.de
connect: Resource temporarily unavailable
lmuelle@gab:~> ping gray.suse.de
PING gray.suse.de (10.10.2.19) 56(84) bytes of data.
64 bytes from 10.10.2.19: icmp_seq=1 ttl=62 time=2.86 ms
64 bytes from 10.10.2.19: icmp_seq=2 ttl=62 time=7.00 ms
64 bytes from 10.10.2.19: icmp_seq=3 ttl=62 time=3.09 ms

--- gray.suse.de ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 2.869/4.321/7.004/1.899 ms
lmuelle@gab:~>

Message from syslogd@gab at Fri Jan 27 14:11:36 2006 ...
gab klogd:  lock: e17309a0, .magic: dead4eac, .owner: <none>/-1, .owner_cpu: -1
Comment 61 Olaf Kirch 2006-01-27 14:31:00 UTC
Please provide the complete dmesg output. This looks like the spinlock debug
code triggered.
Comment 62 Lars Müller 2006-01-27 14:39:44 UTC
Created attachment 65417 [details]
dmesg output as requested with comment #61.
Comment 63 Christoph Thiel 2006-01-27 15:51:43 UTC
I'v been using RAS for like 2:30h now and didn't hit a crash yet. Syslog only has a lot of:

  klogd: xfrm_bundle_ok: 1210


I'll report back as soon as I get a crash.
Comment 64 Christoph Thiel 2006-01-27 22:04:05 UTC
So, I'v really tried hard to reproduce a crash, but didn't get one so far.

Here are some other messages that I'v seen:

Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235
Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale
Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235
Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale
Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235
Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale
Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235
Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale
Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235
Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale
Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235
Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale
Jan 27 22:58:09 t41p klogd: xfrm_lookup bailing out after 11 loops

So, it looks like the patch helps indeed.
Comment 65 Lars Müller 2006-01-28 12:06:58 UTC
I had a crash after one hour real remote work with the patch from comment #55.
Comment 66 Olaf Kirch 2006-01-30 12:05:37 UTC
Created attachment 65615 [details]
Updated test patch

Here's another patch for testing. If we manage to pin-point the problem
a little closer I will include a patch for this tonight so it gets into
the next beta.
Comment 67 Olaf Kirch 2006-01-30 16:24:33 UTC
I committed a patch to CVS that may fix this problem. Please try the
upcoming KOTD.
Comment 68 Lars Müller 2006-01-31 21:16:07 UTC
I run kernel-default-2.6.16_rc1_git3-20060130162402 atm.  And I didn't had a freezing system since more than three hours.

But:  I have agaiun and again stocking connections if I had been idle.
Comment 69 Olaf Kirch 2006-02-02 12:55:35 UTC
Please provide the kernel debug output. The fact that it works doesn't
help me create a fix for this problem.

And what is a stocking connection?
Comment 70 Lars Müller 2006-02-02 13:04:45 UTC
Argl. I intended to say 'stagnant' connection.  The result are hanging connections if the remote connection is idle for some time.
Comment 71 Lars Müller 2006-02-02 13:07:18 UTC
Created attachment 66176 [details]
/var/log/messages of the system
Comment 72 Lars Müller 2006-02-02 13:08:06 UTC
Anything missing?
Comment 73 Michael Matz 2006-02-02 15:33:48 UTC
Olaf, see comment #39 for a description of what is meant with the stagnant
connection.  The symptoms haven't changed.  Somewhen the connection just
stops delivering traffic (in difference to Lars I seldomly also notice this
_while_ doing something over ssh).  Then I have to wait some time, do
a ping to some remote IPsec'ed host, get the "temporarily unavailable" message
repeat the ping some times, and then it's working again.  Including the
ssh connection which was stuck, and could _not_ be revived by just generating
more traffic (i.e. pressing more keys).
Comment 74 Forgotten User ZhJd0F0L3x 2006-02-04 17:08:16 UTC
yesterday i had this in the log with the Beta3 kernel (i have removed the ipv6.ko module):
Feb  3 18:11:23 strolchi klogd: xfrm_tmpl_resolve says EAGAIN, try again
Feb  3 18:11:23 strolchi klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4)
Feb  3 18:11:23 strolchi klogd:  [<c0279deb>] xfrm_lookup+0x403/0x481
Feb  3 18:11:23 strolchi klogd:  [<c02517ac>] ip_output+0x1f4/0x240
Feb  3 18:11:23 strolchi klogd:  [<c01158f2>] default_wake_function+0x0/0xc
Feb  3 18:11:23 strolchi klogd:  [<c0278a09>] xfrm4_output_finish+0x8a/0x36a
Feb  3 18:11:23 strolchi klogd:  [<c024a244>] ip_route_output_flow+0x77/0x8b
Feb  3 18:11:23 strolchi klogd:  [<c0265ee3>] raw_sendmsg+0x2e6/0x70f
Feb  3 18:11:23 strolchi klogd:  [<c0265a89>] raw_recvmsg+0x6e/0x171
Feb  3 18:11:23 strolchi klogd:  [<c026cbf2>] inet_sendmsg+0x35/0x3f
Feb  3 18:11:23 strolchi klogd:  [<c022b9ce>] sock_sendmsg+0xd2/0xec
Feb  3 18:11:23 strolchi klogd:  [<c0127a03>] autoremove_wake_function+0x0/0x2d
Feb  3 18:11:23 strolchi klogd:  [<c0127a03>] autoremove_wake_function+0x0/0x2d
Feb  3 18:11:23 strolchi klogd:  [<c014bcce>] cache_alloc_debugcheck_after+0xb8/0xea
Feb  3 18:11:23 strolchi klogd:  [<c01addf7>] copy_from_user+0x54/0x80
Feb  3 18:11:24 strolchi klogd:  [<c0231626>] verify_iovec+0x3e/0x70
Feb  3 18:11:24 strolchi klogd:  [<c022bdf9>] sys_sendmsg+0x186/0x1df
Feb  3 18:11:24 strolchi klogd:  [<c022c662>] sys_recvmsg+0x1a1/0x1b5
Feb  3 18:11:24 strolchi klogd:  [<c013962c>] get_page_from_freelist+0x6f/0x2c4
Feb  3 18:11:24 strolchi klogd:  [<c01f70b3>] pty_write+0x2b/0x35
Feb  3 18:11:24 strolchi klogd:  [<c0282add>] _spin_unlock_irq+0x5/0x7
Feb  3 18:11:24 strolchi klogd:  [<c02818bf>] schedule+0x499/0x4f7
Feb  3 18:11:24 strolchi klogd:  [<c014bcce>] cache_alloc_debugcheck_after+0xb8/0xea
Feb  3 18:11:24 strolchi klogd:  [<c022ce26>] sys_socketcall+0x183/0x19e
Feb  3 18:11:24 strolchi klogd:  [<c0102a99>] syscall_call+0x7/0xb
Comment 75 Forgotten User ZhJd0F0L3x 2006-02-04 17:12:23 UTC
there is more than one of those in the logs. If you need them all, just tell me.
Comment 76 Olaf Kirch 2006-02-05 11:16:54 UTC
That was exactly what I was hoping for. Did you experience
any hangs?
Comment 77 Christoph Thiel 2006-02-05 12:17:21 UTC
I'v seen similar messages like seife:

e.g.

Feb  3 14:58:32 t41p klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4)
[...]
Feb  3 15:48:18 t41p klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4)
[...]
Feb  3 16:37:23 t41p klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4)

Luckily I didn't had a single crash after the latests patch (that has been in the beta3 kernel). [But the usual "hangs", that Matz and Lars were talking about earlier. However, I think they are unrelated to this problem!]
Comment 78 Forgotten User ZhJd0F0L3x 2006-02-05 13:13:39 UTC
(In reply to comment #76)
> That was exactly what I was hoping for. Did you experience
> any hangs?

no. I just noticed the "oops" scrolling by and then had a closer look.
Do you need the other backtraces, too?

Christoph, with the beta3 kernel, you get similar messages if you (try to) use ipv6, this is why i removed the ipv6 module (well, to get name lookup fast to be honest, but it also got rid of the lots of backtraces)

I also think the hanging connections and spurious "resource temporarily not available" are a different problem. The tunnels are only created on-demand and the first connection always fails this way; maybe they are also tore down after a timeout and need to be re-established => resource temporarily not available?
Comment 79 Michael Matz 2006-02-06 01:21:17 UTC
If or if not the hanging connections are a different problem is not clear,
I wouldn't jump to conclusions yet.  Also with 'plutowait=yes' the connections
aren't started on-demand, but right when starting ipsec (then also the very
first ping to keep the connection open works).  Additionally there is
continuous traffic over the tunnels (and if only the 30-second-interval ping),
so there's nothing to timeout.  _Except_ if there is a bug, which I could
very well imagine being the same as the one which in the end also leads to
the crashes (the message above seems to indicate that something got stale,
i.e. unused/old in some way, which without the patch lead to crashes, and with
it to hanging connections).

It's perhaps also interesting to know that once such connection hangs I can
revive it by starting another ping, as already written.  But it seems that if
I don't do that soon enough (e.g. when I don't notice the connection is
hanging, overnight for instance) then not even that helps, and I have to
restart the whole ipsec to get out of the stuck situation.
Comment 80 Andreas Schwab 2006-02-06 10:07:29 UTC
(In reply to comment #79)
> But it seems that if
> I don't do that soon enough (e.g. when I don't notice the connection is
> hanging, overnight for instance) then not even that helps, and I have to
> restart the whole ipsec to get out of the stuck situation.

Which is most likely due to the RAS timeout.

Comment 81 Lars Müller 2006-02-15 14:58:39 UTC
I had a system freeze again with 2.6.16-rc2-git2-6-default kernel after a full working day using our RAS.
Comment 82 Christoph Thiel 2006-02-15 15:01:26 UTC
Are you sure that this was really an IPsec related crash/freeze?
Comment 83 Lars Müller 2006-02-15 15:19:19 UTC
Same symptoms as before and no problem at all while running the system without IPsec.
Comment 84 Olaf Kirch 2006-02-16 10:19:18 UTC
Hm. Please hook up your system to a serial console, and when the crash
happens next time, please try Sysrq-p (several times) and Sysrq-t and attach
the output to this bug.
Comment 85 Olaf Kirch 2006-02-16 15:54:30 UTC
*** Bug 115200 has been marked as a duplicate of this bug. ***
Comment 86 Olaf Kirch 2006-02-16 16:19:21 UTC
BTW I think the remaining problem is something entirely different.

I'm downgrading to major because this happens a lot less often
Comment 87 Marius Tomaschewski 2006-02-27 13:00:51 UTC
I've tested again using unpatched 10.0 kernel (kernel-smp-2.6.13-15.8)
and with a kernel containing your latest patch (attachmend id=65615)
using a NAT-T configuration.

As expected, the unpatched kernel causes a system freeze and sysrq-t/p
does not print anything unless I use sysrq-i (Terminate All Tasks).
After this, the system reacts again to sysrq-t/p. Let me know if you
want to get the logs captured on serial console.


The kernel containing your last patch seems to work fine since ca. 36
hours without a system freeze, even though I've got some Oopses while
the system was booting.

The "Resource temporarily unavailable" from ping and "xfrm_tmpl_resolve
says EAGAIN, try again" on the console happens somethimes while rekeying
takes place. Also if my external/dynamic IP changes and the other ipsec
side's filter blocks the traffic from the new one. After a reauth&restart
of IPSec (so the other side updates its ip filter) the tunnel works fine
again.

I've also a full serial console capture of this session, that I'll attach
to this bug.
Comment 88 Marius Tomaschewski 2006-02-27 13:02:50 UTC
Created attachment 70394 [details]
serial console capture containing some Oopses while system is booting

The kernel used here contains patch #65615.
Comment 89 Marius Tomaschewski 2006-02-27 13:07:43 UTC
Created attachment 70395 [details]
serial console capture with sysrq-t/p output while system freeze

Uses unpatched kernel-smp-2.6.13-15.8
Comment 90 Olaf Kirch 2006-02-27 13:17:17 UTC
Marius, this isn't really the latest patch - this seems to be the bad
patch that went into beta3. Please grab the latest patch from CVS HEAD
if you want to give it a try.

Lars any news on IPsec capturing sysrq info via serial console?
Comment 91 Marius Tomaschewski 2006-02-27 14:13:05 UTC
OK, I'll rebuild the kernel using the patch from CVS HEAD and retry
(this evening).
Comment 92 Marius Tomaschewski 2006-02-28 18:03:34 UTC
I'm using a 10.0 kernel-smp-2.6.13-15.9 + xfrm-endless-loop.diff from CVS
HEAD since ca. 24 hours (one reconnect because my external IP has changed)
and it seems to work fine.

Everytime pluto replaces the ipsec SA, the xfrm_lookup bailing out code is
executed, e.g.:

Feb 28 17:32:23 exodus pluto[8307]: "mt-suse" #43: IPsec SA expired (LATEST!)
Feb 28 17:32:27 exodus pluto[8307]: initiate on demand from 10.204.0.12:0 to 10.0.9.7:0 proto=0 state: fos_start because: acquire
Feb 28 17:32:27 exodus pluto[8307]: "mt-suse" #48: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP {using isakmp#44}
Feb 28 17:32:27 exodus pluto[8307]: "mt-suse" #48: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
Feb 28 17:32:27 exodus pluto[8307]: "mt-suse" #48: STATE_QUICK_I2: sent QI2, IPsec SA established  {ESP=>0x81661bee <0xe1b01b5f xfrm=AES_0-HMAC_SHA1 NATD=195.135.221.4:4500 DPD=none}
Feb 28 17:32:57 exodus kernel: xfrm_lookup bailing out after 11 loops
Feb 28 17:32:57 exodus kernel:  [<c03257df>] xfrm_lookup+0x2af/0x4a0
Feb 28 17:32:57 exodus kernel:  [<c0120cd0>] default_wake_function+0x0/0x10
Feb 28 17:32:57 exodus kernel:  [<c030b4de>] raw_sendmsg+0x17e/0x420
Feb 28 17:32:57 exodus kernel:  [<c0314b0a>] inet_sendmsg+0x3a/0x70
Feb 28 17:32:57 exodus kernel:  [<c02c0742>] sock_sendmsg+0xd2/0x150
Feb 28 17:32:57 exodus kernel:  [<c013c040>] autoremove_wake_function+0x0/0x50
Feb 28 17:32:57 exodus kernel:  [<c01068d9>] default_do_nmi+0x79/0x180
Feb 28 17:32:57 exodus kernel:  [<c013c040>] autoremove_wake_function+0x0/0x50
Feb 28 17:32:57 exodus kernel:  [<c02c2525>] sys_sendmsg+0x155/0x270
Feb 28 17:32:57 exodus kernel:  [<c0344a49>] schedule+0x5a9/0xc00
Feb 28 17:32:57 exodus kernel:  [<c0344a77>] schedule+0x5d7/0xc00
Feb 28 17:32:57 exodus kernel:  [<c012f370>] lock_timer_base+0x20/0x50
Feb 28 17:32:57 exodus kernel:  [<c012f627>] del_timer_sync+0x27/0x30
Feb 28 17:32:57 exodus kernel:  [<c02c1192>] sock_poll+0x12/0x20
Feb 28 17:32:57 exodus kernel:  [<c0185907>] do_pollfd+0x57/0xb0
Feb 28 17:32:57 exodus kernel:  [<c02c2afc>] sys_socketcall+0x27c/0x2a0
Feb 28 17:32:57 exodus kernel:  [<c010983f>] do_gettimeofday+0x1f/0xd0
Feb 28 17:32:57 exodus kernel:  [<c012ab08>] sys_gettimeofday+0x28/0x80
Feb 28 17:32:57 exodus kernel:  [<c01042ab>] sysenter_past_esp+0x54/0x79
Comment 93 Marius Tomaschewski 2006-02-28 18:06:17 UTC
BTW: the above kernel RPMs are in ~mt/Export/openswan/xfrm-endless-loop/
Comment 94 Marius Tomaschewski 2006-03-03 13:21:29 UTC
It still works on my 10.0 server - uptime: 2:18pm  an 3 days 18:20

Thanks Olaf!
Comment 95 Olaf Kirch 2006-03-03 14:12:10 UTC
Fine. I will downgrade the bug severity for now. I am not entirely happy
yet because the bailout message still triggers. This means there's still
something fishy going on.

I'm assigning this to Jiri Bohac - Jiri, maybe you could try to find out
why it's still looping in xfrm_lookup?
Comment 96 Jiri Bohac 2006-03-21 20:04:19 UTC
Olaf,

this part of your patch is upstream since linux-2.6.16-rc3:

- goto restart;
+
+ err = -EHOSTUNREACH;
+ goto error;

This was what caused the lockup. Now, there is only one "goto restart" left, which is called after sleeping on the km_waitq wait queue. This wait queue is woken from many places that touch the xfrm states.

The purpose of the loop is clearly to wait until a valid bundle is available and the number of iterations may be large, because the wait queue may be woken up many times by completely unrelated events.

I think that looping there 10 times does not mean anything is wrong and I'd just delete the bailout check/message. But if you think that this still may be an endless loop (which would not lockup the system because of the schedule() call), I suggest changing the number of iterations to something like 1000 instead of 10. Than we could wait if anyone ever sees the message.

Does this make sense?
Comment 97 Olaf Kirch 2006-03-23 11:15:15 UTC
If the number of wake-ups is really that large, then it may make more sense
to do something like this:

	unsigned long timeout = jiffies + 60 * HZ;

again:
	if (time_after(jiffies, timeout)) {
		complain and bail out
	}

Feel free to modify the patch either way.
Comment 98 Jiri Bohac 2006-03-28 13:32:17 UTC
Good idea. I updated the patch with the 60 second timeout instead of the maximum of 10 loop iterations. 

If anyone sees the message "xfrm_lookup bailing out after 60 seconds and %d loops"      in their syslog, please let me know. I still haven't reproduced it myself. 
Comment 99 Jiri Bohac 2006-04-20 13:51:21 UTC
I have not heard from anyone reporting about the above message in syslog. Closing for now. 

If anyone sees the message "xfrm_lookup bailing out after 60 seconds and %d
loops" in their syslog, please reopen.
Comment 100 Marius Tomaschewski 2006-06-01 10:53:35 UTC
Created attachment 86502 [details]
updated SUSE 10.0 patch
Comment 101 Marius Tomaschewski 2006-06-01 10:56:52 UTC
aj: what's about SUSE 10.0 kernel - the problem ocurrs there as well...
Comment 102 Andreas Jaeger 2006-06-01 11:36:20 UTC
Please add to our kernel CVS for 10.0 so that it can go out with the next security update.
Comment 103 Marius Tomaschewski 2006-06-02 12:10:25 UTC
Using the actual patch on a __10.0__ kernel, I got the bailing out
message once in last 24 hours (bellow).

With the old patch I seen them 3 - 6 times a day, that is on every
(re)establish. In both cases no crash happened.

Jun  1 18:03:25 exodus pluto[4191]: "mt-suse" #20: IPsec SA expired (LATEST!)
Jun  1 18:03:28 exodus pluto[4191]: initiate on demand from 10.204.0.12:0 ...
Jun  1 18:03:29 exodus pluto[4191]: "mt-suse" #27: transition from state
                                    STATE_QUICK_I1 to state STATE_QUICK_I2
Jun  1 18:03:29 exodus pluto[4191]: "mt-suse" #27: STATE_QUICK_I2: sent QI2,                
                                    IPsec SA established {...NATD...}

Jun  1 18:04:28 exodus kernel: 
xfrm_lookup bailing out after 60 seconds and 0 loops
 [<c032584c>] xfrm_lookup+0x2ac/0x490
 [<c0120cd0>] default_wake_function+0x0/0x10
 [<c030b54e>] raw_sendmsg+0x17e/0x420
 [<c0314b7a>] inet_sendmsg+0x3a/0x70
 [<c02c07a2>] sock_sendmsg+0xd2/0x150
 [<c013c010>] autoremove_wake_function+0x0/0x50
 [<c013c010>] autoremove_wake_function+0x0/0x50
 [<c012f9e2>] update_wall_time+0x12/0x50
 [<c012fd63>] do_timer+0x53/0xf0
 [<c02c2585>] sys_sendmsg+0x155/0x270
 [<c0344aa9>] schedule+0x5a9/0xc00
 [<c0344ad7>] schedule+0x5d7/0xc00
 [<c014e686>] __do_IRQ+0x106/0x150
 [<c012f370>] lock_timer_base+0x20/0x50
 [<c012f627>] del_timer_sync+0x27/0x30
 [<c02c11f2>] sock_poll+0x12/0x20
 [<c01858d7>] do_pollfd+0x57/0xb0
 [<c02c2b5c>] sys_socketcall+0x27c/0x2a0
 [<c010987f>] do_gettimeofday+0x1f/0xd0
 [<c012ab08>] sys_gettimeofday+0x28/0x80
 [<c01042cb>] sysenter_past_esp+0x54/0x79
Comment 104 Jiri Bohac 2006-06-06 16:01:59 UTC
Created attachment 87491 [details]
proper patch for the 10.0 kernel

This patch includes also includes the part that was missing in #100.
Comment 105 Jiri Bohac 2006-06-06 16:09:44 UTC
I fixed the patches.fixes/xfrm-endless-loop in HEAD, because the loop count for debugging message was not properly incremented (eeek!). So if this happens again with the SLES10 kernel, the debugging message will be more useful.

The patch from #100 does not fix the problem for 2.6.13 on its own. The more important part of the fix was meged upstream before 2.6.16. It is included in the patch from #104

Before we apply this to the 10.0 kernel as a security update, it would be good to have some testing. It is _probably_ safe, as it seems to work well in HEAD, but it would be nice to actually try to use it with 10.0. Marius, you're saying that you have been using the 10.0 kernel and you got the bailout messages. Do you think you can try with this patch to be sure it doesn't break anything?

Not commiting to CVS yet...
Comment 106 Dirk Mueller 2006-06-06 17:14:53 UTC
I'm using the STABLE kernel and still see connection hangs pretty regularly (about every 10  minutes). wasn't this supposed to get solved by this patch?
Comment 107 Jiri Bohac 2006-06-06 18:11:27 UTC
Dirk:
Well, this patch solves a hard lockup in the first place. If instead of the lockups the connection hangs for a while, this means that there might be a different bug somewhere else. Including the userspace IKE daemon.

Is it that the ipsec connection stops working for a short while and then starts again, without you doing anything?
How long does your connection remain hung?

Anything like "xfrm_lookup bailing out after 60 seconds..." in dmesg?
Are you using racoon or openswan for IKE? If it is openswan, can you please set plutodebug=all in ipsec.conf and attach piece of /var/log/messages around the time one of the hangs occurs?
If you're using racoon, please do the same, but put "log debug2" in racoon.conf.
Comment 108 Marius Tomaschewski 2006-06-06 19:35:18 UTC
(In reply to comment #105)
> I fixed the patches.fixes/xfrm-endless-loop in HEAD, because the loop count
> for debugging message was not properly incremented (eeek!). So if this
> happens again with the SLES10 kernel, the debugging message will be more
> useful.

Yes, it helped against the crashes / freezes (I guess because of the bailout
every 60 secs), but the ipsec connection got lost in many cases...

> The patch from #100 does not fix the problem for 2.6.13 on its own. The more
> important part of the fix was meged upstream before 2.6.16. It is included in
> the patch from #104

OK, thanks!

> Before we apply this to the 10.0 kernel as a security update, it would be good
> to have some testing. It is _probably_ safe, as it seems to work well in HEAD,
> but it would be nice to actually try to use it with 10.0. Marius, you're
> saying that you have been using the 10.0 kernel and you got the bailout
> messages. Do you think you can try with this patch to be sure it doesn't
> break anything?

Yes, of course - I'm going to use the #104 patch on my 2.6.13 machine now.
Comment 109 Marius Tomaschewski 2006-06-08 09:30:42 UTC
I'm running the #104 patch since 24 hours on my productive 10.0 gateway
(I'm working over it) and there is no machine lockup, no ipsec connection
drop occured, as well as no "xfrm_lookup bailing out" messages in the time.

The 10.0 (smp) kernel is in ~mt/export/openswan/xfrm-endless-loop/.
Comment 110 Marius Tomaschewski 2006-06-14 13:05:14 UTC
The last "xfrm_lookup bailing out after 11 loops" was on Jun 7 05:56:23;
no crash since 7 days 5:41 using the 10.0 smp kernel + patch from #104.

Sometimes, the (ssh) connection over the tunnel does not response for few
seconds (I guess while the rekeying is done), but continues then.
Comment 111 Marius Tomaschewski 2006-06-14 13:08:41 UTC
This means, the bailing out does not happen any more with patch from #104.
reboot   system boot  2.6.13-15.10-smp Wed Jun  7 09:19   (7+05:47)
Comment 112 Jiri Bohac 2006-06-14 15:33:59 UTC
OK, commited the patch from #104 to the 10.0 cvs branch.

Marking as FIXED for now; if anyone sees the bailout message, please reopen.