|
Bugzilla – Full Text Bug Listing |
|
Description
Lars Müller
2005-09-19 11:33:42 UTC
Nothing in the logs. sysrq doesn't help. Please enable logging to console (klogconsole -l8 -r0), switch to virtual console and wait for it to happen. I followed the instructions from comment #2 and the system had frozen again. But nothing on the console. Okay, next try: please enable nmi_watchdog in addition and see if something shows up on the console With set to 1 and 2 the NMI count of /proc/interrupts is zero. The CPU is Intel(R) Pentium(R) III Mobile 1200MHz. Anything I can do? I'm not in the office atm and need a working IPsec. I'm currently at a loss on how to debug this. Do you have a machine where NMI watchdog works? Well, there goes my theory that it might be x86-64 only. What kind of nic do you have lars? I'm a via rhine. To comment #7: Only the two systems of my girlfriend And testing with this systems might be to risky for my life. To comment #8: lmuelle@gab:~> /sbin/lspci | grep -i ether 02:00.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) I've updated my kernel to 2.6.14-rc4-20051012095427-default from kotd. After booting this kernel and starting SuSE RAS/ IPsec I no longer have the system freeze. From time to time I have a non working IPsec conenction. Sometimes the connection recovers and sometime not. Then I have to restart it. Created attachment 54633 [details]
Compressed /var/log/messages of the system with kernel 2.6.14-rc4-20051012095427-default
Now I have again such an unrecoverable situation. lmuelle@gab:~> ping gray.suse.de connect: Resource temporarily unavailable Created attachment 54650 [details]
Output of ip xfrm policy list at the moment when I got 'connect: Resource temporarily unavailable'.
Created attachment 54652 [details]
Output of ip xfrm policy list when everything is fine.
While trying to do some work remote I have the same problem again even with this kernel version. *** Bug 134731 has been marked as a duplicate of this bug. *** I believe this is actually a duplicate of 115200 which contains a lot more information already. if you all agree please resolve as duplicate. ah, I also experience the issues now but currently can't add more information than was already given. As suggested by Olaf on my bug 134731 (now dup of this one) I tried various things: 1) nmi_watchdog=1 : doesn't help, so the CPU is not locked in that way 2) working only from the text console so that I can look at sysrq: once the machine is locked sysrq itself still works, but it can't write any files anymore, hence any sysrq output more than 30 lines is lost for me (because I also have no serial console :-( ), so sysrq-t is useless (I see only a few tasks, and they simply sleep in read or similar harmless syscalls). But the interesting thing is, that sysrq-p (show Pc) shows it spinning in the process "ping". At that time only one ping is running, namely the one to keep the ipsec tunnel open (which btw doesn't seem to work that well, because sometimes the tunnel goes sort of down anyway, and I have to do another ping which then triggers a restart of the tunnel, and all continues to work. So somehow the period of the normal ping (30 seconds) isn't enough to keep the tunnel working. I have the _feeling_ (and it's only that) that the lockups happen when the tunnel is down in this way, and in the process of being retriggered. Because the last log message in var/log/messages when such lockup happens is about the ipsec SA being expired (I have to verify if this is really the case every time). So, anyway, if the machine is locked up, sysrq-p points at ping running heavily. I've noted the backtraces on a sheet of paper, which I forgot home, so will enter that some other time. But it was spinning in functions having xfrm_ in their name. 3) tried two ways of crash dumping to no avail. lkcd-stuff doesn't seem to be supported with the 10.0 kernel. And the kexec based things I also didn't get to work. I've had to compile the kexec-utils myself, because no rpm on 10.0, and as it needs a normal vmlinux (unpacked), I had to compile my own kernel as the secondary kernel, which is necessary anyway because of the necessary adjustment to the load address. Once I did that, and kexec-utils finally loaded that kernel sysrq-c (forced kexec) just locked the machine. Yeah, that's it. So still no really useful information :-( I've updated my system to 10.1 post Alpha 3 and still have the crashes. Marius and I before did also several tests with openswan 2.4.4 and kotd. I now move this to critical as it is not only my bug and as it is in the 10.1 tree too. Could someone consider if this is a duplicate of bug #115200? We're getting somewhere, thanks to Michael's debugging help. Michael, can you add the sysrq-p backtrace? That would be extremely helpful. Concerning sysrq-t: as long as you do not switch consoles, you can scroll up using Shift-PageUp for a few pages worth of data. The scrollback buffer is clobbered as soon as you switch consoles though. No, I can't scroll back, because the keyboard doesn't react to anything, not even to shift-pgup/pgdown. It only reacts to sysrq-* . Well, at least the _effect_ is that it doesn't react. It might of course also just be, that the scrolling of the console doesn't work in that state anymore. So, here come four different traces of sysrq-p when the machine is
frozen. I should perhaps add that _all_ of these backtrace were in ping,
once the machine is frozen, where usually (when not frozen) most of these
sysrq-p traces are in the swapper task (IIRC). Also sysrq-t shows at the
end a list of running (R) tasks; ping was included also in all of them.
Anyway, here the traces, from my sheet of paper, so hopefully all function
names are correct, or at least identifieable. I'll write then in opposite
entry order, i.e. first the caller, then the callee. First the common
part for all traces:
syscall_call
sys_gettimeofday
sys_socketcall
do_pollfd
sock_poll
schedule
__alloc_pages
buffered_runqueue (unsure, can't read it fully on the paper, definitely
buffered_run...)
sys_fadvise64_64
sys_sendmsg
verify_iovec
autoremove_wake_function
soft_cursor
sock_sendmsg
inet_sendmsg
raw_sendmsg
default_wake_function
Now the four rests of the backtraces, which just have to be added to the
above list to form the full backtraces:
1)
xfrm_lookup
xfrm_tmpl_resolve
xfrm_state_find
2)
xfrm_lookup
xfrm_find_bundle
__xfrm4_find_bundle
local_bh_enable
3)
xfrm_lookup
stale_bundle
xfrm_bundle_create
xfrm_bundle_ok
4)
xfrm_policy_lookup
xfrm_lookup
flow_cache_lookup
flow_hash_code
Yes, the first three have xfrm_lookup in the first position, the fourth
in the second.
I find the traces a bit strange, as I can't find a reason why
sys_gettimeofday should do a socketcall, or pollfd. So perhaps the start
of the trace is inconsistent, or contains old data. But this trace does
happen all the time when the machine is frozen (tested at two different
freezes), and changes only in the last four to six frames.
Ah! useful intelligence - thanks a lot! There ought to be a ip_route_output_flow call between raw_sendmsg and xfrm_lookup - this is how we end up in the xfrm code. It seems we're looping in xfrm_lookup because the newly created xfrm bundle is deemed stale. Can you please try the attached test patch? If my theory is right, the kernel should not lock up anymore, and print a bunch of diagnostics instead. Created attachment 59102 [details]
Test Patch
Built kernels for 10.0 and stable are at http://w3.suse.de/~lmuelle/kernel/ Adding Andreas as the patch mentioned in comment #27 might concern him too. While testing the patched kernel I got: Nov 30 17:39:09 gab kernel: xfrm_bundle_ok: 1151 Nov 30 17:48:46 gab kernel: xfrm_bundle_ok: 1151 Nov 30 17:48:54 gab last message repeated 3 times Nov 30 17:58:46 gab kernel: xfrm_bundle_ok: 1151 Nov 30 17:58:54 gab last message repeated 5 times Nov 30 18:08:46 gab kernel: xfrm_bundle_ok: 1151 Nov 30 18:08:54 gab last message repeated 7 times Created attachment 59285 [details]
/var/log/messages
Systen is now up for around an hour.
My running rsync got stuck one time in between. At that moment a ping to an internal host succeeded and the rsync process continued.
Four hours and 2 GB received data without a crash. I've used RAS the whole evening yesterday (no mass transfers but continuous
working over ssh) and got no lockups, so ... YEAH!!! ... ;-)
Like Lars I also get occassional connection stucks, which I also got before,
and where I conjectured that these may be the points in time when lockups
actually occur. Now I have such things in my messages:
Dec 1 14:08:50 hell6 pluto[9403]: "matz-suse" #48: IPsec SA expired (LATEST!)
Dec 1 14:09:15 hell6 pluto[9403]: initiate on demand from 10.204.0.110:0 to 10.0.9.7:0 proto=0 state: fos_start because: acquire
Dec 1 14:09:15 hell6 pluto[9403]: "matz-suse" #53: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP {using isakmp#49}
Dec 1 14:09:15 hell6 pluto[9403]: "matz-suse" #53: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
Dec 1 14:09:15 hell6 pluto[9403]: "matz-suse" #53: STATE_QUICK_I2: sent QI2, IPsec SA established {ESP=>0xb16fac6b <0xa100f7b7 xfrm=AES_0-HMAC_SHA1 NATD=195.135.221.4:4500 DPD=none}
Dec 1 14:09:45 hell6 kernel: xfrm_bundle_ok: 1153
Dec 1 14:19:45 hell6 kernel: xfrm_bundle_ok: 1153
Dec 1 14:19:45 hell6 kernel: xfrm_bundle_ok: 1153
Dec 1 14:29:46 hell6 kernel: xfrm_bundle_ok: 1153
Dec 1 14:29:46 hell6 last message repeated 2 times
Dec 1 14:39:46 hell6 kernel: xfrm_bundle_ok: 1153
Dec 1 14:39:46 hell6 last message repeated 3 times
Dec 1 14:49:46 hell6 kernel: xfrm_bundle_ok: 1153
Dec 1 14:49:46 hell6 last message repeated 4 times
Dec 1 14:54:44 hell6 pluto[9403]: "matz-novell1" #54: initiating Main Mode to replace #49
This happened quite frequently, every time after the STATE_QUICK_I2 message
happening after the "SA expired" message occuring roughly every 50 minutes.
I've left it run idly overnight with IPsec active (formerly there was
a high chance that it would lockup over night), and it still runs. I just
look at it from the office, and it hasn't locked up, but RAS also
isn't working anymore, I only get "Resource temporarily unavailable" from
pings, without the connection ever being established. In the log I see:
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: initiating Main Mode
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: received Vendor ID payload[draft-ietf-ipsec-nat-t-ike-03] method set to=108
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: enabling possible NAT-traversal with method RFC 3947 (NAT-Traversal)
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: transition from state STATE_MAIN_I1 to state STATE_MAIN_I2
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: STATE_MAIN_I2: sent MI2, expecting MR2
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: I did not send a certificate because I do not have one.
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: NAT-Traversal: Result using draft-ietf-ipsec-nat-t-ike-02/03: i am NATed
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: transition from state STATE_MAIN_I2 to state STATE_MAIN_I3
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: STATE_MAIN_I3: sent MI3, expecting MR3
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: ignoring informational payload, type INVALID_ID_INFORMATION
Dec 1 16:24:00 hell6 pluto[9403]: "matz-suse" #104: received and ignored informational message
Dec 1 16:24:10 hell6 pluto[9403]: "matz-suse" #104: discarding duplicate packet; already STATE_MAIN_I3
Dec 1 16:24:10 hell6 pluto[9403]: "matz-suse" #104: ignoring informational payload, type INVALID_ID_INFORMATION
Dec 1 16:24:10 hell6 pluto[9403]: "matz-suse" #104: received and ignored informational message
Dec 1 16:24:30 hell6 pluto[9403]: "matz-suse" #104: discarding duplicate packet; already STATE_MAIN_I3
Dec 1 16:24:30 hell6 pluto[9403]: "matz-suse" #104: ignoring informational payload, type INVALID_ID_INFORMATION
Dec 1 16:24:30 hell6 pluto[9403]: "matz-suse" #104: received and ignored informational message
Dec 1 16:24:46 hell6 pluto[9403]: initiate on demand from 10.204.0.110:0 to 10.0.9.7:0 proto=0 state: fos_start because: acquire
Dec 1 16:25:10 hell6 pluto[9403]: "matz-suse" #104: max number of retransmissions (2) reached STATE_MAIN_I3. Possible authentication failure: no acceptable response to our first encrypted message
The above block is repeated with increasing connection IDs. So it seems
that my machine can't authenticate against the server anymore. Just before
these failing auth attempts started I see this in my log:
Dec 1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec 1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec 1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec 1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec 1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec 1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec 1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec 1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec 1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec 1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
Dec 1 14:55:46 hell6 kernel: xfrm_bundle_ok: 1178
Dec 1 14:55:46 hell6 kernel: xfrm_lookup: newly created bundle is stale
So, this is a different __LINE__ than the above messages from the patch,
and of course we have a new stale bundle. OTOH this occured also two hours
before already, and thereafter the connection still worked and was able
to authenticate. So the mere existence of a stale bundle is not the
reason for failing auth. But still there might be a connection.
Many many thanks for your theory testing patch, finally I can work overnight
again comfortably ;-)
Well, I'm happy it works for you... but it's still a bit strange. You said it usually locks up after 10 minutes. This matches exactly the intervals between the xfrm_bundle_ok messages (10 mins). So congrats, you bio clock is highly accurate! But - if we are looping where I think we would loop, you would see 10 messages in a row saying "newly created bundle is stale". But there are just one or two. AND you're seeing this message only when RAS dies! The fact that we hit xfrm_bundle_ok/line 1153 means that the dst cache entry we're looking at has been marked for destruction by someone: dst->obsolete is set, which happens if someone calls dst_free() on it while others still hold a reference to it. Maybe pluto is clearing out some policy entries at 10 minute intervals. At any rate, this needs a new patch with a few more printk's. My experience is that the lockups only happen with NAT-T. At all my locations I use NAT-T too. i am seeing this (~lmuelle-kernel): Dec 2 22:41:20 susi kernel: xfrm_bundle_ok: 1153 long after i stopped pppd and racoon: Dec 2 20:41:22 susi pppd[9104]: Terminating on signal 15 Dec 2 20:41:22 susi pppd[9104]: Connect time 71.8 minutes. Dec 2 20:41:22 susi pppd[9104]: Sent 713396 bytes, received 1505904 bytes. Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi kernel: xfrm_bundle_ok: 1153 Dec 2 20:41:22 susi pppd[9104]: Script /etc/ppp/ip-down started (pid 15366) Dec 2 20:41:22 susi pppd[9104]: sent [LCP TermReq id=0x2 "User request"] Dec 2 20:41:22 susi pppd[9104]: rcvd [LCP TermAck id=0x2] Dec 2 20:41:22 susi pppd[9104]: Connection terminated. Dec 2 20:41:23 susi modify_resolvconf: restored /etc/resolv.conf.saved.by.pppd.modemU to /etc/resolv.conf Dec 2 20:41:23 susi pppd[9104]: Waiting for 1 child processes... Dec 2 20:41:23 susi pppd[9104]: script /etc/ppp/ip-down, pid 15366 Dec 2 20:41:24 susi pppd[9104]: Script /etc/ppp/ip-down finished (pid 15366), status = 0x0 Dec 2 20:41:24 susi pppd[9104]: Exit. Dec 2 20:41:24 susi /etc/ppp/ip-down.d/racoon: VPNCLIENT-CLI-0272:Gateway IP Address is invalid Dec 2 20:41:25 susi racoon: INFO: caught signal 15 Dec 2 20:41:25 susi racoon: ERROR: libipsec failed pfkey check (Invalid message type) Dec 2 20:41:26 susi racoon: ERROR: sendmsg (Invalid argument) Dec 2 20:41:26 susi racoon: ERROR: sendfromto failed Dec 2 20:41:26 susi racoon: INFO: racoon shutdown Dec 2 20:41:26 susi /etc/ppp/ip-down.d/racoon: Shutting down IPsec IKE daemon (racoon)..done It did not lock up this time, i just wondered about the ipsec message (is it one?) long after usage. My machine also practically never locks up while the tunnel is active but long after i stopped using ipsec and - i almost don't dare to say it - it seems to coincide with ACPI LID button events.... Olaf: How about adding this workaround to the current kernel. I have to recompile kernels for each preview. So, is any kernel or patch available which has some more printk's as noted in comment #34? Because, while the machine doesn't lockup anymore IPSEC still doesn't work satisfyingly. After some time the tunnel goes down when no traffic goes over it (_although_ there is that 30 second ping, so it seems this ping doesn't count as activity), After this happens there is a short time interval (about some minutes) where the tunnel can be revived by another ping to the IPsec'ed network (yes, that's strange). If this is not done, and the tunnel is down for a longer period than this, then it's in an unrevivable state (i.e. the pings return just with 'connect: Resource temporarily unavailable'). Everything I said in comment #33 still applies. I.e. after such ping-to-revive pluto tries to recreate the connection but fails because of: Jan 13 22:56:15 hell6 pluto[9662]: "matz-novell5" #24: max number of retransmissions (2) reached STATE_MAIN_I3. Possible authentication failure: no acceptable response to our first encrypted message Right before the _first_ attempt which fails like above I have this: Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178 Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178 Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178 Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178 Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178 Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale Jan 13 22:54:53 hell6 kernel: xfrm_bundle_ok: 1178 Jan 13 22:54:53 hell6 kernel: xfrm_lookup: newly created bundle is stale And this is at the same time the _only_ time I see this. All other times I see something from the printks in the patch it's in line 1153. So it seems that something leads to the above messages in line 1178, i.e. a stale bundle, and from then on the IPsec connection can't be recreated correctly anymore. did anyone try the current kotd? there were plenty of ipsec related changes in there, also with NAT. i had today a "system hangs at connection initiation" with 2.6.15-git8-20060113093923 i don't know if this is the same bug, though. I just had a crash/freeze on SUSE Linux 10.1 (almost)Beta2, kernel: 2.6.16-rc1-git3-4-default. Please provide additional info - was it a crash or a freeze? Any kernel oops? NMI watchdog? ping still working? what is the hwinfo? anything visible if you enable kernel logging to the console, and switch to a VC before the crash? It was a complete lockup => freeze. I didn't had any chance of getting additional data :( As this is happening on my laptop and I'm en route ATM, I couldn't try to ping the maschine. Any suggestions how to gather data on this to solve this problem? If you can reproduce this, please switch to a VC, and enable kernel logging to the console and enable sysrq klogconsole -l8 -r0 echo 1 >/proc/sys/kernel/sysrq If possible, attach a serial console. Then work from the console and do whatever it takes to trigger the lockup. Do several sysrq-p's in a row to see where it's looping. Try sysrq-t for a backtrace of all tasks (this will only be fun if you have a serial console, though) You may also want to try nmi_watchdog=1 Created attachment 65244 [details]
BUG: soft lockup detected...
Created attachment 65245 [details]
next try...
Created attachment 65248 [details]
sysrq-p
So it's still in the xfrm_* stuff. I'm really reluctant to try any kernel except the one containing Olafs debug patch, because I don't want to risk freezing machines. That's it for the moment. If you need futher information, just ask! I was running the 10.1 beta2 kernel, but have rebuilt that one with a km_wlan in prefer rpms. The mbuild job should still be there: zion-cthiel-10. (I'v copied that one to ~cthiel/Export/ as well.) Indeed it looks like we're still looping in the xfrm stuff.
Christoph, can you please test a kernel with the debug patch from
attachment #59102 [details]?
If that cures your problem, we need to add some more instrumentation to
find out what's going on there.
Created attachment 65282 [details]
xfrm_policy.c.rej
Olaf, could you please provide a fixed patch? modprobe crasher could help as well. afaik it does some slab corruption debug stuff. Created attachment 65369 [details]
Updated test patch
Please try this patch
Can we please add this workaround to our kernel packages? I have to rebuild packages after every system update to betaX or stable. This patch will pollute syslog with lots of useless chatter. Let's have some test feedback, and fix it for real. Yes, a freezing system is politer than a polluted syslog. I'll buy a new laptop or use a different OS. :) kernel-default-2.6.16_rc1_git3-5 from stable plus the patch mentioned in comment #55 built, installed and in use since good one hour. I got again the 'Resource temporarily unavailable' and one klogd messaage: lmuelle@gab:~> ping gray.suse.de connect: Resource temporarily unavailable lmuelle@gab:~> ping gray.suse.de PING gray.suse.de (10.10.2.19) 56(84) bytes of data. 64 bytes from 10.10.2.19: icmp_seq=1 ttl=62 time=2.86 ms 64 bytes from 10.10.2.19: icmp_seq=2 ttl=62 time=7.00 ms 64 bytes from 10.10.2.19: icmp_seq=3 ttl=62 time=3.09 ms --- gray.suse.de ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2001ms rtt min/avg/max/mdev = 2.869/4.321/7.004/1.899 ms lmuelle@gab:~> Message from syslogd@gab at Fri Jan 27 14:11:36 2006 ... gab klogd: lock: e17309a0, .magic: dead4eac, .owner: <none>/-1, .owner_cpu: -1 Please provide the complete dmesg output. This looks like the spinlock debug code triggered. Created attachment 65417 [details] dmesg output as requested with comment #61. I'v been using RAS for like 2:30h now and didn't hit a crash yet. Syslog only has a lot of: klogd: xfrm_bundle_ok: 1210 I'll report back as soon as I get a crash. So, I'v really tried hard to reproduce a crash, but didn't get one so far. Here are some other messages that I'v seen: Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235 Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235 Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235 Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235 Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235 Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale Jan 27 22:58:09 t41p klogd: xfrm_bundle_ok: 1235 Jan 27 22:58:09 t41p klogd: xfrm_lookup: newly created bundle is stale Jan 27 22:58:09 t41p klogd: xfrm_lookup bailing out after 11 loops So, it looks like the patch helps indeed. I had a crash after one hour real remote work with the patch from comment #55. Created attachment 65615 [details]
Updated test patch
Here's another patch for testing. If we manage to pin-point the problem
a little closer I will include a patch for this tonight so it gets into
the next beta.
I committed a patch to CVS that may fix this problem. Please try the upcoming KOTD. I run kernel-default-2.6.16_rc1_git3-20060130162402 atm. And I didn't had a freezing system since more than three hours. But: I have agaiun and again stocking connections if I had been idle. Please provide the kernel debug output. The fact that it works doesn't help me create a fix for this problem. And what is a stocking connection? Argl. I intended to say 'stagnant' connection. The result are hanging connections if the remote connection is idle for some time. Created attachment 66176 [details]
/var/log/messages of the system
Anything missing? Olaf, see comment #39 for a description of what is meant with the stagnant connection. The symptoms haven't changed. Somewhen the connection just stops delivering traffic (in difference to Lars I seldomly also notice this _while_ doing something over ssh). Then I have to wait some time, do a ping to some remote IPsec'ed host, get the "temporarily unavailable" message repeat the ping some times, and then it's working again. Including the ssh connection which was stuck, and could _not_ be revived by just generating more traffic (i.e. pressing more keys). yesterday i had this in the log with the Beta3 kernel (i have removed the ipv6.ko module): Feb 3 18:11:23 strolchi klogd: xfrm_tmpl_resolve says EAGAIN, try again Feb 3 18:11:23 strolchi klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4) Feb 3 18:11:23 strolchi klogd: [<c0279deb>] xfrm_lookup+0x403/0x481 Feb 3 18:11:23 strolchi klogd: [<c02517ac>] ip_output+0x1f4/0x240 Feb 3 18:11:23 strolchi klogd: [<c01158f2>] default_wake_function+0x0/0xc Feb 3 18:11:23 strolchi klogd: [<c0278a09>] xfrm4_output_finish+0x8a/0x36a Feb 3 18:11:23 strolchi klogd: [<c024a244>] ip_route_output_flow+0x77/0x8b Feb 3 18:11:23 strolchi klogd: [<c0265ee3>] raw_sendmsg+0x2e6/0x70f Feb 3 18:11:23 strolchi klogd: [<c0265a89>] raw_recvmsg+0x6e/0x171 Feb 3 18:11:23 strolchi klogd: [<c026cbf2>] inet_sendmsg+0x35/0x3f Feb 3 18:11:23 strolchi klogd: [<c022b9ce>] sock_sendmsg+0xd2/0xec Feb 3 18:11:23 strolchi klogd: [<c0127a03>] autoremove_wake_function+0x0/0x2d Feb 3 18:11:23 strolchi klogd: [<c0127a03>] autoremove_wake_function+0x0/0x2d Feb 3 18:11:23 strolchi klogd: [<c014bcce>] cache_alloc_debugcheck_after+0xb8/0xea Feb 3 18:11:23 strolchi klogd: [<c01addf7>] copy_from_user+0x54/0x80 Feb 3 18:11:24 strolchi klogd: [<c0231626>] verify_iovec+0x3e/0x70 Feb 3 18:11:24 strolchi klogd: [<c022bdf9>] sys_sendmsg+0x186/0x1df Feb 3 18:11:24 strolchi klogd: [<c022c662>] sys_recvmsg+0x1a1/0x1b5 Feb 3 18:11:24 strolchi klogd: [<c013962c>] get_page_from_freelist+0x6f/0x2c4 Feb 3 18:11:24 strolchi klogd: [<c01f70b3>] pty_write+0x2b/0x35 Feb 3 18:11:24 strolchi klogd: [<c0282add>] _spin_unlock_irq+0x5/0x7 Feb 3 18:11:24 strolchi klogd: [<c02818bf>] schedule+0x499/0x4f7 Feb 3 18:11:24 strolchi klogd: [<c014bcce>] cache_alloc_debugcheck_after+0xb8/0xea Feb 3 18:11:24 strolchi klogd: [<c022ce26>] sys_socketcall+0x183/0x19e Feb 3 18:11:24 strolchi klogd: [<c0102a99>] syscall_call+0x7/0xb there is more than one of those in the logs. If you need them all, just tell me. That was exactly what I was hoping for. Did you experience any hangs? I'v seen similar messages like seife: e.g. Feb 3 14:58:32 t41p klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4) [...] Feb 3 15:48:18 t41p klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4) [...] Feb 3 16:37:23 t41p klogd: xfrm_lookup: IPv4 route is stale (obsolete=2, loops=4) Luckily I didn't had a single crash after the latests patch (that has been in the beta3 kernel). [But the usual "hangs", that Matz and Lars were talking about earlier. However, I think they are unrelated to this problem!] (In reply to comment #76) > That was exactly what I was hoping for. Did you experience > any hangs? no. I just noticed the "oops" scrolling by and then had a closer look. Do you need the other backtraces, too? Christoph, with the beta3 kernel, you get similar messages if you (try to) use ipv6, this is why i removed the ipv6 module (well, to get name lookup fast to be honest, but it also got rid of the lots of backtraces) I also think the hanging connections and spurious "resource temporarily not available" are a different problem. The tunnels are only created on-demand and the first connection always fails this way; maybe they are also tore down after a timeout and need to be re-established => resource temporarily not available? If or if not the hanging connections are a different problem is not clear, I wouldn't jump to conclusions yet. Also with 'plutowait=yes' the connections aren't started on-demand, but right when starting ipsec (then also the very first ping to keep the connection open works). Additionally there is continuous traffic over the tunnels (and if only the 30-second-interval ping), so there's nothing to timeout. _Except_ if there is a bug, which I could very well imagine being the same as the one which in the end also leads to the crashes (the message above seems to indicate that something got stale, i.e. unused/old in some way, which without the patch lead to crashes, and with it to hanging connections). It's perhaps also interesting to know that once such connection hangs I can revive it by starting another ping, as already written. But it seems that if I don't do that soon enough (e.g. when I don't notice the connection is hanging, overnight for instance) then not even that helps, and I have to restart the whole ipsec to get out of the stuck situation. (In reply to comment #79) > But it seems that if > I don't do that soon enough (e.g. when I don't notice the connection is > hanging, overnight for instance) then not even that helps, and I have to > restart the whole ipsec to get out of the stuck situation. Which is most likely due to the RAS timeout. I had a system freeze again with 2.6.16-rc2-git2-6-default kernel after a full working day using our RAS. Are you sure that this was really an IPsec related crash/freeze? Same symptoms as before and no problem at all while running the system without IPsec. Hm. Please hook up your system to a serial console, and when the crash happens next time, please try Sysrq-p (several times) and Sysrq-t and attach the output to this bug. *** Bug 115200 has been marked as a duplicate of this bug. *** BTW I think the remaining problem is something entirely different. I'm downgrading to major because this happens a lot less often I've tested again using unpatched 10.0 kernel (kernel-smp-2.6.13-15.8) and with a kernel containing your latest patch (attachmend id=65615) using a NAT-T configuration. As expected, the unpatched kernel causes a system freeze and sysrq-t/p does not print anything unless I use sysrq-i (Terminate All Tasks). After this, the system reacts again to sysrq-t/p. Let me know if you want to get the logs captured on serial console. The kernel containing your last patch seems to work fine since ca. 36 hours without a system freeze, even though I've got some Oopses while the system was booting. The "Resource temporarily unavailable" from ping and "xfrm_tmpl_resolve says EAGAIN, try again" on the console happens somethimes while rekeying takes place. Also if my external/dynamic IP changes and the other ipsec side's filter blocks the traffic from the new one. After a reauth&restart of IPSec (so the other side updates its ip filter) the tunnel works fine again. I've also a full serial console capture of this session, that I'll attach to this bug. Created attachment 70394 [details]
serial console capture containing some Oopses while system is booting
The kernel used here contains patch #65615.
Created attachment 70395 [details]
serial console capture with sysrq-t/p output while system freeze
Uses unpatched kernel-smp-2.6.13-15.8
Marius, this isn't really the latest patch - this seems to be the bad patch that went into beta3. Please grab the latest patch from CVS HEAD if you want to give it a try. Lars any news on IPsec capturing sysrq info via serial console? OK, I'll rebuild the kernel using the patch from CVS HEAD and retry (this evening). I'm using a 10.0 kernel-smp-2.6.13-15.9 + xfrm-endless-loop.diff from CVS
HEAD since ca. 24 hours (one reconnect because my external IP has changed)
and it seems to work fine.
Everytime pluto replaces the ipsec SA, the xfrm_lookup bailing out code is
executed, e.g.:
Feb 28 17:32:23 exodus pluto[8307]: "mt-suse" #43: IPsec SA expired (LATEST!)
Feb 28 17:32:27 exodus pluto[8307]: initiate on demand from 10.204.0.12:0 to 10.0.9.7:0 proto=0 state: fos_start because: acquire
Feb 28 17:32:27 exodus pluto[8307]: "mt-suse" #48: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP {using isakmp#44}
Feb 28 17:32:27 exodus pluto[8307]: "mt-suse" #48: transition from state STATE_QUICK_I1 to state STATE_QUICK_I2
Feb 28 17:32:27 exodus pluto[8307]: "mt-suse" #48: STATE_QUICK_I2: sent QI2, IPsec SA established {ESP=>0x81661bee <0xe1b01b5f xfrm=AES_0-HMAC_SHA1 NATD=195.135.221.4:4500 DPD=none}
Feb 28 17:32:57 exodus kernel: xfrm_lookup bailing out after 11 loops
Feb 28 17:32:57 exodus kernel: [<c03257df>] xfrm_lookup+0x2af/0x4a0
Feb 28 17:32:57 exodus kernel: [<c0120cd0>] default_wake_function+0x0/0x10
Feb 28 17:32:57 exodus kernel: [<c030b4de>] raw_sendmsg+0x17e/0x420
Feb 28 17:32:57 exodus kernel: [<c0314b0a>] inet_sendmsg+0x3a/0x70
Feb 28 17:32:57 exodus kernel: [<c02c0742>] sock_sendmsg+0xd2/0x150
Feb 28 17:32:57 exodus kernel: [<c013c040>] autoremove_wake_function+0x0/0x50
Feb 28 17:32:57 exodus kernel: [<c01068d9>] default_do_nmi+0x79/0x180
Feb 28 17:32:57 exodus kernel: [<c013c040>] autoremove_wake_function+0x0/0x50
Feb 28 17:32:57 exodus kernel: [<c02c2525>] sys_sendmsg+0x155/0x270
Feb 28 17:32:57 exodus kernel: [<c0344a49>] schedule+0x5a9/0xc00
Feb 28 17:32:57 exodus kernel: [<c0344a77>] schedule+0x5d7/0xc00
Feb 28 17:32:57 exodus kernel: [<c012f370>] lock_timer_base+0x20/0x50
Feb 28 17:32:57 exodus kernel: [<c012f627>] del_timer_sync+0x27/0x30
Feb 28 17:32:57 exodus kernel: [<c02c1192>] sock_poll+0x12/0x20
Feb 28 17:32:57 exodus kernel: [<c0185907>] do_pollfd+0x57/0xb0
Feb 28 17:32:57 exodus kernel: [<c02c2afc>] sys_socketcall+0x27c/0x2a0
Feb 28 17:32:57 exodus kernel: [<c010983f>] do_gettimeofday+0x1f/0xd0
Feb 28 17:32:57 exodus kernel: [<c012ab08>] sys_gettimeofday+0x28/0x80
Feb 28 17:32:57 exodus kernel: [<c01042ab>] sysenter_past_esp+0x54/0x79
BTW: the above kernel RPMs are in ~mt/Export/openswan/xfrm-endless-loop/ It still works on my 10.0 server - uptime: 2:18pm an 3 days 18:20 Thanks Olaf! Fine. I will downgrade the bug severity for now. I am not entirely happy yet because the bailout message still triggers. This means there's still something fishy going on. I'm assigning this to Jiri Bohac - Jiri, maybe you could try to find out why it's still looping in xfrm_lookup? Olaf, this part of your patch is upstream since linux-2.6.16-rc3: - goto restart; + + err = -EHOSTUNREACH; + goto error; This was what caused the lockup. Now, there is only one "goto restart" left, which is called after sleeping on the km_waitq wait queue. This wait queue is woken from many places that touch the xfrm states. The purpose of the loop is clearly to wait until a valid bundle is available and the number of iterations may be large, because the wait queue may be woken up many times by completely unrelated events. I think that looping there 10 times does not mean anything is wrong and I'd just delete the bailout check/message. But if you think that this still may be an endless loop (which would not lockup the system because of the schedule() call), I suggest changing the number of iterations to something like 1000 instead of 10. Than we could wait if anyone ever sees the message. Does this make sense? If the number of wake-ups is really that large, then it may make more sense
to do something like this:
unsigned long timeout = jiffies + 60 * HZ;
again:
if (time_after(jiffies, timeout)) {
complain and bail out
}
Feel free to modify the patch either way.
Good idea. I updated the patch with the 60 second timeout instead of the maximum of 10 loop iterations. If anyone sees the message "xfrm_lookup bailing out after 60 seconds and %d loops" in their syslog, please let me know. I still haven't reproduced it myself. I have not heard from anyone reporting about the above message in syslog. Closing for now. If anyone sees the message "xfrm_lookup bailing out after 60 seconds and %d loops" in their syslog, please reopen. Created attachment 86502 [details]
updated SUSE 10.0 patch
aj: what's about SUSE 10.0 kernel - the problem ocurrs there as well... Please add to our kernel CVS for 10.0 so that it can go out with the next security update. Using the actual patch on a __10.0__ kernel, I got the bailing out
message once in last 24 hours (bellow).
With the old patch I seen them 3 - 6 times a day, that is on every
(re)establish. In both cases no crash happened.
Jun 1 18:03:25 exodus pluto[4191]: "mt-suse" #20: IPsec SA expired (LATEST!)
Jun 1 18:03:28 exodus pluto[4191]: initiate on demand from 10.204.0.12:0 ...
Jun 1 18:03:29 exodus pluto[4191]: "mt-suse" #27: transition from state
STATE_QUICK_I1 to state STATE_QUICK_I2
Jun 1 18:03:29 exodus pluto[4191]: "mt-suse" #27: STATE_QUICK_I2: sent QI2,
IPsec SA established {...NATD...}
Jun 1 18:04:28 exodus kernel:
xfrm_lookup bailing out after 60 seconds and 0 loops
[<c032584c>] xfrm_lookup+0x2ac/0x490
[<c0120cd0>] default_wake_function+0x0/0x10
[<c030b54e>] raw_sendmsg+0x17e/0x420
[<c0314b7a>] inet_sendmsg+0x3a/0x70
[<c02c07a2>] sock_sendmsg+0xd2/0x150
[<c013c010>] autoremove_wake_function+0x0/0x50
[<c013c010>] autoremove_wake_function+0x0/0x50
[<c012f9e2>] update_wall_time+0x12/0x50
[<c012fd63>] do_timer+0x53/0xf0
[<c02c2585>] sys_sendmsg+0x155/0x270
[<c0344aa9>] schedule+0x5a9/0xc00
[<c0344ad7>] schedule+0x5d7/0xc00
[<c014e686>] __do_IRQ+0x106/0x150
[<c012f370>] lock_timer_base+0x20/0x50
[<c012f627>] del_timer_sync+0x27/0x30
[<c02c11f2>] sock_poll+0x12/0x20
[<c01858d7>] do_pollfd+0x57/0xb0
[<c02c2b5c>] sys_socketcall+0x27c/0x2a0
[<c010987f>] do_gettimeofday+0x1f/0xd0
[<c012ab08>] sys_gettimeofday+0x28/0x80
[<c01042cb>] sysenter_past_esp+0x54/0x79
Created attachment 87491 [details]
proper patch for the 10.0 kernel
This patch includes also includes the part that was missing in #100.
I fixed the patches.fixes/xfrm-endless-loop in HEAD, because the loop count for debugging message was not properly incremented (eeek!). So if this happens again with the SLES10 kernel, the debugging message will be more useful. The patch from #100 does not fix the problem for 2.6.13 on its own. The more important part of the fix was meged upstream before 2.6.16. It is included in the patch from #104 Before we apply this to the 10.0 kernel as a security update, it would be good to have some testing. It is _probably_ safe, as it seems to work well in HEAD, but it would be nice to actually try to use it with 10.0. Marius, you're saying that you have been using the 10.0 kernel and you got the bailout messages. Do you think you can try with this patch to be sure it doesn't break anything? Not commiting to CVS yet... I'm using the STABLE kernel and still see connection hangs pretty regularly (about every 10 minutes). wasn't this supposed to get solved by this patch? Dirk: Well, this patch solves a hard lockup in the first place. If instead of the lockups the connection hangs for a while, this means that there might be a different bug somewhere else. Including the userspace IKE daemon. Is it that the ipsec connection stops working for a short while and then starts again, without you doing anything? How long does your connection remain hung? Anything like "xfrm_lookup bailing out after 60 seconds..." in dmesg? Are you using racoon or openswan for IKE? If it is openswan, can you please set plutodebug=all in ipsec.conf and attach piece of /var/log/messages around the time one of the hangs occurs? If you're using racoon, please do the same, but put "log debug2" in racoon.conf. (In reply to comment #105) > I fixed the patches.fixes/xfrm-endless-loop in HEAD, because the loop count > for debugging message was not properly incremented (eeek!). So if this > happens again with the SLES10 kernel, the debugging message will be more > useful. Yes, it helped against the crashes / freezes (I guess because of the bailout every 60 secs), but the ipsec connection got lost in many cases... > The patch from #100 does not fix the problem for 2.6.13 on its own. The more > important part of the fix was meged upstream before 2.6.16. It is included in > the patch from #104 OK, thanks! > Before we apply this to the 10.0 kernel as a security update, it would be good > to have some testing. It is _probably_ safe, as it seems to work well in HEAD, > but it would be nice to actually try to use it with 10.0. Marius, you're > saying that you have been using the 10.0 kernel and you got the bailout > messages. Do you think you can try with this patch to be sure it doesn't > break anything? Yes, of course - I'm going to use the #104 patch on my 2.6.13 machine now. I'm running the #104 patch since 24 hours on my productive 10.0 gateway (I'm working over it) and there is no machine lockup, no ipsec connection drop occured, as well as no "xfrm_lookup bailing out" messages in the time. The 10.0 (smp) kernel is in ~mt/export/openswan/xfrm-endless-loop/. The last "xfrm_lookup bailing out after 11 loops" was on Jun 7 05:56:23; no crash since 7 days 5:41 using the 10.0 smp kernel + patch from #104. Sometimes, the (ssh) connection over the tunnel does not response for few seconds (I guess while the rekeying is done), but continues then. This means, the bailing out does not happen any more with patch from #104. reboot system boot 2.6.13-15.10-smp Wed Jun 7 09:19 (7+05:47) OK, commited the patch from #104 to the 10.0 cvs branch. Marking as FIXED for now; if anyone sees the bailout message, please reopen. |