Bug 1166688 - shorewall traffic shaping locks/crashes kernel hard
shorewall traffic shaping locks/crashes kernel hard
Status: RESOLVED NORESPONSE
Classification: openSUSE
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel
Leap 15.1
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-03-15 15:01 UTC by Jon Nelson
Modified: 2020-11-24 11:34 UTC (History)
5 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
trace (5.54 KB, text/plain)
2020-03-16 12:36 UTC, Jon Nelson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jon Nelson 2020-03-15 15:01:09 UTC
I've been playing with shorewall's built-in (aka "complex" or "internal" vs. "simple") traffic shaping.

https://shorewall.org/traffic_shaping.htm#Builtin

When I use 'shorewall restart' some percentage of the time (perhaps 30%) the process will lock the kernel hard requiring a reboot.

"Magic Sysrq" -style reboots *do* work, so it's not totally locked. However, no other keys work.

Reproducibility: for me, very easy.
Worrisomeness: High (yes, worrisomeness is a made-up word)

This reproduces across all of the kernel versions of the last six-nine months (I have not gone back further than that).

Contents of selected files:

Obviously you'll need to turn on shorewall's traffic shaping in /etc/shorewall/shorewall.conf

However, between '#################' you'll find contents:

/etc/shorewall/tcclasses:
#################
outside         1       5*full/10       full            1       tcp-ack,tos-minimize-delay
outside         2       3*full/10       9*full/10       2       default
outside         3       1*full/10       8*full/10       2
#################

(outside is the name of my internet-facing interface)

/etc/shorewall/tcfilters is empty other than comments.
/etc/shorewall/tcdevices:
#################
outside       200mbit         10mbit          hfsc
#################
/etc/shorewall/interfaces:
#################
inside          inside                  dhcp
outside         outside                 dhcp,rpfilter
#################


I would love to assist in any way I can to resolve this issue.
Comment 1 Denis Kirjanov 2020-03-16 09:55:06 UTC
Hi Jon, 

you can try to setup a netconsole and try to dump active CPUs stack backtrace over the network or at least capture dmesg.
Comment 2 Denis Kirjanov 2020-03-16 09:57:37 UTC
can you reproduce the scenario on a virtual machine?
Comment 3 Jon Nelson 2020-03-16 12:34:34 UTC
I was able to repro using a different bit of software that messes with traffic shaping.  I put it in a loop and within seconds had a crash.

I used magic-sysrq keys to write out some info and - this time - I caught it.
Attaching a .txt file of the logs!
Comment 4 Jon Nelson 2020-03-16 12:36:20 UTC
Created attachment 832894 [details]
trace
Comment 5 Denis Kirjanov 2020-03-31 11:57:09 UTC
Could you please post a minimal setup to trigger it?
Comment 6 Jon Nelson 2020-04-07 12:26:04 UTC
I have seen the request for more info. Please allow me a bit of time to try to gather a useful test case.
Comment 7 Jon Nelson 2020-04-09 01:30:14 UTC
This is one way:

https://gist.github.com/bradoaks/940616

I then put that into a loop and start some traffic passing.

An alternative would be to use shorewall's traffic shaping support.
TC_ENABLED=Simple

and tcdevices:

<outside_device>  200mbit  10mbit  hfsc
Comment 8 Bruno Friedmann 2020-04-09 13:48:05 UTC
In case you need it shorewall 5.2.4 is packaged and available at
https://build.opensuse.org/package/show/security:netfilter/shorewall

I'm waiting feedback from users to forward it to Factory.
Comment 9 Jon Nelson 2020-04-10 04:54:57 UTC
Giving 5.2.4 a try, but as to the core issue?
Is there anything I can do?
Comment 10 Miroslav Beneš 2020-07-30 11:09:43 UTC
Jon, sorry for late answer. Is the issue still happening? Leap 15.2 was released a while ago with 5.3 kernel, so there is a good change the problem is fixed there.
Comment 11 Miroslav Beneš 2020-11-24 11:34:15 UTC
No response. Closing. Feel free to reopen if the issue persists and update the report with required feedback.