|
Bugzilla – Full Text Bug Listing |
| Summary: | libvirt-routed firewalld zone not functional | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Robert Munteanu <rombert> |
| Component: | Virtualization:Other | Assignee: | James Fehlig <jfehlig> |
| Status: | NEW --- | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | admin, jfehlig, marius.kittler, mohd.saquib, rombert |
| Version: | Leap 15.5 | Flags: | jfehlig:
needinfo?
(rombert) |
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE Leap 15.5 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
(In reply to Robert Munteanu from comment #0) > After a recent update of my virtual machines host (sorry for being fuzzy) I > started seeing connections being dropped between the VMs and the host, > specifically NFS. By "recent update", do you mean a package update of your 15.5 system, or a distro upgrade from 15.4? I assume the former. > What is worrying me and pointing to a bug are the following messages from > the system journal which point to the firewalld zone not being functional > > Aug 09 13:12:36 vmhost002 firewalld[17692]: ERROR: Calling pre func <bound > method Firewall.full_check_config of <class > 'firewall.core.fw.Firewall'>(True, True, True, 'RUNNING', False, 'public', > {'nf_nat_tftp': 4}, [], True, True, True, False, 'all')>(()) failed: > INVALID_ZONE: 'libvirt-routed' not among existing zones Right, we need to get this resolved first. A firewalld update was recently released. If you encountered the issue after a package update, downgrading to the previous firewalld package would be a quick, easy test. Let's cc firewalld maintainer for other ideas about this error. > By "recent update", do you mean a package update of your 15.5 system, or a distro upgrade from 15.4? I assume the former. This was a package update > Right, we need to get this resolved first. A firewalld update was recently released. If you encountered the issue after a package update, downgrading to the previous firewalld package would be a quick, easy test. Let's cc firewalld maintainer for other ideas about this error. # zypper se -s --match-exact firewalld Loading repository data... Reading installed packages... S | Name | Type | Version | Arch | Repository ---+-----------+------------+---------------------+--------+------------------------------------------------------------- i+ | firewalld | package | 0.9.3-150400.8.12.1 | noarch | Update repository with updates from SUSE Linux Enterprise 15 v | firewalld | package | 0.9.3-150400.8.9.1 | noarch | Main Repository | firewalld | srcpackage | 0.9.3-150400.8.12.1 | noarch | Update repository with updates from SUSE Linux Enterprise 15 # zypper in --force firewalld-0.9.3-150400.8.9.1 # systemctl reboot I see the same errors in the log after a restart. Aug 10 16:06:01 vmhost002 systemd[1]: Starting firewalld - dynamic firewall daemon... Aug 10 16:06:01 vmhost002 systemd[1]: Started firewalld - dynamic firewall daemon. Aug 10 16:06:32 vmhost002 firewalld[975]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'RUNNING', False, 'public', {}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Aug 10 16:06:33 vmhost002 firewalld[975]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'RUNNING', False, 'public', {'nf_nat_tftp': 1}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Thanks for checking the previous firewalld package. It would be nice to know what package caused the regression. Perhaps the kernel itself. I'm not sure how to proceed. firewalld knows about the libvirt-routed zone as you've shown in #0, yet it produces an error that it's not among existing zones. Hopefully the firewalld maintainer has some suggestions. Hi, firewalld maintainer here! Yes there was a firewalld update recently but I highly doubt that this error is due to that. Anyway I'll double check. Meanwhile it would be great if I can be provided with a minimal reproducer for this error. Thanks... I don't yet have a reproducer, will probably take quite a bit more time. What I did discover is that the NFS timeout and the error logged by firewalld are highly unlikely to be related.
Inspecting the nft configuration created by firewalld I discovered that there is an additional policy installed by libvirt that is applied before the zone settings
# rpm -qf /usr/lib/firewalld/policies/libvirt-to-host.xml
libvirt-daemon-driver-network-9.0.0-150500.6.11.1.x86_64
# cat /usr/lib/firewalld/policies/libvirt-to-host.xml
<?xml version="1.0" encoding="utf-8"?>
<policy target="REJECT">
<short>libvirt-to-host</short>
<description>
This policy is used to filter traffic from virtual machines to the
host.
</description>
<ingress-zone name="libvirt-routed" />
<egress-zone name="HOST" />
<protocol value='icmp'/>
<protocol value='ipv6-icmp'/>
<service name='dhcp'/>
<service name='dhcpv6'/>
<service name='dns'/>
<service name='ssh'/>
<service name='tftp'/>
</policy>
Due to the default reject target of the policy I need to manually add services to it in order to permit access.
# firewall-cmd --policy=libvirt-to-host --add-service=nfs
So the logged issue seems to be cosmetic.
I wonder if the new policy was brought in by
* Tue Jul 25 2023 jfehlig@suse.com
- spec: Build library with support for modular daemons
bsc#1213352
Perhaps the 'support for modular daemons' change caused the policy to be pulled in, but it was not installed for me before?
(In reply to Robert Munteanu from comment #5) > Due to the default reject target of the policy I need to manually add > services to it in order to permit access. > > # firewall-cmd --policy=libvirt-to-host --add-service=nfs Adding the 'nfs' service to the 'libvirt-to-host' policy resolved the dropped NFS connections? > So the logged issue seems to be cosmetic. And as you say a separate issue from the dropped NFS connections. But what could have caused it to suddenly appear? > I wonder if the new policy was brought in by > > * Tue Jul 25 2023 jfehlig@suse.com > - spec: Build library with support for modular daemons > bsc#1213352 > > Perhaps the 'support for modular daemons' change caused the policy to be > pulled in, but it was not installed for me before? That change builds the libvirt library with knowledge about how to connect to modular daemons, in addition to the monolithic libvirtd. It defines REMOTE_DRIVER_AUTOSTART_DIRECT, which is used in src/remote/remote_sockets.c when determining which daemon socket to connect. No packaging changes were introduced. /usr/lib/firewalld/policies/libvirt-to-host.xml has been provided by the libvirt-daemon-driver-network package since it was introduced with commit 2a461957b1f in the libvirt 8.10.0 dev cycle. It was part of a larger set of changes that "allow incoming connections to guests on routed networks w/firewalld" https://gitlab.com/libvirt/libvirt/-/commit/7f7a09a2d25a668092be98ed5abfaeec572f5104 I forgot to set needinfo to Robert for my question in #6... (In reply to Mohd Saquib from comment #4) > Hi, > firewalld maintainer here! Thanks for taking a look! > Yes there was a firewalld update recently but I highly doubt that this error > is due to that. Anyway I'll double check. Robert already verified the issue was not caused by the firewalld update. Still, any help understanding the cause of "INVALID_ZONE: 'libvirt-routed' not among existing zones" error would be much appreciated. (In reply to James Fehlig from comment #6) > (In reply to Robert Munteanu from comment #5) > > Due to the default reject target of the policy I need to manually add > > services to it in order to permit access. > > > > # firewall-cmd --policy=libvirt-to-host --add-service=nfs > > Adding the 'nfs' service to the 'libvirt-to-host' policy resolved the > dropped NFS connections? Yes, that is correct. > > > So the logged issue seems to be cosmetic. > > And as you say a separate issue from the dropped NFS connections. But what > could have caused it to suddenly appear? Well, I take it back, it's not cosmetic. Whenever I run firewall-cmd --set-log-denied=... commands, the error is logged and the changes I made to the policy ( without adding --permanent ) are lost. So there is some impact from this # firewall-cmd --info-policy=libvirt-to-host libvirt-to-host (active) priority: -1 target: REJECT ingress-zones: libvirt-routed egress-zones: HOST services: dhcp dhcpv6 dns mysql nfs ssh tftp ports: protocols: icmp ipv6-icmp masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: # firewall-cmd --set-log-denied=all success # firewall-cmd --info-policy=libvirt-to-host libvirt-to-host (active) priority: -1 target: REJECT ingress-zones: libvirt-routed egress-zones: HOST services: dhcp dhcpv6 dns ssh tftp ports: protocols: icmp ipv6-icmp masquerade: no forward-ports: source-ports: icmp-blocks: rich rules: There is no hint in the console, but the journal contains the problematic entries Aug 14 13:10:54 vmhost002 firewalld[975]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'INIT', False, 'public', {}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Aug 14 13:10:54 vmhost002 firewalld[975]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'INIT', False, 'public', {'nf_nat_tftp': 1}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones > > > I wonder if the new policy was brought in by > > > > * Tue Jul 25 2023 jfehlig@suse.com > > - spec: Build library with support for modular daemons > > bsc#1213352 > > > > Perhaps the 'support for modular daemons' change caused the policy to be > > pulled in, but it was not installed for me before? > > That change builds the libvirt library with knowledge about how to connect > to modular daemons, in addition to the monolithic libvirtd. It defines > REMOTE_DRIVER_AUTOSTART_DIRECT, which is used in src/remote/remote_sockets.c > when determining which daemon socket to connect. No packaging changes were > introduced. > > /usr/lib/firewalld/policies/libvirt-to-host.xml has been provided by the > libvirt-daemon-driver-network package since it was introduced with commit > 2a461957b1f in the libvirt 8.10.0 dev cycle. It was part of a larger set of > changes that "allow incoming connections to guests on routed networks > w/firewalld" > > https://gitlab.com/libvirt/libvirt/-/commit/ > 7f7a09a2d25a668092be98ed5abfaeec572f5104 Thanks for clarifying that. (In reply to James Fehlig from comment #7) > I forgot to set needinfo to Robert for my question in #6... > > (In reply to Mohd Saquib from comment #4) > > Hi, > > firewalld maintainer here! > > Thanks for taking a look! > > > Yes there was a firewalld update recently but I highly doubt that this error > > is due to that. Anyway I'll double check. > > Robert already verified the issue was not caused by the firewalld update. > Still, any help understanding the cause of "INVALID_ZONE: 'libvirt-routed' > not among existing zones" error would be much appreciated. I can share more information about my setup, if helpful. I am not sure I can create an actual reproducer, since I don't have another bare metal machine around. The setup (loosely) is the following: - NFS server running on the host - libvirt managing 3 VMs defined via https://github.com/dmacvicar/terraform-provider-libvirt - a kubernetes cluster provisioned using k3s is running on those machines - a systemd unit opens up additional ports for the relevant firewalld zones [Unit] Description=Opens ports for libvirtd Requires=libvirtd.service After=libvirtd.service [Service] ExecStart=/usr/local/bin/libvirtd-open-ports.sh [Install] WantedBy=multi-user.targ The script currently casts a very wide net because of my troubleshooting #!/bin/bash -eu zones="libvirt libvirt-routed" services="rpc-bind mountd nfs http mysql" ports="7090/tcp 9115/tcp 9427/tcp" # Motion webcam, blackbox_exporter, ping exporter for zone in ${zones}; do for svc in ${services}; do firewall-cmd --zone="${zone}" --add-service="${svc}" done for port in ${ports}; do firewall-cmd --zone="${zone}" --add-port="${port}" done done I am using a script instead of passing '--permanent' to firewall-cmd invocations because it's easier for me to manage it with SaltStack. I'm clearing needinfo in hope that it helps, feel free to request again. > Well, I take it back, it's not cosmetic. Whenever I run firewall-cmd --set-log-denied=... commands, > the error is logged and the changes I made to the policy > ( without adding --permanent ) are lost. So there is some impact from this As per the man page this is the expected behaviour: > --set-log-denied=value > Add logging rules right before reject and drop rules in the INPUT, FORWARD > and OUTPUT chains for the default > are: all, unicast, broadcast, multicast and off. The default setting is off, which disables the logging. > > This is a runtime and permanent change and will also reload the firewall to be able to add the logging rules. If you don't add --permanent then you're changing your runtime config. and --set-log-denied then reloads the firewall and your runtime changes are lost. firewalld noob here.
In my case I upgraded from 15.2 to 15.5 as after 2 year the libvirt and firewalld combo started to misbehave badly.
In my case the error manifest as
firewalld[6760]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'INIT', False, 'public', {'nf_nat_ftp': 2}, [], True, True, True, False, 'off')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones
And looks like is no longer possible in my case to have nat port forwarding.
Rules like
firewall-cmd --permanent --add-forward-port=port=2302:proto=udp:toaddr=192.168.100.223:toport=2302
simply vanish when querying iptables-save (which is not showing many many other things btw), same for iptables -L -v -n -t nat.
Sorry, forget to add I tried to update to libvirt-9.7.0-Virt.150500.1084.1.x86_64, but same error. firewall-cmd --version 0.9.3 (In reply to Roy Bellingan from comment #11) > > > > And looks like is no longer possible in my case to have nat port forwarding. > > Rules like > > firewall-cmd --permanent > --add-forward-port=port=2302:proto=udp:toaddr=192.168.100.223:toport=2302 > > simply vanish when querying iptables-save (which is not showing many many > other things btw), same for iptables -L -v -n -t nat. This might be because firewalld is using the nftables backend. Note that this is still an issue. We are seeing these errors a lot on openQA worker hosts (the infrastructure we run openQA tests on) and they are rather distracting (see https://progress.opensuse.org/issues/155848). Any ideas how to workaround this? Is there an upstream issue about it? When it comes to reproducing: I think all one has to do it having the `libvirt-daemon-driver-network` package installed while also have `firewalld.service` enabled/running. Then the errors are quite apparent in the service's journal. (In reply to Marius Kittler from comment #14) > When it comes to reproducing: I think all one has to do it having the > `libvirt-daemon-driver-network` package installed while also have > `firewalld.service` enabled/running. Then the errors are quite apparent in > the service's journal. I just checked a TW host and do not see any of the errors with libvirt-daemon-driver-network and libvirt-daemon-config-network installed, firewalld enabled and running, and libvirt's 'default' network started. So there must be additional conditions to reproduce. Or is this issue specific to Leap 15.5? Does anyone see it with TW? Does it exist whether using NetworkManager or wicked? I am using tumbleweed (last version as of 2024-0-04), also tried on a 15.5 on my live server and same behaviour. In short libvirt redid several time the config reset the system whatever, never works. LXD forwarding works on the first try. **** I retried the network setup and if I want to forward into a libvirtd managed instance if keeps failing (currently bypassing the problem using socat, but it does not perform ip rewrite so is a problem) I also tried to nat into a LXD container and is working fine for this one... The command I use to create the nat rule is the classic (this one below if for the lxd container, the other I just change the ip) firewall-cmd --zone=public --add-rich-rule='rule family="ipv4" destination address="192.168.178.2" forward-port port="1201" protocol="tcp" to-port="1201" to-addr="10.29.49.148"' --permanent When I try to access the libvirt one wireshark report a ICMP response Destination unreachable (Port unreachable) The response looks like is generated NOT on the libvirt interface (if I put wireshark listening here I have nothing) but on the eth0 one If I remove the NAT rule (and start nc) it will work fine. So is the firewall that goes crazy when the rule is present... ? |
After a recent update of my virtual machines host (sorry for being fuzzy) I started seeing connections being dropped between the VMs and the host, specifically NFS. The interfaces are assigned to the libvirt-routed interface contributed by libvirt-daemon-driver-network. # firewall-cmd --get-active-zones (...) libvirt-routed interfaces: kubic-net-br virbr1 # rpm -qf /usr/lib/firewalld/zones/libvirt-routed.xml libvirt-daemon-driver-network-9.0.0-150500.6.11.1.x86_64 Once enabling firewalld dropped packages logging I see log entries such as Aug 10 13:08:02 vmhost002 kernel: "filter_IN_policy_libvirt-to-host_REJECT: "IN=virbr1 OUT= MAC=52:54:00:33:68:12:52:54:00:ce:5c:25:08:00 SRC=10.25.1.6 DST=10.25.0.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=54291 DF PROTO=TCP SPT=943 DPT=2049 WINDOW=64240 RES=0x00 SYN URGP=0 The nfs service should be allowed # firewall-cmd --info-zone=libvirt-routed | grep services services: http mountd mysql nfs rpc-bind # firewall-cmd --info-service=nfs nfs ports: 2049/tcp protocols: source-ports: modules: destination: includes: helpers: What is worrying me and pointing to a bug are the following messages from the system journal which point to the firewalld zone not being functional Aug 09 13:12:36 vmhost002 firewalld[17692]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'RUNNING', False, 'public', {'nf_nat_tftp': 4}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Aug 09 13:12:36 vmhost002 firewalld[17692]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'RUNNING', False, 'public', {'nf_nat_tftp': 4}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Aug 09 13:13:33 vmhost002 firewalld[17692]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'INIT', False, 'public', {}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Aug 09 13:13:33 vmhost002 firewalld[17692]: ERROR: Calling pre func <bound method Firewall.full_check_config of <class 'firewall.core.fw.Firewall'>(True, True, True, 'INIT', False, 'public', {'nf_nat_tftp': 1}, [], True, True, True, False, 'all')>(()) failed: INVALID_ZONE: 'libvirt-routed' not among existing zones Happy to provide more information if needed.