Bugzilla – Bug 1218376
conntrackd buffer size insufficient?
Last modified: 2024-01-03 15:00:27 UTC
Hi, we run a pair of openSUSE Leap machines in our openSUSE infrastructure acting as firewalls and utilize conntrackd on them. I noticed that after rebooting one node, its initial resync is aborted by systemd: ``` systemd[1]: conntrackd.service: Watchdog timeout (limit 1min)! ``` I tried increasing the `WatchdogSec` parameter in a systemd override, but it seems even 240 seconds do not help: ``` Dec 24 04:46:54 asgard2 systemd[1]: Started Connection tracking userspace daemon. Dec 24 04:46:54 asgard2 conntrack-tools[3196]: using user-space event filtering Dec 24 04:46:54 asgard2 conntrack-tools[3196]: netlink event socket buffer size has been set to 262142 bytes Dec 24 04:46:54 asgard2 conntrack-tools[3196]: initialization completed Dec 24 04:46:54 asgard2 conntrack-tools[3196]: -- starting in console mode -- Dec 24 04:46:54 asgard2 conntrack-tools[3196]: resync requested Dec 24 04:46:54 asgard2 conntrackd[3196]: [Sun Dec 24 04:46:54 2023] (pid=3196) [notice] netlink event socket buffer size has been doubled to 524284 bytes Dec 24 04:46:54 asgard2 conntrack-tools[3196]: netlink event socket buffer size has been doubled to 524284 bytes Dec 24 04:46:54 asgard2 conntrackd[3196]: [Sun Dec 24 04:46:54 2023] (pid=3196) [warning] netlink event socket buffer size cannot be doubled further since it will exceed NetlinkBufferSizeMaxGrowth. We are likely to be losing events, this may lead to unsynchronized replicas. Please, consider increasing netlink socket buffer size via NetlinkBufferSize and NetlinkBufferSizeMaxGrowth clauses in conntrackd.conf Dec 24 04:46:54 asgard2 conntrack-tools[3196]: netlink event socket buffer size cannot be doubled further since it will exceed NetlinkBufferSizeMaxGrowth. We are likely to be losing events, this may lead to unsynchronized replicas. Please, consider increasing netlink socket buffer size via NetlinkBufferSize and NetlinkBufferSizeMaxGrowth clauses in conntrackd.conf Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Watchdog timeout (limit 4min)! Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Killing process 3196 (conntrackd) with signal SIGABRT. Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Main process exited, code=dumped, status=6/ABRT Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Failed with result 'watchdog'. Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Scheduled restart job, restart counter is at 1. Dec 24 04:46:41 asgard2 systemd[1]: Stopped Connection tracking userspace daemon. Dec 24 04:46:41 asgard2 systemd[1]: Starting Connection tracking userspace daemon... ``` I'm unsure whether submitting a patch to conntrackd.service (in the conntrack-tools package) increasing `WatchdogSec` a very high number is a good idea, as it seems to only be needed for the initial resync and not during regular operation, and I could not find a systemd setting to set a dedicated timeout for the unit startup. After the service automatically restarts, it does not time out a second time. Is there something that could be tuned here to avoid the initial startup failure?
Have you tried followed the recommended setting changes as per the log messages? What values have you set?
Hi Adam, I currently use: ``` NetlinkBufferSize 10064960 NetlinkBufferSizeMaxGrowth 20129920 ``` This is about ten times more than what was set originally and the issue persists.
>Dec 24 04:46:54 asgard2 systemd[1]: Started Connection tracking userspace daemon. >Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Watchdog timeout (limit 4min)! Sorry, we don't support universes where time goes backwards.
(In reply to Jan Engelhardt from comment #3) > >Dec 24 04:46:54 asgard2 systemd[1]: Started Connection tracking userspace daemon. > >Dec 24 04:46:41 asgard2 systemd[1]: conntrackd.service: Watchdog timeout (limit 4min)! > > Sorry, we don't support universes where time goes backwards. Nice catch. I wonder if this is part of the problem. Could the RTC be wrong on bootup causing this issue? Do you see cases where this happens when the clock is correct?
System clock got corrected during boot, in between the start of conntrackd and the watchdog kill: ``` Dec 24 04:46:59 asgard2 chronyd[3256]: Selected source 2a07:de40:b27e:1203::12 (ntp2.infra.opensuse.org) Dec 24 04:46:59 asgard2 chronyd[3256]: System clock wrong by -256.858925 seconds Dec 24 04:42:42 asgard2 chronyd[3256]: System clock was stepped by -256.858925 seconds ```
Created attachment 871571 [details] journal from the 24th
Yes, I understand, but does the timeout behaviour also happen in other instances, where the clock is not wrong? If yes (still see timeouts), could you update to current version 1.4.8 to check if the problem persists? That would be in security:netfilter/conntrack-tools
> Yes, I understand, but does the timeout behaviour also happen in other instances, where the clock is not wrong? Will try on the next boot and let you know. > If yes (still see timeouts), could you update to current version 1.4.8 to check if the problem persists? I can, if it is not a problem to run different versions on both nodes temporarily?
(In reply to Georg Pfuetzenreuter from comment #8) > > If yes (still see timeouts), could you update to current version 1.4.8 to check if the problem persists? > > I can, if it is not a problem to run different versions on both nodes > temporarily? These versions should be compatible.
It seems the clock drift was caused by an unsynchronized hypervisor clock. I corrected it now, so the VM in question no longer performs clock stepping during boot. After this, I can no longer observe any watchdog timeouts, conntrackd starts on the first attempt: ``` # journalctl -S today -u conntrackd Jan 03 10:53:21 asgard2 systemd[1]: Starting Connection tracking userspace daemon... Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] disabling external cache Jan 03 10:53:21 asgard2 conntrack-tools[3109]: disabling external cache Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] using user-space event filtering Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] netlink event socket buffer size has been set to 262142 bytes Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] initialization completed Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] -- starting in console mode -- Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] resync requested Jan 03 10:53:21 asgard2 conntrack-tools[3109]: using user-space event filtering Jan 03 10:53:21 asgard2 conntrack-tools[3109]: netlink event socket buffer size has been set to 262142 bytes Jan 03 10:53:21 asgard2 conntrack-tools[3109]: initialization completed Jan 03 10:53:21 asgard2 conntrack-tools[3109]: -- starting in console mode -- Jan 03 10:53:21 asgard2 conntrack-tools[3109]: resync requested Jan 03 10:53:21 asgard2 systemd[1]: Started Connection tracking userspace daemon. Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [notice] netlink event socket buffer size has been doubled to 524284 bytes Jan 03 10:53:21 asgard2 conntrack-tools[3109]: netlink event socket buffer size has been doubled to 524284 bytes Jan 03 10:53:21 asgard2 conntrackd[3109]: [Wed Jan 3 10:53:21 2024] (pid=3109) [warning] netlink event socket buffer size cannot be doubled further since it will exceed NetlinkBufferSizeMaxGrowth. We are likely to be losing events, this may lead to unsynchronized replicas. Please, consider increasing netlink socket buffer size via NetlinkBufferSize and NetlinkBufferSizeMaxGrowth clauses in conntrackd.conf Jan 03 10:53:21 asgard2 conntrack-tools[3109]: netlink event socket buffer size cannot be doubled further since it will exceed NetlinkBufferSizeMaxGrowth. We are likely to be losing events, this may lead to unsynchronized replicas. Please, consider increasing netlink socket buffer size via NetlinkBufferSize and NetlinkBufferSizeMaxGrowth clauses in conntrackd.conf ``` Thank you very much for pointing me in the right direction. I am still concerned about the NetlinkBufferSize warnings, but would say they're out of scope here.
(In reply to Georg Pfuetzenreuter from comment #2) > Hi Adam, > > I currently use: > > ``` > NetlinkBufferSize 10064960 > NetlinkBufferSizeMaxGrowth 20129920 > ``` > > This is about ten times more than what was set originally and the issue > persists. NetlinkBufferSize 10064960 NetlinkBufferSizeMaxGrowth 20129920 Add a zero to the MaxGrowth one and see what happens. The warning you are seeing is generated when the BufferSize * 2 > BufferSizeMaxGrowth which happens rather quickly. And sadly this warning is displayed only once even if condition persists. How many connections is it trying to sync?
Added a 0, unfortunately no change: ``` NetlinkBufferSize 10064960 NetlinkBufferSizeMaxGrowth 201299200 Jan 03 11:41:07 asgard2 conntrackd[3094]: [Wed Jan 3 11:41:07 2024] (pid=3094) [notice] netlink event socket buffer size has been doubled to 524284 bytes Jan 03 11:41:07 asgard2 conntrackd[3094]: [Wed Jan 3 11:41:07 2024] (pid=3094) [warning] netlink event socket buffer size cannot be doubled further since it will exceed NetlinkBufferSizeMaxGrowth. We are likely to be losing events, this may lead to unsynchronized replicas. Please, consider increasing netlink socket buffer size via NetlinkBufferSize and NetlinkBufferSizeMaxGrowth clauses in conntrackd.conf ``` Below are the statistics, including connection counts. Primary: ``` asgard1 # conntrackd -i |wc -l 57521 asgard1 # wc -l /proc/net/nf_conntrack 28397 /proc/net/nf_conntrack asgard1 (Firewall, Router):~ # conntrackd -s cache internal: current active connections: 61311 connections created: 358021997 failed: 14217661 connections updated: 2073835016 failed: 0 connections destroyed: 357960686 failed: 0 external inject: connections created: 65759864 failed: 0 connections updated: 47787341 failed: 1 connections destroyed: 829393 failed: 0 traffic processed: 0 Bytes 0 Pckts multicast traffic (active device=os-asgard): 557760735600 Bytes sent 16236662192 Bytes recv 1741749095 Pckts sent 30735608 Pckts recv 0 Error send 0 Error recv message tracking: 0 Malformed msgs 260390609 Lost msgs ``` Secondary (that's trying to sync): ``` asgard2 # conntrackd -i |wc -l 48 asgard2 # wc -l /proc/net/nf_conntrack 28976 /proc/net/nf_conntrack asgard2 # conntrackd -s cache internal: current active connections: 85 connections created: 87 failed: 0 connections updated: 95 failed: 0 connections destroyed: 2 failed: 0 external inject: connections created: 90157 failed: 0 connections updated: 686213 failed: 0 connections destroyed: 10934 failed: 0 traffic processed: 0 Bytes 0 Pckts multicast traffic (active device=os-asgard): 140196 Bytes sent 72290060 Bytes recv 1894 Pckts sent 431211 Pckts recv 0 Error send 0 Error recv message tracking: 0 Malformed msgs 1648886 Lost msgs ```
It's a little strange. It's not that many connections. You've said this is running in a VM. What is the load on the host and in the VM? Would it be possible for this VM to have higher scheduling priority? Maybe pin it to some specific core(s)? If CPU load is not an issue, it could be interesting to see what's the cause here. Can you attach config file for the VM or the libvirt process with parameters how it's started?
Thanks for updating the bug for this. For load inside the VM, maybe the dashboard is helpful: https://monitor.opensuse.org/grafana/d/rYdddlPWk/node-metrics?orgId=1&var-DS_PROMETHEUS=default&var-job=nodes&var-node=asgard2.infra.opensuse.org The hypervisor is not visualized there, but it is mostly idling. This VM is the only one running on it and there are only the virtualization relevant services running on the host system. Here is some output from said hypervisor: ``` orbit21:~ # ps -eo pcpu,args|grep qemu 29.3 /usr/bin/qemu-system-x86_64 -name guest=asgard2.infra.opensuse.org,process=qemu:asgard2.infra.opensuse.org,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-asgard2.infra.opensu/master-key.aes"} -blockdev {"driver":"file","filename":"/usr/share/qemu/ovmf-x86_64-opensuse-code.bin","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -blockdev {"driver":"file","filename":"/data0/kvm/nvram/asgard2.infra.opensuse.org.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} -machine pc-q35-7.1,usb=off,smm=on,dump-guest-core=off,memory-backend=pc.ram,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format -accel kvm -cpu EPYC-Rome,ht=on,monitor=on,xsaves=on,cmp-legacy=on,extapic=on,ibs=on,skinit=on,wdt=on,tce=on,topoext=on,perfctr-nb=on,invtsc=on,ibrs=on,amd-ssbd=on,lbrv=on,svm-lock=on,tsc-scale=on,vmcb-clean=on,flushbyasid=on,decodeassists=on,pause-filter=on,pfthreshold=on,avic=on,v-vmsave-vmload=on,vgif=on -global driver=cfi.pflash01,property=secure,value=on -m 5860 -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":6144655360} -overcommit mem-lock=off -smp 4,maxcpus=6,sockets=6,cores=1,threads=1 -uuid 4753886d-6569-4b40-b6b7-9818f5a66acd -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=26,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device {"driver":"pcie-root-port","port":8,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x1"} -device {"driver":"pcie-root-port","port":9,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x1.0x1"} -device {"driver":"pcie-root-port","port":10,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x1.0x2"} -device {"driver":"pcie-root-port","port":11,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x1.0x3"} -device {"driver":"pcie-root-port","port":12,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x1.0x4"} -device {"driver":"pcie-root-port","port":13,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x1.0x5"} -device {"driver":"pcie-root-port","port":14,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x1.0x6"} -device {"driver":"qemu-xhci","id":"usb","bus":"pci.2","addr":"0x0"} -device {"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.3","addr":"0x0"} -blockdev {"driver":"file","filename":"/data0/kvm/disks/asgard2.infra.opensuse.org_root.qcow2","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null} -device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk1","bootindex":1,"write-cache":"on"} -netdev tap,fd=27,vhost=on,vhostfd=29,id=hostnet0 -device {"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"d2:e1:4b:98:46:2f","bus":"pci.1","addr":"0x0"} -chardev pty,id=charserial0 -device {"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0} -chardev socket,id=charchannel0,fd=25,server=on,wait=off -device {"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"} -audiodev {"id":"audio1","driver":"none"} -device {"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.5","addr":"0x0"} -object {"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"} -device {"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.6","addr":"0x0"} -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on orbit21:~ # free -h total used free shared buff/cache available Mem: 503Gi 5.3Gi 500Gi 16Mi 278Mi 498Gi Swap: 31Gi 0B 31Gi orbit21:~ # virsh nodeinfo CPU model: x86_64 CPU(s): 32 CPU frequency: 1500 MHz CPU socket(s): 1 Core(s) per socket: 16 Thread(s) per core: 1 NUMA cell(s): 2 Memory size: 528090836 KiB orbit21:~ # virsh vcpuinfo asgard2.infra.opensuse.org VCPU: 0 CPU: 16 State: running CPU time: 159.3s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy VCPU: 1 CPU: 16 State: running CPU time: 1177.8s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy VCPU: 2 CPU: 17 State: running CPU time: 1760.1s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy VCPU: 3 CPU: 19 State: running CPU time: 54.1s CPU Affinity: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy orbit21:~ # virsh dommemstat asgard2.infra.opensuse.org actual 4000000 swap_in 0 swap_out 0 major_fault 84 minor_fault 92503 unused 5597588 available 5770056 usable 5499912 last_update 1704282057 disk_caches 58184 hugetlb_pgalloc 0 hugetlb_pgfail 0 rss 1825204 ``` The template for the Libvirt domain XML file can be found here: https://github.com/openSUSE/heroes-salt/blob/production/salt/files/libvirt/domains/common.xml.j2 I will also attach the live/dumpxml version shortly. I'm happy to allocate more resources to the VM, or to adjust its configuration for scheduling/core pinning, if you still think it could help based on the above.
Created attachment 871637 [details] asgard2 domain XML
(In reply to Georg Pfuetzenreuter from comment #14) > The hypervisor is not visualized there, but it is mostly idling. This VM is > the only one running on it and there are only the virtualization relevant > services running on the host system. Thank you for the information. It seems my guess that it could be CPU was not correct. > I'm happy to allocate more resources to the VM, or to adjust its > configuration for scheduling/core pinning, if you still think it could help > based on the above. It's probably not related at all. I'll try to duplicate this locally.
Appreciate the help, if you need any further information, let me know.