Bug 1217403

Summary: libvirtd+virtlockd restart killed VMs
Product: [openSUSE] openSUSE Distribution Reporter: Bernhard Wiedemann <bwiedemann>
Component: Virtualization:OtherAssignee: virt-bugs list <virt-bugs>
Status: RESOLVED INVALID QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: bwiedemann, georg.pfuetzenreuter, jfehlig, pdostal, santiago.zarate
Version: Leap 15.5   
Target Milestone: ---   
Hardware: x86-64   
OS: All   
See Also: https://bugzilla.suse.com/show_bug.cgi?id=1216903
Whiteboard:
Found By: Development Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: journal from the KVM host

Description Bernhard Wiedemann 2023-11-22 13:09:16 UTC
Created attachment 870904 [details]
journal from the KVM host

Today we had 12h of outage of various openSUSE services and one of the issues we discovered was that virtlockd stopped our VMs during a virtlockd restart.

A bit of background:
In our openSUSE infra, we use the os-update package that auto-updates every day and auto-restarts services listed in zypper ps

Tonight 02:07:47 zypper updated 
libopenssl1_1
openssl-1_1
libxml2-2
python3-setuptools

02:07:49 os-update triggered a restart of libvirtd
02:07:51 os-update triggered a restart of virtlockd

/var/log/libvirt/qemu/ then has
2023-11-22T02:07:51.500608Z qemu-system-x86_64: terminating on signal 15 from pid 29532 (/usr/sbin/virtlockd)
2023-11-22 02:07:51.779+0000: shutting down, reason=shutdown

for all 5 VMs that were running on the "squanchy" host.


Please investigate. We would like to avoid similar outages in the future.
Comment 1 Santiago Zarate 2023-11-22 13:27:52 UTC
I think this is related to https://bugzilla.suse.com/show_bug.cgi?id=1216903, in general I filed https://progress.opensuse.org/issues/151267
Comment 2 Georg Pfuetzenreuter 2023-11-22 14:21:49 UTC
Hi Santiago,

thanks for the input, but we use AppArmor, not SELinux.
Comment 3 James Fehlig 2023-11-22 18:37:32 UTC
(In reply to Bernhard Wiedemann from comment #0)
> Tonight 02:07:47 zypper updated 
> libopenssl1_1
> openssl-1_1
> libxml2-2
> python3-setuptools
> 
> 02:07:49 os-update triggered a restart of libvirtd
> 02:07:51 os-update triggered a restart of virtlockd

Why did virtlockd get restarted? It's not safe to restart while protecting resources of running VMs. It is safe to re-exec. E.g. the spec file has the following posttrans snippet

%posttrans daemon
%libvirt_logrotate_posttrans libvirtd
# virtlockd and virtlogd must not be restarted, particularly virtlockd since the
# locks it uses to protect VM resources would be lost. Both are safe to re-exec.
%{_bindir}/systemctl reload-or-try-restart virtlockd.service >/dev/null 2>&1 || :
%{_bindir}/systemctl reload-or-try-restart virtlogd.service >/dev/null 2>&1 || :
...
Comment 4 Bernhard Wiedemann 2023-11-23 09:08:12 UTC
Alright, virtlockd restart will be avoided in future with
https://github.com/openSUSE/os-update/pull/17