|
Bugzilla – Full Text Bug Listing |
| Summary: | libvirtError: Unable to find 'memory' cgroups controller mount | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Chris <chrisvte> |
| Component: | Virtualization:Other | Assignee: | virt-bugs list <virt-bugs> |
| Status: | RESOLVED WORKSFORME | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P3 - Medium | CC: | aginies, chrisvte, claudio.fontana, ericvanblokland, jfehlig, mkoutny |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE Tumbleweed | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
To mimic the minipc working system I installed this packages (with their dependencies): - patterns-microos-cockpit - cockpit - cockpit-bridge - cockpit-kdump - cockpit-machines - cockpit-networkmanager - cockpit-packagekit - cockpit-pcp - cockpit-podman - cockpit-storaged - cockpit-system - cockpit-ws A few commands to make it work: # systemctl enable --now cockpit.socket # systemctl restart cockpit.socket I accessed to https://mylocalip:9090 with a regular user. I clicked over "Limited access" to switch to administrative access. In the section of "Podman containers" a message told me about enabling the podman service or something similar, I agreed and now the LXC containers are working. I don't know how it is related... It's probably due to the fact that this mount point was not accessible from the container (missing volume). Seems this is solved now. Latest update applied to the AMD Ryzen 5900X machine and I still get the same behavior. # uname -a 6.4.12-1-default #1 SMP PREEMPT_DYNAMIC Fri Aug 25 08:26:31 UTC 2023 (f5aa89b) x86_64 x86_64 x86_64 GNU/Linux The AMD Ryzen 5600G machine with cockpit is working as expected, but I've been looking through the podman installation files and I didn't see anything relevant. Sorry for the delay. We don't provide or support the libvirt LXC driver in SLE so bug reports get lower attention. (In reply to Chris from comment #0) > After I updated to kernel 6.4.11-1-default, every time I booted a LXC I was > getting this error: > “libvirtError: internal error: Unable to find ‘memory’ cgroups controller > mount” What did you upgrade from? It's been a while since memory accounting was disabled. From the systemd-default-settings package: * Fri Oct 23 2020 Franck Bui <fbui@suse.com> - Import 0.3 d299248 List drop-in directories in SUSE.list exclusively e4651a7 Disable memory accounting by default for all distros (jsc#PM-2229 jsc#PM-2230) See /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf. Please check if DefaultMemoryAccounting=no. If so, provide an override to enable it. I've ran into this is as well, but haven't had the time to look into it thoroughly. I think it is a permission issue. My current dirty workaround is: sudo systemctl stop virtlxcd sudo virtlxcd -f /etc/libvirt/virtlxcd.conf -d (In reply to James Fehlig from comment #4) > What did you upgrade from? It's been a while since memory accounting was > disabled. From the systemd-default-settings package: I don't remember the previous version of the kernel, but at worst it was a month older. > See /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf. Please check if > DefaultMemoryAccounting=no. If so, provide an override to enable it. I changed that file content from "DefaultMemoryAccounting=no" to "DefaultMemoryAccounting=yes" and now LXC is working fine. So, as a summary of what I checked: - Clean installation -> LXC not working. - Old installation -> LXC not working. - Clean installation + cockpit with Podman containers -> LXC working. - Old installation + DefaultMemoryAccounting=yes -> LXC working. (In reply to Chris from comment #6) > (In reply to James Fehlig from comment #4) > > See /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf. Please check if > > DefaultMemoryAccounting=no. If so, provide an override to enable it. > > I changed that file content from "DefaultMemoryAccounting=no" to > "DefaultMemoryAccounting=yes" and now LXC is working fine. Yep, the libvirt lxc driver requires the memory controller, which is not available when DefaultMemoryAccounting=no. BTW, as the comments say in /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf, don't edit that file directly. But the comments are not exactly clear on where the drop-in needs to be placed. By trial and error, I found something like /etc/systemd/system.conf.d/80-defaults.conf containing the following works for me [Manager] DefaultMemoryAccounting=yes > So, as a summary of what I checked: > - Clean installation -> LXC not working. > - Old installation -> LXC not working. > - Clean installation + cockpit with Podman containers -> LXC working. > - Old installation + DefaultMemoryAccounting=yes -> LXC working. Are there any cases where the memory controller is available but LXC is not working? Regardless, I think we can close this bug. Issues not related to "Unable to find 'memory' cgroups controller mount" can be handled in a new bug. James, Like you said, it has been a while since memory accounting has been disabled. LXC worked just fine in Tumbleweed until one or two months ago. For reasons I do not yet dare to understand, LXC starts working agains when: - I stop virtlxcd.service - I manually start virtlxcd.service with "virtlxcd -f /etc/libvirt/virtlxcd.conf" After manually running virtlxcd, I can close it and the systemd managed virtlxcd will work again as well. (In reply to James Fehlig from comment #7) > Are there any cases where the memory controller is available but LXC is not > working? Regardless, I think we can close this bug. Issues not related to > "Unable to find 'memory' cgroups controller mount" can be handled in a new > bug. How can I check if the memory controller is available? It is only available if I leave the DefaultMemoryAccounting to yes? What about the performance penalties? (In reply to Eric van Blokland from comment #8) > For reasons I do not yet dare to understand, LXC starts working agains when: > > - I stop virtlxcd.service > - I manually start virtlxcd.service with "virtlxcd -f > /etc/libvirt/virtlxcd.conf" > > After manually running virtlxcd, I can close it and the systemd managed > virtlxcd will work again as well. Indeed, that's very strange. Do you have settings in /etc/sysconfig/virtlxcd that conflict with those in the service file? Regardless, can we please handle this issue in a new bug? (In reply to Chris from comment #9) > How can I check if the memory controller is available? It is only available > if I leave the DefaultMemoryAccounting to yes? What about the performance > penalties? For lxc, the memory controller needs to be available in /sys/fs/cgroup/machine.slice/. You can check the 'cgroup.controllers' file for controllers available at any point in the cgroup hierarchy. E.g. # cat /sys/fs/cgroup/machine.slice/cgroup.controllers memory pids I'm not familiar enough with cgroups to comment on performance penalties associated with DefaultMemoryAccounting=yes. I can only say there are penalties as per the comment in /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf. Perhaps Michal can shed some light. (In reply to James Fehlig from comment #7) > BTW, as the comments say in > /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf, don't edit that file > directly. But the comments are not exactly clear on where the drop-in needs > to be placed. See also (numeric prefix suggestions) - https://en.opensuse.org/Systemd#Main_Configuration_files - https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-systemd.html#sec-boot-systemd-custom-drop-in (In reply to James Fehlig from comment #11) > I can only say there are penalties as per the comment in > /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf. Perhaps Michal can > shed some light. Universal answer in [1]. As for memory controller -- it's workload and nr_cpus (contention/parallelism) dependent (as always). A targeted microbenchmarks saw below 10% drop. I believe in real workloads it will be diluted and negligible unless you want to squeeze every last cycle and byte (memcgs also need memory for themselves) of the machine. Partitioning memory into (accounting) cgroups affects reclaim, that may affect interference between jobs on the machine (which would be again workload dependent but it generally makes sense using memcgs between containers). [1] https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-tuning-cgroups.html#sec-tuning-cgroups-accounting (In reply to Michal Koutný from comment #12) > (In reply to James Fehlig from comment #11) > > I can only say there are penalties as per the comment in > > /usr/lib/systemd/system.conf.d/__20-defaults-SUSE.conf. Perhaps Michal can > > shed some light. > > Universal answer in [1]. > As for memory controller -- it's workload and nr_cpus > (contention/parallelism) dependent (as always). A targeted microbenchmarks > saw below 10% drop. I believe in real workloads it will be diluted and > negligible unless you want to squeeze every last cycle and byte (memcgs also > need memory for themselves) of the machine. > Partitioning memory into (accounting) cgroups affects reclaim, that may > affect interference between jobs on the machine (which would be again > workload dependent but it generally makes sense using memcgs between > containers). > > [1] > https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-tuning-cgroups. > html#sec-tuning-cgroups-accounting Thanks Michal for your always insightful comments :-). I'm going to close this bug now as worksforme since there were no actual fixes, only configuration changes. (In reply to James Fehlig from comment #11) > (In reply to Chris from comment #9) > > How can I check if the memory controller is available? It is only available > > if I leave the DefaultMemoryAccounting to yes? What about the performance > > penalties? > > For lxc, the memory controller needs to be available in > /sys/fs/cgroup/machine.slice/. You can check the 'cgroup.controllers' file > for controllers available at any point in the cgroup hierarchy. E.g. > > # cat /sys/fs/cgroup/machine.slice/cgroup.controllers > memory pids > "cat /sys/fs/cgroup/machine.slice/cgroup.controllers" is showing the same result when "DefaultMemoryAccounting" is set to "Yes" as when it is set to "No": cpu memory pids I tried what Eric said and it worked with DefaultMemoryAccounting set to "No". So what we have for now are a few workarounds to avoid the "Unable to find 'memory' cgroups controller" error and I have no clue about the root of what is wrong. How could it be marked as solved if any new or old installation has this error? (In reply to Chris from comment #14) > "cat /sys/fs/cgroup/machine.slice/cgroup.controllers" is showing the same > result when "DefaultMemoryAccounting" is set to "Yes" as when it is set to > "No": > > cpu memory pids I suppose you have it enabled via other means. By default on TW I only see 'pids'. > I tried what Eric said and it worked with DefaultMemoryAccounting set to > "No". > So what we have for now are a few workarounds to avoid the "Unable to find > 'memory' cgroups controller" error and I have no clue about the root of what > is wrong. DefaultMemoryAccounting=yes is not a workaround. It's a required config change for libvirt-lxc. > How could it be marked as solved if any new or old installation has this > error? Are there any new or old installations that still give "Unable to find 'memory' cgroups controller mount" error after the required config change? Wild guess (because libvirt/lxc is weird) -- maybe the relevant change is what controllers system.slice/virtlxcd.service cgroup has (as opposed to machine.slice/ but affected by DefaultMemoryAccounting=), more precisely the cgroup of virtlsxcd process (hence the effect of manual start).
> DefaultMemoryAccounting=yes is not a workaround. It's a required config
> change for libvirt-lxc.
James, could you clarify this requirement? Is it required by libvirt, lxc or opensuse?
As Chris mentioned, the memory controller has always been available regardless of the DefaultMemoryAccounting setting. I have always been under the impression the controller just didn't do anything with DefaultMemoryAccounting=No set.
Ok now I have better understanding of the problem... In my opinion DefaultMemoryAccounting=Yes is not and should not be required for lxc containers to run. Some change, presumable in systemd or its configuration, made this issue surface. At a first glance I think this is an lxc/systemd interoperability issue. If I read the source correctly, the libvirt LXC driver assumes it can check cgroup controller availability with its own cgroup. Under the current circumstances this assumption is wrong. When the LXC driver is run with systemd, the container process cgroup is moved under machine.slice, where the memory controller is available regardless of the DefaultMemoryAccounting setting. If the current behaviour/configuration of systemd/cgroups is correct and desired I could write a "crude" fix for the LXC driver to not check controller availability when started with systemd and the cgroupv2 backend. (In reply to James Fehlig from comment #15) > I suppose you have it enabled via other means. By default on TW I only see > 'pids'. I think the same. The computer which is running fine with the cockpit Podman containers activated is showing this: cpuset cpu io memory hugetlb pids rdma misc > DefaultMemoryAccounting=yes is not a workaround. It's a required config > change for libvirt-lxc. If it is required, it would be great if this config was applied automatically as part of the the virtualization tools install or written into the wiki. > Are there any new or old installations that still give "Unable to find > 'memory' cgroups controller mount" error after the required config change? No, but neither do the other two "solutions". What I would like to know is what change introduced this problem to understand it and fixed it so no one else ran into it. (In reply to Chris from comment #19) > What I would like to know is > what change introduced this problem to understand it and fixed it so no one > else ran into it. FTR, this patch came to my mind [1] but it's been out for some time already. I don't recall any change in kernel nor systemd that would change cgroup layouts more recent than this. [1] https://bugzilla.suse.com/show_bug.cgi?id=1183247#c48 (oh, I notice that Eric should be aware already :-) (In reply to Michal Koutný from comment #20) > (In reply to Chris from comment #19) > > What I would like to know is > > what change introduced this problem to understand it and fixed it so no one > > else ran into it. > > FTR, this patch came to my mind [1] but it's been out for some time already. > I don't recall any change in kernel nor systemd that would change cgroup > layouts more recent than this. > > [1] https://bugzilla.suse.com/show_bug.cgi?id=1183247#c48 (oh, I notice that > Eric should be aware already :-) That patch applies to something that happened after the point where this new issue occurs. As I explained in an earlier comment, the libvirt lxc driver checks controller availability on a cgroup (I'm unsure which exactly at this time, but I'll look into that as soon as I get the chance) which has not been moved into machine.slice, where the required controllers are available, regardless of the DefaultMemoryAccounting setting. There is one other change I can think of that might be affecting this, at what point did Tumbleweed swap from the monolithic libvirtd to the modular one? (In reply to Eric van Blokland from comment #17) > > DefaultMemoryAccounting=yes is not a workaround. It's a required config > > change for libvirt-lxc. > > James, could you clarify this requirement? Is it required by libvirt, lxc or > opensuse? As you come to find out, apparently not. And you've likely concluded that I know little about the lxc driver, and don't seem to care. And you'd be right in both cases :-). As the sole maintainer of libvirt, my focus and energy is on SLE, where the lxc driver is disabled. The thing is enough of a distraction that I'd like to unconditionally disable it. AFAICT, there's no user community, and being an upstream libvirt maintainer I can say it get's no attention or maintenance, only reactive fixes. Why not use "modern" container technologies that have much larger user communities and active upstream development communities? What benefit does an unmaintained, little used and tested technology like libvirt-lxc provide? (In reply to James Fehlig from comment #22) > (In reply to Eric van Blokland from comment #17) > > > DefaultMemoryAccounting=yes is not a workaround. It's a required config > > > change for libvirt-lxc. > > > > James, could you clarify this requirement? Is it required by libvirt, lxc or > > opensuse? > > As you come to find out, apparently not. And you've likely concluded that I > know little about the lxc driver, and don't seem to care. And you'd be right > in both cases :-). > > As the sole maintainer of libvirt, my focus and energy is on SLE, where the > lxc driver is disabled. The thing is enough of a distraction that I'd like > to unconditionally disable it. AFAICT, there's no user community, and being > an upstream libvirt maintainer I can say it get's no attention or > maintenance, only reactive fixes. > > Why not use "modern" container technologies that have much larger user > communities and active upstream development communities? What benefit does > an unmaintained, little used and tested technology like libvirt-lxc provide? Hey James, I'm fully aware that the LXC driver is not supported by SUSE and that I have you to thank for any patches that get applied regardless. This is also why I don't mind to spend some time to fix issues and actually submit a patch. As to the why "libvirt-lxc": I've worked with libvirt my entire professional life to run vms. So it's tech that I know (that's why libvirt). For situations where an entire vm is a bit overkill I prefer to use containers. The fact I can use the same cli and ui tools for both my vms and containers is extremely appealing. Is there another libvirt container driver that would be more or less a drop in replacement for LXC? I see a few others listed, but I'm not familiar with them and apparently can't assume they're actively maintained. (In reply to Eric van Blokland from comment #23) > I'm fully aware that the LXC driver is not supported by SUSE and that I have > you to thank for any patches that get applied regardless. This is also why I > don't mind to spend some time to fix issues and actually submit a patch. Thanks for that! > As to the why "libvirt-lxc": I've worked with libvirt my entire professional > life to run vms. So it's tech that I know (that's why libvirt). For > situations where an entire vm is a bit overkill I prefer to use containers. > The fact I can use the same cli and ui tools for both my vms and containers > is extremely appealing. Right, one of the main benefits of libvirt. Okay, I'll leave the lxc driver enabled with the caveat that user community is also the maintainers of the driver :-). I'll encourage bug reporters to pursue a fix upstream, at which point I'm happy to integrate in the downstream package. > Is there another libvirt container driver that would be more or less a drop > in replacement for LXC? I see a few others listed, but I'm not familiar with > them and apparently can't assume they're actively maintained. lxc is the only container driver for libvirt. (In reply to James Fehlig from comment #24) > (In reply to Eric van Blokland from comment #23) > > I'm fully aware that the LXC driver is not supported by SUSE and that I have > > you to thank for any patches that get applied regardless. This is also why I > > don't mind to spend some time to fix issues and actually submit a patch. > > Thanks for that! > > > As to the why "libvirt-lxc": I've worked with libvirt my entire professional > > life to run vms. So it's tech that I know (that's why libvirt). For > > situations where an entire vm is a bit overkill I prefer to use containers. > > The fact I can use the same cli and ui tools for both my vms and containers > > is extremely appealing. > > Right, one of the main benefits of libvirt. Okay, I'll leave the lxc driver > enabled with the caveat that user community is also the maintainers of the > driver :-). I'll encourage bug reporters to pursue a fix upstream, at which > point I'm happy to integrate in the downstream package. > > > Is there another libvirt container driver that would be more or less a drop > > in replacement for LXC? I see a few others listed, but I'm not familiar with > > them and apparently can't assume they're actively maintained. > > lxc is the only container driver for libvirt. Sorry for not replying earlier, I've been crazy busy. I don't mind doing some maintenance for the software I use. I will try to fix the issue in this report as soon I can, but since there are a couple of workarounds, it's not on the top of my todo list at this time. Is there anyone in particular you think I could bother if I have questions about implementation details? (In reply to Eric van Blokland from comment #25) > Is there anyone in particular you think I could bother if I have questions > about implementation details? From an upstream perspective, Michal Privoznik (mprivozn@redhat.com) is kinda-sorta the maintainer of the lxc driver. He fixes issues on occasion, reviews contributions, and responds to lxc-related gitlab issues. But others in the community might have an opinion too, so best IMO to direct questions to devel@lists.libvirt.org. Thanks! |
After I updated to kernel 6.4.11-1-default, every time I booted a LXC I was getting this error: “libvirtError: internal error: Unable to find ‘memory’ cgroups controller mount” I replicated the installation in a minipc (Intel N5105) and it worked perfectly, so I did a new installation in the first computer (AMD Ryzen 5600G) and, again, I got the same error of the beginning. I tried in other two computers (AMD Ryzen 5900X and Intel i5-1135G7) and I got the same error. If I added “systemd.unified_cgroup_hierarchy=0” to the boot line, the error went away, but it was really unstable (containers stopped working after a while). I usually install the tools of virtualization through Yast (KVM Server and KVM tools). After that I install a few more packages: libvirt-client, libvirt-daemon, libvirt-daemon-lxc, libvirt-daemon-driver-lxc, system-user-libvirt-dbus. Finally, I add the user to groups libvirt, qemu and kvm. Reboot and run. The details of the error using Virt-Manager: Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 72, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 108, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 57, in newfn ret = fn(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/virt-manager/virtManager/object/domain.py", line 1425, in startup self._backend.create() File "/usr/lib64/python3.11/site-packages/libvirt.py", line 1373, in create raise libvirtError('virDomainCreate() failed') libvirt.libvirtError: Unable to find 'memory' cgroups controller mount And using command line: # virsh -d 0 -c lxc:// start 101-PiHole start: domain(optdata): 101-PiHole start: found option <domain>: 101-PiHole start: <domain> trying as domain NAME error: Failed to start domain '101-PiHole' error: internal error: Unable to find 'memory' cgroups controller mount About the system: # mount | grep cgroup cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot) # uname -a Linux 6.4.11-1-default #1 SMP PREEMPT_DYNAMIC Thu Aug 17 04:57:43 UTC 2023 (2a5b3f6) x86_64 x86_64 x86_64 GNU/Linux