Bug 1217029

Summary: [Build 20231109] openQA test fails in networkd_dhcp
Product: [openSUSE] openSUSE Tumbleweed Reporter: Dominique Leuenberger <dimstar>
Component: BasesystemAssignee: systemd maintainers <systemd-maintainers>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: dheidler, dimstar, fvogt, petr.vorel, santiago.zarate, thomas.blume
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.opensuse.org/tests/3714912/modules/networkd_dhcp/steps/11
See Also: https://bugzilla.suse.com/show_bug.cgi?id=1220119
https://bugzilla.opensuse.org/show_bug.cgi?id=1225930
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---

Description Dominique Leuenberger 2023-11-10 13:12:49 UTC
## Observation

The test spawns two containers that are then supposed to interact. Calling networkctl against node1 seems to 'time out' since recently

command 'systemd-run -tM node1 /bin/bash -c "networkctl"' timed out

Please help investigating what is going on here

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-systemd-networkd@64bit fails in
[networkd_dhcp](https://openqa.opensuse.org/tests/3714912/modules/networkd_dhcp/steps/11)

## Test suite description
Maintainer: dheidler@suse.de
Test for systemd-networkd. These tests run on a single testmachine and spawn systemd-nspawn containers.


## Reproducible

Fails since (at least) Build [20231109](https://openqa.opensuse.org/tests/3713820)


## Expected result

Last good: [20231108](https://openqa.opensuse.org/tests/3711613) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD&machine=64bit&test=systemd-networkd&version=Tumbleweed)
Comment 1 Thomas Blume 2023-11-20 10:04:24 UTC
Current findings.
The command: "networkctl status" to nspawn container node1 hangs:

https://openqa.opensuse.org/tests/3738427#step/networkd_dhcp/15 

Tried on my private openQA setup, but couldn't reproduce the issue there (will retry on another machine).

The ps log on the container host shows:

-->
 1147 ?        Ss     0:00 login -- root
 1161 tty6     Ss     0:00  \_ -bash
 3171 tty6     S+     0:00      \_ systemd-run -tM node1 /bin/bash -c networkctl status
--<

inside the container, the status command returns correctly:

-->
Nov 18 01:45:51.308612 node1 systemd[1]: Started /bin/bash -c networkctl status.
Nov 18 01:45:51.458301 node1 systemd[1]: run-u16.service: Deactivated successfully.
--<

So, apparently the feedback from node1 to the container host is hanging.
Comment 2 Thomas Blume 2023-11-20 10:49:21 UTC
(In reply to Thomas Blume from comment #1)
> So, apparently the feedback from node1 to the container host is hanging.

Looks like there is a time keeping issue.
dmesg on the host shows:

-->
[  303.093179] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[  303.094160] clocksource:                       'kvm-clock' wd_nsec: 480016638 wd_now: 47a9fb099a wd_last: 478d5e909c mask: ffffffffffffffff
[  303.095819] clocksource:                       'tsc' cs_nsec: 41654552965 cs_now: b3c0bc3362 cs_last: 9e6a91c09e mask: ffffffffffffffff
[  303.097036] clocksource:                       Clocksource 'tsc' skewed 41174536327 ns (41174 ms) over watchdog 'kvm-clock' interval of 480016638 ns (480 ms)
[  303.098189] clocksource:                       'kvm-clock' (not 'tsc') is current clocksource.
[  303.099101] tsc: Marking TSC unstable due to clocksource watchdog
--<

I can't see that on my private machine, where the test works.
Comment 3 Thomas Blume 2023-11-20 15:59:50 UTC
Also on my second machine, the test runs successfull.
Dominique, is this openQA instance running in a VM?
If so, that might be the difference, because mine are in containers on bare metal.
If in a VM, could you try using a different timekeeping for the VM?
Comment 4 Dominique Leuenberger 2023-11-22 08:10:04 UTC
(In reply to Thomas Blume from comment #3)
> Also on my second machine, the test runs successfull.
> Dominique, is this openQA instance running in a VM?
> If so, that might be the difference, because mine are in containers on bare
> metal.
> If in a VM, could you try using a different timekeeping for the VM?

basically, all openQA tests run in a VM (qemu/kvm with VNC output)

Let's see if Santiago (as a representative of openQA) has an idea about the clock source
Comment 7 Santiago Zarate 2023-11-22 10:02:21 UTC
(In reply to Dominique Leuenberger from comment #4)
> (In reply to Thomas Blume from comment #3)
> > Also on my second machine, the test runs successfull.
> > Dominique, is this openQA instance running in a VM?
> > If so, that might be the difference, because mine are in containers on bare
> > metal.
> > If in a VM, could you try using a different timekeeping for the VM?
> 
> basically, all openQA tests run in a VM (qemu/kvm with VNC output)
> 
> Let's see if Santiago (as a representative of openQA) has an idea about the
> clock source

I think I have an idea, IIRC... let me check in the meantime couple of things and try to set the clock/timekeeping and see if I can reproduce it by playing with the RTC parameter in qemu
Comment 9 Santiago Zarate 2023-11-23 10:23:07 UTC
it seems that the test might be hanging somewhere... lets see how it goes: https://openqa.opensuse.org/tests/3751459#live

Dimstar, can you show me or do you know the SR that introduced the latest change?
Comment 10 Thomas Blume 2023-11-27 10:28:10 UTC
Unfortunately, I cannot reproduce the issue, even in a VM.
The test just runs fine.
I don't see the error mentioned in comment#2 either:

-->
localhost:~ # dmesg | grep clocksource
[    0.000003] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.025394] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.090525] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1e33052aadd, max_idle_ns: 440795310221 ns
[    0.119165] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.770899] clocksource: Switched to clocksource kvm-clock
[    0.788281] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.890714] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1e33052aadd, max_idle_ns: 440795310221 ns
--<
Comment 11 Thomas Blume 2023-12-19 08:04:09 UTC
(In reply to Santiago Zarate from comment #9)
> it seems that the test might be hanging somewhere... lets see how it goes:
> https://openqa.opensuse.org/tests/3751459#live
> 
> Dimstar, can you show me or do you know the SR that introduced the latest
> change?

Santiago, it seems the test still hangs.
On the other hand I cannot reproduce the issue on my test setup.

Hence, I assume a time keeping issue in the test VM (see comment#2 and comment#10).
Reassigning to you for now, could you try to rerun the test on a VM where the time keeping issue doesn't appear?
If the issue still happens then, please assign back.
Comment 12 Petr Vorel 2024-02-20 15:45:02 UTC
FYI I also noticed some "Marking clocksource 'tsc'" issues triggered on x86_64 on my private openQA worker by ltp_net_tcp_cmds, triggered for 2 tests: arp and host:

OpenQA::run_ltp.pm: Starting ipneigh01_arp
[  410.818828][    C0] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[  410.820365][    C0] clocksource:                       'kvm-clock' wd_nsec: 506724586 wd_now: 610448443a wd_last: 60e6144350 mask: ffffffffffffffff
[  410.821824][    C0] clocksource:                       'tsc' cs_nsec: 49302930542 cs_now: 17c1364ba0a cs_last: 153fdc32526 mask: ffffffffffffffff
[  410.823293][    C0] clocksource:                       Clocksource 'tsc' skewed 48796205956 ns (48796 ms) over watchdog 'kvm-clock' interval of 506724586 ns (506 ms)
[  410.824843][    C0] clocksource:                       'kvm-clock' (not 'tsc') is current clocksource.
[  410.825707][    C0] tsc: Marking TSC unstable due to clocksource watchdog
[  411.109710][T28135] Initializing XFRM netlink socket
OpenQA::run_ltp.pm: Starting ipneigh01_ip
OpenQA::run_ltp.pm: Starting arping01
OpenQA::run_ltp.pm: Starting clockdiff01
OpenQA::run_ltp.pm: Starting ftp
OpenQA::run_ltp.pm: Starting host
[  410.818845][    C0] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[  410.820687][    C0] clocksource:                       'kvm-clock' wd_nsec: 506741379 wd_now: 61044885d3 wd_last: 60e6144350 mask: ffffffffffffffff
[  410.821794][    C0] clocksource:                       'tsc' cs_nsec: 804834172 cs_now: 154a546a082 cs_last: 153fdc32526 mask: ffffffffffffffff
[  410.822919][    C0] clocksource:                       Clocksource 'tsc' skewed 298092793 ns (298 ms) over watchdog 'kvm-clock' interval of 506741379 ns (506 ms)
[  410.824088][    C0] clocksource:                       'kvm-clock' (not 'tsc') is current clocksource.
[  410.824768][    C0] tsc: Marking TSC unstable due to clocksource watchdog
[  411.126759][T28138] Initializing XFRM netlink socket

I wondered, whether this is somehow related to the way openQA run qemu:

/usr/bin/qemu-system-x86_64 -device VGA,edid=on,xres=1024,yres=768 -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -audiodev none,id=snd0 -device intel-hda -device hda-output,audiodev=snd0 -global isa-fdc.fdtypeA=none -m 1536 -cpu host -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 -boot order=c -device qemu-xhci -device usb-tablet -smp 1 -enable-kvm -no-shutdown -vnc :96,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev pipe,id=virtio_console_user,path=virtio_console_user,logfile=virtio_console_user.log,logappend=on -device virtconsole,chardev=virtio_console_user,name=org.openqa.console.virtio_console_user -chardev socket,path=qmp_socket,server=on,wait=off,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/var/lib/openqa/pool/6/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on,discard=unmap -device virtio-blk,id=hd0-device,drive=hd0-overlay0,bootindex=0,serial=hd0 -blockdev driver=file,node-name=cd0-overlay0-file,filename=/var/lib/openqa/pool/6/raid/cd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay0,file=cd0-overlay0-file,cache.no-flush=on,discard=unmap -device scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0

Also, originally I noted this on ltp_net_tcp_cmds running on ppc64le on o3, where host test keeps failing (but arp test is working), I don't see it on x86_64 on o3.
Comment 14 Santiago Zarate 2024-02-21 05:48:02 UTC
Apologies, this completely skipped my radar; let me read a bit and come back to this bug

(In reply to Thomas Blume from comment #11)
> (In reply to Santiago Zarate from comment #9)
> > it seems that the test might be hanging somewhere... lets see how it goes:
> > https://openqa.opensuse.org/tests/3751459#live
> > 
> > Dimstar, can you show me or do you know the SR that introduced the latest
> > change?
> 
> Santiago, it seems the test still hangs.
> On the other hand I cannot reproduce the issue on my test setup.
> 
> Hence, I assume a time keeping issue in the test VM (see comment#2 and
> comment#10).
> Reassigning to you for now, could you try to rerun the test on a VM where
> the time keeping issue doesn't appear?
> If the issue still happens then, please assign back.
Comment 15 Fabian Vogt 2024-05-16 13:33:01 UTC
The timekeeping warnings happen when the VM is paused and unpaused, which happens when creating and restoring snapshots.
Comment 16 Fabian Vogt 2024-05-16 14:06:40 UTC
I had a look as well. Meanwhile it's the networkd_init module that fails: https://openqa.opensuse.org/tests/4194179#step/networkd_init/48

systemd-run -tM node1 /bin/bash -c "zypper lr -u" does not return.

While it appears like zypper got stuck, this is not the case. The zypper process finishes successfully (visible in logs) and inside the container it also reports that the run-uNR.service unit finished. So it appears to be the communication issue indicated by comment 1.
Comment 17 Fabian Vogt 2024-05-16 14:08:54 UTC
Might be https://github.com/systemd/systemd/issues/27189 upstream, no activity.
Comment 18 Thomas Blume 2024-06-04 10:09:46 UTC
(In reply to Fabian Vogt from comment #16)
> I had a look as well. Meanwhile it's the networkd_init module that fails:
> https://openqa.opensuse.org/tests/4194179#step/networkd_init/48

I'm trying to reproduce the issue, but the assets seem to have been deleted.
Even tried the latest test there:

https://openqa.opensuse.org/tests/4209853#

But openSUSE-Tumbleweed-DVD-x86_64-Snapshot20240520-Media.iso isn't available.

openQA colleagues, could you please reschedule the test so that I can get the assets for the reproducer?
Comment 19 Santiago Zarate 2024-06-04 10:14:09 UTC
(In reply to Thomas Blume from comment #18)
> (In reply to Fabian Vogt from comment #16)
> > I had a look as well. Meanwhile it's the networkd_init module that fails:
> > https://openqa.opensuse.org/tests/4194179#step/networkd_init/48
> 
> I'm trying to reproduce the issue, but the assets seem to have been deleted.
> Even tried the latest test there:
> 
> https://openqa.opensuse.org/tests/4209853#
> 
> But openSUSE-Tumbleweed-DVD-x86_64-Snapshot20240520-Media.iso isn't
> available.
> 
> openQA colleagues, could you please reschedule the test so that I can get
> the assets for the reproducer?

you can try out with https://openqa.opensuse.org/tests/4246330#downloads
Comment 20 Thomas Blume 2024-06-04 13:11:42 UTC
(In reply to Fabian Vogt from comment #16)
> I had a look as well. Meanwhile it's the networkd_init module that fails:
> https://openqa.opensuse.org/tests/4194179#step/networkd_init/48
> 
> systemd-run -tM node1 /bin/bash -c "zypper lr -u" does not return.
> 
> While it appears like zypper got stuck, this is not the case. The zypper
> process finishes successfully (visible in logs) and inside the container it
> also reports that the run-uNR.service unit finished. So it appears to be the
> communication issue indicated by comment 1.

With the hint in comment#19 I could reproduce the issue on my local openQA setup.
But unlike in the original test, where there is a timeout:

[2024-05-16T08:17:58.087198Z] [info] [pid:80064] ::: basetest::runtest: # Test died: command 'systemd-run -tM node1 /bin/bash -c "zypper lr -u"' timed out at /usr/lib/os-autoinst/testapi.pm line 926.
  	testapi::assert_script_run("systemd-run -tM node1 /bin/bash -c \"zypper lr -u\"") called at opensuse/lib/networkdbase.pm line 28


The log shows that the systemd-run command returns with failure:

-->
[2024-06-04T11:11:21.773323Z] [info] [pid:1987] ::: basetest::runtest: # Test died: command 'systemd-run -tM node1 /bin/bash -c "zypper lr -u"' failed at /usr/lib/os-autoinst/testapi.pm line 926.
  	testapi::assert_script_run("systemd-run -tM node1 /bin/bash -c \"zypper lr -u\"") called at opensuse/lib/networkdbase.pm line 28
--<

Not sure what makes this difference.
Investigating further.
Comment 21 Thomas Blume 2024-06-05 08:13:53 UTC
(In reply to Thomas Blume from comment #20)
> (In reply to Fabian Vogt from comment #16)
> > I had a look as well. Meanwhile it's the networkd_init module that fails:
> > https://openqa.opensuse.org/tests/4194179#step/networkd_init/48
> > 
> > systemd-run -tM node1 /bin/bash -c "zypper lr -u" does not return.
> > 
> > While it appears like zypper got stuck, this is not the case. The zypper
> > process finishes successfully (visible in logs) and inside the container it
> > also reports that the run-uNR.service unit finished. So it appears to be the
> > communication issue indicated by comment 1.
> 
> With the hint in comment#19 I could reproduce the issue on my local openQA
> setup.
> But unlike in the original test, where there is a timeout:
> 
> [2024-05-16T08:17:58.087198Z] [info] [pid:80064] ::: basetest::runtest: #
> Test died: command 'systemd-run -tM node1 /bin/bash -c "zypper lr -u"' timed
> out at /usr/lib/os-autoinst/testapi.pm line 926.
>   	testapi::assert_script_run("systemd-run -tM node1 /bin/bash -c \"zypper
> lr -u\"") called at opensuse/lib/networkdbase.pm line 28
> 
> 
> The log shows that the systemd-run command returns with failure:
> 
> -->
> [2024-06-04T11:11:21.773323Z] [info] [pid:1987] ::: basetest::runtest: #
> Test died: command 'systemd-run -tM node1 /bin/bash -c "zypper lr -u"'
> failed at /usr/lib/os-autoinst/testapi.pm line 926.
>   	testapi::assert_script_run("systemd-run -tM node1 /bin/bash -c \"zypper
> lr -u\"") called at opensuse/lib/networkdbase.pm line 28
> --<
> 
> Not sure what makes this difference.
> Investigating further.

systemd-run -tM node1 /bin/bash -c "zypper lr -u"

fails on my test setup due to a pty allocation error.
strace shows:

-->
[...]
openat(AT_FDCWD, "/dev/pts/1", O_RDWR|O_NOCTTY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
close(5)                                = 0
close(7)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++
--<

However, the similar command:

machinectl shell node1

succeeds.
According to the documentation at https://www.freedesktop.org/software/systemd/man/latest/systemd-run.html# 'systemd-run -tM' is for starting an interactive shell.
This is not required for the test and the manpage states:

-->
[...]
Note that machinectl(1)'s shell command is usually a better alternative for requesting a new, interactive login session on the local host or a local container.
[...]
--<

For the non-interactive test run, the command:

systemd-run -PM node1 /bin/bash -c "zypper lr -u"

can be used.

My test setup succeeds with the following patch:

-->
diff --git a/lib/networkdbase.pm b/lib/networkdbase.pm
index f763732fb..b90d61239 100644
--- a/lib/networkdbase.pm
+++ b/lib/networkdbase.pm
@@ -25,7 +25,7 @@ Run C<$script> in the systemd-nspawn container called C<$machine>.
 
 sub assert_script_run_container {
     my ($self, $machine, $script) = @_;
-    assert_script_run("systemd-run -tM $machine /bin/bash -c \"$script\"");
+    assert_script_run("systemd-run -PM $machine /bin/bash -c \"$script\"");
 }
 
 =head2 script_run_container
--<

Could you please try this?
Comment 22 Fabian Vogt 2024-06-05 08:37:49 UTC
(In reply to Thomas Blume from comment #21)
> (In reply to Thomas Blume from comment #20)
> > (In reply to Fabian Vogt from comment #16)
> > > I had a look as well. Meanwhile it's the networkd_init module that fails:
> > > https://openqa.opensuse.org/tests/4194179#step/networkd_init/48
> > > 
> > > systemd-run -tM node1 /bin/bash -c "zypper lr -u" does not return.
> > > 
> > > While it appears like zypper got stuck, this is not the case. The zypper
> > > process finishes successfully (visible in logs) and inside the container it
> > > also reports that the run-uNR.service unit finished. So it appears to be the
> > > communication issue indicated by comment 1.
> > 
> > With the hint in comment#19 I could reproduce the issue on my local openQA
> > setup.
> > But unlike in the original test, where there is a timeout:
> > 
> > [2024-05-16T08:17:58.087198Z] [info] [pid:80064] ::: basetest::runtest: #
> > Test died: command 'systemd-run -tM node1 /bin/bash -c "zypper lr -u"' timed
> > out at /usr/lib/os-autoinst/testapi.pm line 926.
> >   	testapi::assert_script_run("systemd-run -tM node1 /bin/bash -c \"zypper
> > lr -u\"") called at opensuse/lib/networkdbase.pm line 28
> > 
> > 
> > The log shows that the systemd-run command returns with failure:
> > 
> > -->
> > [2024-06-04T11:11:21.773323Z] [info] [pid:1987] ::: basetest::runtest: #
> > Test died: command 'systemd-run -tM node1 /bin/bash -c "zypper lr -u"'
> > failed at /usr/lib/os-autoinst/testapi.pm line 926.
> >   	testapi::assert_script_run("systemd-run -tM node1 /bin/bash -c \"zypper
> > lr -u\"") called at opensuse/lib/networkdbase.pm line 28
> > --<
> > 
> > Not sure what makes this difference.
> > Investigating further.
> 
> systemd-run -tM node1 /bin/bash -c "zypper lr -u"
> 
> fails on my test setup due to a pty allocation error.

It's possible you're testing a snapshot with broken systemd: bsc#1225943

Maybe try an older snapshot where the original issue occurs.
Comment 23 Thomas Blume 2024-06-05 08:48:20 UTC
(In reply to Fabian Vogt from comment #22)
> > systemd-run -tM node1 /bin/bash -c "zypper lr -u"
> > 
> > fails on my test setup due to a pty allocation error.
> 
> It's possible you're testing a snapshot with broken systemd: bsc#1225943
> 
> Maybe try an older snapshot where the original issue occurs.

I'd need the older asset for the test, but that isn't available anymore, see comment#18
Comment 24 Santiago Zarate 2024-06-05 08:57:58 UTC
The -P def makes more sense than -t... created also: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/19457 with link to the VR


(In reply to Thomas Blume from comment #21)
> 
> My test setup succeeds with the following patch:
> 
> -->
> diff --git a/lib/networkdbase.pm b/lib/networkdbase.pm
> index f763732fb..b90d61239 100644
> --- a/lib/networkdbase.pm
> +++ b/lib/networkdbase.pm
> @@ -25,7 +25,7 @@ Run C<$script> in the systemd-nspawn container called
> C<$machine>.
>  
>  sub assert_script_run_container {
>      my ($self, $machine, $script) = @_;
> -    assert_script_run("systemd-run -tM $machine /bin/bash -c \"$script\"");
> +    assert_script_run("systemd-run -PM $machine /bin/bash -c \"$script\"");
>  }
>  
>  =head2 script_run_container
> --<
> 
> Could you please try this?
Comment 25 Fabian Vogt 2024-06-05 09:04:48 UTC
(In reply to Thomas Blume from comment #23)
> (In reply to Fabian Vogt from comment #22)
> > > systemd-run -tM node1 /bin/bash -c "zypper lr -u"
> > > 
> > > fails on my test setup due to a pty allocation error.
> > 
> > It's possible you're testing a snapshot with broken systemd: bsc#1225943
> > 
> > Maybe try an older snapshot where the original issue occurs.
> 
> I'd need the older asset for the test, but that isn't available anymore, see
> comment#18

https://openqa.opensuse.org/tests/4241474#downloads looks fine?
Comment 26 Thomas Blume 2024-06-05 09:12:13 UTC
(In reply to Fabian Vogt from comment #25)
> (In reply to Thomas Blume from comment #23)
> > (In reply to Fabian Vogt from comment #22)
> > > > systemd-run -tM node1 /bin/bash -c "zypper lr -u"
> > > > 
> > > > fails on my test setup due to a pty allocation error.
> > > 
> > > It's possible you're testing a snapshot with broken systemd: bsc#1225943
> > > 
> > > Maybe try an older snapshot where the original issue occurs.
> > 
> > I'd need the older asset for the test, but that isn't available anymore, see
> > comment#18
> 
> https://openqa.opensuse.org/tests/4241474#downloads looks fine?

Yepp, that works, thanks!

trying
Comment 27 Thomas Blume 2024-06-05 11:15:19 UTC
(In reply to Thomas Blume from comment #26)
> > https://openqa.opensuse.org/tests/4241474#downloads looks fine?
> 
> Yepp, that works, thanks!
> 
> trying

the test works with the patch from comment#24:

http://10.168.189.195/tests/150#step/networkd_init/48

Trying now without the patch.
Comment 28 Thomas Blume 2024-06-05 11:56:42 UTC
(In reply to Thomas Blume from comment #27)
> (In reply to Thomas Blume from comment #26)
> > > https://openqa.opensuse.org/tests/4241474#downloads looks fine?
> > 
> > Yepp, that works, thanks!
> > 
> > trying
> 
> the test works with the patch from comment#24:
> 
> http://10.168.189.195/tests/150#step/networkd_init/48
> 
> Trying now without the patch.

Unfortunately, it also works. :(

http://10.168.189.195/tests/152
Comment 29 Santiago Zarate 2024-06-05 12:00:35 UTC
(In reply to Thomas Blume from comment #28)
> (In reply to Thomas Blume from comment #27)
> > (In reply to Thomas Blume from comment #26)
> > > > https://openqa.opensuse.org/tests/4241474#downloads looks fine?
> > > 
> > > Yepp, that works, thanks!
> > > 
> > > trying
> > 
> > the test works with the patch from comment#24:
> > 
> > http://10.168.189.195/tests/150#step/networkd_init/48
> > 
> > Trying now without the patch.
> 
> Unfortunately, it also works. :(
> 
> http://10.168.189.195/tests/152

openqa-clone-job --repeat 100 maybe?
Comment 30 Thomas Blume 2024-06-06 06:24:31 UTC
(In reply to Santiago Zarate from comment #29)
> (In reply to Thomas Blume from comment #28)
> > (In reply to Thomas Blume from comment #27)
> > > (In reply to Thomas Blume from comment #26)
> > > > > https://openqa.opensuse.org/tests/4241474#downloads looks fine?
> > > > 
> > > > Yepp, that works, thanks!
> > > > 
> > > > trying
> > > 
> > > the test works with the patch from comment#24:
> > > 
> > > http://10.168.189.195/tests/150#step/networkd_init/48
> > > 
> > > Trying now without the patch.
> > 
> > Unfortunately, it also works. :(
> > 
> > http://10.168.189.195/tests/152
> 
> openqa-clone-job --repeat 100 maybe?

I ran 20 repetitions, which took about 13 hours, but couldn't reproduce the issue. :(