Bugzilla – Bug 1212627
NetworkManager: network interface not configured before SSH daemon is started
Last modified: 2023-07-03 12:31:02 UTC
Basically, SSH is not waiting for network to be setup before starting resulting in SSH that is not brought on boot. Jun 22 14:18:59 localhost sshd[845]: error: Bind to port 22 on 2a01:2f8:2012:9cdf:0:aead:1:aead failed: Cannot assign requested address. Jun 22 14:18:59 localhost systemd[1]: sshd.service: Main process exited, code=exited, status=255/EXCEPTION Jun 22 14:18:59 localhost sshd[845]: fatal: Cannot bind any address. Jun 22 14:18:59 localhost systemd[1]: sshd.service: Failed with result 'exit-code'. Jun 22 14:18:59 localhost systemd[1]: Failed to start OpenSSH Daemon. .... Jun 22 14:18:59 localhost.localdomain NetworkManager[741]: <info> [1687443539.8178] settings: (eth1): created default wired connection 'Wired connection 2' It seems that NetworkManager-wait-online.service is triggered after everything is configured, but SSH startup is prior to this.
Workaround was to add # /etc/systemd/system/sshd.service.d/01-after-network.conf [Unit] After=NetworkManager-wait-online.service [Service] Restart=always RestartSec=5s RestartPreventExitStatus= The changes to Unit were insufficient as sshd was still not able to bind to specific IP address assigned by NetworkManager Override of Restart* allowed the process to actually start.
NetworkManager was configured using `nmcli` and persisted.
Thanks for the bug report, Adam. Can you attach the output of `journalctl -b` on your system?
Created attachment 867858 [details] sshd can't bind to ip address Attached. You see in the journal that systemd started later, but that only happens since I modified sshd service file to always try to restart systemctl cat sshd # /usr/lib/systemd/system/sshd.service [Unit] Description=OpenSSH Daemon After=network.target [Service] Type=notify EnvironmentFile=-/etc/sysconfig/ssh ExecStartPre=/usr/sbin/sshd-gen-keys-start ExecStartPre=/usr/sbin/sshd -t $SSHD_OPTS ExecStart=/usr/sbin/sshd -D $SSHD_OPTS ExecReload=/bin/kill -HUP $MAINPID KillMode=process Restart=on-failure RestartPreventExitStatus=255 TasksMax=infinity [Install] WantedBy=multi-user.target # /etc/systemd/system/sshd.service.d/01-after-network.conf #[Unit] #After=NetworkManager-wait-online.service [Service] Restart=always RestartSec=5s RestartPreventExitStatus=
Created attachment 867859 [details] fails after NetworkManager With additional changes, # /etc/systemd/system/sshd.service.d/01-after-network.conf [Unit] After=NetworkManager-wait-online.service [Service] Restart=always RestartSec=5s RestartPreventExitStatus= SSHD comes up *most of the time* while still failing sometimes. That is, Network Maanger signals via NetworkManager-wait-online.service that it is complete, but the IP address is still not available for bind by SSHD. I've attached a log that shows exactly this.
(In reply to Adam Majer from comment #6) > Created attachment 867859 [details] > fails after NetworkManager > > With additional changes, > > # /etc/systemd/system/sshd.service.d/01-after-network.conf > [Unit] > After=NetworkManager-wait-online.service > > [Service] > Restart=always > RestartSec=5s > RestartPreventExitStatus= > > SSHD comes up *most of the time* while still failing sometimes. That is, > Network Maanger signals via NetworkManager-wait-online.service that it is > complete, but the IP address is still not available for bind by SSHD. I've > attached a log that shows exactly this. > sshd[1225]: error: Bind to port 22 on 2a01:4f8:c012:9cdf:0:aead:1:aead failed: Cannot assign requested address sshd waits for an ipv6 address, is this intentional? In this logs you provided, when sshd starts, there is already an ipv4 address available. But there is no ipv6 address assigned yet.
(In reply to Jonathan Kang from comment #7) > (In reply to Adam Majer from comment #6) > > Created attachment 867859 [details] > > fails after NetworkManager > > > > With additional changes, > > > > # /etc/systemd/system/sshd.service.d/01-after-network.conf > > [Unit] > > After=NetworkManager-wait-online.service > > > > [Service] > > Restart=always > > RestartSec=5s > > RestartPreventExitStatus= > > > > SSHD comes up *most of the time* while still failing sometimes. That is, > > Network Maanger signals via NetworkManager-wait-online.service that it is > > complete, but the IP address is still not available for bind by SSHD. I've > > attached a log that shows exactly this. > > > sshd[1225]: error: Bind to port 22 on 2a01:4f8:c012:9cdf:0:aead:1:aead failed: Cannot assign requested address > > sshd waits for an ipv6 address, is this intentional? In this logs you > provided, when sshd starts, there is already an ipv4 address available. But > there is no ipv6 address assigned yet. The IPv4 address is non-routable and configured via DHCP, and the only public address available is the IPv6. It's configured manually, so it's not like there is a long wait for RA or something, ipv6.method: manual ipv6.dns: 2a01:4ff:ff00::add:1,2a01:4ff:ff00::add:2 ipv6.dns-search: -- ipv6.dns-options: -- ipv6.dns-priority: 0 ipv6.addresses: 2a01:4f8:c012:9cdf::1/64, 2a01:4f8:c012:9cdf:0:aead:1:aead/64 ipv4.method: auto ipv4.dns: -- ipv4.dns-search: -- ipv4.dns-options: -- ipv4.dns-priority: 0 ipv4.addresses: -- In the "old days", all network interfaces should be configured as much as possible before the rest of the system is signaled that network is ready. So, passive configuration via RA for IPv6 would be out-of-scope for this wait but everything else should be blocking. Static IP assignments should definitely be all assigned prior to signaling network setup is successful. Basically, I would expect all configured network interfaces to be up and configured completely before proceeding. It's not OR condition here but AND. Optionally things could proceed after some timeout, like 2 or 3 minutes, if we are waiting for external resources that may be failing (eg. DHCP) Finally, an answer here that "it would work if you just listen on wildcard address" is not really a good answer here.
(In reply to Adam Majer from comment #8) > > The IPv4 address is non-routable and configured via DHCP, and the only > public address available is the IPv6. It's configured manually, so it's not > like there is a long wait for RA or something, > I'm guessing maybe the network link is not yet ready when sshd is started. This needs detailed NetworkManager logs. To collect them, you need add the following lines in /etc/NetworkManager/NetworkManager.conf. > [logging] > level=TRACE > > > In the "old days", all network interfaces should be configured as much as > possible before the rest of the system is signaled that network is ready. > So, passive configuration via RA for IPv6 would be out-of-scope for this > wait but everything else should be blocking. Static IP assignments should > definitely be all assigned prior to signaling network setup is successful. > > Basically, I would expect all configured network interfaces to be up and > configured completely before proceeding. It's not OR condition here but AND. > Optionally things could proceed after some timeout, like 2 or 3 minutes, if > we are waiting for external resources that may be failing (eg. DHCP) > > Finally, an answer here that "it would work if you just listen on wildcard > address" is not really a good answer here.
Created attachment 867884 [details] trace level log I've attached trace level log now
In your setup, NetworkManager logs "startup complete" when ipv4 succeeds, and this triggers NetworkManager-wait-online to quit. But at that moment, NetworkManager is still at the process of setting manual ipv6 address. This is caused by the "ipv6.may-fail" property defaulting to TRUE. So when dhcp4 succeeded, NetworkManager thought it's time to finish the startup. Can you try setting "ipv6.may-fail" to FALSE and see if this makes any difference? > nmcli connection modify Wired\ Connection\ 1 ipv6.may-fail no
(In reply to Jonathan Kang from comment #11) > In your setup, NetworkManager logs "startup complete" when ipv4 succeeds, > and this triggers NetworkManager-wait-online to quit. But at that moment, > NetworkManager is still at the process of setting manual ipv6 address. This > is caused by the "ipv6.may-fail" property defaulting to TRUE. So when dhcp4 > succeeded, NetworkManager thought it's time to finish the startup. > > Can you try setting "ipv6.may-fail" to FALSE and see if this makes any > difference? > > > nmcli connection modify Wired\ Connection\ 1 ipv6.may-fail no setting may-fail to false, seem to fix my problem. Thanks! The question is then, shouldn't manually defined settings automatically default to may-fail=no, if there is no specific value set in this field? I assume this probably should be discussed with upstream when may-fail should default to false and when to true. The current case when it always defaults to true is ... confusing.
(In reply to Adam Majer from comment #12) > > setting may-fail to false, seem to fix my problem. Thanks! > > The question is then, shouldn't manually defined settings automatically > default to may-fail=no, if there is no specific value set in this field? By default, NetworkManager set this property for each new connection to TRUE. This is a safe default for all kinds of connections. Setting this to FALSE may cause infinite waiting for scenarios where users don't have an IPv6 address. > > I assume this probably should be discussed with upstream when may-fail > should default to false and when to true. The current case when it always > defaults to true is ... confusing. I think this is more of a problem about NetworkManager deciding it's time to quit NetworkManager-wait-online when it got an IPv4 address(without internet access), but at that time the manual IPv6 address is still in the process of pending/configuring. It sure is a good idea to talk to upstream developers and see what they think about this issue.