Bug 1227887 - Installer can't be restarted
Summary: Installer can't be restarted
Status: RESOLVED WORKSFORME
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Installation (show other bugs)
Version: Leap 15.6
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: E-mail List
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-16 08:07 UTC by Volker Kuhlmann
Modified: 2024-07-17 07:55 UTC (History)
1 user (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Volker Kuhlmann 2024-07-16 08:07:02 UTC
Installation over ssh can't be restarted.
Installation must be restarted from scratch.

- tftp boot over the LAN.
- select installation via ssh.
- From another Linux host, ssh to the host to be installed.
- Run yast.ssh.
- Watch the text mode installer crank up.
- Ooops, forgot the -Y with ssh.
- Abort text mode installer.
- ssh server disappears from the host to be installed.
- argh
- Go to host to be installed, see text mode setup fluff on screen. No doubt very useful when needed, what's missing is an option to restart installation via ssh!
- Just pull the plug and start again from scratch, it's faster.
- ARRRRRGGGGHHHHHH

I'm unsure whether this also applies to installation on text console but probably not.

I'd like an option to retry installation via ssh. This should be the default when yast.ssh exits. It should always be possible to use the console to reconfidure and restart a console installation.
Comment 1 Stefan Hundhammer 2024-07-16 12:05:27 UTC
When you hit the "Abort" button (no matter if ssh installation or otherwise), you will fall back to linuxrc, and your ssh session is over.

But you can also simply use normal shell job control key combinations from the ssh session; then the ssh session remains active.

Ctrl-C will normally not work, but Ctrl-Z does, and then   

    kill %1

which will send a SIGTERM to the last suspended job, in this case yast.ssh. If you use the Qt UI, this may leave a disfunct window behind; just use the "WM_CLOSE" icon (the [x] at the top right corner) to close it.

If you used the NCurses UI, and the terminal was left behind in an undefined weird state, use   

    stty sane   

to restore reasonable defaults.

Check with

    pstree

if you still have any YaST processes running, like here:

>>  1:install:~ # pstree
>>  
>>  init─┬─3*[bash]
>>       ├─dash───sleep
>>       ├─dbus-daemon
>>       ├─2*[gpg-agent]
>>       ├─haveged
>>       ├─init───inst_setup───sleep
>>       ├─klogd
>>       ├─nscd───8*[{nscd}]
>>       ├─rsyslogd───4*[{rsyslogd}]
>>       ├─sh
>>       ├─sshd─┬─sshd───sshd───bash───yast.ssh───YaST2.First-Sta───YaST2.call───Zypp-main───5*[{Zypp-main}]
>>       │      └─sshd───sshd───bash───pstree
>>       ├─udevd
>>       ├─wickedd
>>       ├─wickedd-auto4
>>       ├─wickedd-dhcp4
>>       ├─wickedd-dhcp6
>>       ├─wickedd-nanny
>>       └─wpa_supplicant

If everything is cleaned up correctly, it should look like this:

>>  1:install:~ # pstree
>>  
>>  init─┬─3*[bash]
>>       ├─dbus-daemon
>>       ├─haveged
>>       ├─init───inst_setup───sleep
>>       ├─klogd
>>       ├─nscd───8*[{nscd}]
>>       ├─rsyslogd───4*[{rsyslogd}]
>>       ├─sh
>>       ├─sshd─┬─sshd───sshd───bash───pstree
>>       │      └─sshd───sshd───bash
>>       ├─udevd
>>       ├─wickedd
>>       ├─wickedd-auto4
>>       ├─wickedd-dhcp4
>>       ├─wickedd-dhcp6
>>       ├─wickedd-nanny
>>       └─wpa_supplicant

I.e. no "yast.ssh", no "Yast2.First-Stage", no "Zypp-Main".


Be advised that in any case, this will leave quite some mess in the log directory, so remember to clean it up in case you need to collect y2logs for a bug report:

    rm -rf /var/log/YaST2/*


Also remember that you can always start another ssh session from another terminal window (or tab). But make sure not to have multiple YaST installations running at the same time.

HTH
Comment 2 Volker Kuhlmann 2024-07-16 23:38:49 UTC
Thanks Stefan.

> When you hit the "Abort" button (no matter if ssh installation or
> otherwise), you will fall back to linuxrc, and your ssh session is over.

The "Abort" button seemed the only way to exit the text-base installer. Being returned to the shell I think is a reasonable expectation?

Otherwise yes shell job control (ctrl-Z) is an alternative however at that point I wouldn't know whether the yast.ssh process exiting by any means causes a crash back to linuxrc so abort and ctrl-Z with cleanup (processes, terminal) seem equally likely to be a dead end.

Suggestion: Add some text like "use shell job control e.g. ctrl-Z to quit if you intend to restart the installer" in the abort confirmation dialogue.

> Also remember that you can always start another ssh session from another
> terminal window (or tab).

Uhm no, the ssh server has quit after yast.ssh aborts. Opening a second ssh session to kill the first yast.ssh so a second one can be started is not intuitive. Only not running two yast at the same time is intuitive.

Thanks again. Reopening for the suggestion, please close again if you disagree.

My use case for ssh install is that I can do something else while stuff is being unpacked.
Comment 3 Stefan Hundhammer 2024-07-17 07:55:19 UTC
Well, we have the same problem since forever when we have to debug anything in the inst-sys; and it's even worse: We typically have to ssh to the machine, scp some files from our development environment to /tmp on the RAM disk (because the rest is mounted read-only) and bind-mount individual files from there to their real destination.

As you can imagine, that is a royal PITA. If something shoots down that carefully hand-crafted environment just like that, everything was in vain because the RAM disk dies with it.

Still, we have to make do with that; one thing that we really don't want to do is to expose those gory details to the normal end user, e.g. in the confirmation dialog when the user hits the "Abort" button.

Worse, we'd even have to carry all those details over to the installation workflow because the ssh installation is a rare scenario, not the normal case, and we really don't want to confuse the hell out of all those users who have the normal case.

For a normal user, aborting the installation really means that it's time to reboot to get back to a well-defined state. And we also don't know what caused the user to abort, and what would be appropriate to get things going again. Restarting from there is definitely something for advanced users.

For cases where we really, really want to make sure that we don't get forced to reboot and get our hand-crafted inst-sys devel environment on the RAM disk killed, some of us also use the "startshell=1" boot parameter in addition to "ssh=1 sshpassword=foo".

That one goes to linuxrc as well and starts another shell (on the system console) that it falls back to, but it's sometimes a bit awkward to handle; and in the good case (no abort, installation went through) it doesn't reboot right away, but you have to end that shell with Ctrl-D before the installation continues, i.e. it reboots into the freshly installed system.

That can be helpful sometimes, but I tend not to do it unless I know that i have to do exhaustive bind-mounts in a very complex devel environment.


Choose your poison. Nobody ever said being a YaST developer working in the inst-sys is fun. And when you do things like this, for all intents and purposes you elevated yourself to that status. ;-)