Bug 1214306 - Install the system with role transactional_server via autoyast, service "YaST2 Second Stage" service fails to start randomly
Summary: Install the system with role transactional_server via autoyast, service "YaST...
Status: RESOLVED INVALID
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Other (show other bugs)
Version: Leap 15.4
Hardware: x86-64 openSUSE Leap 15.4
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: E-mail List
QA Contact: E-mail List
URL: https://openqa.opensuse.org/tests/350...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-16 06:53 UTC by Richard Fan
Modified: 2023-10-13 08:49 UTC (History)
2 users (show)

See Also:
Found By: openQA
Services Priority:
Business Priority:
Blocker: Yes
Marketing QA Status: ---
IT Deployment: ---


Attachments
Autoyast configuration file (7.99 KB, text/xml)
2023-08-16 06:54 UTC, Richard Fan
Details
full serial log (37.86 KB, text/plain)
2023-08-16 06:55 UTC, Richard Fan
Details
error screen shot (75.90 KB, image/png)
2023-08-16 06:56 UTC, Richard Fan
Details
full yast logs (12.94 MB, application/x-tar)
2023-10-10 02:54 UTC, Richard Fan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Fan 2023-08-16 06:53:15 UTC
## Description

The system fails to boot up after installation via autoyast based on openqa test result, however, re-run the same test can pass.

## Steps to reproduce the issue:

Install leap 15.4 via autoyast, please see attached file for autoyast configuration file.

## Logs

I can't see any issue during installation phase, but after installation, os fails to boot randomly based on openQA tests.[Please see attache picture] and [openQA page https://openqa.opensuse.org/tests/3508058#step/installation/14]

I can see the serial log reporting issue like below: [Please see attached log for more detail information]

[  401.759633][    T1] reboot: Restarting system
[    4.072635][    T1] systemd[1]: Failed to start Remount Root and Kernel File Systems.

-------------------------------------------------------
I am wondering that there might be some service start dependency issue, but I can't collect more logs from openQA side. 

Please let me know if you need any further logs. I will try to reproduce it on my local setup and collect.
 


## openQA Observation

openQA test in scenario opensuse-15.4-DVD-Updates-x86_64-create_hdd_leap_transactional_server_autoyast@64bit fails in
[installation](https://openqa.opensuse.org/tests/3508058/modules/installation/steps/14)

## Test suite description
maintainer: richard.fan@suse.com


## Reproducible

Fails since (at least) Build [20230815-5](https://openqa.opensuse.org/tests/3508058) (current job)


## Expected result

Last good: [20230815-4](https://openqa.opensuse.org/tests/3507959) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD-Updates&machine=64bit&test=create_hdd_leap_transactional_server_autoyast&version=15.4)
Comment 1 Richard Fan 2023-08-16 06:54:57 UTC
Created attachment 868822 [details]
Autoyast configuration file
Comment 2 Richard Fan 2023-08-16 06:55:19 UTC
Created attachment 868823 [details]
full serial log
Comment 3 Richard Fan 2023-08-16 06:56:02 UTC
Created attachment 868824 [details]
error screen shot
Comment 4 Stefan Hundhammer 2023-10-09 10:03:25 UTC
I am very sceptical if that is supported at all.

So you are installing a machine with AutoYaST with role "transactional server", which makes the root filesystem of the target system mounted read-only. And yet you request a second stage of YaST/AutoYaST to your profile which of course still needs to work on that root filesystem which is now mounted read-only?

I don't think that can work even in theory. That's conflicting requirements: Transactional server vs. second stage.
Comment 5 Stefan Hundhammer 2023-10-09 16:24:24 UTC
Did this ever work before? I don't see a "last good" test, so this looks like a completely new test to me.

Also, the bootloader screenshots say "Leap 15.2" both at the initial boot and after the reboot. That can't be right.

And there are no y2logs at all. 


Is there a business case for this scenario?
Comment 6 Richard Fan 2023-10-10 02:14:13 UTC
(In reply to Stefan Hundhammer from comment #5)
> Did this ever work before? I don't see a "last good" test, so this looks
> like a completely new test to me.

It is a new test that we need to install leap 15.4 via autoyast 
> 
> Also, the bootloader screenshots say "Leap 15.2" both at the initial boot
> and after the reboot. That can't be right.
> 

Good point, it is leap15.4 rather than leap 15.2, however, openQA fails to match the needle 100%, I will try to add a new needle to make sure openQA can show right screenshot there.
> And there are no y2logs at all. 
> 
The latest test result can be found at https://openqa.opensuse.org/tests/3632230#next_previous

I will try to collect the logs for you and attach it.
> 
> Is there a business case for this scenario?

TBH, I don't know, however, I think we should support install leap 15.4 with different roles via autoyast, am I right?
Comment 7 Richard Fan 2023-10-10 02:54:05 UTC
Created attachment 870022 [details]
full yast logs
Comment 8 Stefan Hundhammer 2023-10-12 10:52:41 UTC
(In reply to Richard Fan from comment #6)
> > Is there a business case for this scenario?
> 
> TBH, I don't know, however, I think we should support install leap 15.4 with
> different roles via autoyast, am I right?

In general, yes, if the scenario is supported.

But the "transactional server" role is special; it pretty much violates every concept known to the Linux/Unix world. While some users may find the concept useful (I am not among them), all kinds of weird things can happen; and they do.

A read-only root filesystem breaks a lot of things that run soothly on any traditional Linux system; in particular pretty much everything that makes a system management tool like YaST useful. Those tools are all about modifying the system, and that means read access to the root filesystem. If that doesn't work, all bets are off.

And that is the case for a transactional server: The root filesystem is read-only, and you have to use special tools to enable write access and to lock it down to read-only again after those write operations are done.

This affects pretty much every operation of such a tool.

So, if you install with YaST, everything will run as usual in the first stage, because by then YaST will create the root filesystem, mount it read/write as usual, write data to it (install RPMs, write the configuration to /etc and other places) and finally reboot into the newly installed system.

And then the transactional part takes over, and the root filesystem is mounted read-only.

That means that normal YaST operations can no longer work. That is expected. And that affects things like a second stage of the installation as well.

So no, this is not supported. Either second stage or transactional server, but not both.
Comment 9 Richard Fan 2023-10-13 01:27:15 UTC
(In reply to Stefan Hundhammer from comment #8)
> (In reply to Richard Fan from comment #6)
> > > Is there a business case for this scenario?
> > 
> > TBH, I don't know, however, I think we should support install leap 15.4 with
> > different roles via autoyast, am I right?
> 
> In general, yes, if the scenario is supported.
> 
> But the "transactional server" role is special; it pretty much violates
> every concept known to the Linux/Unix world. While some users may find the
> concept useful (I am not among them), all kinds of weird things can happen;
> and they do.
> 
> A read-only root filesystem breaks a lot of things that run soothly on any
> traditional Linux system; in particular pretty much everything that makes a
> system management tool like YaST useful. Those tools are all about modifying
> the system, and that means read access to the root filesystem. If that
> doesn't work, all bets are off.
> 
> And that is the case for a transactional server: The root filesystem is
> read-only, and you have to use special tools to enable write access and to
> lock it down to read-only again after those write operations are done.
> 
> This affects pretty much every operation of such a tool.
> 
> So, if you install with YaST, everything will run as usual in the first
> stage, because by then YaST will create the root filesystem, mount it
> read/write as usual, write data to it (install RPMs, write the configuration
> to /etc and other places) and finally reboot into the newly installed system.
> 
> And then the transactional part takes over, and the root filesystem is
> mounted read-only.
> 
> That means that normal YaST operations can no longer work. That is expected.
> And that affects things like a second stage of the installation as well.
> 
> So no, this is not supported. Either second stage or transactional server,
> but not both.

Thanks for your kindly help on this issue!

I will fix our test plan to use original interactive mode to install the system and publish the qcow2 image.
Comment 10 Lukas Ocilka 2023-10-13 08:49:55 UTC
There is a document, written by Ancor, which describes the YaST behavior on a transactional system: https://github.com/ancorgs/alp-system-management/blob/main/transactional.md

Although it is about installing additional packages, it clearly describes what can and what can't be done on a transactional system. YaST simply can't run RW operations on RO system - that would be strictly against the idea of the transactional mode.

We plan to replace YaST-based installation with Agama, so although there might be some YaST Second Stage and then another reboot to the configured system, it would be a huge investment into an old technology. Instead, such investment should be done in Agama (if needed), but then a real use-case (and business-case) needs to be filed through Jira feature request.