Bug 1212393 - Cannot complete SUSE ALP installation via PXE install
Summary: Cannot complete SUSE ALP installation via PXE install
Status: RESOLVED WONTFIX
Alias: None
Product: Granite
Classification: SUSE ALP - SUSE Adaptable Linux Platform
Component: Installation (show other bugs)
Version: unspecified
Hardware: x86-64 Linux
: P5 - None : Normal
Target Milestone: ---
Assignee: Tomáš Bažant
QA Contact:
URL:
Whiteboard: https://jira.suse.com/browse/DOCTEAM-...
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-15 04:51 UTC by Conie Chang
Modified: 2024-04-19 08:16 UTC (History)
8 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
tbazant: needinfo?
tbazant: needinfo? (msuchanek)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Conie Chang 2023-06-15 04:51:05 UTC
After setting the pxe configuration below, start to run pxe SUSE ALP installation, it cannot complete SUSE ALP installation.

I have tried the SLES15.5, it can complete SLES15.5 PXE installation.

Steps:
1. Copy the linux and initrd from "d-installer-live.x86_64-ALP.iso" and put into tftpboot folder.
2. configuration the pxe grub.cfg below
3. show the log below
4. Even we use NFS, the system cannot complete the SUSE ALP installation and enter emergency mode (refer the pxe grub.cfg configuration 2

1.1)====pxe grub.cfg configuration 1===
menuentry "*SUSE ALP x64" {
     echo "Loading Kernel .."
     linuxefi /SUSE/Beta/linuxsusealp install=ftp://192.168.0.254/pub/SLES/susealp ifcfg=*=dhcp dhcptimeout=120
     echo "Loading Initial ramdisk .."
     initrdefi /SUSE/Beta/initrdsusealp
}



1.2)===serial console log=======
[    9.618655][    T1] List of all partitions:
[    9.623793][    T1] No filesystem could mount root, tried:
[    9.623795][    T1]
[    9.633445][    T1] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    9.643808][    T1] CPU: 18 PID: 1 Comm: swapper/0 Not tainted 6.1.12-3-default #1 openSUSE Tumbleweed (unreleased) 969da743660a662b4655bf
4f79107b1449b21a0d
[    9.662078][    T1] Hardware name: Lenovo SE455V3 Mont Blanc MB Planar/None, BIOS MBE101Q-1.10 05/17/2023
[    9.673715][    T1] Call Trace:
[    9.677516][    T1]  <TASK>
[    9.680861][    T1]  dump_stack_lvl+0x44/0x5c
[    9.685940][    T1]  panic+0x10b/0x2bc
[    9.690291][    T1]  mount_block_root+0x1c6/0x1d9
[    9.695706][    T1]  prepare_namespace+0x136/0x165
[    9.701210][    T1]  kernel_init_freeable+0x25c/0x286
[    9.707004][    T1]  ? rest_init+0xd0/0xd0
[    9.711715][    T1]  kernel_init+0x16/0x130
[    9.716501][    T1]  ret_from_fork+0x22/0x30
[    9.721384][    T1]  </TASK>
[    9.724948][    T1] Kernel Offset: 0x7600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    9.748004][    T1] Rebooting in 90 seconds..


2.1) ====pxe grub.cfg configuration 2===
setparams 'Install SLES 16 Pre-Beta UEFI'

	linuxefi kernel/sles/sles16prebeta/linux ip-dhcp initsys=nfs://192.、 168.8.20:/install/nfs_share/sles/sles16prebeta/boot/x86_64/root install=nfs\
://192.168.8.20:/install/nfs_share/sles/sles16prebeta
	initrdefi kernel/sles/sles16prebeta/initrd

2.2)===log=======
[5.973773][ T832] sda:
[5.984148][ T8301 sdc:
[5.985466][ T8321 sd 8:2:0:0: [sda] Attached SCSI disk
[5.986545][ T8301 sd 9:0:2:0: [sdc] Attached SCSI disk
[6.075699][
[
[6.076467][ 6.076792]
T838] usb 1-2: Product: USB Optical Mouse
[6.077072][
[
[T8381 usb 1-2: New USB device found, idVendor-046d, idProduct=c077, bcdDevice=72.00 T838] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
T838] usb 1-2: Manufacturer: Logitech
6.0893151 T10141 usb 3-1: New USB device found, idVendor=1d6b, idProduct-0107, bcdDevice= 1.00 6.089667] T10141 usb 3-1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[6.0899371 T1014] usb 3-1: Product: USB Virtual Hub
[6.090186] T10141 usb 3-1: Manufacturer: Aspeed
[6.0904271 T10141 usb 3-1: Serial Number: 00000000
[6.157752] T1014] hub 3-1:1.0: USB hub found
[
[6.166491] T1014] hub 3-1:1.0: 7 ports detected
6.220770][ C781 mlx5_core 0000:02:00.1: Port module event: module 1, Cable unplugged
[ OK ] Stopped Rule-based Manager for Device Events and Files.
[ OK ] Closed udev Control Socket.
[ OK
] Closed udev Kernel Socket.
[ OK
] Stopped dracut pre-udev hook.
[ OK] Stopped dracut cmdline hook.
[ OK] Stopped dracut ask for additional cmdline parameters. Starting Cleanup udev Database...
[ OK
[ OK
] Stopped Create Static Device Nodes in /dev.
] Stopped Create List of Static Device Nodes.
[ OK] Stopped Setup Virtual Console.
[ OK] Finished Cleanup udev Database.
[ OK ] Reached target Switch Root.
Starting Switch Root...
[FAILED] Failed to start Switch Root.
See 'systemctl status initrd-switch-root.service' for details.
6.3065651 T10141 usb 3-2: new low-speed USB device number 3 using xhci_hcd
6.496741][ T10141 usb 3-2: New USB device found, idVendor-04b3, idProduct=300a, bcdDevice= 1.00 6.497045]
[
[
[
[
T10141 usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 6.4972921 T10141 usb 3-2: Product: IBM USB Keyboard
[ 6.497526][ T10141 usb 3-2: Manufacturer: Silitek
Generating "/run/initramfs/rdsosreport.txt"
Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or boot after mounting them and attach it to a bug report.
Press Enter for maintenance
(or press Control-D to continue):
Comment 1 Conie Chang 2023-06-15 04:53:14 UTC
Is there any advice or documetn for SUSE ALP OS pxe installation ? 

Thank you.
Comment 2 Stefan Hundhammer 2023-06-15 06:58:12 UTC
Please do not change the bug priorities. That is not for the bug reporter to decide, but for  the project / release managers.

Resetting to default.
Comment 3 Stefan Hundhammer 2023-06-15 06:59:28 UTC
Since this doesn't even get to the installer part, changing to component "bootable images".
Comment 4 Conie Chang 2023-06-15 07:01:53 UTC
(In reply to Stefan Hundhammer from comment #2)
> Please do not change the bug priorities. That is not for the bug reporter to
> decide, but for  the project / release managers.
> 
> Resetting to default.

Thank you. I will know better next time.
Comment 5 Conie Chang 2023-06-20 03:44:16 UTC
Is there any update for this issue?
Comment 6 Hui-Zhi Zhao 2023-06-26 07:34:10 UTC
Hi Frederic,

Could you take a look at this issue?
Comment 7 Frederic Crozat 2023-06-28 13:36:26 UTC
I doubt Agama installer supports PXE installation atm
Comment 8 Stefan Hundhammer 2023-06-28 15:41:39 UTC
I see a kernel panic in the very first comment.

Maybe PXE isn't supported for ALP (yet?). I don't know. But since it doesn't even get as far as starting the installation, it cannot possibly be an installer bug.
Comment 9 Michal Suchanek 2023-06-28 15:59:42 UTC
To the kernel it generally does not matter how it was booted so long as both the kernel and initrd is loaded successfully - ISO, PXE, whatever.

From the error message it looks like the initrd was either not loaded or what was loaded as initrd was not recognized by the kernel (wrong/corrupted file).

Unfortunately, the 'serial console log' is cut and does not contain enough information to determine more precise cause.

In latter case the initrd is loaded but fails to boot the system. There is advice to run journalctl and save rdsosreport but neither output is provided.

As already said Agama does not officially support network boot so it might be necessary to put the installer on a drive that appears local for now.

When the installer initrd is in fact loaded but fails to boot is is very much installer bug/lack of features or failure to pass some (not yet documented) options to the installer on the kernel commandline.
Comment 10 Stefan Hundhammer 2023-06-28 16:03:54 UTC
.
Comment 11 Conie Chang 2023-07-21 04:14:11 UTC
(In reply to Michal Suchanek from comment #9)
> To the kernel it generally does not matter how it was booted so long as both
> the kernel and initrd is loaded successfully - ISO, PXE, whatever.
> 
> From the error message it looks like the initrd was either not loaded or
> what was loaded as initrd was not recognized by the kernel (wrong/corrupted
> file).
> 
> Unfortunately, the 'serial console log' is cut and does not contain enough
> information to determine more precise cause.
> 
> In latter case the initrd is loaded but fails to boot the system. There is
> advice to run journalctl and save rdsosreport but neither output is provided.
> 
> As already said Agama does not officially support network boot so it might
> be necessary to put the installer on a drive that appears local for now.
> 
> When the installer initrd is in fact loaded but fails to boot is is very
> much installer bug/lack of features or failure to pass some (not yet
> documented) options to the installer on the kernel commandline.

Hi Michal,

Is there any plan to document this feature does not officially support? We think    the customer also need this information.

Reopen the bugzila to clarify the offcial document.
Comment 12 Michal Suchanek 2023-07-21 08:58:06 UTC
I think it is not the intention to not support network installation.

The current ALP version is a prototype, things that are documented as working should work, other things are to be decided later.
Comment 13 Knut Alejandro Anderssen González 2023-07-21 09:44:35 UTC
(In reply to Michal Suchanek from comment #12)
> I think it is not the intention to not support network installation.
> 
> The current ALP version is a prototype, things that are documented as
> working should work, other things are to be decided later.

network installation should be supported at least the boot of the live image is supported through network and we already use it for s390x, therefore I would expect it works also with PXE otherwise it is a bug which we should take a look.

see https://documentation.suse.com/alp/micro/html/alp-micro/concept-alp-deployment.html#alp-zseries-prepare-iso-image

just take into account that linuxrc is not there anymore and some of the kernel command line options handled by it are not available / supported anymore but just the ones supported directly by dracut (https://man7.org/linux/man-pages/man7/dracut.cmdline.7.html)
Comment 14 Steffen Winterfeldt 2023-07-21 10:28:32 UTC
Indeed. ALP is entirely unrelated to SLES and none of the SLE
installation boot options apply to ALP.

For ALP network installations the option you are looking for is

  root=live:http://example.com/some_dir/squashfs.img

where squashfs.img is LiveOS/squashfs.img from the ALP install ISO.
Comment 15 Steffen Winterfeldt 2023-07-21 10:49:01 UTC
Ok, I just learned you can point directly to the ISO.

  root=live:http://example.com/some_dir/alp.iso

is enough.
Comment 16 Frederic Crozat 2023-07-27 11:41:41 UTC
Tomas, could you update ALP Dolomite documentation to mention how to handle PXE install based on the latest comments here ? Thanks !
Comment 17 Tomáš Bažant 2023-09-19 12:27:07 UTC
(In reply to Frederic Crozat from comment #16)
> Tomas, could you update ALP Dolomite documentation to mention how to handle
> PXE install based on the latest comments here ? Thanks !

OK, but where does do I eneter this `root=live:http://example.com/some_dir/alp.iso` ? GRIB cmdline?
Comment 18 Michal Suchanek 2023-09-19 12:38:43 UTC
It's a kernel parameter.

The document linked document imprecisely states it's edited on grub commandline when they probably mean in grub menu.

Kernel parameters are specified with the linux command in grub.
Comment 19 Tomáš Bažant 2023-09-19 13:03:54 UTC
(In reply to Michal Suchanek from comment #18)
> It's a kernel parameter.
> 
> The document linked document imprecisely states it's edited on grub
> commandline when they probably mean in grub menu.
> 
> Kernel parameters are specified with the linux command in grub.

thanks, clear now. BTW does this apply to Agama image only or to RAW disk installation as well?
Comment 20 Michal Suchanek 2023-09-19 14:27:38 UTC
Not really familiar with the design.

The option is dracut upstream so it should work provided needed tools happen to be in the ramdisk but there is no guarantee this will be the case for a raw disk image.
Comment 21 Tomáš Bažant 2023-09-20 08:14:52 UTC
progress is tracked in https://github.com/SUSE/doc-modular/pull/196/files
@Michal can you pls check the wording there and possibly approve?
Comment 22 Michal Suchanek 2023-09-20 14:48:24 UTC
That looks generally fine if you are running the iso.

However, the point of network boot is to alleviate the need to boot the iso image on the target system.

To that end the SLE deployment guide says where to pull the kernel image and ramdisk from the iso image, what to use to load them on the target system, and how to pass the kernel parameters then, and what parameters.

I don't think advising to point the grub running from the iso image to another one lying around on the internet will be reliable. You still get the kernel from the local iso image, and if it does not match the remote installer it will likely fail.
Comment 23 Tomáš Bažant 2023-09-25 11:35:59 UTC
(In reply to Michal Suchanek from comment #22)
> That looks generally fine if you are running the iso.
> 
> However, the point of network boot is to alleviate the need to boot the iso
> image on the target system.
> 
> To that end the SLE deployment guide says where to pull the kernel image and
> ramdisk from the iso image, what to use to load them on the target system,
> and how to pass the kernel parameters then, and what parameters.
> 
> I don't think advising to point the grub running from the iso image to
> another one lying around on the internet will be reliable. You still get the
> kernel from the local iso image, and if it does not match the remote
> installer it will likely fail.

you are right, my suggestion is not very fruitful. is the following how to do it?:
Generally, steps are following:

1)    set up a repository - for example HTTP - to hold installation files from the media (https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-deployment-instserver.html#sec-deployment-instserver-http)
2)    set up network boot environment DHCP and TFTP server (https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-deployment-prep-pxe.html)
3)    configure target system to boot from network (https://documentation.suse.com/sles/15-SP5/html/SLES-all/cha-deployment-prep-pxe.html#sec-deployment-prep-boot-pxeprep)
Comment 24 Michal Suchanek 2023-09-25 11:54:14 UTC
The new thing with ALP is that it leverages recent dracut upstream feature for expanding ISO images.

Then what you need are three files: the linux image and ramdisk which are typically downloaded over TFTP (although some clients like iPXE can support HTTP as well) and the ISO image (or squashfs file) that is then downloaded when the linux image and ramdisk is loaded. That means that 'hosting the files from the iso image' is not needed, hosting the iso image as-is is sufficient.
Comment 25 Michal Suchanek 2023-09-25 11:59:08 UTC
> zypper in tftpboot-installation-SLE-OS_VERSION-ARCHITECTURE

We don't have this step for ALP. In some earlier times (eg. SLE11) this package was not available, and the guide advised to pull the linux kernel and the ramdisk from the installation ISO.
Comment 26 Michal Suchanek 2023-09-28 13:11:28 UTC
And there seems to be work underway to provide such image in bug 1215732
Comment 27 Hui-Zhi Zhao 2024-04-19 08:16:15 UTC
Closing, please rerun the test on SLES16 once it's available.