Bug 1165907 - AutoYast crashes (SIGSEGV) during RAID+LVM disk setup
AutoYast crashes (SIGSEGV) during RAID+LVM disk setup
Status: RESOLVED FIXED
Classification: SUSE Linux Enterprise Server
Product: Public Beta SUSE Linux Enterprise Server 15 SP2
Classification: SUSE Linux Enterprise Server
Component: AutoYaST
Beta
x86-64 SLES 15
: P5 - None : Normal
: ---
Assigned To: Imobach Gonzalez Sosa
E-mail List
https://trello.com/c/UdsMdRcY
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-03-06 10:28 UTC by Peter Stark
Modified: 2020-03-19 07:52 UTC (History)
10 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
/dev/sd* attempt (198.84 KB, application/x-compressed)
2020-03-06 10:28 UTC, Peter Stark
Details
/dev/pmem* attempt (198.54 KB, application/x-compressed)
2020-03-06 10:29 UTC, Peter Stark
Details
save_y2logs profile and tmp logs (203.58 KB, application/x-compressed)
2020-03-11 12:14 UTC, Peter Stark
Details
save_y2logs and profile etc (249.27 KB, application/x-compressed)
2020-03-12 09:29 UTC, Peter Stark
Details
Screen shot (64.11 KB, image/jpeg)
2020-03-12 09:30 UTC, Peter Stark
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Stark 2020-03-06 10:28:41 UTC
Created attachment 832101 [details]
/dev/sd* attempt

We've tried to use the routines we use for SLES 15 SP1 to setup two media (2x M.2) using a mirror configuration via /dev/md126 and /dev/md127.
It fails and the y2log file shows a ruby dump.
I've attached two tar balls. Each has the save_y2log output, XML profile and tmp/YaST* included. 
One is using the /dev/sd* devices and the other one is a try using DCPMM (/dev/pmem0s and /dev/pmem1s) which have been prepared earlier.

Note: Also in the /dev/sd* attempt the /dev/pmem* devices were present.
Comment 1 Peter Stark 2020-03-06 10:29:21 UTC
Created attachment 832102 [details]
/dev/pmem* attempt
Comment 2 Jiri Srain 2020-03-06 11:55:09 UTC
The bug is reported about SP2, but speaks about SP1, please, clarify the exact build.
Comment 3 Peter Stark 2020-03-06 13:38:56 UTC
(In reply to Jiri Srain from comment #2)
> The bug is reported about SP2, but speaks about SP1, please, clarify the
> exact build.

I only referred to SP1 in the sense that we've used the same routines. We've used SLES 15 SP2 PublicBeta (Full) ISO image for this.
Comment 4 Martin Vidner 2020-03-06 16:14:57 UTC
Thanks for the report!
Comment 5 Peter Stark 2020-03-10 11:30:04 UTC
Today we've tested with the newer Beta4 Full-ISO. There, this issue does not come up again. You may want to close this bug.
Comment 6 Imobach Gonzalez Sosa 2020-03-10 12:38:50 UTC
Hi,

That's weird because I am able to reproduce the problem even using the Beta 4.

After performing some tests, I have spotted a problem in the profile (at least in the original one). In the second partition of the "/dev/pmem0s", "raid_name" and "lvm_group" are specified. Both of them are exclusive because you cannot use the partition as a RAID member and as an LVM PV at the same time. Simply removing the "lvm_group" fixed the issue for me.

I think we should improve AutoYaST to warn the user and handle this situation nicely instead of crashing (or silently ignoring any of them).

Regards,
Imo
Comment 7 Peter Stark 2020-03-11 08:08:28 UTC
(In reply to Imobach Gonzalez Sosa from comment #6)
> After performing some tests, I have spotted a problem in the profile (at
> least in the original one). In the second partition of the "/dev/pmem0s",
> "raid_name" and "lvm_group" are specified. Both of them are exclusive
> because you cannot use the partition as a RAID member and as an LVM PV at
> the same time. Simply removing the "lvm_group" fixed the issue for me.

Well, we discussed this with Daniel Rahn (see CC) who was our TAM at that time. We use this for many customers in SLES 12. (there with sd* instead of pmem*, but still the same concept). We could not find a way to make LVM use the two partitions directly (without md-mirror).

To handle this later in the running system we have to modify the /etc/lvm/lvm.conf filter= entry as we get duplicates (the md-device and the raw-sd* devices). But, with the filter the construct of a LVM volume on-top of the md-device is working fine.

Having said this, I would still prefer to run a more direct way (e.g. LVM RAID-1 on those two partitions without md). If you could enlighten me or point me to a description, that would be great.
Comment 8 Peter Stark 2020-03-11 09:13:32 UTC
(In reply to Imobach Gonzalez Sosa from comment #6)
> That's weird because I am able to reproduce the problem even using the Beta
> 4.
On that, well... I tried again and it failed again. The difference is that I test with both a physical server (having PMEM installed) and a KVM vm which is using /dev/sda and sdb. Our pre-script adjusts to the different names. It seems that it works on the VM but not on the pyhsical server. Though, it might just be a timing topic.

Here is the result of such a configuration on the VM:
slesvm2:~ # lsblk
NAME              MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sda                 8:0    0   70G  0 disk  
├─sda1              8:1    0  150M  0 part  
│ └─md126           9:126  0  150M  0 raid1 /boot/efi
└─sda2              8:2    0 69.9G  0 part  
  └─md127           9:127  0 69.9G  0 raid1 
    ├─system-swap 254:0    0    2G  0 lvm   [SWAP]
    └─system-root 254:1    0 67.7G  0 lvm   /
sdb                 8:16   0   70G  0 disk  
├─sdb1              8:17   0  150M  0 part  
│ └─md126           9:126  0  150M  0 raid1 /boot/efi
└─sdb2              8:18   0 69.9G  0 part  
  └─md127           9:127  0 69.9G  0 raid1 
    ├─system-swap 254:0    0    2G  0 lvm   [SWAP]
    └─system-root 254:1    0 67.7G  0 lvm   /
sr0                11:0    1 1024M  0 rom
Comment 9 Ancor Gonzalez Sosa 2020-03-11 09:26:18 UTC
(In reply to Peter Stark from comment #7)
> We could not find a way to make LVM use the two partitions directly
> (without md-mirror).

What you are trying to achieve is still totally unclear to me.

According to comment#8, you want to partition two disks into two partitions each.

Then you want the first partition of each disk to be combined in a RAID1 that is then used as /boot/efi

And the second partition of each disk are combined into another RAID1 than is then used as the only PV for an LVM volume group containing two logical volumes (swap and /).

That's pretty normal and totally doable with a regular AutoYaST profile without pre-scripts and without tweaking the LVM filters (except in a non-updated SLE-15-SP1 in which the default LVM filters are indeed broken).

So I fail to see what you mean by "make LVM use the two partitions directly (without md-mirror)".
Comment 10 Peter Stark 2020-03-11 09:42:55 UTC
(In reply to Ancor Gonzalez Sosa from comment #9)
> That's pretty normal and totally doable with a regular AutoYaST profile
> without pre-scripts and without tweaking the LVM filters (except in a
> non-updated SLE-15-SP1 in which the default LVM filters are indeed broken).
Ok, great. I think I may have miss-read your comment #6
> In the second partition of the "/dev/pmem0s", "raid_name" and "lvm_group" are
> specified. Both of them are exclusive because you cannot use the partition as a 
> RAID member and as an LVM PV at the same time. Simply removing the "lvm_group"
> fixed the issue for me.
The way I've read it was that I should not combine md and LVM. Reading this again I don't know how my brain took that turn. ;-) Must be an age thing.

I'll remove the "lvm_group" as you've suggested and test again (and clear the need info then).

> So I fail to see what you mean by "make LVM use the two partitions directly
> (without md-mirror)".
AFAIK LVM can do RAID directly (without md). So, it would reduce the complexity of the setup and get rid of the md-layer if we could tell Yast to make a RAID-1 with LVM, not md.
Comment 11 Imobach Gonzalez Sosa 2020-03-11 10:14:25 UTC
(In reply to Peter Stark from comment #10)

[..]

> The way I've read it was that I should not combine md and LVM. Reading this
> again I don't know how my brain took that turn. ;-) Must be an age thing.

No problem :-)

> I'll remove the "lvm_group" as you've suggested and test again (and clear
> the need info then).

Thanks in advance.

> > So I fail to see what you mean by "make LVM use the two partitions directly
> > (without md-mirror)".
> AFAIK LVM can do RAID directly (without md). So, it would reduce the
> complexity of the setup and get rid of the md-layer if we could tell Yast to
> make a RAID-1 with LVM, not md.

YaST does not support using LVM mirroring capabilities, so you need to use the RAID+LVM approach to achieve what you want in this case.
Comment 12 Peter Stark 2020-03-11 12:14:57 UTC
Created attachment 832546 [details]
save_y2logs profile and tmp logs

Without the lvm_group element we'll get another error:

Caller:  /mounts/mp_0003/usr/share/YaST2/lib/y2storage/planned/has_size.rb:109:in `distribute_space'<br /><br />Details: RuntimeError
Comment 13 Imobach Gonzalez Sosa 2020-03-11 12:40:08 UTC
Hi Peter,

I would say that YaST did not find any LVM physical volume to build the Volume Group. I have checked the autoinst.xml file included in the tarball you provided (which is the result of running the pre-script), and it looks like the "lvm_group" is missing *at all*.

You need to remove "lvm_group" for the partition you are using as RAID member (to not conflict), that's it, from the /dev/sda and /dev/sdb definitions.

However, you need to tell AutoYaST to use the RAID as the LVM PV, so you have to keep the "lvm_group" in the first partition of the second "/dev/md" drive specification. Something like this (I have omitted many elements to simplify the example):

###
    <drive>
      <device>/dev/md</device>
      <partitions config:type="list">
        <partition>
          <create config:type="boolean">true</create>
          <!-- tell AutoYaST to use this partition as LVM PV -->
          <lvm_group>system</lvm_group>
        </partition>
      </partitions>
      <type config:type="symbol">CT_MD</type>
      <use>all</use>
    </drive>
###

Please, let me know if it does not work for you or if something is still unclear. AutoYaST profiles are not always easy to read/write :)

PS: when the fix for bug 1162043 lands into the installer, you will get a meaningful error message in a situation like this.
Comment 14 Peter Stark 2020-03-11 14:12:37 UTC
(In reply to Imobach Gonzalez Sosa from comment #13)
Hi Imo,
> ###
>     <drive>
>       <device>/dev/md</device>
>       <partitions config:type="list">
>         <partition>
>           <create config:type="boolean">true</create>
>           <!-- tell AutoYaST to use this partition as LVM PV -->
>           <lvm_group>system</lvm_group>
>         </partition>
>       </partitions>
>       <type config:type="symbol">CT_MD</type>
>       <use>all</use>
>     </drive>
> ###
Thanks for explaining. I've tested it and it works fine. Still I wonder why it worked in 15.1. We probably will have to test 15.1 with these changes, too.

> AutoYaST profiles are not always easy to read/write :)
Yes. I'm suffering for >10 years now from AutoYast and its changes. Must say that RHEL's kickstart is more stable over minor versions.

> PS: when the fix for bug 1162043 lands into the installer, you will get a
> meaningful error message in a situation like this.
Great.
Comment 15 Peter Stark 2020-03-11 14:12:56 UTC
Forgot to clear needinfo
Comment 16 Imobach Gonzalez Sosa 2020-03-11 16:45:27 UTC
I glad to hear that it finally worked. Perhaps in SLE 15 SP1 it simply ignored the lvm_group and did not crash (I would have to check).

For the time being, I am keeping this bug report open because we are adapting AutoYaST to display a message explaining the situation (instead of just crashing) when this problem is detected.
Comment 17 Peter Stark 2020-03-12 09:29:49 UTC
Created attachment 832613 [details]
save_y2logs and profile etc

Well, with sd* it was all ok (as mentioned above). Now I used /dev/pmem*s again. It crashes.
Comment 18 Peter Stark 2020-03-12 09:30:41 UTC
Created attachment 832614 [details]
Screen shot

I managed to capture the error which was briefly shown at the console, too.
Comment 19 Peter Stark 2020-03-12 09:50:39 UTC
Please, ignore the last two comments. I somehow reintroduced the lvm_group again. (doh!). As you may have seen in the pre-script we need to modify the XML to suite multiple configurations. We need the lvm_group in another setup. That pre-script did not work as it should. Sorry for the confusion.
Comment 20 Imobach Gonzalez Sosa 2020-03-12 16:05:40 UTC
No problem. We have improved AutoYaST to display a warning in these situations (and, btw, to not crash). Thus I am closing the bug report.

https://github.com/yast/yast-storage-ng/pull/1062

Thanks!