Bug 998663 - aarch64: Upgrade path from 42.1 broken with btrfs (64K vs. 4K page size)
Summary: aarch64: Upgrade path from 42.1 broken with btrfs (64K vs. 4K page size)
Status: RESOLVED WONTFIX
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Upgrade Problems (show other bugs)
Version: Leap 42.2
Hardware: aarch64 openSUSE 42.1
: P2 - High : Major (vote)
Target Milestone: Leap 42.2 Beta 2
Assignee: Alexander Graf
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 1009123 1014196
  Show dependency treegraph
 
Reported: 2016-09-13 16:54 UTC by Andreas Färber
Modified: 2016-12-22 14:11 UTC (History)
9 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
lnussel: SHIP_STOPPER-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Färber 2016-09-13 16:54:00 UTC
Leap 42.1 ships with a 64K page size kernel, whereas Leap 42.2 inherited the 4K page size from SLES12 SP2.

It has been reported that since btrfs (the default filesystem) uses the native page size as the block size or so, this leads to ...

a) the offline upgrade (.iso) not working (installation kernel can't mount the filesystem),

b) the online upgrade (zypper dup) working but resulting in a non-bootable system when using the 42.2 kernel.

Our intent is to fix this in multiple steps:

1) For Beta 2 switch kernel back to 64K page size.

This will immediately unbreak the upgrade path for any SoftIron customers and other 42.1 users, but will break the Beta 1 upgrade path.
It will also affect whether any semi-official, not yet prepared/built 42.2 JeOS images will be able to boot on low RAM due to large nr. of files in Kiwi initrd.

2) Investigate a btrfs KMP that adds support for non-native page sizes.

That may allow to switch the page size to 4K for real, if we want.
Comment 1 Andrew Wafaa 2016-09-13 20:56:43 UTC
I have serious reservations of deviating from SLE parity. It wold be good to know how many systems have been shipped with 42.1 as OD1000 systems have not shipped yet.

Providing we can provide clear documentation on upgrading (effectively a re-install) I see no problem in inflicting the inconvenience. One of the big advantages of Leap is the upsell opportunities it provides for SLE. From a technical perspective 4K is a better choice and we just need to accept the original mistake.

I'll discuss further with Jeff and Alex.
Comment 2 Ludwig Nussel 2016-09-14 08:43:13 UTC
could we provide a second kernel with 64k page size but install the 4k one on new installations?
Comment 3 Andreas Färber 2016-09-14 08:48:43 UTC
(In reply to Ludwig Nussel from comment #2)
> could we provide a second kernel with 64k page size but install the 4k one
> on new installations?

Has already been discussed: We would still need the default flavor to be 64K.
Comment 4 Takashi Iwai 2016-09-14 08:56:09 UTC
One significant problem in the scenario to switch from 64k to 4k after the release is that it shall break kABI.  It means the KMPs that have been built with 42.2 GA kernel wouldn't work any longer after switching to 4k.  So, if we want to keep KMP compatibility, the switch after the release won't work well, unfortunately.

Of course, we may enforce all KMPs to be rebuilt after the switch happens.  But we have no control over 3rd party stuff, so we can't guarantee it.
Comment 5 Ludwig Nussel 2016-09-14 09:06:14 UTC
3rd party KMPs? How many "customers" does Leap 42.1 aarch64 have? I thought it's a tech preview, experiment, unsupported port, whatever? From my PoV I'd agree with Andrew. Yes, there was a mistake with 42.1 on aarch64. Yes, it sucks. Still, I'm sure there are ways to dump the OS as tarball or something, boot with the new kernel, recreate FS, unpack and then upgrade.
Comment 7 Andreas Färber 2016-09-14 09:35:36 UTC
(In reply to Ludwig Nussel from comment #5)
> Yes, there was a mistake with 42.1 on aarch64.

No, it's not "a mistake with 42.1". We have the same 64K setting in Tumbleweed, it was fully deliberate at the time and aligned with ppc64le. Tumbleweed will likely continue to have 64K as long as the btrfs patches are not mainline.

So 4K also breaks "downgrading" from Tumbleweed to Leap 42.2. And that path is relevant as long as Tumbleweed keeps having publishing problems due to openQA, see the recent question from MPSA on the opensuse-arm mailing list (which was about Gigabyte, not SoftIron hardware - there's more than one).
Comment 8 Takashi Iwai 2016-09-27 08:08:21 UTC
Reassign this one to Alex, who already worked on TW.
Comment 9 Alexander Graf 2016-09-27 08:18:20 UTC
The plan is to provide a 64k kernel flavor for 42.2 and refuse any upgrade without installing that one first.
Comment 10 Ludwig Nussel 2016-11-07 12:23:02 UTC
so what is the current status?

If there's something to be documented please add it to the release notes.
https://github.com/openSUSE/release-notes-openSUSE/blob/Leap_42.2/xml/release-notes.xml

and refer to it from https://en.opensuse.org/Upgrade if needed
Comment 11 Alexander Graf 2016-11-14 09:42:48 UTC
(In reply to Ludwig Nussel from comment #10)
> so what is the current status?
> 
> If there's something to be documented please add it to the release notes.
> https://github.com/openSUSE/release-notes-openSUSE/blob/Leap_42.2/xml/
> release-notes.xml
> 
> and refer to it from https://en.opensuse.org/Upgrade if needed

I sent a PR on github to add the respective documentation.
Comment 12 Alexander Graf 2016-11-14 09:45:09 UTC
It's too late to add the 64k flavor. Let's just make it a hard cut instead.
Comment 13 Swamp Workflow Management 2016-12-22 14:11:20 UTC
openSUSE-RU-2016:3235-1: An update that has 7 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1009123,1009275,1009493,1010575,1014686,995062,998663
CVE References: 
Sources used:
openSUSE Leap 42.2 (src):    release-notes-openSUSE-42.2.20161212-3.1