Bug 1206814 - [Build 20230104-1] openQA test fails in snapper_cleanup, 'snapper cleanup number' gets error 'Failure (error.something)'
Summary: [Build 20230104-1] openQA test fails in snapper_cleanup, 'snapper cleanup num...
Status: CONFIRMED
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Leap 15.5
Hardware: x86-64 openSUSE Leap 15.5
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: openSUSE Kernel Bugs
QA Contact: E-mail List
URL: https://openqa.opensuse.org/tests/301...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-04 03:13 UTC by Richard Fan
Modified: 2024-05-07 08:48 UTC (History)
4 users (show)

See Also:
Found By: openQA
Services Priority:
Business Priority:
Blocker: Yes
Marketing QA Status: ---
IT Deployment: ---


Attachments
snapper.log (18.01 KB, text/x-log)
2023-01-04 03:21 UTC, Richard Fan
Details
strace snapperd (24.51 KB, text/plain)
2023-01-04 03:21 UTC, Richard Fan
Details
strace snapper cleanup number (34.58 KB, text/plain)
2023-01-04 03:22 UTC, Richard Fan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Fan 2023-01-04 03:13:55 UTC
## Observation

openQA test in scenario opensuse-15.4-DVD-Updates-x86_64-extra_tests_filesystem@64bit fails in
[snapper_cleanup](https://openqa.opensuse.org/tests/3010736/modules/snapper_cleanup/steps/65)

## Test suite description
Maintainer: QE Core

Filesystem related tests, for example snapper and btrfs features.


## Reproducible

Fails since (at least) Build [20230102-3](https://openqa.opensuse.org/tests/3006470)


## Expected result

Last good: [20230102-2](https://openqa.opensuse.org/tests/3005830) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD-Updates&machine=64bit&test=extra_tests_filesystem&version=15.4)
Comment 1 Richard Fan 2023-01-04 03:15:02 UTC
I tried to collect some logs manually. please see attached files.
Comment 2 Richard Fan 2023-01-04 03:21:15 UTC
Created attachment 863819 [details]
snapper.log
Comment 3 Richard Fan 2023-01-04 03:21:43 UTC
Created attachment 863820 [details]
strace snapperd
Comment 4 Richard Fan 2023-01-04 03:22:06 UTC
Created attachment 863821 [details]
strace snapper cleanup number
Comment 5 Richard Fan 2023-01-04 03:23:04 UTC
susetest:/tmp # rpm -qf /usr/bin/snapper
snapper-0.8.16-1.1.x86_64
susetest:/tmp # uname -r
5.14.21-150400.24.38-default
susetest:/tmp # cat /etc/*release
NAME="openSUSE Leap"
VERSION="15.4"
ID="opensuse-leap"
ID_LIKE="suse opensuse"
VERSION_ID="15.4"
PRETTY_NAME="openSUSE Leap 15.4"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:leap:15.4"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"
DOCUMENTATION_URL="https://en.opensuse.org/Portal:Leap"
LOGO="distributor-logo-Leap"
Comment 6 Arvin Schnell 2023-02-22 10:27:59 UTC
Looking at openQA it seems as if the test sometimes succeeds and
sometimes fails.

Also, there hasn't been a change to snapper for more than a year
(since GA) in Leap 15.4. So likely some other component is causing
trouble.
Comment 7 Arvin Schnell 2023-02-22 13:47:38 UTC
Anyway, I could not reproduce the problem. The newer snapper version at
https://build.opensuse.org/project/show/filesystems:snapper has improved
error logging. Running the test with that version might help find the
problem. Access to a machine with the problem would also be good.
Comment 8 Richard Fan 2023-02-23 01:56:23 UTC
(In reply to Arvin Schnell from comment #7)
> Anyway, I could not reproduce the problem. The newer snapper version at
> https://build.opensuse.org/project/show/filesystems:snapper has improved
> error logging. Running the test with that version might help find the
> problem. Access to a machine with the problem would also be good.

A vm machine is ready for you, I will send the access info via mail
Comment 9 Richard Fan 2023-02-23 02:44:00 UTC
(In reply to Richard Fan from comment #8)
> (In reply to Arvin Schnell from comment #7)
> > Anyway, I could not reproduce the problem. The newer snapper version at
> > https://build.opensuse.org/project/show/filesystems:snapper has improved
> > error logging. Running the test with that version might help find the
> > problem. Access to a machine with the problem would also be good.
> 
> A vm machine is ready for you, I will send the access info via mail

Bad news, I can't reproduce the issue after vm reset. I will try to find a new vm
Comment 10 Arvin Schnell 2023-02-23 09:47:58 UTC
QA has now provided a machine and I can reproduce the problem there.
Comment 11 Arvin Schnell 2023-02-23 09:56:31 UTC
There is a btrfs quota rescan stuck on the machine for more than
half an hour.

The handling in snapper is not good. I will improve that but it
will cause snapper to be stuck until the btrfs rescan is done.
Comment 12 Fabian Vogt 2023-07-18 14:59:29 UTC
This fails reproducibly in openQA still.

cat /proc/(pid of btrfs quota rescan kworker)/task/*/stack shows that it's idle/stuck in rescuer_thread.
Comment 13 Fabian Vogt 2023-07-18 15:00:52 UTC
(In reply to Fabian Vogt from comment #12)
> This fails reproducibly in openQA still.

Edit: On 15.5 now: https://openqa.opensuse.org/tests/3438103

Both 15.4 and 15.5 are affected, I'll set the version to the more recent one.