Bug 1218257

Summary: [Build 45.1] openQA test fails in hawk_gui: no "/sys/fs/cgroup/docker/cpuset.cpus.effective"
Product: [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP6 Reporter: lili zhao <llzhao>
Component: ContainersAssignee: Containers Team <containers-bugowner>
Status: VERIFIED INVALID QA Contact:
Severity: Normal    
Priority: P1 - Urgent CC: acarvajal, apappas, danish.prakash, dcermak, llzhao, priyanka.saggu, prokop.vlasin, rtsvetkov
Version: unspecified   
Target Milestone: ---   
Hardware: x86-64   
OS: Other   
URL: https://openqa.suse.de/tests/13086523/modules/hawk_gui/steps/43
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---

Description lili zhao 2023-12-20 09:44:36 UTC
## Observation

openQA test in scenario sle-15-SP6-Online-x86_64-ha_hawk_client@64bit fails in
[hawk_gui](https://openqa.suse.de/tests/13086523/modules/hawk_gui/steps/43)

## Test suite description

HA test case "ha_hawk_client" test module "chawk_gui":
The command `docker run --rm --name test --ipc=host -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=\$DISPLAY -v \$PWD/test:/test registry.opensuse.org/devel/openqa/ci/tooling/containers_15_4/hawk_test:latest -b firefox -H hawk-node01 -S hawk-node02 -s nots3cr3t -r /test/hawk_test.results --virtual-ip 10.0.2.222/24 2>&1` reports error:

"docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container
process caused "process_linux.go:297: applying cgroup configuration for process caused
\"open /sys/fs/cgroup/docker/cpuset.cpus.effective: no such file or directory\"": unknown"

The last good one was on build 40.1. It keeps failing from build 41.1.

## Reproducible

Fails since (at least) Build [41.1](https://openqa.suse.de/tests/12949000)


## Expected result

Last good: [40.1](https://openqa.suse.de/tests/12933482) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Online&machine=64bit&test=ha_hawk_client&version=15-SP6)
Comment 2 Dan Čermák 2023-12-21 08:09:04 UTC
You are installing an ancient version of docker that does not support cgroup v2 to which SLES switched (tests/ha/hawk_gui.pm line 20):
```
sub install_docker {
    my $docker_url = "https://download.docker.com/linux/static/stable/x86_64/docker-19.03.5.tgz";

    assert_script_run "curl -s $docker_url | tar zxf - --strip-components 1 -C /usr/bin", 120;
    # Allow the user to run docker. We can't add him to the docker group without restarting X.
    # The final colon is to avoid a bash syntax error when assert_script_run() appends a semicolon
    assert_script_run "/usr/bin/dockerd -G users --insecure-registry registry.suse.de >/dev/null 2>&1 & :";
}
```

Please use our own docker instead of a long obsolete tarball from upstream.
Comment 3 Dan Čermák 2023-12-21 08:09:36 UTC
This is not our docker version but an entirely unsupported docker version from upstream.
Comment 4 Dan Čermák 2023-12-21 08:31:02 UTC
(In reply to Dan Čermák from comment #3)
> This is not our docker version but an entirely unsupported docker version
> from upstream.

You'll have to switch this test away from SLED for this though. SLED does not support enabling the container module. You'll have to base this test on SLES and install Firefox or on SLES + WE (Workstation Extension).
Comment 5 lili zhao 2023-12-22 02:08:33 UTC
Thank you so much for the comments.
I opened a test case JIRA ticket to track this failure:
TEAM-8899 - [15SP6][HA][Build 45.1] openQA test fails in hawk_gui: no "/sys/fs/cgroup/docker/cpuset.cpus.effective"

We will change the bug status to "Verified Invalid" after we done the verifying.
Comment 6 lili zhao 2024-01-30 01:29:27 UTC
Test case issue.