Bug 1227139

Summary: Redis→valkey migration sets wrong permissions
Product: [openSUSE] openSUSE Tumbleweed Reporter: Matt Williams <matt>
Component: OtherAssignee: Antonio Teixeira <antonio.teixeira>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: paulu, toganm
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Matt Williams 2024-06-27 14:48:45 UTC
Previously I had redis installed and would start it with `sudo systemctl start redis@localhost`. With the move to valkey, I have updated my system and if I do `sudo systemctl start valkey@localhost` I get a failure.

`sudo journalctl -u valkey@localhost.service` gives:
```
Jun 27 15:30:56 HEX systemd[1]: Starting Valkey instance: localhost...
Jun 27 15:30:56 HEX valkey-server[31189]: 31189:C 27 Jun 2024 15:30:56.152 # Fatal error, can't open config file '/etc/valkey/localhost.conf': Permission denied
Jun 27 15:30:56 HEX systemd[1]: valkey@localhost.service: Main process exited, code=exited, status=1/FAILURE
Jun 27 15:30:56 HEX systemd[1]: valkey@localhost.service: Failed with result 'exit-code'.
Jun 27 15:30:56 HEX systemd[1]: Failed to start Valkey instance: localhost.
```

and if I look at the old config file for redis, it's owned by the `redis` group:

```
$ sudo ls -hl /etc/redis
total 108K
-rw-r----- 1 root redis 106K May 31 11:42 localhost.conf.bak
```

but the new config file is owned by root (and not readable by others):

```
$ sudo ls -hl /etc/valkey/localhost.conf
-rw-r----- 1 root root 106K Jun 26 22:52 /etc/valkey/localhost.conf
```

Looking at the migration script, I see from package `valkey-compat-redis` in `/usr/libexec/migrate_redis_to_valkey.bash`:

```bash
...
      cp $configfile /etc/valkey/$configfilename
...
```

which I'm assuming runs as root and therefore creates the new file with root ownership.

If I do `sudo chgrp valkey /etc/valkey/localhost.conf`

it at least gets past that part (though now it fails with:)

```
Jun 27 15:44:59 HEX systemd[1]: Starting Valkey instance: localhost...
Jun 27 15:44:59 HEX valkey-server[2090]: *** FATAL CONFIG FILE ERROR (Version 7.2.5) ***
Jun 27 15:44:59 HEX valkey-server[2090]: Can't open the log file: Permission denied
Jun 27 15:44:59 HEX systemd[1]: valkey@localhost.service: Main process exited, code=exited, status=1/FAILURE
Jun 27 15:44:59 HEX systemd[1]: valkey@localhost.service: Failed with result 'exit-code'.
Jun 27 15:44:59 HEX systemd[1]: Failed to start Valkey instance: localhost.
```

which I can't see a reason for yet.
Comment 1 Matt Williams 2024-06-27 14:52:19 UTC
Checking strace, I see that the log error is because it's trying to write its logs to `/var/log/redis/default.log` which is owned by `redis:redis`.
Comment 2 Paul Uiterlinden 2024-07-06 21:09:55 UTC
This update breaks Nextcloud.

OS: openSUSE Tumbleweed aarch64, Version 20240629
NextCloud: 29.0.3.4 (installed outside zypper)

Zypper output:

# zypper dup
Loading repository data...
Reading installed packages...
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Computing distribution upgrade...

The following 2 NEW packages are going to be installed:
  valkey valkey-compat-redis

The following package is going to be REMOVED:
  redis

2 new packages to install, 1 to remove.
Overall download size: 1.5 MiB. Already cached: 0 B. After the operation, 30.6 KiB will be freed.

Backend:  classic_rpmtrans
Continue? [y/n/v/...? shows all options] (y): 
Retrieving: valkey-7.2.5-3.1.aarch64 (openSUSE-Tumbleweed-Oss)                                                                              (1/2),   1.5 MiB    
Retrieving: valkey-7.2.5-3.1.aarch64.rpm .....................................................................................................[done (3.8 MiB/s)]
Retrieving: valkey-compat-redis-7.2.5-3.1.noarch (openSUSE-Tumbleweed-Oss)                                                                  (2/2),   8.6 KiB    
Retrieving: valkey-compat-redis-7.2.5-3.1.noarch.rpm .....................................................................................................[done]

Checking for file conflicts: .............................................................................................................................[done]
/usr/bin/systemd-sysusers -
Creating group 'valkey' with GID 460.
Creating user 'valkey' (User for valkey key-value store) with UID 460 and GID 460.
See /usr/share/doc/packages/valkey/README.SUSE to continue
(1/2) Installing: valkey-7.2.5-3.1.aarch64 ...............................................................................................................[done]
/etc/redis/*.conf has been copied to /etc/valkey.  Manual review of adjusted configs is strongly suggested.
chown: warning: '.' should be ':': 'valkey.'
On-disk redis dumps copied from /var/lib/redis/ to /var/lib/valkey
Removed "/etc/systemd/system/redis.target.wants/redis@default.service".
Removed "/etc/systemd/system/multi-user.target.wants/redis@default.service".
Warning: The unit file, source configuration file or drop-ins of redis.target changed on disk. Run 'systemctl daemon-reload' to reload units.
Failed to stop redis@.service: Unit name redis@.service is missing the instance name.
See system logs and 'systemctl status redis@.service' for details.
Warning: The unit file, source configuration file or drop-ins of redis-sentinel.target changed on disk. Run 'systemctl daemon-reload' to reload units.
Failed to stop redis-sentinel@.service: Unit name redis-sentinel@.service is missing the instance name.
See system logs and 'systemctl status redis-sentinel@.service' for details.
warning: file /var/lib/redis: remove failed: No such file or directory
(2/2) Installing: valkey-compat-redis-7.2.5-3.1.noarch ...................................................................................................[done]
Running post-transaction scripts .........................................................................................................................[done]


valkey is not started, also not after reboot, or trying to start manually via systemctl.

Consequently, Nextcloud fails to start:

Jul 06 21:18:16 ari-pi systemd[1]: Started Nextcloud cron.php job.
Jul 06 21:18:17 ari-pi php[1732]: RedisException: No such file or directory in /srv/www/htdocs/nextcloud/lib/private/RedisFactory.php:117
Jul 06 21:18:17 ari-pi php[1732]: Stack trace:
Jul 06 21:18:17 ari-pi php[1732]: #0 /srv/www/htdocs/nextcloud/lib/private/RedisFactory.php(117): Redis->pconnect()
Jul 06 21:18:17 ari-pi php[1732]: #1 /srv/www/htdocs/nextcloud/lib/private/RedisFactory.php(158): OC\RedisFactory->create()
Jul 06 21:18:17 ari-pi php[1732]: #2 /srv/www/htdocs/nextcloud/lib/private/Memcache/Redis.php(73): OC\RedisFactory->getInstance()
Jul 06 21:18:17 ari-pi php[1732]: #3 /srv/www/htdocs/nextcloud/lib/private/Memcache/Redis.php(79): OC\Memcache\Redis->getCache()
Jul 06 21:18:17 ari-pi php[1732]: #4 /srv/www/htdocs/nextcloud/lib/private/App/InfoParser.php(56): OC\Memcache\Redis->get()
Jul 06 21:18:17 ari-pi php[1732]: #5 /srv/www/htdocs/nextcloud/lib/private/App/AppManager.php(727): OC\App\InfoParser->parse()
Jul 06 21:18:17 ari-pi php[1732]: #6 /srv/www/htdocs/nextcloud/lib/private/AppFramework/App.php(72): OC\App\AppManager->getAppInfo()
Jul 06 21:18:17 ari-pi php[1732]: #7 /srv/www/htdocs/nextcloud/lib/private/legacy/OC_App.php(157): OC\AppFramework\App::buildAppNamespace()
Jul 06 21:18:17 ari-pi php[1732]: #8 /srv/www/htdocs/nextcloud/lib/private/AppFramework/Bootstrap/Coordinator.php(119): OC_App::registerAutoloading()
Jul 06 21:18:17 ari-pi php[1732]: #9 /srv/www/htdocs/nextcloud/lib/private/AppFramework/Bootstrap/Coordinator.php(90): OC\AppFramework\Bootstrap\Coordinator->registerApps()
Jul 06 21:18:17 ari-pi php[1732]: #10 /srv/www/htdocs/nextcloud/lib/base.php(706): OC\AppFramework\Bootstrap\Coordinator->runInitialRegistration()
Jul 06 21:18:17 ari-pi php[1732]: #11 /srv/www/htdocs/nextcloud/lib/base.php(1181): OC::init()
Jul 06 21:18:17 ari-pi php[1732]: #12 /srv/www/htdocs/nextcloud/cron.php(58): require_once('...')
Jul 06 21:18:17 ari-pi php[1732]: #13 {main}
Jul 06 21:18:17 ari-pi systemd[1]: nextcloud-cron.service: Main process exited, code=exited, status=1/FAILURE
Jul 06 21:18:17 ari-pi systemd[1]: nextcloud-cron.service: Failed with result 'exit-code'.
Jul 06 21:18:17 ari-pi systemd[1]: nextcloud-cron.service: Consumed 1.059s CPU time.
Comment 3 Paul Uiterlinden 2024-07-06 21:45:43 UTC
Two additional comments about script /usr/libexec/migrate_redis_to_valkey.bash

1)
It searches service files only in /etc/systemd/system. Shouldn't /usr/lib/systemd/system be included as well? On my system the redis service files cannot be found under /etc/systemd/system. The are located under /usr/lib/systemd/system:
-rw-r--r-- 1 root root 680 May 19 15:57 /usr/lib/systemd/system/redis@.service
-rw-r--r-- 1 root root 744 May 19 15:57 /usr/lib/systemd/system/redis-sentinel@.service

2)
The script contains the line:
chown -R valkey. /var/lib/valkey

The dot should have been a colon, considering this warning:
chown: warning: '.' should be ':': 'valkey.'