Bug 1220905 - salt-master: Error loading known_hosts
Summary: salt-master: Error loading known_hosts
Status: CONFIRMED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Salt (show other bugs)
Version: Current
Hardware: Other Other
: P2 - High : Normal (vote)
Target Milestone: ---
Assignee: E-Mail List
QA Contact: E-mail List
URL: https://github.com/SUSE/spacewalk/iss...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-04 21:25 UTC by Enno Gotthold
Modified: 2024-04-30 14:07 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Enno Gotthold 2024-03-04 21:25:46 UTC
This error is happening to me on openSUSE MicroOS 20240302 (the logs reveal this has been happening for a good while already). The error is known upstream and is due to a combination of changes in pygit2 and how Salt is starting up its daemon.

https://github.com/saltstack/salt/issues/64121#issuecomment-1705658539

Since we are not using the onedir packaging yet one cannot downgrade to another pygit2 version via salt-pip.

I am not that deep into the codebase to know if a systemd drop-in that specifies the user used for running the salt-master.service as "salt", is enough.

The log of the salt-master is spammed with the following message:

Mar 04 00:01:51 esprimo-1 salt-master[30997]: [ERROR   ] Error occurred fetching git_pillar remote 'main git@gitlab.com:<user>/<repo>.git': error loading known_hosts:
Mar 04 00:01:51 esprimo-1 salt-master[30997]: Traceback (most recent call last):
Mar 04 00:01:51 esprimo-1 salt-master[30997]:   File "/usr/lib/python3.11/site-packages/salt/utils/gitfs.py", line 1996, in _fetch
Mar 04 00:01:51 esprimo-1 salt-master[30997]:     fetch_results = origin.fetch(**fetch_kwargs)
Mar 04 00:01:51 esprimo-1 salt-master[30997]:                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mar 04 00:01:51 esprimo-1 salt-master[30997]:   File "/usr/lib64/python3.11/site-packages/pygit2/remotes.py", line 155, in fetch
Mar 04 00:01:51 esprimo-1 salt-master[30997]:     payload.check_error(err)
Mar 04 00:01:51 esprimo-1 salt-master[30997]:   File "/usr/lib64/python3.11/site-packages/pygit2/callbacks.py", line 99, in check_error
Mar 04 00:01:51 esprimo-1 salt-master[30997]:     check_error(error_code)
Mar 04 00:01:51 esprimo-1 salt-master[30997]:   File "/usr/lib64/python3.11/site-packages/pygit2/errors.py", line 65, in check_error
Mar 04 00:01:51 esprimo-1 salt-master[30997]:     raise GitError(message)
Mar 04 00:01:51 esprimo-1 salt-master[30997]: _pygit2.GitError: error loading known_hosts:
Comment 1 Enno Gotthold 2024-03-05 09:46:24 UTC
After talking to the Salt Maintainers I did try the following things:

1. Check with "ausearch -m AVC,USER_AVC -ts recent" if there are any SELinux denials present. - This is not the case.
2. Set SELinux to permissive mode. - This did not result in any change.
3. Verify that my salt-master RPM is up to date. - It is (3006.0-8.1)

The change that should have fixed this is https://github.com/openSUSE/salt/pull/588. However the changes don't seem to be enough to actually fix the problem on MicroOS.
Comment 2 Pablo Suárez Hernández 2024-03-05 12:13:33 UTC
I was able to reproduce this issue. Apparentely, latest libgit2/pygit2 versions are bringing this error back.

As a temporary workaround, you could add: 

Environment=HOME=/var/lib/salt

to the "[Service]" section of "/usr/lib/systemd/system/salt-master.service".

It seems Salt internals to detect and properly set the environment variables after switching the user to "salt" is still not working fine in some cases, as "HOME" variable is not properly set.

We would probably need to backport this: https://github.com/saltstack/salt/pull/64510/commits/a180bfe6e0d6daf0997b71751215baef1a5cc646

(I'm setting this bug to CONFIRMED and also changing the component to Salt, so it gets into the right queues)

Hth!
Comment 3 Enno Gotthold 2024-03-05 14:10:03 UTC
The provided workaround works as expected. Thanks a lot for the help!