Bug 1217370

Summary: [Build :31467:nvidia-open-driver-G06-signed] /etc/modprobe.d/50-nvidia-default.conf install source conflict
Product: [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP4 Reporter: Jozef Pupava <jpupava>
Component: OtherAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact:
Severity: Normal    
Priority: P2 - High CC: meissner, petr.vorel
Version: SLES15SP4-MaintUpd   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.suse.de/tests/12850349/modules/update_install/steps/145
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---

Description Jozef Pupava 2023-11-21 15:51:14 UTC
## Observation

With update https://build.suse.de/request/show/312973 there is conflict below

Detected 1 file conflict:

File /etc/modprobe.d/50-nvidia-default.conf
  from install of
     nvidia-open-driver-G06-signed-kmp-default-545.29.02_k5.14.21_150400.24.97-150400.9.30.2.x86_64 (TEST_0)
  conflicts with file from package
     nvidia-open-driver-G06-signed-kmp-default-535.129.03_k5.14.21_150400.24.92-150400.9.27.1.x86_64 (@System)

https://openqa.suse.de/tests/12850349/modules/update_install/steps/145

## Reproducible

Fails since (at least) Build [:31467:nvidia-open-driver-G06-signed](https://openqa.suse.de/tests/12850349) (current job)


## Expected result

Last good: [:31448:libxml2](https://openqa.suse.de/tests/12850347) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-Incidents-Install&machine=64bit&test=qam-incidentinstall&version=15-SP4)
Comment 1 Marcus Meissner 2023-11-21 15:56:04 UTC
To explain the problem.

The config file is in the kmp package, without a version in its filename.

the kmp is multiversioned, so multiple instances can exist on the system.

As long as the config file does not change its content, this is acceptable by zypper.

As we have now changed the config file, it no longer is.

Ideas:

- Ignore problem ... but customers would see above conflict and would need to accept it.

- seperate config file package? as also the firmware is multiversioned it cant be in firmware package either.

  might be cleanest?

- encode the version in the config file filename, due to sorting the later ones would overwrite the older ones?

Nothing really trivial.
Comment 2 Stefan Dirsch 2023-11-21 16:35:52 UTC
I'm afraid there is no good solution. I guess I should just rename the config file to resolve the conflict, which then possibly breaks the older driver of course. Hopefully it's no longer been used by booting an older kernel, but well then it's the wrong user space driver version anyway.

Note for myself.

I may need to set 

  options nvidia NVreg_OpenRmEnableUnsupportedGpus=0

in the new config file (which I currently no longer set at all) to overwrite the setting of the old config file. I've seen reports where the new driver breaks completely with this being enabled. I need to check this first.
Comment 3 Jozef Pupava 2023-11-21 16:37:26 UTC
Same issue on 15-SP5 update https://build.suse.de/request/show/312830
Should I open separate bug or is this good enough ?

Detected 1 file conflict:

File /etc/modprobe.d/50-nvidia-default.conf
  from install of
     nvidia-open-driver-G06-signed-kmp-default-545.29.02_k5.14.21_150500.55.36-150500.3.16.1.x86_64 (TEST_0)
  conflicts with file from package
     nvidia-open-driver-G06-signed-kmp-default-535.129.03_k5.14.21_150500.55.31-150500.3.13.1.x86_64 (@System)

https://openqa.suse.de/tests/12817721#step/update_install/137
Comment 4 Stefan Dirsch 2023-11-21 16:39:07 UTC
No need to open another report. Fix will be the same and I will submit it for both - SP4 and SP5.
Comment 5 Stefan Dirsch 2023-11-22 13:32:51 UTC
Fixed with these changes:

-------------------------------------------------------------------
Wed Nov 22 13:16:01 UTC 2023 - Stefan Dirsch <sndirsch@suse.com>

- no longer try to overwrite NVreg_OpenRMEnableSupporteGpus driver
  option setting; apparently it's ignored by the driver (boo#1215981,
  comment#26)

-------------------------------------------------------------------
Tue Nov 21 21:05:50 UTC 2023 - Stefan Dirsch <sndirsch@suse.com>

- use different modprobe.d config file to resolve conflict with
  older driver package (boo#1217370); overwrite 
  NVreg_OpenRMEnableSupporteGpus driver option setting (disable it),
  since letting it enabled is supposed to break booting (boo#1215981, 
  comment#23)

Just submitted to factory/TW, sle15-sp5 and sle15-sp4. Closing.
Comment 7 OBSbugzilla Bot 2023-11-22 15:35:05 UTC
This is an autogenerated message for OBS integration:
This bug (1217370) was mentioned in
https://build.opensuse.org/request/show/1128138 Factory / nvidia-open-driver-G06-signed
Comment 8 Maintenance Automation 2023-12-05 12:30:02 UTC
SUSE-RU-2023:4642-1: An update that has two fixes can now be installed.

Category: recommended (moderate)
Bug References: 1215981, 1217370
Sources used:
openSUSE Leap 15.5 (src): nvidia-open-driver-G06-signed-545.29.02-150500.3.18.1
SUSE Linux Enterprise Micro 5.5 (src): nvidia-open-driver-G06-signed-545.29.02-150500.3.18.1
Basesystem Module 15-SP5 (src): nvidia-open-driver-G06-signed-545.29.02-150500.3.18.1
Public Cloud Module 15-SP5 (src): nvidia-open-driver-G06-signed-545.29.02-150500.3.18.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 9 Maintenance Automation 2023-12-05 12:36:08 UTC
SUSE-RU-2023:4641-1: An update that has two fixes can now be installed.

Category: recommended (moderate)
Bug References: 1215981, 1217370
Sources used:
openSUSE Leap 15.4 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1
SUSE Linux Enterprise Micro for Rancher 5.3 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1
SUSE Linux Enterprise Micro 5.3 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1
SUSE Linux Enterprise Micro for Rancher 5.4 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1
SUSE Linux Enterprise Micro 5.4 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1
Basesystem Module 15-SP4 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1
Public Cloud Module 15-SP4 (src): nvidia-open-driver-G06-signed-545.29.02-150400.9.32.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.