Bugzilla – Bug 1188954
black screen because dm tries to start before /dev/dri/card0 has been created during init
Last modified: 2024-03-27 22:41:25 UTC
Created attachment 851460 [details] Xorg.0.log and dmesg Initial summary: black screen on initial X open on Kaveri Radeon R7, /dev/dri/card0: No such file or directory with radeon.cik_support=0 amdgpu.cik_support=1 To reproduce: 1-boot with included on kernel cmdline radeon.cik_support=0 amdgpu.cik_support=1 2-wait for X to fail to start 3-login on a tty 4-systemctl restart xdm Actual behavior: 1-X fails to start, finding no /dev/dri/card0, leaving login prompt on tty1 on display screen 2-X starts normally using amdgpu DDX driver Expected behavior: 1-X starts normally using amdgpu DDX driver Tested with kernels 5.12.13, 5.11.16, 5.7.11 # inxi -SGIay System: Host: asa88 Kernel: 5.12.13-1-default x86_64 bits: 64 compiler: gcc v: 11.1.1 parameters: BOOT_IMAGE=/boot/vmlinuz root=LABEL=tvgp07stw noresume ipv6.disable=1 net.ifnames=0 mitigations=auto consoleblank=0 radeon.cik_support=0 amdgpu.cik_support=1 video=1024x768@60 video=1440x900@60 drm.debug=0x1e log_buf_len=1M Console: tty pts/0 DM: TDM Distro: openSUSE Tumbleweed 20210730 Graphics: Device-1: AMD Kaveri [Radeon R7 Graphics] vendor: ASUSTeK driver: amdgpu v: kernel alternate: radeon bus-ID: 00:01.0 chip-ID: 1002:130f class-ID: 0300 Display: server: X.org 1.20.12 driver: loaded: vesa unloaded: fbdev,modesetting alternate: ati Message: Advanced graphics data unavailable for root. Info:...inxi: 3.3.06 Comments: 1-without radeon.cik_support=0 amdgpu.cik_support=1 on cmdline, X cannot be coaxed into using amdgpu DDX driver via /etc/X11/xorg.conf.d/*conf 2-without radeon.cik_support=0 amdgpu.cik_support=1, greeter startup completes (normally, on first try) using modesetting DIX driver
Behavior remains the same with 5.13.4 kernel, and with omission of non-essential cmdline options.
Not sure why you want to enable Sea Islands (CIK) support in amdgpu kernel driver. I doubt it gets sufficient testing. I suggest to use radeon kernel module (default). Then possibly with "radeon" DDX, but "modesetting" should be fine as well, then using Mesa driver for acceleration via Glamor. If this fails as well, we can discuss again.
Created attachment 851463 [details] Xorg.0.log booted without radeon.cik_support=0 amdgpu.cik_support=1 (In reply to Stefan Dirsch from comment #2) > Not sure why you want to enable Sea Islands (CIK) support in amdgpu kernel > driver. I doubt it gets sufficient testing. 10 months ago it (A10-7850K) worked at each boot just fine, as it does now once xdm is restarted. I have another Kaveri that still does work just fine, without need to restart xdm first thing after booting: # cat inxi-tw20210730.txt # pinxi -SCzy System: Kernel: 5.12.13-1-default x86_64 bits: 64 Desktop: Trinity R14.0.10 Distro: openSUSE Tumbleweed 20210730 Machine: Type: Desktop Mobo: ASRock model: FM2A88X Extreme6+ serial: <filter> UEFI: American Megatrends v: P4.20 date: 01/13/2016 CPU: Info: Quad Core model: AMD PRO A8-8650B R7 10 Compute Cores 4C+6G bits: 64 type: MCP cache: L2: 2 MiB Speed: 1396 MHz min/max: 1400/3200 MHz Core speeds (MHz): 1: 1396 2: 1392 3: 1397 4: 1381 # pinxi -Gazy Graphics: Device-1: AMD Kaveri [Radeon R7 Graphics] vendor: ASRock driver: amdgpu v: kernel alternate: radeon bus-ID: 00:01.0 chip-ID: 1002:1313 class-ID: 0300 Display: x11 server: X.Org 1.20.12 driver: loaded: amdgpu unloaded: fbdev,modesetting,vesa alternate: ati display-ID: :0 screens: 1 Screen-1: 0 s-res: 1920x1200 s-dpi: 120 s-size: 406x254mm (16.0x10.0") s-diag: 479mm (18.9") Monitor-1: DisplayPort-0 res: 1920x1200 hz: 60 dpi: 94 size: 519x324mm (20.4x12.8") diag: 612mm (24.1") OpenGL: renderer: AMD KAVERI (DRM 3.40.0 5.12.13-1-default LLVM 12.0.1) v: 4.6 Mesa 21.1.5 direct render: Yes # systemd-analyze Startup finished in 9.750s (firmware) + 23.795s (loader) + 1.952s (kernel) + 2.480s (initrd) + 3.207s (userspace) = 41.185s multi-user.target reached after 3.145s in userspace # dmesg | grep amdgpu [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz root=LABEL=zd8p07stw noresume ipv6.disable=1 net.ifnames=0 mitigations=auto consoleblank=0 radeon.cik_support=0 amdgpu.cik_support=1 video=1024x768@60 video=1440x900@60 5 [ 0.019636] Kernel command line: BOOT_IMAGE=/boot/vmlinuz root=LABEL=zd8p07stw noresume ipv6.disable=1 net.ifnames=0 mitigations=auto consoleblank=0 radeon.cik_support=0 amdgpu.cik_support=1 video=1024x768@60 video=1440x900@60 5 [ 6.559662] [drm] amdgpu kernel modesetting enabled. [ 6.559836] amdgpu: Topology: Add APU node [0x0:0x0] [ 6.559894] fb0: switching to amdgpudrmfb from EFI VGA [ 6.560540] amdgpu 0000:00:01.0: vgaarb: deactivate vga console [ 6.560775] amdgpu 0000:00:01.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported [ 6.577972] amdgpu 0000:00:01.0: amdgpu: Fetched VBIOS from ROM BAR [ 6.577978] amdgpu: ATOM BIOS: 113-SPEC-102 [ 6.578227] amdgpu 0000:00:01.0: amdgpu: VRAM: 1024M 0x000000F400000000 - 0x000000F43FFFFFFF (1024M used) [ 6.578231] amdgpu 0000:00:01.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF [ 6.578343] [drm] amdgpu: 1024M of VRAM memory ready [ 6.578347] [drm] amdgpu: 3072M of GTT memory ready. [ 6.586607] [drm] amdgpu: dpm initialized [ 6.861924] amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 6 [ 7.037007] fbcon: amdgpudrmfb (fb0) is primary device [ 7.490778] amdgpu 0000:00:01.0: [drm] fb0: amdgpudrmfb frame buffer device [ 7.525260] [drm] Initialized amdgpu 3.40.0 20150101 for 0000:00:01.0 on minor 0 > I suggest to use radeon kernel module (default). Then possibly with "radeon" > DDX, but "modesetting" should be fine as well, then using Mesa driver for > acceleration via Glamor. If this fails as well, we can discuss again. In comment #0 my last sentence would seem to have covered this. Anyway: # inxi -SCMzy System: Kernel: 5.13.4-1-default x86_64 bits: 64 Desktop: Trinity R14.0.10 Distro: openSUSE Tumbleweed 20210730 Machine: Type: Desktop Mobo: ASUSTeK model: A88X-PRO v: Rev X.0x serial: <filter> UEFI: American Megatrends v: 2603 date: 03/10/2016 CPU: Info: Quad Core model: AMD A10-7850K Radeon R7 12 Compute Cores 4C+8G bits: 64 type: MCP cache: L2: 2 MiB Speed: 1689 MHz min/max: 1700/3700 MHz Core speeds (MHz): 1: 1689 2: 1700 3: 1699 4: 1695 # inxi -Gazy Graphics: Device-1: AMD Kaveri [Radeon R7 Graphics] vendor: ASUSTeK driver: radeon v: kernel alternate: amdgpu bus-ID: 00:01.0 chip-ID: 1002:130f class-ID: 0300 Display: x11 server: X.Org 1.20.12 driver: loaded: modesetting unloaded: fbdev,vesa alternate: ati display-ID: :0 screens: 1 Screen-1: 0 s-res: 1920x1200 s-dpi: 120 s-size: 406x254mm (16.0x10.0") s-diag: 479mm (18.9") Monitor-1: DP-1 res: 1920x1200 hz: 60 dpi: 94 size: 519x324mm (20.4x12.8") diag: 612mm (24.1") OpenGL: renderer: AMD KAVERI (DRM 2.50.0 5.13.4-1-default LLVM 12.0.1) v: 4.5 Mesa 21.1.5 direct render: Yes # systemd-analyze Startup finished in 5.901s (firmware) + 22.155s (loader) + 1.645s (kernel) + 2.238s (initrd) + 3.324s (userspace) = 35.265s multi-user.target reached after 3.306s in userspace I checked BIOS and found IOMMU was disabled. After enabling, dmesg got very noisy using drm.debug=0x1e log_buf_len=1M, with the following repeated about 20 times: [ 8.376985] [drm:amdgpu_atombios_encoder_dpms [amdgpu]] encoder dpms 37 to mode 3, devices 00000001, active_devices 00000000 [ 8.387861] [drm:dce_v8_0_program_watermarks [amdgpu]] force priority to high [ 8.388035] [drm:dce_v8_0_program_watermarks [amdgpu]] force priority to high [ 8.388330] [drm:dce_v8_0_program_watermarks [amdgpu]] force priority to high [ 8.388484] [drm:dce_v8_0_program_watermarks [amdgpu]] force priority to high [ 8.388663] [drm:dce_v8_0_program_watermarks [amdgpu]] force priority to high [ 8.388817] [drm:dce_v8_0_program_watermarks [amdgpu]] force priority to high
Created attachment 851464 [details] Xorg.0.log & dmesg after enabling IOMMU in BIOS; with radeon.cik_support=0 amdgpu.cik_support=1; without drm.debug=0x1e log_buf_len=1M After enabling IOMMU in BIOS, no improvement was apparent. I contacted ASUS about this via browser chat. It was escalated. I'm supposed to hear back in 24-48 hours via email.
So things are simply working with default "radeon" kernel driver. No idea what's remaining here.
(In reply to Stefan Dirsch from comment #5) > So things are simply working with default "radeon" kernel driver. No idea > what's remaining here. The amdgpu DDX driver behaved totally as expected on TW until relatively recently. I have no recollection when it stopped behaving normally, but I'm guessing I was hoping when it first occurred it was temporary or fluke and would disappear on its own. I cannot reproduce on the same PC using Debian 10 or 11, Fedora 34 or Leap 15.3. They all work perfectly fine at the outset with amdgpu kernel driver, radeon.cik_support=0 amdgpu.cik_support=1 and the amdgpu DDX. TW works fine too, except on initial X start at boot. There is one other openSUSE quirk, and that is that 15.3 will not load the amdgpu DDX unless explicitly directed in /etc/X11/xorg.conf.d/ via Driver "amdgpu". All others need no help from /etc/X11/xorg.con*. There is a Fedora quirk too, which I doubt is connected, because it also happens with kernel-radeon/X-modesetting. When X starts SDDM, X immediately crashes and restarts, with this Xorg.0.log.old tail: [ 12.301] (II) UnloadModule: "libinput" [ 12.303] (WW) xf86OpenConsole: VT_ACTIVATE failed: Input/output error [ 12.303] (EE) Fatal server error: [ 12.303] (EE) xf86OpenConsole: Switching VT failed ... [ 12.303] (WW) xf86CloseConsole: KDSETMODE failed: Input/output error [ 12.303] (WW) xf86CloseConsole: VT_GETMODE failed: Input/output error [ 12.303] (WW) xf86CloseConsole: VT_ACTIVATE failed: Input/output error [ 12.303] (EE) Server terminated with error (1). Closing log file. Maybe it just needs more time to go away magically, or show up somewhere else. :P I have lots of freespace on the SSD, so when I get the urge or need to do a fresh install of TW I'll see if it reproduces.
Ok. If it fails only during initial X startup, this looks like a timing issue, i.e. kernel module is not being initialized in time before X gets started. Maybe amdgpu kernel module is missing from initrd, but radeon is (since it's the default driver), i.e. adding amdgpu to initrd may help (if it's really missing). I can't say anything about other Linux distros. They may use completely different kernel versions and patches for them. On openSUSE amdgpu DDX is being used when "amdgpu" kernel module is being loaded. There should be no need to configure it when the package xf86-video-amdgpu is being installed. I still don't understand why you want to use "amdgpu" drvier with your hardware though.
(In reply to Stefan Dirsch from comment #7) > Ok. If it fails only during initial X startup, this looks like a timing > issue, i.e. kernel module is not being initialized in time before X gets > started. Maybe amdgpu kernel module is missing from initrd, but radeon is > (since it's the default driver), i.e. adding amdgpu to initrd may help (if > it's really missing). asa88:/boot # head -n2 /etc/os-release NAME="openSUSE Tumbleweed" # VERSION="20210730" asa88:/boot # lsinitrd initrd-5.13.4-1-default | grep AMD -rw-r--r-- 1 root root 7876 Aug 1 02:08 kernel/x86/microcode/AuthenticAMD.bin asa88:/boot # lsinitrd initrd-5.13.4-1-default | grep -i radeon asa88:/boot # lsinitrd initrd-5.13.4-1-default | grep -i amdgpu asa88:/boot # lsinitrd initrd-5.12.13-1-default | grep -i amdgpu asa88:/boot # lsinitrd initrd-5.11.16-1-default | grep -i amdgpu asa88:/boot # lsinitrd initrd-5.10.16-1-default | grep -i amdgpu asa88:/boot # lsinitrd initrd-5.9.14-1-default | grep -i amdgpu asa88:/boot # lsinitrd initrd-5.8.15-1-default | grep -i amdgpu asa88:/boot # lsinitrd initrd-5.7.11-1-default | grep -i amdgpu asa88:/boot # On Intel Haswell lsinitrd /boot/initrd | grep i915 returns null too. > I still don't understand why you want > to use "amdgpu" drvier with your hardware though. I started to when I learned it was possible. Until this, I had only seen one reason not to (unique set of connector names is a nuisance). I have to wonder how much testing following refactoring the radeon gets by developers, given how old the hardware is that isn't supported by the modesetting DIX.
I don't know why neither amdgpu, nor radeon, nor i915 being added to initrd of your system. Looks like there is no timing issue with radeon with not existing in initrd - in contrary to amdgpu driver. "modesetting" X driver is supposed to work with any hardware with working DRM/modestting kernel driver.
Created attachment 851529 [details] Xorg.0.log.old; journalctl -b; dmesg; Xorg.0.log I have 15.3 freshly updated on an old GeForce PC doing essentially the same thing. On boot, X fails to start. After login and systemctl restart xdm, normalcy begins. Booting older kernels is no help. Switching from TDM to XDM directly using update-alternatives --configure default-displaymanager didn't help either. # inxi -CSy System: Host: g5eas Kernel: 5.3.18-59.16-default x86_64 bits: 64 Desktop: Trinity Distro: openSUSE Leap 15.3 CPU: Info: Single Core model: Intel Pentium 4 bits: 64 type: MT cache: L2: 2 MiB Speed: 3200 MHz min/max: N/A Core speeds (MHz): 1: 3200 2: 3200 # inxi -Gayz Graphics: Device-1: XGI Z7/Z9 vendor: Gigabyte driver: N/A bus-ID: 0a:03.0 chip-ID: 18ca:0020 class-ID: 0300 Device-2: NVIDIA G98 [GeForce 8400 GS Rev. 2] vendor: PNY driver: nouveau v: kernel bus-ID: 0b:00.0 chip-ID: 10de:06e4 class-ID: 0300 Display: x11 server: X.Org 1.20.3 driver: loaded: modesetting unloaded: fbdev,vesa alternate: nouveau,nv,nvidia display-ID: :0 screens: 1 Screen-1: 0 s-res: 1920x1200 s-dpi: 120 s-size: 406x254mm (16.0x10.0") s-diag: 479mm (18.9") Monitor-1: DVI-I-1 res: 1920x1200 hz: 60 dpi: 94 size: 519x324mm (20.4x12.8") diag: 612mm (24.1") OpenGL: renderer: NV98 v: 3.3 Mesa 20.2.4 direct render: Yes # Note that there is no way via PC BIOS to make the on the motherboard XGI GPU disappear. I found all this out after trying to reproduce this on NVidia with TW20210803 on this PC, but since today's zypper dup, it refuses to mount / RW until I login and execute mount -o remount,rw /.
Ok. So even more confusing details from a completely different system. Thanks!
Created attachment 851587 [details] Xorg.0.log.old; dmesg; journalctl -b Happens in Intel Kaby Lake too: # lspci -nnk | grep VGA 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 630 [8086:5912] (rev 04) # inxi -Sy | grep istro Distro: openSUSE Tumbleweed 20210805 Again, no automatic restart was attempted. I must run systemctl restart xdm to get the greeter open. 30 of these are in dmesg: [ 5.131236] i915 0000:00:02.0: [drm:intel_hdmi_set_edid [i915]] HDMI GMBUS EDID read failed, retry using GPIO bit-banging [ 143.509712] i915 0000:00:02.0: [drm:intel_hdmi_set_edid [i915]] HDMI GMBUS EDID read failed, retry using GPIO bit-banging That made me think maybe hardware issue, so I connected a different display, but no joy.
(In reply to Stefan Dirsch from comment #7) > Ok. If it fails only during initial X startup, this looks like a timing > issue, i.e. kernel module is not being initialized in time before X gets > started. As previously noted, this has turned out *not* to be limited to AMD. After reproducing this on a second Kaby Lake (gb250) I did some experimenting. This is typical content of my grub.cfg linu lines where this reproduces: mitigations=auto consoleblank=0 video=1440x900@60 5 Changing it on gb250 to: mitigations=auto video=1440x900@60 5 *sometimes* avoids "(EE) open /dev/dri/card0: No such file or directory", resulting in expected startup. Other times, the screen is black as noted in comment 0, while other times, focus returns to the login prompt on tty1. I switched back to the comment #12 Kaby Lake (host ab250), tried the same things, and cannot get X to start on first try no matter what. There's also this: # systemd-analyze Startup finished in 17.795s (firmware) + 5.243s (loader) + 1.458s (kernel) + 1.406s (initrd) + 2.985s (userspace) = 28.889s graphical.target reached after 2.961s in userspace It seems consoleblank=0 on kernel command line *can* somehow affect timing of the KMS kernel module loading, whether it be i915, amdgpu or radeon, but the real or main problem must be that each KMS module simply is not getting loaded soon enough. Is there anything that can advance KMS module loading?
Since hosts ab250, asa88 & gb250 where this reproduces are all running TDM, which a forum responder suggested might be at fault, on comment #0 host asa88 I did a fresh minimal/KDE (no-recommends) install of TW20210810, without NetworkManager, without Wicked, with systemd-networkd, without sddm, without lightdm, with xdm. This bug reproduces randomly on it. In 11 successive reboots: 1-success 2-success 3-failure 4-success 5-success 6-success 7-failure 8-failure 9-failure 10-failure 11-failure Example failure stats (from #3 above): # systemd-analyze Startup finished in 5.768s (firmware) + 7.100s (loader) + 1.846s (kernel) + 2.102s (initrd) + 3.486s (userspace) = 20.304s graphical.target reached after 3.477s in userspace # systemd-analyze critical-chain ... graphical.target @3.477s └─multi-user.target @3.477s └─kbdsettings.service @1.007s +2.469s └─basic.target @996ms └─sockets.target @996ms └─telnet.socket @995ms └─sysinit.target @987ms └─systemd-update-utmp.service @958ms +28ms └─systemd-tmpfiles-setup.service @927ms +28ms └─local-fs.target @924ms └─usr-local.mount @876ms +48ms └─systemd-fsck@dev-disk-by\x2dlabel-tvgp04usrlcl.service @663ms +204ms └─local-fs-pre.target @650ms └─systemd-tmpfiles-setup-dev.service @630ms +17ms └─kmod-static-nodes.service @556ms +55ms └─systemd-journald.socket └─system.slice └─-.slice The following is from #11 above: # systemd-analyze blame | head -n22 2.521s kbdsettings.service 1.902s systemd-random-seed.service 1.541s sshd.service 1.140s dracut-initqueue.service 659ms initrd-switch-root.service 638ms chronyd.service 471ms smartd.service 396ms issue-generator.service 304ms user@0.service 303ms initrd-parse-etc.service 182ms systemd-fsck@dev-disk-by\x2dlabel-tvgp04usrlcl.service 179ms systemd-fsck@dev-disk-by\x2dlabel-tvgp06pub.service 179ms systemd-fsck@dev-disk-by\x2dlabel-tvgp05home.service 178ms systemd-udevd.service 137ms systemd-networkd.service 100ms display-manager.service 100ms systemd-logind.service 92ms systemd-udev-trigger.service 82ms apparmor.service 63ms sound-extra.service 62ms systemd-tmpfiles-clean.service 61ms modprobe@drm.service Without radeon.cik_support=0 amdgpu.cik_support=1 on cmdline I managed success on 11 straight boots, so it seems amdgpu.ko.xz simply doesn't get loaded as fast/soon as radeon.ko.xz. I started a forum thread about this in June: https://forums.opensuse.org/showthread.php/555514-boots-too-fast-for-Xorg-to-run
Using systemd-networkd instead of wicked or networkmanager makes the (~6s) difference between KMS module finishing loading soon enough or not. This is from an xdm startup failing using systemd-networkd: # journalctl -b -o short-monotonic -u display-manager.service -u systemd-modules-load.service -g St -- Journal begins at Sat 2021-01-16 18:26:29 EST, ends at Sat 2021-08-14 16:50:47 EDT. -- [ 3.849113] asa88 systemd[1]: Stopped Load Kernel Modules. [ 6.231843] asa88 systemd[1]: Starting X Display Manager... [ 7.383982] asa88 display-manager[651]: Starting service tdm [ 7.384359] asa88 systemd[1]: Started X Display Manager. This is from an xdm startup succeeding using wicked: # journalctl -b -o short-monotonic -u display-manager.service -u systemd-modules-load.service -g St -- Journal begins at Fri 2021-08-13 23:33:06 EDT, ends at Sat 2021-08-14 17:38:24 EDT. -- [ 3.731005] localhost systemd[1]: Stopped Load Kernel Modules. [ 13.312481] asa88 systemd[1]: Starting X Display Manager... [ 13.411758] asa88 display-manager[1013]: Starting service xdm [ 13.412175] asa88 systemd[1]: Started X Display Manager. All my systems are configured with static IP, without IPV6 enabled, and without resolvconf of any kind enabled. It looks like the summary should be: display-manager.service starts too soon when using systemd-networkd or using systemd-networkd instead of wicked or networkmanager results in systemd-modules-load.service functionally finishing after display-manager.service starts
(In reply to Felix Miata from comment #15) > Using systemd-networkd instead of wicked or networkmanager makes the (~6s) > difference between KMS module finishing loading soon enough or not. > > This is from an xdm startup failing using systemd-networkd: > # journalctl -b -o short-monotonic -u display-manager.service -u > systemd-modules-load.service -g St > -- Journal begins at Sat 2021-01-16 18:26:29 EST, ends at Sat 2021-08-14 > 16:50:47 EDT. -- > [ 3.849113] asa88 systemd[1]: Stopped Load Kernel Modules. > [ 6.231843] asa88 systemd[1]: Starting X Display Manager... > [ 7.383982] asa88 display-manager[651]: Starting service tdm > [ 7.384359] asa88 systemd[1]: Started X Display Manager. > > This is from an xdm startup succeeding using wicked: > # journalctl -b -o short-monotonic -u display-manager.service -u > systemd-modules-load.service -g St > -- Journal begins at Fri 2021-08-13 23:33:06 EDT, ends at Sat 2021-08-14 > 17:38:24 EDT. -- > [ 3.731005] localhost systemd[1]: Stopped Load Kernel Modules. > [ 13.312481] asa88 systemd[1]: Starting X Display Manager... > [ 13.411758] asa88 display-manager[1013]: Starting service xdm > [ 13.412175] asa88 systemd[1]: Started X Display Manager. > > All my systems are configured with static IP, without IPV6 enabled, and > without resolvconf of any kind enabled. > > It looks like the summary should be: > > display-manager.service starts too soon when using systemd-networkd > or > using systemd-networkd instead of wicked or networkmanager results in > systemd-modules-load.service functionally finishing after > display-manager.service starts That is what using early KMS (module added to initrd) may help with, as Stefan Dirsch already mentioned back in comment #7.
I don't want to force early via including graphics modules in initrd. One of the Shaman Penguins on the forum thread provided a workaround I can use until the real offender can be isolated and possibly dealt with: # systemctl cat display-manager.service ... # /etc/systemd/system/display-manager.service.d/override.conf [Unit] After= systemd-udev-settle.service Requires=systemd-udev-settle.service
(In reply to Felix Miata from comment #17) > I don't want to force early via including graphics modules in initrd. > > One of the Shaman Penguins on the forum thread provided a workaround I can > use until the real offender can be isolated and possibly dealt with: > > # systemctl cat display-manager.service > ... > # /etc/systemd/system/display-manager.service.d/override.conf > [Unit] > After= systemd-udev-settle.service > Requires=systemd-udev-settle.service Your choice of display-manager is TDM. Have you tested other display-managers for this behaviour?
One only. On the fresh TW20210810 installation on host asa88 from which comment #17 resulted, no DM other than XDM is or was installed. I found another AMD host with TW20210810/TDM that requires this workaround, ara88, CPU/APU: AMD PRO A8-8650B R7. As deano noted in the forum thread, on https://adamsdesk.com/blog/2021/02/15/gdm-no-longer-starts-automatically/ is explained this had been surfacing apparently more than 6 months ago on Arch and Fedora. There it points out the likelihood of this happening was discovered over 9 years ago: https://gitlab.gnome.org/GNOME/gdm/-/issues/103. In the adamsdesk article one workaround (out of several) was suggested, and repeated in my forum thread, which I initially found not to work on asa88 with TDM: Wants=dev-dri-card0.device & After=dev-dri-card0.device instead of After=systemd-udev-settle.service & Requires=systemd-udev-settle.service. I later discovered that I hadn't coupled the dev-dri-card0.device method with a required /etc/udev/rules.d/99-make-udev-drm-aware.rules containing 'SUBSYSTEM=="drm", TAG+="systemd"'. With the udev rule, dev-dri-card0.device also works for hosts asa88, ab250 and gb250, but not host ara88. I need to note to that systemd-udev-settle.service as shipped, which ends with "ExecStart=udevadm settle", does not work. In order to work it needs the version shipped in 15.2, which ends with "ExecStart=/usr/bin/udevadm settle".
Still happening on the comment #17 TW host running Tiwai's simpledrm kernel, and no workarounds engaged: # rpm -qa | egrep 'gdm|sddm|lightdm|kdm|tdm|xdm' xdm-1.1.12-17.2.x86_64 # inxi -Sy System: Host: asa88 Kernel: 5.16.3-4.gc7377e3-default x86_64 bits: 64 Desktop: KDE Plasma 5.23.5 Distro: openSUSE Tumbleweed 20220130 # inxi -Gayz Graphics: Device-1: AMD Kaveri [Radeon R7 Graphics] vendor: ASUSTeK driver: amdgpu v: kernel alternate: radeon bus-ID: 00:01.0 chip-ID: 1002:130f class-ID: 0300 Display: x11 server: X.Org 1.21.1.3 compositor: kwin_x11 driver: loaded: amdgpu unloaded: modesetting alternate: ati,fbdev,vesa display-ID: :0 screens: 1 Screen-1: 0 s-res: 1920x1200 s-dpi: 120 s-size: 406x254mm (16.0x10.0") s-diag: 479mm (18.9") Monitor-1: HDMI-A-0 res: 1920x1200 hz: 60 dpi: 94 size: 519x324mm (20.4x12.8") diag: 612mm (24.1") OpenGL: renderer: AMD KAVERI (DRM 3.44.0 5.16.3-4.gc7377e3-default LLVM 13.0.0) v: 4.6 Mesa 21.3.4 direct render: Yes #
Continues on host asa88 with TW20220425, and others I haven't kept track of.
Happens on fresh installation of 15.4 beta with KDM3 on Kaby Lake host gb250: # pinxi -GISaz System: Kernel: 5.14.21-150400.19-default arch: x86_64 bits: 64 compiler: gcc v: 7.5.0 parameters: BOOT_IMAGE=/boot/vmlinuz root=LABEL=<filter> noresume ipv6.disable=1 net.ifnames=0 mitigations=auto consoleblank=0 video=1440x900@60 5 Desktop: KDE v: 3.5.10 tk: Qt v: 3.3.8c info: kicker wm: kwin vt: 7 dm: KDM Distro: openSUSE Leap 15.4 Beta Graphics: Device-1: Intel HD Graphics 630 vendor: Gigabyte driver: i915 v: kernel ports: active: HDMI-A-1 empty: DP-1, DP-2, HDMI-A-2, HDMI-A-3 bus-ID: 00:02.0 chip-ID: 8086:5912 class-ID: 0300 Display: x11 server: X.Org v: 1.20.3 driver: X: loaded: modesetting unloaded: fbdev,vesa alternate: intel gpu: i915 display-ID: :0 screens: 1 Screen-1: 0 s-res: 1920x1200 s-dpi: 120 s-size: 406x254mm (15.98x10.00") s-diag: 479mm (18.85") Monitor-1: HDMI-A-1 mapped: HDMI-1 model: NEC EA243WM serial: <filter> built: 2011 res: 1920x1200 hz: 60 dpi: 94 gamma: 1.2 size: 519x324mm (20.43x12.76") diag: 612mm (24.1") ratio: 16:10 modes: max: 1920x1200 min: 640x480 OpenGL: renderer: Mesa Intel HD Graphics 630 (KBL GT2) v: 4.6 Mesa 21.2.4 direct render: Yes Info:...Shell: Bash v: 4.4.23 running-in: konsole pinxi: 3.3.15-3 # ls -1 /sys/class/drm/ card0 card0-DP-1 card0-DP-2 card0-HDMI-A-1 card0-HDMI-A-2 card0-HDMI-A-3 renderD128 version # systemd-analyze critical-chain ... graphical.target @2.993s └─multi-user.target @2.993s └─kbdsettings.service @866ms +2.126s └─basic.target @853ms └─sockets.target @852ms └─telnet.socket @852ms └─sysinit.target @842ms └─systemd-update-utmp.service @829ms +9ms └─systemd-tmpfiles-setup.service @716ms +110ms └─local-fs.target @709ms └─usr-local.mount @692ms +14ms └─systemd-fsck@dev-disk-by\x2dlabel-pi3p04usrlcl.service @648ms +37ms └─local-fs-pre.target @625ms └─systemd-tmpfiles-setup-dev.service @590ms +33ms └─kmod-static-nodes.service @511ms +51ms └─systemd-journald.socket └─system.slice └─-.slice
Both 15.4 & 15.3 using TDM suffer this with their latest kernels 24.21 & 59.93 on comment 10 host g5eas with chip-ID: 10de:06e4 & kernel module nouveau. SDDM in current TW has this problem not. I wonder if this is aggravated on this host because the 18ca:0020 XGI IGP cannot be disabled, and doesn't seem to be by the presence of the ancient PCI GPU. I don't recognize existence of any module for it among kernel/drivers/gpu/drm. Comment #0 host asa88's motherboard died, so I moved its APU to host ara88 in the process of confirming the death. Checking ara88's status re this is yet todo....
Created attachment 862373 [details] systemd-analyze blame This 475 line attachment is from today's fresh installation. I spent all day yesterday and the first hours of today beating my head against this wall with this PC's previous 15.4, an upgrade from 15.3 many moons ago then ending with kernel-default-5.14.21-150400.22.1 as latest kernel, then three separate attempts to upgrade from 15.3 again, before deciding on a fresh start. (In reply to Stefan Dirsch from comment #7) > Ok. If it fails only during initial X startup, this looks like a timing > issue, i.e. kernel module is not being initialized in time before X gets > started. Maybe amdgpu kernel module is missing from initrd, but radeon is > (since it's the default driver), i.e. adding amdgpu to initrd may help (if > it's really missing). On 15.4 post-kernel-default-5.14.21-150400.22.1, the Kaveri [Radeon R7 Graphics] chip-ID: 1002:130f PC is stubbornly refusing to load a kernel graphics module before X tries its first start on each boot. XDM won't auto restart, so I get either solid black screen, or a tty1 login prompt, depending on what linu line parameters are used, and/or which display drivers are configured, and/or whether I have validly reconfigured dracut for graphics module loading, and/or I've blacklisting of radeon in /etc/modulesload.d/. # systemd-analyze critical-chain ... graphical.target @4.227s └─multi-user.target @4.226s └─kbdsettings.service @2.171s +2.055s └─basic.target @2.154s └─sockets.target @2.154s └─telnet.socket @2.154s └─sysinit.target @2.148s └─systemd-backlight@backlight:acpi_video0.service @3.476s +7ms └─system-systemd\x2dbacklight.slice @3.271s └─system.slice └─-.slice Questions: 1-How do I guarantee earliest possible loading of whatever kernel graphics module is needed or wanted, whether radeon or amdgpu? Is force_drivers+=" amdgpu " in a file of any name in /etc/dracut.conf.d/ sufficient to ensure loading amdgpu comes first? Solely? Is omit_drivers+=" radeon " needed as well? 2-Does initrd by default include whatever blacklisting is contained in /etc/modprobe.d/? Is "blacklist radeon" in any file in this directory sufficient? Do filenames here need to end in .conf to be utilized? 3-Could a delay of first X start on custom want or after existence of /dev/dri/card0: in /usr/lib/systemd/system/display-manager.service work? Shouldn't that already be happening? Googling early graphics loading has been getting me nothing but *NVidia* and/or Youtubes. :( 15.3 & TW are behaving perfectly using only amdgpu, without any heroics, on same PC.
You could check which DRM drivers have been added to initrd by running lsinitrd /boot/initrd | grep drm I believe only the needed (default) driver is being added to initrd. So, yes it would make sense to force dracut to add "amdgpu" driver and save space and not include "radeon" driver. As you suggested force_drivers+=" amdgpu" omit_drivers+="radeon" Of course you need to regenerate the initrd afterwards. Xserver is being started by the displaymanager. So if anyone can/should wait for existence of /dev/dri it would be the DM. But I'm afraid this won't help. As you noticed the load of the module is being triggered by Xserver, the device is just not available in time.
(In reply to Stefan Dirsch from comment #25) > You could check which DRM drivers have been added to initrd by running > > lsinitrd /boot/initrd | grep drm > > I believe only the needed (default) driver is being added to initrd. So, yes > it would make sense to force dracut to add "amdgpu" driver and save space > and not include "radeon" driver. As you suggested > > force_drivers+=" amdgpu" > omit_drivers+="radeon" > > Of course you need to regenerate the initrd afterwards. > > Xserver is being started by the displaymanager. So if anyone can/should wait > for existence of /dev/dri it would be the DM. But I'm afraid this won't > help. As you noticed the load of the module is being triggered by Xserver, > the device is just not available in time. So does this help?
I gave this more thought and reached the conclusion that at the KMS switch point is when a kernel graphics module should load, but does not. When the screen goes black as 1024x768 mode ends, it should then enable either the native mode, or the cmdline video= mode. That's not been happening. Between comments #24 & #25, I took a trip through BIOS setup and found IOMMU disabled. I switched it to enabled, and found the bad behavior seemed to be gone. However, after a break for sleep, on next considerable number of boots the bad behavior was back. So I decided to forget about it and do other things that needed doing. During that lull, I stumbled onto rd.driver.pre=amdgpu, so on return to try to follow-up here I added it. It didn't seem to make any difference. Today is a new day. So far, every boot with an initrd built to include: # lsinitrd /boot/initrd | grep drm | grep -v ^d -rw-r--r-- 1 root root 3417336 Sep 8 13:12 lib/modules/5.14.21-150400.24.21-default/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.zst -rw-r--r-- 1 root root 301834 Sep 8 13:12 lib/modules/5.14.21-150400.24.21-default/kernel/drivers/gpu/drm/drm.ko.zst -rw-r--r-- 1 root root 182288 Sep 8 13:12 lib/modules/5.14.21-150400.24.21-default/kernel/drivers/gpu/drm/drm_kms_helper.ko.zst -rw-r--r-- 1 root root 8183 Sep 8 13:12 lib/modules/5.14.21-150400.24.21-default/kernel/drivers/gpu/drm/drm_ttm_helper.ko.zst -rw-r--r-- 1 root root 22889 Sep 8 13:12 lib/modules/5.14.21-150400.24.21-default/kernel/drivers/gpu/drm/scheduler/gpu-sched.ko.zst -rw-r--r-- 1 root root 47953 Sep 8 13:12 lib/modules/5.14.21-150400.24.21-default/kernel/drivers/gpu/drm/ttm/ttm.ko.zst As a result of force_drivers+=" amdgpu" omit_drivers+="radeon" But, a boot without such an initrd, and without rd.driver.pre=amdgpu, following a reboot from same initrd but with rd.driver.pre=amdgpu, just behaved as expected. The following boot, also without rd.driver.pre=amdgpu, black screened with (EE) open /dev/dri/card0: No such file or directory yet again. Another, with rd.driver.pre=amdgpu, also bad. Later with a no driver "forcing" initrd, I got a black screen boot with rd.driver.pre=amdgpu, followed by a normal boot without rd.driver.pre=amdgpu followed by a black screen boot with rd.driver.pre=amdgpu, so I took rd.driver.pre=amdgpu out of the default boot stanza. Next boot was black, followed by an almost black, both using an initrd explicitly omitting radeon but not mentioning amdgpu. Then again without changing anything, next boot was normal/as expected. I switched back to initrd with force amdgpu & omit radeon (initrd for .24.21 #7), and got good boots >5X in a row. There's just no rhyme or reason to whether /dev/dri/card0 appears soon enough or not without force-amd/omit-radeon, but with is a solution that is less than a 100% guarantee. I think I want to see what the long-awaited next kernel version brings, but ATM, helps seems to be something like ~97% yes. On occasion, the "black" screen turns out to be not 100% black. Occasionally a tty1 screen will have output, but with the brightness turned down to something in the neighborhood of 10% or less. FWIW, same machine has no direct I/O at all on Fedora 36 except with 5.18 kernel instead of 5.19.17, but is fine on 37: https://bugzilla.redhat.com/show_bug.cgi?id=2130843#c1 That, this, and threads in various forums in recent weeks makes me think something is going wrong upstream in kernel with AMD/ATI graphics.
Created attachment 863603 [details] first and second and third Xorg.0.logs from a fresh boot (fail=1, fail=2, success=3) Yet another host suffering this without having i915 included in the initrd. There are three things to distinguish this one from previous comments. First: this one locks up at the black screen, no response to keyboard. Remote login enables xdm to be restarted, after which operation is normal. Second, KDM3 is the DM. Third, when X attempts start automatically for the second time, it uses /dev/dri/card1 instead of /dev/dri/card0. The manual xdm start also uses /dev/dri/card1. This /dev/dri/card1 usage on #2+ starts is not unique. I've seen it with other problem hosts not already mentioned here. # inxi -SG System: Host: gx280 Kernel: 6.0.12-1-default arch: i686 bits: 32 Desktop: KDE v: 3.5.10 Distro: openSUSE Tumbleweed 20221219 Graphics: Device-1: Intel 82915G/GV/910GL Integrated Graphics driver: i915 v: kernel Display: x11 server: X.Org v: 21.1.4 driver: X: loaded: intel unloaded: fbdev,modesetting,vesa dri: i915 gpu: i915 resolution: 1680x1050~60Hz API: OpenGL v: 2.1 Mesa 22.2.4 renderer: i915 (: 915G) # dmesg | grep ailed [ 62.345100] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 [ 62.345122] cfg80211: failed to load regulatory.db [ 80.820683] simple-framebuffer simple-framebuffer.0: [drm:drm_atomic_helper_check_planes] [CRTC:34:crtc-0] atomic driver check failed [ 80.820690] simple-framebuffer simple-framebuffer.0: [drm:drm_atomic_check_only] atomic driver check for 963f11c1 failed: -22 [ 145.537805] i915 0000:00:02.0: [drm:i915_gem_execbuffer2_ioctl [i915]] copy 1 exec entries failed [ 201.786429] i915 0000:00:02.0: [drm:i915_gem_execbuffer2_ioctl [i915]] copy 1 exec entries failed [ 2583.712955] i915 0000:00:02.0: [drm:i915_gem_execbuffer2_ioctl [i915]] copy 1 exec entries failed
Created attachment 866300 [details] Xorg.0.log from TW20230412 w/ SDDM/Plasma Host fi965 is another victim, currently TW20230412 on an old PCIe Radeon HD 8570 / R5 430 OEM R7 240/340 Radeon 520 OEM 1002:6611. It has two discrete installations. One with KDM3/KDE3, the other with SDDM/Plasma, both suffering. From the SDDM: # dmesg | grep aile [ 30.152899] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 [ 30.152906] cfg80211: failed to load regulatory.db # journalctl -b | grep aile Apr 13 21:37:28 fi965 systemd-vconsole-setup[167]: Failed to import credentials, ignoring: No such file or directory Apr 14 01:38:21 fi965 kernel: platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 Apr 14 01:38:21 fi965 kernel: cfg80211: failed to load regulatory.db Apr 14 01:38:48 fi965 nscd[790]: 790 stat failed for file `/etc/services'; will try again later: No such file or directory # grep /dev/dr /var/log/Xorg.0.log [ 1486.092] (II) xfree86: Adding drm device (/dev/dri/card1) [ 1486.098] (II) Applying OutputClass "AMDgpu" to /dev/dri/card1 [ 1486.106] (EE) open /dev/dri/card0: No such file or directory [ 1486.106] (II) Applying OutputClass "AMDgpu" options to /dev/dri/card1 # grep /dev/dr /var/log/Xorg.0.log.old [ 619.557] (II) xfree86: Adding drm device (/dev/dri/card1) [ 619.697] (II) Applying OutputClass "AMDgpu" to /dev/dri/card1 [ 619.795] (EE) open /dev/dri/card0: No such file or directory [ 619.806] (II) Applying OutputClass "AMDgpu" options to /dev/dri/card1 # Absent vtty or remote login for systemctl restart xdm, post-grub activity ends with login prompt on vtty1.
Hmm. For some reason amdgpu driver takes /dev/dri/card1. I suggest to try again without xf86-video-amdgpu package installed, i.e. let modesetting X driver take over the device. Apart from that this all looks like a timing issue, i.e. driver not being initialized in time before X gets started.
Created attachment 871337 [details] Xorg.0.log from the initial post-boot X failure on KBL GT2 Slowroll host ab250 Host ab250 here is Intel Kaby Lake with Tumbleweed, Slowroll, 15.4, 15.5 and 15.5, among other distros, installed on it. Like several other openSUSE installations here on various hosts, TW, SR & Leap, X will fail to start, claiming /dev/dri/card0 does not exist, but instead of a black screen, it exhibits multi-user.target behavior. When I login and systemctl restart xdm, the greeter starts, and X is running using /dev/dri/card1, with /dev/dri/card0 non-existant. # systemd-analyze Startup finished in 13.602s (firmware) + 9.196s (loader) + 1.801s (kernel) + 2.357s (initrd) + 4.994s (userspace) = 31.953s graphical.target reached after 4.993s in userspace # systemd-analyze critical-chain ... graphical.target @4.993s └─multi-user.target @4.993s └─kbdsettings.service @2.865s +2.126s └─systemd-vconsole-setup.service @2.655s +175ms └─systemd-journald.socket └─system.slice └─-.slice # dmesg | grep aile [ 6.200638] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 [ 6.200641] cfg80211: failed to load regulatory.db [ 6.538658] i915 0000:00:02.0: [drm] [ENCODER:94:DDI A/PHY A] failed to retrieve link info, disabling eDP # journalctl -b --no-hostname | grep aile Dec 13 21:43:06 kernel: platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 Dec 13 21:43:06 kernel: cfg80211: failed to load regulatory.db Dec 13 21:43:06 kernel: i915 0000:00:02.0: [drm] [ENCODER:94:DDI A/PHY A] failed to retrieve link info, disabling eDP Dec 13 21:43:06 systemd[1]: kbdsettings.service: Failed with result 'signal'. #
Hmm. So seems we see here a timing issue as well on Intel. First time I hear about this.
Maybe it would make sense to wait for /dev/dri/card0 to appear for a few seconds before starting the display manager. Likely that timeout will need to be reached on systems that do not have 3D acceleration support, only modesetting.
What's apparently been happening in most installations for a while is the DM retries, so eventually launches: # inxi -CGSz --hostname System: Host: ab250 Kernel: 6.4.0-150600.9-default arch: x86_64 bits: 64 Desktop: TDE (Trinity) v: R14.1.1 Distro: openSUSE Leap 15.6 Beta CPU: Info: quad core model: Intel Core i5-7500T bits: 64 type: MCP cache: L2: 1024 KiB Speed (MHz): avg: 800 min/max: 800/3300 cores: 1: 800 2: 800 3: 800 4: 800 Graphics: Device-1: Intel HD Graphics 630 driver: i915 v: kernel Display: x11 server: X.Org v: 1.21.1.11 driver: X: loaded: modesetting unloaded: fbdev,vesa dri: iris gpu: i915 resolution: 1: 2560x1440~60Hz 2: 1920x1200~60Hz 3: 1680x1050~60Hz API: OpenGL v: 4.6 vendor: intel mesa v: 23.3.4 renderer: Mesa Intel HD Graphics 630 (KBL GT2) # lsinitrd /boot/initrd | grep 915 # systemd-analyze critical-chain ... graphical.target @4.586s └─multi-user.target @4.586s └─kbdsettings.service @2.422s +2.163s └─systemd-vconsole-setup.service @2.102s +299ms └─systemd-journald.socket └─system.slice └─-.slice # grep dev/dr /var/log/Xorg.0* /var/log/Xorg.0.log:[ 20.631] (II) xfree86: Adding drm device (/dev/dri/card0) /var/log/Xorg.0.log:[ 20.666] (II) modeset(0): using drv /dev/dri/card0 /var/log/Xorg.0.log.old:[ 5.606] (EE) open /dev/dri/card0: No such file or directory /var/log/Xorg.0.log.old:[ 5.606] (EE) open /dev/dri/card0: No such file or directory # # inxi -CGSz --hostname System: Host: ab250 Kernel: 6.6.21-1-longterm arch: x86_64 bits: 64 Desktop: KDE v: 3.5.10 Distro: openSUSE Tumbleweed-Slowroll 20240213 CPU: Info: quad core model: Intel Core i5-7500T bits: 64 type: MCP cache: L2: 1024 KiB Speed (MHz): avg: 800 min/max: 800/3300 cores: 1: 800 2: 800 3: 800 4: 800 Graphics: Device-1: Intel HD Graphics 630 driver: i915 v: kernel Display: x11 server: X.Org v: 21.1.11 driver: X: loaded: modesetting dri: iris gpu: i915 resolution: 1: 2560x1440~60Hz 2: 1920x1200~60Hz 3: 1680x1050~60Hz API: EGL v: 1.5 drivers: iris,swrast platforms: x11,surfaceless,device API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 23.3.6 renderer: Mesa Intel HD Graphics 630 (KBL GT2) # lsinitrd /boot/initrd | grep i915 # systemd-analyze critical-chain ... graphical.target @5.686s └─multi-user.target @5.686s └─kbdsettings.service @3.561s +2.122s └─systemd-vconsole-setup.service @3.351s +152ms └─systemd-journald.socket └─system.slice └─-.slice # grep dev/dr /var/log/Xorg.0* /var/log/Xorg.0.log:[ 22.161] (II) xfree86: Adding drm device (/dev/dri/card1) /var/log/Xorg.0.log:[ 22.207] (II) modeset(0): using drv /dev/dri/card1 /var/log/Xorg.0.log.old:[ 7.135] (EE) open /dev/dri/card0: No such file or directory /var/log/Xorg.0.log.old:[ 7.135] (EE) open /dev/dri/card0: No such file or directory #
So this is the result of trying to boot as fast as possible, and the display manager starts before the system is ready. Systemd maintainers, please advise how to delay display manager startup until all drivers are loaded.
(In reply to Michal Suchanek from comment #35) > So this is the result of trying to boot as fast as possible, and the display > manager starts before the system is ready. > > Systemd maintainers, please advise how to delay display manager startup > until all drivers are loaded. I'm not aware of any way to do that (and that would be ugly to do so). When a process needs a device, the traditional way is to rely on udev for waiting for the device before accessing it.
You should include amdgpu driver in initrd. That's done in the early stage, hence such a timing problem can be avoided in most cases. And that's the openSUSE default behavior. It worked with radeon (casually) likely because it can initialize the device much quicker than amdgpu that needs the firmware loading and more complex tasks. There is no general solution if the device is being initialized at a late stage, AFAIK. You can tweak the stuff that matches with your hardware setup, but it can't be applied generically.
We don't know what device is needed, only that if some devices are not initialized the display server fails to start.
Just to confirm the observation. The displaymanager tries 3 times to start Xserver before it fails in a fatal way. It wasn't meant as a workaround for timing issues though. As Takashi I see no generic approach to address this. Of course you could add some weird sleep loops until you run into a predefined timeout, before you start the displaymanager. But even this may fail in the end. And it makes booting slower for everyone ...
And we do not know what driver is needed, it varies depending on hardware. We also do not know which one is needed if more than one graphics card is present or if all are needed.
(In reply to Michal Suchanek from comment #40) > And we do not know what driver is needed, it varies depending on hardware. Usually it would be /dev/dri/card0 (of course only on systems with VGA device). > We also do not know which one is needed if more than one graphics card is > present or if all are needed. Yeah. That's true. So on some systems things will still fail miserably and on many systems it will make the startup slower.
(In reply to Takashi Iwai from comment #37) > You should include amdgpu driver in initrd. That's done in the early stage, > hence such a timing problem can be avoided in most cases. And that's the > openSUSE default behavior. Forcing i915 apparently caused bug 1206316 so I stopped force including it. Following comes from the two comment #34 installations, while same host's TW uses the same configuration and suffers the same delay.: # cat /etc/dracut.conf.d/*conf /disks/sslo/etc/dracut.conf.d/*conf | egrep -v '^\#|^$' persistent_policy="by-uuid" persistent_policy="by-label" compress="xz" hostonly="yes" omit_drivers+=" btrfs crypto dmraid encryptfs i18n iscsi lvm lvm2 plymouth raid1 md_mod resume sata_sil uefi-lib usb_storage watchdog " omit_dracutmodules+=" resume " persistent_policy="by-uuid" persistent_policy="by-label" compress="xz" hostonly="yes" omit_drivers+=" btrfs crypto dmraid encryptfs i18n iscsi lvm lvm2 plymouth raid1 md_mod resume sata_sil uefi-lib usb_storage watchdog " omit_dracutmodules+=" resume " # ls -gG /etc/dracut.conf.d/ -rw-r--r-- 1 894 Feb 27 12:59 10-persistent_policy.conf -rw-r--r-- 1 29 Nov 10 2022 13-persistent-local.conf -rw-r--r-- 1 366 Aug 13 2023 90-local.conf -rw-r--r-- 1 491 Feb 27 12:59 99-debug.conf # rpm -qf 99-debug.conf 10-persistent_policy.conf dracut-059+suse.506.gd33b6bef-150600.1.32.x86_64 dracut-059+suse.506.gd33b6bef-150600.1.32.x86_64 # cat /etc/dracut.conf.d/10-persistent_policy.conf | egrep -v '^\#|^$' persistent_policy="by-uuid"
Back to the comment #0 PC, no amount of waiting gets X going. Systemctl restart xdm is required: # inxi -GSz System: Kernel: 6.6.22-1-longterm arch: x86_64 bits: 64 Console: pty pts/0 Distro: openSUSE Tumbleweed-Slowroll 20240213 Graphics: Device-1: AMD Kaveri [Radeon R7 Graphics] driver: amdgpu v: kernel Display: server: X.org v: 1.21.1.11 driver: X: loaded: vesa unloaded: fbdev,modesetting gpu: amdgpu resolution: 1: 2560x1440 2: 1680x1050 3: 1920x1200 4: 1680x1050 API: EGL v: 1.5 drivers: radeonsi,swrast platforms: surfaceless,device API: OpenGL v: 4.6 compat-v: 4.5 vendor: mesa v: 23.3.6 note: incomplete (EGL sourced) renderer: AMD Radeon R7 Graphics (radeonsi kaveri LLVM 17.0.6 DRM 3.54 6.6.22-1-longterm), llvmpipe (LLVM 17.0.6 256 bits) API: Vulkan Message: No Vulkan data available. # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz root=LABEL=zd8p19sslo noresume ipv6.disable=1 net.ifnames=0 radeon.cik_support=0 amdgpu.cik_support=1 consoleblank=0 preempt=full mitigations=off # ls -gGh /dev/dri total 0 drwxr-xr-x 2 80 Mar 27 18:26 by-path crw-rw---- 1 226, 1 Mar 27 18:26 card1 crw-rw---- 1 226, 128 Mar 27 18:26 renderD128 # grep /dev/dr /var/log/Xorg.0.log* /var/log/Xorg.0.log:[ 11.244] (EE) open /dev/dri/card0: No such file or directory /var/log/Xorg.0.log:[ 11.244] (EE) open /dev/dri/card0: No such file or directory /var/log/Xorg.0.log.old:[ 869.564] (II) xfree86: Adding drm device (/dev/dri/card1) /var/log/Xorg.0.log.old:[ 869.570] (II) Applying OutputClass "AMDgpu" to /dev/dri/card1 /var/log/Xorg.0.log.old:[ 869.576] (EE) open /dev/dri/card0: No such file or directory /var/log/Xorg.0.log.old:[ 869.577] (II) Applying OutputClass "AMDgpu" options to /dev/dri/card1 # systemd-analyze critical-chain ... graphical.target @10.040s └─multi-user.target @10.040s └─kbdsettings.service @7.945s +2.093s └─systemd-vconsole-setup.service @7.315s +599ms └─systemd-journald.socket └─system.slice └─-.slice