|
Bugzilla – Full Text Bug Listing |
| Summary: | [Build 49.1] Coredump after installation/migration of SLES 15 SP6 | ||
|---|---|---|---|
| Product: | [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP6 | Reporter: | Chenzi Cao <chcao> |
| Component: | GNOME | Assignee: | E-mail List <gnome-bugs> |
| Status: | VERIFIED FIXED | QA Contact: | |
| Severity: | Major | ||
| Priority: | P3 - Medium | CC: | chcao, ihno, jan.stehlik, jeriveramoya, joan.torres, llzhao, marcela.maslanova, richard.fan, santiago.zarate, sndirsch, xiaoguang.wang, yfjiang, zcjia |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | S/390-64 | ||
| OS: | Other | ||
| URL: | https://openqa.suse.de/tests/13351074/modules/first_boot/steps/4 | ||
| Whiteboard: | |||
| Found By: | openQA | Services Priority: | |
| Business Priority: | Blocker: | Yes | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
y2log
first_boot-journal.log |
||
|
Description
Chenzi Cao
2024-01-26 04:45:43 UTC
Created attachment 872212 [details] first_boot-journal.log These are logs from fresh installation case. Attached is the first_boot-journal.log from openqa case: https://openqa.suse.de/tests/13348800#step/first_boot/1 And the serial0 log of the installation case: [FAILED] Failed to listen on Xvnc Server. 2024-01-25T22:29:08.525369-05:00 localhost (sd-listen)[6990]: xvnc.socket: Failed to create listening socket ([::]:5901): Address already in use 2024-01-25T22:29:08.528073-05:00 localhost systemd[1]: xvnc.socket: Failed to receive listening socket ([::]:5901): Input/output error 2024-01-25T22:29:08.528270-05:00 localhost systemd[1]: Failed to listen on Xvnc Server. [ 121.427871][ T7063] device-mapper: uevent: version 1.0.3 [ 121.427946][ T7063] device-mapper: ioctl: 4.48.0-ioctl (2023-03-01) initialised: dm-devel@redhat.com 2024-01-25T22:29:22.855160-05:00 localhost systemd-coredump[7431]: Process 3256 (Xvnc) of user 0 dumped core.#012#012Stack trace of thread 3256:#012#0 0x000003ff8969fad2 __pthread_kill_implementation (libc.so.6 + 0x9fad2)#012#1 0x000003ff89651200 raise (libc.so.6 + 0x51200)#012#2 0x000003ff896339fc abort (libc.so.6 + 0x339fc)#012#3 0x000002aa2354dcd6 OsAbort (Xvnc + 0x1cdcd6)#012#4 0x000002aa235530a4 AbortServer (Xvnc + 0x1d30a4)#012#5 0x000002aa23553fc2 FatalError (Xvnc + 0x1d3fc2)#012#6 0x000002aa2354ac0c n/a (Xvnc + 0x1cac0c)#012#7 0x000003ffe7cfe490 n/a (linux-vdso64.so.1 + 0x490)#012ELF object binary architecture: IBM S/390 /usr/lib/YaST2/startup/YaST2.call: line 369: 3256 Aborted (core dumped) /usr/bin/Xvnc :0 -noreset -rfbauth /root/.vnc/passwd.yast -desktop "Installation" -geometry "$VNCSize" -dpi 96 -rfbport 5901 -fp /usr/share/fonts/misc/,/usr/share/fonts/uni/,/usr/share/fonts/truetype/ > /var/log/YaST2/vncserver.log 2>&1 removed '/root/.vnc/passwd.yast' The whole serial0 log: https://openqa.suse.de/tests/13348800/logfile?filename=serial0.txt In the journal log, there were these messages: > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: (II) LoadModule: "fbdev" > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: (WW) Warning, couldn't open module fbdev > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: (EE) Failed to load module "fbdev" (module does not exist, 0) > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: (II) modesetting: Driver for Modesetting Kernel Drivers: kms > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: (EE) > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: Fatal server error: > Jan 25 22:19:30.865503 susetest /usr/lib/gdm/gdm-x-session[5099]: (EE) parse_vt_settings: Cannot open /dev/tty0 (Permission denied) And there is X coredump: > Jan 25 22:19:30.985346 susetest systemd-coredump[5101]: Process 5099 (X) of user 462 dumped core. > > Stack trace of thread 5099: > #0 0x000003ff8359fad2 __pthread_kill_implementation (libc.so.6 + 0x9fad2) > #1 0x000003ff83551200 raise (libc.so.6 + 0x51200) > #2 0x000003ff835339fc abort (libc.so.6 + 0x339fc) > #3 0x000002aa30ffd97e OsAbort (Xorg + 0x1fd97e) > #4 0x000002aa31003c3c n/a (Xorg + 0x203c3c) > #5 0x000002aa31004b5a FatalError (Xorg + 0x204b5a) > #6 0x000002aa30edcd0e n/a (Xorg + 0xdcd0e) > #7 0x000002aa30edd100 xf86OpenConsole (Xorg + 0xdd100) > #8 0x000002aa30ebc1a2 InitOutput (Xorg + 0xbc1a2) > #9 0x000002aa30e7936c n/a (Xorg + 0x7936c) > #10 0x000003ff83533fba __libc_start_call_main (libc.so.6 + 0x33fba) > #11 0x000003ff835340a0 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x340a0) > #12 0x000002aa30e601bc _start (Xorg + 0x601bc) > ELF object binary architecture: IBM S/390 CC X maintainer. Xserver support for s390x has been enabled since sle15-sp4 due to feature request by IBM. https://jira.suse.com/browse/SLE-18632 IBM confirmed it working. I think I never tested it myself. Not sure why it ever worked. I thought there is no /dev/tty0 (virtual terminal/Linux console) on s390x. Or is there and now X starting as regular user cannot access it? Remark to myself. Maybe it's related to Xorg wrapper and gfx emulation. bsc#1175867 (In reply to Stefan Dirsch from comment #4) > Remark to myself. > > Maybe it's related to Xorg wrapper and gfx emulation. bsc#1175867 virtio-gpu KMS driver needs to be used. I guess this means running qemu with -vga virtio But maybe this has never been confirmed to work. Could be that IBM tested this on real hardware and not on qemu. Dear developers, fyi pls, this issue blocks all s390x migration and installation cases which installed with gnome desktop. If any more info needed here, please feel free to contact me, thanks. I would have hoped you would comment on what I wrote in comment#5, i.e. has this ever been tested in openQ/qemu before? And with which qemu -vga option? Here's my observation: First, this problem is not unique to migration, a freshly installed system has same problem, so this is blocking all functional tests of S390x which requires GUI: https://openqa.suse.de/tests/13393418#step/first_boot/2 Current setting works for xdm however: https://openqa.suse.de/tests/13393483#step/first_boot/1 Gnome works fine on SLE15SP5 GMC with S390x: https://openqa.suse.de/tests/11181630#step/first_boot/1 I'm not sure if the S390x openqa worker settings has changed or not. (In reply to Jia Zhaocong from comment #8) > Here's my observation: > > First, this problem is not unique to migration, a freshly installed system > has same problem, so this is blocking all functional tests of S390x which > requires GUI: https://openqa.suse.de/tests/13393418#step/first_boot/2 > > Current setting works for xdm however: > https://openqa.suse.de/tests/13393483#step/first_boot/1 > > Gnome works fine on SLE15SP5 GMC with S390x: > https://openqa.suse.de/tests/11181630#step/first_boot/1 > > I'm not sure if the S390x openqa worker settings has changed or not. Hi Zhaocong, thanks for the comment. Could you/Chenzi explain the question on the comment#5 and comment#7, what was the exact graphical hardware environment in which the openQA test happened? The hardware environment didn't change recently and it worked fine when testing SLES15SP6 build45.1.
Here is the hardware environment of s390x worker:
<?xml version="1.0"?>
<domain type="kvm">
<name>openQA-SUT-8</name>
<memory unit="MiB">1024</memory>
<vcpu>1</vcpu>
<os>
<type>hvm</type>
<initrd>/var/lib/libvirt/images/openQA-SUT-8.initrd</initrd>
<kernel>/var/lib/libvirt/images/openQA-SUT-8.kernel</kernel>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<on_reboot>destroy</on_reboot>
<devices>
<disk type="file" device="disk">
<driver name="qemu" type="qcow2" cache="unsafe"/>
<target dev="vda" bus="virtio"/>
<source file="/var/lib/libvirt/images/openQA-SUT-8a.img"/>
</disk>
<console type="pty">
<target type="sclp" port="0"/>
</console>
<console type="pty">
<target type="virtio" port="1"/>
</console>
<console type="pty">
<target type="virtio" port="2"/>
</console>
<interface type="direct">
<target dev="macvtap12"/>
<source mode="bridge" dev="vlan2114"/>
<mac address="52:54:00:82:27:46"/>
</interface>
</devices>
</domain>
(In reply to Chenzi Cao from comment #10) > The hardware environment didn't change recently and it worked fine when > testing SLES15SP6 build45.1. Hmm. So what has changed between 45.1 and 49.1? When was 45.1 and 49.1 been built? (In reply to Stefan Dirsch from comment #11) > (In reply to Chenzi Cao from comment #10) > > The hardware environment didn't change recently and it worked fine when > > testing SLES15SP6 build45.1. > > Hmm. So what has changed between 45.1 and 49.1? When was 45.1 and 49.1 been > built? As far as I know, the gnome has big version update between 45.1 and 49.1. 45.1 was built two months ago, and 49.1 was built 23 days ago. Here the VNC is used to connect to the x390 server, Xvnc server is running on the x390 server.
I test it on TW X86, enable VNC server by Yast (Allow Remote Administration With Session Management), and connect it through vncviewer, get the same error:
> (EE) parse_vt_settings: Cannot open /dev/tty0 (Permission denied)
(In reply to xiaoguang wang from comment #13) > Here the VNC is used to connect to the x390 server, Xvnc server is running > on the x390 server. > > I test it on TW X86, enable VNC server by Yast (Allow Remote Administration > With Session Management), and connect it through vncviewer, get the same > error: > > > (EE) parse_vt_settings: Cannot open /dev/tty0 (Permission denied) Ok. So instead of Xvnc now Xorg is being started for remote administration? This sounds totally weird and makes no sense to me. You can see the updated packages from: http://xcdchk.suse.de/results/SLE-15-SP6-Full-Test/49.1 Not only the GNOME stack is updated, Xvnc is also updated: xorg-x11-Xvnc.s390x: 1.12.0-150500.2.6 => 1.13.1-150600.1.2 (In reply to Stefan Dirsch from comment #14) > (In reply to xiaoguang wang from comment #13) > > Here the VNC is used to connect to the x390 server, Xvnc server is running > > on the x390 server. > > > > I test it on TW X86, enable VNC server by Yast (Allow Remote Administration > > With Session Management), and connect it through vncviewer, get the same > > error: > > > > > (EE) parse_vt_settings: Cannot open /dev/tty0 (Permission denied) > > Ok. So instead of Xvnc now Xorg is being started for remote administration? > This sounds totally weird and makes no sense to me. Ok. I will try to reproduce on x86_64 with SP6 Snapshot 202402-1. Several issues here. Issue 1: xorg-x11-Xvnc of SP6 Snapshot 202402-1 is still not up-to-date. Looks like xorg-x11-server update to 21.1.11 came to late, let alone tigervnc would have been rebuilt. I will resubmit tigervnc to make sure this rebuild will happen. Related: bsc#1219311 Issue 2: For some reason gdm starts Xorg instead of Xvnc for VNC sesssions. I have no idea why. GNOME experts need to investigate that. After switching to xdm ("update-alternatives --config default-displaymanager") a VNC xdm login screen gets started when connecting via vncviewer. Xvnc is running! Issue 3: GNOME session crashes. This also needs to be investigated by your GNOME experts. I made starting icewm the default by copying ~/.xinitrc-template to ~/.xinitrc and changing it accordingly to sstart icewm [...] #exec $WINDOWMANAGER ${1+"$@"} exec icewm [...] With that I could start a Xsession by connecting via vncviewer. (In reply to Stefan Dirsch from comment #17) > Several issues here. > > Issue 1: > xorg-x11-Xvnc of SP6 Snapshot 202402-1 is still not up-to-date. Looks like > xorg-x11-server update to 21.1.11 came to late, let alone tigervnc would > have been rebuilt. I will resubmit tigervnc to make sure this rebuild will > happen. Related: bsc#1219311 https://build.suse.de/request/show/322121 @Joan Just for your information! This is an autogenerated message for OBS integration: This bug (1219205) was mentioned in https://build.opensuse.org/request/show/1147574 Factory / tigervnc (In reply to Stefan Dirsch from comment #17) > Several issues here. > > Issue 3: > GNOME session crashes. This also needs to be investigated by your GNOME > experts. It's this famous --- Oh no! Something has gone wrong. A problem has occurred and the system can't recover. Pleas log out and try again. --- message. And then as error log you have: ~> cat ~/.xsession-errors-fec0\:\:e796\:e31\:5ba2\:ede\:1 Environment variable $XAUTHORITY not set, ignoring. This is an autogenerated message for OBS integration: This bug (1219205) was mentioned in https://build.opensuse.org/request/show/1148276 Factory / tigervnc This is an autogenerated message for OBS integration: This bug (1219205) was mentioned in https://build.opensuse.org/request/show/1148477 Factory / tigervnc When Xvnc started the greeter session, the Xorg was started, it’s related to the commit 6184c8a9 in gdm. I created an issue on upstream https://gitlab.gnome.org/GNOME/gdm/-/issues/909 I sent a workaround patch to GNOME:Factory https://build.opensuse.org/request/show/1148941 (In reply to Stefan Dirsch from comment #17) > Several issues here. > [...] > Issue 3: > GNOME session crashes. This also needs to be investigated by your GNOME > experts. I made starting icewm the default by copying > ~/.xinitrc-template to ~/.xinitrc and changing it accordingly to sstart icewm > > [...] > #exec $WINDOWMANAGER ${1+"$@"} > exec icewm > [...] > > With that I could start a Xsession by connecting via vncviewer. We should not forget about this issue. I mean Xsession crashing immediately after logging in is still an improvement compared to having no login screen, but it's not really perfect yet ... *** Bug 1219392 has been marked as a duplicate of this bug. *** > <memory unit="MiB">1024</memory> Chenzi, can you also try with 2 GB of RAM, no kdump? https://progress.opensuse.org/issues/153808#note-6 (In reply to Chenzi Cao from comment #10) > The hardware environment didn't change recently and it worked fine when > testing SLES15SP6 build45.1. > > Here is the hardware environment of s390x worker: > > <?xml version="1.0"?> > <domain type="kvm"> > <name>openQA-SUT-8</name> > <memory unit="MiB">1024</memory> > <vcpu>1</vcpu> > <os> > <type>hvm</type> > <initrd>/var/lib/libvirt/images/openQA-SUT-8.initrd</initrd> > <kernel>/var/lib/libvirt/images/openQA-SUT-8.kernel</kernel> > </os> > <features> > <acpi/> > <apic/> > <pae/> > </features> > <on_reboot>destroy</on_reboot> > <devices> > <disk type="file" device="disk"> > <driver name="qemu" type="qcow2" cache="unsafe"/> > <target dev="vda" bus="virtio"/> > <source file="/var/lib/libvirt/images/openQA-SUT-8a.img"/> > </disk> > <console type="pty"> > <target type="sclp" port="0"/> > </console> > <console type="pty"> > <target type="virtio" port="1"/> > </console> > <console type="pty"> > <target type="virtio" port="2"/> > </console> > <interface type="direct"> > <target dev="macvtap12"/> > <source mode="bridge" dev="vlan2114"/> > <mac address="52:54:00:82:27:46"/> > </interface> > </devices> > </domain> The SR quoting this bug was integrated into Build 59.2; please set it as resolved fixed so QE can verify it if you consider it fixed. (In reply to Stefan Dirsch from comment #27) > (In reply to Stefan Dirsch from comment #17) > > Several issues here. > > [...] > > Issue 3: > > GNOME session crashes. This also needs to be investigated by your GNOME > > experts. I made starting icewm the default by copying > > ~/.xinitrc-template to ~/.xinitrc and changing it accordingly to sstart icewm > > > > [...] > > #exec $WINDOWMANAGER ${1+"$@"} > > exec icewm > > [...] > > > > With that I could start a Xsession by connecting via vncviewer. > > We should not forget about this issue. I mean Xsession crashing immediately > after logging in is still an improvement compared to having no login screen, > but it's not really perfect yet ... Looks like this is ok, so let's close this ticket as fixed. Sorry, I have to reopen it, the issue is still existing on build59.2 results: https://openqa.suse.de/tests/13636296#step/first_boot/2 And I also tried to add memory to 2G and 4G (on build59.2), it doesn't resolve the problem: https://openqa.suse.de/tests/13655158#step/first_boot/1 (2G) Serial0 log: https://openqa.suse.de/tests/13655158/logfile?filename=serial0.txt https://openqa.suse.de/tests/13655203#step/first_boot/1 (4G) Serial0 log: https://openqa.suse.de/tests/13655203/logfile?filename=serial0.txt (In reply to Chenzi Cao from comment #34) > Sorry, I have to reopen it, the issue is still existing on build59.2 results: > Serial0 log: > https://openqa.suse.de/tests/13655158/logfile?filename=serial0.txt Xorg server is still tried to be started, so the gdm fix isn't included yet. Check for Thu Feb 22 01:17:18 UTC 2024 - Xiaoguang Wang <xiaoguang.wang@suse.com> - Add gdm-xvnc-start-session-failed.patch: None seat0 session runs without running launcher(bsc#1219205 glgo#GNOME/gdm#909). in gdm package changelog. Not sure where one can find build59.2. Otherwise I could check myself ... Still fails in build 60.3 https://openqa.suse.de/tests/13680877#step/first_boot/2 More failed cases on 15-SP6, Build60.3, x86_64 https://openqa.suse.de/tests/13679655# (x86_64) (In reply to lili zhao from comment #42) > More failed cases on 15-SP6, Build60.3, x86_64 > https://openqa.suse.de/tests/13679655# (x86_64) This bug is specifically for testing on a SLE 15 SP6 native vnc (typically on s390x). So for anything other than, please open another bug. (In reply to Yifan Jiang from comment #43) > (In reply to lili zhao from comment #42) > > More failed cases on 15-SP6, Build60.3, x86_64 > > https://openqa.suse.de/tests/13679655# (x86_64) > > This bug is specifically for testing on a SLE 15 SP6 native vnc (typically > on s390x). So for anything other than, please open another bug. It's x86 IPMI machine, also uses VNC. (In reply to Yifan Jiang from comment #43) > (In reply to lili zhao from comment #42) > > More failed cases on 15-SP6, Build60.3, x86_64 > > https://openqa.suse.de/tests/13679655# (x86_64) > > This bug is specifically for testing on a SLE 15 SP6 native vnc (typically > on s390x). So for anything other than, please open another bug. The first_boot succeeded on build 62.1, for example: https://openqa.suse.de/tests/13715358# Not sure if this bug fixed the "STALL" issue on x86_64 ipmi-nvdimm. The issue is fixed on s390x with build62.1, thanks. https://openqa.suse.de/tests/13713506#step/first_boot/1 Checked the results on SLES15SP6 build 62.1, the issue is fixed, so close this ticket as fixed now, thanks! And verify here, thank you. |