Bug 1213579 - kded5 or kwin_wayland_wrapper crash during start
Summary: kded5 or kwin_wayland_wrapper crash during start
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: KDE Workspace (Plasma) (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: E-Mail List
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-24 07:32 UTC by Jiri Slaby
Modified: 2023-08-02 06:56 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jiri Slaby 2023-07-24 07:32:17 UTC
While "plasma on wayland session" is being started from sddm 0.20 (wayland-configured too), kded5 or kwin_wayland_wrapper crash.

kded5 crash:
> (gdb) thread 7
> (gdb) where
> #0  0x00007f63db0df957 in __GI___wait4 (pid=8595, stat_loc=0x0, options=1, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
> #1  0x00007f63dcdc1080 in KCrash::startProcess(int, char const**, bool) (argv=argv@entry=0x7fff0420c568, waitAndExit=waitAndExit@entry=true, argc=<optimized out>)
>     at /usr/src/debug/kcrash-5.108.0/src/kcrash.cpp:720
> #2  0x00007f63dcdc1cdd in KCrash::defaultCrashHandler(int) (sig=11) at /usr/src/debug/kcrash-5.108.0/src/kcrash.cpp:616
> #3  0x00007f63db041330 in <signal handler called> () at /lib64/libc.so.6
> #4  wl_list_remove (elm=elm@entry=0x5612e573a110) at ../src/wayland-util.c:57

(gdb) l
52
53      WL_EXPORT void
54      wl_list_remove(struct wl_list *elm)
55      {
56              elm->prev->next = elm->next;
57              elm->next->prev = elm->prev;
58              elm->next = NULL;
59              elm->prev = NULL;
60      }
61
(gdb) disass
Dump of assembler code for function wl_list_remove:
   0x00007f63d9dd9ef0 <+0>:     mov    (%rdi),%rdx
   0x00007f63d9dd9ef3 <+3>:     mov    0x8(%rdi),%rax
   0x00007f63d9dd9ef7 <+7>:     pxor   %xmm0,%xmm0
   0x00007f63d9dd9efb <+11>:    mov    %rax,0x8(%rdx)
=> 0x00007f63d9dd9eff <+15>:    mov    %rdx,(%rax)
   0x00007f63d9dd9f02 <+18>:    movups %xmm0,(%rdi)
   0x00007f63d9dd9f05 <+21>:    ret
End of assembler dump.
(gdb) p/x $rax
$2 = 0x73746e6f662f6572
(gdb) x/x $rax
0x73746e6f662f6572:     Cannot access memory at address 0x73746e6f662f6572
(gdb) p elm
$3 = (struct wl_list *) 0x5612e573a110
(gdb) p elm->next
$4 = (struct wl_list *) 0x73746e6f662f6572

So elm->next is corrupted, apparently.

> #5  0x00007f63d9dda351 in wl_event_queue_release (queue=queue@entry=0x5612e5306260) at ../src/wayland-client.c:320
> #6  0x00007f63d9dda5d3 in wl_display_disconnect (display=0x5612e5306170) at ../src/wayland-client.c:1323
> #7  0x00007f63d7d2a5ac in QtWaylandClient::QWaylandDisplay::~QWaylandDisplay() (this=0x5612e5305fb0, __in_chrg=<optimized out>) at qwaylanddisplay.cpp:383
> #8  0x00007f63d7d2abb9 in QtWaylandClient::QWaylandDisplay::~QWaylandDisplay() (this=0x5612e5305fb0, __in_chrg=<optimized out>) at qwaylanddisplay.cpp:387
> #9  0x00007f63d7d18f89 in QtWaylandClient::QWaylandIntegration::~QWaylandIntegration() (this=0x5612e5306a90, __in_chrg=<optimized out>)
>     at qwaylandintegration.cpp:135
> #10 0x00007f63dbd6f6f7 in QGuiApplicationPrivate::~QGuiApplicationPrivate() (this=0x5612e52f7000, __in_chrg=<optimized out>) at kernel/qguiapplication.cpp:1731
> #11 0x00007f63dc7a3cb9 in QApplicationPrivate::~QApplicationPrivate() () at /lib64/libQt5Widgets.so.5
> #12 0x00005612e3e6357a in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/kded-5.108.0/src/kded.cpp:786

Other threads:
>   Id   Target Id                        Frame
> * 1    Thread 0x7f638affd6c0 (LWP 7016) __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
>   2    Thread 0x7f63d7ca46c0 (LWP 6977) warning: Section `.reg-xstate/6977' in core file too small.
> tcache_get (tc_idx=<optimized out>) at malloc.c:3174
>   3    Thread 0x7f63897fa6c0 (LWP 7019) warning: Section `.reg-xstate/7019' in core file too small.
> _g_locale_get_charset_aliases () at ../glib/libcharset/localcharset.c:112
>   4    Thread 0x7f638a7fc6c0 (LWP 7017) warning: Section `.reg-xstate/7017' in core file too small.
> 0x00007f63db10610c in __GI___libc_read (nbytes=8, buf=0x7f638a7fbc80, fd=36) at ../sysdeps/unix/sysv/linux/read.c:26
>   5    Thread 0x7f6389ffb6c0 (LWP 7018) warning: Section `.reg-xstate/7018' in core file too small.
> 0x00007f63db10a48f in __GI___poll (fds=0x7f6358000b90, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
>   6    Thread 0x7f6349ffd6c0 (LWP 7487) warning: Section `.reg-xstate/7487' in core file too small.
> syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
>   7    Thread 0x7f63db44e240 (LWP 6975) warning: Section `.reg-xstate/6975' in core file too small.
> 0x00007f63db0df957 in __GI___wait4 (pid=8595, stat_loc=0x0, options=1, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30

It's exactly the same for two other crashed kded5 cores:
> Fri 2023-07-21 11:40:01 CEST  2937 500 500 SIGSEGV present  /usr/bin/kded5                  3.9M
> Fri 2023-07-21 19:24:32 CEST  2403 500 500 SIGSEGV present  /usr/bin/kded5                  3.9M

Except the elm->next values are different:
(gdb) p elm->next
$1 = (struct wl_list *) 0x20
(gdb) p elm->next
$2 = (struct wl_list *) 0xa1

With kwin_wayland_wrapper, the backtrace is different (so maybe a different bug?):
> (gdb) where
> #0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
> #1  0x00007f99a6c92b43 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
> #2  0x00007f99a6c41266 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> #3  0x00007f99a6c29897 in __GI_abort () at abort.c:79
> #4  0x00007f99a72bb4f9 in qt_message_fatal (message=<synthetic pointer>..., context=...) at global/qlogging.cpp:1914
> #5  QMessageLogger::fatal (this=this@entry=0x7ffe94ef42a0, msg=msg@entry=0x55bd7d178208 "Could not create wayland socket") at global/qlogging.cpp:893

So: "Could not create wayland socket"...

> #6  0x000055bd7d175495 in KWinWrapper::KWinWrapper (parent=0x7ffe94ef4280, this=0x7ffe94ef42c0)

From:
58      KWinWrapper::KWinWrapper(QObject *parent)
59          : QObject(parent)
60          , m_kwinProcess(new QProcess(this))
61      {
62          m_socket = wl_socket_create();
63          if (!m_socket) {
64              qFatal("Could not create wayland socket");
65          }

>     at /usr/src/debug/kwin-5.27.6/src/helpers/wayland_wrapper/kwin_wrapper.cpp:64
> #7  main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/kwin-5.27.6/src/helpers/wayland_wrapper/kwin_wrapper.cpp:174
Comment 1 Jiri Slaby 2023-07-24 07:35:53 UTC
(In reply to Jiri Slaby from comment #0)
> started from sddm 0.20 (wayland-configured too)

$ cat /etc/sddm.conf.d/wayland 
[General]
DisplayServer=wayland

[Users]
MinimumUid=500

[Autologin]
Session=plasmawayland
User=xslaby
Relogin=true

> 54      wl_list_remove(struct wl_list *elm)
> 55      {
> 56              elm->prev->next = elm->next;
> 57              elm->next->prev = elm->prev;
...
> So elm->next is corrupted, apparently.

Cc Jan as wayland maintainer.
Comment 2 Fabian Vogt 2023-07-24 14:41:32 UTC
It looks like the crash is during shutdown, but it should not shut down at all.

"Could not create wayland socket" is probably the cause, but that's a weird issue, there's not much that could go wrong there. Possibly XDG_RUNTIME_DIR isn't set or the directory not created (properly), but that should be taken care of by pam_systemd.
Comment 3 Jan Engelhardt 2023-07-24 20:21:01 UTC
>#3 <signal handler called>
>#4 wl_list_remove
>#5 wl_event_queue_release
>#6 wl_display_disconnect
>#7/8 QtWaylandClient::QWaylandDisplay::~QWaylandDisplay()
>#9 QtWaylandClient::QWaylandIntegration::~QWaylandIntegration() 
>#10 QGuiApplicationPrivate::~QGuiApplicationPrivate()
>#11 QApplicationPrivate::~QApplicationPrivate()
>#12 main(int, char**)

mh, initial hypothesis goes for a double-free/use-after-free.
Comment 4 Jiri Slaby 2023-07-25 05:42:21 UTC
(In reply to Jan Engelhardt from comment #3)
> mh, initial hypothesis goes for a double-free/use-after-free.

How can I run kded5 under valgrind?

Will something like this:
cat > /etc/sddm.conf.d/valgrind <<EOF
[Wayland]
CompositorCommand=valgrind kwin_wayland --no-global-shortcuts --no-lockscreen --locale1
EOF

work?
Comment 5 Fabian Vogt 2023-07-25 06:18:03 UTC
(In reply to Jiri Slaby from comment #4)
> (In reply to Jan Engelhardt from comment #3)
> > mh, initial hypothesis goes for a double-free/use-after-free.
> 
> How can I run kded5 under valgrind?
> 
> Will something like this:
> cat > /etc/sddm.conf.d/valgrind <<EOF
> [Wayland]
> CompositorCommand=valgrind kwin_wayland --no-global-shortcuts
> --no-lockscreen --locale1
> EOF
> 
> work?

kded5 is probably started through dbus activation, so you'd have to edit either the dbus service file or the referenced systemd user service.
Comment 6 Jiri Slaby 2023-08-02 06:56:20 UTC
(In reply to Jiri Slaby from comment #4)
> (In reply to Jan Engelhardt from comment #3)
> > mh, initial hypothesis goes for a double-free/use-after-free.
> 
> How can I run kded5 under valgrind?

No reports. But also the crashes of kded are gone. Maybe I couldn't reproduce ATM or perhaps this was fixed.

(In reply to Fabian Vogt from comment #2)
> It looks like the crash is during shutdown, but it should not shut down at
> all.
> 
> "Could not create wayland socket" is probably the cause,

This remains and happened twice now:
Wed 2023-08-02 08:20:42 CEST  6716 500 500 SIGABRT present  /usr/bin/kwin_wayland_wrapper 386.9K
Wed 2023-08-02 08:22:35 CEST  8258 500 500 SIGABRT present  /usr/bin/kwin_wayland_wrapper 386.4K