Bug 1217202

Summary: systemd-journald uses whole memory and crashes creating unusable UI
Product: [openSUSE] openSUSE Tumbleweed Reporter: Ilgaz Öcal <ilgaz>
Component: BasesystemAssignee: systemd maintainers <systemd-maintainers>
Status: RESOLVED UPSTREAM QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: fbui
Version: Current   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Tumbleweed   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Ilgaz Öcal 2023-11-15 17:54:27 UTC
I am having problems with i950 so it took sometime to figure.

Under normal load with kde-plasma session system became unresponsive to user input with constant disk read/write because of coredump process.

When I try my best with debug symbols attached, the "bt" (gdb) of crash is

(gdb) bt
#0  client_context_read_log_level_max (c=c@entry=0x55dd3b36ac10, s=0x7ffcdca25c50)
    at ../src/journal/journald-context.c:373
#1  0x000055dd39a58bea in client_context_really_refresh (s=0x7ffcdca25c50, c=0x55dd3b36ac10, 
    ucred=<optimized out>, label=<optimized out>, label_size=<optimized out>, unit_id=<optimized out>, 
    timestamp=<optimized out>) at ../src/journal/journald-context.c:534
#2  0x000055dd39a59354 in client_context_maybe_refresh (s=s@entry=0x7ffcdca25c50, c=c@entry=0x55dd3b36ac10, 
    ucred=ucred@entry=0x7ffcdca259e0, label=label@entry=0x0, label_size=label_size@entry=0, 
    unit_id=unit_id@entry=0x0, timestamp=<optimized out>) at ../src/journal/journald-context.c:589
#3  0x000055dd39a5be24 in client_context_get_internal (s=s@entry=0x7ffcdca25c50, pid=2694, 
    ucred=ucred@entry=0x7ffcdca259e0, label=0x0, label_len=label_len@entry=0, unit_id=unit_id@entry=0x0, 
    add_ref=false, ret=0x7ffcdca25680) at ../src/journal/journald-context.c:692
#4  0x000055dd39a5dc87 in client_context_get (unit_id=0x0, ret=0x7ffcdca25680, label_len=0, 
    label=<optimized out>, ucred=0x7ffcdca259e0, pid=<optimized out>, s=0x7ffcdca25c50)
    at ../src/journal/journald-context.c:730
#5  server_process_syslog_message (s=0x7ffcdca25c50, 
    buf=0x55dd3b32b560 "<28>Nov 15 08:11:46 rtkit-daemon[2694]: The canary thread is apparently starving. Taking action.\n", raw_len=97, ucred=0x7ffcdca259e0, tv=0x7ffcdca257e0, label=<optimized out>, label_len=0)
    at ../src/journal/journald-syslog.c:337
#6  0x000055dd39a4c2ed in server_process_datagram (es=<optimized out>, fd=3, revents=<optimized out>, 
    userdata=0x7ffcdca25c50) at ../src/journal/journald-server.c:1504
#7  0x00007f440907af91 in source_dispatch (s=0x55dd3b3234b0) at ../src/libsystemd/sd-event/sd-event.c:4187
#8  0x00007f440907ef6d in sd_event_dispatch (e=<optimized out>, e@entry=0x55dd3b323130)
    at ../src/libsystemd/sd-event/sd-event.c:4808
#9  0x00007f440907f878 in sd_event_run (e=<optimized out>, timeout=timeout@entry=18446744073709551615)
    at ../src/libsystemd/sd-event/sd-event.c:4869
#10 0x000055dd39a4a5d5 in main (argc=<optimized out>, argv=<optimized out>) at ../src/journal/journald.c:114

additionally this is one of several crashes so I reported. I can provide whatever detail required. I don't know if it is related however there is a corrupt file detected as the result of "journalctl --verify" command.

44ef68: Invalid entry item (22/29) offset: 000000                                       
44ef68: Invalid object contents: Bad message                                            
File corruption detected at /var/log/journal/55f063104b824b1f8708a4a4ad42d840/user-1000@00060a1679f2ed4a-a328cffc5560b5c8.journal~:4517736 (of 8388608 bytes, 53%).
Comment 1 Franck Bui 2023-11-30 16:17:17 UTC
Please report this problem to systemd upstream [1] instead. The problem you're facing isn't specific to Tumbleweed.

Thanks.

[1] https://github.com/systemd/systemd/issues