Bug 1220085

Summary: Data races reported by KCSAN on s390x
Product: [openSUSE] openSUSE Tumbleweed Reporter: Miroslav Franc <miroslav.franc>
Component: KernelAssignee: Miroslav Franc <miroslav.franc>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: ada.lovelace, ihno, marcela.maslanova, ro
Version: Current   
Target Milestone: ---   
Hardware: S/390-64   
OS: Linux   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: dmesg log from s390x VM z15 factory
dmesg log from s390x VM z15 factory (valgrind regtest)

Description Miroslav Franc 2024-02-19 16:42:26 UTC
Created attachment 872851 [details]
dmesg log from s390x VM z15 factory

KCSAN enabled kernel* on s390x reports a lot of data races. See the dmesg log for details.


* https://download.suse.de/ibs/home:/mfranc:/branches:/SUSE:/Factory:/Head:/kernel-source/S390/
Comment 1 Miroslav Franc 2024-02-19 16:53:09 UTC
BUG: KCSAN: data-race in arch_vcpu_is_preempted+0x4a/0xe0
BUG: KCSAN: data-race in atime_needs_update / inode_update_timestamps
BUG: KCSAN: data-race in begin_new_exec / wait_consider_task
BUG: KCSAN: data-race in blk_mq_delay_run_hw_queue / blk_mq_delay_run_hw_queue
BUG: KCSAN: data-race in btrfs_block_rsv_release [btrfs] / need_preemptive_reclaim [btrfs]
BUG: KCSAN: data-race in btrfs_record_root_in_trans [btrfs] / record_root_in_trans [btrfs]
BUG: KCSAN: data-race in btrfs_update_delayed_refs_rsv [btrfs] / btrfs_use_block_rsv [btrfs]
BUG: KCSAN: data-race in capable / vtime_account_kernel
BUG: KCSAN: data-race in common_perm_cond / kernfs_refresh_inode
BUG: KCSAN: data-race in copy_from_read_buf / n_tty_receive_buf_common
BUG: KCSAN: data-race in d_alloc_parallel / d_alloc_parallel
BUG: KCSAN: data-race in do_epoll_ctl / ep_poll_callback
BUG: KCSAN: data-race in __flush_work.isra.0 / queue_work_on
BUG: KCSAN: data-race in __fput / kernfs_refresh_inode
BUG: KCSAN: data-race in generic_fillattr / kernfs_refresh_inode
BUG: KCSAN: data-race in __hrtimer_run_queues / hrtimer_active
BUG: KCSAN: data-race in inode_permission / kernfs_refresh_inode
BUG: KCSAN: data-race in kernfs_refresh_inode / kernfs_refresh_inode
BUG: KCSAN: data-race in kernfs_refresh_inode / link_path_walk.part.0.constprop.0
BUG: KCSAN: data-race in ktime_get_real_seconds / timekeeping_advance
BUG: KCSAN: data-race in n_tty_check_unthrottle / n_tty_receive_buf_common
BUG: KCSAN: data-race in n_tty_poll / n_tty_receive_buf_common
BUG: KCSAN: data-race in n_tty_read / n_tty_receive_buf_common
BUG: KCSAN: data-race in osq_lock / osq_lock
BUG: KCSAN: data-race in osq_lock / osq_unlock
BUG: KCSAN: data-race in pipe_poll / pipe_write
BUG: KCSAN: data-race in pipe_read / pipe_release
BUG: KCSAN: data-race in pipe_read / pipe_write
BUG: KCSAN: data-race in poll_schedule_timeout.constprop.0 / pollwake
BUG: KCSAN: data-race in queue_work_on / worker_thread
BUG: KCSAN: data-race in rcu_all_qs / rcu_exp_need_qs
BUG: KCSAN: data-race in rcu_all_qs / rcu_implicit_dynticks_qs
BUG: KCSAN: data-race in rcu_all_qs / rcu_report_qs_rdp
BUG: KCSAN: data-race in set_nlink / set_nlink
BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event
BUG: KCSAN: data-race in tick_sched_do_timer / tick_sched_do_timer
BUG: KCSAN: data-race in wq_worker_tick / wq_worker_tick
Comment 2 Miroslav Franc 2024-02-19 18:23:26 UTC
The previous list of reported data races was from booting the machine.  The following is from running valgrind testsuite.

BUG: KCSAN: data-race in __count_memcg_events / mem_cgroup_css_rstat_flush
BUG: KCSAN: data-race in __d_add / __d_add
BUG: KCSAN: data-race in __flush_work.isra.0 / queue_work_on
BUG: KCSAN: data-race in __hrtimer_run_queues / hrtimer_active
BUG: KCSAN: data-race in __mod_lruvec_page_state / filemap_fault
BUG: KCSAN: data-race in __mod_lruvec_page_state / folio_batch_move_lru
BUG: KCSAN: data-race in __mod_lruvec_page_state / lru_add_fn
BUG: KCSAN: data-race in __mod_lruvec_page_state / next_uptodate_folio
BUG: KCSAN: data-race in __mod_memcg_lruvec_state / __mod_memcg_lruvec_state
BUG: KCSAN: data-race in __mod_memcg_lruvec_state / mem_cgroup_css_rstat_flush
BUG: KCSAN: data-race in __mod_memcg_lruvec_state / memcg_account_kmem
BUG: KCSAN: data-race in __mod_zone_page_state / memchr_inv
BUG: KCSAN: data-race in __napi_schedule_irqoff / net_rx_action
BUG: KCSAN: data-race in _find_next_and_bit+0x50/0x130
BUG: KCSAN: data-race in acct_account_cputime / do_brk_flags
BUG: KCSAN: data-race in alloc_empty_file / percpu_counter_add_batch
BUG: KCSAN: data-race in alloc_pid / copy_process
BUG: KCSAN: data-race in arch_vcpu_is_preempted+0x4a/0xe0
BUG: KCSAN: data-race in atime_needs_update / xfs_vn_update_time [xfs]
BUG: KCSAN: data-race in begin_new_exec / vtime_account_kernel
BUG: KCSAN: data-race in bio_chain / bio_endio
BUG: KCSAN: data-race in blk_mq_delay_run_hw_queue / blk_mq_delay_run_hw_queue
BUG: KCSAN: data-race in blk_mq_dispatch_rq_list / blk_mq_hctx_has_pending
BUG: KCSAN: data-race in copy_from_read_buf / n_tty_receive_buf_common
BUG: KCSAN: data-race in copy_process / vtime_account_kernel
BUG: KCSAN: data-race in d_alloc / lockref_put_return
BUG: KCSAN: data-race in dasd_block_tasklet [dasd_mod] / dasd_start_IO [dasd_mod]
BUG: KCSAN: data-race in do_execveat_common.isra.0 / vtime_account_kernel
BUG: KCSAN: data-race in do_exit / zap_other_threads
BUG: KCSAN: data-race in do_flush_stats / tick_do_update_jiffies64
BUG: KCSAN: data-race in do_nanosleep / hrtimer_wakeup
BUG: KCSAN: data-race in dput / lockref_get_not_dead
BUG: KCSAN: data-race in exit_signals / vtime_account_kernel
BUG: KCSAN: data-race in folio_activate_fn / unmap_page_range
BUG: KCSAN: data-race in inode_needs_update_time / inode_update_timestamps
BUG: KCSAN: data-race in inode_update_timestamps / inode_update_timestamps
BUG: KCSAN: data-race in kthread_is_per_cpu / xfs_trans_free [xfs]
BUG: KCSAN: data-race in ktime_get_real_seconds / timekeeping_advance
BUG: KCSAN: data-race in ktime_get_seconds / timekeeping_advance
BUG: KCSAN: data-race in link_path_walk.part.0.constprop.0 / page_get_link
BUG: KCSAN: data-race in mas_wr_slot_store / mtree_range_walk
BUG: KCSAN: data-race in n_tty_check_unthrottle / n_tty_receive_buf_common
BUG: KCSAN: data-race in n_tty_poll / n_tty_receive_buf_common
BUG: KCSAN: data-race in n_tty_read / n_tty_receive_buf_common
BUG: KCSAN: data-race in osq_lock / osq_lock
BUG: KCSAN: data-race in osq_lock / osq_unlock
BUG: KCSAN: data-race in page_get_link / page_get_link
BUG: KCSAN: data-race in pipe_read / pipe_read
BUG: KCSAN: data-race in pipe_read / pipe_release
BUG: KCSAN: data-race in pipe_read / pipe_write
BUG: KCSAN: data-race in poll_schedule_timeout.constprop.0 / pollwake
BUG: KCSAN: data-race in process_one_work / process_one_work
BUG: KCSAN: data-race in process_one_work / queue_work_on
BUG: KCSAN: data-race in queue_work_on / worker_thread
BUG: KCSAN: data-race in raw3215_irq / tty3215_write_room
BUG: KCSAN: data-race in rcu_all_qs / rcu_implicit_dynticks_qs
BUG: KCSAN: data-race in rcu_all_qs / rcu_report_qs_rdp
BUG: KCSAN: data-race in release_task+0x1fe/0xb28
BUG: KCSAN: data-race in release_task+0x20e/0xb28
BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event
BUG: KCSAN: data-race in tick_sched_do_timer / tick_sched_do_timer
BUG: KCSAN: data-race in vtime_account_kernel / xfs_end_ioend [xfs]
BUG: KCSAN: data-race in vtime_account_kernel / xfs_prepare_ioend [xfs]
BUG: KCSAN: data-race in vtime_account_kernel / xfs_trans_alloc [xfs]
BUG: KCSAN: data-race in vtime_account_kernel / xfs_trans_free [xfs]
BUG: KCSAN: data-race in xa_get_order / xas_store
BUG: KCSAN: data-race in xas_clear_mark / xas_find_marked
Comment 3 Miroslav Franc 2024-02-19 18:25:03 UTC
Created attachment 872854 [details]
dmesg log from s390x VM z15 factory (valgrind regtest)
Comment 4 Sarah Kriesch 2024-06-30 10:45:52 UTC
Should that be forwarded to IBM?
Comment 5 Miroslav Franc 2024-07-01 07:33:56 UTC
(In reply to Sarah Kriesch from comment #4)
> Should that be forwarded to IBM?

As far as I remember, IBM is aware.   KCSAN practically doesn't work on s390x, i.e. shows way too many false positives.  I hit this when I tried to debug certain memory issues on a LinuxONE machine.  It's definitely on my radar, but I don't think it's a priority for anybody right now.  However, it's a real problem, so I don't want to close the bug without any meaningful resolution.  At the same time I doubt it will be fixed anytime soon.