Bugzilla – Bug 1220073
KASAN enabled kernel dumps with Stack depot reached limit capacity
Last modified: 2024-02-19 14:09:22 UTC
with the Z13 finally back online we tried the KASAN enabled factory kernel there as well. none of the other crashes we see on the z15 but KASAN triggers this [216976.545411] ------------[ cut here ]------------ [216976.545448] Stack depot reached limit capacity [216976.545514] WARNING: CPU: 13 PID: 22981 at lib/stackdepot.c:271 __stack_depot_save+0x5a6/0x5b8 [216976.545561] Modules linked in: algif_hash af_alg af_packet rfkill binfmt_misc nbd kvm drm drm_panel_orientation_quirks i2c_core configfs ip_tables x_tables ext4 mbcache jbd2 dm_service_time dm_multipath fuse squashfs loop virtio_blk dm_mod brd bonding tls qeth_l2 bridge stp llc lcs ctcm fsm dasd_fba_mod dasd_eckd_mod dasd_mod evdev crc32_vx_s390 ghash_s390 prng chacha_s390 libchacha zfcp scsi_transport_fc xts sd_mod scsi_dh_emc scsi_dh_rdac scsi_dh_alua aes_s390 t10_pi des_s390 libdes crc64_rocksoft_generic crc64_rocksoft sha512_s390 sg sha256_s390 crc64 sha1_s390 sha_common scsi_mod qeth qdio ccwgroup scsi_common vfio_ccw mdev vfio_iommu_type1 vfio eadm_sch pkey zcrypt rng_core [216976.545936] CPU: 13 PID: 22981 Comm: mkfs.ext3 Kdump: loaded Not tainted 6.7.4-1.g19af270-default #1 openSUSE Tumbleweed (unreleased) 2e496a1ebf29e36069d2a1de97d0f181d3690830 [216976.545974] Hardware name: IBM 2964 N63 400 (LPAR) [216976.545989] Krnl PSW : 0404e00180000000 00000003406e7fb2 (__stack_depot_save+0x5aa/0x5b8) [216976.546021] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 [216976.546049] Krnl GPRS: 001c000000000027 001c000100000023 0000000000000022 0000000000000004 [216976.546078] 0000000000000001 000000033ff8b938 000000034189bcfc 0400002000000000 [216976.546100] 0000000000000020 0000000082b6f848 000000006ea33377 0000000000000100 [216976.546121] 00000003418b9bb0 0000000000001fff 00000003406e7fae 0000000082b6f738 [216976.546189] Krnl Code: 00000003406e7fa2: c020006258f8 larl %r2,0000000341333192 00000003406e7fa8: c0e5ffb0d6f4 brasl %r14,000000033fd02d90 #00000003406e7fae: af000000 mc 0,0 >00000003406e7fb2: a7b90000 lghi %r11,0 00000003406e7fb6: a7f4fee9 brc 15,00000003406e7d88 00000003406e7fba: 0707 bcr 0,%r7 00000003406e7fbc: 0707 bcr 0,%r7 00000003406e7fbe: 0707 bcr 0,%r7 [216976.546457] Call Trace: [216976.546477] [<00000003406e7fb2>] __stack_depot_save+0x5aa/0x5b8 [216976.546507] ([<00000003406e7fae>] __stack_depot_save+0x5a6/0x5b8) [216976.546537] [<00000003401b187a>] kasan_save_stack+0x52/0x60 [216976.546562] [<00000003401b3e2a>] __kasan_record_aux_stack+0xe2/0x100 [216976.546587] [<000000033fd4ede8>] task_work_add+0x98/0x230 [216976.546614] [<000000033fd85264>] scheduler_tick+0x14c/0x478 [216976.546649] [<000000033fe490f4>] update_process_times+0xec/0xf8 [216976.546677] [<000000033fe66c78>] tick_sched_handle+0x88/0xb0 [216976.546718] [<000000033fe674c0>] tick_nohz_highres_handler+0x80/0xf8 [216976.546757] [<000000033fe4a350>] __hrtimer_run_queues+0x310/0x4f8 [216976.546793] [<000000033fe4c352>] hrtimer_interrupt+0x282/0x4e0 [216976.546817] [<000000033fc956c4>] do_IRQ+0x7c/0x90 [216976.546844] [<000000033fc95d66>] do_irq_async+0xc6/0x160 [216976.546880] [<0000000340d3057a>] do_ext_irq+0xda/0x120 [216976.546909] [<0000000340d4d7f8>] ext_int_handler+0xd0/0x100 [216976.546942] [<00000003401b3ab4>] __asan_load8+0xc/0x98 [216976.546965] [<00000003405298dc>] bio_associate_blkg+0x5c/0xb8 [216976.546995] [<00000003404d3e34>] bio_init+0x11c/0x1d0 [216976.547028] [<00000003404d7784>] bio_alloc_bioset+0x2ec/0x548 [216976.547055] [<001bffff8083e29c>] ext4_bio_write_folio+0x884/0xd18 [ext4] [216976.547753] [<001bffff807fb434>] mpage_submit_folio+0xf4/0x1a0 [ext4] [216976.548326] [<001bffff807fbc36>] mpage_map_and_submit_buffers+0x36e/0x578 [ext4] [216976.548930] [<001bffff80808ea6>] ext4_do_writepages+0xdc6/0x15c0 [ext4] [216976.549488] [<001bffff8080995c>] ext4_writepages+0x194/0x2a0 [ext4] [216976.550054] [<000000034009c9c6>] do_writepages+0x11e/0x368 [216976.550086] [<00000003400826d2>] filemap_fdatawrite_wbc+0xda/0x118 [216976.550116] [<0000000340089900>] __filemap_fdatawrite_range+0xb8/0xd0 [216976.550145] [<0000000340089ab0>] file_write_and_wait_range+0x80/0xf0 [216976.550175] [<00000003402a2256>] generic_buffers_fsync_noflush+0x66/0x198 [216976.550208] [<001bffff807e934e>] ext4_sync_file+0x286/0x5d8 [ext4] [216976.550800] [<000000034028f4a4>] __s390x_sys_fsync+0x6c/0xa0 [216976.550827] [<000000033fc88b12>] do_syscall+0x19a/0x1d8 [216976.550849] [<0000000340d30350>] __do_syscall+0xd0/0xf8 [216976.550874] [<0000000340d4d5a0>] system_call+0x70/0x98 [216976.550894] Last Breaking-Event-Address: [216976.550905] [<00000003406efcd0>] __s390_indirect_jump_r14+0x0/0x10 [216976.550936] Kernel panic - not syncing: kernel: panic_on_warn set ... [216976.550954] CPU: 13 PID: 22981 Comm: mkfs.ext3 Kdump: loaded Not tainted 6.7.4-1.g19af270-default #1 openSUSE Tumbleweed (unreleased) 2e496a1ebf29e36069d2a1de97d0f181d3690830 uploading crash dumps to ziu:/nfs/Bug[whatevernumberwegethere]
Thanks for the report. But on closer look, this is not a bug. KASAN enabled kernel ran out of memory for storing allocation stacks here. I could rebuild the kernel with higher limit, but since KASAN hasn't caught anything else so far, I believe it's not worth it.