Bugzilla – Bug 1217821
signal SIGBUS happen when clone with static stack parameter
Last modified: 2024-01-25 10:14:53 UTC
## Observation openQA test in scenario sle-15-SP6-Azure-BYOS-aarch64-publiccloud_ltp_cve@64bit fails in [cve-2021-4197_2](https://openqa.suse.de/tests/12934596/modules/cve-2021-4197_2/steps/1) ONLY found issue on aarch64 platform. LTP log show nothing but failed, actually there is a signal SIGBUS happen on LTP sub process. (Detail info please following gdb trace) This issue seems related with static key word for stack parameter, if you remove static key word the issue is gone. But base gdb output show $sp indeed in static area but it is valid address, also address seems do not have aligned issue. So why SIGBUS happen? I also attach a more simple code to reproduce this issue. --- a/testcases/kernel/controllers/cgroup/cgroup_core02.c +++ b/testcases/kernel/controllers/cgroup/cgroup_core02.c @@ -51,7 +51,9 @@ static int lesser_ns_open_thread_fn(void *arg) static void test_lesser_ns_open(void) { int i; -static char stack[65536]; +char stack[65536]; Starting program: /home/azureuser/ltp/testcases/kernel/controllers/cgroup/cgroup_core02 Missing separate debuginfos, use: zypper install glibc-debuginfo-2.31-150300.63.1.aarch64 tst_test.c:1690: TINFO: LTP version: 20230929-184-g776b57984 tst_test.c:1576: TINFO: Timeout per run is 0h 17m 10s [Attaching after process 29714 fork to child process 29716] [New inferior 2 (process 29716)] [Detaching after fork from parent process 29714] [Inferior 1 (process 29714) detached] [Attaching after process 29716 fork to child process 29717] [New inferior 3 (process 29717)] [Detaching after fork from parent process 29716] [Inferior 2 (process 29716) detached] [Attaching after process 29717 fork to child process 29718] [New inferior 4 (process 29718)] [Detaching after fork from parent process 29717] [Inferior 3 (process 29717) detached] Thread 4.1 "cgroup_core02" received signal SIGBUS, Bus error. [Switching to process 29718] lesser_ns_open_thread_fn (arg=0xfffffffff8a8) at cgroup_core02.c:44 44 { (gdb) bt #0 lesser_ns_open_thread_fn (arg=0xfffffffff8a8) at cgroup_core02.c:44 #1 0x0000fffff7f0feec in thread_start () from /lib64/libc.so.6 (gdb) disassemble lesser_ns_open_thread_fn Dump of assembler code for function lesser_ns_open_thread_fn: => 0x0000000000404a80 <+0>: stp x29, x30, [sp, #-32]! 0x0000000000404a84 <+4>: adrp x1, 0x440000 memcpy@got.plt 0x0000000000404a88 <+8>: mov w4, #0x2 // #2 0x0000000000404a8c <+12>: adrp x3, 0x420000 0x0000000000404a90 <+16>: mov x29, sp 0x0000000000404a94 <+20>: ldr x2, [x1, #3048] 0x0000000000404a98 <+24>: add x3, x3, #0x7c8 0x0000000000404a9c <+28>: str x19, [sp, #16] 0x0000000000404aa0 <+32>: mov x19, x0 0x0000000000404aa4 <+36>: mov x5, x19 0x0000000000404aa8 <+40>: mov w1, #0x2f // #47 0x0000000000404aac <+44>: adrp x0, 0x420000 0x0000000000404ab0 <+48>: add x0, x0, #0x7b8 0x0000000000404ab4 <+52>: bl 0x40bca8 0x0000000000404ab8 <+56>: str w0, [x19, #128] 0x0000000000404abc <+60>: mov w0, #0x0 // #0 0x0000000000404ac0 <+64>: ldr x19, [sp, #16] 0x0000000000404ac4 <+68>: ldp x29, x30, [sp], #32 0x0000000000404ac8 <+72>: ret End of assembler dump. (gdb) x/10x 0xfffffffff8a8 0xfffffffff8a8: 0x00000000 0x00000000 0x00000000 0x00000000 0xfffffffff8b8: 0x00000000 0x00000000 0x00000000 0x00000000 0xfffffffff8c8: 0x00000000 0x00000000 (gdb) i registers x0 0xfffffffff8a8 281474976708776 x1 0x450bf8 4525048 x2 0x0 0 x3 0x0 0 x4 0x0 0 x5 0x0 0 x6 0x0 0 x7 0x440bf8 4459512 x8 0xdc 220 x9 0x7f7f7f7f7f7f7f7f 9187201950435737471 x10 0x404a80 4213376 x11 0x2000511 33555729 x12 0xfffffffff8a8 281474976708776 x13 0xffffffffffffffff -1 x14 0xfffff7e449c8 281474840676808 x15 0xfffff7e37228 281474840621608 x16 0xfffff7f0fe90 281474841509520 x17 0x4404e8 4457704 x18 0x1 1 x19 0x440000 4456448 x20 0x440be0 4459488 x21 0x4207b8 4327352 x22 0x4207c8 4327368 x23 0x423000 4337664 x24 0xffffffff 4294967295 x25 0x425380 4346752 x26 0x0 0 x27 0x0 0 x28 0x420000 4325376 x29 0x0 0 x30 0xfffff7f0feec 281474841509612 sp 0x450bf8 0x450bf8 pc 0x404a80 0x404a80 cpsr 0x60001000 [ EL=0 BTYPE=0 SSBS C Z ] fpsr 0x0 [ ] --Type for more, q to quit, c to continue without paging--c fpcr 0x0 RMode=0 x/50x $sp-32 0x450bd8 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450be8 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450bf8 : 0x74736574 0x3739322d 0x00003431 0x00000000 0x450c08 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c18 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c28 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c38 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c48 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c58 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c68 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c78 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c88 : 0x00000000 0x00000000 0x00000000 0x00000000 0x450c98 : 0x00000000 0x00000000 cat /etc/os-release NAME="SLES" VERSION="15-SP6" VERSION_ID="15.6" PRETTY_NAME="SUSE Linux Enterprise Server 15 SP6" ID="sles" ID_LIKE="suse" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:suse:sles:15:sp6" DOCUMENTATION_URL="https://documentation.suse.com/" azureuser@cvetestasmorodskyi:~> uname -r 6.4.0-150600.4-default azureuser@cvetestasmorodskyi:~> ldd --version ldd (GNU libc) 2.31 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Roland McGrath and Ulrich Drepper. azureuser@cvetestasmorodskyi:~> lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: ARM Model name: Neoverse-N1 Model: 1 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 1 Stepping: r3p1 BogoMIPS: 50.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp Caches (sum of all): L1d: 128 KiB (2 instances) L1i: 128 KiB (2 instances) L2: 2 MiB (2 instances) L3: 32 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0,1 Vulnerabilities: Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Mitigation; PTI Mmio stale data: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Not affected Spectre v1: Mitigation; __user pointer sanitization Spectre v2: Mitigation; CSV2, BHB Srbds: Not affected Tsx async abort: Not affected
Created attachment 871165 [details] simple reproduce code
On ARMv8 the stack pointer must be 16-byte aligned whenever it is used as a base address, so program should ensure that stack is 16 byte aligned. https://developer.arm.com/documentation/den0024/a/An-Introduction-to-the-ARMv8-Instruction-Sets/The-ARMv8-instruction-sets/Addressing
(In reply to WEI GAO from comment #1) > Created attachment 871165 [details] > simple reproduce code As a side note not directly related to the issue, please note that the code sample provided is currently allocating space from BSS segment and not the stack, at least as long as the uncommented line is the one declaring the variable as static. Thanks, Andrea
Looks like a compiler problem. I tried with two compiler versions : gcc (SUSE Linux) 7.5.0 (the default) - I can reproduce the issue gcc-13 (SUSE Linux) 13.2.1 20230912 [revision b96e66fd4ef3e36983969fb8cdd1956f551a074b] - The issue disappeared And definitely with gcc 7.5 the 'stack' variable address is not 16Bytes aligned, and on gcc 13 it is aligned.
stack variable is byte/char array, I don't think that there is any rule that compiler should align the array in any way.
(In reply to Ivan Ivanov from comment #5) > stack variable is byte/char array, I don't think that there > is any rule that compiler should align the array in any way. Exactly, also SP from gdb dump seems to be far off (0x450bf8) the usual memory range (0xfffff....) at least for EL0, it maybe something else that is corrupting all the stuff. I'll give it a try asap to reproduce it and see why it's happening
(In reply to Ivan Ivanov from comment #5) > stack variable is byte/char array, I don't think that there > is any rule that compiler should align the array in any way. Can __attribute__((__aligned__(16))) work?
The tests from Stanimir suggests that different glibc version behaves differently when it comes to clone. In fact, newer ones seem to adjust the address passed to the clone syscall: #ltrace ./waitpid ... clone(0x401196, 0x504082, 33555729, 0x404060) ... #strace ./waitpid ... clone(child_stack=0x504070, flags=CLONE_VM|CLONE_FILES|CLONE_NEWCGROUP|SIGCHLD) ... you can see that the misaligned address passed from userspace (0x504082) has been sanitized to an aligned one (0x504070) by defect, since the stack is growing towards lower address (on most architectures). also, the address range makes sense since it has been allocated from BSS segment (see static), as follows: #objdump -x ./waitpid | grep bss 25 .bss 00100048 0000000000404040 0000000000404040 00003040 2**5 ... To play around a bit, you can easily tweak the topmost address of the stack array by adding e.g. one to the STACK_SIZE macro. The advise from Takashi, i.e. adding __attribute__((__aligned__(16))) to the stack definition, should do the trick. As an alternative you can use mmap to reserve the memory to be passed as the stack, since it's page aligned, adding MAP_STACK option for maximum portability on exotic architecture.
(In reply to Andrea della Porta from comment #8) > The tests from Stanimir suggests that different glibc version behaves > differently when it comes to clone. In fact, newer ones seem to adjust the > address passed to the clone syscall: > > #ltrace ./waitpid > ... > clone(0x401196, 0x504082, 33555729, 0x404060) > ... > > #strace ./waitpid > ... > clone(child_stack=0x504070, > flags=CLONE_VM|CLONE_FILES|CLONE_NEWCGROUP|SIGCHLD) > ... > > you can see that the misaligned address passed from userspace (0x504082) has > been sanitized to an aligned one (0x504070) by defect, since the stack is > growing towards lower address (on most architectures). also, the address > range makes sense since it has been allocated from BSS segment (see static), > as follows: > > #objdump -x ./waitpid | grep bss > 25 .bss 00100048 0000000000404040 0000000000404040 00003040 > 2**5 > ... > > To play around a bit, you can easily tweak the topmost address of the stack > array by adding e.g. one to the STACK_SIZE macro. > > The advise from Takashi, i.e. adding __attribute__((__aligned__(16))) to the > stack definition, should do the trick. As an alternative you can use mmap to > reserve the memory to be passed as the stack, since it's page aligned, > adding MAP_STACK option for maximum portability on exotic architecture. I found buf fix commit contain in glibc-2.33, so do we have plan to upgrade it? commit 3842ba494963b1d76ad5f68b8d1e5c2279160e31 Author: Szabolcs Nagy <szabolcs.nagy@arm.com> Date: Tue Jun 1 09:23:40 2021 +0100 aarch64: align stack in clone [BZ #27939] The AArch64 PCS requires 16 byte aligned stack. Previously if the caller passed an unaligned stack to clone then the child crashed. Fixes bug 27939.
(In reply to Takashi Iwai from comment #7) > (In reply to Ivan Ivanov from comment #5) > > stack variable is byte/char array, I don't think that there > > is any rule that compiler should align the array in any way. > > Can __attribute__((__aligned__(16))) work? Yes, this is worlking, of cource. :-) (In reply to WEI GAO from comment #9) > > I found buf fix commit contain in glibc-2.33, so do we have plan to upgrade > it? > > commit 3842ba494963b1d76ad5f68b8d1e5c2279160e31 > Author: Szabolcs Nagy <szabolcs.nagy@arm.com> > Date: Tue Jun 1 09:23:40 2021 +0100 > > aarch64: align stack in clone [BZ #27939] > > The AArch64 PCS requires 16 byte aligned stack. Previously if the > caller passed an unaligned stack to clone then the child crashed. > > Fixes bug 27939. Question to our glibc maintainers, I suppose.
(In reply to WEI GAO from comment #9) > (In reply to Andrea della Porta from comment #8) > > The tests from Stanimir suggests that different glibc version behaves > > differently when it comes to clone. In fact, newer ones seem to adjust the > > address passed to the clone syscall: > > > > #ltrace ./waitpid > > ... > > clone(0x401196, 0x504082, 33555729, 0x404060) > > ... > > > > #strace ./waitpid > > ... > > clone(child_stack=0x504070, > > flags=CLONE_VM|CLONE_FILES|CLONE_NEWCGROUP|SIGCHLD) > > ... > > > > you can see that the misaligned address passed from userspace (0x504082) has > > been sanitized to an aligned one (0x504070) by defect, since the stack is > > growing towards lower address (on most architectures). also, the address > > range makes sense since it has been allocated from BSS segment (see static), > > as follows: > > > > #objdump -x ./waitpid | grep bss > > 25 .bss 00100048 0000000000404040 0000000000404040 00003040 > > 2**5 > > ... > > > > To play around a bit, you can easily tweak the topmost address of the stack > > array by adding e.g. one to the STACK_SIZE macro. > > > > The advise from Takashi, i.e. adding __attribute__((__aligned__(16))) to the > > stack definition, should do the trick. As an alternative you can use mmap to > > reserve the memory to be passed as the stack, since it's page aligned, > > adding MAP_STACK option for maximum portability on exotic architecture. > > I found buf fix commit contain in glibc-2.33, so do we have plan to upgrade > it? > > commit 3842ba494963b1d76ad5f68b8d1e5c2279160e31 > Author: Szabolcs Nagy <szabolcs.nagy@arm.com> > Date: Tue Jun 1 09:23:40 2021 +0100 > > aarch64: align stack in clone [BZ #27939] > > The AArch64 PCS requires 16 byte aligned stack. Previously if the > caller passed an unaligned stack to clone then the child crashed. > > Fixes bug 27939. As Ivan already mentioned, this is not something related to the kernel so I guess this ticket has to be moved to glibc package domain. Many thanks
Resting to default assignee.
.. moving to base system. If this is not correct, please, reassign.
Let's toss to the glibc package maintainer.
Fixed.
https://openqa.suse.de/tests/12975040 show pass, so issue can be close