|
Bugzilla – Full Text Bug Listing |
| Summary: | /usr/src/kernel-modules/nvidia-390.157-default/nvidia/nv-procfs.o: warning: objtool: nv_register_procfs() falls through to next function nv_unregister_procfs() | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Michal Suchanek <msuchanek> |
| Component: | X11 3rd Party Driver | Assignee: | Stefan Dirsch <sndirsch> |
| Status: | RESOLVED FIXED | QA Contact: | Stefan Dirsch <sndirsch> |
| Severity: | Normal | ||
| Priority: | P3 - Medium | CC: | mbenes, mjambor, msuchanek, petr.pavlu, pmladek, sndirsch |
| Version: | Leap 15.5 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
The nvidia module that includes the symbol
build log source preprocessed source object file nvidia package change |
||
|
Description
Michal Suchanek
2024-05-17 08:34:33 UTC
Created attachment 874943 [details]
The nvidia module that includes the symbol
Strange and I would need to take a look at sources because
000000000000abb0 <nv_register_procfs>:
[...]
adbc: e8 00 00 00 00 call adc1 <nv_register_procfs+0x211>
adbd: R_X86_64_PC32 __stack_chk_fail-0x4
adc1: 4c 8b 05 00 00 00 00 mov 0x0(%rip),%r8 # adc8 <nv_register_procfs+0x218>
adc4: R_X86_64_PC32 .bss+0x1284
adc8: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
adcb: R_X86_64_32S .rodata+0x420
adcf: 48 89 c2 mov %rax,%rdx
add2: be 24 81 00 00 mov $0x8124,%esi
add7: e8 00 00 00 00 call addc <nv_register_procfs+0x22c>
add8: R_X86_64_PC32 proc_create_data-0x4
addc: 0f 1f 40 00 nopl 0x0(%rax)
It really ends with a call to proc_create_data() which, obviously, is not a noreturn function. So it falls through to the next one. If it ended with the call to __stack_chk_fail() a couple of lines above, like most of the other functions, everything would be fine, because __stack_chk_fail() is marked as noreturn and objtool knows about it.
The code between the call to __stack_chk_fail() and the call to proc_create_data() fills arguments. adc1 address is a target of intra-function jump so looks like a valid code.
000000000000ade0 <nv_unregister_procfs>:
ade0: e8 00 00 00 00 call ade5 <nv_unregister_procfs+0x5>
ade1: R_X86_64_PC32 __fentry__-0x4
ade5: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # adec <nv_unregister_procfs+0xc>
ade8: R_X86_64_PC32 .bss+0x1274
adec: e9 00 00 00 00 jmp adf1 <nv_unregister_procfs+0x11>
aded: R_X86_64_PC32 proc_remove-0x4
adf1: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
adf8: 00 00 00 00
adfc: 0f 1f 40 00 nopl 0x0(%rax)
I left it here as an example. There is jmp at the end which is correct. It does not return. Sibling jump. All good.
I am only guessing without sources though. Could you also appends logs from the compilation, please?
Created attachment 875026 [details]
build log
Created attachment 875027 [details]
source
Created attachment 875028 [details]
preprocessed source
Thanks. So it seems that... (thanks Petr Pavlu for helping out because it is really a strange thing)
adc1: 4c 8b 05 00 00 00 00 mov 0x0(%rip),%r8 # adc8 <nv_register_procfs+0x218>
adc4: R_X86_64_PC32 .bss+0x1284
adc8: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
adcb: R_X86_64_32S .rodata+0x420
adcf: 48 89 c2 mov %rax,%rdx
add2: be 24 81 00 00 mov $0x8124,%esi
add7: e8 00 00 00 00 call addc <nv_register_procfs+0x22c>
add8: R_X86_64_PC32 proc_create_data-0x4
addc: 0f 1f 40 00 nopl 0x0(%rax)
is just dead code. Originally it comes from
for (i = 0; __nv_patches[i].short_description; i++)
{
nv_procfs_add_text_file(proc_nvidia_patches,
__nv_patches[i].short_description, __nv_patches[i].description);
}
where __nv_patches is a global array initialized as
static struct {
const char *short_description;
const char *description;
} __nv_patches[] = {
{ ((void *)0), ((void *)0) } };
The dead code is nv_procfs_add_text_file() inlined (it contains just the call to proc_create_data()).
The compiler knows that the loop does not exist and it is optimized out. The call site (from the "loop") looks like
aced: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # acf4 <nv_register_procfs+0x144>
acf0: R_X86_64_PC32 .bss+0x127c
acf4: 48 85 ff test %rdi,%rdi
acf7: 0f 85 c4 00 00 00 jne adc1 <nv_register_procfs+0x211>
Meaning that the jump never happens (the array contains just zeros). However it is all left there. No idea why.
So objtool basically reports a correct thing.
Martin, does it ring a bell? Could you shed some light on this?
Created attachment 875072 [details]
object file
GCC 7 deduces that the loop cannot complete its first iteration because then it would read past the __nv_patches array if it did. So it puts its __builtin_unreachable after the call to proc_create_data because if it did not return, it would be still necessary to call it. To make the compiler do what you want it do, I think you need to make __nv_patches const. Michal, could you try Martin's suggestion, please? I do not think this is something that objtool can do anything about. Created attachment 875152 [details]
nvidia package change
Making the array const is indeed workaround for the problem.
Nonetheless, we have real code here where gcc fails to eliminate some dead branch of the code, and as a result objtool issues a useless warning.
That's problem with the tools, not the code.
I will look into why the variable was not promoted to const automagically in this case because that seems to be somewhat lame (but maybe some real reason escaped me). However, failing that, GCC cannot reliably identify every piece of unreachable code. If, for example, the address of the non-const array escaped the compilation unit, then GCC simply might not do anything better than what it does. (Well, GCC 13 has a new option -funreachable-traps which puts a trap before the "fall-through" but we have just discovered its current documentation is not correct and since ATM I cannot really estimate its costs, I cannot really recommend using it or even evaluating it.) I will not get caught in "broken code, broken tools, broken gcc" discussion but the thing is that objtool cannot know if it is a bug in the code or not. So it reports the warning and stops the analysis. Things like this are fixed in the mainline kernel and const would be appropriate here. Having said that the support for noreturn functions in objtool is not great. There have been talks about improving it but it needs to happen in cooperation with GCC because there is currently no way to know that GCC decided to place unreachable somewhere (decided that a function is noreturn). I do not know if there is a progress on this front behind the scenes. GCC 13 (at least) does not automagically promote the array to const because of the -flive-patching=inline-clone option (which disables the pure-const pass). It is even natural, we need to be ready for some unknown live-patch to possibly leak the address even when such a leak is not present in the code at hand. (In reply to Miroslav Beneš from comment #12) > Having said that the support for noreturn functions in objtool is not great. > There have been talks about improving it but it needs to happen in > cooperation with GCC because there is currently no way to know that GCC > decided to place unreachable somewhere (decided that a function is > noreturn). I do not know if there is a progress on this front behind the > scenes. Once we start building our enterprise kernels with GCC 13+, asking our performance team to evaluate the option -funreachable-traps might be worthwhile. Please consider adding the attached patch to avoid displaying this (false-positive) warning to users. Which warning do you mean exactly? conftest.sh creates code snippets to figure out for specific functions, how many and which kind of parameters are used in the kernel against the driver is being built. This is needed due to all kind of backports done in downstream kernels of distributions, so checking the kernel version doesn't work. So maybe it's just overkill to fix such a warning? I mean if the detection would be wrong because of this, probably the driver build would fail. (In reply to Stefan Dirsch from comment #16) > Which warning do you mean exactly? The one shown in the bug title which is displayed every time the package is installed on the user's system and builds the kernel modules. This part of the conftest.sh ends up in the generated code, confuses gcc which fails to eliminate some dead code which then confuses objtool. Ok. BTW, there is no such warning when building against Tumbleweed, just against Leap 15.5 and 15.6. Check this https://build.suse.de/project/monitor/Proprietary:X11:Drivers nvidia-driver-G06 nvidia-gfxG05 nvidia-gfxG04 (this bug G04 = 390.157) All these are affected by this objtool warning on Leap. mabye one should figure out what's different in objtool for Tumbleweed? (In reply to Martin Jambor from comment #13) > GCC 13 (at least) does not automagically promote the array to const because > of the -flive-patching=inline-clone option (which disables the pure-const > pass). It is even natural, we need to be ready for some unknown live-patch > to possibly leak the address even when such a leak is not present in the > code at hand. Bah, right. I wonder if we should just get rid of the option eventually as upstream did and somehow detect dangerous changes in our tooling. Adding Petr for his awareness. Thanks a lot for the explanation. (In reply to Stefan Dirsch from comment #18) > Ok. BTW, there is no such warning when building against Tumbleweed, just > against Leap 15.5 and 15.6. Check this > > https://build.suse.de/project/monitor/Proprietary:X11:Drivers > > nvidia-driver-G06 > nvidia-gfxG05 > nvidia-gfxG04 (this bug G04 = 390.157) > > All these are affected by this objtool warning on Leap. mabye one should > figure out what's different in objtool for Tumbleweed? Upstream objtool reports the same warning on the object file so there is no difference. TW does not use the GCC option mentioned above so the object code is different. @Miroslav So I apply this patch and we're done? Then please assign this ticket to me. (Then I will also try to push this patch upstream) (In reply to Stefan Dirsch from comment #20) > @Miroslav So I apply this patch and we're done? Then please assign this > ticket to me. > > (Then I will also try to push this patch upstream) Yes, if you apply the patch, GCC will remove the dead code and everything will be all right from objtool perspective. Patch is now applied to G04, G05 and G06 packages. Closing. I'll inform nvidia by email about this patch and ticket. |