Bugzilla – Bug 84600
grub cannot load current STABLE kernel
Last modified: 2007-06-05 12:34:53 UTC
grub cannot load the current STABLE kernel, it reports Executable format not recognized. If I use a 9.3 kernel (with the same grub!), everything works fine.
Using the 9.3 kernel did not really help - now I get invalid compressed format (err=1) System halted
Adrian mentioned that using the 9.3 kernel worked for him - and he only had this problem on one machine, the other worked fine.
yes, I do run a full-i386 grub with the 9.3 kernel rpm atm.
I've done minimal X11 install successfully - even with STABLE kernel.
I had to downgrade to the 9.3 grub - and then it works. grub seems to be broken.
Indeed it is. On my test machine it doesn't even reach stage2 (stuck in stage1.5, Error 16) Up/Downgrading kernels _might_ be, independent of the actual kernel file contents, a game of chance.
Downgrading to an older grub version shows less problems - but still some. Either our gcc4 patches are broken, not complete - or we have a grub miscompilation.
Does this happen on anything else than reiserfs on your test setups?
I didn't test anything else.
I didn't test anything else either.
Same with me, only reiser.
Then this narrows down to the grub's reiserfs code. Building with gcc-3.3.5 makes no difference it seems. The linux kernel bug of disjoint caches keeps complicating things, BTW.
I have the same problem with grub while booting from reiserfs. The grub command line doesn't see /boot at all. Even if the dir is there in the filesystem. I've dongraded to the grub of 9.3.
Correction: working around the kernel bug, I can now clearly identify this as gcc4 miscompilation. I Comment#12 there were leftovers in the /dev/hda buffer cache.
Created attachment 38248 [details] preprocessed fsys_reiserfs, gcc 3 and 4 commands and resulting .o files
So here is the test case. Have a look at the variable "depth" in funtion "next_key()". GCC3 stores it in %ebx, and does increment, array access and test properly, addresses 0x3b9 to 0x3bf. GCC4 happens to store it in %edi, I think, because this is initialized with 1 and incremented. The rest however is totally bogus (addresses 0x6cf to 0x6d7).
Looking deeper into the generated code, it seems gcc4 has heavily shifted around the expressions. But whatever it did, on sufficiently populated ReiserFSes, it doesn't work; sorry I cannot make an attachment with a sample image. Hopefully the asm code is enough.
There are many functions in that .i file. Do we at least know _which_ one is miscompiled? Do we know if some compiler options (like removing -Os, or doing just -O0) works around this? To limit the effect of reordering basic blocks, you can use the option -fno-reorder-blocks, and to even limit the effects of scheduling also -fno-schedule-insns2 . This should make the functions better comparable.
Here are the typescripts that lead me to said conclusion. These were captured with debug enabled, in case that makes a difference. The first debug prints that differ are from next_key; both are on the same set of data (e.g.partition), both are reproducable.
Created attachment 38590 [details] 2 typescripts, from gcc-3 and gcc-4 generated code resp. preferably feed these 2 scripts to the diff command.
Steven or Zdenek, can either of you please look at this bug report and try to identify either the miscompilation or create a testcase which shows the misbehaviour without the need to boot every time (perhaps by writing a wrapper function calling the miscompiled one?)? I severely lack the time due to other more pressing issues.
just FYI: I tried patching grub with fedora rawhide's gcc4 patch (which is 4 times as big) but it didn't make a difference either. still same failure here.
Might be related to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21985
Thorsten (or someone else), can you please show a sequence of commands you use to test if this works, or not, without rebooting the machine? I can't just debug this with just looking at the assembler. I see you use qemu in the typescript, so there must be a way. I want a HD boot image, the full commands to change the grub in it, and the relevant snippets for running qemu testing that newly built grub. This way we can also check if a fix for the above mentioned PR21985 fixes also this bug. Or at least I can look further at where exactly the miscompile is.
I've now looked at the assembler some more. I looked at all places, where INFO->next_key_nr[] is accessed in next_key() and search_stat(). The typescript indicates that this loop is repeated with GCC 4 (next_key:749): do { if (depth == INFO->tree_depth) { ... ; goto found; } depth++; } while (INFO->next_key_nr[depth] == 0); i.e. during the first iteration INFO->next_key_nr[depth] is zero, where with GCC-3 it's not (depth being 2). The corresponding loop in gcc-4 asm looks like so: 3a4: bb 01 00 00 00 mov $0x1,%ebx 3a9: 0f b7 05 1e e0 06 00 movzwl 0x6e01e,%eax 3b0: ba 00 e0 06 00 mov $0x6e000,%edx 3b5: 39 c3 cmp %eax,%ebx 3b7: 74 41 je 3fa <next_key+0x8c> 3b9: 43 inc %ebx 3ba: 83 7c 9a 44 00 cmpl $0x0,0x44(%edx,%ebx,4) 3bf: 74 e8 je 3a9 <next_key+0x3b> %ebx is depth (starting from 1), and the code looks correct. Address 0x6e000 is INFO, the ->tree_depth member at offset 0x1e, and the ->next_key_nr[] array at offset 0x44. So this loop itself is not miscompiled, ergo I think next_key_nr is really 0 (wrongly so!) at that point. I looked at the places where this array is changed. First some more code in next_key: do { int nr_item = BLOCKHEAD (cache)->blk_nr_item; int key_nr = INFO->next_key_nr[depth]++; if (key_nr == nr_item) INFO->next_key_nr[depth] = 0; ... --depth ... } while (depth > DISK_LEAF_NODE_LEVEL); Here it can be incremented or set to zero depending on other conditions. The assembler looks like so: 409: be 00 e0 06 00 mov $0x6e000,%esi 40e: 0f b7 4f 02 movzwl 0x2(%edi),%ecx 412: 8b 54 9e 44 mov 0x44(%esi,%ebx,4),%edx 416: 8d 42 01 lea 0x1(%edx),%eax 419: 39 ca cmp %ecx,%edx 41b: 89 44 9e 44 mov %eax,0x44(%esi,%ebx,4) 41f: 75 08 jne 429 <next_key+0xbb> 421: c7 44 9e 44 00 00 00 movl $0x0,0x44(%esi,%ebx,4) 428: 00 The above is the code up to the first if, and it's implemented correctly. If the array elements _before_ incrementing is equal to nr_item the array element will be zeroed. 429: c1 e1 04 shl $0x4,%ecx 42c: 4b dec %ebx .... 443: 83 fb 01 cmp $0x1,%ebx 446: 7f c1 jg 409 <next_key+0x9b> Rest of loop also is correct. Now the only other place, where next_key_nr[] is changed is in search_stat(): INFO->next_key_nr[depth] = (i == nr_item) ? 0 : i+1; cache = read_tree_node (DC (cache)[i].dc_block_number, --depth); This corresponds to (%ebx is 'i', %ecx is 'nr_item' and %esi is 'depth'): 2b6: 31 d2 xor %edx,%edx 2b8: 39 cb cmp %ecx,%ebx 2ba: 74 03 je 2bf <search_stat+0x59> 2bc: 8d 53 01 lea 0x1(%ebx),%edx 2bf: c1 e1 04 shl $0x4,%ecx 2c2: b8 00 e0 06 00 mov $0x6e000,%eax 2c7: 89 54 b0 44 mov %edx,0x44(%eax,%esi,4) This correctly implements the ?: operator, the rest is the decrement of depth and the call: 2cb: 8d 04 d9 lea (%ecx,%ebx,8),%eax 2ce: 4e dec %esi 2cf: 89 f2 mov %esi,%edx 2d1: 8b 44 38 18 mov 0x18(%eax,%edi,1),%eax 2d5: e8 0f ff ff ff call 1e9 <read_tree_node> So, I really need a way to test various theories at runtime by changing and instrumenting grub itself. Hence to make progress it's mandatory that we have an easy way to get a testing environment for grub.
Crap! It's cd-dce which eliminates necessary instructions. I've found a way to test this with qemu on willimas, and instrumented the code some more. The instrumented version exhibits a problem in t59.cddce in this code in search_stats: INFO->next_key_nr[depth] = (i == nr_item) ? 0 : i+1; printf (" CCC depth=%d nr=%d nr2=%d\n", depth, INFO->next_key_nr[depth], INFO->next_key_nr[2]); When running this the output will be e.g.: CCC depth=2 nr=148 nr2=0 Note that depth is 2, and hence the last two numbers should be the same (148). That they are not is because the store to next_key_nr[depth] in front of the printf() is deleted. Like shown in comment #25 this is not the immediate case when no instrumentation code is there, but I bet that some other store will be deleted for the same reasons. Anyway for the above code in question we have this in .t58.redphi3: D.3338_120 = 450560B; ivtmp.234_121 = &450560B->next_key_nr[depth_13]; ..... # iftmp.32_6 = PHI <iftmp.32_96(14), 0(13)>; <L11>:; D.3339_122 = (unsigned int *) ivtmp.234_118; *D.3339_122 = iftmp.32_6; D.2280_72 = 450560B->next_key_nr[2]; grub_printf (&" CCC depth=%d nr=%d nr2=%d\n"[0], depth_60, iftmp.32_6, D.2280_72); iftmp.32 is the result of the ?: operator (i.e. 0 or i+1) which is stored at *D.3339, which holds the address of 450560B->next_key_nr[depth]. Fine so far. (For depth==2 this means that 450560B->next_key_nr[2] == D.2280 and *D.3339 alias!). But now cd-dce comes and deletes the store: Deleting : D.3339_122 = (unsigned int *) ivtmp.234_118; Deleting : *D.3339_122 = iftmp.32_6; .... # iftmp.32_6 = PHI <iftmp.32_96(10), 0(9)>; <L11>:; D.2280_72 = 450560B->next_key_nr[2]; grub_printf (&" CCC depth=%d nr=%d nr2=%d\n"[0], depth_60, iftmp.32_6, D.2280_72); So, while iftmp.32 hold the correct temporary value (so it's printed correctly), D.2280 will be wrong, because the store which would have set it is now removed. From then everything goes downhill. I'll also attach my version of fsys_reiserfs.i It is to be compiled with -falign-jumps=1 -falign-loops=1 -falign-functions=1 -m32 -O2 -Os -fno-strict-aliasing -minline-all-stringops -fno-asynchronous-unwind-tables to see the problem.
Created attachment 39001 [details] preprocessed instrumented fsys_reiserfs.i showing the problem
If I should theorize, I would guess GCC thinks the move is unnecessary because of the strange "addresses" (literal numbers casted to pointer type and then dereferenced). This is of course non-conformant C, as the address space is accessed directly. I.e. it violates object and pointer rules of C (e.g. that a pointer has to point into a valid (sub-)object, which obviously can't be the case for a pointer like "(int*)450560", except if the object is "whole memory"). But that's exactly why this source is compiled with -fno-strict-aliasing. So the load and the store _should_ alias (and hence the store be necessary), but my theory is, that this is somehow not the case.
Okay, it's all analyzed and well. The problem was indeed a miscompilation, which in fact was already fixed no mainline, but was missing from the 4.0 branch (it's there now). The bug was http://gcc.gnu.org/PR21171 . Fixed package was submitted, and seems to fix the bug. At least the snapshot2 does not have the problems, and my testing with qemu also showed that grub was working.