|
Bugzilla – Full Text Bug Listing |
| Summary: | gcc miscompiles kpartx with -O2 or bigger. | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Carl-Daniel Hailfinger <kernel01> |
| Component: | Basesystem | Assignee: | Hannes Reinecke <hare> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | hare, lmb, matz |
| Version: | Beta 4 | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Found By: | Development | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | hdimage.tar.bz2 (sparse image, needs 1 MB diskspace unpacked) | ||
|
Description
Carl-Daniel Hailfinger
2005-09-02 10:53:46 UTC
Bad news. The problem is not within kpartx, but a miscompilation. If I compile without RPM_OPT_FLAGS, it works. If I strip -O2 from RPM_OPT_FLAGS, it works. Summary: Flags | Result (none) | OK -O | OK -O2 | Miscompilation -O3 | Miscompilation carldani@piii-1000:~> gcc --version gcc (GCC) 4.0.2 20050826 (prerelease) (SUSE Linux) System is a PentiumIII with 1 GHz, fresh install of SUSE 10.0beta4 Michael, Andreas: Can one of you take that bug? Thanks! Could also be an aliasing bug. Does it work when compiled with -fno-strict-aliasing? It still doesn't work when using "-O2 -fno-strict-aliasing". Hm. -fno-unit-at-a-time seems to help. Just to be sure, if anyone expect me or some other compiler guy to have a look at this, then this is missing much information. For instance what package we are talking about. As this seems also to be somewhat hard to use (needing a HD image), this would require someone setting up a buildsystem, with the sources of kpartx, where one can see the fault. But don't expect that this is not a bug in kpartx. Michael, what do you need? A simple way to reproduce this miscompilation? kpartx is in package multipath-tools. Install the multipath-tools and the source rpm of multipath-tools. Download the tar.bz2 of the harddisk image I'll attach in bugzilla. Then do the following as root on a i386 machine running 10.0beta4: # tar xfjS hdimage.tar.bz2 # losetup -r /dev/loop0 harddisk.img # kpartx -l /dev/loop0 llseek error llseek error llseek error llseek error Expected output is: loop0p1 : 0 48132 /dev/loop0 63 loop0p5 : 0 80262 /dev/loop0 48258 loop0p6 : 0 96327 /dev/loop0 128583 loop0p7 : 0 64197 /dev/loop0 224973 loop0p8 : 0 112392 /dev/loop0 289233 # cd /usr/src/packages/ # rpmbuild -bp SPECS/multipath-tools.spec # cd BUILD/multipath-tools-0.4.4/ # make OPTFLAGS="-O2 -march=i586 -mtune=i686 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -fno-unit-at-a-time" BUILD=glibc # kpartx/kpartx -l /dev/loop0 loop0p1 : 0 48132 /dev/loop0 63 loop0p5 : 0 80262 /dev/loop0 48258 loop0p6 : 0 96327 /dev/loop0 128583 loop0p7 : 0 64197 /dev/loop0 224973 loop0p8 : 0 112392 /dev/loop0 289233 # make OPTFLAGS="-O2 -march=i586 -mtune=i686 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g" BUILD=glibc # kpartx/kpartx -l /dev/loop0 llseek error llseek error llseek error llseek error Do you need any additional information? Have you tried compiling without _FORTIFY_SOURCE? Created attachment 48811 [details]
hdimage.tar.bz2 (sparse image, needs 1 MB diskspace unpacked)
I have tried without _FORTIFY_SOURCE and every other flag except -O2. Didn't help. # make OPTFLAGS="-O2 -fno-unit-at-a-time" BUILD=glibc will compile correctly. # make OPTFLAGS="-O2" BUILD=glibc will miscompile. I hope that is better for tracking it down. Argh. This code is horrible. Anyway, it was a bug in the source code,
which arguably is a bug in the kernel headers. The problem is that
kparts likes to define its own _llseek syscall via the syscall5 macro of
linux/unistd.h. But the assembler to which it expands doesn't capture all
effects it has, namely:
int _llseek (uint fd,ulong hi,ulong lo,long long * res,uint wh)
{
long __res;
__asm__ volatile ("int $0x80"
: "=a" (__res)
: "0" (140),"b" ((long)(fd)),
"c" ((long)(hi)), "d" ((long)(lo)),
"S" ((long)(res)),"D" ((long)(wh)));
The problem is that *res (where the asm simply sees "res") is written to.
But the asm doesn't describe this sideeffect, so when gcc sees this call:
if (_llseek (fd, in>>32, in & 0xffffffff, &out, 0) != 0
|| out != in)
_and_ inlines _llseek (containing the asm above, created by the syscall5
macro), which it only does with -O2 in unit-at-a-time mode by default (hence
-fno-inline also "fixes" this problem), then GCC doesn't see anymore that
"out" might have changed due to that call. So it regards it as unchanged,
and optimizes the later "out != in" into "1 != in" (as "1" was the initial
value of "out", which according to all GCC knows didn't change.
So, the lesson learned: Don't use the kernel headers in user space for
defining own syscall functions. Don't use the kernel headers at all,
if possible. In this case we have a perfectly fine lseek64 library function
which can be used, when one defines _LARGEFILE64_SOURCE.
Submitted fixed package removing this hack to STABLE. Thanks! However, does it still compile and work with klibc instead of glibc? klibc doesn't have a lseek64 function, resulting in this hack. IIRC every package which can also be linked against klibc has similar hacks in its source to work around the lack of lseek64. I was only concerned about this package, and it seems to build exclusively for glibc. If klibc doesn't provide this syscall, then this should simply be fixed in klibc. |