Bug 1214249

Summary: rsync crashes when built with glibc-2.38
Product: [openSUSE] openSUSE Tumbleweed Reporter: Jiri Slaby <jslaby>
Component: BasesystemAssignee: David Anes <david.anes>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: dimstar, jan.hubicka, matz, opensuse, pmonrealgonzalez, rguenther, schwab
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
See Also: https://github.com/WayneD/rsync/issues/511
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Jiri Slaby 2023-08-14 11:05:02 UTC
rsync crashes in vim-plugins' build:
https://build.opensuse.org/package/live_build_log/editors/vim-plugins/openSUSE_Tumbleweed/x86_64

I believe it's due to glibc-2.38 update. If I update glibc to 2.38 only, rsync-3.2.7-3.1 does not crash.

As soon as I update to rsync-3.2.7-3.2 (I believe the one rebuilt against this very new glibc), I see:
$ rsync -FFXHav '--filter=merge global-rsync-filter' Align-37-43/ xxx
sending incremental file list
*** buffer overflow detected ***: terminated
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(231) [Receiver=3.2.7]
Neúspěšně ukončen (SIGABRT) (core dumped [obraz paměti uložen])

gdb says:
> #3  0x00007f2a31226917 in __GI_abort () at abort.c:79
> #4  0x00007f2a312277e3 in __libc_message (fmt=fmt@entry=0x7f2a313b030c "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:150
> #5  0x00007f2a31327bdb in __GI___fortify_fail (msg=msg@entry=0x7f2a313b02f3 "buffer overflow detected") at fortify_fail.c:24
> #6  0x00007f2a31327506 in __GI___chk_fail () at chk_fail.c:28
> #7  0x00007f2a31329279 in __strlcpy_chk (s1=<optimized out>, s2=<optimized out>, n=<optimized out>, s1len=<optimized out>) at strlcpy_chk.c:28
> 27        if (__glibc_unlikely (s1len < n))
> 28          __chk_fail ();
> #8  0x0000559d0acf778a in strlcpy (__n=4096, __src=0x7ffece39ae20 "xslaby/pokus/Align-37-43/", __dest=0x559d0ad61886 <dirbuf.lto_priv+6> "")
>     at /usr/include/bits/string_fortified.h:156
> 156         return __strlcpy_chk (__dest, __src, __n, __glibc_objsize (__dest));

How does it come __glibc_objsize(dirbuf.lto_priv+6) is less than 4096?

> #9  setup_merge_file (mergelist_num=mergelist_num@entry=0, ex=ex@entry=0x559d0bf84b40, lp=lp@entry=0x559d0bf84b90) at /usr/src/debug/rsync-3.2.7/exclude.c:737
737                     strlcpy(y, save, MAXPATHLEN);
> #10 0x0000559d0acf7d94 in push_local_filters (dir=dir@entry=0x7ffece39c000 ".", dirlen=dirlen@entry=1) at /usr/src/debug/rsync-3.2.7/exclude.c:806
> #11 0x0000559d0acf8259 in change_local_filter_dir (dname=0x7ffece39c000 ".", dlen=1, dir_depth=0) at /usr/src/debug/rsync-3.2.7/exclude.c:899
> #12 0x0000559d0acef91c in send_file_list (f=4, argc=0, argv=0x559d0bf84898) at /usr/src/debug/rsync-3.2.7/flist.c:2453
> #13 0x0000559d0ad07d4b in client_run (f_in=f_in@entry=5, f_out=f_out@entry=4, pid=pid@entry=6659, argc=argc@entry=1, argv=argv@entry=0x559d0bf84890)
>     at /usr/src/debug/rsync-3.2.7/main.c:1315
> #14 0x0000559d0ace2bdb in start_client (argv=0x559d0bf84890, argc=1) at /usr/src/debug/rsync-3.2.7/main.c:1613
> #15 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/rsync-3.2.7/main.c:1873
Comment 1 Jiri Slaby 2023-08-14 11:08:08 UTC
(In reply to Jiri Slaby from comment #0)
> rsync crashes in vim-plugins' build:
> https://build.opensuse.org/package/live_build_log/editors/vim-plugins/
> openSUSE_Tumbleweed/x86_64
> 
> I believe it's due to glibc-2.38 update. If I update glibc to 2.38 only,
> rsync-3.2.7-3.1 does not crash.
> 
> As soon as I update to rsync-3.2.7-3.2 (I believe the one rebuilt against
> this very new glibc), I see:
> $ rsync -FFXHav '--filter=merge global-rsync-filter' Align-37-43/ xxx
> sending incremental file list
> *** buffer overflow detected ***: terminated
> rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
> rsync error: error in rsync protocol data stream (code 12) at io.c(231)
> [Receiver=3.2.7]
> Neúspěšně ukončen (SIGABRT) (core dumped [obraz paměti uložen])
> 
> gdb says:
> > #3  0x00007f2a31226917 in __GI_abort () at abort.c:79
> > #4  0x00007f2a312277e3 in __libc_message (fmt=fmt@entry=0x7f2a313b030c "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:150
> > #5  0x00007f2a31327bdb in __GI___fortify_fail (msg=msg@entry=0x7f2a313b02f3 "buffer overflow detected") at fortify_fail.c:24
> > #6  0x00007f2a31327506 in __GI___chk_fail () at chk_fail.c:28
> > #7  0x00007f2a31329279 in __strlcpy_chk (s1=<optimized out>, s2=<optimized out>, n=<optimized out>, s1len=<optimized out>) at strlcpy_chk.c:28
> > 27        if (__glibc_unlikely (s1len < n))
> > 28          __chk_fail ();
> > #8  0x0000559d0acf778a in strlcpy (__n=4096, __src=0x7ffece39ae20 "xslaby/pokus/Align-37-43/", __dest=0x559d0ad61886 <dirbuf.lto_priv+6> "")
> >     at /usr/include/bits/string_fortified.h:156
> > 156         return __strlcpy_chk (__dest, __src, __n, __glibc_objsize (__dest));
> 
> How does it come __glibc_objsize(dirbuf.lto_priv+6) is less than 4096?

Provided it comes from parse_merge_name():
602             static char buf[MAXPATHLEN];
...
653             return buf;

Is it an LTO fallout? Or gcc 13.2 fallout?

> > #9  setup_merge_file (mergelist_num=mergelist_num@entry=0, ex=ex@entry=0x559d0bf84b40, lp=lp@entry=0x559d0bf84b90) at /usr/src/debug/rsync-3.2.7/exclude.c:737
> 737                     strlcpy(y, save, MAXPATHLEN);
> > #10 0x0000559d0acf7d94 in push_local_filters (dir=dir@entry=0x7ffece39c000 ".", dirlen=dirlen@entry=1) at /usr/src/debug/rsync-3.2.7/exclude.c:806
> > #11 0x0000559d0acf8259 in change_local_filter_dir (dname=0x7ffece39c000 ".", dlen=1, dir_depth=0) at /usr/src/debug/rsync-3.2.7/exclude.c:899
> > #12 0x0000559d0acef91c in send_file_list (f=4, argc=0, argv=0x559d0bf84898) at /usr/src/debug/rsync-3.2.7/flist.c:2453
> > #13 0x0000559d0ad07d4b in client_run (f_in=f_in@entry=5, f_out=f_out@entry=4, pid=pid@entry=6659, argc=argc@entry=1, argv=argv@entry=0x559d0bf84890)
> >     at /usr/src/debug/rsync-3.2.7/main.c:1315
> > #14 0x0000559d0ace2bdb in start_client (argv=0x559d0bf84890, argc=1) at /usr/src/debug/rsync-3.2.7/main.c:1613
> > #15 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/rsync-3.2.7/main.c:1873
Comment 2 Jiri Slaby 2023-08-16 09:06:45 UTC
(In reply to Jiri Slaby from comment #1)
> > How does it come __glibc_objsize(dirbuf.lto_priv+6) is less than 4096?
> 
> Provided it comes from parse_merge_name():
> 602             static char buf[MAXPATHLEN];
> ...
> 653             return buf;
> 
> Is it an LTO fallout? Or gcc 13.2 fallout?

I cannot reproduce with 2 simple units:
fun.c:1:#include <limits.h>
fun.c:2:
fun.c:3:__attribute__((noinline)) char *fun()
fun.c:4:{
fun.c:5:        static char buf[PATH_MAX];
fun.c:6:
fun.c:7:        return buf;
fun.c:8:}
fun.c:9:
fun.h:1:#pragma once
fun.h:2:
fun.h:3:char *fun();
str.c:1:#include <err.h>
str.c:2:#include <errno.h>
str.c:3:#include <limits.h>
str.c:4:#include <stdio.h>
str.c:5:#include <stdlib.h>
str.c:6:#include <string.h>
str.c:7:
str.c:8:#include "fun.h"
str.c:9:
str.c:10:int main(int, char **argv)
str.c:11:{
str.c:12:       char *buf = fun();
str.c:13:
str.c:14:       strlcpy(buf, argv[0], PATH_MAX);
str.c:15:
str.c:16:       printf("%s\n", buf);
str.c:17:
str.c:18:       return 0;
str.c:19:}

Maybe I would need to put them into 2 separated slices? LTO seems to propagate the constant (PATH_MAX) nicely to main and eliminates strlcpy_chk() completely above.
Comment 3 Dominique Leuenberger 2023-08-17 09:07:15 UTC
(In reply to Jiri Slaby from comment #1)
> 
> Provided it comes from parse_merge_name():
> 602             static char buf[MAXPATHLEN];
> ...
> 653             return buf;
> 

according to valgrind it comes from setup_merge_file (exclude.c:737)

*** buffer overflow detected ***: terminated
==17792== 
==17792== Process terminating with default action of signal 6 (SIGABRT): dumping core
==17792==    at 0x4EF1E4C: __pthread_kill_implementation (pthread_kill.c:44)
==17792==    by 0x4E9F125: raise (raise.c:26)
==17792==    by 0x4E86916: abort (abort.c:79)
==17792==    by 0x4E877E2: __libc_message.cold (libc_fatal.c:150)
==17792==    by 0x4F87BDA: __fortify_fail (fortify_fail.c:24)
==17792==    by 0x4F87505: __chk_fail (chk_fail.c:28)
==17792==    by 0x4F89278: __strlcpy_chk (strlcpy_chk.c:28)
==17792==    by 0x128789: UnknownInlinedFun (string_fortified.h:156)
==17792==    by 0x128789: setup_merge_file (exclude.c:737)
==17792==    by 0x128D93: push_local_filters (exclude.c:806)
==17792==    by 0x129258: change_local_filter_dir (exclude.c:899)
==17792==    by 0x12091B: send_file_list (flist.c:2453)
==17792==    by 0x138D4A: client_run (main.c:1315)
Comment 4 Jiri Slaby 2023-08-17 09:10:49 UTC
(In reply to Dominique Leuenberger from comment #3)
> (In reply to Jiri Slaby from comment #1)
> > 
> > Provided it comes from parse_merge_name():
> > 602             static char buf[MAXPATHLEN];
> > ...
> > 653             return buf;
> > 
> 
> according to valgrind it comes from setup_merge_file (exclude.c:737)

Which obtains the buffer from parse_merge_name() above -- see line 693 -- if I am looking correctly?
Comment 5 Jiri Slaby 2023-08-17 09:16:17 UTC
(In reply to Jiri Slaby from comment #4)
> (In reply to Dominique Leuenberger from comment #3)
> > (In reply to Jiri Slaby from comment #1)
> > > 
> > > Provided it comes from parse_merge_name():
> > > 602             static char buf[MAXPATHLEN];
> > > ...
> > > 653             return buf;
> > > 
> > 
> > according to valgrind it comes from setup_merge_file (exclude.c:737)
> 
> Which obtains the buffer from parse_merge_name() above -- see line 693 -- if
> I am looking correctly?

But you gave me a hint. I didn't study the code in between the crash and parse_merge_name() before. Now I have. Well, the buffer pointer might be incremented and that means the MAXPATHLEN in
  strlcpy(y, save, MAXPATHLEN);
is not be correct in that case. I.e. 'y' might not equal 'buf' from the above and can point to the middle of 'buf', right?
Comment 6 Richard Biener 2023-08-17 09:23:24 UTC
Frame #9 points to

        while (*y) {
                char save[MAXPATHLEN];
                strlcpy(save, y, MAXPATHLEN);
                *y = '\0';
                dirbuf_len = y - dirbuf;
                strlcpy(x, ex->pattern, MAXPATHLEN - (x - buf));
                parse_filter_file(lp, buf, ex, XFLG_ANCHORED2ABS);
                if (ex->rflags & FILTRULE_NO_INHERIT) {
                        /* Free the undesired rules to clean up any per-dir
                         * mergelists they defined.  Otherwise pop_local_filters
                         * may crash trying to restore nonexistent state for
                         * those mergelists. */
                        free_filters(lp->head);
                        lp->head = NULL;
                }
                lp->tail = NULL; 
                strlcpy(y, save, MAXPATHLEN);
^^^

that would be an automatic var 'save'.  We're using _FORTIFY_SOURCE=3 and
thus __builtin_dynamic_object_size (__o, 1).  Yes, 'y' is initially
from parse_merge_name, either based on 'buf' or the incoming argument
'merge_file'.  But 'y' is also incremented in a lot of places and
while 'buf' is of size MAXPATAHLEN I don't see how it can reliably still
have MAXPATHLEN space here.

Did you check what s1len actually is?

But yes, __builtin_dynamic_object size does have some bugs but none
I am aware of for this specifically.

It would be nice to know whether the issue reproduces when rsync is not
built with -flto.

Do you have an actual testcase?  Doing what you quote fails due to missing
global-rsync-filter (and missing source files, possibly not needed).

> ./rsync -FFXHav '--filter=merge global-rsync-filter' Align-37-43/ xxx
rsync: [client] failed to open exclude file global-rsync-filter: No such file or directory (2)
rsync error: error in file IO (code 11) at exclude.c(1481) [client=3.2.7]

I'll note the fortification will fail once the computed length is less
than the advertised, thus when GCC figures out _anything_ and the
code actually incremented 'p'.  The code is definitely fragile, possibly
not expecting fortification ...

For example we have

        if (*x)
                y += strlen(y); /* nope -- skip the scan */

and since strlen is actually computed we now know symbolically the
space left in 'y' (that it's bound by MAXPATHLEN minus strlen(y) at that
point).  That we can symbolically compute that means there's enough
chance to run into this bug of rsync.

IMHO the

                strlcpy(y, save, MAXPATHLEN);

is simply not correct (without proving that GCC computes a correct value
here, but that should be possible to verify debugging the testcase).
Comment 7 Richard Biener 2023-08-17 09:26:02 UTC
I'll note again that __strlcpy_chk does

  if (__glibc_unlikely (s1len < n))
    __chk_fail ();

  return __strlcpy (s1, s2, n);

so it doesn't check whether 's2' might be shorter than s1len and thus
the call is actually safe (it might be safe to call strnlen (s2, n)).
Comment 8 Dominique Leuenberger 2023-08-17 09:30:38 UTC
(In reply to Richard Biener from comment #6)
> 
> It would be nice to know whether the issue reproduces when rsync is not
> built with -flto.

That I tried locally: changed the spec to add _clt_clfags=%nil, rebult and ran 

rsync -Fah . /tmp
=> > rsync -Fah . /tmp
*** buffer overflow detected ***: terminated

so, same issue observed without using to to build
Comment 9 Jiri Slaby 2023-08-17 09:32:53 UTC
(In reply to Richard Biener from comment #6)
> Did you check what s1len actually is?

Unfortunately "optimized out". It's held only in a register which was apparently overwritten in further frames.

> Do you have an actual testcase?  Doing what you quote fails due to missing
> global-rsync-filter (and missing source files, possibly not needed).

Sorry, it's:
https://build.opensuse.org/package/view_file/editors/vim-plugins/global-rsync-filter?expand=1

> IMHO the
> 
>                 strlcpy(y, save, MAXPATHLEN);
> 
> is simply not correct (without proving that GCC computes a correct value
> here, but that should be possible to verify debugging the testcase).

Yeah, agreed, I created:
https://github.com/WayneD/rsync/issues/511
Comment 10 Jiri Slaby 2023-08-18 06:43:23 UTC
What about https://build.opensuse.org/request/show/1104626 ?
Comment 11 Richard Biener 2023-08-18 07:52:42 UTC
(In reply to Jiri Slaby from comment #10)
> What about https://build.opensuse.org/request/show/1104626 ?

I can see how this avoids triggering the fortification check when it will
turn out harmless.  Of course the underlying problem is still present.

Note doing strlcpy (dest, src, strlen (src)) is effectively making the
use of strlcpy pointless, so a more honest patch would have replaced it
with strcpy (dest, src) instead ...
Comment 12 Michael Matz 2023-08-18 13:08:23 UTC
The patch is wrong.  The third argument to strlcpy is supposed to contain the size
of the _destination_ buffer (pointer to that in first argument).  As such it has
no inherent relations to the source string at all.  Clearly the author of that code
wanted to prevent buffer overruns, otherwise he hadn't used strlcpy, so it's better
to fix that for good.

You could for instance use this pattern to do the right thing:

char buf[MAXPATHLEN], *bufend = buf + MAXPATHLEN, *y = buf;
... stuff that manipulates y ...
strlcpy (y, input, bufend - y);
Comment 13 David Anes 2023-09-06 08:13:17 UTC
*** Bug 1214616 has been marked as a duplicate of this bug. ***
Comment 15 David Anes 2023-11-23 14:30:27 UTC
(In reply to Michael Matz from comment #12)
> The patch is wrong. (...)

I was about to close this one as fixed, but then I saw your comment.

I see a patch was already sent some time ago by Dirk (and sent along other fixes later by me) in the following request:
https://build.opensuse.org/request/show/1109241

Is the current version of the patch the one that's wrong? Can you please review and adjust it if needed? Thanks.
Comment 16 Michael Matz 2023-11-23 15:25:38 UTC
The patch from SR#1109241 looks fine at a cursory look, it tracks the bytes left
in the destination buffer for the second strlcpy call.  (FWIW, the initial patch
in SR#1104626 was the one that was wrong, but it's now lost in the mist of time :) )  So, I think it's all fine now, regarding this bug report here.
Comment 17 David Anes 2023-11-23 16:38:26 UTC
(In reply to Michael Matz from comment #16)
> The patch from SR#1109241 looks fine at a cursory look, it tracks the bytes
> left
> in the destination buffer for the second strlcpy call.  (FWIW, the initial
> patch
> in SR#1104626 was the one that was wrong, but it's now lost in the mist of
> time :) )  So, I think it's all fine now, regarding this bug report here.

Awesome, thanks for such a quick reply. Therefore I'm closing this one a fixed.