Bug 1193801 - AUDIT-TRACKER: connman: closer look at DNS protocol handling
AUDIT-TRACKER: connman: closer look at DNS protocol handling
Status: RESOLVED FIXED
Classification: Novell Products
Product: SUSE Security Incidents
Classification: Novell Products
Component: Audits
unspecified
Other Other
: P5 - None : Normal
: ---
Assigned To: Matthias Gerstner
Security Team bot
https://smash.suse.de/issue/317786/
:
Depends on: CVE-2022-23097 CVE-2022-23096 CVE-2022-23098
Blocks:
  Show dependency treegraph
 
Reported: 2021-12-16 09:29 UTC by Matthias Gerstner
Modified: 2022-03-29 12:21 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Gerstner 2021-12-16 09:29:12 UTC
For my preparation of the buffer overflow training I researched CVE-2021-33833
from bug 1186869 found in connman. From looking at the code it seems to me
there could still linger more issues in there.

When there are spare cycles we should have a broader look.
Comment 1 Matthias Gerstner 2021-12-27 12:20:37 UTC
I will work on this right away, shouldn't be too long to finish.
Comment 2 Matthias Gerstner 2021-12-30 11:07:23 UTC
It looks like my first impression did not betray me. I found a couple of DoS /
undefined data access issues. Nothing as severe as the original CVE but still
noteworthy.

Daniel Wagner at SUSE is the upstream maintainer so I will involve him soon by
mail and provide him the findings.
Comment 3 Matthias Gerstner 2021-12-30 11:41:32 UTC
This is the full report I just shared with Daniel by mail:

I recently researched CVE-2021-33833, a stack buffer overflow and possible
remote code execution in connman's dnsproxy component. This research made me
curious, because the DNS handling code in there still looks complex and shaky,
so I had a closer look at it.

I have been looking into the most recent upstream version tag 1.40. For
`dnsproxy.c` there are only minimal differences to the current master branch
so all should apply to that as well.

I found a couple of invalid memory read accesses that could possibly lead to
remote DoS, remote information leaks or otherwise undefined behaviour.
Furthermore I found a way to trigger a 100 % CPU loop. The following sections
explain the findings in detail. All of these affect the processing of DNS
server replies, i.e. they can be triggered by malicious remote DNS servers,
similar to CVE-2021-33833.

1) Possibly invalid memory reference in `strnlen` call in `forward_dns_reply()`
===============================================================================

In `forward_dns_reply()` in `dnsproxy.c:2004` the following `strnlen`
invocation occurs:

```
host_len = *ptr;
if (host_len > 0)
	domain_len = strnlen(ptr + 1 + host_len,
			reply_len - header_len);
```

This function does not actually check whether there are enough `reply_len`
bytes at all to even retrieve a valid `host_len` from where `ptr` is pointing
to.

The maximum size calculation `reply_len - header_len` is not necessarily
correct. If `reply_len` is smaller than `header_len`, which can be the case
for the TCP case (see issue 2), then `reply_len - header_len` can even become
negative i.e. an underflow wrap occurs.

`host_len` can be up to 255 and is attacker controlled. This means even for
the UDP case, where the calling function does make sure that at least
`header_len` bytes are available, the `ptr + 1 + host_len` expression can
point up to 257 bytes outside of valid packet data.

For the UDP case this means that data present in the stack based buffer in
function `udp_server_event` in `dnsproxy.c:2243` will be accessed that could
contain data from previous DNS replies or stack management data (pointer
addresses, stack canary values).

For the TCP case, where a heap based buffer of the exact receive size is used
(see `dnsproxy.c:2417`) this means that a heap out of bounds read access is
performed that could even crash Connman. In my exploit tests I did not manage
to cause a crash but this depends strongly on the heap allocator and
optimization levels etc.

So the possible effects of this vulnerability are:

- undefined behaviour of the domain name uncompress / recompress handling
  based on undefined data.
- remote denial of service especially in the TCP case
- an information leak, especially in the UDP case where a stack based buffer
  is used. If an attacker controls both the DNS server and the DNS client, or
  the DNS client and can spoof DNS replies on the network, then that attacker
  could receive stack management data on the client side. This is because the
  `forward_dns_reply` function has large degrees of freedom in the dns name
  uncompress / recompress handling and will forward even undefined data to the
  DNS client.

I suggest to diligently check for sufficient input data in the
`forward_dns_reply()` function to avoid any out of bound accesses.

2) TCP Receive Path does not Check for Presence of Sufficient Header Data
=========================================================================

In the UDP server reply code path a literal size check is performed to make
sure that at least a complete DNS header has been received
(`dnsproxy.c:2257`). This check is missing in the TCP server reply code path
as can be seen in:

- dnsproxy.c:2417: here a heap buffer of the exact claimed packet size is
  allocated (this can cause up to a 64 KB buffer to be allocated by the way).
- dnsproxy.c:2444: here the now completely received packet is passed on to
  `forward_dns_reply()` without any further minimum package size checks.

This means that a malicious DNS server can send a minimum reply message
consisting of four bytes (claiming a TCP message of two bytes size, and then
supplying the correct request ID in the next two bytes). `forward_dns_reply()`
will then be called with the `reply_len` parameter being set to 4. There are
no further input size checks in `forward_dns_reply`:

```
int dns_id, sk, err, offset = protocol_offset(protocol);

if (offset < 0)
	return offset;

hdr = (void *)(reply + offset);
```

`offset` will be 2 for the TCP case. Thus `hdr` will point to the two valid
DNS ID bytes at the end of the heap allocated buffer. All further protocol
processing code in this function will operate on undefined out of bounds heap
data.

From here on the characteristics of this vulnerability are similar to issue
1). There can be undefined behaviour and a possiblity for a remote DoS, a heap
based information leak maybe.

To fix this the `tcp_server_event()` function should not call
`forward_dns_reply()` if no complete header has been received. Furthermore an
upper TCP packet size limit (`TCP_MAX_BUF_LEN` already exists) should be
enforced also. The suggestions for issue 1) regarding the
`forward_dns_reply()` function apply here as well.

A sample run with `valgrind` acting against a malicious DNS server showed
output like this:

    Invalid read of size 2
       at 0x487311: forward_dns_reply.isra.0 (dnsproxy.c:2153)
       by 0x487EBD: tcp_server_event (dnsproxy.c:2447)
       by 0x48C0D9E: g_main_context_dispatch (in
/usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x48C1127: ??? (in /usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x48C1412: g_main_loop_run (in /usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x41217B: main (main.c:932)
     Address 0x587c88c is 2 bytes after a block of size 10 alloc'd
       at 0x48437B5: malloc (in
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
       by 0x48C719F: g_try_malloc (in /usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x4881A9: tcp_server_event (dnsproxy.c:2420)
       by 0x48C0D9E: g_main_context_dispatch (in
/usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x48C1127: ??? (in /usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x48C1412: g_main_loop_run (in /usr/lib64/libglib-2.0.so.0.7000.2)
       by 0x41217B: main (main.c:932)

3) TCP Receive Path Triggers 100 % CPU loop if DNS server does not Send Back Data
=================================================================================

In the TCP server reply case, if the server simply does not send back any data
at all but keeps the socket connection open, then Connman enters a 100 % CPU
loop. This is probably related to the event watch configuration in
`dnsproxy.c:2523`, where also `G_IO_OUT` is set, meaning that the event loop
will wake up when data can be written to the TCP connection, which is true all
the time.

Allthough there is a 30 second timeout configured `tcp_idle_timeout()`, the
100 % CPU loop does not seem to stop after that time. I did not further
investigate the reasons for this.

To fix this the watch condition could be altered after the logic in
`dnsproxy.c:2318` has run once (i.e. after the server is connected). Removing
the `G_IO_OUT` bit after this should then prevent the 100 % CPU loop.

4) TCP DNS Operation is Broken due to Bad TCP Length Header
===========================================================

This is only a functional issue. It seems the TCP based DNS operation never
really worked, because in `forward_dns_reply()` the message content is
possibly modified due to the domain name uncompress logic. However, the TCP
length header is never adjusted to reflect this. The original DNS header
received from the server is plainly copied in `dnsproxy.c:2125` and never
adjusted after this.

This means that either, if the modified DNS message is shorter than the one
supplied by the server, that the DNS header length will be larger than what is
actually forwarded to the DNS client. The DNS client will wait for more data
that will never be supplied. Or, should the modified DNS message become
larger, then the DNS client will receive only a truncated message that will be
incomplete / corrupted. The attached patch seems to fix this. I tested this
simply using `host -T` as a simple TCP based DNS client.

5) Generally Worrying Code Quality / Suggestion to Refactor dnsproxy
====================================================================

This DNS proxy code does not seem to be very mature at the moment. I found the
code hard to follow given all the bytes bean counting, confusing variable
names (e.g. in the `forward_dns_reply()` and `uncompress()` context: `char
*ptr`, `char *uptr`, `char *uncompressed`, `char **uncompressed_ptr`) and
pretty long functions (e.g. `forward_dns_reply` having about 250 lines of
code,
pretty deeply nested blocks).

I recommend a refactoring of this whole compilation unit. Splitting it up into
better separated parts like the DNS client side communication, the DNS server
side communication and the actual DNS protocol parsing logic. Avoiding code
redundancy (e.g. `udp_listener_event()` and `tcp_listener_event()` have a lot
of identical code).

Protocol parsing in C is not the easiest of tasks but the uncompress() label
handling can certainly still be improved somewhat.

6) Miscellaneous
================

- In the UDP server code path there is no detection of truncated packets
  (`dnsproxy.c:2255`). As long as the protocol parsing is implemented
  correctly this shouldn't cause any trouble. Still it could be prudent to
  discard truncated messages right away.
Comment 6 Gianluca Gabrielli 2022-03-29 11:49:57 UTC
Hi Daniel,

did you backport the patch [0] to the following packages?
  - openSUSE:Backports:SLE-15-SP3/connman
  - openSUSE:Backports:SLE-15-SP4/connman
  - openSUSE:Factory/connman

[0] https://lore.kernel.org/connman/20220125090026.5108-3-wagi@monom.org/T/#m81ef1e357b6b2d3efd53f86d1cdcbfe9a37d8b3f
Comment 7 Daniel Wagner 2022-03-29 12:10:10 UTC
yes
Comment 8 Gianluca Gabrielli 2022-03-29 12:21:28 UTC
Thanks, closing.