Bug 1196957 - AUDIT-1: frr: Misusing strdup leads to stack overflow in isis_nb_notifications.c
AUDIT-1: frr: Misusing strdup leads to stack overflow in isis_nb_notifications.c
Status: RESOLVED FIXED
Classification: Novell Products
Product: SUSE Security Incidents
Classification: Novell Products
Component: Audits
unspecified
Other Other
: P5 - None : Normal
: ---
Assigned To: Carlos López
Security Team bot
https://smash.suse.de/issue/324794/
:
Depends on:
Blocks: CVE-2019-25074 CVE-2022-37032
  Show dependency treegraph
 
Reported: 2022-03-10 11:06 UTC by Johannes Segitz
Modified: 2022-10-13 14:50 UTC (History)
4 users (show)

See Also:
Found By: Security Response Team
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Johannes Segitz 2022-03-10 11:06:56 UTC
Details in bnc#1196506. Would make sense to have another look here, sounds juicy
Comment 1 Carlos López 2022-07-26 11:56:04 UTC
Find below the report I sent to security@lists.frrouting.org:

# 0. Introduction

This report presents two security-relevant issues found in FRR during the SUSE auditing process. At SUSE, the security team regularly audits packages present in the openSUSE Tumbleweed rolling distribution. These audits can be triggered by a number of reasons; in this case, an existing bug report for CVE-2022-26126 and concerns raised by our package maintainer made us want to look deeper into the source code.

FRRouting is an open source multi-protocol routing software suite. It implements the logic to parse and process a multitude of routing protocols, including among others BGP, IS-IS, OSPF, RIP and VRRP.

The audit was done on FRR 8.1, which is the package version present in Tumbleweed. However, the findings reported are still relevant on the latest revision of the master branch at the time of writing (commit 1004137bf3).

# 1. Context

FRR is built using a modular design, where each of these modules represents a different protocol. The modules are largely independent and can be built separately; additionally there is some common core functionality in directories like `lib/` or `zebra/`. During this audit, we focused on fuzzing protocol-specific parsing logic, since it is the most likely to include remote bugs.

# 2. Findings

## 2.1. IS-IS: Memory leak in unpack_tlv_router_cap()

When FRR receives an IS-IS HELLO packet, it parses it and attempts to unpack the TLVs contained in it, resulting in a call trace that looks something like the following:

```
#0 unpack_tlvs() isisd/isis_tlvs.c
#1 isis_unpack_tlvs() in isisd/isis_tlvs.c
#2 process_hello() in isisd/isis_pdu.c
#3 isis_handle_pdu() in isisd/isis_pdu.c
```

Down the call stack, different TLV unpackers are called via ops-like structures holding function pointers, depending on the TLV type. One of these unpackers is `unpack_tlv_router_cap()`. Here, FRR allocates a new structure to hold the received information; however, if the transmitted length is greater than the actual amount of data sent, the function bails out.

```c
/* Functions related to TLV 242 Router Capability as per RFC7981 */
...
static int unpack_tlv_router_cap(...)
{
	...
	/* Allocate router cap structure and initialize SR Algorithms */
	rcap = XCALLOC(MTYPE_ISIS_TLV, sizeof(struct isis_router_cap));
	...
	while (...) {
		...
		if (length > STREAM_READABLE(s) || length > subtlv_len - 2) {
			stream_forward_getp(s, STREAM_READABLE(s));
			return 0;
		}
		...
	}
	...
	tlvs->router_cap = rcap;
	return 0;
}
```

Note that the pointer to the allocated structure (`rcap`) is leaked once the function returns with an error, since it is never freed. Normally, this structure would be freed via `isis_free_tlvs()` after the error bubbles up from `unpack_tlv_router_cap()` back to `process_hello()`:

```c
static int process_hello(...)
{
	...
	if (isis_unpack_tlvs(STREAM_READABLE(circuit->rcv_stream),
			     circuit->rcv_stream, &iih.tlvs, &error_log)) {
		/* error ... */
		goto out;
	}
	...
out:
	isis_free_tlvs(iih.tlvs);

	return retval;
}
```

```c
void isis_free_tlvs(struct isis_tlvs *tlvs) {
	...
	free_tlv_router_cap(tlvs->router_cap);
	...
}
```

As mentioned above, the pointer to the new structure (`rcap`) is never copied to `tlvs->router_cap` nor freed directly when the function bails out. This is the only place where this function returns early, so simply copying the pointer earlier suffices to fix the issue:

```diff
diff --git a/isisd/isis_tlvs.c b/isisd/isis_tlvs.c
index 11be3c3a7..36cb945cd 100644
--- a/isisd/isis_tlvs.c
+++ b/isisd/isis_tlvs.c
@@ -3610,6 +3610,8 @@ static int unpack_tlv_router_cap(enum isis_tlv_context context,
        for (int i = 0; i < SR_ALGORITHM_COUNT; i++)
                rcap->algo[i] = SR_ALGORITHM_UNSET;
 
+       tlvs->router_cap = rcap;
+
        /* Get Router ID and Flags */
        rcap->router_id.s_addr = stream_get_ipv4(s);
        rcap->flags = stream_getc(s);
@@ -3774,7 +3776,6 @@ static int unpack_tlv_router_cap(enum isis_tlv_context context,
                }
                subtlv_len = subtlv_len - length - 2;
        }
-       tlvs->router_cap = rcap;
        return 0;
 }
```

This issue could be exploited to exhaust the routing server's available memory, albeit slowly, eventually leading to denial of service.

## 2.2. BGP: Out of bounds read in bgp_capability_msg_parse()

When FRR receives a BGP capability message, the following call trace occurs when an attempt is made to parse it:

```
#0 bgp_capability_msg_parse() in bgpd/bgp_packet.c
#1 bgp_capability_receive() in bgpd/bgp_packet.c
#2 bgp_process_packet() in bgpd/bgp_packet.c
```

In `bgp_capability_msg_parse()`, an inproper calculation on the bounds of the available data is made.

```c
static int bgp_capability_msg_parse(struct peer *peer, uint8_t *pnt,
					bgp_size_t length)
{
	...
	end = pnt + length;

	while (pnt < end) {
		/* We need at least action, capability code and capability
		 * length. */
		if (pnt + 3 > end) {
			/* error ... */
		}
		...
		hdr = (struct capability_header *)(pnt + 1);

		...

		/* Capability length check. */
		if ((pnt + hdr->length + 3) > end) {
			/* error ... */
		}

		/* Fetch structure to the byte stream. */
		memcpy(&mpc, pnt + 3, sizeof(struct capability_mp_data));
		pnt += hdr->length + 3;

	}

	return BGP_PACKET_NOOP;
}
```

While the first length check is properly done, the second one relies on a value directly read from the packet without verification. `hdr->length` is used to check the remaining amount of data, but the actual amount read is `sizeof(struct capability_mp_data)`. This means that the size check can be bypassed, causing `memcpy()` to read out of bounds.

The correct bounds check would be the following:

```diff
diff --git a/bgpd/bgp_packet.c b/bgpd/bgp_packet.c
index 7613ccc7d..70ad647bf 100644
--- a/bgpd/bgp_packet.c
+++ b/bgpd/bgp_packet.c
@@ -2622,7 +2622,7 @@ static int bgp_capability_msg_parse(struct peer *peer, uint8_t *pnt,
                                peer->host, action, hdr->code, hdr->length);
 
                /* Capability length check. */
-               if ((pnt + hdr->length + 3) > end) {
+               if ((pnt + sizeof(struct capability_mp_data) + 3) > end) {
                        zlog_info("%s Capability length error", peer->host);
                        bgp_notify_send(peer, BGP_NOTIFY_CEASE,
                                        BGP_NOTIFY_SUBCODE_UNSPECIFIC);
@@ -2631,7 +2631,7 @@ static int bgp_capability_msg_parse(struct peer *peer, uint8_t *pnt,
 
                /* Fetch structure to the byte stream. */
                memcpy(&mpc, pnt + 3, sizeof(struct capability_mp_data));
-               pnt += hdr->length + 3;
+               pnt += sizeof(struct capability_mp_data) + 3;
 
                /* We know MP Capability Code. */
                if (hdr->code == CAPABILITY_CODE_MP) {

```

An alternative fix would be to refactor `bgp_capability_msg_parse()` to use the internal stream API (`lib/stream.c`) instead of directly accessing packet data.

This issue could be exploited to trigger a segmentation fault, leading to denial of service, or cause undefined behavior via corrupt fields in `mpc`.

## 2.3 Other minor issues

As mentioned in the introduction, the issue that made us want to look deeper was the string handling done in the the IS-IS submodule (CVE-2022-26126). This issue consisted in a call to `strdup()` via `yang_data_new()` on raw data coming from the network, which might not be null-terminated. The way to fix this was to use instead the newly added `yang_data_new_binary()`, which base64-encodes the data and null-terminates the result (see PR#10566). This new buffer is passed to SNMP hooks for further processing, and there does not seem to be any problematic processing there. However, consider that the also newly added helper function, `yang_dnode_get_binary_buf()`, does not null-terminate the decoded buffer. This leaves room for potential future bugs if a developer forgets this detail - this is especially easy to do since the output parameter for the helper function is of type `char *`.

# 3. Summary

The code quality in the project is quite high, evidenced by the use of abstractions like the internal stream API, which avoids several memory management pitfalls. However, having fuzzing (e.g. libFuzzer) integrated in the development process would greatly improve its safety. We found that, due to the nature of these routing protocols, state and parsing logic are entangled, and thus make it difficult for developers to build fuzzing harnesses around the more dangerous parts of the suite.
Comment 2 Carlos López 2022-08-01 10:17:28 UTC
Upstream acknowledged both issues. I requested CVE IDs for them:
 - CVE-2019-25074: A memory leak in the IS-IS daemon of FRRouting (FRR) before 8.4 may lead to server memory exhaustion. This occurs in unpack_tlv_router_cap in isisd/isis_tlvs.c.
 - CVE-2022-37032: An out-of-bounds read in the BGP daemon of FRRouting FRR before 8.4 may lead to information disclosure or denial of service. This occurs in bgp_capability_msg_parse in bgpd/bgp_packet.c.

Both fixed in the following PR:
https://github.com/FRRouting/frr/pull/11656
Comment 3 Carlos López 2022-10-13 14:50:34 UTC
Everything fixed on our side, closing.