Bugzilla – Bug 1166403
VUL-0: CVE-2020-1759: ceph: Nonce reuse in msgr2 secure mode
Last modified: 2022-09-19 13:05:53 UTC
Hi everyone, I believe I see a problem with the secure mode of msgr2, which completely breaks both confidentiality and integrity aspects for long-lived sessions. Listing src/msg/async/crypto_onwire.cc: 22 struct nonce_t { 23 std::uint32_t random_seq; <--------- 24 std::uint64_t random_rest; 25 } __attribute__((packed)); Encryption (nonce is tx_nonce): 78 void AES128GCM_OnWireTxHandler::reset_tx_handler( 79 std::initializer_list<std::uint32_t> update_size_sequence) 80 { 81 if(1 != EVP_EncryptInit_ex(ectx.get(), nullptr, nullptr, nullptr, 82 reinterpret_cast<const unsigned char*>(&nonce))) { 83 throw std::runtime_error("EVP_EncryptInit_ex failed"); 84 } ... 89 ++nonce.random_seq; 90 } Decryption (nonce is rx_nonce): 188 void AES128GCM_OnWireRxHandler::reset_rx_handler() 189 { 190 if(1 != EVP_DecryptInit_ex(ectx.get(), nullptr, nullptr, nullptr, 191 reinterpret_cast<const unsigned char*>(&nonce))) { 192 throw std::runtime_error("EVP_DecryptInit_ex failed"); 193 } 194 ++nonce.random_seq; 195 } Initialization: 288 if (auth_meta.is_mode_secure()) { 289 ceph_assert_always(auth_meta.connection_secret.length() >= \ 290 sizeof(key_t) + 2 * sizeof(nonce_t)); 291 const char* secbuf = auth_meta.connection_secret.c_str(); ... 299 nonce_t rx_nonce; 300 { 301 ::memcpy(&rx_nonce, secbuf, sizeof(rx_nonce)); 302 secbuf += sizeof(rx_nonce); 303 } 304 305 nonce_t tx_nonce; 306 { 307 ::memcpy(&tx_nonce, secbuf, sizeof(tx_nonce)); 308 secbuf += sizeof(tx_nonce); 309 } So we have a 96-bit nonce which is initialized with a chunk of the session secret. Then, it is treated to consist of a 32-bit counter followed by a 64-bit fixed salt. It only takes 2**32 frames for the counter to repeat and the same nonce to be used with the same key. Nonce reuse is catastrophic for GCM. After just a few nonces get used twice (not repeatedly, just twice!) the attacker can forge auth tags and potentially manipulate plaintext. A small random read workload can get into this state in under a day. This was introduced in commit fe387e02b11d ("msg/async, v2: drop depedency on uint128_t. Clean up onwire crypto."), which effectively replaced a 96-bit counter with a 32-bit one. The comment referenced OpenVPN, which indeed seems to use a 32-bit counter, but GCM is only supported in TLS mode because it can force a TLS renegotiation in time. A proper fix would probably require a RADOS feature bit. The new msgr2 supported/required_features won't work because they are checked before the connection mode is negotiated and we don't want to break the crc mode which isn't affected. I went with a workaround that just cuts the connection before the nonce repeats. Additionally, it looks like there is an endianness issue there. Attached are two patches, only very lightly tested at this point. Thanks, Ilya
SUSE-SU-2020:0930-1: An update that fixes two vulnerabilities is now available. Category: security (important) Bug References: 1166403,1166484 CVE References: CVE-2020-1759,CVE-2020-1760 Sources used: SUSE Linux Enterprise Module for Open Buildservice Development Tools 15-SP1 (src): ceph-14.2.5.389+gb0f23ac248-3.35.2, ceph-test-14.2.5.389+gb0f23ac248-3.35.2 SUSE Linux Enterprise Module for Basesystem 15-SP1 (src): ceph-14.2.5.389+gb0f23ac248-3.35.2 SUSE Enterprise Storage 6 (src): ceph-14.2.5.389+gb0f23ac248-3.35.2 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
openSUSE-SU-2020:0494-1: An update that fixes two vulnerabilities is now available. Category: security (important) Bug References: 1166403,1166484 CVE References: CVE-2020-1759,CVE-2020-1760 Sources used: openSUSE Leap 15.1 (src): ceph-14.2.5.389+gb0f23ac248-lp151.2.13.1, ceph-test-14.2.5.389+gb0f23ac248-lp151.2.13.1
As msgrv2 was in tech preview, we will not fix this for SLE-12-SP3 or SLE-15.
closing