Commit 41477662 authored by Jakub Kicinski's avatar Jakub Kicinski Committed by David S. Miller
Browse files

net/tls: prevent skb_orphan() from leaking TLS plain text with offload



sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.

Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.

Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk->sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff->decrypted member.

Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.

Reusing the sk_buff->decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).

See commit 0608c69c ("bpf: sk_msg, sock{map|hash} redirect
through ULP") for justification why the internal flag is safe.
The only location which could leak the flag in is tcp_bpf_sendmsg(),
which is taken care of by clearing the previously unused bit.

v2:
 - remove superfluous decrypted mark copy (Willem);
 - remove the stale doc entry (Boris);
 - rely entirely on EOR marking to prevent coalescing (Boris);
 - use an internal sendpages flag instead of marking the socket
   (Boris).
v3 (Willem):
 - reorganize the can_skb_orphan_partial() condition;
 - fix the flag leak-in through tcp_bpf_sendmsg.

Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: default avatarWillem de Bruijn <willemb@google.com>
Reviewed-by: default avatarBoris Pismenny <borisp@mellanox.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 0de94de1
Loading
Loading
Loading
Loading
+0 −18
Original line number Diff line number Diff line
@@ -506,21 +506,3 @@ Drivers should ignore the changes to TLS the device feature flags.
These flags will be acted upon accordingly by the core ``ktls`` code.
TLS device feature flags only control adding of new TLS connection
offloads, old connections will remain active after flags are cleared.

Known bugs
==========

skb_orphan() leaks clear text
-----------------------------

Currently drivers depend on the :c:member:`sk` member of
:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
encryption. Any operation which removes or does not preserve the socket
association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
will cause the driver to miss the packets and lead to clear text leaks.

Redirects leak clear text
-------------------------

In the RX direction, if segment has already been decrypted by the device
and it gets redirected or mirrored - clear text will be transmitted out.
+8 −0
Original line number Diff line number Diff line
@@ -1374,6 +1374,14 @@ static inline void skb_copy_hash(struct sk_buff *to, const struct sk_buff *from)
	to->l4_hash = from->l4_hash;
};

static inline void skb_copy_decrypted(struct sk_buff *to,
				      const struct sk_buff *from)
{
#ifdef CONFIG_TLS_DEVICE
	to->decrypted = from->decrypted;
#endif
}

#ifdef NET_SKBUFF_DATA_USES_OFFSET
static inline unsigned char *skb_end_pointer(const struct sk_buff *skb)
{
+3 −0
Original line number Diff line number Diff line
@@ -292,6 +292,9 @@ struct ucred {
#define MSG_BATCH	0x40000 /* sendmmsg(): more messages coming */
#define MSG_EOF         MSG_FIN
#define MSG_NO_SHARED_FRAGS 0x80000 /* sendpage() internal : page frags are not shared */
#define MSG_SENDPAGE_DECRYPTED	0x100000 /* sendpage() internal : page may carry
					  * plain text and require encryption
					  */

#define MSG_ZEROCOPY	0x4000000	/* Use user data in kernel path */
#define MSG_FASTOPEN	0x20000000	/* Send data in TCP SYN */
+9 −1
Original line number Diff line number Diff line
@@ -2482,6 +2482,7 @@ static inline bool sk_fullsock(const struct sock *sk)

/* Checks if this SKB belongs to an HW offloaded socket
 * and whether any SW fallbacks are required based on dev.
 * Check decrypted mark in case skb_orphan() cleared socket.
 */
static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
						   struct net_device *dev)
@@ -2489,8 +2490,15 @@ static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
#ifdef CONFIG_SOCK_VALIDATE_XMIT
	struct sock *sk = skb->sk;

	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb)
	if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb) {
		skb = sk->sk_validate_xmit_skb(sk, dev, skb);
#ifdef CONFIG_TLS_DEVICE
	} else if (unlikely(skb->decrypted)) {
		pr_warn_ratelimited("unencrypted skb with no associated socket - dropping\n");
		kfree_skb(skb);
		skb = NULL;
#endif
	}
#endif

	return skb;
+14 −5
Original line number Diff line number Diff line
@@ -1992,6 +1992,19 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
}
EXPORT_SYMBOL(skb_set_owner_w);

static bool can_skb_orphan_partial(const struct sk_buff *skb)
{
#ifdef CONFIG_TLS_DEVICE
	/* Drivers depend on in-order delivery for crypto offload,
	 * partial orphan breaks out-of-order-OK logic.
	 */
	if (skb->decrypted)
		return false;
#endif
	return (skb->destructor == sock_wfree ||
		(IS_ENABLED(CONFIG_INET) && skb->destructor == tcp_wfree));
}

/* This helper is used by netem, as it can hold packets in its
 * delay queue. We want to allow the owner socket to send more
 * packets, as if they were already TX completed by a typical driver.
@@ -2003,11 +2016,7 @@ void skb_orphan_partial(struct sk_buff *skb)
	if (skb_is_tcp_pure_ack(skb))
		return;

	if (skb->destructor == sock_wfree
#ifdef CONFIG_INET
	    || skb->destructor == tcp_wfree
#endif
		) {
	if (can_skb_orphan_partial(skb)) {
		struct sock *sk = skb->sk;

		if (refcount_inc_not_zero(&sk->sk_refcnt)) {
Loading