Commit 10780291 authored by David S. Miller's avatar David S. Miller
Browse files


Saeed Mahameed says:

====================
mlx5-tls-2020-06-26

1) Improve hardware layouts and structure for kTLS support

2) Generalize ICOSQ (Internal Channel Operations Send Queue)
Due to the asynchronous nature of adding new kTLS flows and handling
HW asynchronous kTLS resync requests, the XSK ICOSQ was extended to
support generic async operations, such as kTLS add flow and resync, in
addition to the existing XSK usages.

3) kTLS hardware flow steering and classification:
The driver already has the means to classify TCP ipv4/6 flows to send them
to the corresponding RSS HW engine, as reflected in patches 3 through 5,
the series will add a steering layer that will hook to the driver's TCP
classifiers and will match on well known kTLS connection, in case of a
match traffic will be redirected to the kTLS decryption engine, otherwise
traffic will continue flowing normally to the TCP RSS engine.

3) kTLS add flow RX HW offload support
New offload contexts post their static/progress params WQEs
(Work Queue Element) to communicate the newly added kTLS contexts
over the per-channel async ICOSQ.

The Channel/RQ is selected according to the socket's rxq index.

A new TLS-RX workqueue is used to allow asynchronous addition of
steering rules, out of the NAPI context.
It will be also used in a downstream patch in the resync procedure.

Feature is OFF by default. Can be turned on by:
$ ethtool -K <if> tls-hw-rx-offload on

4) Added mlx5 kTLS sw stats and new counters are documented in
Documentation/networking/tls-offload.rst
rx_tls_ctx - number of TLS RX HW offload contexts added to device for
decryption.

rx_tls_ooo - number of RX packets which were part of a TLS stream
but did not arrive in the expected order and triggered the resync
procedure.

rx_tls_del - number of TLS RX HW offload contexts deleted from device
(connection has finished).

rx_tls_err - number of RX packets which were part of a TLS stream
 but were not decrypted due to unexpected error in the state machine.

5) Asynchronous RX resync

a. The NIC driver indicates that it would like to resync on some TLS
record within the received packet (P), but the driver does not
know (yet) which of the TLS records within the packet.
At this stage, the NIC driver will query the device to find the exact
TCP sequence for resync (tcpsn), however, the driver does not wait
for the device to provide the response.

b. Eventually, the device responds, and the driver provides the tcpsn
within the resync packet to KTLS. Now, KTLS can check the tcpsn against
any processed TLS records within packet P, and also against any record
that is processed in the future within packet P.

The asynchronous resync path simplifies the device driver, as it can
save bits on the packet completion (32-bit TCP sequence), and pass this
information on an asynchronous command instead.

Performance:
    CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
    NIC: ConnectX-6 Dx 100GbE dual port

    Goodput (app-layer throughput) comparison:
    +---------------+-------+-------+---------+
    | # connections |   1   |   4   |    8    |
    +---------------+-------+-------+---------+
    | SW (Gbps)     |  7.26 | 24.70 |   50.30 |
    +---------------+-------+-------+---------+
    | HW (Gbps)     | 18.50 | 64.30 |   92.90 |
    +---------------+-------+-------+---------+
    | Speedup       | 2.55x | 2.56x | 1.85x * |
    +---------------+-------+-------+---------+

    * After linerate is reached, diff is observed in CPU util
====================

Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 989d957a a2907436
Loading
Loading
Loading
Loading
+18 −0
Original line number Diff line number Diff line
@@ -428,6 +428,24 @@ by the driver:
   which were part of a TLS stream.
 * ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets
   which were successfully decrypted.
 * ``rx_tls_ctx`` - number of TLS RX HW offload contexts added to device for
   decryption.
 * ``rx_tls_del`` - number of TLS RX HW offload contexts deleted from device
   (connection has finished).
 * ``rx_tls_resync_req_pkt`` - number of received TLS packets with a resync
    request.
 * ``rx_tls_resync_req_start`` - number of times the TLS async resync request
    was started.
 * ``rx_tls_resync_req_end`` - number of times the TLS async resync request
    properly ended with providing the HW tracked tcp-seq.
 * ``rx_tls_resync_req_skip`` - number of times the TLS async resync request
    procedure was started by not properly ended.
 * ``rx_tls_resync_res_ok`` - number of times the TLS resync response call to
    the driver was successfully handled.
 * ``rx_tls_resync_res_skip`` - number of times the TLS resync response call to
    the driver was terminated unsuccessfully.
 * ``rx_tls_err`` - number of RX packets which were part of a TLS stream
   but were not decrypted due to unexpected error in the state machine.
 * ``tx_tls_encrypted_packets`` - number of TX packets passed to the device
   for encryption of their TLS payload.
 * ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets
+1 −0
Original line number Diff line number Diff line
@@ -173,6 +173,7 @@ config MLX5_TLS
config MLX5_EN_TLS
	bool "TLS cryptography-offload accelaration"
	depends on MLX5_CORE_EN
	depends on XPS
	depends on MLX5_FPGA_TLS || MLX5_TLS
	default y
	help
+2 −1
Original line number Diff line number Diff line
@@ -74,7 +74,8 @@ mlx5_core-$(CONFIG_MLX5_EN_IPSEC) += en_accel/ipsec.o en_accel/ipsec_rxtx.o \
				     en_accel/ipsec_stats.o

mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/tls.o en_accel/tls_rxtx.o en_accel/tls_stats.o \
				   en_accel/ktls.o en_accel/ktls_tx.o
				   en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \
				   en_accel/ktls_tx.o en_accel/ktls_rx.o

mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \
					steering/dr_matcher.o steering/dr_rule.o \
+18 −1
Original line number Diff line number Diff line
@@ -43,9 +43,20 @@ int mlx5_ktls_create_key(struct mlx5_core_dev *mdev,
			 u32 *p_key_id);
void mlx5_ktls_destroy_key(struct mlx5_core_dev *mdev, u32 key_id);

static inline bool mlx5_accel_is_ktls_tx(struct mlx5_core_dev *mdev)
{
	return MLX5_CAP_GEN(mdev, tls_tx);
}

static inline bool mlx5_accel_is_ktls_rx(struct mlx5_core_dev *mdev)
{
	return MLX5_CAP_GEN(mdev, tls_rx);
}

static inline bool mlx5_accel_is_ktls_device(struct mlx5_core_dev *mdev)
{
	if (!MLX5_CAP_GEN(mdev, tls_tx))
	if (!mlx5_accel_is_ktls_tx(mdev) &&
	    !mlx5_accel_is_ktls_rx(mdev))
		return false;

	if (!MLX5_CAP_GEN(mdev, log_max_dek))
@@ -67,6 +78,12 @@ static inline bool mlx5e_ktls_type_check(struct mlx5_core_dev *mdev,
	return false;
}
#else
static inline bool mlx5_accel_is_ktls_tx(struct mlx5_core_dev *mdev)
{ return false; }

static inline bool mlx5_accel_is_ktls_rx(struct mlx5_core_dev *mdev)
{ return false; }

static inline int
mlx5_ktls_create_key(struct mlx5_core_dev *mdev,
		     struct tls_crypto_info *crypto_info,
+6 −0
Original line number Diff line number Diff line
@@ -23,6 +23,9 @@ static const char *const mlx5_rsc_sgmt_name[] = {
	MLX5_SGMT_STR_ASSING(SX_SLICE_ALL),
	MLX5_SGMT_STR_ASSING(RDB),
	MLX5_SGMT_STR_ASSING(RX_SLICE_ALL),
	MLX5_SGMT_STR_ASSING(PRM_QUERY_QP),
	MLX5_SGMT_STR_ASSING(PRM_QUERY_CQ),
	MLX5_SGMT_STR_ASSING(PRM_QUERY_MKEY),
};

struct mlx5_rsc_dump {
@@ -130,11 +133,13 @@ struct mlx5_rsc_dump_cmd *mlx5_rsc_dump_cmd_create(struct mlx5_core_dev *dev,
	cmd->mem_size = key->size;
	return cmd;
}
EXPORT_SYMBOL(mlx5_rsc_dump_cmd_create);

void mlx5_rsc_dump_cmd_destroy(struct mlx5_rsc_dump_cmd *cmd)
{
	kfree(cmd);
}
EXPORT_SYMBOL(mlx5_rsc_dump_cmd_destroy);

int mlx5_rsc_dump_next(struct mlx5_core_dev *dev, struct mlx5_rsc_dump_cmd *cmd,
		       struct page *page, int *size)
@@ -155,6 +160,7 @@ int mlx5_rsc_dump_next(struct mlx5_core_dev *dev, struct mlx5_rsc_dump_cmd *cmd,

	return more_dump;
}
EXPORT_SYMBOL(mlx5_rsc_dump_next);

#define MLX5_RSC_DUMP_MENU_SEGMENT 0xffff
static int mlx5_rsc_dump_menu(struct mlx5_core_dev *dev)
Loading