Commit bad5b6e2 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab Committed by David S. Miller
Browse files

docs: networking: convert rds.txt to ReST



- add SPDX header;
- add a document title;
- mark code blocks and literals as such;
- mark tables as such;
- mark lists as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.

Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 8c6e1720
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -97,6 +97,7 @@ Contents:
   proc_net_tcp
   radiotap-headers
   ray_cs
   rds

.. only::  subproject and html

+165 −140
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

==
RDS
===

Overview
========
@@ -24,6 +29,7 @@ as IB.
The high-level semantics of RDS from the application's point of view are

 *	Addressing

	RDS uses IPv4 addresses and 16bit port numbers to identify
	the end point of a connection. All socket operations that involve
	passing addresses between kernel and user space generally
@@ -38,6 +44,7 @@ The high-level semantics of RDS from the application's point of view are
	protocol.

 *	Socket interface

	RDS sockets work *mostly* as you would expect from a BSD
	socket. The next section will cover the details. At any rate,
	all I/O is performed through the standard BSD socket API.
@@ -53,6 +60,7 @@ The high-level semantics of RDS from the application's point of view are
	doesn't move to a different transport.

 *	sysctls

	RDS supports a number of sysctls in /proc/sys/net/rds


@@ -147,8 +155,7 @@ Socket Interface
	operation. In this case, it would use RDS_CANCEL_SENT_TO to
	nuke any pending messages.

  setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
  getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
  ``setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..), getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)``
	Set or read an integer defining  the underlying
	encapsulating transport to be used for RDS packets on the
	socket. When setting the option, integer argument may be
@@ -180,7 +187,9 @@ RDS Protocol
  Message header

    The message header is a 'struct rds_header' (see rds.h):

    Fields:

      h_sequence:
	  per-packet sequence number
      h_ack:
@@ -192,9 +201,14 @@ RDS Protocol
      h_dport:
	  destination port
      h_flags:
          CONG_BITMAP - this is a congestion update bitmap
          ACK_REQUIRED - receiver must ack this packet
          RETRANSMITTED - packet has previously been sent
	  Can be:

	  =============  ==================================
	  CONG_BITMAP    this is a congestion update bitmap
	  ACK_REQUIRED   receiver must ack this packet
	  RETRANSMITTED  packet has previously been sent
	  =============  ==================================

      h_credit:
	  indicate to other end of connection that
	  it has more credits available (i.e. there is
@@ -260,7 +274,7 @@ RDS Protocol


RDS Transport Layer
==================
===================

  As mentioned above, RDS is not IB-specific. Its code is divided
  into a general RDS layer and a transport layer.
@@ -281,19 +295,25 @@ RDS Kernel Structures
    be sent and sets header fields as needed, based on the socket API.
    This is then queued for the individual connection and sent by the
    connection's transport.

  struct rds_incoming
    a generic struct referring to incoming data that can be handed from
    the transport to the general code and queued by the general code
    while the socket is awoken. It is then passed back to the transport
    code to handle the actual copy-to-user.

  struct rds_socket
    per-socket information

  struct rds_connection
    per-connection information

  struct rds_transport
    pointers to transport-specific functions

  struct rds_statistics
    non-transport-specific statistics

  struct rds_cong_map
    wraps the raw congestion bitmap, contains rbnode, waitq, etc.

@@ -317,53 +337,58 @@ The send path
=============

  rds_sendmsg()
    struct rds_message built from incoming data
    CMSGs parsed (e.g. RDMA ops)
    transport connection alloced and connected if not already
    rds_message placed on send queue
    send worker awoken
    - struct rds_message built from incoming data
    - CMSGs parsed (e.g. RDMA ops)
    - transport connection alloced and connected if not already
    - rds_message placed on send queue
    - send worker awoken

  rds_send_worker()
    calls rds_send_xmit() until queue is empty
    - calls rds_send_xmit() until queue is empty

  rds_send_xmit()
    transmits congestion map if one is pending
    may set ACK_REQUIRED
    calls transport to send either non-RDMA or RDMA message
    - transmits congestion map if one is pending
    - may set ACK_REQUIRED
    - calls transport to send either non-RDMA or RDMA message
      (RDMA ops never retransmitted)

  rds_ib_xmit()
    allocs work requests from send ring
    adds any new send credits available to peer (h_credits)
    maps the rds_message's sg list
    piggybacks ack
    populates work requests
    post send to connection's queue pair
    - allocs work requests from send ring
    - adds any new send credits available to peer (h_credits)
    - maps the rds_message's sg list
    - piggybacks ack
    - populates work requests
    - post send to connection's queue pair

The recv path
=============

  rds_ib_recv_cq_comp_handler()
    looks at write completions
    unmaps recv buffer from device
    no errors, call rds_ib_process_recv()
    refill recv ring
    - looks at write completions
    - unmaps recv buffer from device
    - no errors, call rds_ib_process_recv()
    - refill recv ring

  rds_ib_process_recv()
    validate header checksum
    copy header to rds_ib_incoming struct if start of a new datagram
    add to ibinc's fraglist
    if competed datagram:
      update cong map if datagram was cong update
      call rds_recv_incoming() otherwise
      note if ack is required
    - validate header checksum
    - copy header to rds_ib_incoming struct if start of a new datagram
    - add to ibinc's fraglist
    - if competed datagram:
	 - update cong map if datagram was cong update
	 - call rds_recv_incoming() otherwise
	 - note if ack is required

  rds_recv_incoming()
    drop duplicate packets
    respond to pings
    find the sock associated with this datagram
    add to sock queue
    wake up sock
    do some congestion calculations
    - drop duplicate packets
    - respond to pings
    - find the sock associated with this datagram
    - add to sock queue
    - wake up sock
    - do some congestion calculations
  rds_recvmsg
    copy data into user iovec
    handle CMSGs
    return to application
    - copy data into user iovec
    - handle CMSGs
    - return to application

Multipath RDS (mprds)
=====================
+1 −1
Original line number Diff line number Diff line
@@ -14219,7 +14219,7 @@ L: linux-rdma@vger.kernel.org
L:	rds-devel@oss.oracle.com (moderated for non-subscribers)
S:	Supported
W:	https://oss.oracle.com/projects/rds/
F:	Documentation/networking/rds.txt
F:	Documentation/networking/rds.rst
F:	net/rds/
RDT - RESOURCE ALLOCATION