File changed.
Preview size limit exceeded, changes collapsed.
Loading
Gitlab 现已全面支持 git over ssh 与 git over https。通过 HTTPS 访问请配置带有 read_repository / write_repository 权限的 Personal access token。通过 SSH 端口访问请使用 22 端口或 13389 端口。如果使用CAS注册了账户但不知道密码,可以自行至设置中更改;如有其他问题,请发邮件至 service@cra.moe 寻求协助。
Saeed Mahameed says:
====================
mlx5-updates-2018-03-30
This series contains updates to mlx5 core and mlx5e netdev drivers.
The main highlight of this series is the RX optimizations for striding RQ path,
introduced by Tariq.
First Four patches are trivial misc cleanups.
- Spelling mistake fix
- Dead code removal
- Warning messages
RX optimizations for striding RQ:
1) RX refactoring, cleanups and micro optimizations
- MTU calculation simplifications, obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.
- Do not busy-wait a pending UMR completion.
- post the new values of UMR WQE inline, instead of using a data pointer.
- use pre-initialized structures to save calculations in datapath.
2) Use linear SKB in Striding RQ "build_skb", (Using linear SKB has many advantages):
- Saves a memcpy of the headers.
- No page-boundary checks in datapath.
- No filler CQEs.
- Significantly smaller CQ.
- SKB data continuously resides in linear part, and not split to
small amount (linear part) and large amount (fragment).
This saves datapath cycles in driver and improves utilization
of SKB fragments in GRO.
- The fragments of a resulting GRO SKB follow the IP forwarding
assumption of equal-size fragments.
implementation details:
HW writes the packets to the beginning of a stride,
i.e. does not keep headroom. To overcome this we make sure we can
extend backwards and use the last bytes of stride i-1.
Extra care is needed for stride 0 as it has no preceding stride.
We make sure headroom bytes are available by shifting the buffer
pointer passed to HW by headroom bytes.
This configuration now becomes default, whenever capable.
Of course, this implies turning LRO off.
Performance testing:
ConnectX-5, single core, single RX ring, default MTU.
UDP packet rate, early drop in TC layer:
--------------------------------------------
| pkt size | before | after | ratio |
--------------------------------------------
| 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
| 500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
| 64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
--------------------------------------------
TCP streams: ~20% gain
3) Support XDP over Striding RQ:
Now that linear SKB is supported over Striding RQ,
we can support XDP by setting stride size to PAGE_SIZE
and headroom to XDP_PACKET_HEADROOM.
Striding RQ is capable of a higher packet-rate than
conventional RQ.
Performance testing:
ConnectX-5, 24 rings, default MTU.
CQE compression ON (to reduce completions BW in PCI).
XDP_DROP packet rate:
--------------------------------------------------
| pkt size | XDP rate | 100GbE linerate | pct% |
--------------------------------------------------
| 64byte | 126.2 Mpps | 148.0 Mpps | 85% |
| 128byte | 80.0 Mpps | 84.8 Mpps | 94% |
| 256byte | 42.7 Mpps | 42.7 Mpps | 100% |
| 512byte | 23.4 Mpps | 23.4 Mpps | 100% |
--------------------------------------------------
4) Remove mlx5 page_ref bulking in Striding RQ and use page_ref_inc only when needed.
Without this bulking, we have:
- no atomic ops on WQE allocation or free
- one atomic op per SKB
- In the default MTU configuration (1500, stride size is 2K),
the non-bulking method execute 2 atomic ops as before
- For larger MTUs with stride size of 4K, non-bulking method
executes only a single op.
- For XDP (stride size of 4K, no SKBs), non-bulking have no atomic ops per packet at all.
Performance testing:
ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Single core packet rate (64 bytes).
Early drop in TC: no degradation.
XDP_DROP:
before: 14,270,188 pps
after: 20,503,603 pps, 43% improvement.
====================
Signed-off-by:
David S. Miller <davem@davemloft.net>
File changed.
Preview size limit exceeded, changes collapsed.
CRA Git | Maintained and supported by SUSTech CRA and CCSE