Commit 16e7483e authored by Jason Gunthorpe's avatar Jason Gunthorpe
Browse files

Merge branch 'dynamic_sg' into rdma.git for-next

From Maor Gottlieb says:

====================
This series extends __sg_alloc_table_from_pages to allow chaining of new
pages to an already initialized SG table.

This allows for drivers to utilize the optimization of merging contiguous
pages without a need to pre allocate all the pages and hold them in a very
large temporary buffer prior to the call to SG table initialization.

The last patch changes the Infiniband core to use the new API. It removes
duplicate functionality from the code and benefits from the optimization
of allocating dynamic SG table from pages.

In huge pages system of 2MB page size, without this change, the SG table
would contain x512 SG entries.
====================

* branch 'dynamic_sg':
  RDMA/umem: Move to allocate SG table from pages
  lib/scatterlist: Add support in dynamic allocation of SG table from pages
  tools/testing/scatterlist: Show errors in human readable form
  tools/testing/scatterlist: Rejuvenate bit-rotten test
parents bf6a4764 0c16d963
Loading
Loading
Loading
Loading
+4 −0
Original line number Diff line number Diff line
@@ -169,6 +169,10 @@ Juha Yrjola <juha.yrjola@solidboot.com>
Julien Thierry <julien.thierry.kdev@gmail.com> <julien.thierry@arm.com>
Kamil Konieczny <k.konieczny@samsung.com> <k.konieczny@partner.samsung.com>
Kay Sievers <kay.sievers@vrfy.org>
Kees Cook <keescook@chromium.org> <kees.cook@canonical.com>
Kees Cook <keescook@chromium.org> <keescook@google.com>
Kees Cook <keescook@chromium.org> <kees@outflux.net>
Kees Cook <keescook@chromium.org> <kees@ubuntu.com>
Kenneth W Chen <kenneth.w.chen@intel.com>
Konstantin Khlebnikov <koct9i@gmail.com> <khlebnikov@yandex-team.ru>
Konstantin Khlebnikov <koct9i@gmail.com> <k.khlebnikov@samsung.com>
+18 −7
Original line number Diff line number Diff line
@@ -1324,15 +1324,26 @@ PAGE_SIZE multiple when read back.
	  pgmajfault
		Number of major page faults incurred

	  workingset_refault
		Number of refaults of previously evicted pages
	  workingset_refault_anon
		Number of refaults of previously evicted anonymous pages.

	  workingset_activate
		Number of refaulted pages that were immediately activated
	  workingset_refault_file
		Number of refaults of previously evicted file pages.

	  workingset_restore
		Number of restored pages which have been detected as an active
		workingset before they got reclaimed.
	  workingset_activate_anon
		Number of refaulted anonymous pages that were immediately
		activated.

	  workingset_activate_file
		Number of refaulted file pages that were immediately activated.

	  workingset_restore_anon
		Number of restored anonymous pages which have been detected as
		an active workingset before they got reclaimed.

	  workingset_restore_file
		Number of restored file pages which have been detected as an
		active workingset before they got reclaimed.

	  workingset_nodereclaim
		Number of times a shadow node has been reclaimed
+9 −1
Original line number Diff line number Diff line
@@ -67,7 +67,7 @@ Parameters::
    the value passed in <key_size>.

<key_type>
    Either 'logon' or 'user' kernel key type.
    Either 'logon', 'user' or 'encrypted' kernel key type.

<key_description>
    The kernel keyring key description crypt target should look for
@@ -121,6 +121,14 @@ submit_from_crypt_cpus
    thread because it benefits CFQ to have writes submitted using the
    same context.

no_read_workqueue
    Bypass dm-crypt internal workqueue and process read requests synchronously.

no_write_workqueue
    Bypass dm-crypt internal workqueue and process write requests synchronously.
    This option is automatically enabled for host-managed zoned block devices
    (e.g. host-managed SMR hard-disks).

integrity:<bytes>:<type>
    The device requires additional <bytes> metadata per-sector stored
    in per-bio integrity structure. This metadata must by provided
+1 −1
Original line number Diff line number Diff line
@@ -690,7 +690,7 @@ which of the two parameters is added to the kernel command line. In the
instruction of the CPUs (which, as a rule, suspends the execution of the program
and causes the hardware to attempt to enter the shallowest available idle state)
for this purpose, and if ``idle=poll`` is used, idle CPUs will execute a
more or less ``lightweight'' sequence of instructions in a tight loop.  [Note
more or less "lightweight" sequence of instructions in a tight loop.  [Note
that using ``idle=poll`` is somewhat drastic in many cases, as preventing idle
CPUs from saving almost any energy at all may not be the only effect of it.
For example, on Intel hardware it effectively prevents CPUs from using
+1 −4
Original line number Diff line number Diff line
@@ -182,9 +182,6 @@ in the order of reservations, but only after all previous records where
already committed. It is thus possible for slow producers to temporarily hold
off submitted records, that were reserved later.

Reservation/commit/consumer protocol is verified by litmus tests in
Documentation/litmus_tests/bpf-rb/_.

One interesting implementation bit, that significantly simplifies (and thus
speeds up as well) implementation of both producers and consumers is how data
area is mapped twice contiguously back-to-back in the virtual memory. This
@@ -200,7 +197,7 @@ a self-pacing notifications of new data being availability.
being available after commit only if consumer has already caught up right up to
the record being committed. If not, consumer still has to catch up and thus
will see new data anyways without needing an extra poll notification.
Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbuf.c_) show that
Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbufs.c) show that
this allows to achieve a very high throughput without having to resort to
tricks like "notify only every Nth sample", which are necessary with perf
buffer. For extreme cases, when BPF program wants more manual control of
Loading