Commit 76fb4e08 authored by Stan Moore's avatar Stan Moore
Browse files

Merge branch 'master' into kk_changes

parents d3fa8822 a59b7e4d
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -3,7 +3,7 @@ GNU GENERAL PUBLIC LICENSE
Version 2, June 1991 

Copyright (C) 1989, 1991 Free Software Foundation, Inc.  
59 Temple Place - Suite 330, Boston, MA  02111-1307, USA
51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
+7 −59
Original line number Diff line number Diff line
These are input scripts used to run versions of several of the
benchmarks in the top-level bench directory using the GPU and
USER-CUDA accelerator packages.  The results of running these scripts
on two different machines (a desktop with 2 Tesla GPUs and the ORNL
Titan supercomputer) are shown on the "GPU (Fermi)" section of the
Benchmark page of the LAMMPS WWW site: lammps.sandia.gov/bench.
benchmarks in the top-level bench directory using the GPU accelerator
package.  The results of running these scripts on two different machines
(a desktop with 2 Tesla GPUs and the ORNL Titan supercomputer) are shown
on the "GPU (Fermi)" section of the Benchmark page of the LAMMPS WWW
site: lammps.sandia.gov/bench.

Examples are shown below of how to run these scripts.  This assumes
you have built 3 executables with both the GPU and USER-CUDA packages
you have built 3 executables with the GPU package
installed, e.g.

lmp_linux_single
lmp_linux_mixed
lmp_linux_double

The precision (single, mixed, double) refers to the GPU and USER-CUDA
package precision.  See the README files in the lib/gpu and lib/cuda
directories for instructions on how to build the packages with
different precisions.  The GPU and USER-CUDA sub-sections of the
doc/Section_accelerate.html file also describes this process.

Make.py -d ~/lammps -j 16 -p #all orig -m linux -o cpu -a exe
Make.py -d ~/lammps -j 16 -p #all opt orig -m linux -o opt -a exe
Make.py -d ~/lammps -j 16 -p #all omp orig -m linux -o omp -a exe
Make.py -d ~/lammps -j 16 -p #all gpu orig -m linux \
        -gpu mode=double arch=20 -o gpu_double -a libs exe
Make.py -d ~/lammps -j 16 -p #all gpu orig -m linux \
        -gpu mode=mixed arch=20 -o gpu_mixed -a libs exe
Make.py -d ~/lammps -j 16 -p #all gpu orig -m linux \
        -gpu mode=single arch=20 -o gpu_single -a libs exe
Make.py -d ~/lammps -j 16 -p #all cuda orig -m linux \
        -cuda mode=double arch=20 -o cuda_double -a libs exe
Make.py -d ~/lammps -j 16 -p #all cuda orig -m linux \
        -cuda mode=mixed arch=20 -o cuda_mixed -a libs exe
Make.py -d ~/lammps -j 16 -p #all cuda orig -m linux \
        -cuda mode=single arch=20 -o cuda_single -a libs exe
Make.py -d ~/lammps -j 16 -p #all intel orig -m linux -o intel_cpu -a exe
Make.py -d ~/lammps -j 16 -p #all kokkos orig -m linux -o kokkos_omp -a exe
Make.py -d ~/lammps -j 16 -p #all kokkos orig -kokkos cuda arch=20 \
        -m cuda -o kokkos_cuda -a exe

Make.py -d ~/lammps -j 16 -p #all opt omp gpu cuda intel kokkos orig \
        -gpu mode=double arch=20 -cuda mode=double arch=20 -m linux \
        -o all -a libs exe

Make.py -d ~/lammps -j 16 -p #all opt omp gpu cuda intel kokkos orig \
        -kokkos cuda arch=20 -gpu mode=double arch=20 \
        -cuda mode=double arch=20 -m cuda -o all_cuda -a libs exe

------------------------------------------------------------------------

To run on just CPUs (without using the GPU or USER-CUDA styles),
To run on just CPUs (without using the GPU styles),
do something like the following:

mpirun -np 1 lmp_linux_double -v x 8 -v y 8 -v z 8 -v t 100 < in.lj
@@ -81,23 +47,5 @@ node via a "-ppn" setting.

------------------------------------------------------------------------

To run with the USER-CUDA package, do something like the following:

mpirun -np 1 lmp_linux_single -c on -sf cuda -v x 16 -v y 16 -v z 16 -v t 100 < in.lj
mpirun -np 2 lmp_linux_double -c on -sf cuda -pk cuda 2 -v x 32 -v y 64 -v z 64 -v t 100 < in.eam

The "xyz" settings determine the problem size.  The "t" setting
determines the number of timesteps.  The "np" setting determines how
many MPI tasks (per node) the problem will run on.  The numeric
argument to the "-pk" setting is the number of GPUs (per node); 1 GPU
is the default.  Note that the number of MPI tasks must equal the
number of GPUs (both per node) with the USER-CUDA package.

These mpirun commands run on a single node.  To run on multiple nodes,
scale up the "-np" setting, and control the number of MPI tasks per
node via a "-ppn" setting.

------------------------------------------------------------------------

If the script has "titan" in its name, it was run on the Titan
supercomputer at ORNL.
+15 −31
Original line number Diff line number Diff line
@@ -71,49 +71,33 @@ integration

----------------------------------------------------------------------

Here is a src/Make.py command which will perform a parallel build of a
LAMMPS executable "lmp_mpi" with all the packages needed by all the
examples.  This assumes you have an MPI installed on your machine so
that "mpicxx" can be used as the wrapper compiler.  It also assumes
you have an Intel compiler to use as the base compiler.  You can leave
off the "-cc mpi wrap=icc" switch if that is not the case.  You can
also leave off the "-fft fftw3" switch if you do not have the FFTW
(v3) installed as an FFT package, in which case the default KISS FFT
library will be used.

cd src
Make.py -j 16 -p none molecule manybody kspace granular rigid orig \
  -cc mpi wrap=icc -fft fftw3 -a file mpi

----------------------------------------------------------------------

Here is how to run each problem, assuming the LAMMPS executable is
named lmp_mpi, and you are using the mpirun command to launch parallel
runs:

Serial (one processor runs):

lmp_mpi < in.lj
lmp_mpi < in.chain
lmp_mpi < in.eam
lmp_mpi < in.chute
lmp_mpi < in.rhodo
lmp_mpi -in in.lj
lmp_mpi -in in.chain
lmp_mpi -in in.eam
lmp_mpi -in in.chute
lmp_mpi -in in.rhodo

Parallel fixed-size runs (on 8 procs in this case):

mpirun -np 8 lmp_mpi < in.lj
mpirun -np 8 lmp_mpi < in.chain
mpirun -np 8 lmp_mpi < in.eam
mpirun -np 8 lmp_mpi < in.chute
mpirun -np 8 lmp_mpi < in.rhodo
mpirun -np 8 lmp_mpi -in in.lj
mpirun -np 8 lmp_mpi -in in.chain
mpirun -np 8 lmp_mpi -in in.eam
mpirun -np 8 lmp_mpi -in in.chute
mpirun -np 8 lmp_mpi -in in.rhodo

Parallel scaled-size runs (on 16 procs in this case):

mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.lj
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.chain.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.eam
mpirun -np 16 lmp_mpi -var x 4 -var y 4 < in.chute.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.rhodo.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.lj
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.chain.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.eam
mpirun -np 16 lmp_mpi -var x 4 -var y 4 -in in.chute.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.rhodo.scaled

For each of the scaled-size runs you must set 3 variables as -var
command line switches.  The variables x,y,z are used in the input
+2 −3
Original line number Diff line number Diff line
<!-- HTML_ONLY -->
<HEAD>
<TITLE>LAMMPS Users Manual</TITLE>
<META NAME="docnumber" CONTENT="6 Jul 2017 version">
<META NAME="docnumber" CONTENT="24 Jul 2017 version">
<META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
<META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation.  This software and manual is distributed under the GNU General Public License.">
</HEAD>
@@ -21,7 +21,7 @@
<H1></H1>

LAMMPS Documentation :c,h3
6 Jul 2017 version :c,h4
24 Jul 2017 version :c,h4

Version info: :h4

@@ -261,7 +261,6 @@ END_RST -->
:link(start_6,Section_start.html#start_6)
:link(start_7,Section_start.html#start_7)
:link(start_8,Section_start.html#start_8)
:link(start_9,Section_start.html#start_9)

:link(cmd_1,Section_commands.html#cmd_1)
:link(cmd_2,Section_commands.html#cmd_2)
+3.49 KiB (557 KiB)

File changed.

No diff preview for this file type.

Loading