Merge branch 'master' into HEAD (b6187173) · Commits · 郑智淋 / lammps

cmake/CMakeLists.txt

+8 −2

Original line number	Diff line number	Diff line
		@@ -37,6 +37,10 @@ enable_language(CXX)
		#####################################################################
		include(CheckCCompilerFlag)

		if (${CMAKE_CXX_COMPILER_ID} STREQUAL "Intel")
		set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -restrict")
		endif()

		########################################################################
		# User input options #
		########################################################################
		@@ -76,7 +80,7 @@ add_definitions(-DLAMMPS_MEMALIGN=${LAMMPS_MEMALIGN})
		option(LAMMPS_EXCEPTIONS "enable the use of C++ exceptions for error messages (useful for library interface)" OFF)
		if(LAMMPS_EXCEPTIONS)
		add_definitions(-DLAMMPS_EXCEPTIONS)
		set(LAMMPS_API_DEFINES "${LAMMPS_API_DEFINES -DLAMMPS_EXCEPTIONS")
		set(LAMMPS_API_DEFINES "${LAMMPS_API_DEFINES} -DLAMMPS_EXCEPTIONS")
		endif()

		set(LAMMPS_MACHINE "" CACHE STRING "Suffix to append to lmp binary and liblammps (WON'T enable any features automatically")
		@@ -665,7 +669,9 @@ include_directories(${LAMMPS_STYLE_HEADERS_DIR})
		############################################
		add_library(lammps ${LIB_SOURCES})
		target_link_libraries(lammps ${LAMMPS_LINK_LIBS})
		if(LAMMPS_DEPS)
		add_dependencies(lammps ${LAMMPS_DEPS})
		endif()
		set_target_properties(lammps PROPERTIES OUTPUT_NAME lammps${LAMMPS_MACHINE})
		if(BUILD_SHARED_LIBS)
		set_target_properties(lammps PROPERTIES SOVERSION ${SOVERSION})

doc/src/JPG/user_intel.png

−963 B (19.1 KiB)

Loading image diff...

doc/src/Section_packages.txt

+2 −1

Original line number	Diff line number	Diff line
		@@ -706,7 +706,7 @@ dynamics can be run with LAMMPS using density-functional tight-binding
		quantum forces calculated by LATTE.

		More information on LATTE can be found at this web site:
		"https://github.com/lanl/LATTE"_#latte_home. A brief technical
		"https://github.com/lanl/LATTE"_latte_home. A brief technical
		description is given with the "fix latte"_fix_latte.html command.

		:link(latte_home,https://github.com/lanl/LATTE)
		@@ -729,6 +729,7 @@ make lib-latte args="-b" # download and build in lib/latte/LATTE-
		make lib-latte args="-p $HOME/latte" # use existing LATTE installation in $HOME/latte
		make lib-latte args="-b -m gfortran" # download and build in lib/latte and
		# copy Makefile.lammps.gfortran to Makefile.lammps
		:pre

		Note that 3 symbolic (soft) links, "includelink" and "liblink" and
		"filelink", are created in lib/latte to point into the LATTE home dir.

doc/src/accelerate_intel.txt

+30 −21

Original line number	Diff line number	Diff line
		@@ -25,14 +25,14 @@ LAMMPS to run on the CPU cores and coprocessor cores simultaneously.
		[Currently Available USER-INTEL Styles:]

		Angle Styles: charmm, harmonic :ulb,l
		Bond Styles: fene, harmonic :l
		Bond Styles: fene, fourier, harmonic :l
		Dihedral Styles: charmm, harmonic, opls :l
		Fixes: nve, npt, nvt, nvt/sllod :l
		Fixes: nve, npt, nvt, nvt/sllod, nve/asphere :l
		Improper Styles: cvff, harmonic :l
		Pair Styles: airebo, airebo/morse, buck/coul/cut, buck/coul/long,
		buck, eam, eam/alloy, eam/fs, gayberne, lj/charmm/coul/charmm,
		lj/charmm/coul/long, lj/cut, lj/cut/coul/long, lj/long/coul/long, rebo,
		sw, tersoff :l
		buck, dpd, eam, eam/alloy, eam/fs, gayberne, lj/charmm/coul/charmm,
		lj/charmm/coul/long, lj/cut, lj/cut/coul/long, lj/long/coul/long,
		rebo, sw, tersoff :l
		K-Space Styles: pppm, pppm/disp :l
		:ule

		@@ -54,11 +54,12 @@ warmup run (for use with offload benchmarks).
		:c,image(JPG/user_intel.png)

		Results are speedups obtained on Intel Xeon E5-2697v4 processors
		(code-named Broadwell) and Intel Xeon Phi 7250 processors
		(code-named Knights Landing) with "June 2017" LAMMPS built with
		Intel Parallel Studio 2017 update 2. Results are with 1 MPI task
		per physical core. See {src/USER-INTEL/TEST/README} for the raw
		simulation rates and instructions to reproduce.
		(code-named Broadwell), Intel Xeon Phi 7250 processors (code-named
		Knights Landing), and Intel Xeon Gold 6148 processors (code-named
		Skylake) with "June 2017" LAMMPS built with Intel Parallel Studio
		2017 update 2. Results are with 1 MPI task per physical core. See
		{src/USER-INTEL/TEST/README} for the raw simulation rates and
		instructions to reproduce.

		:line

		@@ -82,6 +83,11 @@ this order :l
		The {newton} setting applies to all atoms, not just atoms shared
		between MPI tasks :l
		Vectorization can change the order for adding pairwise forces :l
		When using the -DLMP_USE_MKL_RNG define (all included intel optimized
		makefiles do) at build time, the random number generator for
		dissipative particle dynamics (pair style dpd/intel) uses the Mersenne
		Twister generator included in the Intel MKL library (that should be
		more robust than the default Masaglia random number generator) :l
		:ule

		The precision mode (described below) used with the USER-INTEL
		@@ -119,8 +125,8 @@ For Intel Xeon Phi CPUs:
		Runs should be performed using MCDRAM. :ulb,l
		:ule

		For simulations using {kspace_style pppm} on Intel CPUs
		supporting AVX-512:
		For simulations using {kspace_style pppm} on Intel CPUs supporting
		AVX-512:

		Add "kspace_modify diff ad" to the input script :ulb,l
		The command-line option should be changed to
		@@ -237,14 +243,17 @@ However, if you do not have coprocessors on your system, building
		without offload support will produce a smaller binary.

		The general requirements for Makefiles with the USER-INTEL package
		are as follows. "-DLAMMPS_MEMALIGN=64" is required for CCFLAGS. When
		using Intel compilers, "-restrict" is required and "-qopenmp" is
		highly recommended for CCFLAGS and LINKFLAGS. LIB should include
		"-ltbbmalloc". For builds supporting offload, "-DLMP_INTEL_OFFLOAD"
		is required for CCFLAGS and "-qoffload" is required for LINKFLAGS.
		Other recommended CCFLAG options for best performance are
		"-O2 -fno-alias -ansi-alias -qoverride-limits fp-model fast=2
		-no-prec-div".
		are as follows. When using Intel compilers, "-restrict" is required
		and "-qopenmp" is highly recommended for CCFLAGS and LINKFLAGS.
		CCFLAGS should include "-DLMP_INTEL_USELRT" (unless POSIX Threads
		are not supported in the build environment) and "-DLMP_USE_MKL_RNG"
		(unless Intel Math Kernel Library (MKL) is not available in the build
		environment). For Intel compilers, LIB should include "-ltbbmalloc"
		or if the library is not available, "-DLMP_INTEL_NO_TBB" can be added
		to CCFLAGS. For builds supporting offload, "-DLMP_INTEL_OFFLOAD" is
		required for CCFLAGS and "-qoffload" is required for LINKFLAGS. Other
		recommended CCFLAG options for best performance are "-O2 -fno-alias
		-ansi-alias -qoverride-limits fp-model fast=2 -no-prec-div".

		NOTE: The vectorization and math capabilities can differ depending on
		the CPU. For Intel compilers, the "-x" flag specifies the type of

doc/src/atom_modify.txt

+29 −25

Original line number	Diff line number	Diff line
		@@ -16,7 +16,7 @@ atom_modify keyword values ... :pre
		one or more keyword/value pairs may be appended :ulb,l
		keyword = {id} or {map} or {first} or {sort} :l
		{id} value = {yes} or {no}
		{map} value = {array} or {hash}
		{map} value = {yes} or {array} or {hash}
		{first} value = group-ID = group whose atoms will appear first in internal atom lists
		{sort} values = Nfreq binsize
		Nfreq = sort atoms spatially every this many time steps
		@@ -25,8 +25,8 @@ keyword = {id} or {map} or {first} or {sort} :l

		[Examples:]

		atom_modify map hash
		atom_modify map array sort 10000 2.0
		atom_modify map yes
		atom_modify map hash sort 10000 2.0
		atom_modify first colloid :pre

		[Description:]
		@@ -62,29 +62,33 @@ switch. This is described in "Section 2.2"_Section_start.html#start_2
		of the manual. If atom IDs are not used, they must be specified as 0
		for all atoms, e.g. in a data or restart file.

		The {map} keyword determines how atom ID lookup is done for molecular
		atom styles. Lookups are performed by bond (angle, etc) routines in
		LAMMPS to find the local atom index associated with a global atom ID.

		When the {array} value is used, each processor stores a lookup table
		of length N, where N is the largest atom ID in the system. This is a
		The {map} keyword determines how atoms with specific IDs are found
		when required. An example are the bond (angle, etc) methods which
		need to find the local index of an atom with a specific global ID
		which is a bond (angle, etc) partner. LAMMPS performs this operation
		efficiently by creating a "map", which is either an {array} or {hash}
		table, as descibed below.

		When the {map} keyword is not specified in your input script, LAMMPS
		only creates a map for "atom_styles"_atom_style.html for molecular
		systems which have permanent bonds (angles, etc). No map is created
		for atomic systems, since it is normally not needed. However some
		LAMMPS commands require a map, even for atomic systems, and will
		generate an error if one does not exist. The {map} keyword thus
		allows you to force the creation of a map. The {yes} value will
		create either an {array} or {hash} style map, as explained in the next
		paragraph. The {array} and {hash} values create an atom-style or
		hash-style map respectively.

		For an {array}-style map, each processor stores a lookup table of
		length N, where N is the largest atom ID in the system. This is a
		fast, simple method for many simulations, but requires too much memory
		for large simulations. The {hash} value uses a hash table to perform
		the lookups. This can be slightly slower than the {array} method, but
		its memory cost is proportional to the number of atoms owned by a
		processor, i.e. N/P when N is the total number of atoms in the system
		and P is the number of processors.

		When this setting is not specified in your input script, LAMMPS
		creates a map, if one is needed, as an array or hash. See the
		discussion of default values below for how LAMMPS chooses which kind
		of map to build. Note that atomic systems do not normally need to
		create a map. However, even in this case some LAMMPS commands will
		create a map to find atoms (and then destroy it), or require a
		permanent map. An example of the former is the "velocity loop
		all"_velocity.html command, which uses a map when looping over all
		atoms and insuring the same velocity values are assigned to an atom
		ID, no matter which processor owns it.
		for large simulations. For a {hash}-style map, a hash table is
		created on each processor, which finds an atom ID in constant time
		(independent of the global number of atom IDs). It can be slightly
		slower than the {array} map, but its memory cost is proportional to
		the number of atoms owned by a processor, i.e. N/P when N is the total
		number of atoms in the system and P is the number of processors.

		The {first} keyword allows a "group"_group.html to be specified whose
		atoms will be maintained as the first atoms in each processor's list

Admin message