Commit 92fff53b authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
  hisi_sas, target/iscsi and target/core.

  Additionally Christoph refactored gdth as part of the dma changes. The
  major mid-layer change this time is the removal of bidi commands and
  with them the whole of the osd/exofs driver and filesystem. This is a
  major simplification for block and mq in particular"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
  scsi: cxgb4i: validate tcp sequence number only if chip version <= T5
  scsi: cxgb4i: get pf number from lldi->pf
  scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
  scsi: mpt3sas: Add missing breaks in switch statements
  scsi: aacraid: Fix missing break in switch statement
  scsi: kill command serial number
  scsi: csiostor: drop serial_number usage
  scsi: mvumi: use request tag instead of serial_number
  scsi: dpt_i2o: remove serial number usage
  scsi: st: osst: Remove negative constant left-shifts
  scsi: ufs-bsg: Allow reading descriptors
  scsi: ufs: Allow reading descriptor via raw upiu
  scsi: ufs-bsg: Change the calling convention for write descriptor
  scsi: ufs: Remove unused device quirks
  Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
  scsi: megaraid_sas: Remove a bunch of set but not used variables
  scsi: clean obsolete return values of eh_timed_out
  scsi: sd: Optimal I/O size should be a multiple of physical block size
  scsi: MAINTAINERS: SCSI initiator and target tweaks
  scsi: fcoe: make use of fip_mode enum complete
  ...
parents a50243b1 26af1a36
Loading
Loading
Loading
Loading
+3 −2
Original line number Diff line number Diff line
@@ -6,9 +6,10 @@ Each UFS Host Controller should have its own node.
Required properties:
- compatible        : compatible list, contains one of the following -
					"hisilicon,hi3660-ufs", "jedec,ufs-1.1" for hisi ufs
					host controller present on Hi36xx chipset.
					host controller present on Hi3660 chipset.
					"hisilicon,hi3670-ufs", "jedec,ufs-2.1" for hisi ufs
					host controller present on Hi3670 chipset.
- reg               : should contain UFS register address space & UFS SYS CTRL register address,
- interrupt-parent  : interrupt device
- interrupts        : interrupt number
- clocks	        : List of phandle and clock specifier pairs
- clock-names       : List of clock input name strings sorted in the same
+8 −5
Original line number Diff line number Diff line
@@ -4,11 +4,14 @@ UFSHC nodes are defined to describe on-chip UFS host controllers.
Each UFS controller instance should have its own node.

Required properties:
- compatible		: must contain "jedec,ufs-1.1" or "jedec,ufs-2.0", may
			  also list one or more of the following:
					  "qcom,msm8994-ufshc"
					  "qcom,msm8996-ufshc"
					  "qcom,ufshc"
- compatible		: must contain "jedec,ufs-1.1" or "jedec,ufs-2.0"

			  For Qualcomm SoCs must contain, as below, an
			  SoC-specific compatible along with "qcom,ufshc" and
			  the appropriate jedec string:
			    "qcom,msm8994-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
			    "qcom,msm8996-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
			    "qcom,sdm845-ufshc", "qcom,ufshc", "jedec,ufs-2.0"
- interrupts        : <interrupt mapping for UFS host controller IRQ>
- reg               : <registers mapping>

+0 −185
Original line number Diff line number Diff line
===============================================================================
WHAT IS EXOFS?
===============================================================================

exofs is a file system that uses an OSD and exports the API of a normal Linux
file system. Users access exofs like any other local file system, and exofs
will in turn issue commands to the local OSD initiator.

OSD is a new T10 command set that views storage devices not as a large/flat
array of sectors but as a container of objects, each having a length, quota,
time attributes and more. Each object is addressed by a 64bit ID, and is
contained in a 64bit ID partition. Each object has associated attributes
attached to it, which are integral part of the object and provide metadata about
the object. The standard defines some common obligatory attributes, but user
attributes can be added as needed.

===============================================================================
ENVIRONMENT
===============================================================================

To use this file system, you need to have an object store to run it on.  You
may download a target from:
http://open-osd.org

See Documentation/scsi/osd.txt for how to setup a working osd environment.

===============================================================================
USAGE
===============================================================================

1. Download and compile exofs and open-osd initiator:
  You need an external Kernel source tree or kernel headers from your
  distribution. (anything based on 2.6.26 or later).

  a. download open-osd including exofs source using:
     [parent-directory]$ git clone git://git.open-osd.org/open-osd.git

  b. Build the library module like this:
     [parent-directory]$ make -C KSRC=$(KER_DIR) open-osd

     This will build both the open-osd initiator as well as the exofs kernel
     module. Use whatever parameters you compiled your Kernel with and
     $(KER_DIR) above pointing to the Kernel you compile against. See the file
     open-osd/top-level-Makefile for an example.

2. Get the OSD initiator and target set up properly, and login to the target.
  See Documentation/scsi/osd.txt for farther instructions. Also see ./do-osd
  for example script that does all these steps.

3. Insmod the exofs.ko module:
   [exofs]$ insmod exofs.ko

4. Make sure the directory where you want to mount exists. If not, create it.
   (For example, mkdir /mnt/exofs)

5. At first run you will need to invoke the mkfs.exofs application

   As an example, this will create the file system on:
   /dev/osd0 partition ID 65536

   mkfs.exofs --pid=65536 --format /dev/osd0

   The --format is optional. If not specified, no OSD_FORMAT will be
   performed and a clean file system will be created in the specified pid,
   in the available space of the target. (Use --format=size_in_meg to limit
   the total LUN space available)

   If pid already exists, it will be deleted and a new one will be created in
   its place. Be careful.

   An exofs lives inside a single OSD partition. You can create multiple exofs
   filesystems on the same device using multiple pids.

   (run mkfs.exofs without any parameters for usage help message)

6. Mount the file system.

   For example, to mount /dev/osd0, partition ID 0x10000 on /mnt/exofs:

	mount -t exofs -o pid=65536 /dev/osd0 /mnt/exofs/

7. For reference (See do-exofs example script):
	do-exofs start - an example of how to perform the above steps.
	do-exofs stop - an example of how to unmount the file system.
	do-exofs format - an example of how to format and mkfs a new exofs.

8. Extra compilation flags (uncomment in fs/exofs/Kbuild):
	CONFIG_EXOFS_DEBUG - for debug messages and extra checks.

===============================================================================
exofs mount options
===============================================================================
Similar to any mount command:
	mount -t exofs -o exofs_options /dev/osdX mount_exofs_directory

Where:
    -t exofs: specifies the exofs file system

    /dev/osdX: X is a decimal number. /dev/osdX was created after a successful
               login into an OSD target.

    mount_exofs_directory: The directory to mount the file system on

    exofs specific options: Options are separated by commas (,)
		pid=<integer> - The partition number to mount/create as
                                container of the filesystem.
                                This option is mandatory. integer can be
                                Hex by pre-pending an 0x to the number.
		osdname=<id>  - Mount by a device's osdname.
                                osdname is usually a 36 character uuid of the
                                form "d2683732-c906-4ee1-9dbd-c10c27bb40df".
                                It is one of the device's uuid specified in the
                                mkfs.exofs format command.
                                If this option is specified then the /dev/osdX
                                above can be empty and is ignored.
                to=<integer>  - Timeout in ticks for a single command.
                                default is (60 * HZ) [for debugging only]

===============================================================================
DESIGN
===============================================================================

* The file system control block (AKA on-disk superblock) resides in an object
  with a special ID (defined in common.h).
  Information included in the file system control block is used to fill the
  in-memory superblock structure at mount time. This object is created before
  the file system is used by mkexofs.c. It contains information such as:
	- The file system's magic number
	- The next inode number to be allocated

* Each file resides in its own object and contains the data (and it will be
  possible to extend the file over multiple objects, though this has not been
  implemented yet).

* A directory is treated as a file, and essentially contains a list of <file
  name, inode #> pairs for files that are found in that directory. The object
  IDs correspond to the files' inode numbers and will be allocated according to
  a bitmap (stored in a separate object). Now they are allocated using a
  counter.

* Each file's control block (AKA on-disk inode) is stored in its object's
  attributes. This applies to both regular files and other types (directories,
  device files, symlinks, etc.).

* Credentials are generated per object (inode and superblock) when they are
  created in memory (read from disk or created). The credential works for all
  operations and is used as long as the object remains in memory.

* Async OSD operations are used whenever possible, but the target may execute
  them out of order. The operations that concern us are create, delete,
  readpage, writepage, update_inode, and truncate. The following pairs of
  operations should execute in the order written, and we need to prevent them
  from executing in reverse order:
	- The following are handled with the OBJ_CREATED and OBJ_2BCREATED
	  flags. OBJ_CREATED is set when we know the object exists on the OSD -
	  in create's callback function, and when we successfully do a
	  read_inode.
	  OBJ_2BCREATED is set in the beginning of the create function, so we
	  know that we should wait.
		- create/delete: delete should wait until the object is created
		  on the OSD.
		- create/readpage: readpage should be able to return a page
		  full of zeroes in this case. If there was a write already
		  en-route (i.e. create, writepage, readpage) then the page
		  would be locked, and so it would really be the same as
		  create/writepage.
		- create/writepage: if writepage is called for a sync write, it
		  should wait until the object is created on the OSD.
		  Otherwise, it should just return.
		- create/truncate: truncate should wait until the object is
		  created on the OSD.
		- create/update_inode: update_inode should wait until the
		  object is created on the OSD.
	- Handled by VFS locks:
		- readpage/delete: shouldn't happen because of page lock.
		- writepage/delete: shouldn't happen because of page lock.
		- readpage/writepage: shouldn't happen because of page lock.

===============================================================================
LICENSE/COPYRIGHT
===============================================================================
The exofs file system is based on ext2 v0.5b (distributed with the Linux kernel
version 2.6.10).  All files include the original copyrights, and the license
is GPL version 2 (only version 2, as is true for the Linux kernel).  The
Linux kernel can be downloaded from www.kernel.org.

Documentation/scsi/osd.txt

deleted100644 → 0
+0 −197
Original line number Diff line number Diff line
The OSD Standard
================
OSD (Object-Based Storage Device) is a T10 SCSI command set that is designed
to provide efficient operation of input/output logical units that manage the
allocation, placement, and accessing of variable-size data-storage containers,
called objects. Objects are intended to contain operating system and application
constructs. Each object has associated attributes attached to it, which are
integral part of the object and provide metadata about the object. The standard
defines some common obligatory attributes, but user attributes can be added as
needed.

See: http://www.t10.org/ftp/t10/drafts/osd2/ for the latest draft for OSD 2
or search the web for "OSD SCSI"

OSD in the Linux Kernel
=======================
osd-initiator:
  The main component of OSD in Kernel is the osd-initiator library. Its main
user is intended to be the pNFS-over-objects layout driver, which uses objects
as its back-end data storage. Other clients are the other osd parts listed below.

osd-uld:
  This is a SCSI ULD that registers for OSD type devices and provides a testing
platform, both for the in-kernel initiator as well as connected targets. It
currently has no useful user-mode API, though it could have if need be.

exofs:
  Is an OSD based Linux file system. It uses the osd-initiator and osd-uld,
to export a usable file system for users.
See Documentation/filesystems/exofs.txt for more details

osd target:
  There are no current plans for an OSD target implementation in kernel. For all
needs, a user-mode target that is based on the scsi tgt target framework is
available from Ohio Supercomputer Center (OSC) at:
http://www.open-osd.org/bin/view/Main/OscOsdProject
There are several other target implementations. See http://open-osd.org for more
links.

Files and Folders
=================
This is the complete list of files included in this work:
include/scsi/
	osd_initiator.h   Main API for the initiator library
	osd_types.h	  Common OSD types
	osd_sec.h	  Security Manager API
	osd_protocol.h	  Wire definitions of the OSD standard protocol
	osd_attributes.h  Wire definitions of OSD attributes

drivers/scsi/osd/
	osd_initiator.c   OSD-Initiator library implementation
	osd_uld.c	  The OSD scsi ULD
	osd_ktest.{h,c}	  In-kernel test suite (called by osd_uld)
	osd_debug.h	  Some printk macros
	Makefile	  For both in-tree and out-of-tree compilation
	Kconfig		  Enables inclusion of the different pieces
	osd_test.c	  User-mode application to call the kernel tests

The OSD-Initiator Library
=========================
osd_initiator is a low level implementation of an osd initiator encoder.
But even though, it should be intuitive and easy to use. Perhaps over time an
higher lever will form that automates some of the more common recipes.

init/fini:
- osd_dev_init() associates a scsi_device with an osd_dev structure
  and initializes some global pools. This should be done once per scsi_device
  (OSD LUN). The osd_dev structure is needed for calling osd_start_request().

- osd_dev_fini() cleans up before a osd_dev/scsi_device destruction.

OSD commands encoding, execution, and decoding of results:

struct osd_request's is used to iteratively encode an OSD command and carry
its state throughout execution. Each request goes through these stages:

a. osd_start_request() allocates the request.

b. Any of the osd_req_* methods is used to encode a request of the specified
   type.

c. osd_req_add_{get,set}_attr_* may be called to add get/set attributes to the
   CDB. "List" or "Page" mode can be used exclusively. The attribute-list API
   can be called multiple times on the same request. However, only one
   attribute-page can be read, as mandated by the OSD standard.

d. osd_finalize_request() computes offsets into the data-in and data-out buffers
   and signs the request using the provided capability key and integrity-
   check parameters.

e. osd_execute_request() may be called to execute the request via the block
   layer and wait for its completion.  The request can be executed
   asynchronously by calling the block layer API directly.

f. After execution, osd_req_decode_sense() can be called to decode the request's
   sense information.

g. osd_req_decode_get_attr() may be called to retrieve osd_add_get_attr_list()
   values.

h. osd_end_request() must be called to deallocate the request and any resource
   associated with it. Note that osd_end_request cleans up the request at any
   stage and it must always be called after a successful osd_start_request().

osd_request's structure:

The OSD standard defines a complex structure of IO segments pointed to by
members in the CDB. Up to 3 segments can be deployed in the IN-Buffer and up to
4 in the OUT-Buffer. The ASCII illustration below depicts a secure-read with
associated get+set of attributes-lists. Other combinations very on the same
basic theme. From no-segments-used up to all-segments-used.

|________OSD-CDB__________|
|                         |
|read_len (offset=0)     -|---------\
|                         |         |
|get_attrs_list_length    |         |
|get_attrs_list_offset   -|----\    |
|                         |    |    |
|retrieved_attrs_alloc_len|    |    |
|retrieved_attrs_offset  -|----|----|-\
|                         |    |    | |
|set_attrs_list_length    |    |    | |
|set_attrs_list_offset   -|-\  |    | |
|                         | |  |    | |
|in_data_integ_offset    -|-|--|----|-|-\
|out_data_integ_offset   -|-|--|--\ | | |
\_________________________/ |  |  | | | |
                            |  |  | | | |
|_______OUT-BUFFER________| |  |  | | | |
|      Set attr list      |</  |  | | | |
|                         |    |  | | | |
|-------------------------|    |  | | | |
|   Get attr descriptors  |<---/  | | | |
|                         |       | | | |
|-------------------------|       | | | |
|    Out-data integrity   |<------/ | | |
|                         |         | | |
\_________________________/         | | |
                                    | | |
|________IN-BUFFER________|         | | |
|      In-Data read       |<--------/ | |
|                         |           | |
|-------------------------|           | |
|      Get attr list      |<----------/ |
|                         |             |
|-------------------------|             |
|    In-data integrity    |<------------/
|                         |
\_________________________/

A block device request can carry bidirectional payload by means of associating
a bidi_read request with a main write-request. Each in/out request is described
by a chain of BIOs associated with each request.
The CDB is of a SCSI VARLEN CDB format, as described by OSD standard.
The OSD standard also mandates alignment restrictions at start of each segment.

In the code, in struct osd_request, there are two _osd_io_info structures to
describe the IN/OUT buffers above, two BIOs for the data payload and up to five
_osd_req_data_segment structures to hold the different segments allocation and
information.

Important: We have chosen to disregard the assumption that a BIO-chain (and
the resulting sg-list) describes a linear memory buffer. Meaning only first and
last scatter chain can be incomplete and all the middle chains are of PAGE_SIZE.
For us, a scatter-gather-list, as its name implies and as used by the Networking
layer, is to describe a vector of buffers that will be transferred to/from the
wire. It works very well with current iSCSI transport. iSCSI is currently the
only deployed OSD transport. In the future we anticipate SAS and FC attached OSD
devices as well.

The OSD Testing ULD
===================
TODO: More user-mode control on tests.

Authors, Mailing list
=====================
Please communicate with us on any deployment of osd, whether using this code
or not.

Any problems, questions, bug reports, lonely OSD nights, please email:
   OSD Dev List <osd-dev@open-osd.org>

More up-to-date information can be found on:
http://open-osd.org

Boaz Harrosh <ooo@electrozaur.com>

References
==========
Weber, R., "SCSI Object-Based Storage Device Commands",
T10/1355-D ANSI/INCITS 400-2004,
http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf

Weber, R., "SCSI Object-Based Storage Device Commands -2 (OSD-2)"
T10/1729-D, Working Draft, rev. 3
http://www.t10.org/ftp/t10/drafts/osd2/osd2r03.pdf
+11 −0
Original line number Diff line number Diff line
@@ -147,6 +147,17 @@ send SG_IO with the applicable sg_io_v4:
	io_hdr_v4.max_response_len = reply_len;
	io_hdr_v4.request_len = request_len;
	io_hdr_v4.request = (__u64)request_upiu;
	if (dir == SG_DXFER_TO_DEV) {
		io_hdr_v4.dout_xfer_len = (uint32_t)byte_cnt;
		io_hdr_v4.dout_xferp = (uintptr_t)(__u64)buff;
	} else {
		io_hdr_v4.din_xfer_len = (uint32_t)byte_cnt;
		io_hdr_v4.din_xferp = (uintptr_t)(__u64)buff;
	}

If you wish to read or write a descriptor, use the appropriate xferp of
sg_io_v4.


UFS Specifications can be found at,
UFS - http://www.jedec.org/sites/default/files/docs/JESD220.pdf
Loading