update documentation to clarify non-UMI and dual-indexed technologies (8ac5f5c2) · Commits · github_fork / Universc

README.Rmd

+22 −1

Original line number	Original line	Diff line number	Diff line
	@@ -212,12 +212,24 @@ configured with the `--chemistry` argument.
	For other technologies, the template switching oligonucleotide		For other technologies, the template switching oligonucleotide
	is automatically converted to the match the 10x sequence.		is automatically converted to the match the 10x sequence.

			By default, UMIs are supported where available so with the following
			exceptions for non-UMI technologies:
			ICELL8 v2, RamDA-Seq, Quartz-Seq, Smart-Seq, Smart-Seq2.
			Other techniques can be forced to replace the UMI with a mock sequence
			for counting reads only with `--non-umi` or `--read-only` arguments.
			Forcing non-UMI techniques is _not recommended_ unless you are
			integrating non-UMI and UMI-based technologies. It is not necessary
			to specific `--non-umi` for non-UMI techniques as these will be used
			automatically when applicable. For ICELL8 and Smart-Seq where both
			non-UMI (icell8-v2, smartseq2) and UMI-based (icell8-v3, smartseq3)
			techniques are available it is possible to specify which to use.

	Single indexes are supported for STRT-Seq, Quartz-Seq, and RamDA-Seq.		Single indexes are supported for STRT-Seq, Quartz-Seq, and RamDA-Seq.
	Dual indexes are supported for inDrops-v3, SCI-RNA-Seq, scifi-seq, and Smart-Seq.		Dual indexes are supported for inDrops-v3, SCI-RNA-Seq, scifi-seq, and Smart-Seq.
	Combinatorial indexing technologies have linkers between barcodes removed		Combinatorial indexing technologies have linkers between barcodes removed
	automatically to match the barcode whitelist.		automatically to match the barcode whitelist.

	#### Dual-indexing		#### Demultiplexing for dual-indexing

	For dual-indexed technologies such as inDrops-v3, Sci-Seq, SmartSeq3 it is advised to use "bcl2fastq"		For dual-indexed technologies such as inDrops-v3, Sci-Seq, SmartSeq3 it is advised to use "bcl2fastq"
	before calling UniverSC:		before calling UniverSC:
	@@ -229,6 +241,15 @@ before calling UniverSC:
	--minimum-trimmed-read-length 0		--minimum-trimmed-read-length 0
	```		```

			Please adjust the lengths for `--use-bases-mask` accordingly for read 1, index 1 (i7), index 2 (i5), and read 2.
			Ensure that `--create-fastq-for-index-read` is used where possible. If a sequencing facility has demultiplexed
			the samples for you without this, UniverSC will attempt to extract index sequences from FASTQ headers in read 1.
			Using `--no-lane-splitting` is optional as UniverSC can process an arbirtary number of lanes.

			There is no need to specify index sequences in the same sheet for cell barcodes, using "NNNNNNNN" will match all
			samples and the cell barcodes will be distinguished by the single-cell processing pipeline. Index sequences should
			only be used to demultiplex samples and replicates (not cells).

	#### Custom inputs		#### Custom inputs

	Custom inputs are also supported by giving the name "custom" and length of barcode and UMI separated by a "_" character.		Custom inputs are also supported by giving the name "custom" and length of barcode and UMI separated by a "_" character.

README.html

+4 −1

README.md

+23 −2

Original line number	Original line	Diff line number	Diff line
	@@ -6,7 +6,7 @@ affiliations:
	index: 1		index: 1
	- name: "RIKEN Center for Sustainable Resource Sciences, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Kanagawa 230-0045, Japan"		- name: "RIKEN Center for Sustainable Resource Sciences, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Kanagawa 230-0045, Japan"
	index: 2		index: 2
	date: "Tuesday 27 April 2021"		date: "Wednesday 28 April 2021"
	output:		output:
	prettydoc::html_pretty:		prettydoc::html_pretty:
	theme: cayman		theme: cayman
	@@ -212,12 +212,24 @@ configured with the `--chemistry` argument.
	For other technologies, the template switching oligonucleotide		For other technologies, the template switching oligonucleotide
	is automatically converted to the match the 10x sequence.		is automatically converted to the match the 10x sequence.

			By default, UMIs are supported where available so with the following
			exceptions for non-UMI technologies:
			ICELL8 v2, RamDA-Seq, Quartz-Seq, Smart-Seq, Smart-Seq2.
			Other techniques can be forced to replace the UMI with a mock sequence
			for counting reads only with `--non-umi` or `--read-only` arguments.
			Forcing non-UMI techniques is _not recommended_ unless you are
			integrating non-UMI and UMI-based technologies. It is not necessary
			to specific `--non-umi` for non-UMI techniques as these will be used
			automatically when applicable. For ICELL8 and Smart-Seq where both
			non-UMI (icell8-v2, smartseq2) and UMI-based (icell8-v3, smartseq3)
			techniques are available it is possible to specify which to use.

	Single indexes are supported for STRT-Seq, Quartz-Seq, and RamDA-Seq.		Single indexes are supported for STRT-Seq, Quartz-Seq, and RamDA-Seq.
	Dual indexes are supported for inDrops-v3, SCI-RNA-Seq, scifi-seq, and Smart-Seq.		Dual indexes are supported for inDrops-v3, SCI-RNA-Seq, scifi-seq, and Smart-Seq.
	Combinatorial indexing technologies have linkers between barcodes removed		Combinatorial indexing technologies have linkers between barcodes removed
	automatically to match the barcode whitelist.		automatically to match the barcode whitelist.

	#### Dual-indexing		#### Demultiplexing for dual-indexing

	For dual-indexed technologies such as inDrops-v3, Sci-Seq, SmartSeq3 it is advised to use "bcl2fastq"		For dual-indexed technologies such as inDrops-v3, Sci-Seq, SmartSeq3 it is advised to use "bcl2fastq"
	before calling UniverSC:		before calling UniverSC:
	@@ -229,6 +241,15 @@ before calling UniverSC:
	--minimum-trimmed-read-length 0		--minimum-trimmed-read-length 0
	```		```

			Please adjust the lengths for `--use-bases-mask` accordingly for read 1, index 1 (i7), index 2 (i5), and read 2.
			Ensure that `--create-fastq-for-index-read` is used where possible. If a sequencing facility has demultiplexed
			the samples for you without this, UniverSC will attempt to extract index sequences from FASTQ headers in read 1.
			Using `--no-lane-splitting` is optional as UniverSC can process an arbirtary number of lanes.

			There is no need to specify index sequences in the same sheet for cell barcodes, using "NNNNNNNN" will match all
			samples and the cell barcodes will be distinguished by the single-cell processing pipeline. Index sequences should
			only be used to demultiplex samples and replicates (not cells).

	#### Custom inputs		#### Custom inputs

	Custom inputs are also supported by giving the name "custom" and length of barcode and UMI separated by a "_" character.		Custom inputs are also supported by giving the name "custom" and length of barcode and UMI separated by a "_" character.

man/launch_universc.sh

+25 −2

Original line number	Original line	Diff line number	Diff line
	@@ -95,6 +95,11 @@ Provides a conversion script to run multiple technologies and custom libraries w
	/usr/local/bin/bcl2fastq -v --runfolder-dir "/path/to/illumina/bcls" --output-dir "./Data/Intensities/BaseCalls"\		/usr/local/bin/bcl2fastq -v --runfolder-dir "/path/to/illumina/bcls" --output-dir "./Data/Intensities/BaseCalls"\
	--sample-sheet "/path/to/SampleSheet.csv" --create-fastq-for-index-reads		--sample-sheet "/path/to/SampleSheet.csv" --create-fastq-for-index-reads

			Index 1 file is required for the following technologies, in addition to those requiring Index 2.
			UniverSC will attempt to extract them from Read 1 headers if not found:

			inDrops-v3, STRT-Seq-C1

	-I2, --index2 FILE		-I2, --index2 FILE
	Index (I2) FASTQ file to pass to Cell Ranger (OPTIONAL). Contains the indexes		Index (I2) FASTQ file to pass to Cell Ranger (OPTIONAL). Contains the indexes
	for each sample. (In the case of Illumina paired-ends these are the i5 indexes).		for each sample. (In the case of Illumina paired-ends these are the i5 indexes).
	@@ -149,6 +154,11 @@ Provides a conversion script to run multiple technologies and custom libraries w
	Note: processing dual-indexed files is not stable. If behaviour is not as you expect,		Note: processing dual-indexed files is not stable. If behaviour is not as you expect,
	we welcome you to contact us on GitHub to help you out.		we welcome you to contact us on GitHub to help you out.

			Index 1 and Index 2 files are required for the following technologies
			UniverSC will attempt to extract them from Read 1 headers if not found:

			SCI-RNA-Seq, SCI-RNA-Seq3, scifi-seq, Smart-Seq2, Smart-Seq3, STRT-Seq-2i

	-f, --file NAME		-f, --file NAME
	Path and the name of FASTQ files to pass to Cell Ranger (prefix before R1 or R2)		Path and the name of FASTQ files to pass to Cell Ranger (prefix before R1 or R2)

	@@ -227,6 +237,19 @@ Provides a conversion script to run multiple technologies and custom libraries w
	Where no known barcodes are available all possible barcodes of the expected length are		Where no known barcodes are available all possible barcodes of the expected length are
	generated and converted if the permutations have not been computed already.		generated and converted if the permutations have not been computed already.

			Linkers are automatically removed from the following technologies:

			BD Rhapsody, inDrops-v1, Microwell-Seq, SCI-Seq3 Split-Seq, Smart-Seq2, Smart-Seq3, SureCell

			The following technologies default to non-UMI parameters (others can be forced):

			ICELL8-v2, RamDA-Seq, Quartz-Seq, Smart-Seq, Smart-Seq2

			The following technologies require Index 1 or Index 2 sequences (see above):

			inDrops-v3, SCI-RNA-Seq, SCI-RNA-Seq3, scifi-seq, Smart-Seq2, Smart-Seq3, STRT-Seq-2i, STRT-Seq-C1


	-b, --barcodefile FILE		-b, --barcodefile FILE
	Custom barcode list in plain text (with each line containing a barcode). Please provide		Custom barcode list in plain text (with each line containing a barcode). Please provide
	the name of a text file in the working directory or the path to it.		the name of a text file in the working directory or the path to it.

Admin message