Commit 8ed543ee authored by Jun Zhao's avatar Jun Zhao
Browse files

documentation update

parent ac076f09
Loading
Loading
Loading
Loading
+6 −2
Original line number Diff line number Diff line
@@ -3,12 +3,16 @@
export(STGmarkerFinder)
export(getDAcells)
export(getDAregion)
export(plotCellLabel)
export(plotCellScore)
export(plotDAsite)
export(runSTG)
export(updateDAcells)
import(RANN)
import(ggplot2)
import(Seurat)
import(cowplot)
import(ggplot2)
import(glmnet)
import(reticulate)
import(scales)
import(tclust)
importFrom(caret,createFolds)
+4 −6
Original line number Diff line number Diff line
@@ -6,8 +6,7 @@
#'
#' @param X matrix, normalized expression matrix of all cells in the dataset, genes are in rows,
#' rownames must be gene names
#' @param cell.idx result "da.cell.idx" from the output of function getDAcells
#' @param da.region.label result "cluster.res" from the output of function getDAregion
#' @param da.regions output from the function getDAregion()
#' @param da.regions.to.run numeric (vector), which DA regions to run the marker finder,
#' default is to run all regions
#' @param lambda numeric, regularization parameter that weights the number of selected genes,
@@ -35,7 +34,7 @@
#' @export

STGmarkerFinder <- function(
  X, cell.idx, da.region.label,
  X, da.regions,
  da.regions.to.run = NULL,
  lambda = 1.2, n.runs = 5, return.model = F,
  python.use = "/usr/bin/python", GPU = ""
@@ -55,15 +54,14 @@ STGmarkerFinder <- function(
  X.py <- r_to_py(as.matrix(X))

  # get DA regions to run
  n.da <- length(unique(da.region.label)) - 1
  n.da <- length(unique(da.regions$da.region.label)) - 1
  if(is.null(da.regions.to.run)){
    da.regions.to.run <- c(1:n.da)
  }

  # create DA label vector
  n.cells <- ncol(X)
  da.label <- rep(0, n.cells)
  da.label[cell.idx] <- da.region.label
  da.label <- da.regions$da.region.label


  # run model for each da region
+1 −1
Original line number Diff line number Diff line
@@ -3,7 +3,7 @@
## Introduction
DA-seq is a method to detect cell subpopulations with differential abundance between single cell RNA-seq (scRNA-seq) datasets from different samples, described in the preprint, "Detecting regions of differential abundance between scRNA-Seq datasets" available [here](https://www.biorxiv.org/content/10.1101/711929v2). Given a low dimensional transformation, for example principal component analysis (PCA), of the merged gene expression matrices from different samples (cell states, condition, etc.), DA-seq first computes a score vector for each cell to represent the DA behavior in the neighborhood to select cells in the most DA areas; then groups these cells into distinct DA regions.

This repository contains the DA-seq package.
[This](https://github.com/KlugerLab/DAseq) repository contains the DA-seq package.

## R Dependencies
Required packages: RANN, tclust, ggplot2, cowplot, RColorBrewer, scales, reticulate
+1 −1
Original line number Diff line number Diff line
@@ -82,7 +82,7 @@
<h2 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h2>
<p>DA-seq is a method to detect cell subpopulations with differential abundance between single cell RNA-seq (scRNA-seq) datasets from different samples, described in the preprint, “Detecting regions of differential abundance between scRNA-Seq datasets” available <a href="https://www.biorxiv.org/content/10.1101/711929v2">here</a>. Given a low dimensional transformation, for example principal component analysis (PCA), of the merged gene expression matrices from different samples (cell states, condition, etc.), DA-seq first computes a score vector for each cell to represent the DA behavior in the neighborhood to select cells in the most DA areas; then groups these cells into distinct DA regions.</p>
<p>This repository contains the DA-seq package.</p>
<p><a href="https://github.com/KlugerLab/DAseq">This</a> repository contains the DA-seq package.</p>
</div>
<div id="r-dependencies" class="section level2">
<h2 class="hasAnchor">
+22 −53
Original line number Diff line number Diff line
@@ -38,11 +38,6 @@

<meta property="og:title" content="DA-seq Step 1 &amp; Step 2: select DA cells — getDAcells" />

<meta property="og:description" content="Step 1: compute a multiscale score measure for each cell of its k-nearest-neighborhood for
multiple values of k.
Step 2: train a logistic regression classifier based on the multiscale score measure and retain cells
that may reside in DA regions." />




@@ -125,36 +120,31 @@ that may reside in DA regions." />

    <div class="ref-description">

    <p>Step 1: compute a multiscale score measure for each cell of its k-nearest-neighborhood for
multiple values of k.
Step 2: train a logistic regression classifier based on the multiscale score measure and retain cells
that may reside in DA regions.</p>
    
    </div>

    <pre class="usage"><span class='fu'>getDAcells</span>(<span class='no'>X</span>, <span class='no'>cell.labels</span>, <span class='no'>labels.1</span>, <span class='no'>labels.2</span>, <span class='no'>k.vector</span>, <span class='kw'>k.folds</span> <span class='kw'>=</span> <span class='fl'>10</span>,
  <span class='kw'>n.runs</span> <span class='kw'>=</span> <span class='fl'>10</span>, <span class='kw'>pred.thres</span> <span class='kw'>=</span> <span class='fu'><a href='https://rdrr.io/r/base/c.html'>c</a></span>(<span class='fl'>0.05</span>, <span class='fl'>0.95</span>), <span class='kw'>do.plot</span> <span class='kw'>=</span> <span class='no'>T</span>,
  <span class='kw'>plot.embedding</span> <span class='kw'>=</span> <span class='kw'>NULL</span>, <span class='kw'>size</span> <span class='kw'>=</span> <span class='fl'>0.5</span>, <span class='kw'>python.use</span> <span class='kw'>=</span> <span class='st'>"/usr/bin/python"</span>,
  <span class='kw'>GPU</span> <span class='kw'>=</span> <span class='st'>""</span>)</pre>
    <pre class="usage"><span class='fu'>getDAcells</span>(<span class='no'>X</span>, <span class='no'>cell.labels</span>, <span class='no'>labels.1</span>, <span class='no'>labels.2</span>, <span class='kw'>k.vector</span> <span class='kw'>=</span> <span class='kw'>NULL</span>,
  <span class='kw'>k.folds</span> <span class='kw'>=</span> <span class='fl'>10</span>, <span class='kw'>n.runs</span> <span class='kw'>=</span> <span class='fl'>10</span>, <span class='kw'>es.patience</span> <span class='kw'>=</span> <span class='fl'>10</span>, <span class='kw'>k.smooth</span> <span class='kw'>=</span> <span class='kw'>NULL</span>,
  <span class='kw'>pred.thres</span> <span class='kw'>=</span> <span class='fu'><a href='https://rdrr.io/r/base/c.html'>c</a></span>(<span class='fl'>0.05</span>, <span class='fl'>0.95</span>), <span class='kw'>do.plot</span> <span class='kw'>=</span> <span class='no'>T</span>, <span class='kw'>plot.embedding</span> <span class='kw'>=</span> <span class='kw'>NULL</span>,
  <span class='kw'>size</span> <span class='kw'>=</span> <span class='fl'>0.5</span>, <span class='kw'>python.use</span> <span class='kw'>=</span> <span class='st'>"/usr/bin/python"</span>, <span class='kw'>GPU</span> <span class='kw'>=</span> <span class='st'>""</span>)</pre>
    
    <h2 class="hasAnchor" id="arguments"><a class="anchor" href="#arguments"></a>Arguments</h2>
    <table class="ref-arguments">
    <colgroup><col class="name" /><col class="desc" /></colgroup>
    <tr>
      <th>X</th>
      <td><p>size N-by-p matrix, input merged dataset of interest after dimension reduction</p></td>
      <td><p>size N-by-p matrix, input merged dataset of interest after dimension reduction.</p></td>
    </tr>
    <tr>
      <th>cell.labels</th>
      <td><p>size N vector, labels for each input cell</p></td>
      <td><p>size N character vector, labels for each input cell</p></td>
    </tr>
    <tr>
      <th>labels.1</th>
      <td><p>vector, label name(s) that represent condition 1</p></td>
      <td><p>character vector, label name(s) that represent condition 1</p></td>
    </tr>
    <tr>
      <th>labels.2</th>
      <td><p>vector, label name(s) that represent condition 2</p></td>
      <td><p>character vector, label name(s) that represent condition 2</p></td>
    </tr>
    <tr>
      <th>k.vector</th>
@@ -169,45 +159,24 @@ that may reside in DA regions.</p>
      <td><p>integer, number of times to run the neural network to get the predictions, default 10</p></td>
    </tr>
    <tr>
      <th>pred.thres</th>
      <td><p>length-2 vector, top and bottom threshold on the predictions from the
logistic classification, default c(0.05,0.95)</p></td>
    </tr>
    <tr>
      <th>do.plot</th>
      <td><p>a logical value to indicate whether to return ggplot objects showing the results,
default True</p></td>
    </tr>
    <tr>
      <th>plot.embedding</th>
      <td><p>size N-by-2 matrix, 2D embedding for the cells</p></td>
    </tr>
    <tr>
      <th>size</th>
      <td><p>cell size to use in the plot, default 0.5</p></td>
    </tr>
    <tr>
      <th>python.use</th>
      <td><p>character string, the Python to use, default "/usr/bin/python"</p></td>
      <th>es.patience</th>
      <td><p>integer, patience parameter used in the EarlyStopping procedure, default 10</p></td>
    </tr>
    <tr>
      <th>GPU</th>
      <td><p>which GPU to use, default '', using CPU</p></td>
      <th>k.smooth</th>
      <td><p>integer, number of nearest neighbors used to smooth the prediction scores,
default 1<!-- % of cells} --></p>
<p>pred.threslength-2 vector, top and bottom threshold on the predictions from the
logistic classification, default c(0.05,0.95)</p>
<p>do.plota logical value to indicate whether to return ggplot objects showing the results,
default True</p>
<p>plot.embeddingsize N-by-2 matrix, 2D embedding for the cells</p>
<p>sizecell size to use in the plot, default 0.5</p>
<p>python.usecharacter string, the Python to use, default "/usr/bin/python"</p>
<p>GPUwhich GPU to use, default '', using CPU</p></td>
    </tr>
    </table>
    
    <h2 class="hasAnchor" id="value"><a class="anchor" href="#value"></a>Value</h2>

    <p>a list of results</p><dl'>
  <dt>da.ratio</dt><dd><p>score vector for each cell</p></dd>
  <dt>da.pred</dt><dd><p>(mean) prediction from the neural network</p></dd>
  <dt>da.cell.id</dt><dd><p>index for DA cells</p></dd>
  <dt>pred.plot</dt><dd><p>ggplot object showing the predictions of logistic regression on plot.embedding</p></dd>
  <dt>da.cells.plot</dt><dd><p>ggplot object highlighting cells of da.cell.idx on plot.embedding</p></dd>

</dl>

    

  </div>
  <div class="col-md-3 hidden-xs hidden-sm" id="pkgdown-sidebar">
Loading