Commit b75f726e authored by smorabit's avatar smorabit
Browse files

isoseq tutorial

parent 42a4bf7a
Loading
Loading
Loading
Loading
+9 −145
Original line number Diff line number Diff line
@@ -155,18 +155,17 @@

    
    
<p>** WARNING** This tutorial is under construction, and we do not advise running it at this point in time.</p>
<p>This tutorial covers the basics of using hdWGCNA to perform <strong>isoform</strong> co-expression network analysis using <a href="https://www.pacb.com/wp-content/uploads/Application-note-MAS-Seq-for-single-cell-isoform-sequencing.pdf" class="external-link">PacBio MAS-Seq</a> data. Briefly, MAS-Seq is a single-cell RNA-seq protocol that uses long-read sequencing to profile full-length RNA molecules, potentially providing a much greater depth of information about individual RNA splicing isoforms compared to traditional short-read scRNA-seq. This tutorial starts from the typical genes by cells counts matrix, as well as an isoforms by cells counts matrix unique to long-read single-cell approaches like MAS-seq, and we cover how to perform clustering analysis in long-read single-cell data using Seurat. Finally, we demonstrate several different ways to use hdWGCNA for isoform co-expression network analysis. Here we use a MAS-Seq dataset of PBMCs from two donors, sequenced using the PacBio Revio platform. Before working through this tutorial, we recommend becoming familar with the basics of hdWGCNA using our single-cell tutorial.</p>
<div class="section level2">
<h2 id="download-the-tutorial-data">Download the tutorial data<a class="anchor" aria-label="anchor" href="#download-the-tutorial-data"></a>
</h2>
<p>Here we download the genes and isoforms counts matrices from PacBio for the two replicates using wget. These counts matrices were generated using the<br>
PacBio SMRT Link software, more details can be found at the <a href="https://isoseq.how/umi/" class="external-link">isoseq documentation website</a>.</p>
<p>Here we download the genes and isoforms counts matrices from PacBio for the two replicates using wget. These counts matrices were generated using the PacBio SMRT Link software, more details can be found at the <a href="https://isoseq.how/umi/" class="external-link">isoseq documentation website</a>.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># download replicate 1</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">wget</span> --recursive --no-parent https://downloads.pacbcloud.com/public/dataset/MAS-Seq/DATA-Revio-PBMC-1/4-SeuratMatrix/</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="co"># download replicate 2</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="fu">wget</span> --recursive --no-parent https://downloads.pacbcloud.com/public/dataset/MAS-Seq/DATA-Revio-PBMC-2/4-SeuratMatrix/</span></code></pre></div>
<p>You can also <a href="https://drive.google.com/file/d/1pIINwgMlJX-3db4sLApt5Gi7GkWQElGf/view?usp=share_link" class="external-link">download the unprocessed Seurat object containing both replicates</a> instead of running the “Create a multi-assay Seurat object” section.</p>
</div>
<div class="section level2">
<h2 id="seurat-clustering-analysis">Seurat clustering analysis<a class="anchor" aria-label="anchor" href="#seurat-clustering-analysis"></a>
@@ -175,7 +174,7 @@ PacBio SMRT Link software, more details can be found at the <a href="https://iso
<div class="section level3">
<h3 id="create-a-multi-assay-seurat-object">Create a multi-assay Seurat object<a class="anchor" aria-label="anchor" href="#create-a-multi-assay-seurat-object"></a>
</h3>
<p>The MAS-Seq dataset includes both gene and isoform level quantifications, and in this section we will load these matrices for both replicates to create a multi-assay Seurat object. The number of isoforms detected differs for each replicate, and here we have chosen to only include isoforms that were detected in both replicates.</p>
<p>The MAS-Seq dataset includes both gene and isoform level quantifications, and in this section we will load these matrices for both replicates to create a multi-assay Seurat object. The number of isoforms detected differs for each replicate, and here we have chosen to only include isoforms that were detected in both replicates. If you don’t want to run this section, you can <a href="https://drive.google.com/file/d/1pIINwgMlJX-3db4sLApt5Gi7GkWQElGf/view?usp=share_link" class="external-link">download the unprocessed Seurat object here</a>.</p>
<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://satijalab.org/seurat" class="external-link">Seurat</a></span><span class="op">)</span>
<span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://tidyverse.tidyverse.org" class="external-link">tidyverse</a></span><span class="op">)</span>
@@ -886,7 +885,6 @@ See cell type marker genes
<span class="fu"><a href="https://rdrr.io/r/base/length.html" class="external-link">length</a></span><span class="op">(</span><span class="va">selected_isoforms</span><span class="op">)</span></code></pre></div>
<pre><code>[1] 13217</code></pre>
<p>This approach gives us 13217 isoforms for network analysis. Now we run the main steps of co-expression network analysis with hdWGCNA similarly to what we did in the previous section.</p>
<p><img src="figures/isoseq/major_iso_heatmap.png" width="900" height="600"></p>
<div class="sourceCode" id="cb26"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># set up expression matrix</span>
<span class="va">groups</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/unique.html" class="external-link">unique</a></span><span class="op">(</span><span class="fu"><a href="../reference/GetMetacellObject.html">GetMetacellObject</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span><span class="op">$</span><span class="va">annotation</span><span class="op">)</span>
@@ -1011,149 +1009,15 @@ See cell type marker genes
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/readRDS.html" class="external-link">readRDS</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">'tutorial_data/PBMC_masseq_hdWGCNA_major.rds'</span><span class="op">)</span></code></pre></div>
</div>
<div class="section level3">
<h3 id="example-3-major-isoforms-in-one-cell-type">Example 3: Major isoforms in one cell type<a class="anchor" aria-label="anchor" href="#example-3-major-isoforms-in-one-cell-type"></a>
<h3 id="example-3-consensus-wgcna-between-the-two-samples">Example 3: consensus wgcna between the two samples<a class="anchor" aria-label="anchor" href="#example-3-consensus-wgcna-between-the-two-samples"></a>
</h3>
<div class="sourceCode" id="cb29"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="fu"><a href="https://rdrr.io/r/base/lapply.html" class="external-link">sapply</a></span><span class="op">(</span><span class="va">major_marker_list</span>, <span class="kw">function</span><span class="op">(</span><span class="va">x</span><span class="op">)</span><span class="op">{</span><span class="fu"><a href="https://rdrr.io/r/base/length.html" class="external-link">length</a></span><span class="op">(</span><span class="va">x</span><span class="op">)</span><span class="op">}</span><span class="op">)</span>

<span class="co"># setup for WGCNA</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/SetupForWGCNA.html">SetupForWGCNA</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  gene_select <span class="op">=</span> <span class="st">"custom"</span>,
  gene_list <span class="op">=</span> <span class="va">major_marker_list</span><span class="op">[[</span><span class="st">'CD14+ Monocyte'</span><span class="op">]</span><span class="op">]</span>,
  metacell_location <span class="op">=</span> <span class="st">'variable'</span>, <span class="co"># this will use the same Metacell seurat object that we made before</span>
  wgcna_name <span class="op">=</span> <span class="st">"major_mono"</span>
<span class="op">)</span>

<span class="co"># how many isoforms?</span>
<span class="fu"><a href="https://rdrr.io/r/base/length.html" class="external-link">length</a></span><span class="op">(</span><span class="va">selected_isoforms</span><span class="op">)</span></code></pre></div>
<p><img src="figures/isoseq/major_iso_heatmap.png" width="900" height="600"></p>
<div class="sourceCode" id="cb30"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="co"># set up expression matrix</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/SetDatExpr.html">SetDatExpr</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  group.by<span class="op">=</span><span class="st">'annotation'</span>,
  group_name <span class="op">=</span> <span class="st">"CD14+ Monocyte"</span>,
  use_metacells<span class="op">=</span><span class="cn">TRUE</span>,
  slot <span class="op">=</span> <span class="st">'data'</span>,
  assay <span class="op">=</span> <span class="st">'iso'</span>
<span class="op">)</span>

<span class="co"># test soft powers</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/TestSoftPowers.html">TestSoftPowers</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span>
<span class="va">plot_list</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/PlotSoftPowers.html">PlotSoftPowers</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span>
<span class="fu"><a href="https://patchwork.data-imaginist.com/reference/wrap_plots.html" class="external-link">wrap_plots</a></span><span class="op">(</span><span class="va">plot_list</span>, ncol<span class="op">=</span><span class="fl">2</span><span class="op">)</span>

<span class="co"># construct wgcna network:</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/ConstructNetwork.html">ConstructNetwork</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  soft_power <span class="op">=</span> <span class="fl">8</span>,
  detectCutHeight<span class="op">=</span><span class="fl">0.995</span>,
  mergeCutHeight<span class="op">=</span><span class="fl">0.05</span>,
  TOM_name <span class="op">=</span> <span class="st">'major_NK'</span>,
  minModuleSize<span class="op">=</span><span class="fl">30</span>,
  overwrite_tom <span class="op">=</span> <span class="cn">TRUE</span>
<span class="op">)</span>

<span class="co"># plot the dendrogram</span>
<span class="fu"><a href="../reference/PlotDendrogram.html">PlotDendrogram</a></span><span class="op">(</span><span class="va">seurat_obj</span>, main<span class="op">=</span><span class="st">'hdWGCNA Dendrogram'</span><span class="op">)</span>

<span class="co"># have to have this ScaleData line or else Harmony gets angry</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://satijalab.org/seurat/reference/ScaleData.html" class="external-link">ScaleData</a></span><span class="op">(</span><span class="va">seurat_obj</span>, features<span class="op">=</span><span class="fu"><a href="https://tibble.tidyverse.org/reference/rownames.html" class="external-link">rownames</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span><span class="op">[</span><span class="fl">1</span><span class="op">:</span><span class="fl">100</span><span class="op">]</span><span class="op">)</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/ModuleEigengenes.html">ModuleEigengenes</a></span><span class="op">(</span><span class="va">seurat_obj</span>, group.by.vars <span class="op">=</span> <span class="st">'Sample'</span><span class="op">)</span>

<span class="co"># compute module connectivity:</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/ModuleConnectivity.html">ModuleConnectivity</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span></code></pre></div>
<p><img src="figures/isoseq/major_NK_dendro.png" width="900" height="600"></p>
<p>Next we visualize the MEiso signatures for each module in each cell population using a DotPlot.</p>
<div class="sourceCode" id="cb31"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="va">MEs</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/GetMEs.html">GetMEs</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span>
<span class="va">modules</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/GetModules.html">GetModules</a></span><span class="op">(</span><span class="va">seurat_obj</span><span class="op">)</span>
<span class="va">mods</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/levels.html" class="external-link">levels</a></span><span class="op">(</span><span class="va">modules</span><span class="op">$</span><span class="va">module</span><span class="op">)</span>
<span class="va">mods</span> <span class="op">&lt;-</span> <span class="va">mods</span><span class="op">[</span><span class="va">mods</span><span class="op">!=</span><span class="st">'grey'</span><span class="op">]</span>

<span class="va">meta</span> <span class="op">&lt;-</span> <span class="va">seurat_obj</span><span class="op">@</span><span class="va">meta.data</span>
<span class="va">seurat_obj</span><span class="op">@</span><span class="va">meta.data</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/cbind.html" class="external-link">cbind</a></span><span class="op">(</span><span class="va">meta</span>, <span class="va">MEs</span><span class="op">)</span>


<span class="co"># make dotplot</span>
<span class="va">p</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://satijalab.org/seurat/reference/DotPlot.html" class="external-link">DotPlot</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  group.by<span class="op">=</span><span class="st">'annotation'</span>,
  features <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/rev.html" class="external-link">rev</a></span><span class="op">(</span><span class="va">mods</span><span class="op">)</span>
<span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://satijalab.org/seurat/reference/SeuratTheme.html" class="external-link">RotatedAxis</a></span><span class="op">(</span><span class="op">)</span> <span class="op">+</span>
  <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/scale_gradient.html" class="external-link">scale_color_gradient2</a></span><span class="op">(</span>high<span class="op">=</span><span class="st">'red'</span>, mid<span class="op">=</span><span class="st">'grey95'</span>, low<span class="op">=</span><span class="st">'blue'</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html" class="external-link">xlab</a></span><span class="op">(</span><span class="st">''</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html" class="external-link">ylab</a></span><span class="op">(</span><span class="st">''</span><span class="op">)</span> <span class="op">+</span>
  <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/theme.html" class="external-link">theme</a></span><span class="op">(</span>
    plot.title <span class="op">=</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/element.html" class="external-link">element_text</a></span><span class="op">(</span>hjust <span class="op">=</span> <span class="fl">0.5</span><span class="op">)</span>,
    axis.line.x <span class="op">=</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/element.html" class="external-link">element_blank</a></span><span class="op">(</span><span class="op">)</span>,
    axis.line.y <span class="op">=</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/element.html" class="external-link">element_blank</a></span><span class="op">(</span><span class="op">)</span>,
    panel.border <span class="op">=</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/element.html" class="external-link">element_rect</a></span><span class="op">(</span>colour <span class="op">=</span> <span class="st">"black"</span>, fill<span class="op">=</span><span class="cn">NA</span>, size<span class="op">=</span><span class="fl">1</span><span class="op">)</span>
  <span class="op">)</span>

<span class="va">p</span></code></pre></div>
<p><img src="figures/isoseq/major_iso_dotplot_MEs.png" width="900" height="600"></p>
<div class="sourceCode" id="cb32"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span class="va">seurat_obj</span><span class="op">@</span><span class="va">meta.data</span> <span class="op">&lt;-</span> <span class="va">meta</span>


<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/RunModuleUMAP.html">RunModuleUMAP</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  n_hubs <span class="op">=</span><span class="fl">5</span>,
  n_neighbors<span class="op">=</span><span class="fl">15</span>,
  min_dist<span class="op">=</span><span class="fl">0.2</span>,
  spread<span class="op">=</span><span class="fl">1</span>
  <span class="co">#supervised=TRUE,</span>
  <span class="co">#target_weight=0.5</span>
<span class="op">)</span>


<span class="co"># get the hub gene UMAP table from the seurat object</span>
<span class="va">umap_df</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/GetModuleUMAP.html">GetModuleUMAP</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>
<span class="op">)</span>

<span class="co"># plot with ggplot</span>
<span class="va">p</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html" class="external-link">ggplot</a></span><span class="op">(</span><span class="va">umap_df</span>, <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html" class="external-link">aes</a></span><span class="op">(</span>x<span class="op">=</span><span class="va">UMAP1</span>, y<span class="op">=</span><span class="va">UMAP2</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span>
  <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_point.html" class="external-link">geom_point</a></span><span class="op">(</span>
   color<span class="op">=</span><span class="va">umap_df</span><span class="op">$</span><span class="va">color</span>,
   size<span class="op">=</span><span class="va">umap_df</span><span class="op">$</span><span class="va">kME</span><span class="op">*</span><span class="fl">2</span>
  <span class="op">)</span> <span class="op">+</span>
  <span class="fu"><a href="../reference/umap_theme.html">umap_theme</a></span><span class="op">(</span><span class="op">)</span>

<span class="fu"><a href="https://rdrr.io/r/grDevices/pdf.html" class="external-link">pdf</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/paste.html" class="external-link">paste0</a></span><span class="op">(</span><span class="va">fig_dir</span>, <span class="st">'major_iso_hubgene_umap_ggplot_uns.pdf'</span><span class="op">)</span>, width<span class="op">=</span><span class="fl">5</span>, height<span class="op">=</span><span class="fl">5</span><span class="op">)</span>
<span class="va">p</span>
<span class="fu"><a href="https://rdrr.io/r/grDevices/dev.html" class="external-link">dev.off</a></span><span class="op">(</span><span class="op">)</span>

<span class="fu"><a href="https://rdrr.io/r/grDevices/png.html" class="external-link">png</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/paste.html" class="external-link">paste0</a></span><span class="op">(</span><span class="va">fig_dir</span>, <span class="st">'major_coex_umap.png'</span><span class="op">)</span>, width<span class="op">=</span><span class="fl">7</span>, height<span class="op">=</span><span class="fl">7</span>, units<span class="op">=</span><span class="st">'in'</span>, res<span class="op">=</span><span class="fl">500</span><span class="op">)</span>
<span class="fu"><a href="../reference/ModuleUMAPPlot.html">ModuleUMAPPlot</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  edge.alpha<span class="op">=</span><span class="fl">0.5</span>,
  sample_edges<span class="op">=</span><span class="cn">TRUE</span>,
  keep_grey_edges<span class="op">=</span><span class="cn">FALSE</span>,
  edge_prop<span class="op">=</span><span class="fl">0.075</span>, <span class="co"># taking the top 20% strongest edges in each module</span>
<span class="co">#   label_genes = umap_df$isoform_name[umap_df$gene_name %in% plot_genes],</span>
  <span class="co">#label_genes = c(''),</span>
  label_hubs<span class="op">=</span><span class="fl">2</span> <span class="co"># how many hub genes to plot per module?</span>
<span class="op">)</span>
<span class="fu"><a href="https://rdrr.io/r/grDevices/dev.html" class="external-link">dev.off</a></span><span class="op">(</span><span class="op">)</span>


<span class="co"># individual module networks</span>
<span class="fu"><a href="../reference/ModuleNetworkPlot.html">ModuleNetworkPlot</a></span><span class="op">(</span>
  <span class="va">seurat_obj</span>,
  mods <span class="op">=</span> <span class="st">"all"</span>,
  <span class="co">#label_center=TRUE,</span>
  outdir <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/paste.html" class="external-link">paste0</a></span><span class="op">(</span><span class="va">fig_dir</span>, <span class="st">'hubNetworks/'</span><span class="op">)</span>
<span class="op">)</span>


<span class="fu"><a href="https://rdrr.io/r/base/readRDS.html" class="external-link">saveRDS</a></span><span class="op">(</span><span class="va">seurat_obj</span>, file <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/paste.html" class="external-link">paste0</a></span><span class="op">(</span><span class="va">data_dir</span>, <span class="st">'PBMC_masseq_hdWGCNA_wip.rds'</span><span class="op">)</span><span class="op">)</span>
<span class="va">seurat_obj</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/readRDS.html" class="external-link">readRDS</a></span><span class="op">(</span>file <span class="op">=</span> <span class="st">'tutorial_data/PBMC_masseq_hdWGCNA_major.rds'</span><span class="op">)</span></code></pre></div>
<p>Coming soon!</p>
</div>
<div class="section level3">
<h3 id="bonus-consensus-wgcna-between-the-two-samples">Bonus: consensus wgcna between the two samples:<a class="anchor" aria-label="anchor" href="#bonus-consensus-wgcna-between-the-two-samples"></a>
</h3>
</div>
<div class="section level2">
<h2 id="isoform-plotting-with-ggtranscript-and-swan">Isoform plotting with ggtranscript and Swan<a class="anchor" aria-label="anchor" href="#isoform-plotting-with-ggtranscript-and-swan"></a>
</h2>
<p>Coming soon!</p>
</div>
  </div>

+25 −16

File changed.

Preview size limit exceeded, changes collapsed.