Commit dc907a7a authored by tekath's avatar tekath
Browse files

New Readme and updated vignette.

parent 3138c4de
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -10,5 +10,6 @@ LazyData: true
Depends: R (>= 3.4.0)
Imports: Matrix, tximport, sparseDRIMSeq, stageR, rtracklayer, formattable, DT, Gviz, assertthat, scales, ggplot2, reshape2, BiocParallel, Matrix.utils, pheatmap, methods, GenomicRanges, htmltools, htmlwidgets, knitr, stringi 
Suggests: GenomeInfoDb, Seurat (>= 3.0.0), webshot, webshot2, magick
Remotes: github::TobiTekath/sparseDRIMSeq
RoxygenNote: 7.1.0
Roxygen: list(markdown = TRUE)

NEWS.Rmd

0 → 100644
+12 −0
Original line number Diff line number Diff line
---
title: "DTUrtle News"
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

### 0.2.5

- initial first version.
+4 −3
Original line number Diff line number Diff line
# DTUrtle News
DTUrtle News
================

## 0.5.0
### 0.2.5

* initial first version.
  - initial first version.
+2 −1
Original line number Diff line number Diff line
@@ -90,13 +90,14 @@ create_dtu_table <- function(dturtle, add_gene_metadata = list("pct_gene_expr"="
#' Plot a DTU table to HTML and image
#'
#' Creates a enhanced HTML representation of a DTU table. The table can be (color) formatted individually by providing `column_formatters`.
#' Also automatically links columns of plot names, to be viewable in the table. Currently you are not allowed to provide a column formatter for plot columns.
#'
#' The table can optionally also be saved as an image ('.png'), by specifying the wanted number of rows to create_table_image.
#'
#' @param dturtle `dturtle` result object of [create_dtu_table()].
#' @param columns Optinally subset the existing `dtu_table` of the dturtle object to the columns specified here.
#' @param column_formatters Named list of column_formatters, specifying a formatter function for every column that shall be formatted.
#' The formatter functions are either from this package like [table_percentage_bar()], [table_pval_tile()] or from \code{\link[formattable]{formattable}}.
#' The formatter functions are either from this package like [table_percentage_bar()], [table_pval_tile()] or from \code{\link[formattable:00Index]{formattable}}.
#' @param order_by One or multiple columns to order the table by. Must be a vector of column names, descending order can be achived by prepending a `-` (e.g. `c("-my_col_name")`).
#' @param num_digits Number of digits, numerical columns shall be formatted to. Can be a single number to apply to all numerical columns, or a number for each numerical column (in their order).
#' @param num_digits_format Digit format string, as in \code{\link[base:formatC]{formatC}}. These format string are used in numerical columns formatting if `num_digits` is provided.

Readme.Rmd

0 → 100644
+143 −0
Original line number Diff line number Diff line
---
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# DTUrtle <img src="inst/logo/logo.svg" align="right" alt="" width="250"/>

**Perform differential transcript usage (DTU) analysis of bulk or single-cell RNA-seq data.**

## Installation

Install from GitHub:

```{r eval=F}
if(!requireNamespace(devtools){
    install.packages("devtools")
}
devtools::install_github("TobiTekath/DTUrtle")
```

## Basic workflow

- **See preprocessing vignettes for exemplified workflow with data**

```{r echo=FALSE,  out.height = '100%', fig.align='center'}
knitr::include_graphics("/data/dturtle_package/DTUrtle_workflow.svg")
```


## DTUrtle minimal workflow

A minimal DTUrtle workflow might look like this:


### setup environment

```{r eval=F}
library(DTUrtle)

#the BiocParallel framework is used to parallelize the computations.
    #Using 4 cores:
    biocpar <- BiocParallel::MulticoreParam(4)
    #or standard serial computation (only 1 core)
    biocpar <- BiocParallel::SerialParam()
#multiple other options available for computational clusters.
```

### import and format data

```{r eval=F}
#import gtf Annotation to get transcript to gene mapping
tx2gene <- import_gtf(gtf_file = "path_to_your_gtf_file.gtf")

##optional:
    #move transcript and gene identifier columns to front
    tx2gene <- move_columns_to_front(df = tx2gene, 
                                     columns = c("transcript_name", "gene_name"))
    #ensure that a one to one mapping between names and IDs exists
    tx2gene$gene_name <- one_to_one_mapping(name = tx2gene$gene_name, 
                                            id = tx2gene$gene_id)
    tx2gene$transcript_name <- one_to_one_mapping(name = tx2gene$transcript_name, 
                                                  id = tx2gene$transcript_id)

#import transcript-level quantification data, for example from Salmon
files <- Sys.glob("path_to_your_data/*/quant.sf")
names(files) <- gsub(".*/","",gsub("/quant.sf","",files))
cts <- import_counts(files = files, type = "Salmon")

##for single-cell data only:
    #import counts returned a list of matrices -> combine them to one matrix
    cts <- combine_to_matrix(tx_list = cts)

#create a sample data sheet, specifying which sample / cell belongs to which group
pd <- data.frame("id"=colnames(cts), "group"="your_grouping_variable", 
                 stringsAsFactors = F)
```

### DTU analysis

- the `dturtle` object is an easy-to-access list, containing all necessary analysis information and results 

```{r eval=F}
#use DRIMSeq for fitting a Dirichlet-multinomial model
dturtle <- run_drimseq(counts = cts, tx2gene = tx2gene, pd=pd, id_col = "id",
                    cond_col = "group", filtering_strategy = "bulk", 
                    BPPARAM = biocpar)

#run posthoc filtering and two-staged statistical correction with stageR
dturtle <- posthoc_and_stager(dturtle = dturtle, ofdr = 0.05)

```


### Result aggregation and visualization

```{r, eval=F}
#highly felxible function to create a results data frame
dturtle <- create_dtu_table(dturtle = dturtle)

    ## View results data frame
    View(dturtle$dtu_table)

#plot and save the results
setwd("my_results_folder")    

barprop_plot_list <- plot_proportion_barplot(dturtle = dturtle, 
                                             savepath = "images", 
                                             BPPARAM = biocpar)
pheatprop_plot_list <- plot_proportion_pheatmap(dturtle = dturtle, 
                                                savepath = "images", 
                                                include_expression = T, 
                                                BPPARAM = biocpar)
transcriptview_plot_list <- plot_transcripts_view(dturtle = dturtle, 
                                                  gtf = "path_to_your_gtf_file.gtf", 
                                                  genome = 'hg38', 
                                                  one_to_one = T,
                                                  savepath = "images", 
                                                  BPPARAM = biocpar)

#add relative filepaths of plots to results data frame
dturtle$dtu_table$barplot <- barprop_plot_list[match(rownames(dturtle$dtu_table),
                                                     names(barprop_plot_list))]
dturtle$dtu_table$pheatmap <- pheatprop_plot_list[match(rownames(dturtle$dtu_table),
                                                        names(pheatprop_plot_list))]
dturtle$dtu_table$transcript_view <- transcriptview_plot_list[match(rownames(dturtle$dtu_table),
                                                                    names(transcriptview_plot_list))]


#create interactive HTML-table from results data frame
    #specify colorful column formatters
    column_formatter_list <- list(
      "gene_qval" = table_pval_tile("white", "orange", digits = 3),
      "min_tx_qval" = table_pval_tile("white", "orange", digits = 3),
      "n_tx" = formattable::color_tile('white', "lightblue"),
      "n_sig_tx" = formattable::color_tile('white', "lightblue"),
      "max(Condition1-Condition2)" = table_percentage_bar('lightgreen', "#FF9999", digits=2))

plot_dtu_table(dturtle = dturtle, savepath = "my_results.html", 
               column_formatters = column_formatter_list)
```
Loading