Define where the pipeline should find input data and save output data.

A string identifier used to name result files in the output directory

required
type: string
default: study

A string identifying the technology used to produce the data

required
type: string

Path to CSV/TSV file containing information about the samples in the experiment.

required
type: string
pattern: ^\S+\.(csv|tsv)$

A CSV/TSV/YML/YAML file describing sample contrasts to compare groups.

type: string
pattern: ^\S+\.(csv|tsv|yml|yaml)$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required
type: string

Type of abundance measure used, platform-dependent.

required
type: string

To how many digits should numeric output in different modules be rounded? If -1 or null, will not round.

type: integer
default: 4

Ways of providing your abundance values

TSV/CSV-format abundance matrix

type: string
pattern: ^\S+\.(tsv|csv)$|\S*proteinGroups\.txt$

(RNA-seq only): optional transcript/gene length matrix with samples and transcript_ids/gene_ids as in the abundance matrix.

type: string

Alternative to matrix: a compressed CEL files archive such as often found in GEO

type: string

Use SOFT files from GEO by providing the GSE study identifier

type: string

Column in the sample sheet to be used as the primary sample identifier

required
type: string
default: sample

Type of observation

required
type: string
default: sample

Column in the sample sheet to be used as identifier for observations. If unset, the —observations_id_col is used.

type: string

Options related to features

Feature ID attribute in the abundance table as well as in the GTF file (e.g. the gene_id field)

required
type: string
default: gene_id

Feature name attribute in the abundance table as well as in the GTF file (e.g. the gene symbol field)

required
type: string
default: gene_name

Type of feature. Often ‘gene’

required
type: string
default: gene

When set, use the control features in scaling/ normalization (currently only supported for differential_method deseq2)

type: boolean

A text file listing technical features (e.g. spikes)

type: string

Comma-separated string, specifies feature metadata columns to be used for exploratory analysis, platform-specific

type: string
default: gene_id,gene_name,gene_biotype

Supply your own feature annotations. Can be derived from the GTF (rnaseq) or from the Bioconductor annotation package (affy arrays).

type: string
pattern: ^\S+\.(csv|tsv)$

Analysis options related to the use of paramsheet to run multiple combinations of analyses (see usage docs for details).

Name of the paramset to run. In profile mode, set by the analysis profile for output directory naming. In paramsheet mode, selects which paramset(s) to run (comma-separated).

type: string

Path to a paramsheet YAML file. Setting this activates multi-run (paramsheet) mode where paramsheet values take priority over CLI flags.

type: string
pattern: ^\S+\.(yaml|yml)$

Options for processing of affy arrays with justRMA()

Column of the sample sheet containing the Affymetrix CEL file name

type: string
default: file

logical value. If set to true, apply background correction using RMA.

type: boolean
default: true

integer value indicating which RMA background to use

type: integer
default: 2

logical value. If TRUE, then works on the PM matrix in place as much as possible, good for large datasets.

type: boolean

Used to specify the name of an alternative cdf package. If set to NULL, then the usual cdf package based on Affymetrix’ mappings will be used.

type: string

logical value. If TRUE, a matrix of probe annotations will be derived.

type: boolean
default: true

should the spots marked as ‘MASKS’ set to NA?

type: boolean

should the spots marked as ‘OUTLIERS’ set to NA?

type: boolean

if TRUE, then overrides what is in rm.mask and rm.oultiers.

type: boolean

Genome annotation file in GTF format

type: string
pattern: ^\S+\.gtf(\.gz)?

If a GTF file is supplied, which feature type to use

type: string
default: transcript

If a GTF file is supplied, which field should go first in the converted output table

type: string
default: gene_id

Options for processing of proteomics MaxQuant tables with the Proteus R package

Prefix of the column names of the MaxQuant proteingroups table in which the intensity values are saved; the prefix has to be followed by the sample names that are also found in the samplesheet. Default: ‘LFQ intensity’; will search for both the prefix as entered and the prefix followed by one whitespace.

type: string
default: LFQ intensity

Normalization function to use on the MaxQuant intensities.

type: string

Which method to use for plotting sample distributions of the MaxQuant intensities; one of ‘violin’, ‘dist’, ‘box’.

type: string

Should a loess line be added to the plot of mean-variance relationship of the conditions? Default: true.

type: boolean
default: true

Valid R palette name

type: string
default: Set1

Options related to filtering upstream of differential analysis

Minimum abundance value. Set to false to disable abundance filtering.

required
type: integer,boolean

Minimum observations that must pass the threshold to retain the row/ feature (e.g. gene).

type: number
default: 1

A minimum proportion of observations, given as a number between 0 and 1, that must pass the threshold. Overrides minimum_samples

type: number

An optional grouping variable to be used to calculate a min_samples value

type: string

A minimum proportion of observations, given as a number between 0 and 1, that must have a value (not NA) to retain the row/ feature (e.g. gene).

type: number
default: 0.5

Minimum observations that must have a value (not NA) to retain the row/ feature (e.g. gene). Overrides filtering_min_proportion_not_na.

type: number

Set to run IMMUNEDECONV

type: boolean

Set method to run with IMMUNEDECONV. Available options can be found in ‘https://omnideconv.org/immunedeconv/articles/immunedeconv.html

type: string
default: quantiseq

Set function to run with IMMUNEDECONV. Available options can be found in ‘https://omnideconv.org/immunedeconv/articles/immunedeconv.html

type: string
default: deconvolute

Options related to data exploration

Clustering method used in dendrogram creation

required
type: string
default: ward.D2

Correlation method used in dendrogram creation

required
type: string
default: spearman

Number of features selected before certain exploratory analyses. If -1, will use all features.

required
type: integer
default: 500

Length of the whiskers in boxplots as multiple of IQR. Defaults to 1.5.

type: number
default: 1.5

Threshold on MAD score for outlier identification

type: integer
default: -5

How should the main grouping variable be selected? ‘auto_pca’, ‘contrasts’, or a valid column name from the observations table.

required
type: string
default: auto_pca

Specifies assay names to be used for matrices, platform-specific.

hidden
type: string
default: raw,normalised,variance_stabilised

Specifies final assay to be used for exploratory analysis, platform-specific

hidden
type: string
default: variance_stabilised

Of which assays to compute the log2 during exploratory analysis. Not necessary for maxquant data as this is controlled by the pipeline.

type: string
default: raw,normalised

Valid R palette name

required
type: string
default: Set1

Options related to differential operations

Differential analysis method

type: string

Advanced option: the suffix associated tabular differential results tables. Will by default use the appropriate suffix according to the study_type.

type: string

The feature identifier column in differential results tables

required
type: string
default: gene_id

The fold change column in differential results tables

required
type: string
default: log2FoldChange

The p value column in differential results tables

type: string
default: pvalue

The q value column in differential results tables (adjust p values/ q values).

required
type: string
default: padj

Minimum fold change used to calculate differential feature numbers. Note that this number will be log2 transformed

required
type: number
default: 2

Maximum p value used to calculate differential feature numbers

required
type: number
default: 1

Maximum q value used to calculate differential feature numbers

required
type: number
default: 0.05

Where a features file (GTF) has been provided, what attribute to use to name features

type: string
default: gene_name

Indicate whether or not fold changes are on the log scale (default is to assume they are)

type: boolean
default: true

Valid R palette name

required
type: string
default: Set1

In differential analysis (DEseq2 or Limma), subset to the contrast samples before modelling variance?

type: boolean

test parameter passed to DESeq()

type: string

fitType parameter passed to DESeq()

type: string

sfType parameter passed to DESeq()

type: string

‘minReplicatesForReplace’ parameter passed to DESeq()

type: integer
default: 7

useT parameter passed to DESeq2

type: boolean

independentFiltering parameter passed to results()

type: boolean
default: true

lfcThreshold parameter passed to results()

type: number

altHypothesis parameter passed to results()

type: string
default: greaterAbs

pAdjustMethod parameter passed to results()

type: string
default: BH

alpha parameter passed to results()

type: number
default: 0.1

minmu parameter passed to results()

type: number
default: 0.5

variance stabilisation method to use when making a variance stabilised matrix

type: string

Shrink fold changes in results?

type: boolean
default: true
type: integer

blind parameter for rlog() and/ or vst()

type: boolean
default: true

nsub parameter passed to vst()

type: integer
default: 1000

passed to lmFit(), positive integer giving the number of times each distinct probe is printed on each array.

type: number

passed to lmFit(), positive integer giving the spacing between duplicate occurrences of the same probe, spacing=1 for consecutive rows.

type: string

Sample sheet column to be used to derive a vector or factor specifying a blocking variable on the arrays for limma::lmFit(); however, for random effects models, DREAM is the recommended approach in this pipeline

type: string

passed to limma::lmFit(), the inter-duplicate or inter-technical replicate correlation; however for random effects models, DREAM is the recommended approach in this pipeline

type: string

passed to lmFit(), the fitting method

type: string

passed to eBayes(), a numeric value between 0 and 1, assumed proportion of genes which are differentially expressed

type: number
default: 0.01

passed to eBayes(), logical, should an intensity-dependent trend be allowed for the prior variance?

type: boolean

passed to eBayes(), logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?

type: boolean

passed to eBayes, comma separated string of two values, assumed lower and upper limits for the standard deviation of log2-fold-changes for differentially expressed genes

type: string
default: 0.1,4

passed to eBayes, comma separated string of length 1 or 2, giving left and right tail proportions of x to Winsorize. Used only when robust=TRUE.

type: string
default: 0.05,0.1

passed to topTable(), minimum absolute log2-fold-change required

type: integer

passed to topTable(), logical, should confidence 95% intervals be output for logFC? Alternatively, can take a numeric value between zero and one specifying the confidence level required.

type: boolean

passed to topTable(), method used to adjust the p-values for multiple testing.

type: string

cutoff value for adjusted p-values. Only genes with lower p-values are listed.

type: number
default: 1

Turns on and off usage of voom normalization in the Limma module.

type: boolean

type: integer
default: 1
type: integer
type: boolean
type: number
default: 0.01
type: string
default: 0.1,4
type: boolean
type: boolean
type: string
default: 0.05,0.1
type: string
default: adaptive
type: boolean
type: boolean
type: string
default: BH

Functional analysis method

type: string

Gene sets in GMT or GMX-format; for GSEA: multiple comma-separated input files in either format are possible. For gprofiler2: A single file in GMT format is possible; this has lowest priority and will be overridden by —gprofiler2_token and —gprofiler2_organism.

type: string

Permutation type

type: string

Number of permutations

type: integer
default: 1000

Enrichment statistic

type: string

Metric for ranking genes

type: string

Gene list sorting mode

type: string

Gene list ordering mode

type: string

Max size: exclude larger sets

type: integer
default: 500

Min size: exclude smaller sets

type: integer
default: 15

Normalisation mode

type: string

Randomization mode

type: string

Make detailed geneset report?

type: boolean
default: true

Use median for class metrics

type: boolean

Number of markers

type: integer
default: 100

Plot graphs for the top sets of each phenotype

type: integer
default: 20

Seed for permutation

type: string
default: timestamp

Save random ranked lists

type: boolean

Make a zipped file with all reports

type: boolean

Short name of the organism that is analyzed, e.g. hsapiens for homo sapiens.

type: string

Should only significant enrichment results be considered?

type: boolean
default: true

Should underrepresentation be measured instead of overrepresentation?

type: boolean

The method that should be used for multiple testing correction.

type: string

On which source databases to run the gprofiler query

type: string

Whether to include evcodes in the results.

type: boolean

Maximum q value used for significance testing.

type: number
default: 0.05

Token that should be used as a query.

type: string

Path to CSV/TSV/TXT file that should be used as a background list of genes for the query; alternatively, ‘auto’ (default) or ‘false’.

type: string
default: auto
pattern: ^\S+\.(csv|tsv|txt)$|auto|false

Which column to use as gene IDs in the background matrix.

type: string

How to calculate the statistical domain size.

type: string

How many genes must be differentially expressed in a pathway for it to be considered enriched? Default 1.

type: integer
default: 1

Valid R palette name

type: string
default: Blues

Path to TSV file containing network file for decoupler

type: string
pattern: ^\S+\.(tsv)$

Removes sources of a net with less than min_n targets

type: integer
default: 5

Comma-separated list of methods to use (e.g., ‘ora,ulm’)

type: string
default: ulm

Should a Shiny app be built?

type: boolean
default: true

Should the app be deployed to shinyapps.io?

type: boolean

Your shinyapps.io account name

type: string

The name of the app to push to in your shinyapps.io account

type: string

Qmd report template from which to create the pipeline report

required
type: string
default: ${projectDir}/assets/differentialabundance_report.qmd
pattern: ^\S+\.(Rmd|qmd|ipynb)$

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

A logo to display in the report instead of the generic pipeline logo.

hidden required
type: string
default: ${projectDir}/docs/images/nf-core-differentialabundance_logo_light.png

CSS to use to style the output, in lieu of the default nf-core styling

hidden required
type: string
default: ${projectDir}/assets/nf-core_style.css

A markdown file containing citations to include in the final report

type: string
default: ${projectDir}/CITATIONS.md

A title for reporting outputs

type: string

An author for reporting outputs

type: string

Semicolon-separated string of contributor info that should be listed in the report.

type: string

A description for reporting outputs

type: string

Whether to generate a scree plot in the report

type: boolean
default: true

Skip generation of reports

type: boolean

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Do not load the iGenomes reference config.

hidden
type: boolean

The base path to the igenomes reference files

hidden
type: string
default: s3://ngi-igenomes/igenomes/

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden
type: boolean

Method used to save pipeline results to output directory.

type: string

Email address for completion summary, only when pipeline fails.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden
type: boolean

Do not use coloured log outputs.

hidden
type: boolean

Incoming hook URL for messaging service

hidden
type: string

Boolean whether to validate parameters against the schema at runtime

type: boolean
default: true

Base URL or local path to location of pipeline test dataset files

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden
type: string

Display the help message.

type: boolean,string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when —help or —help_full are provided).

type: boolean