Perform overrepresentation analysis for a set of genes
Usage
ora(
genes,
annotation,
column = NULL,
background,
correction = "BH",
alpha = 0.05,
min_setsize = 5,
max_setsize = 500,
bp_param = BiocParallel::SerialParam()
)
Arguments
- genes
Character vector containing genes for overrepresentation analysis.
- annotation
Annotation data frame with genes in the first column and functional annotation in the other columns. This data frame can be exported from Biomart or similar databases.
- column
Column or columns of annotation to be used for enrichment. Both character or numeric values with column indices can be used. If users want to supply more than one column, input a character or numeric vector. Default: all columns from annotation.
- background
Character vector of genes to be used as background for the overrepresentation analysis.
- correction
Multiple testing correction method. One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr" or "none". Default is "BH".
- alpha
Numeric indicating the adjusted P-value threshold for significance. Default: 0.05.
- min_setsize
Numeric indicating the minimum gene set size to be considered. Gene sets correspond to levels of each variable in annotation). Default: 5.
- max_setsize
Numeric indicating the maximum gene set size to be considered. Gene sets correspond to levels of each variable in annotation). Default: 500.
- bp_param
BiocParallel back-end to be used. Default: BiocParallel::SerialParam()
Value
A data frame of overrepresentation results with the following variables:
- term
Character, functional term ID/name.
- genes
Numeric, intersection length between input genes and genes in a particular functional term.
- all
Numeric, number of all genes in a particular functional term.
- pval
Numeric, P-value for the hypergeometric test.
- padj
Numeric, P-value adjusted for multiple comparisons using the method specified in parameter adj.
- category
Character, name of the grouping variable (i.e., column name of annotation).
Examples
data(se_chlamy)
data(go_chlamy)
data(deg_list)
# Perform ORA for up-regulated genes in contrast F1_vs_P1
up_genes <- deg_list$F1_vs_P1
up_genes <- rownames(up_genes[up_genes$log2FoldChange > 0, ])
background <- rownames(se_chlamy)
ora(up_genes, go_chlamy, background = background)
#> term genes all pval padj category
#> 133 dynein complex 8 13 1.924882e-05 0.00802676 GO