Skip to contents

This function identifies and classifies TFs, and returns TF counts for each family as a SummarizedExperiment object

Usage

get_tf_counts(proteomes, species_metadata = NULL)

Arguments

proteomes

List of AAStringSet objects

species_metadata

(Optional) A data frame containing species names in row names (names must match element names in the proteomes list), and species metadata (e.g., taxonomic information, ecological information) in columns. If NULL, the colData of the SummarizedExperiment object will be empty.

Value

A SummarizedExperiment object containing transcription factor frequencies per family in each species, as well as species metadata (if species_metadata is not NULL).

Examples

data(gsu)

set.seed(123)
# Pick random subsets of 100 genes to simulate other species
proteomes <- list(
    Gsu1 = gsu[sample(names(gsu), 50, replace = FALSE)],
    Gsu2 = gsu[sample(names(gsu), 50, replace = FALSE)],
    Gsu3 = gsu[sample(names(gsu), 50, replace = FALSE)],
    Gsu4 = gsu[sample(names(gsu), 50, replace = FALSE)]
)

# Create species metadata
species_metadata <- data.frame(
    row.names = names(proteomes),
    Division = "Rhodophyta",
    Origin = c("US", "Belgium", "China", "Brazil")
)

# Get SummarizedExperiment object
if(hmmer_is_installed()) {
    se <- get_tf_counts(proteomes, species_metadata)
}