Compare inferred orthogroups to a reference set
Source:R/homology_detection.R
compare_orthogroups.Rd
Compare inferred orthogroups to a reference set
Arguments
- ref_orthogroups
Reference orthogroups in a 3-column data frame with columns Orthogroup, Species, and Gene. This data frame can be created from the 'Orthogroups.tsv' file generated by OrthoFinder with the function
read_orthogroups()
.- test_orthogroups
Test orthogroups that will be compared to ref_orthogroups in the same 3-column data frame format.
Value
A 2-column data frame with the following variables:
- Orthogroup
Character of orthogroup IDs.
- Preserved
A logical vector of preservation status. It is TRUE if the orthogroup in the reference set is fully preserved in the test set, and FALSE otherwise.
Details
This function compares a test set of orthogroups to a reference set and returns which orthogroups in the reference set are fully preserved in the test set (i.e., identical gene repertoire) and which are not. Species names (column 2) must be the same between reference and test set. If some species are not shared between reference and test sets, they will not be considered for the comparison.
Examples
set.seed(123)
data(og)
og <- og[1:5000, ]
ref <- og
# Shuffle genes to simulate a different set
test <- data.frame(
Orthogroup = sample(og$Orthogroup, nrow(og), replace = FALSE),
Species = og$Species,
Gene = og$Gene
)
comparison <- compare_orthogroups(ref, test)
# Calculating percentage of preservation
sum(comparison$Preserved) / length(comparison$Preserved)
#> [1] 0