Classify gene pairs derived from segmental duplications

Usage

get_segmental(anchor_pairs = NULL, pairs = NULL)

Arguments

anchor_pairs: A 2-column data frame with anchor pairs in columns 1 and 2.
pairs: A 2-column data frame with all duplicate pairs. This is equivalent to the first 2 columns of the tabular output of BLAST-like programs.

Value

A 3-column data frame with the variables:

dup1: Character, duplicated gene 1
dup2: Character, duplicated gene 2
type: Factor indicating duplication types, with levels "SD" (segmental duplication) or "DD" (dispersed duplication).

Examples

data(diamond_intra)
data(yeast_annot)
data(yeast_seq)
blast_list <- diamond_intra

# Get processed annotation for S. cerevisiae
annotation <- syntenet::process_input(yeast_seq, yeast_annot)$annotation[1]

# Get list of intraspecies anchor pairs
anchor_pairs <- get_anchors_list(blast_list, annotation)
anchor_pairs <- anchor_pairs[[1]][, c(1, 2)]

# Get duplicate pairs from DIAMOND output
duplicates <- diamond_intra[[1]][, c(1, 2)]
dups <- get_segmental(anchor_pairs, duplicates)