The major goal of doubletrouble is to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. Duplicates can be classified using four different classification schemes, which increase the complexity and level of details in a stepwise manner. The classification schemes and the duplication modes they can classify are:
Scheme | Duplication modes |
---|---|
binary | SD, SSD |
standard | SD, TD, PD, DD |
extended | SD, TD, PD, TRD, DD |
full | SD, TD, PD, rTRD, dTRD, DD |
Legend: SD, segmental duplication. SSD, small-scale duplication. TD, tandem duplication. PD, proximal duplication. TRD, transposon-derived duplication. rTRD, retrotransposon-derived duplication. dTRD, DNA transposon-derived duplication. DD, dispersed duplication.
Besides classifying gene pairs, users can also classify genes, so that each gene is assigned to a unique mode of duplication.
Users can also calculate substitution rates per substitution site (i.e., , and their ratios ) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Installation instructions
Get the latest stable R
release from CRAN. Then install doubletrouble from Bioconductor using the following code:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("doubletrouble")
And the development version from GitHub with:
BiocManager::install("almeidasilvaf/doubletrouble")
Citation
Below is the citation output from using citation('doubletrouble')
in R. Please run this yourself to check for any updates on how to cite doubletrouble.
print(citation('doubletrouble'), bibtex = TRUE)
#> To cite package 'doubletrouble' in publications use:
#>
#> Almeida-Silva F, Van de Peer Y (2022). _doubletrouble: Identification
#> and classification of duplicated genes_. R package version 1.3.0,
#> <https://github.com/almeidasilvaf/doubletrouble>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {doubletrouble: Identification and classification of duplicated genes},
#> author = {Fabrício Almeida-Silva and Yves {Van de Peer}},
#> year = {2022},
#> note = {R package version 1.3.0},
#> url = {https://github.com/almeidasilvaf/doubletrouble},
#> }
Please note that the doubletrouble was only made possible thanks to many other R and bioinformatics software authors, which are cited either in the vignettes and/or the paper(s) describing this package.
Code of Conduct
Please note that the doubletrouble project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Development tools
- Continuous code testing is possible thanks to GitHub actions through usethis, remotes, and rcmdcheck customized to use Bioconductor’s docker containers and BiocCheck.
- Code coverage assessment is possible thanks to codecov and covr.
- The documentation website is automatically updated thanks to pkgdown.
- The code is styled automatically thanks to styler.
- The documentation is formatted thanks to devtools and roxygen2.
For more details, check the dev
directory.
This package was developed using biocthis.