cubar
Overview
cubar is a package for codon usage bias analysis in R. Main features
are as follows:
- Codon level analyses
- Calculate codon weights based on gene expression, tRNA availability,
and mRNA stability;
- Calculate relative synonymous codon usage (RSCU);
- Machine learning-based inference of optimal codons;
- Visualization codon-anticodon pairing relationships;
- Gene level analyses
- Tabulate codon frequency of each coding sequence;
- Measure codon usage similarity to highly expressed genes with Codon
Adaptation Index (CAI);
- Quantify the influence of codon usage on mRNA stability with Mean
Codon Stabilization Coefficients (CSCg);
- Measure codon usage bias with the nonparametric index Effective
number of codons (ENC);
- Measure the fraction of pre-determined optimal codons (Fop) in each
sequence;
- Overall GC content (GC) or that of 3rd synonymous positions (GC3s)
or 4-fold degenerate sites (GC4d);
- Quantify whether codon usage matches tRNA availability using tRNA
Adaptation Index (tAI);
- Measure the deviation from porportionality (Dp) of viral synonymous
codon usage from host tRNA supply;
- Utilities
- Sliding window analysis of codon usage within a coding
sequence;
- Optimize codon usage based on optimal codons for heterologous
expression;
- Test differential usage of codons between two sets of
sequences;
Main advantages of cubar
are as follows: - Process large
datasets (>10,0000 sequences) efficiently using the
Biostrings
and data.table
backends; - Support
genetic codes cataloged by NCBI
as well as custom ones; - Integrate with other data analysis or
bioinformatic packages in the R ecosystem;
Dependencies
Depends
Imports
Biostrings
(>= 2.60.0),
IRanges
(>= 2.34.0),
data.table
(>= 1.14.0),
ggplot2
(>= 3.3.5),
rlang
(>= 0.4.11)
Installation
The latest release of cubar
can be installed with:
install.packages("cubar")
The latest developmental version of cubar
can be
installed with:
devtools::install_github("mt1022/cubar", dependencies = TRUE)
Usage
Documentation can be found within R (by typing
?function_name
). The following tutorials are available from
our website:
Getting help
Please use GitHub issues for bug
reports, questions, and feature requests.
Suggests
- Biostrings
for sequence input/output and manipulation;
- Peptides for
peptide- or protein-related indices;
Acknowledgements
GitHub Copilot was used to suggest code snippets in the development
of this package. Thanks the GitHub Education teacher
program for providing free access to GitHub Copilot.