- Software
- Open access
- Published:
CamITree: a streamlined software for phylogenetic analysis of viral and mitochondrial genomes
BMC Bioinformatics volume 26, Article number: 53 (2025)
Abstract
Background
Over the past decade, the continuous and rapid advances in bioinformatics have led to an increasingly common use of molecular sequence comparison for phylogenetic analysis. However, the use of multi-software and cross-platform strategies has increased the complexity of phylogenetic tree estimation. Therefore, the development and application of streamlined phylogenetic analysis tools are growing in significance in the field of biology. Particularly for genomes with relatively short sequences, there is a lack of simple and integrative tools for phylogenetic analysis.
Results
In this study, we present CamlTree (Concatenated alignments maximum-likelihood tree), a user-friendly desktop software designed to simplify phylogenetic analysis for viral and mitochondrial genomes, ultimately facilitating related research. CamlTree provides a workflow including gene concatenation (or coalescence), sequence alignment, alignment optimization, and the estimation of phylogenetic trees using both maximum-likelihood (ML) and Bayesian inference (BI) methods. CamlTree was written in TypeScript and developed using the Electron framework. It offers a primarily user-friendly interface based on the React framework.
Conclusions
CamlTree software has been released for the Windows OS. It integrates several popular analysis tools to optimize and simplify the process of estimating polygenic phylogenetic trees. The establishment of software can assist researchers in reducing their workload and enhancing data processing efficiency, enabling them to expedite their research progress. The software, along with a detailed user manual, is available at https://github.com/BioCrossCoder/camltree.
Background
Phylogenetic relationships comprehension among organisms constitutes a prerequisite for evolutionary research [1]. In the past two decades, the rapid development of sequencing technologies, advancements in computational power and the growing accessibility of extensive multilocus datasets have resulted in phylogenetic methods having become more diverse [1,2,3]. Phylogenetic analysis is a widely used biological method that involves comparing morphological, behavioral, and molecular features among species to estimate a species evolution tree and study their evolutionary history. Inferring evolutionary history through molecular data is an essential goal of modern biology [4]. Differences in molecular data can reflect evolutionary relationships among species and have significant implications for species classification and origin research [5, 6].
There is a growing wealth of genetic data available for phylogenetic analysis with powerful algorithms. However, accurate phylogenetic analysis is a difficult and complex process that involves multiple steps, mainly including data preparation, sequence alignment, alignment trimming, tree estimation and evaluation, interpretation, and application. Moreover, functional gene sequences from some small genomes, such as mitochondria and viruses, are typically short, which may result in unavoidable biases in phylogenetic analysis. A good solution is to select functional genes from multiple species, align and trim the gene sequences, concatenate them into a longer sequence, and use the concatenated sequence to estimate the phylogenetic tree [7,8,9]. This is time-consuming if each step of the entire process requires the use of different tools on different platforms (Linux, Windows or MacOS), especially as they usually have various input or output file formats. Employing diverse tools and conducting cross-platform analysis at different stages within a workflow is particularly challenging for novices and researchers with limited computer proficiency.
Consequently, researchers are increasingly seeking and developing multi-functional software packages that encapsulate an entire analysis workflow in a single solution [10,11,12]. In recent years, with the development of sequencing technologies and advancements in bioinformatics, the development and application of phylogenetic analysis tools have gradually been realized. Although several existing software programs have integrated related steps and achieved promising simplification effects, there is still a lack of suitable platforms for phylogenetic analysis of non-eukaryotic genomes with relatively short sequences or smaller-scale genes concerning eukaryotic genomes (the nuclear genome or mitochondrial genome). For example, downstream analysis using GENEIOUS is not entirely automated and requires manual data processing, especially for multi-gene datasets [13]. Concatenator may exhibit unstable performance due to operating system compatibility issues. For example, as reported in 2022, it was found to be incompatible with the older but widely used Windows 7 system [14]. LMAP_S integrates more than 25 software programs, offering over 65 algorithm choices. However, it only supports Linux/UNIX, making it highly advanced for novices in the field of phylogenetic analysis [11].
In light of this, we developed CamlTree, a user-friendly graphical user interface (GUI), for phylogenetic analysis of small-scaled genomes, specifically mitochondria and viruses. The platform integrates a series of third-party analytical tools and encapsulates multiple sequence alignment, alignment trimming, and tree-estimation processes. Sequence identification and concatenation are implemented using Python scripts. This platform aims to automate program execution for researchers, reducing the need to understand program operating procedures, perform data conversion between programs, and manipulate input and output operations before performing phylogenetic analysis. As a result, researchers are able to concentrate more on interpreting the analysis results and understanding the data, ultimately improving the efficiency and credibility of their analyses.
Collectively, we introduce the design and architecture of CamlTree, highlighting the relationships between each module and the infrastructure choices made throughout the development process. We present the user interface of CamlTree, along with its functional modules and available workflows in detail. And discuss the advantages and limitations of the software. Emphasize the unique strengths by comparing CamlTree with other existing software programs. Additionally, we utilize test data to ensure that CamlTree effectively analyzes information and generates reliable results.
Implementation
Design and architecture
CamlTree is designed using a modular approach, which allows for clear delineation between different components (Additional file 1: Figure S1). The architecture of CamlTree mainly consists of the user interface, program backend and infrastructure. Additionally, CamlTree was written in TypeScript and developed using the Electron framework. It integrates various command-line tools and offers a primarily user-friendly interface built on the React framework. To ensure a good page transition performance, CamlTree utilizes front-end routing in hash mode. The encapsulated analysis process is implemented in the background using NodeJS working scripts, achieving multi-process parallel computing through a combination of asynchronous and subprocesses.
Interaction between modules
Communication between the front-end and back-end is accomplished through Electron, which allows inter-process communication. Additionally, Electron provides encapsulations of some system APIs, allowing developers to call system pop-ups and other functions through it. Due to JavaScript being a single-threaded language, it is unsuitable to support multi-threading. Therefore, we use the ‘child-process’ module in the standard library of NodeJS and the asynchronous characteristics of JavaScript together. This approach enabled the workflow to run with multiple processes, achieving parallel computation and significantly improving the program's performance, effectively reducing the overall execution time. Furthermore, CamlTree adopts a "misalignment parallelization" strategy, which means different analysis tasks with different entries in the same step are submitted sequentially but can be executed in parallel, improving program performance and reducing the time required for the workflow. The strategy optimizes the execution of the sequential tree-estimation workflow and reduces the processing time significantly.
Selection of Infrastructure
Regarding the global development trend in phylogenetic analysis, the significance of streamlined analysis tools is increasing in the field of biology. Integrated workflows are becoming more common, making analyses faster and more straightforward. CamlTree aims to optimize multi-gene phylogenetic analysis by integrating and simplifying all steps involved, from sequence alignment to estimating phylogenetic trees. It encapsulates several commonly used command-line software programs for phylogenetic analysis into a user-friendly interface, and also allows individual programs to be run separately and connects these independent analyses into a workflow, thereby simplifying and accelerating the standard phylogenetic analysis process.
There are currently numerous different software options available for phylogenetic analysis. In most cases, opting for mainstream programs is advisable as they enjoy broad recognition and acceptance within the field, which enhances the credibility and widespread acceptance of their results. In this work, several software is selected as the composition of the "Infrastructure" part (Additional file 1: Figure S1, Table 1).
For sequence alignment, we chose the programs MAFFT [15] and MACSE [16]. MAFFT is a widely used and highly accurate multiple sequence alignment tool that utilizes an advanced sequence alignment algorithm and rapidly identifies homologous regions in sequences using a fast Fourier transform [15, 17, 18]. Other published tools, such as TREEasy [19] and Concatenator, also used the MAFFT program. However, MAFFT is limited in handling misaligned or frameshifted sequences. This limitation can potentially lead to significant impacts on the accuracy of downstream analyses [20]. To address this issue, CamlTree provides an additional alignment software, MACSE, which can obtain reliable alignment results even in the presence of frameshifts [16]. For alignment optimization, we used trimAl [21] as an excellent choice for automatically removing suspicious sequences. Because trimAI can be operated based on multiple parameter combinations to preserve the most reliable positions in multiple sequence alignments, which makes it useful for handling complex alignment pruning in large-scale phylogenetic analysis [21]. Besides, CamlTree integrates ALTER, a format conversion tool that converts the output alignment format to the required input format for subsequent analysis [22].
In terms of sequence concatenation, after multiple sequence alignment and sequence optimization subtasks are completed in a workflow, our custom script reads the output and stores it in a JavaScript object. After all subtasks are completed, a FASTA file will be created and the concatenated sequences will be written into it. Due to the fact that JavaScript objects are hash tables, their read-and-write performance makes them a good choice for CamlTree. The current approach to systematic phylogenetic analysis is typically based on serial tree estimation, which ignores the fact that the recombination of base sequences may result in incomplete evolutionary histories at different positions on different chromosomes [23]. We also introduced a parallel method to estimate trees using each gene, with an approach that is not affected by inter-gene recombination.
In phylogenetic analysis, ML and BI methods are commonly applied in phylogenetic tree estimation. We integrated two good-performing and widely-used programs, IQ-TREE2 [24] and MrBayes [25] enabling the estimation of ML and BI trees. IQ-TREE2 integrates three critical steps in phylogenetic analysis: rapid model selection using ModelFinder [26], efficient tree search algorithms [27], and fast bootstrap tests [28,29,30]. Compared to similar tools, it excels in both processing speed and result accuracy. MrBayes estimates the posterior distribution of model parameters through Markov chain Monte Carlo (MCMC) methods. Besides, FigTree was chosen as the preferred graphical viewer for phylogenetic trees [31]. CamlTree allows users to open the estimated phylogenetic tree using FigTree through a one-click button in the pop-up window. This facilitates the subsequent tasks of tree visualization, organization, and polishing, providing a seamless and user-friendly experience for further refining the phylogenetic analysis results.
Results
User interface
CamlTree is a desktop software for visualization and phylogenetic analysis based on genetic data. The homepage of the application provides a brief overview of CamlTree along with three buttons (Fig. 1A). Clicking these buttons allows users to access three distinct interfaces, each serving a different purpose. Upon clicking the "Start" button, users are directed to a window where they encounter a software selection section in the top-left corner and a dedicated software plugin area in the top-right corner (Fig. 1B). Below these sections, there are two input fields provided for specifying the input file and output result paths. Clicking the "Customize" button, within the software parameter modification interface, users could access to choose from two pre-configured parameter sets or customize the software's runtime parameters according to their specific requirements (Fig. 1C). Users can also save these modifications, ensuring convenient access to their preferred settings. Upon clicking the "Guidance" button on the homepage, users are directed to a window that provides a general guide on how to use the software (Fig. 1D). This window includes links to more detailed operational analyses, offering users further insights and assistance in utilizing the software effectively. When submitting the task, a window will also prompt for confirmation of the workflow and parameters (Fig. 1E). Additionally, there is a task execution status window where users can track the progress of their submitted tasks (Fig. 1F). This window provides regular updates and information regarding the status and advancements of the tasks, allowing users to stay informed about the progress of their submissions. After finishing a task, four result files will be generated in the output directory (Fig. 1G). Each result file corresponds to the outcome of each step in the workflow along with its respective log. Except for sequence files and tree files, there are also multiple additional files, such as the matrix of genetic distances, which is generated by IQ-TREE2 and saved to the output folder “4-IQ-TREE_results” with an ending name ".mldist". While the posterior probabilities generated by the embedded MrBayes method are saved to the folder “4-MrBayes_results”.
The interface and main functional modules of CamlTree. (A) The homepage with Start, Customize, and Guidance keys. (B) The start window with different workflows. (C) The customize window. The parameters of four software programs can be modified according to the user's requirements.(D) The user guidance window. (E) The check window. Users can review the selected workflow and the parameters applied. (F) The task execution status window. (G) The list of output files
Workflows and functions
CamlTree includes two main workflows for executing the analysis (Fig. 2). One is the concatenation analysis, which involves merging multiple sequences into a single dataset. The other is the coalescence analysis, which focuses on estimating trees for each dataset or gene separately and subsequently merging these individual trees through embedded Weighted ASTRAL (wASTRAL) software [23, 32,33,34]. The two streamlined processes provide a more efficient solution for multi-gene-based phylogenetic analysis compared to traditional methods that require running multiple steps manually. Furthermore, CamlTree offers the "Separation" option to perform multiple-sequence files in parallel, where each file represents a different gene and contains the sequences of various species for that gene, outputting multiple tree files. It is different from the “Concatenation” and “Coalescence” workflows, which only process sequences within one group and generate only one tree. There is no need to download additional plugins separately as it provides built-in functionality that could be used directly.
The implementation/workflow diagram of CamlTree. Concatenated and parallel analyses are optional. Workflow 1 is the concatenation strategy, including steps of sequence alignment, alignment optimization, sequence concatenation, and tree estimation. Workflow 2 is the coalescence strategy, including steps of sequence alignment, alignment optimization, tree estimation, and tree merging
CamlTree offers support for user-defined runtime parameters, allowing users to customize the analysis based on their data and specific requirements. The settings will be automatically saved to the program's state, allowing users to obtain the desired results. CamlTree supports multiple formats of sequence alignment files as input and automatically identifies the input sequence files and convert them into FASTA format at the beginning of the workflow. This allows for seamless integration of the converted files into the existing workflow.
Furthermore, CamlTree also provides import and export functions for configurations. Users can export the modified configuration to a file in JSON format. There are two sets of runtime configurations provided by the program: one set focuses on the accuracy of analysis results and is automatically loaded every time the program is opened; the other set is for quick analysis and mainly for users to quickly experience the program's functions and conduct testing. Moreover, users could choose more customization options to meet their specific analysis needs.
Additionally, CamlTree supports the handling of multiple tasks, allowing users to submit another task (with different strategies) after the previous one has been submitted. This feature enables users to keep track of the running status and manage multiple tasks efficiently. After the execution of the submitted phylogenetic analysis workflow is completed in the background, a pop-up notification will be displayed, and corresponding folders for each step can be checked in the result directory by clicking the specific button on the pop-up window. These folders include output result files for each step in the workflow.
CamlTree includes built-in exception problem handling. In some cases, tasks may encounter errors during execution. When this happens, the program immediately halts the execution of background tasks and displays an error message in a pop-up window. The pop-up window will provide information about the specific error and the portion of the program that caused the issue. Users are able to utilize buttons within the pop-up window to access detailed error logs, facilitating error troubleshooting and investigation.
Discussion
Advantages and shortcomings of CamlTree
Phylogenetic analysis involves reestimating the evolutionary history of species and populations based on genetic data, and has become a powerful tool for understanding the relationships between different organisms. By comparing the genetic sequences of different species, researchers can estimate evolutionary trees that show the relationships between different species and how they have evolved over time. With the rapid expansion of genome sequencing technologies, there is a growing wealth of genetic data available for phylogenetic analysis [35, 36]. However, accurate phylogenetic analysis is a difficult and complex process that involves multiple steps, and the challenge of performing this process quickly and efficiently remains. We have developed a reliable and efficient desktop application, CamlTree, for phylogenetic analysis based on mitochondrial genome sequences. It is simple to use and has powerful functionality, allowing users to easily execute systematic phylogenetic analysis workflows and call various third-party tools. CamlTree has a clean and visually appealing responsive graphical user interface, good compatibility, strong flexibility, and supports running on both new and old operating systems.
CamlTree makes use of a combination of subprocesses and asynchronous functions to perform multiprocessing calculations. Additionally, it employs a "misalignment parallelization" strategy. The capability of batch processing is essential for handling multi-gene datasets. CamlTree directly integrates the third-party tools it needs to run, allowing users to run these tools independently and providing a great deal of convenience. The program also supports custom settings for its runtime parameters. During workflow execution, CamlTree displays a progress bar window that allows users to monitor the task's progress in real-time, providing a useful way to track workflow execution progress. Compared to other client–server or cloud-based programs, CamlTree is a desktop application that provides better data privacy and security, without user data and analysis transmitting over a network, and minimizes the consumption of memory usage. Especially friendly for users with low computer proficiency, such as those who are not familiar with operating command-line software.
There are currently some shortcomings that need to be addressed to ensure the practicality and reliability of CamlTree. For example, when the data volume is large enough to cause the program to run out of memory, sequence loss may occur during program execution. However, this issue does not need to be considered when the data volume is less than 15 GB. CamlTree is a graphical user interface program. When processing large amounts of data, the efficiency is closely related to the performance of the computer. Therefore, it may be necessary to develop command-line software subsequently. For experienced and professional researchers, a command-line version will support running and managing multiple instances on the server, significantly improving analysis efficiency.
Comparison of other available software programs
Although some existing software programs have similar functionalities, CamlTree possesses unique features and performance advantages that make it highly practical (Table 2). With the rapid development of sequencing technologies and exponentially increasing data, the batch processing capability is indispensable to help researchers process phylogenomic analysis. EPoS [37], Armadillo [38] and MEGA [39] lack this function. Additionally, it is necessary to set the parameters of the tools and customize the phylogenomic analysis depending on data and requirements. CamlTree allows users to modify parameters before running, and the run configuration is included in the result file. However, Concatenator and MitoPhAST [40] do not support customizing. Moreover, CamlTree supports the selection of analysis strategies, which mainly include concatenation and coalescence analyses. But Concatenator, EPoS and Armadillo are unable to allow to select the strategy of phylogenomic analysis. ML and BI are two commonly used methods for phylogenetic tree estimation in most phylogenetic analyses. In comparison, CamlTree is able to select analytical strategies, align sequences, optimize them, and finish ML and BI phylogenetic tree estimation. MEGA mainly focuses on single-alignment-based phylogenetic analysis. With the accumulation of large amounts of data, relatively reliable phylogenetic software programs for large multi-gene data sets, such as LMAP_S and PhyloSuite, have been presented. CamlTree is specifically designed for phylogenetic analysis of small genomes, with a particular emphasis on mitochondrial and viral genomes. Generally, CamlTree offers improved data privacy and security and minimizes memory usage and time consumption.
To date, PhyloSuite is the most comparable software to CamlTree. We focused on the running performance of these two software. When processing the test data using the same software (MAFFT, trimAI, IQ-TREE2) and default parameters in the concatenation workflow, CamlTree completed the task in 24 min and 10 s, whereas PhyloSuite took 32 min and 41 s. Moreover, unlike CamlTree which completes the entire process in one click, PhyloSuite requires manual submission of data to ASTRAL [34], software after sequence alignment, followed by the task of merging trees. Furthermore, additional comparisons and superior performance are described on the homepage of CamlTree https://github.com/BioCrossCoder/camltree.
Usage examples
To facilitate the use of CamlTree and demonstrate the performance of the software, we conducted a comparative analysis by re-analyzing the data that published in the previous study [41]. A total of 17 mitochondrial genome datasets were downloaded from the previously published paper under the accession numbers OR148894-OR148901, AP002940.1, AP004431.1, KX254549.1, NC_004395.1, NC_063496.1, NC_063501.1, OQ349185.1, OQ349187.1 and OQ349189.1. 13 protein-coding genes (PCGs) were obtained from each mitochondrial genome of holocentrid fish and out-group fish. The nucleotide sequences of the same PCGs from different species were placed in a single FASTA input file, with sequences named according to their respective source species. Additionally, there is a total of thirteen FASTA files, which have been bundled together and can be downloaded from the following link: https://github.com/BioCrossCoder/camltree/releases/download/v2.1.2/test.data.zip.
The workflows for ML and BI tree estimation are shown in Fig. 3A. In ML analysis, MAFFT was executed with the parameters "–auto –inputorder", while trimAl was executed with the parameters "-automated1" and IQ-TREE2 was executed with the parameters "-m MFP -b 1000 -bnni". In the BI analysis, the parameters for MAFFT and trimAI were the same as mentioned earlier. Model selection was executed using ModelFinder embedded in the IQ-TREE2. Bayesian inference posterior probabilities were calculated by running four Markov Chain Monte Carlo chains simultaneously for 2,000,000 generations, with sampling conducted every 1000 generations. The initial 25% of MCMC sampling generations were discarded as burn-in. Additionally, the average standard deviation of split frequencies was set to be less than 0.01 as a convergence criterion. After the successful completion of the task, four result files were generated in the output directory. Each result file corresponds to the outcome of each step in the workflow along with its respective log. The final output results were visualized using FigTree with certain parameters, and 17 species were classified into subfamilies and families (Fig. 3B). The results show that the tree matches the original tree from our previous paper, which was estimated using a cross-platform, multi-software combination strategy. Additionally, it has a similar number of posterior probabilities (from the BI method) and bootstraps (from the ML method) beside the nodes. In addition, we have also installed PhyloSuite, which takes about eight minutes longer than CamlTree using the same data above and the same software selected in the process (MAFFT, trimAI, IQ-TREE2) and the default configuration.
An example of ML and BI tree estimation by CamlTree using 13 protein-coding genes from 17 mitochondrial genomes. (A) Workflows for tree estimation. (B) Tree file visualization by FigTree. The numbers beside the nodes are posterior probabilities (from the BI method) and bootstraps (from the ML method)
Conclusion
This study presents a powerful, reliable, and user-friendly platform that integrates a series of phylogenetic analysis tools into a workflow. CamlTree will simplify and accelerate phylogenetic analysis, helping researchers reduce operational burdens and improve research efficiency. Additionally, it provides a better solution for conducting phylogenetic analysis for small genomes or smaller-scale projects (containing dozens or fewer genes).
Availability and requirements
-
Project name: CamlTree
-
Project home page: https://github.com/BioCrossCoder/camltree
-
Operating system(s): Windows
-
Programming language: TypeScript
-
Other requirements: integrated software from Table 1, JAVA (if using the integrated version of FigTree), node modules (antd, react, recoil, react-router, emotion): antd (a component library), react (a UI framework), recoil (a state management library), react-router (a front-end router library), emotion (a css-in-js library)
-
License: GNU General Public License, version 3.0 (GPLv3)
-
Any restrictions to use by non-academics: none
Availability of data and materials
The datasets used in this paper are downloaded from NCBI accession numbers OR148894-OR148901, AP002940.1, AP004431.1, KX254549.1, NC_004395.1, NC_063496.1, NC_063501.1, OQ349185.1, OQ349187.1 and OQ349189.1.
References
Delsuc F, Brinkmann H, Philippe H. Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet. 2005;6(5):361–75.
Liu L, Xi Z, Wu S, Davis CC, Edwards SV. Estimating phylogenetic trees from genome-scale data. Ann N Y Acad Sci. 2015;1360:36–53.
Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet. 2010;11(1):31–46.
Kapli P, Yang Z, Telford MJ. Phylogenetic tree building in the genomic age. Nat Rev Genet. 2020;21(7):428–44.
Dhivya S, Ashutosh S, Gowtham I, Baskar V, Harini AB, Mukunthakumar S, et al. Molecular identification and evolutionary relationships between the subspecies of Musa by DNA barcodes. BMC Genomics. 2020;21(1):659.
Rosell JA, Olson ME, Weeks A, De-Nova JA, Lemos RM, Camacho JP, et al. Diversification in species complexes: tests of species origin and delimitation in the Bursera simaruba clade of tropical trees (Burseraceae). Mol Phylogenet Evol. 2010;57(2):798–811.
Bravo IG, Alonso A. Phylogeny and evolution of papillomaviruses based on the E1 and E2 proteins. Virus Genes. 2007;34(3):249–62.
Robles-Sikisaka R, Rivera R, Nollens HH, St Leger J, Durden WN, Stolen M, et al. Evidence of recombination and positive selection in cetacean papillomaviruses. Virology. 2012;427(2):189–97.
Gottschling M, Bravo IG, Schulz E, Bracho MA, Deaville R, Jepson PD, et al. Modular organizations of novel cetacean papillomaviruses. Mol Phylogenet Evol. 2011;59(1):34–42.
Smith DR. Buying in to bioinformatics: an introduction to commercial sequence analysis software. Brief Bioinform. 2015;16(4):700–9.
Maldonado E, Antunes A. LMAP_S: Lightweight Multigene alignment and phylogeny eStimation. BMC Bioinformatics. 2019;20(1):739.
Zhang D, Gao F, Jakovlic I, Zou H, Zhang J, Li WX, et al. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour. 2020;20(1):348–55.
Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–9.
Vences M, Patmanidis S, Kharchev V, Renner SS. Concatenator, a user-friendly program to concatenate DNA sequences, implementing graphical user interfaces for MAFFT and FastTree. Bioinform Adv. 2022;2(1):vbac050.
Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059–66.
Ranwez V, Harispe S, Delsuc F, Douzery EJ. MACSE: multiple alignment of coding sequences accounting for frameshifts and stop codons. PLoS ONE. 2011;6(9): e22594.
Nakamura T, Yamada KD, Tomii K, Katoh K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics. 2018;34(14):2490–2.
Zhu X, Li K, Salah A, Shi L, Li K. Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware. IEEE/ACM Trans Comput Biol Bioinform. 2015;12(1):205–18.
Mao Y, Hou S, Shi J, Economo EP. TREEasy: an automated workflow to infer gene trees, species trees, and phylogenetic networks from multilocus data. Mol Ecol Resour. 2020;20(3):832.
Ranwez V, Douzery EJP, Cambon C, Chantret N, Delsuc F. MACSE v2: toolkit for the alignment of coding sequences accounting for frameshifts and stop codons. Mol Biol Evol. 2018;35(10):2582–4.
Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25(15):1972–3.
Glez-Pena D, Gomez-Blanco D, Reboiro-Jato M, Fdez-Riverola F, Posada D. ALTER: program-oriented conversion of DNA and protein alignments. Nucleic Acids Res. 2010;38:W14–8.
Rokas A, Williams BL, King N, Carroll SB. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature. 2003;425(6960):798–804.
Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37(5):1530–4.
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S, et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61(3):539–42.
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587–9.
Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74.
Minh BQ, Nguyen MA, von Haeseler A. Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol. 2013;30(5):1188–95.
Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35(2):518–22.
Zhou X, Shen XX, Hittinger CT, Rokas A. Evaluating fast maximum likelihood-based phylogenetic programs using empirical phylogenomic data sets. Mol Biol Evol. 2018;35(2):486–503.
Rambaut A. FigTree v1.4.2: Tree figure drawing tool 2014. Available from: http://tree.bio.ed.ac.uk/software/figtree/.
Zhang C, Rabiee M, Sayyari E, Mirarab S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinfo. 2018. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12859-018-2129-y.
Zhang C, Mirarab S. Weighting by gene tree uncertainty improves accuracy of quartet-based species trees. Mol Biol Evol. 2022. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/molbev/msac215.
Mirarab S, Reaz R, Bayzid MS, Zimmermann T, Swenson MS, Warnow T. ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics. 2014;30(17):i541–8.
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, et al. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ. 2019;7: e6399.
Allen JM, Germain-Aubrey CC, Barve N, Neubig KM, Majure LC, Laffan SW, et al. Spatial phylogenetics of florida vascular plants: the effects of calibration and uncertainty on diversity estimates. iScience. 2019;11:57–70.
Griebel T, Brinkmeyer M, Bocker S. EPoS: a modular software framework for phylogenetic analysis. Bioinformatics. 2008;24(20):2399–400.
Lord E, Leclercq M, Boc A, Diallo AB, Makarenkov V. Armadillo 1.1: an original workflow platform for designing and conducting phylogenetic analysis and simulations. PLoS ONE. 2012;7(1):e29903.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9.
Tan MH, Gan HM, Schultz MB, Austin CM. MitoPhAST, a new automated mitogenomic phylogeny tool in the post-genomic era with a case study of 89 decapod mitogenomes including eight new freshwater crayfish mitogenomes. Mol Phylogenet Evol. 2015;85:180–8.
Tang Q, Liu Y, Li CH, Zhao JF, Wang T. Comparative mitogenome analyses uncover mitogenome features and phylogenetic implications of the reef fish family holocentridae (holocentriformes). Biology (Basel). 2023;12(10):1273.
Acknowledgements
The computations in this paper were run on the bioinformatics computing platform of the National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University. We appreciate their assistance.
Funding
This work was supported by the Fundamental Research Funds for the Central Universities in China (2662022SCQD002), the National Key Research and Development Program of the Ministry of Science and Technology (2022YFD2400901).
Author information
Authors and Affiliations
Contributions
P.S., Y.Y. and Q.T. conceived and designed the experiments. P.S., Y.Y. and Q.T. conducted the experiments. P.S., Y.Y., M.Y. and Q.T analysed the results. P.S., Y.Y. and Q.T. wrote and reviewed the manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Sun, P., Yang, Y., Yuan, M. et al. CamITree: a streamlined software for phylogenetic analysis of viral and mitochondrial genomes. BMC Bioinformatics 26, 53 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12859-025-06034-2
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12859-025-06034-2