Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Multi-proteins similarity-based sampling to select representative genomes from large databases

Fig. 5

Phylogenetic distribution of the MPS-representatives for Bacteria. To make visualization about Bacteria possible, a reference bacterial phylogeny of 35,159 genomes has been inferred (Additional file 30). The MPS-representatives from eight samples were mapped on this phylogeny. Each circle represents a sample. Squares correspond to the selected MPS-representatives. The eight circles correspond, from the innermost to the outermost, to the eight values Δ  {0.7; 0.6; 0.5; 0.4; 0.3; 0.2; 0.1; 0.05}. They contain 12,787, 8,343, 5,332, 3,442, 2,252, 1,368, 794 and 527 MPS-representative genomes, respectively. Samples generated with Δ > 0.7 can not be mapped on this tree because, for these samples, some MPS-representatives are not part of the 35,159 genomes used to build the tree (Additional file 30). The scale bar represents the average number of amino acid substitutions per site in the protein sequences used to infer the tree. The 10 most represented phyla are shown: Pseudomonadota (formerly Proteobacteria, 12,816 leaves in orange at the top); Bacillota (formerly Firmicutes, 5,767 leaves in light beige at the bottom); Bacteroidota (formerly Bacteroidetes, 4,705 leaves in light orange on the right); Actinomycetota (formerly Actinobacteria, 4,259 leaves in light green on the left); Chloroflexi (1,001 leaves in light purple on the left); Planctomycetota (formerly Planctomycetes, 691 leaves in red below); Acidobacteria (615 leaves in light blue top left); Verrucomicrobia (560 leaves in dark blue below); Spirochaetes (415 purple leaves bottom right); Cyanobacteria (413 leaves in green at bottom)

Back to article page