Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: Multi-proteins similarity-based sampling to select representative genomes from large databases

Fig. 4

Sampling of the bacterial dataset (178,203 genomes) with MPS-Sampling as a function of Δ. A: Size of the samples. B: Phylogenetic diversity of the samples, computed as the length of all branches of the tree inferred with the sample genomes divided by the number of leaves. C: Detail of the sampling of the 135,315 genomes with a complete taxonomic affiliation and the 42,888 genomes with an incomplete taxonomic affiliation. D: Taxonomic diversity of the samples. The proportion of phyla, classes, orders, families, genera, and species represented in each sample is indicated. For each graph, the solid lines represent a continuous interval of calculable values for Δ, while the dotted lines represent no calculable value of Δ

Back to article page