Fig. 5
From: Human limits in machine learning: prediction of potato yield and disease using soil microbiome data

Boxplots represent the weighted F1 scores corresponding to the most accurate predictions achieved by RF and Bayesian NN across various yield or disease outcomes (columns). The RF model exhibits its highest accuracy when utilizing alpha diversity and soil chemistry data (Alpha+Soil), while the Bayesian NN models demonstrate optimal performance by integrating OTUs identified as important by both machine learning and network comparison strategies, along with soil chemistry data (OTU-S3+Soil). Each boxplot range depicts the weighted F1 scores for datasets at different taxonomic levels and normalization methods, with the dashed line indicating the score obtained from fitting the model with random datasets (see section B in Supplementary file). Detailed results are provided in Figs. 4 (RF) and S3 (Bayesian NN)