Fig. 6

BiomiX MOFA Pipeline. This illustration depicts the BiomiX MOFA pipeline, with input from various normalized omics matrices on the left and their subsequent decomposition into factors. Discriminating MOFA factors and the feature contributing most to them are identified and measured by their weights. BiomiX offers several multiple tools to help better understand the nature of these factors. The first tool (A) integrates available clinical and biological data for factor identification. For numeric clinical data, Pearson correlation is used to assess significant correlations with each factor in the model. For binary labeled clinical data, the Wilcoxon test determines whether the factor value difference between the two groups is significant. The second tool (B) uses the most contributing features as input for biological pathway analysis, employing MetPath and EnrichR, depending on the type of omics. The third tool (C) retrieve relevant PubMed abstracts that closely match the most contributing features of the factor. The BiomiX PubMed search operates on three levels of research. First, it retrieves abstracts with at least one or more features from all omics. Then, it retrieves abstracts containing one or more features from all omics then from each omics pair, and finally from each single omic. A final table is generated listing the total and the unique match of features within the abstract, along with DOI, PubmedID, and keywords. As keywords can be missing novel keywords are extracted by text-mining approach in the Litsearch package. Keyword filtering is done through GSEA Biological process “BP” and Human Phenotype Ontology “HPO” vocabulary