Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: A comparison of RNA-Seq data preprocessing pipelines for transcriptomic predictions across independent studies

Fig. 3

Deterioration of classifier performance after Limma's batch effect correction against the ICGC/GEO test set. Weighted F1-scores as determined by SVM classifier loaded with the original dataset (left-most bar) versus the modified datasets after combinations of normalization (Unnormalized, QN [Quantile Normalization], QN-Target [Quantile Normalization with Target], FSQN [Feature-Specific Quantile Normalization]), batch effect correction (No batch correction, Batch correction) and data scaling (Unscaled, Scaled). The training and independent test datasets were TCGA and ICGC/GEO, respectively. The batch effect correction algorithm used was Limma, and all three types of batch effect (Protocol batch effect, Disease batch effect, and Consortium batch effect) were adjusted. Bars indicate the median values of each group that consisted of five models evaluated from the outer folds of cross-validation. Error bars represent the 95% confidence interval

Back to article page