Skip to main content

Table 2 Overall performance metrics of SVM classifier using data preprocessing combinations evaluated against GTEx test set related to Fig. 2

From: A comparison of RNA-Seq data preprocessing pipelines for transcriptomic predictions across independent studies

Index

Normalization

Batch effect correction

Scaling

Micro-average of AUROC

Weighted F1-score

p Value

1

Unnormalized

No batch correction

Unscaled

0.94 (0.93–0.95)

0.71 (0.66–0.72)

Baseline

2

Quantile normalization

No batch correction

Unscaled

0.93 (0.92–0.94)

0.71 (0.68–0.73)

0.2963

3

Quantile normalization with target

No batch correction

Unscaled

0.93 (0.92–0.94)

0.70 (0.68–0.72)

0.3133

4

Feature specific quantile normalization

No batch correction

Unscaled

0.92 (0.91–0.93)

0.66 (0.63–0.67)

0.9636

5

Unnormalized

Batch correction

Unscaled

0.98 (0.96–0.98)

0.76 (0.74–0.77)

0.0049**

6

Quantile normalization

Batch correction

Unscaled

0.97 (0.96–0.97)

0.75 (0.73–0.76)

0.0089**

7

Quantile normalization with target

Batch correction

Unscaled

0.97 (0.96–0.97)

0.75 (0.74–0.75)

0.0073**

8

Feature specific quantile normalization

Batch correction

Unscaled

0.96 (0.94–0.97)

0.73 (0.72–0.73)

0.0339*

9

Unnormalized

No batch correction

Scaled

0.92 (0.90–0.93)

0.70 (0.67–0.70)

0.6009

10

Quantile normalization

No batch correction

Scaled

0.90 (0.89–0.91)

0.68 (0.67–0.69)

0.7241

11

Quantile normalization with target

No batch correction

Scaled

0.89 (0.87–0.90)

0.68 (0.67–0.69)

0.7298

12

Feature specific quantile normalization

No batch correction

Scaled

0.91 (0.89–0.91)

0.69 (0.64–0.71)

0.3715

13

Unnormalized

Batch correction

Scaled

0.97 (0.96–0.98)

0.76 (0.75–0.77)

0.0026**

14

Quantile normalization

Batch correction

Scaled

0.96 (0.96–0.97)

0.77 (0.76–0.77)

0.0009***

15

Quantile normalization with target

Batch correction

Scaled

0.96 (0.96–0.97)

0.76 (0.75–0.77)

0.0016**

16

Feature specific quantile normalization

Batch correction

Scaled

0.96 (0.95–0.97)

0.73 (0.72–0.74)

0.0305*

  1. Values indicate the median of each metric with five models evaluated from the outer folds of cross-validation; Inside the parentheses denotes the 95% confidence interval. Statistical significance was determined with the Student's t-test. *p < 0.05; **p < 0.01; ***p < 0.001