Dataset | BBBP | BACE | HIV | |||
---|---|---|---|---|---|---|
# Compounds | 2039 | 1513 | 41127 | |||
Negative:Positive | \(\approx\)1:3 | \(\approx\)1:1 | \(\approx\)28:1 |
Models | F1-Score | AUROC | F1-Score | AUROC | F1-Score | AUROC |
---|---|---|---|---|---|---|
Morgan FP | 0.921 ± 0.003 | 0.896 ± 0.014 | 0.778 ± 0.027 | 0.880 ± 0.020 | 0.373 ± 0.028 | 0.797 ± 0.019 |
BERT | 0.935 ± 0.005 | 0.947 ± 0.007 | 0.744 ± 0.023 | 0.845 ± 0.016 | 0.182 ± 0.032 | 0.780 ± 0.011 |
ChemBERTa | 0.926 ± 0.011 | 0.944 ± 0.012 | 0.767 ± 0.020 | 0.862 ± 0.011 | 0.294 ± 0.033 | 0.767 ± 0.019 |
MolFormer-XL | 0.927 ± 0.006 | 0.934 ± 0.007 | 0.762 ± 0.012 | 0.860 ± 0.010 | 0.317 ± 0.032 | 0.804 ± 0.010 |
GPT | 0.908 ± 0.007 | 0.921 ± 0.015 | 0.648 ± 0.025 | 0.743 ± 0.030 | 0.039 ± 0.010 | 0.746 ± 0.009 |
LLaMA | 0.933 ± 0.006 | 0.953 ± 0.009 | 0.766 ± 0.024 | 0.859 ± 0.017 | 0.391 ± 0.013 | 0.802 ± 0.010 |
LLaMA2 | 0.930 ± 0.006 | 0.945 ± 0.004 | 0.772 ± 0.023 | 0.863 ± 0.018 | 0.378 ± 0.017 | 0.799 ± 0.008 |