Skip to main content

Table 2 Comparison of embedding models used in this study

From: Can large language models understand molecules?

Model

Dim. Size

# Layers

# Parameters

Speed\(^* (s)\)

Morgan FP (Radius=2)

1024

Not applicable

Not applicable

0.0015

BERT

768

12

110 M

2.9777

ChamBERTa

384

3

3 M

4.8544

MolFormer

768

12

44 M

20.9644

GPT

1536

96

175 B

0.2597

LLaMA

4095

32

7 B

50.8919

LLaMA2

4095

32

7 B

51.6308

  1. *Speed of generating embedding. Speed is dependent on the machine