Skip to main content

Table 2 Experimental results based on the MLEE, GE09 and GE11 datasets

From: Effective type label-based synergistic representation learning for biomedical event trigger detection

Methods

MLEE

GE09

GE11

P \((\%)\)

R \((\%)\)

F1 \((\%)\)

P \((\%)\)

R \((\%)\)

F1 \((\%)\)

P \((\%)\)

R \((\%)\)

F1 \((\%)\)

Large language models (LLMs)

ChatGPT-3.5 (0-shot)

33.02

30.17

31.53

17.53

26.51

21.10

14.69

28.00

19.27

ChatGPT-4 (0-shot)

35.40

34.48

34.93

17.92

27.01

21.55

15.28

29.33

20.09

ChatGPT-3.5 (5-shot ICL)

43.75

40.24

41.92

20.54

29.50

24.22

23.53

32.00

27.12

ChatGPT-4 (5-shot ICL)

44.63

42.10

43.33

21.46

31.07

25.39

24.51

33.33

28.25

Feature-based supervised learning models

HASH [31]

–

–

–

79.83

56.02

65.84

–

–

–

SVM-CRF [9]

–

–

–

69.96

64.28

67.00

–

–

–

Bio-SVM\(\dagger\) [10]

75.56

81.29

78.32

–

–

–

–

–

–

TSVM\(\dagger\) [7]

80.35

79.16

79.75

75.94

68.31

71.01

68.09

76.41

72.01

Representation-based supervised learning models

BiLSTM-FastText [35]

77.89

78.28

78.08

68.21

58.55

63.01

68.44

65.26

66.81

DeepEventMine [11]

79.37

78.86

79.12

–

–

–

72.05

68.89

70.43

TEES-CNN [25]

81.49

78.43

79.93

–

–

–

73.32

68.72

70.95

RecurCRFs [16]

81.12

79.15

80.28

76.42

70.45

73.24

–

–

–

SemPRE [20]

79.73

81.44

80.58

71.70

71.99

71.42

73.36

70.83

71.93

ResLSTM [23]

79.89

81.61

80.74

–

–

–

–

–

–

Tree-LSTM [8]

82.24

80.20

81.21

–

–

–

–

–

–

BioLSL (Ours)

80.71

83.79

82.25

74.51

76.34

75.41

78.37

71.67

74.79

  1. The best results are highlighted in bold