Skip to main content

Table 1 Statistics of datasets

From: PCVR: a pre-trained contextualized visual representation for DNA sequence classification

Dataset

Supk.

Phyl.

Genus

Training data

Test data

Closely related dataset

4

30

146

2,268,584

60,000

Distantly related dataset

4

30

146

2,245,416

53,400

Final dataset

4

44

156

5,311,920

88,000