From: PCVR: a pre-trained contextualized visual representation for DNA sequence classification
Dataset | Supk. | Phyl. | Genus | Training data | Test data |
---|---|---|---|---|---|
Closely related dataset | 4 | 30 | 146 | 2,268,584 | 60,000 |
Distantly related dataset | 4 | 30 | 146 | 2,245,416 | 53,400 |
Final dataset | 4 | 44 | 156 | 5,311,920 | 88,000 |