Skip to main content

Table 3 GO enrichment analysis results

From: CompàreGenome: a command-line tool for genomic diversity estimation in prokaryotes and eukaryotes

Similarity Class

Pairwise Similarity Score

Enriched GO category (Top 10 most significant)

Observed gene count

Expected gene count

Log2 fold change

Fisher's P value

Most Conserved Sequences (n = 0)

95—100%

Highly Conserved sequences (n = 21)

85% to < 95%

Protein-arginine deiminase activity

7

1.04

2.753

4.64E-04

Mycotoxin biosynthetic process

10

2.45

2.027

6.71E-04

Toxin activity

8

1.98

2.013

2.43E-03

Serine-type peptidase activity

14

5.38

1.380

2.52E-03

NADP binding

9

3.11

1.531

8.26E-03

Cholestenol delta-isomerase activity

3

0.28

3.405

1.04E-02

Sterol metabolic process

3

0.28

3.405

1.04E-02

N,N-dimethylaniline monooxygenase activity

6

1.60

1.903

1.13E-02

Proteolysis

28

16.61

0.753

1.15E-02

Nitrogen compound metabolic process

7

2.17

1.689

1.20E-02

Moderately Conserved Sequences (n = 20)

70% to < 85%

Protein phosphorylation

17

4.97

1.773

1.83E-05

Protein kinase activity

16

5.03

1.668

7.34E-05

Heme binding

12

3.32

1.856

2.03E-04

Monooxygenase activity

10

2.47

2.016

3.10E-04

Oxidoreductase activity, acting on paired

    

Donors, with incorporation or reduction of

10

2.59

1.948

4.39E-04

Molecular oxygen

    

Extracellular space

4

0.42

3.245

1.55E-03

Double-stranded RNA binding

3

0.18

4.052

1.80E-03

Metallopeptidase activity

6

1.24

2.279

2.24E-03

Iron ion binding

10

3.47

1.528

3.35E-03

Structural constituent of cytoskeleton

3

0.24

3.637

3.39E-03

Most Variable Sequences (n = 34)

 < 70%

Protein dimerization activity

20

1.76

3.510

1.07E-13

Serine-type endopeptidase activity

17

3.13

2.444

9.14E-08

Proteolysis

22

7.53

1.546

1.57E-05

Nucleoside metabolic process

9

1.28

2.809

2.00E-05

  1. All the gene sequences were first grouped into 4 similarity classes according to the sequence similarity within the query genomes. Enrichment was calculated by comparison of the expected vs. observed gene count for each GO term (P < 0.05, Fisher’s test). Shown the top 10 most enriched categories (full list available in the supplementary file)