!title
Constrained Consensus TOPology prediction server

Description of the CCTOP server

Filtering transmembrane proteins

Eight prediction methods have been tested to filter transmembrane proteins, i.e. determining whether a sequence codes a transmembrane protein or a non-transmembrane one. These methods are Memsat, Octopus, Philius, Phobius, Pro-TMHMM, Scampi-single, Scampi-msa and TMHMM. They were executed on preprocessed sequences, i.e. after removing transit and/or signal peptides from the query sequences. As none of these method’s accuracies were as high as desired, a simple consensus approach were utilized to increase the prediction accuracy.

We have prepared a figure by combining the results in a Venn diagram of methods with Matthews correlation coefficients above 0.93, as well as a table showing the results of all triplet combinations of the selected four methods.

Venn diagram of the filtering accuracies

Venn diagram of the filtering accuracies of the selected four algorithms



Filtering accuracies of simple majority decision algorithm

Methods

TP

TN

FP

FN

Sensitivity

Specificity

MCC

Philius

Phobius

Scampi

TMHMM



465

1,403

19

9

0.98

0.99

0.96



459

1,403

19

15

0.97

0.99

0.95



465

1,403

19

9

0.98

0.99

0.96



467

1,401

21

7

0.99

0.99

0.96



As it expected, and can be seen from these data, combining the various algorithms decreases the false negative (FN) and false positive (FP) ratios. However, the true positive (TP) ratio decreases as well. Therefore, there should be an optimal number of the combined methods. Using a simple majority decision algorithm for three methods out of the selected four ones, the highest accuracy could be reached if the three methods were TMHMM, Scampi and Phobius.