In silico prediction of the peroxisomal proteome in fungi, plants and animals

Emanuelsson O, Elofsson A, von Heijne G, Cristobal S (2003): In silico prediction of the peroxisomal proteome in fungi, plants and animals. J.Mol.Biol. 330:443-456 (PubMed)
This web page contains the extended version of the Supplemental Material. Here is the condensed and official Supplemental Material accompanying our JMB publication.

PeroxiP predictor of PTS1 peroxisomal proteins, a web-server implementation of the predictors described in our paper.

The proteins collected from SWISS-PROT with a PTS1 motif

The sequences from SwissProt:
152 peroxisomal proteins with PTS1 motif: fasta swissprot
308 non-peroxisomal proteins with PTS1-like motif: fasta swissprot

Statistics on the 35 motifs found in the set of 152 peroxisomal proteins


Assignment of pfam domains: peroxisomal or not peroxisomal

Assignment of peroxi/nonperoxi to pfam domains (roughly an extension of columns 2 and 3 of Table 5 in the paper).


The sets of predicted proteins without clustering (Method 1 and Method 4)

Results from PeroxiP, Methods 1 and 4. Including screening by TargetP (S1-3) and TMHMM. (Roughly corresponding to Table 4 in the paper).

fasta files:

Method 1:
S cerevisiae (27)
S pombe (10)
A thaliana (61)
O sativa (102)
C elegans (61)
D melanogaster (58)
M musculus (59)
H sapiens (44)

Method 2:
S cerevisiae (64)
S pombe (36)
A thaliana (198)
O sativa (249)
C elegans (164)
D melanogaster (117)
M musculus (198)
H sapiens (240)
Method 3:
S cerevisiae (77)
S pombe (53)
A thaliana (337)
O sativa (574)
C elegans (251)
D melanogaster (156)
M musculus (217)
H sapiens (243)
Method 4:
S cerevisiae (277)
S pombe (210)
A thaliana (1146)
O sativa (1592)
C elegans (755)
D melanogaster (482)
M musculus (947)
H sapiens (1427)



The sets of predicted proteins with clustering (Method 1 and Combined method)

Results from PeroxiP including clustering procedure (both Pfam and BLAST clusters) and cluster cutoff, Methods 1 and Combined method. Including screening by TargetP (S1-3) and TMHMM. (Roughly corresponding to Table 6 in the paper).

fasta files:

Method 1:
S cerevisiae (10)
S pombe (2)
A thaliana (17)
O sativa (26)
C elegans (27)
D melanogaster (24)
M musculus (27)
H sapiens (16)
ALL (142pfam+7blast=149 proteins)
Method 2:
S cerevisiae
S pombe
A thaliana
O sativa
C elegans
D melanogaster
M musculus
H sapiens
ALL
Method 3:
S cerevisiae
S pombe
A thaliana
O sativa
C elegans
D melanogaster
M musculus
H sapiens
ALL
Method 4:
S cerevisiae
S pombe
A thaliana
O sativa
C elegans
D melanogaster
M musculus
H sapiens
ALL

Combined method:
S cerevisiae (68)
S pombe (45)
A thaliana (299)
O sativa(295)
C elegans (207)
D melanogaster (135)
M musculus (334)
H sapiens (181)
ALL (1359pfam+202blast=1561 proteins)




The sets of predicted proteins with clustering and then expanding the sets(Method 1 and its homologs from Method 4)

Results from PeroxiP including clustering procedure (both Pfam and BLAST clusters) and cluster cutoff, Method 1, and then searching for homologs in the set from Method 4 using the Pfam domain criterion. Including screening by TargetP (S1-3) and TMHMM. (Roughly corresponding to part of Table 7 in the paper).

fasta files:

Method 1 and including homologs from Method 4:
S cerevisiae (21)
S pombe (14)
A thaliana (110)
O sativa (84)
C elegans (64)
D melanogaster (42)
M musculus (65)
H sapiens (30)
ALL (430 proteins)



The complete list of all domains compatible with a specific group of proteins. (Roughly an extension of columns 1 and 3 of Table 7 in the paper).

The exact classification of domains for each protein family.


Last modified: Wed Aug 20 12:02:18 CEST 2003