※
INTRODUCTION:
The
past decade has witnessed the rapid progresses
on functional dissections of protein sumoylation
(Geiss-Friedlander
and Melchior, 2007). The SUMO (small
ubiquitin-related modifier) gene SMT3 was firstly
identified in S. cerevisiae as a suppressor
of the centromeric protein Mif2 (Meluh
and Koshland, 1995), and later was
shown to be covalently coupled to the Ran GTPase-activating
protein RanGAP1 as a reversible modifier (Mahajan,
et al., 1997; Matunis,
et al., 1996). Proteins modified
by SUMO could alter their sub-cellular localization,
activity or stability, etc (Fernandez-Lloris,
et al., 2006; Mahajan,
et al., 1997; Matunis,
et al., 1996). And protein sumoylation
plays important roles in a variety of biological
processes, such as transcriptional regulation,
signaling transduction, cell cycle progression
and differentiation (Deyrieux,
et al., 2007; Gill,
2004; Montpetit,
et al., 2006; Seeler
and Dejean, 2003), etc. In addition,
aberrance of SUMO system is highly implicated
in numerous diseases and cancer developments
(Dorval
and Fraser, 2007; Fernandez-Lloris,
et al., 2006; Li,
et al., 2005; Seeler,
et al., 2007).
In
this work, we updated our SUMOsp
1.0 into version 2.0.
The training data set was manually collected
from scientific literature. The non-redundant
training data contained 279 sumoylation sites
from 166 distinct proteins. Then an updated
version of GPS algorithm was deployed. The self-consistency,
leave-one-out validation and 4-, 6-, 8-, 10-fold
cross-validations were calculated to evaluate
the prediction performance and system robustness
of SUMOsp 2.0. Also, the prediction performance
was tested on an additional data set not included
in the training data set, with 53 sumoylation
sites from 31 proteins. We compared SUMOsp 2.0
with SUMOplot and SUMOsp 1.0, on both the training
data and new data. The specificity (Sp) of SUMOsp
2.0 was improved significantly, while the sensitivity
(Sn) was similar or just slightly reduced against
previous tools. The SUMOsp 2.0 was implemented
in JAVA 1.4.2 and
would use local CPU for computation. With a
high speed, SUMOsp 2.0 could predict out potential
sumoylation sites for ~1,000
proteins (with an average length of ~1000aa)
within ten minutes. Taken together, we proposed
that the highly specific SUMOsp 2.0 web server
will be more efficient for sumoylation sites
prediction. The SUMOsp 2.0 is freely available
at: http://bioinformatics.lcd-ustc.org/sumosp.
This
website is linked in ExPASy
Proteomics Tools page.

SUMOsp
2.0 User Interface
|