|
※ INTRODUCTION:
The past decade has
witnessed the rapid progresses on functional dissections of
protein sumoylation (Geiss-Friedlander
and Melchior, 2007). The SUMO (small ubiquitin-related
modifier) gene SMT3 was firstly identified in S. cerevisiae
as a suppressor of the centromeric protein Mif2 (Meluh
and Koshland, 1995), and later was shown to be covalently
coupled to the Ran GTPase-activating protein RanGAP1 as a reversible
modifier (Mahajan,
et al., 1997; Matunis,
et al., 1996). Proteins modified by SUMO could alter
their sub-cellular localization, activity or stability, etc
(Fernandez-Lloris,
et al., 2006; Mahajan,
et al., 1997; Matunis,
et al., 1996). And protein sumoylation plays important
roles in a variety of biological processes, such as transcriptional
regulation, signaling transduction, cell cycle progression and
differentiation (Deyrieux,
et al., 2007; Gill,
2004; Montpetit,
et al., 2006; Seeler
and Dejean, 2003), etc. In addition, aberrance of
SUMO system is highly implicated in numerous diseases and cancer
developments (Dorval
and Fraser, 2007; Fernandez-Lloris,
et al., 2006; Li,
et al., 2005; Seeler,
et al., 2007).
In this work, we updated
our SUMOsp
1.0 into version 2.0.
The training data set was manually collected from scientific
literature. The non-redundant training data contained 279 sumoylation
sites from 166 distinct proteins. Then an updated version of
GPS algorithm was deployed. The self-consistency, leave-one-out
validation and 4-, 6-, 8-, 10-fold cross-validations were calculated
to evaluate the prediction performance and system robustness
of SUMOsp 2.0. Also, the prediction performance was tested on
an additional data set not included in the training data set,
with 53 sumoylation sites from 31 proteins. We compared SUMOsp
2.0 with SUMOplot and SUMOsp 1.0, on both the training data
and new data. The specificity (Sp) of SUMOsp 2.0 was improved
significantly, while the sensitivity (Sn) was similar or just
slightly reduced against previous tools. The SUMOsp 2.0 was
implemented in JAVA 1.4.2 and would
use local CPU for computation. With a high speed, SUMOsp 2.0
could predict out potential sumoylation sites for ~1,000
proteins (with an average length of ~1000aa) within ten minutes.
Taken together, we proposed that the highly specific SUMOsp
2.0 web server will be more efficient for sumoylation sites
prediction. The SUMOsp 2.0 is freely available at: http://bioinformatics.lcd-ustc.org/sumosp.
This website is free
and open to all users and there is no login requirement.

SUMOsp
2.0 User Interface
|
 |