 |
※ Documentation
1. Content of the Database
Base on the
rationale of "seeing is believing", the
MiCroKit database contains proteins that have been
experimentally identified to be localized on Kinetochore,
Centrosome and/or Midbody. All microkit proteins have
been manually curated from PubMed.
The MiCroKit database holds
the information of such proteins, including names,
accession identifiers in the public databases, and
their host organisms, etc. The localization information
of each protein and the respective literature are
also provided. The functional annotations, e.g. functional
domains and Gene Ontology terms, could be found in
the protein browsing page of each protein, e.g. MCK-CE-00001.
The full list of the fields of information and their
respective meanings are provided in the next section
of this document.
Most of the previous studies have
been focused on yeast and animals, and only a few
of microkit proteins are discovered in plants. So
the MiCroKit database version 2.0 provides the information
of microkit proteins from seven organisms, i.e.
Saccharomyces cerevisiae, Schizosaccharomyces
pombe, Caenorhabditis elegans, Drosophila
melanogaster, Xenopus laevis, Mus
musculus and Homo sapiens. More organisms
will be added into our database in the future.
2. All the fields of information
for each protein in this database
The following
fields can be accessed for each protein in the database:
| ID |
The MiCroKit database ID of a protein.
An ID starts with the "MCK" (means a
curated MiCroKit protein), followed by the abbreviation
of the organism name (two letters, SC: Saccharomyces
cerevisiae, SP: Schizosaccharomyces pombe,
CE: Caenorhabditis elegans, DM: Drosophila
melanogaster, XL: Xenopus laevis,
MM: Mus musculus, HS: Homo sapiens)
, and the last five digits represent the accession-number
of the protein. The three sections are deliminated
with a hyphen, i.e. "-". |
| SP |
UniProt accession IDs |
| PI |
Theoretical/calculated PI (Compute
pI/Mw from ExPASy) |
| MW |
Calculated molecular weight (Compute
pI/Mw from ExPASy) |
| GP |
Genbank IDs of the protein sequence of each
entry |
| GM |
Genbank IDs of the nucleotide sequence of each
entry |
| PN |
Protein name, standard nomenclature |
| PA |
Protein synonyms/alias |
| GN |
Gene names of the protein (usually taken from
UniProt) |
| GA |
Synonyms/alias for the protein (usually taken
from UniProt) |
| DT |
Creation date of the entry |
| UD |
Dates of important updates of the entry |
| CL |
Sub-cellular localization of the protein. Only
Kinetochore, Centrosome and Midbody are considered
in our database. |
| OS |
The host organism of the protein |
| OX |
NCBI Taxa ID of the host organism |
| RF |
Primary references to report the proteins to
be localized on Kinetochore, Centrosome and/or
Midbody |
| FN |
Functional description of the protein (taken
from UniProt) |
| UC |
Users comments. For furture update, Users could
contact with us to add useful comments on each
entry, to make the database more integrated |
| KW |
Keywords of the protein (taken from UniProt) |
| SS |
Sequence source (public database of the given
sequence) |
| NL |
The length of the nucleotide sequence of a microkit
protein |
| NS |
The full length of nucleotide (cDNA/mRNA) CDS
(coding sequence) of a microkit protein |
| PL |
The length of protein sequence |
| PS |
Protein sequence of the entry |
| GO |
Gene Ontology annotations of the entry (curated
from UniProt) |
| IP |
Functional domain annotations of the entry (Interpro,
curated from UniProt) |
| PF |
Functional domain annotations of the entry (Pfam,
curated from UniProt) |
| SM |
Functional domain annotations of the entry (SMART,
curated from UniProt) |
| PS |
Functional domain annotations of the entry (PROSITE,
curated from UniProt) |
| PR |
Functional domain annotations of the entry (PRINTS,
curated from UniProt) |
3. How to search the MiCroKit database
Three searching
options are provided in MiCroKit: Simple
search, Advance
search and BLAST
search.
Simple
search will present all entries
containing the input keywords (separated by space
character, for example, input "bub1 human"
to find five records in MiCroKit database, although
MCK-HS-00193 is the human Bub1 protein).
Advanced
search allows you to input up
to three terms to find the information more specifically.
The querying fields can be empty if less terms are
needed.The three terms could be connected by the following
operators:
exclude: If selected, the term following this operator must be not contained in the specified field(s)
and: the term following this operator has to be included in the specified field(s)
or: either the preceding or the following term to this operator should occur in the specified field(s)
For example: if you want to search for
human CENP-E protein specifically, you can try the
query with: OS Homo sapiens (Human)
and GN
CENPE and CL
kinetochore. (see
in example figure)
BLAST
search could be used to find the
specific protein and/or related homologues by sequence
alignment. This search-option will help you to find
the querying protein accurately and fast.
4. How to browse the MiCroKit database.
You can browse
through the MiCroKit database instead of searching
for a specific protein. In the browse
page, all microkit proteins are sorted
by a specific feature of MCK ID, SwissProt ID, organism,
molecular weight, or pI. Each feature title is clickable,
to re-sort the microkit proteins either ascending
(U)
or descending (D).
Each page will show up to twenty entries.
5.Download the Database
Users can
also download the data of MiCroKit database for further
analysis. The following files could be downloadable:
All information
of MiCroKit database in ASCII file (.zip
or .tar.gz)
Protein sequences
of all microkit proteins in FASTA file format (.zip
or .tar.gz)
Nucleotide sequences of all
microkit proteins in FASTA file format (.zip
or .tar.gz)
|
|
|
 |
 |
| |
 |
| Last update: June 5th, 2006 |
|
| Copyright © 2005,2006 LCD, USTC, All Rights Reserved |
|
|