|
Q9VSK6 (Uniprot 2018_03 genome)
(show help)
For a sequence (see Protein sequence) in target, dcGO predictor has the following procedures to predict the ontology terms of the target:
First, obtain Domain architecture and its residual domains and supra-domains from the SUPERFAMILY database.
Then, use the domain-centric annotations to predict the ontology terms of the target:
- If a target contained a domain/supra-domain, then all ontology terms associated to that domain/supra-domain are transferred to the target (together with hypergeometric score, h-score);
- When a target-to-term transfer is supported by one or more residential domains/supra-domains, sum up h-scores to calculate predictive score (p-score);
- The p-score is then rescaled to the range of 0-1. For each namespace (e.g., three sub-ontologies for GO), p-score=(SUM-MIN)/(MAX-MIN), where SUM is the sum of all h-scores to support a term transferred to the target, MIN and MAX are respectively the minimum and maximum of SUM over a whole list of predicted terms for the target;
Finally, the predictive score being rescaled is used to rank the predictions. The higher value of the p-score indicates the more evident the prediction is. In the dcGO, each ontology has a slim version on its own, containing ontological terms at four levels of increasing granularity (that is, being highly general, general, specific, and highly specific). Listed in the table are the top 5 predictions for each specificity and for each namespace. In addition to those restricted by the term specificity, i.e., Export prediction (slim version), the full list of predictions are also provided for the download, i.e., Export prediction (full version).
Protein sequence
Comment |
(tr|Q9VSK6|Q9VSK6_DROME) Uncharacterized protein, isoform D {ECO:0000313|EMBL:AFH04344.1} KW=Complete proteome; Reference proteome OX=7227 OS=Drosophila melanogaster (Fruit fly). GN=CG43163 OC=Ephydroidea; Drosophilidae; Drosophila; Sophophora. |
Sequence length |
2523 |
Sequence |
MTNYENIINLPLNFNANNHAAGNNENNAKIAIVGEQLLNKMSQRDFSENEPECTPELPAA
NRALFLEKVRQSNAACQSGDFATAVLLYTDALQLDPGNHILYSNRSAALLKQGQFTAALQ
DATQARDLCPQWPKAYFRQGVALQCLGRYGEALAAFASGLAQEPSNKQLMAGLVEASLKS
PLRAALEPTLQQLRTMQLQESPFVVSSVVGQELLQATQYSAAVTVLEAALRIGSCSLKLR
GSVFSALSSAHWALNQLDQAIGYMQQDLAVAKSLGDTAGECRAHGNLGSAYFSQGAHKEA
LTAHRYQLVLAMKCKDTQAAAAALTSLGHVYTASGDYPNALASHKQCVQLFKQLGDRLQE
AREIGNVGAVYLALGECEAALDCHSQHLRLARKLHDQVEEARAYSNLGSAHHQRRQFTQA
AACHEQVLRIAQALGDRSMEAAAYAGLGHAARCAGDASASKRFHERQLAMALAARDKLGE
GRACSNLGIVYQMLGSHDAALKLHQAHLGIARSLGDRTGMGKAYGNMARMAHMAGSYEAA
VKYHKQELAINQAMNDRSAEAATHGNLAVAYQALGAHDAALTHYRAHLATARSLKDTAGE
ACALLNLGNCLSGRQEYEEAVPHYESYLMLAQELGDVAAEGKACHLLGYAHFSLGNYRAA
VRYYDQDLALAKDAQHRPNMGRAYCNLGLAHLALGHTAAALECQQLFLAVAHATNQLPAK
FRALGNIGDILIRTGSHEEAIKLYQRQLALARAAGDRSMEAAACGALGLAHRLMRRWDKA
LGHHTQELTLRQELGDLSGECRAHGHLGAVHMALCSWTNAVKCYQEQLERAQEQRDAAVE
AQAHGNLGIARLNMAHYEAAIGCLEAQLGTLERVSLPSTQADRARALGHLGDCYAALGDY
EEALKCHDRQLQLALGLTSHRDQERAYRGLGQARRALGQLPAALVCLEKRLVVAHELHSP
EIKALAYGDLGHVHAALGNHAQALNCLEHQRELAQGLQDRALESDAMCALGQVQQRMGQH
AQALELHRQDLEICTELSAPALQARALSNLGSVHESLGQQAEALKCYERQLELSTDRLAK
AMACLALGRVHHQLEQHNQAVDYLRQGLASAQTTGKSEEEAKIRHQLGLALRSSGDAEGA
HIQLETAAQLLESVRHEQRSPETRQALYDLQTTCYQLLQVLLVALNRNEDALVAAERCKA
RGGADAVSGEGSKVPLADSESIQETVDRGRTTVLYYSLAGDQLFTWLLEPHVGIVRFHAA
KIDAHSLQLPLSLSEEEEEEEDLEMERELKLNRDQNQDQSEDQGTSNKANRLLERYMSLV
RDNLGVNSQSLLHEGDGSGWRASTEQLLEDLPGAGGSGGGGFLRMVSRNQLLNSSNYSLS
SLFSVGSVGGSVASLQGSTRSVGSRSSRRAPGLPVWRGPSCLHILYNLLLAPFDDLLPAG
EADTNRQGRRELVLVLDSSLYLVPFAILRAAQEDGEYLCERCAILTAPSLQSMRSRPRIR
RDRARPPKALVVGGPRIPSNLAELWGWAGAESPAALQEAAMVADMLQATALAGSNATKES
VLAELPSAECVHFAANISWQLGAVVLSPGDVVTAEQQEQKEPHEPQMTDFTLAAGELRQL
RLSARLVVLSSYHSVEPITGDGVAQLAGGWLLAGAGVVLISLWPVPETAAKILLRAFYSA
LLQGARAARALAEAMQTVQHTKHFAHPANWAGFLLVGSNIRLSNKVAMGHALCELLRTPE
RCRDALRVCLHLVEKSLQRIHRGQKNAMYTTQQSIENKAGPVAGWKDLLMAVGFRVEPAA
NGIPASVFFPQADPEERLAQCSASLQALLALTPATLQALGKLVHVNSAEYAGDIIAVVRN
ILAQFPGSNPATGSSSTKSDAIAETCFIEMPLSVRLWRVAGCHELLASVGFDLTEVGADQ
VILRTGKQANRRQCQFVLQALLALFDTREAPKNLGLDRDSDHSSSCESLAEPEMDVASIP
PAASSTPSVNGGNKAPLPLHPRSAFISYVRRRGEPDGGRTDAIGGVGSGALDSSLANTTD
SEHSLSDGYVTQPGFGLSNPRLGYASMRAPVRVSRPGGGGESDAAFTPSPPVTDPSLSLA
LAHQTRIRSLYSQTSSAAIVPPVATTLNRRPDSSSSASSTTDWEGSGHATVLRRGVSTAI
PQQQPPLPPPRPPTFHNLSVGAGGRSKPKIKLGSAQSSLAARVNKEHALFMDRLSVRTEL
SAPGSGNAMTSGSAARKPLALPEEEDVSNVVFNPSSLYFSQTDSDLLNEPKQEETPPPTK
SSTKSLQDSMMRHMNRELTPSISEMYHDRNVGLGLAPPLSKLLLSPNYEEQEVVSITEAC
PQSGANLKTLVDAVSELELSGSSSGATMPTVVGGGANEVEGTSGTSAEATCPICGEPADV
ICRCKAVTIQKKSILKPWLSNVPDSSLVQASELTTADILERRLEIHTATSSELRSGAPYS
VDLSRRDEGDGRSVADSQCSSNYKRVLASASGGALSLAGIGKGSEAEAQSASSGATCSSA
RLV
|
Jump to [ Top · Protein sequence · Domain architecture ]
Domain architecture

1
Jump to [ Top · Protein sequence · Domain architecture ]
|
  
|