|
ENSMGAP00000005623 (Meleagris gallopavo 76_2)
(show help)
For a sequence (see Protein sequence) in target, dcGO predictor has the following procedures to predict the ontology terms of the target:
First, obtain Domain architecture and its residual domains and supra-domains from the SUPERFAMILY database.
Then, use the domain-centric annotations to predict the ontology terms of the target:
- If a target contained a domain/supra-domain, then all ontology terms associated to that domain/supra-domain are transferred to the target (together with hypergeometric score, h-score);
- When a target-to-term transfer is supported by one or more residential domains/supra-domains, sum up h-scores to calculate predictive score (p-score);
- The p-score is then rescaled to the range of 0-1. For each namespace (e.g., three sub-ontologies for GO), p-score=(SUM-MIN)/(MAX-MIN), where SUM is the sum of all h-scores to support a term transferred to the target, MIN and MAX are respectively the minimum and maximum of SUM over a whole list of predicted terms for the target;
Finally, the predictive score being rescaled is used to rank the predictions. The higher value of the p-score indicates the more evident the prediction is. In the dcGO, each ontology has a slim version on its own, containing ontological terms at four levels of increasing granularity (that is, being highly general, general, specific, and highly specific). Listed in the table are the top 5 predictions for each specificity and for each namespace. In addition to those restricted by the term specificity, i.e., Export prediction (slim version), the full list of predictions are also provided for the download, i.e., Export prediction (full version).
Protein sequence
Comment |
pep:known_by_projection chromosome:UMD2:10:8501315:8529274:-1 gene:ENSMGAG00000005648 transcript:ENSMGAT00000006363 gene_biotype:protein_coding transcript_biotype:protein_coding |
Sequence length |
3293 |
Sequence |
QRIFVYVTWTPSEEGKVRELVTFVVNGIVKHQAVLLGVAEQPVKKKKSLWDTIKKKNSSE
TSIPRKINKRNSKVKNVNKTFQVSQEADRLRSPLQSCENQNAMQNSTSPVNESPVGSENK
LPISPIAPVLEELRTTACTPLSMRRSTTFSDIAAAVNEGLLPVMDSHNFNKSVDDHSELK
SGSAYSCSGLGKPPISNDEVILNCTLSPVCTPEHSSSVPFLSTRRFLSPDSFVNDNYQVD
GDIVVPSVPILSPDQFVKDMQLYMQSKTPKPEAALISSTTETYVVKECVPSKQKENGDRW
TKTQVHVEHVLNEAFEVRCVDSQVPHHNKSKENHFTCPPIGEFQTRSSQIEQPKKRPVLS
ATVIKNKPNADERKKMEMLQPKSKKCLNSAIMTCENMVPLYTKAEISERLPVIGPVSAGN
KCPSDKVSSSTESTSLCRKRKNEVSVEKNRVVGPVHVEEVERKRALTSSMDNKTQATVRP
SAFKHTTRERIGQNKKAGSSLQKASKTNSRINKPIAGLAQSHLTFVKPLKTVIPRHPMPF
AAQNMFYDERWKEKQQRGFTWWLNFVLTPDDFSVKTNTSQVNAAALILGEENHHKTSVPK
APTKEEASLKAYTARRRLNKLRRDACRLFTSETMVKAIKKLEVEIETRRLLVRRDRHLWK
DVGERQKVLNWLLSYNPLWLRIGLETVYGELITLESNSDFMGLAIFILNRLLWNPDIAAE
YRHPTVPHLYREGHEEALSKFTLKKLLLLVCFLDRAKQSRLIGHDPCLFCKDAEFKTSKD
ILLAISRDFLSGEGDLSRHLGFLGLPVSHIQTPLDEFDFAVTNLAVDLQCGIRLVRAMEL
LTRDWNLSKQLRVPAISRLQKMHNVDIVLSVLKKRGIHLKDDSGASIDCRDIVDRHRERT
LALLWKIVFAFQVDILLNVEQLKEEIQFLKNAHKIKMQLGALKLFSSCCRIQEDNSSSPS
PQSHGENVKLLMAWVNAVCGFYNIKVENFTVCFSDGRVLCHLIHHYHPCYVPLEAVCQRT
TQTVECSRTHKVGLNCSSSSESDTSLNVMEGPFDQTVTASVLYKELLENEKKNFQLINNA
VSDLGGIPAMIHYSDMSNTIPDEKVVITYLSFLCSRLLDLRQETRAARLIQSAWRKFRLK
RELKLSQERDRAARIIQKYAINFLSHRRLVKKHNAAVIIQKHWRRHLARIIFLNLKKTKW
EEARSKSATVIQAYWRRYSARKSYLQLRYYVIFVQARIRMLLAVAAYKRILWATVTIQNR
LRASNLAKEHRKRYEILRSSALTIQSAFRKWRKHKIQEKIRAAVVLQRYFRKWQSSKLTK
RKRAALLVQSWYRMHRDMKRYLRIKQSIIKIQAWYRCQIARRIYQEYRAQIVMIQQHYRA
YKLGKNEREIYLQKRAAVVVLQAAFRGKKARILYRQTKAACVVQSLWRMRQARQRFLLIK
KSVTLLQSHVRKHQQVKRYKEMKNAASVIQAWFRAHVTSKKAALSFQRMRLAAIVLQSAY
RGRKARKEAHILRSVIKIQSSFRAYVIRKRFEDLRNATVKIQACVKMRQARRYYRALREA
TLYVQRRYRSRRYALQLKEDYRKLKGACIRIQAAVRGCLVRKQIKRWRETAVFLQAQYRM
RRTRARYLLMYSAAVVIQKHYRAYHKQLCQRQEFLQAKKAAVCIQAGYRGYKARKKLKLE
HRSAVKIQAAFRAHATRKKYQAMIQASVVIQRWYRTCKTSNRQRLTFLKTRAAVLTLQAA
FRGYQVRKQIRRQCAAATAIQSAFRKFMALKTFRLMNHAVLNIQRRYRAIVISRKQRQEY
VELRNCVVRLQAIWRGKAARKKIQKMHNLATIIQSYYRMHVNQLKFKKLRRSTLVIQRYF
RAYCMKKNQRARYLKTKAAVLVLQSAYRGMTVRKQLRELSKAATTIQAAFKSYLVKKDYV
GLRSAAVVIQRRYRAVIHTKWQRQKYLSLKAAAIKMQAIYRGVKVRRQIHSMHRAAIRIQ
AMFKMHRINIRYQAIRMAAIIIQRQYRAFCLGRVQRKKYLELKKSSIVLQAAFRGMKVRQ
DLKMMHQSAALIQSYYRVHKQQRDFRNLLLATRRIQQWYRACKERNRQVHNYMTVRSATL
CLQAAFRGMKARRLLRTMNHSAELIQRRFRTFLQRKRFISLRTAAVVIQRKYRATKLAKI
QRQQYLSLLNAAVIIQSAYRGFLARQKMRQMHQAATVIQATLRMRKIYISYQVLRLASVI
IQQRYRAYREGKRVRKMYLKTYKSVLVIQAAYRGMKTRCFLKKRHEAALIIQRNYRMYRQ
YHRYRRVQWATQLIQRKFRANSLRNIAVQRYFSFKKAAICIQRAFRDMRLKKQHQEMHRA
ATVVQKNYKAFREHQRYLSLKAAALVFQRRYRALILSRQHTQEYLYLRRATIRLQAVYRG
IRVRKSIEHMHLAARTIQSAYKMHRNRSAYQKMRTAAVVIQNYYRSYVKAKNQQKKYLTI
KKSALLIQASYRGMKERQQLKMMRASAIIIQSSYRMYIQHKYYKQLCWAVRVIQQRFRAK
IAKKADMENYAKIRKAIICLQSSFRAKKARQLRKTNAAALCIQSFLRMRVERKRFLVKKA
AAITIQSAFRCQRARARYKSVQNSIVAIQRWYRACHSARLQKAEYSLQRQAIIIIQSAYR
GMKARKLTRQIRATRKIQSFLQMAVQRRKFIQLKRAAITLQACYLMHKAKSQYASYKKAA
VVLQRWYRSHLIVKHQRMTYLQTQKKIILVQAVVRRFIVKKRFQKIKESAIKIQASYRGF
KARRLANKVRAARVIQAWFRGCKARKEYASMVEAVRIIKSHFRTKRQRTWFLKMKFCALT
IQRRWRATLAARMVRLQFLATKKKHEAACLIQTTYRCFKERRKLDRQKAAAVTIQKHLRA
WQEGRLQFMKYNKIRRAVIKLQAFIRGYLVRKKFLEQKQKKRLLYFIAAAYHHVSAIKIQ
RAYRMHLAYKLAENHISSVLIIQKWFRAKMQRRRFQRDLQRIVQCQRMIRGWLNRRNDAA
TVIQRHVRRFLACRRKRKIAVGIIKFQALWRGYSWRKNNDTAKTKALRLSLEKANRKSKE
ENKLCNRTSIAIEYLLKYKHISYILAALKHLEVVTRLSPLCCENMAQSRAIFTIFVLIRS
CNRSVPCMDVIRYSVQVLLNVSKYERTTQAVYEVDNCIDTLLDLLQMYRGKAGDKISEKG
GSIFTKTCCLLAILSKDSKRALEIRSMPRVVTCIQSLYKLTARKCKMDAERTLAKQKTST
LISGISFIPVTPLRIKTVSRIKPDWVLRKDNMQEVVDPLQAIVMVMDTLGIAC
|
Jump to [ Top · Protein sequence · Domain architecture ]
Domain architecture

1
Jump to [ Top · Protein sequence · Domain architecture ]
|
  
|