|
ENSMGAP00000013184 (Meleagris gallopavo 69_2)
(show help)
For a sequence (see Protein sequence) in target, dcGO predictor has the following procedures to predict the ontology terms of the target:
First, obtain Domain architecture and its residual domains and supra-domains from the SUPERFAMILY database.
Then, use the domain-centric annotations to predict the ontology terms of the target:
- If a target contained a domain/supra-domain, then all ontology terms associated to that domain/supra-domain are transferred to the target (together with hypergeometric score, h-score);
- When a target-to-term transfer is supported by one or more residential domains/supra-domains, sum up h-scores to calculate predictive score (p-score);
- The p-score is then rescaled to the range of 0-1. For each namespace (e.g., three sub-ontologies for GO), p-score=(SUM-MIN)/(MAX-MIN), where SUM is the sum of all h-scores to support a term transferred to the target, MIN and MAX are respectively the minimum and maximum of SUM over a whole list of predicted terms for the target;
Finally, the predictive score being rescaled is used to rank the predictions. The higher value of the p-score indicates the more evident the prediction is. In the dcGO, each ontology has a slim version on its own, containing ontological terms at four levels of increasing granularity (that is, being highly general, general, specific, and highly specific). Listed in the table are the top 5 predictions for each specificity and for each namespace. In addition to those restricted by the term specificity, i.e., Export prediction (slim version), the full list of predictions are also provided for the download, i.e., Export prediction (full version).
Protein sequence
Comment |
pep:novel chromosome:UMD2:2:47182753:47480846:1 gene:ENSMGAG00000012513 transcript:ENSMGAT00000014086 gene_biotype:protein_coding transcript_biotype:protein_coding |
Sequence length |
3216 |
Sequence |
DEHNDVQKKTFTKWINARFSKSGTPPVKDMFTDLKDGRKLLDLLEGLTGKPLPKERGSTR
VHALNNVNRVLQVLHQNNVELVNIGGTDIVDGNHKLTLGLLWSIILHWQVKDVMKNIMSD
LQQTNSEKILLSWVRQSSRPYSQVNVLNFTTSWADGLAFNAVIHRHKPEFFSWDKVIMMS
PIERLEHAFSIAKNHLGIEKLLDPEDVAVPLPDKKSIIMYLTSLFEVLPKQVTMEDIREV
ETLPRRFKQECEDGEFGLKQAVAPEEEQNSSRAETPSTVTEVDMDLDSYQVALEEVLTWL
LSAEDAFREQEEISGDVEEVKEQFATHEAFMMELTAHQSSVGSVLQAGNQLVAQGNLSPE
EEFEIQEQMLLLNSRWEELRVESMDRQSRLHDMLMELQKKQLQQLSDWLTVTEERIQKME
SQLLADDLETLQKQLEEHKNLQNDLEAEQVKVNSLTHMVVIVDESQGTDYNCCFDDSMFT
AGVSIGVLCDWSETQMNNLEILIVFFLYFFFSEKILKACVTNACFRLEVSRVWTIVDWYT
NFREKKTVIVTTHELLLILKEDMGMKRQTLDQLNELGQDVAQILGNSRASNQIDHDQEEL
TQRWDSLVQKLEDYSNQVTQAVGSVGMPQISQKTSAETMSLKEKVVVRESRQESPPPPPP
KKRQIPGDSEAKKKFDAESAELLDWISKSKAAIQATDIKEYKKMRETSSVKEKLKVIEKE
KVEKSQKLDDLKRNGHILLDQMGKEGLPAVDLKNLMEKILSGWKEVVQLLEELTRKIKYQ
EDINAYFKEMNELEKIIVEKESWLKNDSSASQPSAVLKDLCQRELTYLVSLSPRLEHLKA
TCKVLTSQPVSPNFVQESLSGFLDRFKLTCQKLDERHQQLKKVTETESLPTQDFLGSLQR
LKDVLSEVECKQQAIASGLSDPSKVEKALQQTKTVCETLEAQKPRVDKLAEETKALEKQS
SDRTKVYMQQFRHVQGHWDKLRLKTSKDVKLLEEITSKLRLFETNSKVVQKWMDGVKGFL
LTEKAAQGDTEGLQRQLDQCTIFVNEMEKIEPSLKEMKDLESVLRSQSIAGINTWAKSRL
VDCQTQWEKLSKQIISQKNRLSESQEKAVNLKKDLAEMQEWMTQAEEEYLEKDFEYKSPE
ELENAVEEMKRAKEDVLQKEVRVKILKDNIKMLATKVPSGGQELATELNVVLENYQLLCN
RIRGKCHTLEEVWSCWIELLQYLDLETAWLNTLEERVQMTENLTDKLDVVNDALESLESV
LRHPADNRTQIRELGQTLIDGGILDDIISEKLEAFNARYEELSHLAVSRQITLEQQLQTM
RETDHMLQVLQENLAELDRQLTSYLTDRVDAFQMPQEAQKIQAEIAAHEATLEELKKSPR
SFPPASPECRSPRGGTLLDVLQRKLREVSTKYQLFQKPANFEQRMLDCKRVLESVKAELH
VLDVKDTDPDVIQTHLDKCMKLYKTLSEVKLEVETVIKTGRQIVQKQQTDNPKGMDEQLT
ALKFLYNDLGAQVTEGKQDLERASQLARKMRKEAASLSEWLATTEAELVQKSTSESLLSD
LDSEIAWAKNVLRELERRKADLKSITESCTTLQALVEGSETSLEEKLCVLNAGWSRVRTW
TEDWCDTLLNHQSQLEIFDENVAHISTWLYQAEALLDEIEKKPASKKEETVKRLTSELDD
VSLRVDNVRDQAVILMNGRGVSCRELVEPKLAELNRNFEKVSQHIKSAKMLVGQERLPVM
PESHEVRVAFPDLEKFESDVQNMLKVVEKHQELSEEDEKLDQERAQIEEVLQRGGQLLQQ
PMEDNKREKIRMQLVLLQTKYNNVQAITVQQRLKSHFATGVRTLPFSAEYLVEINKVLLA
MADVELLLNAPELNTGIYEDFSTQEDALKNVKDILDKLGDQIAVIHEKQPDVILEASGPE
AIQIGDALTQLNAEWDRINRMYNDRKCSFDRAIEAWRQFHCDLNDLSQWITEAEGLLADA
RGPDGSLDLQAAGLHQQELEEGVHSHQSSVLTLNRTGEGIIQKLSATDGSFLQEKIKLEN
VDWHVMFTAKVHVDILFSFLRLKGENQQLIDYRKKLDELRCWLENAENVLDMRFTFNHEE
NLRELQVLSEEMELQGDKLEWLNRTEPEVILDKNVDLQEKNKLSDSLQSLNVRWNKVFRE
VPDRVRQLEAFVRQPYLVAQAKPSDVQKVVLVSSSLEIPVQTQHALELSAPADLDKTTTE
LAEWLVLIDQMLKSNIVTVGNAEDINRTIARMKVTKADLDQRHPQLDSVFTLAQNLKNKT
SSSDIRTVITEKLEKVKNQWDNTQHGVEVRQQQLKYMLSDSIQWDEQRKEMEHLMEQCEI
RLRMLQQAPKEVLVKQISDNKMLIHDLHRGDITVAAFNDLSNKLLREYREDDTRRVKETT
DQMNTCWVNLNQRAVDRQNILETELKTVQTSCKDLESFLKWLQEAETTVNVLADAARREN
TAQDSACVRELRKQMQDIQAEIDAHNDIFKSIDGNRQKMVKALGNSEEAALLQHRLDDMN
QRWNDLKVKSANIRAHLEASAEKWNRLLTSLEELIKWLNTKDEELKKQMPIGGDVPTLQQ
QYDHCKQKALRRELKDKEQTILNAVDQARIFLADQPIEGPEEPRRNVHSKTELTPEEKAM
KIAKAMRKQSSEVKEKWENLNASATIWQKQVDKALEKLKDLQCAMDDLDADLKEAENVRN
GWKPVGDLLIDSLPDHIEKTTAFREEIAPINLKVKTVNDLSSQLSPLDLYPSLKMSRQLD
DLNMRWKLLQVSVEDRLKQLQEAHHDFGPASQHFLSTSVQFPWQRSVSHNKVPYYINHET
QTTCWDHPKMTDLFQSLADLNNVRFSAYRTAIKIRRLQKALCYKQFLYSTSGREGHKLHN
IHNINILSSVQGKQYVSTCTFKKKEKRGMENRLCICNIQICINFCNNCNGWLSSFIDFKK
FLSFYLLVNFLKVGTESSSVPLLGYKFHKSFQELQGKHKVCSRISLASILNDKIAILEMQ
KILPFSALSLFIHIYVYFPQNHNKPEITVKQFIDWMHLEPQSMVWLPVLHRVAAAETAKH
QAKCNICKECPIVGFRYRSLKHFNYDVCQSCFFSGRTAKGHKLHYPMVEYCIPTTSGEDV
RDFTKVLKNKFRSKKYFAKHPRLGYLPVQTVLEGDNLETPITLISMWPEQYDPSQSPQLF
HDDTHSRIEQYATRLAQMERTNGSFLTDSSSTTGSV
|
Jump to [ Top · Protein sequence · Domain architecture ]
Domain architecture

1
Jump to [ Top · Protein sequence · Domain architecture ]
|
  
|