SUPERFAMILY HMM library and genome assignments server

Comparative Genomics Tools

The SUPERFAMILY web site provides a number of comparative genomics tools for the analysis of superfamily, and family, domains from across the tree of life.


Unusual domains

It is now possible to compare the domain composition of a given genome with that of other genomes and thereby detect superfamilies that are over- or under-represented. The group of genomes for comparison consists of several predefined choices, such as eukaryotes or archaebacteria, or any user-defined set of genomes (or a single genome), e.g. other strains of the same species.
Over-represented superfamilies have typically expanded as the organism specialised for its environmental niche; e.g. in Shewanella oneidensis, a Gram-negative bacterium with diverse respiratory strategies that are of potential use in bioremediation, the five most unusual superfamilies include multiheme cytochromes, porins and transferrins. Proteins in these superfamilies may provide interesting targets for investigation.
Links to the unusual superfamily and family domains are provided at the top of each genome page.

[
Top of page ]

Unique domain pairs

We have also added a page for each genome which lists all unique pairs of superfamilies that occur next to each other. Links to these pages can be found at the top of each genome page.


Adjacent domain pair lists and graphs

For each genome, lists of pairs of superfamilies that occur next to each other in its domain architectures, can be accessed by clicking on the "domain combinations" link at the top of each genome page. This list can be visualised in the form of a graph, by clicking on the Adjacent domain pairs logo (shown above) at the top of the domain pairs list pages. Nodes represent superfamilies labelled according to their SCOP classification, edges indicate superfamilies that occur next to each other in domain architectures, and arrows show the N-to-C terminal order. Node size and edge thickness are proportional to the number of proteins.
Users can remove from the initial list those pairs that are already present in proteins of known structure, or which also occur in other genomes. Lists, and graphs, for adjacent domain pairs in custom sets of genomes can be obtained by entering two letter genome codes into the "Custom genome sets" form on the adjacent domain pairs list page.

[
Top of page ]

Domain combinations in groups of genomes

Starting from a superfamily of interest, users can get a visual representation of all domain architectures the superfamily occurs in. From this representation, a clear picture of partner domains occuring with the superfamily of interest can be gained. Five groups of genomes are available for comparison: all, eukaryotic, bacterial, eubacterial and archaeal.
These pages can be navigated to by selecting the superfamily of interest from one of the genome assignment pages, and then following the Domain Combinations icon link at the top of the page. Finally, select one of the five groups of genomes.

[
Top of page ]

Taxonomic visualisation of domain distribution

The TaxViz tool displays the distribution of domain superfamilies, or families, across the major taxonomic kingdoms or genomes within a kingdom. This gives an immediate impression of how superfamilies/families are restricted to certain kingdoms of life.
The TaxViz pages can be navigated to by selecting the superfamily, or family, of interest from one of the genome assignment pages, and then following the TaxViz logo (shown above) at the top of the page.

[
Top of page ]

Domain occurrence networks

Undirected domain occurrence networks are available for all superfamilies. Nodes in these networks represent genomes. Connections between nodes represent the presence of domain architectures, which contain the superfamily of interest, in both genomes. These networks convey the co-occurrence of domain architectures, for each superfamily, across the genomes included in SUPERFAMILY.
Again, the Domain occurrence network pages can be navigated to by selecting the superfamily of interest from one of the genome assignment pages, and then clicking the network logo (shown above) at the top of the page.

[
Top of page ]