Quick-start guide to using PHOG
A detailed step-by-step tutorial is available here.
The PHOG webserver provides a way of retrieving pre-calculated super-orthology groups for sequences in the PhyloFacts resource (http://phylogenomics.berkeley.edu/phylofacts/). PhyloFacts is an encyclopedia of thousands of protein families from across the Tree of Life, including phylogenetic trees, predicted subfamilies, family and subfamily hidden Markov models, predicted 3D structures, GO annotations and links to external resources. PHOGs are derived by analysis of phylogenetic trees for PhyloFacts families, and correspond to super-orthology groups found in these trees.
If you use PHOG, please cite the following paper which describes it:
-
Ruchira S. Datta, Christopher Meacham, Bushra Samad, Christoph Neyer, Kimmen
Sjölander,
Berkeley PHOG: PhyloFacts orthology group prediction web server, Nucleic Acids Research 2009; doi: 10.1093/nar/gkp373
PhyloFacts is described in the following papers:
- Krishnamurthy, Brown, Kirshner and Sjölander, "PhyloFacts: An online structural phylogenomic encyclopedia for protein functional and structural classification," Genome Biology 2006
- Glanville, Kirshner, Krishnamurthy and Sjölander, "Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis," Nucleic Acids Research 2007
Step 1: Type (or paste) in a protein sequence identifier or accession. See examples and details of accepted inputs.
Step 2: Click “Search”. Results will display within a few seconds on the page below.
If your input sequence is contained in a PhyloFacts orthology group, two tables will be displayed. The first, "PhyloFacts Orthology Groups Containing the Query Sequence", gives a list of orthology groups, with summary data and links to each PHOG page. The second table, "Orthologs and In-Species Paralogs from One or More PhyloFacts Books", gives a list of orthologs, retrieved from the orthology groups listed in the first table.
Orthologs and In-Species Paralog from One or More PhyloFacts Books
- Gene: UniProt accession. Clicking on this link will bring you to the corresponding page in the UniProt resource.
- Species: The species of origin.
- UniProt ID: The identifier for the sequence.
- Description: A description of the molecular function of the protein, drawn from the UniProt resource.
- Links to other resources: The next five columns include icons displaying links to external resources where available: SwissProt, Gene Ontology and biological literature links in the UniProt resource, and KEGG and BioCyc pathway data.
- Protein-protein interactions: Link to information about interactions between this and other proteins.
- Alignment: PhyloFacts families are of various types, depending on the degree of alignment overlap. There are three basic types: Domain (sequences share a common structural domain), Global (sequences align along their entire lengths), and Local (local alignments with no correspondence to a solved structure). Orthologs defined within Global homology books are far more likely to have a common function than those that share only a local region of homology.
- PHOG: the accession for the PhyloFacts orthology group.
- View Tree: clicking on the link will bring you to the family tree, which you can view using the Berkeley Phylogenomics Group Javascript tree viewer, PhyloScope.
- EC: a link to the IntEnz database at the EBI (http://www.ebi.ac.uk/intenz/) providing information about Enzyme Commission numbers.