Annotation Complete Checklist ( page in progress)

Consistency checks for when a gene (or ideally complex) is considered annotation complete.

What does "annotation complete" mean?

The point when a gene product (or complex) is considered "annotation complete" , ALL papers retrieved by PubMed? search on gene name are curated (may be exceptions if papers are out for community curation)

Gene specific

GO Molecular function

  • MF check if known targets are missing for 'protein modifiers'
  • Link functions to processes (Do the processes of the substrate annotations match)

GO biological process

  • Are biological processes linked from functions annotated
  • Do regulation terms specify positive or negative?
  • GO slim

GO localization

  • refine localisation screen based on low throughput (expand)


  • Remove or replace all IEA/NAS/TAS IC where possible
  • refine ISS (expand)
  • refine ISA (expand)
  • Are all GO annotations as specific as possible


  • refine existing annotations to "normalize" phenotypes expand
  • look for identical phenotypes differently described
  • same allele described or named differently
  • conditions important for interpretation omitted
  • Do phenotype and condition combinations make sense? (looks for copy/paste errors in conditions/penetrance/expression/extensions)

COmplexes: enrich list for complex with expect value of 1 to see annotation differences (this will work for GO, but wont be possible for phenotypes)

Modifications added by, removed by, etc

enrichment Complex specific

  • look at enrichment for process /component /function IC to fill in gaps

run enrichment on kinetochore and spindle pole body (after update) which are attachment/biorientation, which are loading?

can be more specific? check demo genes missing for a particular term

Updating Existing/Legacy Annotations

  • Check all existing ISS, IEA, TAS and NAS annotations to see if any are no longer required, or are incorrect
    • Remove any TAS/NAS/ISS which are now covered by experiment
    • Automated mappings (IEA) will be suppressed by experimental data. Are any IEA annotations not covered by your manual annotation? It should be possible to make a manual annotation to cover all automated mappings (if no experimentally supported annotation can be made, a manually evaluated ISS should be possible)
    • Mappings that can't be replaced by manual ISS or experimental annotations may be incorrect and need to be removed. Report incorrect mappings on the GO annotation issues tracker
      • Swiss-Prot keyword (SPKW, SP_KW, UniProtKB-KW) mappings: request labels "UniProt KW2GO mapping", "GOA"
      • Swiss-Prot Subcellular Location (SP_SL, UniProtKB-SubCell) mappings: request labels "UniProt subcell2GO mapping", "GOA"
      • InterPro mappings: request label "InterPro mapping"
      • For UniProt keyword and subcellular location mappings, you can also go to the UniProt entry from a PomBase gene page and send a message to UniProt. Ivo Pedruzzi will fix it quickly.
    • Some "pombe kw mappings" have NAS evidence code (you will know these are mappings because they will not be visible in Artemis. These will need to be deleted from the mapping file.
    • If a mapping doesn't seem to be the problem, the ontology may need to be revised; contact GO editors via the ontology tracker
  • Check for consistency with other annotations and other resources
  • Make sure all remaining ISS are made to an experimentally characterised ortholog.
    • If the gene has an S. cerevisiae ortholog, check the annotations to the ortholog in SGD.
      • If the gene in SGD is not annotated to a term, and you think it clearly should be, mail them (sgd-helpdesk at to add it so that the ISS is supported. This is frequently required when annotation gene products which are not published.
    • Can you make any further annotations based on what SGD has? (Note reasons why annotations cannot be transferred; 1:1 is easiest)
  • Ask experts to give feedback on lists
Last modified 4 years ago Last modified on Dec 24, 2016, 4:21:37 PM