PomBase Planning Meeting 27/5/ 2015

Action item review

Outstanding action items carried forward from previous meeting

Action items from 13 March

New Agenda items

  • Review of posters/talks/workshop for pombe 2015
  • Discuss what people think of making "review" versions of curated sessions accessible from literature section ? useful or not?
    • Curation attribution ( Mouse over?)
    • Can flag if a paper is curated or not
    • Raises visibility of what a 'curation session' looks like
    • Can see all annotation from a paper simultaneously

Continuing Agenda items

  • diversity data
    • What is happeng Ensembl side
    • 1,419 variants that were significantly associated with at least one phenotype. For 89/223 traits, at least one variant passed the significance
      • What is required for gene pages

  • Versioning....
    • (need Mark's help to get this file up to date)
    • Problem: For example FungiDB use a descriptions  gff file from PomBase?, as the main input but this only records the Schizosaccharomyces_pombe.ASM294v2.25.gff3.gz which does not change even thought the annotation changes between PomBase? versions as time goes on the gene build becomes more static but the annotation changes more. Need a way to capture the annotation version in the gff versioning.
    • From Midori: the build number in the table has never corresponded to the number embedded in the GFF file names. According to the Jira ticket, "2.1" was the build number for at least chado versions 38-40, and the GFF files nearest those dates include 2.20 and 2.21.
    • How does someone get from Schizosaccharomyces_pombe.ASM294v2.25 to know which gene build?
      • i.e where is the historical data about the on the Ensembl site about the gene build that was used in a specific release?
    • figuring the numbering of releases before 32 ( I think maybe we just used a data stamp for these?)
    • It seems that .25 is <eg_version>
    • The gff file should be renamed to capture
    • PomBase? sequence assembly version
    • gene build version
    • annotation version

  • Display of wild strains, see previous minutes
    • Need to discuss what we can feasibly display in the light of the volume and scope of the data
    • should we just attach the phenotype trait data to the SNP?

  • When will usage stats be back?
    • IS it possible to provide a list of most frequently accessed pages?

Hosting high throughput datasets

  • None pending at high priority
    • Carr lab replication origins
    • Gx data (postponed waiting for user0
    • Runge lab, pending (transposon insertions?)

Next Priority Jira tickets

  • 1. Finalising any gene page display issues related to summary views etc (phenotype, GO and multi-gene phenotype). Tickets related to this are now collected at Chado future 1 (6 tickets).
  • 2. Query builder related tickets especially to handle GO annotation extensions, phenotype conditions and alleles, multi gene phenotypes and sequence features.
    • Curators need to discuss, and then with Kim and Mark, should have done this before meeting
  • 3. Refining the Query builder results download for the above new query types (and the others which had previously been identified - downloading physical interactions or GO terms associated with a gene list).
  • 4. Other tickets related to the term suggestion behaviour.
  • Then tickets in Chado future 3 which appear to be stalled


Usage stats

  • Is it possible to get lists of the most frequently accessed gene pages?



Phenotype ontology


  • Kim is finalising multi gene phenotypes (and their storage in Canto), many annotation transfer speed ups.
    • We have a large volume of multi gene phenotype data ready to input so this will be the issues to tackle immediately after the compact display. *There are related tickets already for how this should look on the gene pages.

Other general issues



Update on community curation

Literature (triage) status

Item March 2013 April 2013 May 2013 Aug 2013 May7 2014 July 23 2014 Oct 2014 March 2015 May 2015
All publications 9755 9761 9773 9989 10356 10400 10522 1069310797
Curatable publications 4735 4740 4780 4896 5017 5070 4873 49745053
Publications with Approved sessions 580 600 647 712 1083 1187 1598 19832123
Publications with active sessions 245 247 265 246 220 246 225 146148
Publications with session needing approval 14 4 4 18 6 11 19 42
community curatable publications - - - 351 526 613 733 8591008
community curated publications with approved sessions - - - 58 159 178 207 253270
curatable publications without sessions - - - 3730 3256 3177 2518 24132371
  • numbers of annotatable papers have dropped due to re-triage and classification of some papers that are probably of low value for curation.

All annotation types

172764 increased from 170515 in Oct

Next priorities



News and Outreach

Next planning meeting