PomBase Planning Meeting 20 Nov 2013

Cambridge - Meeting Room ?????

Action item review

Postponed Action Items

Outstanding action items carried forward from previous meeting

Action items from Sept

  • AI: Midori to make track names mandatory - DONE
  • AI Antonia to ask Katrina to fill in track names DONE
  • AI Mark, remove tracks from gene pages DONE
  • AI Mark to group data tracks by biological type (histone modification, polyadenylation etc), Paul says this is possible. DONE
  • Mark demo htp data sets from Marc Bühler DAM 1 D data (discuss display pulled through to PomBase?? page, and details displayed) (DONE)
  • AI: Mark: We want to group by genes first and then relationship. For future we also want to group by relationship.
  • ditto Juan's poly A data (DONE)
  • Sort databases names available for blast options (cannot interpret what the database option labels mean right now)

New (and continuing) Agenda items

Hosting community survey results

Feedback from community survey


  • 470 respondents (60% or respondents were daily or weekly users)
  • 98% of users are looking up data on specific genes
  • 54% are looking for candidate genes
  • 162 PI's completed the survey

Gene Pages

  • 95% always or frequently use the basic information at the top of the page
  • 72% always or frequently use the sequence download (lots of requests for sequence download to be moved up to the basic info section at the top of the page)
  • 60% always or frequently use the context map
  • 51% always or frequently use the GO section
  • 74% always or frequently use the phenotype section (most used section after basic info)

Ensembl browser

  • Many people found it difficult to navigate in the browser 92/360 (25%), 93 had not tried, another 5 failed
  • A similar number found it difficult to switch on tracks 71/281 (25%), 163 had not tried, 3 failed Later specific questions:
  • 25% of people found it "somewhat difficult or very difficult " to use
  • 30% of people said that it "did not meet their needs"

Most desired new features (ranked options)

  • Multi gene Phenotypes were the most desired new data type/feature (50% voted in their top 3)
  • Also popular were visualisations:
    • Of modification data 41%
    • Of networks 43%
    • Of signalling pathways 33%
    • Not so highly ranked: specific data types on browser tracks, publication pages, synteny, variation data
    • Caveat, some people could not figure out how to change the ranking...


97.6% people were satisfied with their response from the help desk

Community curation summary (reasons for non-participation)

  • I have no papers to curate 45.43% 174
  • I don't know how 25.33% 97
  • I have not had time 24.54% 94
  • I have not been asked 18.02% 69
  • I do not want to participate 0.26% 1

Of those who have tried

  • 3/50 people did not find canto easy to use
  • 9 people could not annotate all their data
  • 91% would be happy to curate more papers

Other organisms

Fission yeast is the primary organism of research for 88% of respondents

  • However 283/470 also work experimentally on other organism Of these :
  • 60% cerevisiae
  • 44% human
  • 19% mouse
  • 16% japonicus
  • 7% candida

Most desired …. (if you could add or improve one feature summary, collated into categories)

  • Existing feature 18 (will inform community, FAQs etc)
  • Future - In progress 32 (identified, on trackers)
  • Future -Intermine 3
  • Future- Blue sky 7
  • Insufficient Information 9
  • Future curation 6 (we don't have the data)
  • Improve Blast 9
  • Improve Browser 15
  • Improve Sequence download 11
  • add Gene summary 6
  • Improve phenotype data display 1
  • Add Network data 8

Blast related

Sequence download from gene page

Grant related

  • Final comments for pre proposal
  • refine approach for data dissemination and sustaibility
  • other

Ensembl browser speed

  • see email response yet (and survey) "When I turn on the tracks for the Woolcock data they take around 30 seconds to load.The default view is configured for a single gene start-stop (more about that later), but most people will want to browse this data along the chromsome. For each scroll the new data takes the same length if time (30 seconds or so) to render every time you scroll left or right. You could not

possibly use this to browse through the genome in a region of interest, and this is only a single experiment set. If you do similar browsing in the USC browser or GBrowse, with multiple tracks the plots render in a second or two. The speed is really, really prohibitive.

Ensembl browser default views

We talked a while ago about Ensembl Browser defualt views There are probably better defaults we could use (file too large to upload in planning_meeting) Also in this view if you have tracks swithched on the sequence features are buried in the middle, they should be at the top

Hosting high throughput datasets

Browser update speed

2 hours to update the browser. Isn't there any way this can be synchronised with the swithch to the updated gene pages? We saw the gene pages were updated, and now an announcement has been sent out advertising new data which is not yet visible.

EBI fire drill Pombase fail

Steve asked what measures will be taken to avoid future outages, MArk to update us Also Pombase specific browser outage reported by vinvcent vanoosthuyse (26/9)

Priority Jira tickets


Usage stats



Phenotype ontology

  • 2826 terms
  • Cell phenotypes now disjoint from cell population phenotypes


  • Paper almost complete
  • Trying to obtain GMOD compliance
  • Install and maintenance instructions documented

and on GitHub?:

  • Ontology loading time (hourly update) reduced from 1 hours to 10-20 seconds, now mainly happens off line

Other general issues



Update on community curation

Sent out ~ 100 community curation sessions. Getting lots back in various stages of completion

Literature (triage) status

ItemMarch meetingApril meeting May meeting July 3 Aug 18 Oct ??Nov 20
All publications975597619773 9935 9989 10025 10120
Un-triaged publications042 1 0 0 0
Curatable publications47354740 4780 4877 4896 4909 4951
Publications with Approved sessions580600 647 674 712 791 884
Publications with active sessions245247 265 249 246 209 225
Publications with session needing approval144 416 18 35 23
community curatable publications - -- 262 351 385 433
community curated publications with approved sessions- -- 31 58 80 94
curatable publications without sessions----3730?3542

Canto annotations (from 2013-??-??)

nameMarch countApril countMay countJuly countAugustSeptOct(v39)Latest load

All annotation types

cv_name count
DNA_binding_specificity 2
cat_act 17
ex_tools 19
pathway 32
subunit_composition 47
m_f_g 95
complementation 179
genome_org 225
misc 258
disease_associated 418
name_description 680
EC numbers 838
sequence 939
warning 1604
PSI-MOD 1772
PomBase family or domain 1883
PomBase gene characterisation status 5143
PomBase gene products 7018
molecular_function 8717
biological_process 12945
cellular_component 15376
species_dist 25460
gene_ex 26359
fission_yeast_phenotype 26641
Total 136667

(last month's total was 132119)

Next priorities



News and Outreach

Next planning meeting


  • PI's to coordinate on travel and equipment budget
Last modified 7 years ago Last modified on Nov 20, 2013, 9:19:47 AM