Version 2 (modified by antonialock, 8 years ago) (diff)


February 18 2013 10AM EBI Hinxton

Present: Paul, Steve, Jurg, Val, Midori, Dan, Mark, Antonia Remote: Kim

Action items carried forward and not addressed yet

  • AI Organise a large general "PomBase" banner
  • Demo of curation tool (postponed)
  • AI Val/curators will start to send out a large number of community curation sessions (postponed until new session management and documentation is in place. Will continue to send out small numbers of papers)
  • AI: Mark to liaise with Giulietta to help with making a Pombe community curation video.
  • AI: Val to ask CRUK if their firewall is stripping cookies from pages and let Mark know

New Action Items

  • AI Curators and Kim to further discuss the monthly build which will be available on the first Monday of each month
  • AI Mark, Dan and Paul to provide an estimate for when they will be able to go live with the build given to them on the first Monday of each month.
  • AI Kim to check if citexplore (references?) can be put into Chado in order to speed up loading
  • AI Mark to run checks to find out the exact loading times.
  • AI Dan and Mark to see what files downloadable from PomBase? have previously been generated by ensembl. These should be easy to create automatic updates for.
  • AI Val and Kim to see what files Kim could write some code to generate automatically in order to get regular updates.

Speed issues

  • Mark has been working on this for a few weeks. This includes changing from searching to an internal table. He has been working on caching the json behind the pages, ontology requests. One option is to pregenerate the HTML. We agreed that stage is looking much quicker already. He is nearing ready but more things can be addressed such as caching cite explore (can this be integrated into chado). He can go live with some of the improvements done to date.
  • Dan and paul thinks that we need some way of checking that page loading speed will be the same across different locations. Paul says that here is an international system for checking speed, but it is expensive to use long term.
  • Mark estimates that excluding the search performance is down to 0.6 of a second. With loading it adds 2 seconds. In total it is max 5-6 seconds (typical = 3 seconds?). Paul thinks 3 seconds loading would be acceptable. Paul tells Mark to run some tests so we know exactly how long it is taking. Val needs to be informed on progress. Dan also thinks it is important to do speed tests so we know exactly what will happen.

Regularity of updates

  • Current status:
    • v30 is live (from the 12th of November, 253 fully curated publications). v31 is ready to go (from December, 324 fully curated publications) and v32 will be skipped (not a huge improvement and already outdated). v33 is in progress (535 fully curated publications).
    • We will go live with V31 because Mark can implement the speed improvements sorted to date at the same time.
  • Future:
    • We should do a monthly scheduled build. This will flag if things get delayed. mark estimates that it take 1-2 full working days spread over a week long period to make a new release. Paul think that it is acceptable that this time is spent monthly.
    • Curators and Kim agreed to make a monthly build available on the first Monday of each Month.
    • Dan, Mark and Paul will provide an estimate for when the dataset can go live. This is however likely to happen on the second Friday or third Monday of each month.
  • Automated pipeline:
    • We agree that the ideal thing would be to have an automated pipeline for releases. However, Dan and Paul pointed out that even if there is a pipeline then things will not happen by magic and the release is likely to require work input by Mark.

Update on providing files for download

  • At present files are not consistent with version (out of date). Lots of outstanding helpdesk tickets related to this
  • Mark reckons that it is mainly the matter of writing the code to pull the information out. After that it is easy. Can Kim help with this? Val and Mark should check what needs doing and what Kim can do.
  • Dan pointed out that some of the files have been generated by ensembl. Paul reckons that what they have made before should be easily assimilated into a pipeline.

Misc items not thoroughly discussed this meeting

  • Progress implementing the RNA/protein expression data from Sam's paper
    • This is postponed in favor of speed and updates
  • Update on gene name synching (e.g. why is mre11 still called rad32…was changed in early November?)
    • This is fixed on stage

Curation Update

  • We have upped curation output alot, it should be quicker in the future because of infrastructure being improved but some is due to focusing on older papers.
    • 547 approved sessions up from 441
  • A few community curation, but still not sent out bulk, still need
    • More frequent updates
    • session management and help finishing for Canto (in progress)
  • Phenotype ontology
    • Currently 1932 phenotype terms

Chado loading / curation tool

*nothing reported this meeting

Meetings Attended/ Upcoming

  • 28 Feb-1 March GO cell cycle ontology content meeting ( GO and Jacky Hayles)
  • March 20-22 BYG 2 posters submitted: Antonia (community curation), Midori (phenotypes)
  • April 7-10 ISB, 4 posters submitted: Val (Annotation QC, general PomBase), Antonia (community curation), Midori (phenotypes) Mark might want to attend, at least poster sessions?
    • April 10-13 GO meeting is immediately after this meeting, Mark should attend the software sessions

  • Pombe meeting, still need to decide what to do about PomBase/GO/curation demo