Version 50 (modified by gomidori, 10 years ago) (diff)


Next Meeting

Feedback from curation practical

  • Once community curation is started, would be good to flag which ones are community curated, and also which ones have been assessed by a curator.
  • Stress that people need to annotate only to the most granular term / short and friendly text boxes to inform people.
  • More options for phenotype evidence codes "what is other"
  • What about batch submissions?
  • New feature: gene product complementation...e.g. K.O of gene in another organism and pombe protein rescue/partly rescues the phenotype. This could come under the orthology box. Useful to all communities.
  • SGD displays expression data nicely e.g

Feedback from retreat

  • Genome visualization in Ensembl. A feature that allows scrolling left and right like in X-map would be useful
  • RNA-seq data (and chip-seq). Does Ensembl support squiggly data tracks or only boxes? Comparison between species would be useful, showing conserved refions, like in UCSC.

o Intra-species variation and SNP’s – wiggle tracks for gene diversity

  • Multiple data-sets on tracks. Make them searchable. Show one as consensus and the others beneath. Data-sets should be searchable and suppressible.
  • Will our data be loaded into Biogrid? Many people who do not use PomBase? may use Biogrid.
  • Could make a Pombe wiki where academics can freetype pombe facts. Different pages for different features/conditions or whatever. Suppress text from people who talk rubbish.
  • E-QTL data – show in the genome browser and on gene-pages. Human ensemble may already do this. Might be difficult to do on tracks.
  • Protein modifications – show on the protein sequence. Also map out functional regions, catalytic residues.
  • All data should be downloadable.
  • Support a repository for alignmemts. Dan had problems with a publication once because there was nowhere where he could publically store his alignment.

o Genome scale alignments

  • Transcript boxes: this needs to be referenced and show what data-sets that have been used to define UTR’s etc.

o Evidence codes could also be shown.
o If multiple alternatives the data could be shown on tracks.
o There should be links to papers.
o They could be searchable and suppressible.
o RNA seq data
o Different transcripts under different conditions could be shown

Locating terms
How do I locate/quantify what GO terms or extensions that I have used? (for implementing changes to syntax/go back and make changes etc)

Do it on the weeks that we have the calls? Or something else? If we plan in advance we won't have the same issues with room bookings...
update 2011-12-14: sorted - see email from Val.

NOT annotations
again....ok to put just 'NOT' in col 16? And then tack on any other extension as they apply?
response: Not quite -- just plain 'NOT' can only go in col 4. There are a few more specific relations in the collection that can be used to negate extensions (not_during, not_happens_during, not_exists_during), and we can request more if we need. (MAH 2011-12-14)

Ste11 paper
PMID: 10867006. Only annotated "HMG box domain binding" - possible to capture more information? E.g C-term binds outside of the consensus sequence, loop 1/2 binds consensus.

Protein IDs
PR:xxx and PRO:xxx when to use what ID
Also, certain gene products, either repeats that are cleaved into multiple proteins after transcr/transl (P-factor), or where there are multiple genes for the same product (M-factor) or if you want to refer to a family of proteins (histones). What is best to use? GeneID? (multiple annotations). Protein ID?
(partial) response: use PR:xxx IDs; they're much more traceable. I think, but we should confirm, that the protein ontology folks would be happy to add entries for the cleaved precursors and final products. (MAH 2011-12-14)

Get artemis running
I'm still mac inept!

Protein sequence features
another example in PMID 21118960 of promoter mutagenesis

cnx1Protein cleavage site (protein feature, SO:0100011)within aa residue 438-455

This is a bit like the NLS situation i described earlier.

Midori has supplied Kim with a mapping with the use of these exisiting protein feature terms to SO terms

So I added this in artemis as:

/controlled_curation="term=protein sequence feature, cleavage site; M438-K455|region; db_xref=PMID:19606215; date=20111027"

(is this what you meant, the cleavage site is somewhere in the region of 438-455)

so this one will go through to PomBase. We will do protein sequence features like this for a while as it will be some time before these can be done fully in the curation tool, and there aren't too many of them.

Note that if we use any SO terms which aren't in the list we will need to supply a mapping to Kim to make it appear in Pombase.

1-2 December, 2011

14-15 November, 2011 (3-4 November, 2011)

done - Retreat
Stuff to talk about and discuss at the retreat [BR]]

done - has_substrate
use systematic ID for now. Is inferred from context. Can add .1 for transcripts (mRNA substrates).
At the moment I annotate the has_substrate relationship on the underlying assumption that the user should be able to infer whether the substrate is a protein product or RNA (for instance kinase X has the substrate GeneDBSpombe Y). I suspect that this might be quite an inaccurate way of doing it and might come back and haunt us in the future (especially if we want to build automated pathway overviews etc). Should I actually be crossreferencing PRO ID's (and what about mRNA?).

done - TermGenie
How do I know what things to run through TermGenie? Just things like "regulation" etc relating to existing terms?

done - helpdesk
Will go through this next time.
a tour

done - Relations
Keep this as it is for MF. Requires regulation by for process control.
make list of ontologies used for relations and examples
the 'requires_regulator' term is very specific (the annotated gene product is a catalytic subunit of a complex, and requires a regulatory subunit for activity). Should it only be applicable to complexes e.g if it is shown that a MF doesn't occur if another gene is not present (which might be upstream....) then this should not be used (and only annotate to the other gene product as being involved in the process...)?

17-18 October, 2011

Capturing diploid phenotypes
There is a CL term for diploid (CL:0000415 - A cell whose nucleus has two haploid genomes)
This does not distinguish between homozygous or heterozygous. I did ask for a term but not sure they will make one. If not --> allele=heterozygous/homozygous?

eg PMID 12557273 
pik3 deletion --> sterile
pik3 deletion in diploid --> inviable spores produced
pik3 heterozygote --> all spores viable (even pik3 deletion)
therefore - pik3 is required for the formation of spore but not germination

What is the standard
Perhaps more of a philosophical question, but what are the standard conditions? e.g MM or YE, exponential growth, stationary? etc.
I think it would be nice to have the standard conditions defined, especially for phenotype annotations. Of course not to the fine experimental set-up details, but capturing general deviations from the standard would, in my opinion, be more rigorous.

Cell morphology
It would be useful to get some (image) examples of shapes. {{{E.g what is snowman? And what morphology do these cells have in D? (8380233)

  • Make a wiki page linking to different examples. Name examples FYPO[term]_img. Record reference.

Phenotype terms
To discuss: DNA binding terms and enzymatic activity.

||sxa2||Normal carboxypeptidase activity||K309A||
This mutant cannot process P-factor. It was an in vitro assay as opposed to monitoring the pheromone response

Phenotype 2
For the term "abolished protein modifications", how do you specify which protein mod is abolished if there are multiple annotated. For the time being I have just used qualifier=MOD...

  • add child terms

What are appropriate annotations to "regulation" terms?
I.e. Do you have to have some level of the response initially, the level which is further upregulated by a gene product.
or if the loss of a gene cause the absence of a response is it then +vly regulating? I'm guessing not - it is just essential for the pathway?

Modifying frames