Gene expression

Gene expression analysis measures the abundance of the mRNA molecules, and gives us insight into the regulation of the genes of interest. Machinery in the cell reads the sequence of the gene in groups of three bases. Each group of three bases (codon) corresponds to one of 20 different amino acids used to build a protein.”

Note that standards for RNA Seq gene expression are still under development.

Gene Expression Platforms:

  • microarray analysis
  • RNA Seq analysis– RNASeq data contains information about both nucleotide sequence and gene expression.

Recommendations

Summary

  1. We recommend the existing format standards laid out by the repositories such as NCBI (GEO)  and EBI Array Express +  ENA
  2. We recommend using ontologies and controlled vocabularies to annotate the required metadata.

Data formats

For microarray analysis:

  • We recommend the existing format standards laid out by the repositories such as NCBI (GEO)  and EBI Array Express +  ENA

For RNA Seq analysis:

The NCBI SRA database (http://www.ncbi.nlm.nih.gov/Traces/sra/) is the official repository for the actual sequence data, produced in the form of FASTQ and/or BAM files.

For more information on formats:

Metadata

Metadata is important for all gene expression studies – whether  microarray or RNA Seq data.

For BAM files – additional info needed:

  • mapping software
  • mismatch settings
  • reference sequences used such as IWGSC survey sequences or MIPS gene models, or transcriptome assembly

Please refer to this paper for more information:

https://www.betacell.org/documents/administered/about/guidelines/ENCODE_BCBC_RNA-Seq_Standards_V01_20110503.pdf

Vocabularies

We recommend using ontologies and controlled vocabularies to annotate the required metadata. Please see the recommendations on the detailed page: Ontologies and Vocabularies

  • Plant Ontology terms to describe the plant tissues and developmental stage
  • Plant Environment Ontology  to describe the experimental conditions
  • Plant Stress Ontology to describe the treatments with pathogens, stress conditions (proposed)
  • Gene Ontology is the standard for the functional analysis
  • Microarray ontology (MO) terms mapped to the OBI/OBO foundry ontology terms – MGED ontology (http://bioportal.bioontology.org/ontologies/MO?p=classes)

 

 

Written on: WDI working group
Published on:  02 October 2014
Updated on: 27 April 2015

 

1 Comment

  1. Chris Rawlings Chris Rawlings
    1 February 2016    

    It seems a bit inconsistent to be mentioning SRA as “the official repository” for RNASeq reads when earlier there is almost equal emphasis on ArrayExpress/ENA and GEO/SRA. I think this entry could do with being clearer that there are two international repositories in the USA and Europe and be more specific about the landing page for data submissions…

    Also the reference to https://www.betacell.org/documents/administered/about/guidelines/ENCODE_BCBC_RNA-Seq_Standards_V01_20110503.pdf as the basis for more information on standards does not take you to anything useful, but to a site on beta-cell biology that is no longer active.

Leave a Reply

Your email address will not be published. Required fields are marked *

Getting Involved