Introduction
Nowadays, the microarray studies supply a high numbers of data. These data are often complex and hard to analyse. Different elements (e.g. treatment of biological materials or normalisation algorithm) can have a great impact on the study result. It is why the European Bioinformatics Institute (EBI)[ebib], by the way of the Microarray group[ebia], and the Microarray Gene Expression Data society (MGED society)[mgec] - defined the minimal information set to supply for a correct evaluation of data, the Minimum Information About a Microarray Experiment[miab] (MIAME - version 1.0, November 2000). This set covers several technologies: in situ-synthesized oligonucleotides, spotted cDNA clones, spotted Polymerase Chain Reaction (PCR)) products (generated from cDNA clones); and several microarray applications: Gene Expression [GE] (majority of cases), Comparative Genomic Hybridisation [CGH], Binding-site identification by Chromatin-immunoprecipitation [ChIP] Genotyping [Single Nucleotide Polymorphism SNP]). Corresponding to this set, the MicroArray and Gene Expression group[maga] (MAGE) provides a standard for the representation of microarray expression data that would facilitate the exchange of microarray information. Through the Object Management Group[omg] (OMG), this is developed by the establishment of a data exchange model - Microarray Gene Expression - Object Model[magg]: MAGE-OM; and a data exchange format Microarray Gene Expression - Markup Language[magb]: MAGE-ML for microarray expression experiments. To simplify the creation of Array design data submission - a subset of MIAME - in comparison of MAGE-ML file, the Array Design File[adfa] (ADF) format has been designed. It is a simple and easily readable (human readable) file format without the need of a specific knowledge of MAGE-OM. These two formats are used for submitting of array design data, usually for public release associated with an article.
This project is a part of the work of Microarray group[ebia] of the E.B.I, in addition of Microarray data (MAGE-ML) submission, storage and analysis tools (MIAME-Express, Array-Express, Expression Profiler). The aim of this project is to develop a standalone application allowing conversion from ADF to MAGE-ML or from MAGE-ML to ADF. This application will facilitate the spread of the MIAME and MAGE standard for microarray data exchange.
The microarray usage of submitted array design data are: (decreasing order of usage)
- Per application:
- Gene Expression (GE)
- Comparative genomic hybridisation (CGH)
- SNPs
- Transcription site localisation (CHIP/On chip or DamID)
- Affymetrix
- Per technology:
- Oligonucleotide
- cDNA clones
- Polymerase Chain Reaction products
- CpG methylation
- Tissue
In addition, array design can be devised so that more sequences from more than one species are deposited on the array.
Consequently, the project will mainly focus on gene expression application use case (the bulk of submission). However, the specification should be able to ensure minimal handling of other applications of microarray technology in order to meet the needs of the community.
PierreMarguerite-EBI,pierre@ebi.ac.uk