Genome Annotation Markup Elements (GAME)

Download

bioxml-game.tar.gz

Here's an HTML version of the GAME dtd.

Please contact Suzanna Lewis for more information

Motivation

The motivation for GAME is a desire to provide a syntax, together with some simple tools, that will facilitate the exchange of genomic annotations. It will enable genome centres, model organism databases, an individual researchers to clearly specify the conclusions they have drawn from their analyses of primary sequence data and share these XML descriptions with one another. The development of GAME was necessary to allow the Drosophila Genome Project to coordinate their efforts with Celera, which required a stable and expressive interchange format.

GAME and GFF

GAME complements the existing work that has been done with GFF (Gene-Finding Features). GFF is largely targeted at standardizing the output of gene prediction software. GAME is inclusive of these types of sequence descriptions, but extend beyond this to include curated results as well. Tools will be added to convert between these two syntaxes (although it is likely that there is information loss when going from GAME to GFF).

GAME and EMBL/GenBank formats

GAME does not aim to be a replacement for the established flat file formats from GenBank or EMBL, or the ASN.1 model of the GenBank database. GAME aims to be an interchange format for annotations which can make the necessary distinctions to allow a full interchange of data between genome centres. The flat file formats are focused on archival storage of the DNA sequence as submitted, and the ASN.1 model provides a rich object model for manipulation of these sequences with the NCBI toolkit. Of course, conversion tools between the formats for the common information that they share are in development, but there is not a one to one mapping between a GAME document and GenBank/EMBL/DDBJ formats.

GAME and the CORBA LSR standards

Ewan Birney has started a FAQ about GAME. Please read it if you are unfamiliar with the Project. The CORBA Life Science Research (LSR) is considering a standard for biosequence analysis. This has considerable overlap with the GAME document in terms of the information which both standards define, but as CORBA is focused on an interface and therefore methods definition and XML is focused on a data definition, the two standards are largely speaking orthogonal. We believe that GAME is a sensible data orientated view of a number of interfaces defined in the LSR standard, and hope to define a sensible mapping between the two standards soon.

Plans

Since this effort has just begun there is a consider

Definitions

These basics entities and their relationships are described below