ComputationTutorial

 

BioXML Wiki   

OK, this is where it starts to get interesting. The bx-computation:computation dtd is designed to represent the results of any computations whose output involves intervals on a sequence. This would include Blast, Fasta, HMM's, SMith-Waterman, etc.

The dtd uses all of the dtds we've previously encountered. Note that it doesn't import bx-link, because that's already imported by bx-seq.

The idea is that when a computation is done, the details about the computation are stored in the computation element. This includes the program and version, any parameters used, the database the analysis was done against, the date, etc. The results of that experiment are FeatureDtd objects, and are stored in a bx-computation:feature_set. The bx-computation:feature_set is basically a container for a group of bx-computation:feature_span objects, each of which contain one bx-feature:feature object. The effect of holding each bx-feature:feature in a bx-computation:feature_span is that you can extend the feature with properties that are specific to a computational analysis, like a score.

This dtd's url is http://www.bioxml.org/dtds/current/computation.dtd.

<!ENTITY % feature SYSTEM "feature.dtd">
<!ENTITY % link SYSTEM "link.dtd">

<!ELEMENT bx-computation:computation (
bx-computation:type?,
bx-computation:description?,
bx-computation:author?,
bx-computation:database?,
bx-computation:program,
bx-computation:creation_date,
bx-computation:version?,
bx-computation:parameter*,
bx-link:link*,
bx-computation:feature_set*)
>
<!ATTLIST bx-computation:computation
xmlns:bx-computation CDATA #FIXED "http://www.bioxml.org/dtds/computation/v0_1"
bx-computation:id ID #REQUIRED
bx-computation:seq IDREF #IMPLIED
>
<!ELEMENT bx-computation:feature_set (
bx-computation:type?,
bx-computation:description?,
bx-link:link*,
bx-computation:feature_span*)
>
<!ATTLIST bx-computation:feature_set
bx-computation:id ID #REQUIRED
>

<!ELEMENT bx-computation:feature_span (
bx-computation:score*,
bx-link:link*,
bx-feature:feature
)>
<!ATTLIST bx-computation:feature_span
bx-computation:id ID #REQUIRED
bx-computation:parent IDREF #IMPLIED
>

<!ELEMENT bx-computation:type (#PCDATA)>
<!ELEMENT bx-computation:program (#PCDATA)>
<!ELEMENT bx-computation:author (#PCDATA)>
<!ELEMENT bx-computation:database (#PCDATA)>
<!ELEMENT bx-computation:creation_date (#PCDATA)>
<!ELEMENT bx-computation:version (#PCDATA)>
<!ELEMENT bx-computation:parameter (
bx-computation:key,
bx-computation:value
)>
<!ELEMENT bx-computation:key (#PCDATA)>
<!ELEMENT bx-computation:value (#PCDATA)>

%feature;
%link;

And the example. Admittedly, this example is rather lame. I will try to get some better ones worked up soon using some real data. If you bear with me for now, though, this example shows how the bx-link, bx-seq, bx-feature and bx-computation all come together to form a useful whole.

You can validate this file with your java xerces (assuming you're running linux/unix) with the command:

  • java sax.SAXCount -Nwv http://www.bioxml.org/samples/comp.xml The -v flag means validate. The -w is warmup the parser before timing and -N means turnoff namespaces so that FullyQualified? names don't give weird errors. This program doesn't do anything other than validate, count the tags in, and time the parsing of the document.

<?xml version="1.0"?>
<!DOCTYPE bx-computation:computation SYSTEM "/home/brad/tmp/dtds/computation.dtd">
<bx-computation:computation
xmlns:bx-computation="http://www.bioxml.org/dtds/computation/v0_1"
bx-computation:id="b47">
<bx-computation:program>exonfinder</bx-computation:program>
<bx-computation:creation_date>08_07_2000</bx-computation:creation_date>
<bx-link:link xmlns:bx-link="http://www.bioxml.org/dtds/v0.1/bx-link">
<bx-link:dbxref>

<bx-link:database bx-link:url='http://www.genbank.com'>Genbank</bx-link:database>
<bx-link:id bx-link:field='accession'>ae345</bx-link:id>

</bx-link:dbxref>
</bx-link:link>

<bx-computation:feature_set bx-computation:id="b51">
<bx-computation:feature_span bx-computation:id="b52">

<bx-feature:feature
xmlns:bx-feature="http://www.bioxml.org/feature/v0_1"
bx-feature:id='b46'>
<bx-feature:type>exon1</bx-feature:type>
<bx-feature:seq_relationship>
<bx-feature:span>
<bx-feature:start>5</bx-feature:start>
<bx-feature:end>25</bx-feature:end>
</bx-feature:span>
</bx-feature:seq_relationship>
</bx-feature:feature>
</bx-computation:feature_span>
</bx-computation:feature_set>
</bx-computation:computation>

The final installment in the tutorial (for now) is the AnnotationTutorial. This looks at the annotation dtd which, of course, uses all of the dtds we've seen until now.


Related pages: Unclassified?
This page last edited on 13 Sep 2000
< Version:1.3
     
 
 
  • Search Wiki for: