BioXMLTuorial

 

BioXML Wiki   

As you've probably sussed by now the idea behind the bioxml strategy is to create several small dtds which can be combined to attack larger problems. This tutorial is going to step through many of the bioxml dtd's to build a relatively complex document. The parser I'll talk about is the Apache Group's Xerces, which is currently available in Java, C++, and soon in perl.

NOTE: All of the bioxml dtds have their own NameSpace. These start with bx (short for bioxml). So there is a Bx-Seq NameSpace, the top element of which is bx-seq:seq.

NOTE: All of the dtds and example files in this tutorial are available at https://www.bioxml.org/dtds/ and https://www.bioxml.org/xml-samples/ (or will be very soon).

The first dtd we'll look at is Bx-Link:link. It's NameSpace is https://www.bioxml.org/dtds/bx-link/v0_1 . This dtd is used by most other bioxml dtds. It allows the linking of data :

  • within the same xml document - ref_link.
  • across the web to another xml data source - simple_xlink
  • across the web to a non-xml data source - dbxref

This dtd's url is https://www.bioxml.org/dtds/current/link.dtd.

<!ELEMENT bx-link:link (
bx-link:ref_link | bx-link:simple_xlink | bx-link:dbxref
)>
<!ATTLIST bx-link:link
xmlns:bx-link CDATA #FIXED "/dtds/bxlink/v0_1/index.html"
>

All linking to xml datasources is done using IdRefs and/or XPointerS. Within a document, you simply add an IdRef attribute which points at another elements ID? tag.

NOTE: The ref_link, simple_xlink and dbxref have optional xmlns attributes in case another dtd imports only that element and not the entire bx-link dtd.

<!ELEMENT bx-link:ref_link EMPTY>

<!ATTLIST bx-link:ref_link
xmlns:bx-link CDATA "/dtds/v01/bxlink/index.html"
bx-link:ref IDREF #REQUIRED
bx-link:element_name CDATA #REQUIRED
>

To go accross the web to an XML resource, you need to use XLinkS and XPointerS. An XLink is similar to an html link, but you have much more control. It references an xml datasource. An XPointer extends the url of the XLink and tells the server which section of the referenced document to return. so:

href="https://www.xmldatasource.com/documentOfInterest.xml#xpointer(id('XmlElementId'))"

XLink's and XPointer's are still undersupported, unfortunately, but they're now both finished W3C recommendations, I believe, so hopefully we'll see some progress soon. In the meantime, I expect any bioxml dataservers will fudge an xpointer implementation. They only need to support id. They simply return the element (and it's children) with that id.

For a good introduction to xlinks and xpointers, check out: https://www.brics.dk/~amoeller/XML/linking.html

<!ELEMENT bx-link:simple_xlink (#PCDATA)>

<!ATTLIST bx-link:simple_xlink
xmlns:bx-link CDATA "/dtds/bxlink/v0_1/index.html"
xmlns:xlink CDATA #FIXED "https://www.w3.org/1999/xlink"
xlink:type CDATA #FIXED "simple"
xlink:href CDATA #REQUIRED
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:show (embed|replace|new) #IMPLIED
xlink:actuate (auto|user) "user"
>

The other type of link is a more typical database cross-reference (dbxref). This just specifies a database name, url and unique id for the entry you are looking for.

<!ELEMENT bx-link:dbxref (
bx-link:database ,
bx-link:id*
)>
<!ATTLIST bx-link:dbxref
xmlns:bx-link CDATA #FIXED
"/dtds/v01/bxlink/index.html"
>

<!ELEMENT bx-link:database (#PCDATA)>
<!ATTLIST bx-link:database
bx-link:url CDATA #IMPLIED
>

<!ELEMENT bx-link:id (#PCDATA)>
<!ATTLIST bx-link:id
bx-link:field CDATA #IMPLIED
>

Here is a sample xml file which shows a bx-link:simple_xlink. You can validate this file with your java xerces (assuming you're running linux/unix) with the command:

  • java sax.SAXCount -Nwv https://www.bioxml.org/samples/link.xml The -v flag means validate. The -w is warmup the parser before timing and -N means turnoff namespaces so that FullyQualified? names don't give weird errors. This program doesn't do anything other than validate, count the tags in, and time the parsing of the document.

<?xml version="1.0"?>
<!DOCTYPE bx-link:link SYSTEM "/home/brad/tmp/dtds/link.dtd">

<bx-link:link
xmlns:bx-link="/dtds/v01/bxlink/index.html">
<bx-link:simple_xlink
xmlns:xlink="https://www.w3.org/1999/xlink"
xlink:type="simple"
xlink:href="https://www.bradmarshall.com#xpointer(id('seq1'))"
: xlink:role="association"
xlink:title="Brad's seq1"
xlink:show="embed"
xlink:actuate="user"
bx-link:element_name="seq"
>seq1</bx-link:simple_xlink>

</bx-link:link>

Got it? If not, you can probably pick it up as we go on. The bx-link:link elements are used several more times.

On to SeqTutorial.


Related pages: Unclassified?
This page last edited on 13 Sep 2000
< Version:1.16
     
 
 
  • Search Wiki for: