<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5980104128798509938</id><updated>2011-08-13T08:43:07.394-07:00</updated><title type='text'>The Semantic Web in Life Sciences research</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>11</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-8781855676853911661</id><published>2011-08-13T08:40:00.000-07:00</published><updated>2011-08-13T08:43:07.410-07:00</updated><title type='text'>New blog</title><content type='html'>I have just started off a new blog "&lt;a href="http://cartikscorner.blogspot.com"&gt;Ontological definitions and such&lt;/a&gt;" given I have decided to use this "The Semantic Web in Life Sciences research" blog for just um, talking about life science research.&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-8781855676853911661?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/8781855676853911661/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=8781855676853911661' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/8781855676853911661'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/8781855676853911661'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2011/08/new-blog.html' title='New blog'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-6590624328875510958</id><published>2011-07-07T16:56:00.000-07:00</published><updated>2011-07-07T17:07:02.571-07:00</updated><title type='text'>Sensors and algorithms as Web services</title><content type='html'>I'm getting back into blogging after a year of serious personal strife. The period from June 2010 to May 2011 will probably go down as the toughest phase of my life; leave alone career. In my new position at &lt;a href="http://www.iupui.edu"&gt;IUPUI&lt;/a&gt;, I'm now looking at bringing more Semantic Web applications to life; this time with an emphasis on sensors and processing algorithms for visualization and classification. &lt;br /&gt;&lt;br /&gt;From the little I have heard from the Semantic Web research community in the last year, I understand everyone is confronted with scalability and tractability issues when it comes to storing and reasoning with large volumes of data. This is a significant part of the challenge of my current project, which involves encapsulating geographically distributed sensors and sparse sensor arrays and their associated algorithms in a Web service framework to enable discovery, invocation, and dynamic configuration. This is not a life sciences project, although applications can be extended into the clinical research domain.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-6590624328875510958?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/6590624328875510958/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=6590624328875510958' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/6590624328875510958'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/6590624328875510958'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2011/07/sensors-and-algorithms-as-web-services.html' title='Sensors and algorithms as Web services'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-4499995143894110260</id><published>2010-03-26T13:13:00.000-07:00</published><updated>2010-03-26T14:06:34.690-07:00</updated><title type='text'>Modeling the Linnaean taxonomy in OWL: Where do specimens come in?</title><content type='html'>In my earlier post, I had written about one way of modeling a Linnaean taxonomy in OWL, where the names of species, genera, class, and order could be modeled as instances of concepts from the taxonomic rank ontology. For example, &lt;span style="font-style:italic;"&gt;Homo sapiens&lt;/span&gt; is modeled as an instance of the Species concept from the &lt;a href="http://phenoscape.svn.sourceforge.net/viewvc/phenoscape/trunk/vocab/taxonomic_rank.obo"&gt;Taxonomy Rank ontology&lt;/a&gt;. This approach however, did not consider actual biological specimens and their relationships with these taxa. &lt;br /&gt;&lt;br /&gt;Last week, I was at the Phenoscape project all-hands meeting at the &lt;a href="http://www.fieldmuseum.org/"&gt;Field Museum&lt;/a&gt; in Chicago. Matt Yoder, co-PI on the &lt;a href="http://ontology.insectmuseum.org/index.php/Main_Page"&gt;Hymenoptera Anatomy Ontology&lt;/a&gt; project, spoke about the HAO's design of modeling the relationship between specimens and phenotypes, and modeling the names of the taxa as standalone concepts. This approach addresses two issues at once. One, the recurring problems with synonymy, homonymy, and polysemy are addressed directly instead of relegating them to "annotation property" status. Two, the relationships between specimens, names, and taxonomic ranks can be represented without taking recourse to meta-concepts. For the record, regular concepts are instances of meta-concepts. In my earlier post where specimens were ignored, taxonomic ranks (e.g. Species, Genus, Rank etc.) would be meta-concepts, specific names of taxa (e.g. &lt;span style="font-style:italic;"&gt;Brassica olaracea capitata&lt;/span&gt;, &lt;span style="font-style:italic;"&gt;Canis lupus&lt;/span&gt; etc.) would be concepts or instances of the meta-concepts, and finally, specimens such as the big bad wolf and the head of cabbage I bought last evening at the grocery, would be instances of these concepts. There could be workarounds for this philosophy, the most obvious one being modeling the relationship between specimens and taxa names as something other than type-token relationships. &lt;br /&gt;&lt;br /&gt;Further, Chris Mungall was also at the project meeting, as a consultant. Chris is currently working on creating representations of homologies and he suggested using the "hasPart" relation from OBO relations to model the relationship between specimens and phenotypes. &lt;br /&gt;&lt;br /&gt;Given the definition of a Phenotype concept, the relationship "hasPart" can be extended in an OWL framework to relate a Specimen concept (the domain) to a Phenotype concept (the range). An example RDF triple relating a specimen to a phenotype would be as shown in (1). Note the post composed representation of the Phenotype instance.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;'Specimen 1'      'has part'    'some(vertebra 1 and hasQuality some sigmoid)'  --(1)&lt;br /&gt;&lt;br /&gt;The relationship between a specimen and its taxon name would be represented as shown in (2). I have used "hasTaxonName" for want of a better label for this relation, which relates a Specimen concept to a Name concept. &lt;br /&gt;&lt;br /&gt;'Specimen 1'      'has taxon name'    'Danio rerio'                             --(2)&lt;br /&gt;&lt;br /&gt;Lastly, given a "hasRank" relation to model the relationship between a name and a taxonomic rank, the RDF triple (3) completes this paradigm. Note "hasRank" is used as an annotation property in its current avatar. &lt;br /&gt;&lt;br /&gt;'Danio rerio'      'has rank'          'Species'                                --(3)&lt;br /&gt;&lt;br /&gt;The following 'type' triples are necessary.&lt;br /&gt;&lt;br /&gt;'Danio rerio'      'type'         'Name'&lt;br /&gt;'Species'          'type'         'Taxonomic rank'&lt;br /&gt;'Specimen 1'       'type'         'Specimen'&lt;br /&gt; &lt;br /&gt;Alternative names for &lt;span style="font-style:italic;"&gt;Danio rerio&lt;/span&gt; such as &lt;span style="font-style:italic;"&gt;Brachydanio rerio&lt;/span&gt; can be represented using the RDF triple in (4), where synonym can be defined as a reflexive property between Name concepts.  &lt;br /&gt;&lt;br /&gt;'Brachydanio rerio'     'synonym'    'Danio rerio'                             --(4)&lt;br /&gt;&lt;br /&gt;In the interest of sound ontology design principles, each of these concepts can be extended from concepts from "higher-level" ontologies such as the &lt;a href="http://code.google.com/p/information-artifact-ontology/"&gt;Information Artifact Ontology&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;In my next post, I shall look at use cases that can leverage these designs both from the point of view of Phenoscape (the project I currently work on) as well as other life science data integration and modeling projects. &lt;br /&gt;&lt;br /&gt;As an aside, my days on the Phenoscape project are numbered and I'm currently looking for new positions. Wish me luck!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-4499995143894110260?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/4499995143894110260/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=4499995143894110260' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/4499995143894110260'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/4499995143894110260'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2010/03/modeling-linnaean-taxonomy-in-owl-where.html' title='Modeling the Linnaean taxonomy in OWL: Where do specimens come in?'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-3531760929809377302</id><published>2010-02-26T17:17:00.001-08:00</published><updated>2010-03-01T10:50:53.381-08:00</updated><title type='text'>Modeling the Linnaean taxonomy in OWL</title><content type='html'>Following up on the &lt;a href="http://www.phenoscape.org/"&gt;Phenoscape&lt;/a&gt; beta release in July, I've worked primarily on warehousing the phenotype data and refactoring the data services for faster performance on the &lt;a href="http://kb.phenoscape.org/"&gt;Phenoscape web interface&lt;/a&gt;. I'm also collaborating with &lt;a href="http://www.fruitfly.org/%7Ecjm/"&gt;Chris Mungall&lt;/a&gt; at Lawrence Berkeley National Laboratories on a manuscript outlining the principles of OBD and its application to the Phenoscape knowledgebase. I hope to finish writing the first draft in the next couple of weeks.&lt;br /&gt;&lt;br /&gt;I've been pondering over ways to create representations of phenotype annotations in RDF triples using OWL concepts, instances, and object properties. A phenotype annotation is a Subject-Predicate-Object triple that relates an evolutionary taxon from a Linnaean taxonomy to an exhibited phenotype. In the Phenoscape project, phenotype annotations relate species (and sometimes higher taxa from the Linnaean taxonomy) of fish to exhibited phenotypes.&lt;br /&gt;&lt;br /&gt;To relate these two entities, we have defined a new binary relation &lt;span style="font-style: italic;"&gt;exhibits&lt;/span&gt;. The &lt;span style="font-style: italic;"&gt;exhibits&lt;/span&gt; relation has been defined in an OBO framework, where only a simple ID and label are required with a text description of the intended semantics. I have been thinking about a more formal treatment for this important relation, specifically in a Semantic Web framework. How do I create an object property definition of the &lt;span style="font-style: italic;"&gt;exhibits&lt;/span&gt; relation? What concepts do I define as its domain and range?&lt;br /&gt;&lt;br /&gt;In layman terms, the &lt;span style="font-style: italic;"&gt;exhibits&lt;/span&gt; relation relates a taxon (node) from a Linnaean taxonomy to a phenotype. The &lt;a href="https://www.nescent.org/phenoscape/Taxonomic_Rank_Ontology"&gt;taxonomy rank ontology&lt;/a&gt; specifies partonomy relationships between the various ranks of a Linnaean taxonomy, each instance of a rank is also an instance of the higher ranks. The &lt;span style="font-style: italic;"&gt;taxon&lt;/span&gt; concept in the taxonomy rank ontology has been defined as the subconcept of the &lt;a href="http://www.mygrid.org.uk/OWL/Presentation?url=http%3A%2F%2Fwww.ifomis.org%2Fbfo%2F1.1"&gt;continuant concept of the Basic Formal Ontology&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Genus&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;species&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;family&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;order&lt;/span&gt;, and &lt;span style="font-style: italic;"&gt;class&lt;/span&gt; are subconcepts of &lt;span style="font-style: italic;"&gt;taxon&lt;/span&gt;. Species such as &lt;span style="font-style: italic;"&gt;Ictalurus furcatus&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;Oryza sativa&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;Esox americanus&lt;/span&gt; are instances of the &lt;span style="font-style: italic;"&gt;species&lt;/span&gt; concept. The corresponding genera &lt;span style="font-style: italic;"&gt;Ictalurus&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;Oryza&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;Esox&lt;/span&gt; are instances of the &lt;span style="font-style: italic;"&gt;genus&lt;/span&gt; concept.&lt;br /&gt;&lt;br /&gt;I have not addressed the relationship between actual living organisms and Linnaean taxa; is my dog an instance of &lt;span style="font-style: italic;"&gt;Canis familiaris&lt;/span&gt; for example, or is this a different kind of relationship altogether? How about fossils that are being discovered in the various corners of the Earth even today such as the fascinating &lt;a href="http://tiktaalik.uchicago.edu/"&gt;&lt;span style="font-style: italic;"&gt;Tiktaalik rosaea&lt;/span&gt;&lt;/a&gt;? How about the preserved soft tissue specimens in various life science museums? Are these instances of specific taxa? This is the subject of a very old debate in the community of evolutionary biologists and systematists. Very often, evolutionary biologists cannot decide which part of the Tree of Life to assign a newly discovered specimen to. I shall defer a discussion on this relationship to a later post.&lt;br /&gt;&lt;br /&gt;Now let us consider phenotypes. A phenotype is defined as an observable physical or biochemical characteristic of a living organism, that is caused by its genetic makeup and also by the influence of its environment. For sometime now, model organism databases have used the Entity-Quality formalism for modeling phenotypes i.e. a phenotype is a quality that inheres in an anatomical or a behavioral entity. Phenoscape subscribes to this formalism. A phenotype concept in Phenoscape (and in OBD from whence it is inherited) is "post composed" from previously defined concepts in an anatomical ontology or a behavioral ontology such as the &lt;a href="http://sig.biostr.washington.edu/projects/fm/AboutFM.html"&gt;Foundational Model of Anatomy (FMA)&lt;/a&gt; or the &lt;a href="http://www.geneontology.org/GO.process.guidelines.shtml"&gt;GO biiological process ontology&lt;/a&gt; and from a quality ontology such as the &lt;a href="http://obofoundry.org/wiki/index.php/PATO:Main_Page"&gt;Phenotypes and Traits Ontology (PATO)&lt;/a&gt;. This is a nifty way to create a RDF-style blank node with a Skolemized identifier, which identifies the origins of the node. The post composed phenotype is related to the quality concept by a subsumption relationship ("a round fin is round after all") and to the corresponding anatomy or behaviour concept by the &lt;span style="font-style: italic;"&gt;inheres_in&lt;/span&gt; relation from OBO. Again, the comparison with RDF blank nodes is obvious. It's not the node itself, but its relationships that we care about.&lt;br /&gt;&lt;br /&gt;So here goes putting it all together. I use Phenoscape as the namespace prefix here. I have eliminated the angle brackets from the tags so it can be displayed here. This is going into an ontology that will soon be posted on the Phenoscape site.&lt;br /&gt;&lt;br /&gt;&amp;lt;owl:Class rdf:ID="Phenotype"&amp;gt;&lt;br /&gt;  &amp;lt;rdfs:subClassOf&amp;gt;&lt;br /&gt;  &amp;lt;owl:intersectionOf rdf:parseType="Collection"&amp;gt;&lt;br /&gt;   &amp;lt;owl:Class rdf:ID="PATO:0000001"&amp;gt;  // Quality &lt;br /&gt;   &amp;lt;owl:Restriction&amp;gt;&lt;br /&gt;    &amp;lt;owl:onProperty rdf:resource="OBO_REL:inheres_in"&amp;gt;&lt;br /&gt;    &amp;lt;owl:hasValue/&amp;gt;&lt;br /&gt;      &amp;lt;owl:unionOf rdf:parseType="Collection"&amp;gt;&lt;br /&gt;        &amp;lt;owl:Class rdf:ID="GO:0007610"&amp;gt;  // Behavior &lt;br /&gt;        &amp;lt;owl:Class rdf:ID="TAO:0100000"&amp;gt; // Anatomical entity from TAO   &lt;br /&gt;      &amp;lt;/owl:unionOf&amp;gt;&lt;br /&gt;   &amp;lt;owl:restriction&amp;gt;&lt;br /&gt;   &amp;lt;owl:onProperty rdf:resource="OBO_REL:towards"&amp;gt;&lt;br /&gt;   &amp;lt;owl:someValuesFrom/&amp;gt;&lt;br /&gt;     &amp;lt;owl:unionOf rdf:parseType="Collection"&amp;gt;&lt;br /&gt;       &amp;lt;owl:Class rdf:resource="GO:0007610"&amp;gt;  // Behavior &lt;br /&gt;       &amp;lt;owl:Class rdf:resource="TAO:0100000"&amp;gt;  //Anatomical entity from TAO&lt;br /&gt;     &amp;lt;/owl:unionOf&amp;gt;&lt;br /&gt;   &amp;lt;/owl:onProperty&amp;gt;&lt;br /&gt;   &amp;lt;/owl:restriction&amp;gt;&lt;br /&gt;&amp;lt;/rdfs:subClassOf&amp;gt;&lt;br /&gt;&amp;lt;/owl:Class&amp;gt;&lt;br /&gt;&lt;br /&gt;Note how I use the root concept of the &lt;a href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=teleost_anatomy"&gt;Teleost Anatomy Ontology&lt;/a&gt; as one of the concepts in the OWL union in the range of both the &lt;span style="font-style: italic;"&gt;inheres_in&lt;/span&gt; property as well as the &lt;span style="font-style: italic;"&gt;towards&lt;/span&gt; property. This is for the purposes of the Phenoscape project. For other subsets of the Tree of Life, concepts from equivalent anatomy ontologies such as the &lt;a href="http://174.133.140.86/AmphibAnatRDBOM/Query_Ontology_1.aspx?FromAmphibAnat=Yes"&gt;Amphibian Anatomy Ontology&lt;/a&gt;, the Foundational Model of Anatomy (FMA), or even the &lt;a href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=caro"&gt;Common Anatomy Reference Ontology (CARO)&lt;/a&gt; can be used instead of this concept.&lt;br /&gt;&lt;br /&gt;Now for the taxon concept. This is much simpler. I use the &lt;a href="http://www.mygrid.org.uk/OWL/Presentation?url=http%3A%2F%2Fwww.ifomis.org%2Fbfo%2F1.1"&gt;Continuant&lt;/a&gt; concept from BFO as the superconcept of taxon. I use TRO as the prefix for the Taxonomy Rank Ontology.&lt;br /&gt;&lt;br /&gt;&amp;lt;owl:Class rdf:ID="TRO:Taxon"&amp;gt;&lt;br /&gt;&amp;lt;rdfs:subClassOf rdf:resource="BFO:Continuant"/&amp;gt;&lt;br /&gt;&amp;lt;/owl:Class&amp;gt;&lt;br /&gt;&lt;br /&gt;Other concepts in the TRO can be defined as below in OWL.&lt;br /&gt;&lt;br /&gt;&amp;lt;owl:Class rdf:ID="TRO:Genus"&amp;gt;&lt;br /&gt;&amp;lt;rdfs:subClassOf rdf:resource="TRO:Taxon"/&amp;gt;&lt;br /&gt;&amp;lt;/owl:Class&amp;gt;&lt;br /&gt;&lt;br /&gt;&amp;lt;owl:Class rdf:ID="TRO:Species"&amp;gt;&lt;br /&gt;&amp;lt;rdfs:subClassOf rdf:resource="TRO:Taxon"/&amp;gt;&lt;br /&gt;&amp;lt;/owl:Class&amp;gt;&lt;br /&gt;&lt;br /&gt;Lastly, the individual species, genera et al can be defined as OWL individuals as below. These are taken from Peter Midford's &lt;a href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=teleost_taxonomy"&gt;Teleost Taxonomy Ontology&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&amp;lt;TRO:species id="TTO:1001979"/&amp;gt;   // Danio rerio&lt;br /&gt;&lt;br /&gt;&amp;lt;TRO:genus id="TTO:101040"/&amp;gt;      // Danio &lt;br /&gt;&lt;br /&gt;Similarly phenotypes with post composed identifiers can be defined as instances of the OWL concept &lt;span style="font-style:italic;"&gt;phenotype&lt;/span&gt; defined earlier&lt;br /&gt;&lt;br /&gt;&amp;lt;phenoscape:phenotype rdf:ID="PATO:0000599^OBO_REL:inheres_in(TAO:0000656)"/&amp;gt;&lt;br /&gt;&lt;br /&gt;Finally, we define the &lt;span style="font-style: italic;"&gt;exhibits&lt;/span&gt; relation in OWL.&lt;br /&gt;&lt;br /&gt;&amp;lt;owl:ObjectProperty id="exhibits"&amp;gt;&lt;br /&gt;&amp;lt;rdfs:domain resource="TRO:Taxon"/&amp;gt;&lt;br /&gt;&amp;lt;rdfs:range resource="#Phenotype"/&amp;gt;&lt;br /&gt;&amp;lt;/rdfs:range&amp;gt;&lt;br /&gt;&lt;br /&gt;This definition is now the logical underpinning for RDF triples in N3 syntax that look like:&lt;br /&gt;&lt;br /&gt;&amp;lt;tto:0001979&amp;gt;   &amp;lt;phenoscape:exhibits&amp;gt;   &amp;lt;pato:0000599^obo_rel:inheres_in(tao:0000656)&amp;gt;&lt;br /&gt;&lt;br /&gt;I may be off on some of the syntax (I'm a bit rusty), but I hope the points I have made in this post have been reflected adequately in these definitions. As always, feedback and critique are welcome. This OWL ontology will soon be up on the Phenoscape site as I have mentioned earlier. I thank Peter Midford for his input and thoughts. In my next post, I will address the relationship between specimens and evolutionary taxa, a subject to which I have briefly alluded here. Until then, happy trails!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-3531760929809377302?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/3531760929809377302/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=3531760929809377302' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/3531760929809377302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/3531760929809377302'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2010/02/modeling-linnaean-taxonomy-in-owl.html' title='Modeling the Linnaean taxonomy in OWL'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-1109380895684773068</id><published>2009-11-14T15:22:00.000-08:00</published><updated>2009-11-14T20:17:55.116-08:00</updated><title type='text'>Biological data integration with Phenoscape</title><content type='html'>It has been quite a long time since my last post in June 2008. I have since taken up a position working on an NSF funded project in Durham, NC. &lt;a href="http://phenoscape.org/"&gt;Phenoscape&lt;/a&gt; is an initiative to bring together model organism data with rich text descriptions of phenotypes exhibited by evolutionary species, which enables the discovery of new interesting and hitherto unknown relationships between mutants and evolutionary species. In this post, I outline the Phenoscape project. In my next posts, I will describe some of the issues I have come across in the Phenoscape project and some of my thoughts and ideas for resolving these issues.&lt;br /&gt;&lt;br /&gt;I have spent most of my 16 months on the Phenoscape project developing an ontology based back-end repository (the Phenoscape knowledgebase) to store phenotype data from model organism databases and from rich text descriptions in scientific publications. I have developed data loader modules which translate data from a myriad different formats into a single, shareable syntax with ontology based semantics. Lastly, I have developed Web service endpoint interfaces to query this knowledgebase and output the results, which are displayed at a &lt;a href="http://kb.phenoscape.org/"&gt;User Interface&lt;/a&gt; developed by my colleague, Jim Balhoff.&lt;br /&gt;&lt;br /&gt;By ontology based semantics, I mean semantics defined in &lt;a href="http://obofoundry.org/"&gt;OBO&lt;/a&gt; ontologies. OBO predates the Semantic Web initiative, so this is rather a chicken and egg problem here. While the Semantic Web pedant in me is dismayed by the absolute lack of a mathematical framework in OBO definitions, which are nothing but rich text descriptions, the extent of knowledge annotation and reuse that biologists have achieved with OBO ontologies such as the &lt;a href="http://geneontology.org/"&gt;Gene Ontology&lt;/a&gt; is very impressive.&lt;br /&gt;&lt;br /&gt;The schema of the Phenoscape knowledgebase is based upon the &lt;a href="http://www.berkeleybop.org/obd/"&gt;Ontology-Based Database&lt;/a&gt; (OBD) schema developed by &lt;a href="http://www.fruitfly.org/%7Ecjm/"&gt;Chris Mungall&lt;/a&gt; at the Lawrence Berkeley National Laboratory for storing phenotype annotations. For the sake of lucidity, I briefly define some of the terminology that will be used in the rest of this post (and in the posts to follow as well).&lt;br /&gt;&lt;br /&gt;Phenotype annotations are statements that relate evolutionary taxa to exhibited phenotypes. Taxa are the nodes that are part of a taxonomy. Species such as &lt;span style="font-style: italic;"&gt;Homo sapiens&lt;/span&gt; (humans) are the leaf nodes (or leaf taxa) of a taxonomy devised by Linnaeus for the classification of living organisms. When talking about higher and lower taxa, I am referring to the relative positions of these taxa in the Linnaean taxonomy. Classes (concepts) are types or templates of real world "entities" (for want of a better word) with property based definitions. Instances are real world occurrences of the types. The classic example: &lt;span style="font-style: italic;"&gt;car &lt;/span&gt;would be a class, &lt;span style="font-style: italic;"&gt;my Porsche 911 GT &lt;/span&gt;would be an instance of the &lt;span style="font-style: italic;"&gt;car &lt;/span&gt;class.&lt;br /&gt;&lt;br /&gt;OBD is an intelligent, relational database that provides for the storage of phenotype annotations in triples format (not RDF). It also comes with a reasoner for extracting transitive subsumptions and partonomies as well as relation chains from asserted data. The inference capabilities (deductive closure) of the OBD reasoner exceed that of the RDF reasoner.&lt;br /&gt;&lt;br /&gt;OBD allows for evolutionary species to be defined and treated as concepts. Annotations of exhibited phenotypes are existentially quantified. Since annotations to these leaf nodes are existentially quantified, I developed an extension to the OBD reasoner to associate higher level taxa in the Linnaean taxonomy with phenotypes exhibited by the lower level taxa. This makes it possible for biologists to query for all the phenotypes under a higher taxon, and see the phenotypes exhibited by all the lower, subsumed taxa as well as the query taxon itself.&lt;br /&gt;&lt;br /&gt;For example, given an existentially quantified assertion that &lt;span style="font-style: italic;"&gt;Homo sapiens &lt;/span&gt;exhibit a four chambered heart, my extension to the OBD reasoner infers that the genus &lt;span style="font-style: italic;"&gt;Homo &lt;/span&gt;also exhibits a four chambered heart. In every day terms, given an assertion that &lt;span style="font-weight: bold;"&gt;some &lt;/span&gt;instances of &lt;span style="font-style: italic;"&gt;Homo sapiens &lt;/span&gt;exhibit a phenotype, an opposable thumb for example, it is reasonable to infer &lt;span style="font-weight: bold;"&gt;some &lt;/span&gt;instances of &lt;span style="font-style: italic;"&gt;Homo &lt;/span&gt;exhibit the same phenotype as well. The existential quantification is key to inferring up the hierarchy, and add inferences that do not lead to an inconsistent knowledge base. Existential quantifications are also convenient for assertions in the life sciences, which are rife with exceptions to the general rules.&lt;br /&gt;&lt;br /&gt;Phenoscape is a well documented project and more background information can be found on the&lt;br /&gt;&lt;a href="http://phenoscape.org/wiki/Informatics"&gt;Phenoscape informatics wiki&lt;/a&gt; and the &lt;a href="http://blog.phenoscape.org/"&gt;Phenoscape blog&lt;/a&gt;. Phenoscape it must be noted, addresses just one issue in the wide research area of  &lt;a href="http://en.wikipedia.org/wiki/Biodiversity_Informatics"&gt; evolutionary and biodiversity informatics&lt;/a&gt;, whose efforts at data integration and interoperability are confronted by problems very similar to those confronting medical informatics. On a parting note, I find it very satisfying that ontologies are crucial to addressing these issues in both research domains (and several others as well).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-1109380895684773068?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/1109380895684773068/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=1109380895684773068' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/1109380895684773068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/1109380895684773068'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2009/11/biological-data-integration-with.html' title='Biological data integration with Phenoscape'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-5114823928626867900</id><published>2008-06-23T11:21:00.000-07:00</published><updated>2010-03-01T11:05:46.927-08:00</updated><title type='text'>Introducing the Ontology-Based PubMed Annotator</title><content type='html'>Since my last post, I've pretty much kept my nose to the grindstone and I have something to show for it. The new PubMed annotator on steroids, or ahem...the Ontology-Based Pubmed Annotator or the OBPA for short. The OBPA, like its predecessor, the PubMed Annotator, requires the user, a biologist to annotate biomedical experiments in RDF triple format for storage, subsequent querying, summarizing, and comparison.  The difference is the OBPA prompts the user with matching terms from a few preselected ontologies, in auto-complete mode even as she is filling the fields. The user can choose to use terms from the ontologies for her annotation work, or she can use her own terms.&lt;br /&gt;&lt;br /&gt;OBPA  keeps track of the number of terms the user borrows from each ontology as a measure of the ontology's usefulness (c.f. my &lt;a href="http://cartiksplace.blogspot.com/2008/05/introducing-pubmed-annotator.html"&gt;previous blog entry&lt;/a&gt;). OBPA is definitely more advanced, implementing more features and enhanced security than the PubMed Annotator. The following OWL ontologies are currently being used in the OBPA:&lt;br /&gt;&lt;br /&gt;a) &lt;a href="http://obi.sourceforge.net/"&gt;The Ontology for Biomedical Investigations (OBI)&lt;/a&gt;&lt;br /&gt;b) &lt;a href="http://mged.sourceforge.net/ontologies/MGEDontology.php"&gt;The MGED Ontology&lt;/a&gt;&lt;br /&gt;c) &lt;a href="http://www.ifomis.org/bfo"&gt;Barry Smith's Basic Formal Ontology&lt;/a&gt;&lt;br /&gt;d) &lt;a href="http://www.onto-med.de/en/theories/gfo/"&gt;Heinrich Herre's General Formal Ontology&lt;/a&gt;&lt;br /&gt;e) &lt;a href="http://obofoundry.org/ro/"&gt;Barry Smith's Relation Ontology&lt;/a&gt;&lt;br /&gt;f) &lt;a href="http://www.dumontierlab.com/index.php?page=ontologies"&gt;Michel Dumontier's Relation Ontology&lt;/a&gt;&lt;br /&gt;g) &lt;a href="http://www.w3.org/2004/OWL/"&gt;OWL 1.0 Ontology&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The OBPA in its current version cannot handle OBO syntax. I believe ontologies such as the &lt;a href="http://sig.biostr.washington.edu/projects/fm/"&gt;Foundational Model of Anatomy&lt;/a&gt;, &lt;a href="http://www.reactome.org/"&gt;Reactome&lt;/a&gt;, and &lt;a href="http://beta.uniprot.org/"&gt;UniProt&lt;/a&gt; will also be relevant to the OBPA. The OBPA however, suffers from a significant roadblock which prevents the incorporation of more ontologies into its scope. Terms (classes and properties) from ontologies are loaded into OBPA at deployment time. Given the slow performance of current versions of OWL-based APIs such as &lt;a href="http://jena.sourceforge.net/"&gt;Jena&lt;/a&gt; and the &lt;a href="http://owlapi.sourceforge.net/"&gt;OWL API&lt;/a&gt;, server-side deployment is a very tortuous process with the server timing out frequently. Also, the terms are not updated periodically. With the current rate of progress on ontologies, OBPA runs the risk of using obsolete terminology from ontologies.&lt;br /&gt;&lt;br /&gt;Ben Good's &lt;a href="http://www.entitydescriber.org/"&gt;Entity Describer (E.D.)&lt;/a&gt;, which works with ontologically defined terms, uses the interface provided by &lt;a href="http://www.freebase.com/"&gt;Freebase&lt;/a&gt; to dynamically extract terms from ontologies such as the &lt;a href="http://www.geneontology.org/"&gt;Gene Ontology (GO)&lt;/a&gt; to prompt the user with a suggestion box complete with a text description about the term, the ontology it is extracted from, and sometimes, even a picture! Future revisions to the OBPA may incorporate this methodology to alleviate the problems with obsolete ontological terms. Another solution may be to create a service that periodically browses a selection of ontologies and presents the extracted terms on an interface accessible to applications such as the OBPA. An application such as the &lt;a href="http://www.ebi.ac.uk/ontology-lookup/"&gt;Ontology Lookup Service (OLS)&lt;/a&gt; which is also compatible with OWL ontologies may help as well.&lt;br /&gt;&lt;br /&gt;On a tangent, &lt;a href="http://bioinfo.icapture.ubc.ca/"&gt;Mark Wilkinson&lt;/a&gt; suggested a future area of work where one could browse through the nodes of an ontology and extract publications associated with every node. I'm putting it down here because it may be something for me to work on in the future, and also to ensure that you heard it first, from here!! In closing, I would like to thank the hands-on help provided by &lt;a href="http://bioinfo.icapture.ubc.ca/ekawas/index.html"&gt;Ed Kawas&lt;/a&gt; on the jQuery part of the application, Luke McCarthy for his insightful tips on various aspects, and &lt;a href="http://bioinfo.icapture.ubc.ca/bgood/index.html"&gt;Ben Good&lt;/a&gt; for being the Dry Lab's own “thinker.”&lt;br /&gt;&lt;br /&gt;UPDATE: It has been a while since the server for the Wilkinson lab was changed from bioinfo.icapture.ubc.ca to the new server. This is the reason why the link to the Pubmed Annotator Web UI is inactive. The WAR I had on my laptop was lost forever when the laptop was stolen from my house in Vancouver. The code for the Ontology Based Pubmed Annotator is available on the Wilkinson lab's   code repository, and I will be moving this to a new project on SourceForge very soon.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-5114823928626867900?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/5114823928626867900/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=5114823928626867900' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/5114823928626867900'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/5114823928626867900'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2008/06/introducing-ontology-based-pubmed.html' title='Introducing the Ontology-Based PubMed Annotator'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-1110196645566057391</id><published>2008-05-23T16:30:00.000-07:00</published><updated>2008-05-28T11:09:14.923-07:00</updated><title type='text'>Introducing the Pubmed Annotator</title><content type='html'>In my last post, I held forth on the divide between biologists and computer scientists.  Since then, I have been actively collaborating with biologists trying to harness their knowledge of biomedical experiments through an annotation interface that is called the &lt;a href="http://bioinfo.icapture.ubc.ca:8090/PubmedAnnotator"&gt;Pubmed Annotator&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;The Pubmed Annotator is at an alpha state of development. At present, I'm addressing some advanced security issues brought up by &lt;a href="http://bioinfo.icapture.ubc.ca/ekawas/"&gt;Ed&lt;/a&gt; and Luke, two of the &lt;a href="http://bioinfo.icapture.ubc.ca"&gt;Wilkinson lab's&lt;/a&gt; own superhacks. In its present avatar, the Pubmed Annotator allows a user to register, log in, query Pubmed with a Pubmed Identifier to retrieve a publication, and then annotate the experiment described in the publication using Subject-Predicate-Object triple syntax. I have a recent &lt;a href="http://bioinfo.icapture.ubc.ca:8090/PubmedAnnotator/IKE2008.pdf"&gt;publication&lt;/a&gt;   describing the objectives of the Pubmed Annotator project. &lt;br /&gt;&lt;br /&gt;The Pubmed Annotator hopes to elicit unique structured representations of biomedical experiments in SPO triple format.  Each experiment can be stored as a collection of SPO triples, or by extension as RDF triples on the Semantic Web.  This will enable easy querying of experiment details in a universally shareable syntax (RDF) for one.    Second, experiments can be compared for similarity.  Third, logic based reasoning mechanisms (one of the primary benefits of the Semantic Web) can be used to summarize experiments for the benefit of overworked biologists and the curators of biological knowledge bases.  Lastly, raw annotations from users can be used to synthesize a controlled vocabulary (and an ontology) for the annotation of biomedical experiments.  This constitutes a bottom-up approach to ontology synthesis, wherein raw data is used to create a template.  Ontology development today is more often a top-down process where domain experts and knowledge engineers argue and somehow, agree on a set of terms and logical definitions of these terms, which are capable of representing the knowledge domain of interest.  &lt;br /&gt;&lt;br /&gt;I attended the &lt;a href="https://wiki.cbil.upenn.edu/obiwiki/index.php/January2008Workshop"&gt;Vancouver workshop&lt;/a&gt; of the &lt;a href="https://wiki.cbil.upenn.edu/obiwiki/index.php?title=HomePage"&gt;Ontology of Biomedical Investigations (OBI)&lt;/a&gt; Consortium in February.  I was mostly a passive observer as some of the best ontology developers, domain experts, and philosophers went to work arguing on common place terms and their definitions, which most common folk would merely take for granted.  It was fascinating to say the least.  &lt;br /&gt;&lt;br /&gt;The Wilkinson lab on the other hand, is fast becoming a hotbed of bottom-up approaches to ontology development.  &lt;a href="http://bioinfo.icapture.ubc.ca/bgood/"&gt;Ben Good&lt;/a&gt; took the lead a few years ago with the excellent &lt;a href="http://bioinfo.icapture.ubc.ca/bgood/writing/published/icapturer_psb2006_accepted.pdf"&gt;iCAPTUREr&lt;/a&gt;.  Very recently, he has developed the &lt;a href="http://www.connotea.org/wiki/EntityDescriber"&gt;Entity Describer&lt;/a&gt; as an add on to Connotea, as a means to help users use ontologically defined terms to annotate publications of their choice on &lt;a href="http://www.connotea.org"&gt;Connotea&lt;/a&gt;.  This is an example of &lt;a href="http://blogs.nature.com/wp/nascent/2008/05/social_tagging_for_science_1.html"&gt;Semantic Social Tagging&lt;/a&gt;.  &lt;br /&gt;&lt;br /&gt;Along these lines, the Pubmed Annotator is currently being upgraded to use ontologies and ontologically defined terms to annotate publications.  I believe the usefulness of ontologies to experiment annotation in general can be evaluated by a simple metric.  Given a user could use either terms of his choice or ontologically defined terms or a combination of both to annotate an experiment, the metric is a ratio of the number of terms from an ontology that were used in annotating an experiment to the total number of terms used to annotate the same experiment.  Let me put this in a simple mathematical formula.  Given &lt;i&gt;t&lt;/i&gt; is the total number of terms used to annotate an experiment &lt;i&gt;e&lt;/i&gt; by a user &lt;i&gt;u&lt;/i&gt;, and &lt;i&gt;n&lt;/i&gt; is the number of ontologically defined terms used by &lt;i&gt;u&lt;/i&gt; to annotate &lt;i&gt;e&lt;/i&gt;, the efficiency metric is defined as &lt;br /&gt;&lt;br /&gt;&lt;i&gt;e&lt;/i&gt; = &lt;i&gt;n&lt;/i&gt;/&lt;i&gt;t&lt;/i&gt;. &lt;br /&gt;&lt;br /&gt;This is for one user &lt;i&gt;u&lt;/i&gt;.  The same metric can be used to quantitatively measure the effectiveness of an ontology to annotate several experiments by several users by extension. All this is mere hypothesis however. It is hoped the data that we gather will give us a realistic estimate of the correctness of this hypothesis.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-1110196645566057391?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/1110196645566057391/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=1110196645566057391' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/1110196645566057391'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/1110196645566057391'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2008/05/introducing-pubmed-annotator.html' title='Introducing the Pubmed Annotator'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-8024624645019780333</id><published>2007-10-02T17:26:00.001-07:00</published><updated>2007-10-02T18:38:27.191-07:00</updated><title type='text'>So you think you know ontologies?</title><content type='html'>The need to share biological data has led to the development of several high profile and oft referenced ontologies in the life sciences domain. &lt;a href="http://http//www.nature.com/nbt/journal/v23/n9/pdf/nbt0905-1095.pdf"&gt;Soldatova and King&lt;/a&gt; have pointed out the limitations with many biological ontologies that threaten to hurt the long term purpose of their use in the life sciences domain. Based on my experience in ontology development over the last five years, and my interactions with other ontology developers and organizations looking to invest (or involved) in ontology development, the findings of this paper do not come as a surprise.&lt;br /&gt;&lt;br /&gt;Today, the word  &lt;a href="http://www-ksl.stanford.edu/kst/what-is-an-ontology.html"&gt;"ontology"&lt;/a&gt; is used in a variety of contexts. Very often, it is used to refer to vocabularies and taxonomies. While an ontology can be both a vocabulary and a taxonomy, the converse is not true. Throwing together a subsumption hierarchy is not ontology development. Nor is the curation of a carefully controlled vocabulary of concepts pertinent to a knowledge domain. Many biologists (and other professionals) dabbling in ontology development are blithely unaware of the mathematical underpinnings of ontologies.&lt;br /&gt;&lt;br /&gt;Concepts and relations that are defined as part of an ontology need to be grounded in mathematical axioms. Ontology development toolkits such as Protege and Altova isolate ontology developers, or biologists in this case, from this reality. For all their usefulness in enabling the adoption of ontologies, ontology development tools that conveniently generate OWL syntax obscure the reality that every construct in OWL (at least, the decidable species of OWL) has its semantic underpinnings in a rigorous and formal logical framework.&lt;br /&gt;&lt;br /&gt;Ontology developers need to understand data; how it is used, accessed, and most importantly, modeled. A familiarity with the philosophy of the Entity Relation (ER) model or with the Object Oriented (OO) philosophy is a necessary prerequisite to ontology development. A second prerequisite is an understanding of mathematical logic, first order logic at the very least.&lt;br /&gt;&lt;br /&gt;Ontology development by specialists or domain experts amounts to a wastage of their skills, if not a serious threat to the quality of the developed ontology. While ontology engineers need not be experts in a specific knowledge  domain, their skills are relevant to the distillation of expertise from any domain into a representational framework such as OWL. I would not want an expert virologist to develop an ontology pertinent to viruses, any more than I would want a machinist or hangar technician to design and develop a database of airplane spare parts. The hangar technician would be best employed &lt;span style="font-weight: bold;"&gt;working&lt;/span&gt; with spares, not &lt;span style="font-weight: bold;"&gt;describing&lt;/span&gt; them. &lt;br /&gt;&lt;br /&gt;On a positive note, these deficiencies are symptomatic of any new technology, particularly in the information technology area. In the middle and late 90s, programmers accustomed to the procedural syntax of languages such as C were slow to adopt and master the object oriented philosophy behind newly introduced languages such as SmallTalk and Java. Ontologies are in the same phase of adoption today. A new fangled technology that promises to change the world as we know it, with its attendant evangelists (such as yours truly) and skeptics. Believe!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-8024624645019780333?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/8024624645019780333/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=8024624645019780333' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/8024624645019780333'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/8024624645019780333'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2007/10/so-you-think-you-know-ontologies.html' title='So you think you know ontologies?'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-2056364621433625655</id><published>2007-08-01T20:56:00.000-07:00</published><updated>2007-08-02T22:13:42.024-07:00</updated><title type='text'>What will it be? Resource or metadata. How to know the answer without asking (and looking dumb in the process)</title><content type='html'>Several proposals are being discussed in the HCLS community for the resolution of URIs that represent resources from URIs that represent metadata about those resources. One of the most touted proposals involves the use of the 303 redirect protocol to resolve a URI request. On the basis of the value of the content type in the request header, a URI request can be resolved to either streaming RDF/XML or a simple HTML page. &lt;br /&gt;&lt;br /&gt;Consider a scenario from a recent publication. A request URI cannot distinguish between a web page of a person and the person himself. An example would be the URI &lt;span style="font-style:italic;"&gt;http://www.illuminae.org/&lt;span style="font-weight:bold;"&gt;home&lt;/span&gt;/MarkWilkinson&lt;/span&gt; which may resolve to either a Web page of Mark Wilkinson's or to Mark Wilkinson himself, who may be described by a set of RDF triples. Based on the specified content type in the HTTP request header, if it is set to "RDF/XML", the RDF page with the URI &lt;span style="font-style:italic;"&gt;http://www.illuminae.org/&lt;span style="font-weight:bold;"&gt;rdf&lt;/span&gt;/MarkWilkinson&lt;/span&gt; would be retrieved. Otherwise, if the content type is set to "HTML", the Web page with the URI &lt;span style="font-style:italic;"&gt;http://www.illuminae.org/&lt;span style="font-weight:bold;"&gt;html&lt;/span&gt;/MarkWilkinson&lt;/span&gt; would be retrieved. Note the substitution of the "rdf" and "html" terms in the path of resolved URI in place of the original "home" term, creating a URI hierarchy of sorts. This proposal again seems to be a temporary hack rather than a long term solution.&lt;br /&gt;&lt;br /&gt;In contrast, the XSLT based approach discussed by &lt;a href=”http://chem-bla-ics.blogspot.com/2007/07/rdf-ing-molecular-space.html”&gt;InChi&lt;/a&gt; holds a lot of promise. Entire RDF pages can be translated into HTML pages for display on a Web browser while the triples are accessible in plain RDF format. This dovetails with the idea of abstracting the details of the Semantic Web away from the end user as discussed in my previous post here. No troublesome redirects here. And this came through the HCLS mailing list only this morning. Moral of the story: Never lose hope!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-2056364621433625655?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/2056364621433625655/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=2056364621433625655' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/2056364621433625655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/2056364621433625655'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2007/08/what-will-it-be-resource-or-metadata.html' title='What will it be? Resource or metadata. How to know the answer without asking (and looking dumb in the process)'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-7786837115590644543</id><published>2007-07-05T06:47:00.001-07:00</published><updated>2007-07-05T07:28:48.696-07:00</updated><title type='text'>John Doe and the Semantic Web</title><content type='html'>There has been a recent post on the HCLS Wiki comparing the relative strengths and weaknesses of the various proposed solutions to the unique resource identifier (URI) problem on the Semantic Web. A matrix of the various proposals with their desired features has been created. What got my noodle in this matrix was a "clickability" feature. Now going by the definitions of clickability from the Human Computer Interaction course back in Graduate School, I suppose an end user should be able to click on a URI that identifies a node and access a human friendly definition of the resource.&lt;br /&gt;&lt;br /&gt;This is bothersome. Why would a naive end user want to access a node? At best, this would be analogous to looking through an XML file. Now how much of this would John Doe really "get"? In the early days of XML, customized tags with DTDs were touted as one of its main features. You could use CSS or XSLT for rendering it on a browser.  That changed very quickly. Now we use XML transparently in databases and Web services without the need to view concept and message definitions on a User Interface such as a browser.  If clickability were a desired feature of the message formats and protocols defined in SOAP and WSDL, how much of a WSDL definition would make sense to a naive end user? And why do we have to bother with writing an XSLT program to render WSDL on an interface for an end user? Would it make any sense?&lt;br /&gt;&lt;br /&gt;I believe the Semantic Web will be transparent to the naive end user. John Doe will continue using seemingly unstructured raw text data that is formatted for his easy consumption on a Web user interface. The various concepts and relations in the text he reads will hook into logically grounded definitions in an ontology, after the fashion of hyperlinks. This is what will enable John Doe to run informed searches through the Web and invoke agent programs that will help him plan a vacation to Florida by using sophisticated reasoning and planning algorithms. The Semantic Web with its carefully curated ontologies will exist at a "higher level of abstraction" from the Web that is accessed by John Doe, but should not be readily visible to him&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-7786837115590644543?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/7786837115590644543/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=7786837115590644543' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/7786837115590644543'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/7786837115590644543'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2007/07/john-doe-and-semantic-web.html' title='John Doe and the Semantic Web'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5980104128798509938.post-7877482701346884710</id><published>2007-06-28T08:57:00.000-07:00</published><updated>2007-06-28T11:03:40.942-07:00</updated><title type='text'>The battle for LSIDs and the obsession with browsers</title><content type='html'>I've been actively following the debate over resolvable URLs in the Health Care and Life Sciences (HCLS) on the Semantic Web community. At the workshop in Banff as part of the &lt;a href="http://www2007.org/"&gt;WWW 2007 conference&lt;/a&gt;, some leading thinkers in the community actually questioned the utility of resolvable URLs. Shocking!&lt;br /&gt;&lt;br /&gt;Non resolving namespaces are the bane of semantic interoperability. DIE, example.org, DIE!! If definitions of concepts cannot be resolved to specific nodes on the Semantic Web, the best antidote is to discard them all together, IMHO. URIs can be location specific as in URLs or non location specific as in URNs. Again, location is given way too much importance at this juncture. As an ontology developer, I do not really care where a concept is defined, as long as I can access the definition. In other words, I NEED the definition but I could care less whether it came out of a location in Timbuktu or Flin Flon, Manitoba.&lt;br /&gt;&lt;br /&gt;The emphasis on location is because of the desire by life scientists to view definitions of concepts on a browser, a reluctance to let go of the browser. Ironic indeed! The utility of the Web and Web browser were not immediately apparent in the early days (circa 1990) to the scientific community. But once the benefits of Web pages became clear, they were embraced and have become the cornerstone of scientific research community. As a HCI researcher from Microsoft said at a keynote address at WWW 2007, browsers are clearly on the way out. Tim Berners-Lee, &lt;a href="http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21"&gt;in his seminal paper about the Semantic Web&lt;/a&gt;, describes a network of agents that can be invoked by interfaces (not browsers necessarily) and which can process machine understandable content to make intelligent decisions.&lt;br /&gt;&lt;br /&gt;The dependence upon browsers necessitates the need for users to remember (if not bookmark) URLs. In its heyday, the AOL browser only needed users to type in a keyword to locate a Web page. For example, typing in the keyword ``NFL `` would bring up the homepage of the &lt;a href="http://www.nfl.com/"&gt;National Football League&lt;/a&gt;. On a conventional browser, users were required to remember the protocol (HTTP) as well as the complete URL to access the very same page. The use of URNs may very well follow the same procedure as the AOL keyword. It frees the user from the need to remember or bookmark URLs.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.mindinformatics.org/docs/LSID-Clark-Martin-Liefeld.pdf"&gt;LSIDs (Life Science Identifiers)&lt;/a&gt; are URNs that are location independent and resolvable. To the end users, LSIDs are transparent, capable of allowing the access of web services from registries such as &lt;a href="http://www.biomoby.org/"&gt;BioMoby&lt;/a&gt;. LSIDs are capable of handling versions of concept definitions. Because they uniquely identify concepts within an ontology, they can be used to extract specific concept definitions from ontologies without necessarily downloading the entire ontology. They also allow the capture of metadata about the concept definition. Metadata includes the identity of the authority that has defined the concept, the version of the definition, and a timestamp among other things. On the other hand, &lt;a href="http://i9606.blogspot.com/"&gt;Ben Good&lt;/a&gt; has pointed out some of the crucial limitations of the LSID idea. These are temporary limitations though.&lt;br /&gt;&lt;br /&gt;Of late, the HCLS community has been discussing the use of LSIDs and LSID resolvers to address the problem of non standard naming protocols in life science ontologies. The &lt;a href="http://bio2rdf.org/JSPWiki/Wiki.jsp?page=BanffManifesto"&gt;Banff manifesto&lt;/a&gt; is an initiative that hopes to address the same issue. These are very promising developments. I look forward to the day when example.org is consigned to the dustbin and lingers on as a joke...Cheers!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5980104128798509938-7877482701346884710?l=cartiksplace.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://cartiksplace.blogspot.com/feeds/7877482701346884710/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5980104128798509938&amp;postID=7877482701346884710' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/7877482701346884710'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5980104128798509938/posts/default/7877482701346884710'/><link rel='alternate' type='text/html' href='http://cartiksplace.blogspot.com/2007/06/battle-for-lsids-and-obsession-with.html' title='The battle for LSIDs and the obsession with browsers'/><author><name>Cartik</name><uri>http://www.blogger.com/profile/06141451252596268391</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_IRBsyoOY-xw/S7ETtTtaZTI/AAAAAAAAEYA/u6IFn75U2Q8/S220/6adf.jpg'/></author><thr:total>0</thr:total></entry></feed>
