Tuesday, October 2, 2007

So you think you know ontologies?

The need to share biological data has led to the development of several high profile and oft referenced ontologies in the life sciences domain. Soldatova and King have pointed out the limitations with many biological ontologies that threaten to hurt the long term purpose of their use in the life sciences domain. Based on my experience in ontology development over the last five years, and my interactions with other ontology developers and organizations looking to invest (or involved) in ontology development, the findings of this paper do not come as a surprise.

Today, the word "ontology" is used in a variety of contexts. Very often, it is used to refer to vocabularies and taxonomies. While an ontology can be both a vocabulary and a taxonomy, the converse is not true. Throwing together a subsumption hierarchy is not ontology development. Nor is the curation of a carefully controlled vocabulary of concepts pertinent to a knowledge domain. Many biologists (and other professionals) dabbling in ontology development are blithely unaware of the mathematical underpinnings of ontologies.

Concepts and relations that are defined as part of an ontology need to be grounded in mathematical axioms. Ontology development toolkits such as Protege and Altova isolate ontology developers, or biologists in this case, from this reality. For all their usefulness in enabling the adoption of ontologies, ontology development tools that conveniently generate OWL syntax obscure the reality that every construct in OWL (at least, the decidable species of OWL) has its semantic underpinnings in a rigorous and formal logical framework.

Ontology developers need to understand data; how it is used, accessed, and most importantly, modeled. A familiarity with the philosophy of the Entity Relation (ER) model or with the Object Oriented (OO) philosophy is a necessary prerequisite to ontology development. A second prerequisite is an understanding of mathematical logic, first order logic at the very least.

Ontology development by specialists or domain experts amounts to a wastage of their skills, if not a serious threat to the quality of the developed ontology. While ontology engineers need not be experts in a specific knowledge domain, their skills are relevant to the distillation of expertise from any domain into a representational framework such as OWL. I would not want an expert virologist to develop an ontology pertinent to viruses, any more than I would want a machinist or hangar technician to design and develop a database of airplane spare parts. The hangar technician would be best employed working with spares, not describing them.

On a positive note, these deficiencies are symptomatic of any new technology, particularly in the information technology area. In the middle and late 90s, programmers accustomed to the procedural syntax of languages such as C were slow to adopt and master the object oriented philosophy behind newly introduced languages such as SmallTalk and Java. Ontologies are in the same phase of adoption today. A new fangled technology that promises to change the world as we know it, with its attendant evangelists (such as yours truly) and skeptics. Believe!!


Benjamin Good said...

Ack! A Montague in our midst!

From Carole Goble's famous talk
"Two households, both alike in dignity, In fair Genomics, where we lay our scene, (One, comforted by its logic's rigour, Claims ontology for the realm of pure, The other, with blessed scientist's vigour, Acts hastily on models that endure), From ancient grudge break to new mutiny, When being drives a fly-man to blaspheme. From forth the fatal loins of these two foes, Researchers to unlock the book of life; Whole misadventured piteous overthrows, Can with their work bury their clans' strife. The fruitful passage of their GO-mark'd love, And the continuance of their studies sage, Which, united, yield ontologies undreamed-of, Is now the hour's traffic of our stage; The which if you with patient ears attend, What here shall miss, our toil shall strive to mend. "

Wihout those Capulets you Montagues would
A) have nothing to complain about - and what fun would you have then?
B) the task of representing all the complexity of biology (which of course you likely don't understand much of) on your shoulders.

As another ontology developer that, in many respects, agrees with the ideas in the paper you cite, I have to disagree whole-heartedly with both your dismissal of ontology editors and of bio-ontologists. Both are absolutely fundamental to the advance of the semantic web and, through it, of biology and medicine.

I'm in a quoting mood so I'll finish with another one
"Can't we all just get along.." , stop bashing eachother and get on with it???

Cartik said...

To respond to Ben's critique, it is not my intention to dismiss the contributions of bio-ontologists to the life sciences. Bio-ontologists are a very rare breed indeed, as they effectively straddle two vastly different areas of expertise, viz. biology and knowledge representation formalisms. This is no mean feat. I know of only a handful of people who fit this description, and I'm lucky enough to work with two of them! Ontologies mean different things to different people. I have talked about ontologies from a formal Computer Science perspective (as that is my background). Ontologies that are being used in the life sciences today, despite their mathematical limitations, are very effective at solving biological problems and are extensively used by biologists. Formal ontologies lag far behind in terms of usage and performance. My intention was to highlight the formal rigor of ontology development, which seems to be commonly overlooked, to potentially detrimental effect. The Soldatova and King paper that I have referred to explains these dangers at length. Referring to Carol Goble's "Capulets and Montagues" analogy, both families need each other. My intention was to highlight the possible dangers of Capulets trying to be Montagues, and vice versa (although I have not alluded to the latter specifically in my post). What is obvious is everyone needs to get along, if we are to get anywhere!