Evaluating ontological decisions with OntoClean
I’ve technically been an information scientist for over seven years now and I still can’t explain what the field is really about. What I do know though is that it includes things like ontology engineering: a subject that might feel a bit academic at first, but really should be required reading for anyone who designs domain models.
Ontology engineering is a field about the study and construction of ontologies; formal representations of concepts and relations between those concepts in a specific domain.
Ontologies are somewhat akin to models in object-oriented design and domain-driven design, as they give domain experts access to a shared vocabulary. But ontologies are much more than that, because they also make it easier to relate concepts from different domain models.
They can even be used to let AI systems reason about concepts within a domain – provided that they’re constructed correctly.
Ontology construction a bit of an arcane art, and consequently hard to learn and master for newcomers. The authors therefore introduce a methodology that helps ontology engineers make the right modelling decisions by drawing lessons from philosophical ontology.
It’s also a good read for those who design more informal models though!
.
The article explains a few key concepts that you should know and how you can use these concepts to discover modelling mistakes.
Entities have properties. Some of those properties are essential to an entity, i.e. the property must – by definition – always be true for an entity.
Rigidity is a special form of essence. If a property is rigid, then every entity that can exhibit it must exhibit it. We can distinguish between three levels of rigidity:
-
Rigid: Properties that are essential to all instances, e.g. being a person;
-
Semi-rigid: Properties that are essential to some instances, but not to others, e.g. being hard;
-
Anti-rigid: Properties that are not essential at all, e.g. being a student.
All properties in an ontology must be labelled with their rigidity. This makes it possible to verify the consistency of taxonomic links between entities, as anti-rigid properties cannot rigid properties.
Two other important concepts are identity and unity.
Identity is about being able to tell when two different entities in the world are actually the same (or not).
Identity criteria are inherited over subsumption relations. Any subclass must therefore have the same identity criteria as its ancestors.
Unity is about being able to tell whether something is a single “thing”.
Wholes should never be subclasses of non-wholes.
In the case of oceans and water, we cannot say that oceans are a kind of water; that would imply that the parent class (water) is not a whole, but the child class is. This is clearly a contradiction. Instead, we should say that oceans are composed of water.
Ontological analysis can be used to identify a backbone taxonomy that consists of all rigid properties in the ontology. Every entity within a domain should instantiate at least one the properties in the backbone.
It’s also a useful way to explain mistakes in modelling of subsumption relations. The authors list several examples of common mistakes:
-
Instantiation: modelling Human as a subclass of Species. The location of a particular human in the biological taxonomy does not help you identify a specific human. Human is an instance of Species!
-
Part/whole: modelling Engine as a subclass of Car. Cars do not have the same essential properties (accommodating people) as engines. An Engine is part of a Car!
-
Disjunction/type restriction: modelling Engine as a subclass of Car part, as a workaround. Engines are not necessarily car parts: if you take an engine from a car and put it into a boat, it’s no longer a car part. Note that Car part is anti-rigid (something can cease to be a car part), but Engine is rigid (engines cannot become not-engines).
-
Polysemy: using the same word or entity to refer to things that are fundamentally different. Take for example the word “book”: A book can be refer to bound volumes, which have a location in time and space. But it can also refer to an abstract notion of a book, which is not identified by its location, but its author, title, and other criteria.
-
Constitution: modelling Ocean as a subclass of Water. Oceans aren’t water – they consist of water.
-
Rigid properties are essential properties, i.e. they must be true for every entity that have the property
-
Anti-rigid properties are those that can (but don’t have to) apply to an entity
-
Entities with anti-rigid properties cannot subsume entities with rigid properties
-
Identity criteria tell you whether two entities in a world are different or the same thing
-
Identity criteria are inherited and therefore must apply for all entities of an entity and its descendants
-
Unity is about being able to tell whether something is a whole, i.e. whether it can be recognised as an isolated entity
-
Non-whole entities cannot subsume whole entities