Evaluating ontological decisions with OntoClean

Published: 24 Nov 2019
Written by: Chun Fei Lung

A look at domain modelling from an academic perspective.

Thumbs if you liked it, subs if you loved it

I’ve technically been an information scientist for over seven years now and I still can’t explain what the field is really about. What I do know though is that it includes things like ontology engineering: a subject that might feel a bit academic at first, but really should be required reading for anyone who designs domain models.

About the article

Title	Evaluating ontological decisions with OntoClean
Year	2002
Author(s)	Nicola Guarino (Italian National Research Council) Christopher Welty (Vassar College)
Venue	Communications of the ACM

Why it matters

Ontology engineering is a field about the study and construction of ontologies; formal representations of concepts and relations between those concepts in a specific domain.

Ontologies are somewhat akin to models in object-oriented design and domain-driven design, as they give domain experts access to a shared vocabulary. But ontologies are much more than that, because they also make it easier to relate concepts from different domain models.

They can even be used to let AI systems reason about concepts within a domain – provided that they’re constructed correctly.

Ontology construction is (side note: Or was, when the paper was originally published) a bit of an arcane art, and consequently hard to learn and master for newcomers. The authors therefore introduce a methodology that helps ontology engineers make the right modelling decisions by drawing lessons from philosophical ontology.

It’s also a good read for those who design more informal models though!

How the study was conducted

Probably in front of a whiteboard, with a lot of discussions (side note: The article doesn’t have (or need) a methodology section.).

What discoveries were made

The article explains a few key concepts that you should know and how you can use these concepts to discover modelling mistakes.

Essence and rigidity

Entities have properties. Some of those properties are essential to an entity, i.e. the property must – by definition – always be true for an entity.

Rigidity is a special form of essence. If a property is rigid, then every entity that can exhibit it must exhibit it. We can distinguish between three levels of rigidity:

Rigid: Properties that are essential to all instances, e.g. being a person;
Semi-rigid: Properties that are essential to some instances, but not to others, e.g. being hard;
Anti-rigid: Properties that are not essential at all, e.g. being a student.

All properties in an ontology must be labelled with their rigidity. This makes it possible to verify the consistency of taxonomic links between entities, as anti-rigid properties cannot subsume (side note: be a superclass of) rigid properties.

Identity and unity

Two other important concepts are identity and unity.

Identity is about being able to tell when two different entities in the world are actually the same (or not).

Example

One might say that a time slot of “1:00–2:00 next Tuesday” is a time duration of “one hour”, at a specific moment in time. Does that mean that Time slot is a kind of (or a subclass of) Time duration?

The following analysis shows that it isn’t:

Two time durations are the same if they have the same length: there’s no difference between any two “one-hour” durations, so all instances of “one-hour” durations are actually the same.
Two time slots occurring at the same time are the same (there’s only one “1:00–2:00 next Tuesday”). But two time slots occurring on different days are not – even if they have the same time duration.

Modelling Time slots as a subclass of Time duration would lead to inconsistencies! Instead, we should say that a Time slot has a Time duration.

Identity criteria are inherited over subsumption relations. Any subclass must therefore have the same identity criteria as its ancestors.

Unity is about being able to tell whether something is a single “thing”.

Wholes should never be subclasses of non-wholes.

In the case of oceans and water, we cannot say that oceans are a kind of water; that would imply that the parent class (water) is not a whole, but the child class is. This is clearly a contradiction. Instead, we should say that oceans are composed of water.

Discovering misuse of subsumption

Ontological analysis can be used to identify a backbone taxonomy that consists of all rigid properties in the ontology. Every entity within a domain should instantiate at least one the properties in the backbone.

It’s also a useful way to explain mistakes in modelling of subsumption relations. The authors list several examples of common mistakes:

Instantiation: modelling Human as a subclass of Species. The location of a particular human in the biological taxonomy does not help you identify a specific human. Human is an instance of Species!
Part/whole: modelling Engine as a subclass of Car. Cars do not have the same essential properties (accommodating people) as engines. An Engine is part of a Car!
Disjunction/type restriction: modelling Engine as a subclass of Car part, as a workaround. Engines are not necessarily car parts: if you take an engine from a car and put it into a boat, it’s no longer a car part. Note that Car part is anti-rigid (something can cease to be a car part), but Engine is rigid (engines cannot become not-engines).
Polysemy: using the same word or entity to refer to things that are fundamentally different. Take for example the word “book”: A book can be refer to bound volumes, which have a location in time and space. But it can also refer to an abstract notion of a book, which is not identified by its location, but its author, title, and other criteria.
Constitution: modelling Ocean as a subclass of Water. Oceans aren’t water – they consist of water.

Summary

Rigid properties are essential properties, i.e. they must be true for every entity that have the property
Anti-rigid properties are those that can (but don’t have to) apply to an entity
Entities with anti-rigid properties cannot subsume entities with rigid properties
Identity criteria tell you whether two entities in a world are different or the same thing
Identity criteria are inherited and therefore must apply for all entities of an entity and its descendants
Unity is about being able to tell whether something is a whole, i.e. whether it can be recognised as an isolated entity
Non-whole entities cannot subsume whole entities