Evaluating ontological decisions with OntoClean (2002)

Roman emperor gives a thumbs up to a subject from the lower classes
Thumbs if you liked it, subs if you loved it

I’ve technically been an information scientist for over seven years now and I still can’t explain what the field is really about. What I do know though is that it includes things like ontology engineering: a subject that might feel a bit academic at first, but really should be required reading for anyone who designs domain models.

Why it matters

Ontology engineering is a field about the study and construction of ontologies; formal representations of concepts and relations between those concepts in a specific domain.

Ontologies are somewhat akin to models in object-oriented design and domain-driven design, as they give domain experts access to a shared vocabulary. But ontologies are much more than that, because they also make it easier to relate concepts from different domain models.

They can even be used to let AI systems reason about concepts within a domain – provided that they’re constructed correctly.

Ontology construction isOr was, when the article was originally published a bit of an arcane art, and consequently hard to learn and master for newcomers. The authors therefore introduce a methodology that helps ontology engineers make the right modelling decisions by drawing lessons from philosophical ontology.

It’s also a good read for those who design more informal models though!

How the study was conducted

Probably in front of a whiteboard, with a lot of discussionsThe article doesn’t have (or need) a methodology section..

What discoveries were made

The article explains a few key concepts that you should know and how you can use these concepts to discover modelling mistakes.

Essence and rigidity

Entities have properties. Some of those properties are essential to an entity, i.e. the property must – by definition – always be true for an entity.

Example

  • Being hard is an essential property for hammers. Something can only be a hammer if it’s hard.
  • Being hard is not an essential property for sponges. Sponges can (but don’t have to!) be hard. Of course it’s possible to have sponges that are hard throughout their entire existence, by chance. This doesn’t matter though: the point is that it could have been soft at some point in time.

Rigidity is a special form of essence. If a property is rigid, then every entity that can exhibit it must exhibit it. We can distinguish between three levels of rigidity:

All properties in an ontology must be labelled with their rigidity. This makes it possible to verify the consistency of taxonomic links between entities, as anti-rigid properties cannot subsumebe a superclass of rigid properties.

Example

  • The class Student cannot subsume the class Person, because students may cease being a student, while persons must always be persons. This would imply that persons would only be persons as long as they’re students, which is obviously wrong.

Identity and unity

Two other important concepts are identity and unity.

Identity is about being able to tell when two different entities in the world are actually the same (or not).

Example

One might say that a time slot of “1:00–2:00 next Tuesday” is a time duration of “one hour”, at a specific moment in time. Does that mean that Time slot is a kind of (or a subclass of) Time duration?

The following analysis shows that it isn’t:

  1. Two time durations are the same if they have the same length: there’s no difference between any two “one-hour” durations, so all instances of “one-hour” durations are actually the same.
  2. Two time slots occurring at the same time are the same (there’s only one “1:00–2:00 next Tuesday”). But two time slots occurring on different days are not – even if they have the same time duration.

Modelling Time slots as a subclass of Time duration would lead to inconsistencies! Instead, we should say that a Time slot has a Time duration.

Identity criteria are inherited over subsumption relations. Any subclass must therefore have the same identity criteria as its ancestors.

Unity is about being able to tell whether something is a single “thing”.

Example

  • Water cannot be recognised as an isolated entity and is therefore not a whole.
  • Oceans on the other hand do represent whole objects, as you can easily name instances of oceans, e.g. “the Atlantic Ocean”.

Wholes should never be subclasses of non-wholes.

In the case of oceans and water, we cannot say that oceans are a kind of water; that would imply that the parent class (water) is not a whole, but the child class is. This is clearly a contradiction. Instead, we should say that oceans are composed of water.

Discovering misuse of subsumption

Ontological analysis can be used to identify a backbone taxonomy that consists of all rigid properties in the ontology. Every entity within a domain should instantiate at least one the properties in the backbone.

It’s also a useful way to explain mistakes in modelling of subsumption relations. The authors list several examples of common mistakes:

The important bits

  1. Rigid properties are essential properties, i.e. they must be true for every entity that have the property
  2. Anti-rigid properties are those that can (but don’t have to) apply to an entity
  3. Entities with anti-rigid properties cannot subsume entities with rigid properties
  4. Identity criteria tell you whether two entities in a world are different or the same thing
  5. Identity criteria are inherited and therefore must apply for all entities of an entity and its descendants
  6. Unity is about being able to tell whether something is a whole, i.e. whether it can be recognised as an isolated entity
  7. Non-whole entities cannot subsume whole entities