people walking dogs annotated

Ontology

In the context of data labeling, an ontology is used to define and classify the concepts, objects, and relationships that are relevant to a certain project or domain. Oxford dictionary says that an Ontology is a “set of concepts and categories in a subject area or domain that shows their properties and the relations between them“. This is why we can use ontology as a synonym of class hierarchy

We use ontologies in data labeling to ensure that the labels applied to the data are consistent, accurate, and meaningful.

Ontology example

Let’s take as example a machine learning task to clarify images of pets. This might include definitions and relationships for concepts such as “cat,” “dog”, “pet”, “person”. In this example, an ontology could be defining that the labels “cat” or “dogs” are both belonging to the “pet” class. “Pet” class can have a link to “person”, thanks to a “isOwner” relationship.

Difference between the scientific definition and the data labeling definition

So, an ontology in the data labeling context is an operational term to assign classes and possibilities of labeling. Thus, the scientific definition of ontology and the one used in this context are slightly distinct. The data labeling ontology will be parametrized in the labeling tool.

Other companies may use ontology to describe annotation guides, but there is a difference between the two concepts. Ontologies are present in the annotation guide, but the last is a more general term. That is, the labeling instructions can include a class hierarchy, but especially includes how to distinguish the classes and attributes during the labeling process. 

Who is defining ontologies?

It’s usually the experts in the domain that manually create and maintain ontologies. Nonetheless, great data labeling companies will help you adapt or create your ontology to reduce inconsistency of your class hierarchy. Doing this will make the data labeling project more efficient.

Finally, a well-defined ontology is key, to ensure the quality and reliability of your labeled data.

Synonyms: Class hierarchy

Related terms: Typology; Labeling instructions