Issue 382: where to stop documenting the provenance

Starting Date: 
Working Group: 

 In the 41st joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 34th FRBR - CIDOC CRM Harmonization meeting, the discussions about the issue 367 raised a new Issue about best practice on epistemology of the knowledge base itself concerning  where to stop documenting the provenance. The sig decided to begin the discussions by email exchange.

Lyon, May 2018

Current Proposal: 

In the 42nd joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 35th FRBR - CIDOC CRM Harmonization meeting, MD and CM were assigned to start the discussion on best practices on epistemology of the knowledge base –regarding where to stop documenting the provenance. The aim is to arrive at a document which will have the status of a recommendation for using the crm.

Berlin, November 2018

Posted by Martin 20/10/2019

Dear Carlo, Eleni

Here my attempt:

Where to Stop the Provenance Chain


A guideline

A formal ontology is about “being”. It describes classes of individual items, properties and logical rules constraining their combinations that approximate at a categorical level how we perceive that certain things and phenomena of reality are and behave, including our descriptions of it. It describes “possible states of affairs”[1].  We require that these concepts are not only conventions between humans, but also sufficiently close to reality so that valid deductions about reality can be drawn from the ontology and  instances of it, obtained under theoretical, perfect conditions of observation. The deviations in precision and coverage (i.e. wrt exceptions) of the ontology from reality as an idealized, logical approximation should be understood and tolerable for the purpose of the respective research. Only things and phenomena of reality that behave close enough to the logical form of a formal ontology can usefully described by it.

We regard knowledge as justified beliefs of propositions X of a form that make sense in “I know that X holds”. Besides defining the proposition X as an expression of information, a human stating “I know that X holds” must be able to relate all classes, properties and identifiers (names) in such an expression with situations and individual things of the real world. Therefore only humans have knowledge.

A knowledge base in the sense of the CRM is an information object that instantiates the formal ontology with propositions that the maintainer of the knowledge base believes, i.e., regards it as “the best of my knowledge”. There are subtle, but substantial differences between registered knowledge and reality, because it includes contradictions, alternatives and uncertainties. The maintainer of the knowledge base may be an individual person or a team, trusting each other and sharing the same contextual knowledge of the world (see Doerr, Meghini & Spyratos 20

The maintainer of the knowledge base is its ultimate provenance, providing (or not) trust in the care and honesty of the described propositions. The maintainer should not appear in the knowledge base as propositions of provenance, but be described as metadata about the knowledge base as a whole, exactly as we do not repeat the author of a book in each phrase.

The knowledge may be direct or indirect:

Direct knowledge is that believed out of good, explicable reasons of observation or inferencing by the maintainers themselves.

Indirect knowledge is that that the maintainer adopts or refers to from other sources. In that case, the formal knowledge of the maintainer is restricted to the information as a formal expression and its provenance. The maintainer may or may not belief this information. Therefore, the knowledge base should contain the adequate propositions about its provenance (believed by the maintainer). The maintainer may express doubts about the correctness of this information, if indicated.

It should be possible to communicate with the maintainer and discuss justifications and possible corrections of errors.

Ideally, the source of indirect knowledge should contain further provenance statements about indirect knowledge its author has used. The ideal would be, to link all those provenance statements together until they direct us to all direct knowledge used. This is, of course, impossible, but nevertheless we have the means to document, increase and link our provenance knowledge to larger and larger chains, which will be extremely useful for validating and improving our overall knowledge.

The maintainer of a knowledge base may decide to document provenance of provenance, if there is no reliable digital resource to link to next statement in the provenancechain, or if a local copy of parts of the provenance chain appears to be useful.


[1] N.Guarino, …

Reference to Issues:

Meetings discussed: