Issue 241: Wider practical scope note of CRM

Starting Date: 
2014-02-17
Working Group: 
1
Status: 
Open
Background: 

Posted by Martin 17/2/2014 

Should we define the practical scope of the CRM wider than "museums" ? 
I believe yes: 

"....the Practical Scope, 
which is expressed by the overall scope of a reference set of specific identifiable museum documentation 
standards and practices that the CRM aims to encompass..." 

Posted by Athanasios Velios 18/2/2014 

I do not think that the word "museum" stops people from using it in other contexts. We consider it within the context of libraries and archives - but of course to book conservators, libraries are museums... 

All the best, 

Thanasis 

 

Posted by Anja Masur 19/2/2014 

I think we should change this as CIDOC indeed adresses far more than only museums. And I guess that it is confusing for people not being that familiar with CIDOC to read about museums ... 

 

Posted by Phil Carlisle 19/2/2014 

I'd definitely support the change. The Getty GCI/World Monuments Fund Arches project (
http://archesproject.org) has been developing a cultural heritage inventory for the built and buried heritage and this is underpinned with graphs based on the CRM and the CIDOC Core Data Standard for Archaeological and Architectural Sites. 

 

Posted by Richard Light 20/2/2014 

OK, thanks: I think I now understand what is going on. The documents defining the CRM make reference to the web site page [1], noting that it provides more detail. (BTW, the URL in the CRM footnote is out of date [2].) 

It looks as though the web page has [subsequently?] been updated so that it goes beyond what the CRM document says. Your question is about "Practical scope", but in fact the web site Scope page broadens out the definition of "Intended scope" [3]. Clearly it would make sense for the two to be re-aligned, and I would support changing the "Intended scope" definition in the CRM document so that it matches the web page wording. In addition, the bullet point wording on the web page has been updated, compared with the CRM document text. Shouldn't the two be the same? 

[This raises a more general issue: is it wise to include "quotations" from the CRM text as web page content on the CRM site? As soon as you do so, there is the challenge of keeping the two versions in synch with each other, and/or of making clear to the reader which is the authoritative version.] 

I would make a rather different point about "Practical scope". I agree that it should have "museum documentation" replaced by "cultural heritage". However, what interests me is that the wording in the CRM document [4] doesn't state plainly what is actually going on. The wording isn't actually wrong, it just isn't very helpful. I would prefer it if the document said "The Practical Scope^2 of the CRM is the set of /cultural heritage /standards /which //have been mapped to the CRM/. The CRM covers the same domain of discourse as the union of these reference standards; this means that /for/ data correctly encoded according to these /cultural heritage /standards there can be a CRM-compatible expression that conveys the same meaning." (changes in italic) Then footnote 2 could say "The Practical Scope of the CRM is constantly increasing as mapping projects are completed. Seehttp://www.cidoc-crm.org/scope.html for an up-to-date list of cultural heritage standards for which CRM mappings are available." 

Best wishes, 

Richard 

[1] http://www.cidoc-crm.org/scope.html 
[2] "The Practical Scope of the CIDOC CRM, including a list of the relevant museum documentation standards, is discussed in more detail on the CIDOC CRM website at /http://cidoc.ics.forth.gr/scope.html/
[3] "The intended scope of the CIDOC CRM may be defined as all information required for the scientific documentation of /cultural heritage /collections" 

 

Posted by Richard Light 20/2/2014 

Sorry: missed one point, which is that the section in the CRM text starts with the sentence "The overall scope of the CIDOC CRM can be summarised in simple terms as the curated knowledge of museums." This clearly needs updating as well. 

 

Posted by Martin 20/2/2014 

Thank you! Sound good. 

In "practice", the real scope the CRM ended up with is human activities and their products in the past and their current evidence. We apply it quite successfully to scientific records ("metadata") of all sciences, and Joao even referred to legislation activities. In European Commission Jargon, they often talk about "cultural and scientific heritage" in the framework of research funds. 

 

Posted by Richard Light on 20/2/2014 

I'd be happy with "cultural and scientific heritage". Presumably there is a certain amount of formality to be observed in updating the text of the CRM document itself, even the explanatory sections, given that it is an ISO Standard? 

BTW, I'm currently keeping my promise to review the web site design: a "site map with comments" will be on its way to the list later today or tomorrow. 

 

Posted by Stephen Stead 21/02/2014 


I think it should be broadened to "Cultural Heritage" in general. 
 

 

Posted by Dominic 22/02/2014 

I already say in documentation, cultural heritage.... and beyond. 

 

Posted by Christian Emil on 24/02/2014 

The scope can be extended to cultural heritage in general with the caveat: the term "cultural heritage in general" is very, very wide and we do not intend to model the entire world. 

Perhaps one could say that the scope is not limited to museums. 
 

 

Posted by Martin 25/02/2014 

I agree. 

We should also make the distinction of Theoretical and Practical Scope. Practical means that we have studied applications in this direction or aiming at in short terms. 

So far, we have restricted ourselves, due to the nature of documentation structures in ALMI and archaeology, to a world of material particular relations of the past in human scale. 

We have not touched relations in collective behaviour (such political movements, trading routes), neither in categorical behaviour (such as the interplay of parts of a machine or of a manufacturing procedure). We have not touched emotional, psychological facts, such as love, fear, racism etc. We only model states of affairs that can be described in terms of distinct entities and relationships. 
I'd expect all these things being out of scope? 

Would someone volunteer to revise the text

 

Current Proposal: 

In the 32nd joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 25th FRBR - CIDOC CRM Harmonization meeting, sig  discussing about this issue agreed that it is an epistemological question and has a philosophical aspect. The sig assigned to Dominic with the help of Martin Doerr, George Brusecker, Maria Daskalaki, to write proposals.

Posted by Martin on 20/3/2019

Dear All,

Here my attempts to reformulate the objectives of the CRM, its scope, and the methods of extensions. Please comment! To be discussed next week in the meeting.

Best,

Martin

 
Introduction

This document is the formal definition of the CIDOC Conceptual Reference Model (“CRM”), a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information and similar information from other domains. The CRM is the culmination of more than two decades of standards development work by the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM). Work on the CRM itself began in 1996 under the auspices of the ICOM-CIDOC Documentation Standards Working Group. Since 2000, development of the CRM has been officially delegated by ICOM-CIDOC to the CIDOC CRM Special Interest Group, which has been collaborating soon after with the ISO working group ISO/TC46/SC4/WG9 to bring the CRM to the form and status of an International Standard. This collaboration has resulted in ISO21127:2004 and ISO21127:2014, and will be continued to produce the next update of the standard. This document belongs to the series of evolving versions of the formal definition of the CRM, which serve the ISO working group as community draft for the standard. Eventual minor differences of the ISO standard text from the CIDOC version in semantics and notation that the ISO working group requires and implements are harmonized in the subsequent versions of the CIDOC version.


Objectives of the CIDOC CRM

The primary role of the CRM is to enable the exchange and integration of information from heterogeneous sources for the reconstruction and interpretation of the past at a human scale, based on all kinds of material evidence, including texts, audiovisual material and even oral tradition. It starts from, but is not limited to, the needs of museum documentation and research based on museum holdings. It aims at providing the semantic definitions and clarifications needed to transform disparate, localised information sources into a coherent global resource, be it within a larger institution, in intranets or on the Internet, and to make it available for scholarly interpretation and scientific evaluation. Its perspective is supra-institutional and abstracted from any specific local context. This goal determines the constructs and level of detail of the CRM.

More specifically, it defines, in terms of a formal ontology, the underlying semantics of database schemata and structured documents used in the documentation of cultural heritage and scientific activities. In particular it defines the semantics related to the study of the past and current state of our world, as it is characteristic for museums, but also or other institutions and disciplines. It does not define any of the terminology appearing typically as data in the respective data structures; however it foresees the characteristic relationships for its use. It does not aim at proposing what cultural institutions should document. Rather it explains the logic of what they actually currently document, and thereby enables semantic interoperability.

It intends to provide a model of the intellectual structure of the respective kinds of documentation in logical terms. As such, it is not optimised for implementation-specific storage and processing aspects. Implementations may lead to solutions where elements and links between relevant elements of our conceptualizations are no longer explicit in a database or other structured storage system. For instance, the birth event that connects elements such as father, mother, birth date, birth place may not appear in the database, in order to save storage space or response time of the system. The CRM allows us to explain how such apparently disparate entities are intellectually interconnected, and how the ability of the database to answer certain intellectual questions is affected by the omission of such elements and links.
 

Scope of the CIDOC CRM

The overall scope of the CIDOC CRM can be summarised in simple terms as the curated, factual knowledge about the past at a human scale.

However, a more detailed and useful definition can be articulated by defining both the Intended Scope, a broad and maximally-inclusive definition of general application principles, and the Practical Scope, which is expressed by the overall scope of a growing reference set of specific, identifiable documentation standards and practices that the CRM aims to encompass, however restricted in its details to the limitations of the Intended Scope.

The reasons for this distinctions are twofold. Firstly, the CRM is developed in a “bottom-up” manner, starting from well-understood, actually and widely used concepts of domain experts, which are disambiguated and gradually generalized as more forms of encoding are encountered. This allows for avoiding the misadaptations and vagueness often found in introspection-driven attempts to find overarching concepts for such a wide scope, and provides stability to the generalizations found. Secondly, it is a means to identify and keep a focus on the concepts most needed by the communities working in the scope of the CRM and to maintain a well-defined agenda for its evolution.

The Intended Scope of the CRM may be defined as all information required for the exchange and integration of heterogeneous scientific and scholarly documentation about the past at a human scale and its evidence that has come upon us. This definition requires further elaboration:

 

  •     The term “scientific and scholarly documentation” is intended to convey the requirement that the depth and quality of descriptive information that can be handled by the CRM should be sufficient for serious academic research. This does not mean that information intended for presentation to members of the general public is excluded, but rather that the CRM is intended to provide the level of detail and precision expected and required by museum professionals and researchers in the field.
  •     As “evidence that has come upon us” are regarded all types of material collected and displayed by museums and related institutions, as defined by ICOM[1], and other  collections, in-situ objects, sites, monuments and intangible heritage relating to fields such as social history, ethnography, archaeology, fine and applied arts, natural history, history of sciences and technology.
  •     The documentation includes the detailed description of individual items, in situ or within collections, groups of items and collections as a whole, as well as practices of intangible heritage. It pertains to their current state as well as to information about their past. The CRM is specifically intended to cover contextual information: the historical, geographical and theoretical background that gives cultural heritage collections much of their cultural significance and value.
  •     The exchange of relevant information with libraries and archives, and the harmonisation of the CRM with their models, falls within the Intended Scope of the CRM.
  •     Information required solely for the administration and management of cultural institutions, such as information relating to personnel, accounting, and visitor statistics, falls outside the Intended Scope of the CRM.

The Practical Scope[2] of the CRM is expressed in terms of the set of reference standards and de facto standards for documenting factual knowledge that have been used to guide and validate the CRM’s development and its further evolution. The CRM covers the same domain of discourse as the union of these reference standards; this means that for data correctly encoded according to these documentation formats there can be a CRM-compatible expression that conveys the same meaning.


Coverage and Extensions

The intended scope of the CRM is a subset of the “real” world and is therefore potentially infinite. Further, the strategy to develop the model bottom-up from a practical scope has the consequence that the model will always miss some areas of relevant application or, on the other hand, some parts may not be developed in sufficient detail for a specialized field of study, such as E30 Right. Therefore, the CRM has been designed to be extensible by different mechanisms in order to achieve an optimal coverage of the intended scope without losing compatibility with the CRM.

Strict compatibility of extensions with the CRM means that data structured according to an extension must also remain valid as a CRM instance. In practical terms, this implies query containment: any queries based on CRM concepts should retrieve a result set that is correct according to the CRM’s semantics, regardless of whether the knowledge base is structured according to the CRM’s semantics alone, or according to the CRM plus compatible extensions. For example, a query such as “list all events” should recall 100% of the instances deemed to be events by the CRM, regardless of how they are classified by the extension.

A sufficient condition for the compatibility of an extension with the CRM is that CRM classes subsume all classes of the extension, and all properties of the extension are either subsumed by CRM properties, or are part of a path for which a CRM property is a shortcut. Obviously, such a condition can only be tested intellectually.

The mechanisms for extensions are:

  1.     Existing classes and properties can be extended dynamically using thesauri and controlled vocabularies with CRM properties having as range E55 Type, as further elaborated in the section “About Types”. This approach is preferable when specializations of classes are independent from specializations of properties, and for local, non-standardized concepts.
  2.     Existing classes and properties can be extended structurally by adding subclasses and subproperties respectively. This approach is particularly recommended to communities of practice needing well-established properties specific to classes that are not present in the CRM.
  3.     Additional information that falls outside the semantics formally defined by the CRM can trivially be recorded as unstructured data using E1 CRM Entity. P3 has note: E62 String to attach such information to the most adequate instance in the respective knowledge base. This approach is preferable when detailed, targeted queries are not expected; in general, only those concepts used for formal querying need to be explicitly modelled.


Conservative Extensions of Scope

Extensions may be incorporated in new versions of the CRM, or become semi-independent modules maintained in parallel to the CRM by communities of practice. In mechanisms 1 and 2 above, the CRM concepts subsume and thereby cover the extensions. This specialization as only method of extension would mean that the CRM from the beginning has foreseen all necessary high-level classes and properties. This comes in conflict with the very successful bottom-up methodology of evolution of the CRM itself and the development of extensions more peripheral to the current practical scope.

Extensions that are the result of widening the scope, rather than elaborating it in more detail, may quite well find a class “C” not covered by the CRM so far and even a superclass “B” of class C that must be regarded as a superclass of an existing CRM class “A”. From a logical-theoretical point of view, we precisely regard such extensions as compatible, if the CRM classes subsume all classes and all properties of the extension as long as instances are restricted to the not extended scope of the CRM.

In this case, an existing property p of class A may also hold for the new superclass B. We call the latter a conservative extension. That is, when restricted to the original class A, the extended property, p’, is identical to the original property p. In general, a superproperty is said to be a conservative extension of a subproperty when it is identical to the subproperty when restricted to its domain and range. In first order logic, the conservative extension of a property can be expressed as follows. Assume that A and C are subclasses of B and D respectively and  that p, p’ are properties between A,C and B, D respectively:

                               A(x)  ⊃ B(x)
                               C(x)  ⊃ D(x)
                               P(x,y) ⊃ A(x)
                               P(x,y) ⊃ C(y)
                               P’(x,y) ⊃ B(x)
                               P’(x,y) ⊃ D(y)

If p’ is a conservative extension of p then

                               A(x) ∧ C(y) ∧ P’(x,y) ≡  P(x,y)

This is similar to what in logic is called a conservative extension of a theory. This construct is necessary for an effective modular management of ontologies, but is not possible with the current way RDF/OWL treats it. It has very important practical consequences:

Taken on its own, the CRM is not affected by such an conservative extension of scope, since it is not concerned with instances of class B that are not in class A.

If a conservative extension is incorporated into a new version of the CRM, the new version becomes backwards compatible with the previous one (therefore it is conservative).

The bottom-up development of ontologies encourages to find as domain and range of a property not the most general ones for all future, but the best understood ones, and leave it to conservative extensions to find more general ones in the future.

Extensions of the CRM maintained in separate modules that declare classes and/or properties not covered by superclasses and/or superproperties of the CRM should clearly mark the highest-level ones to be used by a respective query system in order to retrieve all instances described in terms of the CRM and the extension modules.

Extensions of the CRM maintained in separate modules must be harmonized with the CRM: All ontologically justified relationships of subsumption between the CRM and the extension should explicitly be declared and contained in the extension, or, if indicated, be submitted for the CRM to consider their inclusion.

It is the hope that over time the CRM and its compatible extension modules will provide a more and more complete coverage of the intended scope as a coherent logical and ontologically adequate theory of widest practical use. Besides others, this will require a collaboration of the involved communities based on a continuous effort of mutual understanding and respect.


[1] The ICOM Statutes provide a definition of the term “museum” at http://icom.museum/statutes.html#2

[2] The Practical Scope of the CIDOC CRM, including a list of the relevant museum documentation standards, is discussed in more detail on the CIDOC CRM website at http://cidoc.ics.forth.gr/scope.html