|
|
Definition of the |
Produced by the ICOM/CIDOC
Documentation Standards Group,
Continued by the
CIDOC CRM Special Interest Group
Version 5.0.4
November 2011
Editors: Nick Crofts, Martin Doerr, Tony Gill, Stephen Stead, Matthew
Stiff.
Copyright © 2003 ICOM/CIDOC CRM Special Interest Group
Table of Contents
Utility
of CRM compatibility. ii
The
Information Integration Environment
CRM Compatibility
of Data Structure
CRM
Compatibility of Information Systems
Compatibility
claim declaration
E75
Conceptual Object Appellation
CIDOC
CRM Property Declarations
P1 is
identified by (identifies)
P4 has
time-span (is time-span of)
P5
consists of (forms part of)
P7 took
place at (witnessed) 38
P8 took
place on or within (witnessed)
P9
consists of (forms part of)
P11 had
participant (participated in)
P12
occurred in the presence of (was present at)
P13
destroyed (was destroyed by)
P14
carried out by (performed)
P15 was
influenced by (influenced)
P16 used
specific object (was used for)
P17 was
motivated by (motivated)
P19 was
intended use of (was made for):
P20 had
specific purpose (was purpose of)
P21 had
general purpose (was purpose of)
P22
transferred title to (acquired title through)
P23
transferred title from (surrendered title through)
P24
transferred title of (changed ownership through)
P26
moved to (was destination of)
P27
moved from (was origin of)
P28
custody surrendered by (surrendered custody through)
P29
custody received by (received custody through)
P30 transferred
custody of (custody transferred through)
P31 has
modified (was modified by)
P32 used
general technique (was technique of)
P33 used
specific technique (was used by)
P34
concerned (was assessed by)
P35 has
identified (was identified by)
P37
assigned (was assigned by)
P38
deassigned (was deassigned by)
P39
measured (was measured by)
P40
observed dimension (was observed in)
P41
classified (was classified by)
P42
assigned (was assigned by)
P43 has
dimension (is dimension of)
P44 has
condition (is condition of)
P45
consists of (is incorporated in)
P46 is
composed of (forms part of)
P48 has
preferred identifier (is preferred identifier of)
P49 has
former or current keeper (is former or current keeper of)
P50 has
current keeper (is current keeper of)
P51 has
former or current owner (is former or current owner of)
P52 has
current owner (is current owner of)
P53 has
former or current location (is former or current location of)
P54 has
current permanent location (is current permanent location of)
P55 has
current location (currently holds)
P56 bears
feature (is found on):
P58 has
section definition (defines section)
P59 has
section (is located on or within)
P65
shows visual item (is shown by)
P67
refers to (is referred to by)
P68
foresees use of (use foreseen by)
P70
documents (is documented in)
P72 has
language (is language of)
P73 has
translation (is translation of)
P74 has
current or former residence (is current or former residence of)
P75
possesses (is possessed by)
P76 has
contact point (provides access to)
P78 is
identified by (identifies)
P83 had
at least duration (was minimum duration of)
P84 had
at most duration (was maximum duration of)
P87 is
identified by (identifies)
P88
consists of (forms part of)
P92
brought into existence (was brought into existence by)
P93 took
out of existence (was taken out of existence by)
P94 has
created (was created by)
P95 has
formed (was formed by)
P97 from
father (was father for)
P98
brought into life (was born)
P99
dissolved (was dissolved by)
P101 had
as general use (was use of)
P102 has
title (is title of) 63
P103 was
intended for (was intention of)
P104 is
subject to (applies to)
P105
right held by (has right on)
P106 is
composed of (forms part of)
P107 has
current or former member (is current or former member of)
P108 has
produced (was produced by)
P109 has
current or former curator (is current or former curator of)
P110
augmented (was augmented by)
P112
diminished (was diminished by)
P115
finishes (is finished by)
P118
overlaps in time with (is overlapped in time by)
P119
meets in time with (is met in time by)
P120
occurs before (occurs after)
P123
resulted in (resulted from)
P124
transformed (was transformed by)
P125
used object of type (was type of object used in)
P126
employed (was employed in)
P127 has
broader term (has narrower term)
P130
shows features of (features are also found on)
P131 is
identified by (identifies)
P134
continued (was continued by)
P135
created type (was created by)
P136 was
based on (supported type creation)
P137
exemplifies ( is exemplified by )
P138
represents (has representation)
P140
assigned attribute to (was attributed by)
P141
assigned (was assigned by)
P142
used constituent (was used in)
P144
joined with (gained member by)
P146
separated from (lost member by)
P148 has
component (is component of)
P149 is
identified by (identifies)
P16 used
specific object (was used for)
P32 used
general technique (was technique of)
P33 used
specific technique (was used by)
P35 has
identified (identified by)
P37
assigned (was assigned by)
P38
deassigned (was deassigned by)
P47 is
identified by(identifies)
P48 has
preferred identifier (is preferred identifier of
P142,
P143, P144, P145, P146, P148
P142
used constituent (was used in)
P144
joined with (gained member by)
P146
separated from (lost member by)
P148 is
identified by (identifies)
Changes
in the scope note of E7 Activity P16
P16 used
specific object (was used for)
Changes
in the domain, range and superproperty of P137
P137 is
exemplified by (exemplifies) (old)
P137
exemplifies (is exemplified by) (NEW)
P39
measured (was measured by):
P39
measured (was measured by):
The
range and the scope note of P20 has been changed
P20 had
specific purpose (was purpose of)
The
scope note of P21 has been changed and an example is added
P21 had
general purpose (was purpose of)
P105 has
been superproperty of P52
The
scope note of P105 has been changed
P105
right held by (has right on)
P68
usually employs (is usually employed by)
P144
joined with (gained member by)
P109 has
current or former curator (is current or former curator of)
Compatibility
claim declaration
P107 has
current or former member (is current or former member of)
P144
joined with (gained member by)
E75
Conceptual Object Appellation
E81
Transformation – issue 165
P4 has
time-span (is time-span of)
P5
consists of (forms part of)
P14
carried out by (performed) – issue 170
P44 has
condition (is condition of) – issue 144
P65
shows visual item (is shown by) – issue 169
P107 has
current or former member (is current or former member of)
P148 has
component (is component of)
P33 used
specific technique (was used by)
P68
foresees use of (use foreseen by)
P101 had
as general use (was use of)
P149 is
identified by (identifies)
Change
the text in objectives of the CIDOC CRM
Definition of the CIDOC Conceptual
Reference Model
This document is the formal definition of the CIDOC Conceptual
Reference Model (“CRM”), a formal ontology intended to facilitate the
integration, mediation and interchange of heterogeneous cultural heritage
information. The CRM is the culmination of more than a decade of standards
development work by the International Committee for Documentation (CIDOC) of
the International Council of Museums (ICOM). Work on the CRM itself began in
1996 under the auspices of the ICOM-CIDOC Documentation Standards Working
Group. Since 2000, development of the CRM has been officially delegated by
ICOM-CIDOC to the CIDOC CRM Special Interest Group, which collaborates with the
ISO working group ISO/TC46/SC4/WG9 to bring the CRM to the form and status of
an International Standard.
The primary role of the CRM is to enable
information exchange and integration between heterogeneous sources of cultural
heritage information. It aims at providing the semantic definitions and
clarifications needed to transform disparate, localised information sources
into a coherent global resource, be it within a larger institution, in
intranets or on the Internet.
Its perspective is supra-institutional and
abstracted from any specific local context. This goal determines the constructs
and level of detail of the CRM.
More specifically, it defines and is restricted to the underlying
semantics of database schemata and document structures used in
cultural heritage and museum documentation in terms of a formal ontology. It
does not define any of the terminology appearing typically as
data in the respective data structures; however it foresees the characteristic
relationships for its use. It does not aim at proposing what cultural
institutions should document. Rather it explains the logic of what they
actually currently document, and thereby enables semantic interoperability.
It intends to provide a model of the intellectual structure of cultural
documentation in logical terms. As such, it is not optimised for
implementation-specific storage and processing aspects. Implementations may
lead to solutions where elements and links between relevant elements of our
conceptualizations are no longer explicit in a database or other structured
storage system. For instance the birth event that connects elements such as father,
mother, birth date, birth place may not appear in the database, in order to
save storage space or response time of the system. The CRM allows us to explain
how such apparently disparate entities are intellectually interconnected, and
how the ability of the database to answer certain intellectual questions is
affected by the omission of such elements and links.
The CRM aims to support the following specific functionalities:
Users of the CRM should be aware that the definition of data entry
systems requires support of community-specific terminology, guidance to what
should be documented and in which sequence, and application-specific
consistency controls. The CRM does not provide such notions.
By its very structure and formalism, the
CRM is extensible and users are encouraged to create extensions for the needs
of more specialized communities and applications.
The overall scope of the CIDOC CRM can be summarised in simple terms as
the curated knowledge of museums.
However, a more detailed and useful definition can be articulated by
defining both the Intended Scope, a broad and maximally-inclusive definition of
general application principles, and the Practical Scope, which is expressed by
the overall scope of a reference set of specific identifiable museum
documentation standards and practices that the CRM aims to encompass, however
restricted in its details to the limitations of the Intended Scope.
The Intended Scope of the CRM may be defined as all information required
for the exchange and integration of heterogeneous scientific documentation of
museum collections. This definition requires further elaboration:
The Practical Scope[2]
of the CRM is expressed in terms of the current reference standards for museum
documentation that have been used to guide and validate the CRM’s development.
The CRM covers the same domain of discourse as the union of these reference
standards; this means that data correctly encoded according to these museum
documentation standards there can be a CRM-compatible expression that conveys
the same meaning.
The goal of the CRM is to enable the integration of the largest number
of information resources. Therefore it aims to provide the greatest flexibility
of systems to become compatible, rather than imposing one particular solution.
Users intending to take advantage of the semantic interoperability
offered by the CRM may want to make parts of their data structures compatible
with the CRM. Compatibility may pertain either to the associations by which
users would like their data to be accessible in an integrated environment, or
to the contents intended for transport to other environments, allowing encoded
meaning to be preserved in a target system.
The CRM does not require complete matching of all user documentation
structures with the CRM, nor that systems should always implement all CRM
concepts and associations; instead it leaves room both for extensions, needed
to capture the full richness of cultural information, and for simplifications,
required for reasons of economy.
Furthermore, the CRM provides a means of interpreting structured
information so that large amounts of data can be transformed or mediated
automatically. It does not require unstructured or semi-structured free text
information to be analysed into a formal logical representation. In other
words, it does not aim to provide more structure than users have previously
provided. The interpretation of information in the form of free text falls
outside the scope of compatibility considerations. The CRM does, however, allow
free text information to be integrated with structured information.
The notion of CRM compatibility is based on interoperability.
Interoperability is best defined on the basis of specific communication practices
between information systems. Following current practice, we distinguish
the following types of information integration environments pertaining to
information systems:
1. Local information systems. These
are either collection management systems or content management
systems that constitute institutional memories and are maintained by an
institution. They are used for primary data entry, i.e. a relevant part of the
information, be it data or metadata, is primary information in digital form that
fulfils institutional needs.
2. Integrated access systems. These
provide an homogeneous access layer to multiple local systems. The
information they manage resides primarily on local systems. We distinguish
between:
a. Materialized access systems, which
physically import data provided by local systems, using a data warehouse
approach. Such systems may employ so-called metadata harvesting techniques or
rely on data submission. Data may be transformed to respect the schema of the
access system before being merged.
b. Mediation systems, [Gio Wiederholt] which send out
queries, formulated according to a virtual global schema, to multiple local
systems and then collect and integrate the answers. The queries may be
transformed to a local schema either by the mediation system or by the
receiving local system itself.
`
Local systems may also import data from other systems, in order
to complement collections, or to merge information from other systems. An
information system may export information for migration and
preservation.
Compatibility with the CRM pertains to one or more of the following data
communication capabilities or use cases:
1. data falling within the scope of the CRM can be exported from an information
system into an encoded form without loss of meaning with respect to CRM
concepts;
2. data falling within the scope of the CRM can be transformed into
another encoded form without loss of meaning with respect to CRM concepts;
3. data falling within the scope of the CRM can be imported from an
encoded form into an information system without loss of meaning with respect to
CRM concepts;
4. data falling within the scope of the CRM that is contained in an
information system can be queried and retrieved exhaustively in terms of
CRM concepts, subject to the expressive power of a particular query language.
Any declaration of CRM compatibility must specify one or more of the above
use cases. System and data structure providers shall not declare their products
as “CRM compatible” without specifying the appropriate use cases as detailed
below.
In the context of this chapter, the expression “without loss of meaning
with respect to the CRM concepts” means the following: The CRM concepts are
used to classify items of discourse and their relationships. By virtue of this
classification, data can be understood as propositions of a kind declared by
the CRM about real world facts, such as “Object x. forms part of: Object y”. In
case the encoding, i.e. the language used to describe a fact, is changed, only
an expert conversant with both languages can assess if the two propositions do
indeed describe the same fact. If this is the case, then there is no loss of
meaning with respect to CRM concepts. Communities of practice requiring fewer
concepts than the CRM declares may restrict CRM compatibility with respect to
an explicitly declared subset of the CRM.
Users of this standard may communicate CRM compatible data, as detailed
below, with data structures and systems that are either more detailed and
specialized than the CRM or whose scope extends beyond that of the CRM.
In such cases, the standard guarantees only the preservation of meaning with
respect to CRM concepts. However, additional information that can be regarded
as extending CRM concepts may be communicated and preserved in CRM compatible
systems through the appropriate use of controlled terminology. The
specification of the latter techniques does not fall under the scope of this
standard. Communities of practice requiring extensions to the CRM are
encouraged to declare their extensions as CRM-compatible standards.
The CRM is a formal ontology which can be expressed in terms of logic or
a suitable knowledge representation language. Its concepts can be instantiated
as sets of statements that provide a model of reality. We call any encoding of
such CRM instances in a formal language that preserves the relations between
the CRM classes, properties and inheritance rules a
“CRM-compatible form”. Hence data expressed in any CRM-compatible form can be
automatically transformed into any other CRM-compatible form without loss of
meaning. Classes and properties of the CRM are identified by their initial
codes, such as “E55” or “P12”. The names of classes and properties of a
CRM-compatible form may be translated into any local language, but the
identifying codes must be preserved. A CRM-compatible form should not
implement the quantifiers of CRM properties as cardinality constraints for
the encoded instances. Quantifiers may be implemented in an informative way, or
not at all. Statements that violate quantifiers should be treated as alternative
knowledge.
Any encoding of CRM instances in a formal language that preserves the
relations within a consistent subset of CRM classes, properties
and inheritance rules is regarded a “reduced CRM-compatible form”, if:
· all the conditions applicable to a CRM compatible form are
respected;
·
the subset does not violate the rules of subsumption
and inheritance;
· any instance of the reduced CRM-compatible form is also a valid instance
of a (full) CRM compatible form
· the subset contains at least the following concepts:
E1 |
CRM Entity |
|||||||||
E2 |
- |
Temporal Entity |
||||||||
E4 |
- |
- |
Period |
|||||||
E5 |
- |
- |
- |
Event |
||||||
E7 |
- |
- |
- |
- |
Activity |
|||||
E11 |
- |
- |
- |
- |
- |
Modification |
||||
E12 |
- |
- |
- |
- |
- |
- |
Production |
|||
E13 |
- |
- |
- |
- |
- |
Attribute Assignment |
||||
E65 |
- |
- |
- |
- |
- |
Creation |
||||
E63 |
- |
- |
- |
- |
Beginning of Existence |
|||||
E12 |
- |
- |
- |
- |
- |
Production |
|
|||
E65 |
- |
- |
- |
- |
- |
Creation |
||||
E64 |
- |
- |
- |
- |
End of Existence |
|||||
E77 |
- |
Persistent Item |
||||||||
E70 |
- |
- |
Thing |
|||||||
E72 |
- |
- |
- |
Legal Object |
||||||
E18 |
- |
- |
- |
- |
Physical Thing |
|||||
E24 |
- |
- |
- |
- |
- |
Physical Man-Made Thing |
||||
E90 |
- |
- |
- |
- |
||||||
E71 |
- |
- |
- |
Man-Made Thing |
||||||
E24 |
- |
- |
- |
- |
Physical Man-Made Thing |
|||||
E28 |
- |
- |
- |
- |
Conceptual Object |
|||||
E89 |
- |
- |
- |
- |
- |
Propositional Object |
||||
E30 |
- |
- |
- |
- |
- |
- |
Right |
|||
E73 |
- |
- |
- |
- |
- |
- |
Information Object |
|||
E90 |
- |
- |
- |
- |
- |
Symbolic Object |
||||
E41 |
- |
- |
- |
- |
- |
- |
Appellation |
|||
E73 |
- |
- |
- |
- |
- |
- |
Information Object |
|||
E55 |
- |
- |
- |
- |
- |
Type |
||||
E39 |
- |
- |
Actor |
|||||||
E74 |
- |
- |
- |
Group |
||||||
E52 |
- |
Time-Span |
||||||||
E53 |
- |
Place |
||||||||
E54 |
- |
Dimension |
||||||||
E59 |
Primitive Value |
|||||||||
E61 |
- |
Time Primitive |
||||||||
E62 |
- |
String |
||||||||
Property
id
|
Property Name
|
Entity – Domain |
Entity - Range |
P1 |
is identified by (identifies) |
E1 CRM Entity |
E41 Appellation |
P2 |
has type (is type of) |
E1 CRM Entity |
E55 Type |
P3 |
has note |
E1 CRM Entity |
E62 String |
P4 |
has time-span (is time-span of) |
E2 Temporal Entity |
E52 Time-Span |
P7 |
took place at (witnessed) |
E4 Period |
E53 Place |
P10 |
falls within (contains) |
E4 Period |
E4 Period |
P12 |
occurred in the presence of (was present at) |
E5 Event |
E77 Persistent Item |
P11 |
- had participant
(participated in) |
E5 Event |
E39 Actor |
P14 |
- - carried out
by (performed) |
E7 Activity |
E39 Actor |
P16 |
- used specific object (was
used for) |
E7 Activity |
E70 Thing |
P31 |
- has modified (was
modified by) |
E11 Modification |
E24 Physical Man-Made Thing |
P108 |
- - has
produced (was produced by) |
E12 Production |
E24 Physical Man-Made Thing |
P92 |
- brought into existence
(was brought into existence by) |
E63 Beginning of Existence |
E77 Persistent Item |
P108 |
- - has
produced (was produced by) |
E12 Production |
E24 Physical Man-Made Thing |
P94 |
- - has created
(was created by) |
E65 Creation |
E28 Conceptual Object |
P93 |
- took out of existence
(was taken out of existence by) |
E64 End of Existence |
E77 Persistent Item |
P15 |
was influenced by (influenced) |
E7 Activity |
E1 CRM Entity |
P16 |
- used specific object (was
used for) |
E7 Activity |
E70 Thing |
P20 |
had specific purpose (was purpose of) |
E7 Activity |
E5 Event |
P43 |
has dimension (is dimension of) |
E70 Thing |
E54 Dimension |
P46 |
is composed of (forms part of) |
E18 Physical Thing |
E18 Physical Thing |
P59 |
has section (is located on or within) |
E18 Physical Thing |
E53 Place |
P67 |
refers to ( is referred to by) |
E89 Propositional Object |
E1 CRM Entity |
P75 |
possesses (is possessed by) |
E39 Actor |
E30 Right |
P81 |
ongoing throughout |
E52 Time-Span |
E61 Time Primitive |
P82 |
at some time within |
E52 Time-Span |
E61 Time Primitive |
P89 |
falls within (contains) |
E53 Place |
E53 Place |
P104 |
is subject to (applies to) |
E72 Legal Object |
E30 Right |
P106 |
is composed of (forms part of) |
E90 Symbolic Object |
E90 Symbolic Object |
P107 |
has current or former member (is current or former
member of) |
E74 Group |
E39 Actor |
P127 |
has broader term (has narrower term) |
E55 Type |
E55 Type |
P128 |
carries (is carried by) |
E24 Physical Man-Made Thing |
E90
Symbolic Object |
P130 |
shows
features of (features are also found on) |
E70 Thing |
E70 Thing |
P140 |
assigned attribute to (was attributed by) |
E13 Attribute Assignment |
E1 CRM Entity |
P141 |
assigned (was assigned by) |
E13 Attribute Assignement |
E1 CRM Entity |
P148 |
has component (is component of) |
E89 Propositional Object |
E89 Propositional Object |
A data structure is export-compatible with the CRM if it is possible to transform any data from this data structure into a
CRM-compatible form without loss of meaning. Implicit concepts may be
present in elements of the data structure that are not supported by the CRM. As
long as these concepts can be encoded as instances of E55 Type (i.e. as
terminology) and attached unambiguously to their respective data items with
suitable properties, the data structure is still regarded as export
compatible.
Note that not all CRM concepts may be represented by elements of an
export-compatible data structure. All data from export-compatible data structures
can be transported in a CRM-compatible form. In particular any CRM compatible
form or reduced CRM-compatible form is export-compatible with the CRM.
A data structure is import-compatible with the CRM if it is possible to automatically transform any data from a
CRM-compatible form into this data structure without loss of meaning,
simply on the basis of knowledge about the data structure elements being used.
This implies that a data record transformed into this data structure from a
CRM-compatible form can be transformed back into the CRM-compatible form without
loss of meaning. Note that the back-transformation into a CRM-compatible
form may result in a data record that is semantically equivalent but not
identical with the original.
Any CRM-compatible form is automatically import-compatible with the CRM.
Note that an import-compatible data structure may be semantically richer than
the CRM. It may contain elements that, through the use of a transformation
algorithm, can be made to correspond to CRM concepts or specializations thereof
or that contain elements with meanings that fall outside the scope of the CRM.
However, it must not contain elements that overlap in meaning with CRM concepts
and which cannot be subsumed via transformation by a CRM concept other than E1
CRM Entity and E77 Persistent Item.
Import-compatible data structures may be used to transport data for
applications that require concepts that lie beyond the scope of the CRM, as
well as data from any export-compatible data structure. Note that, in general,
applications may make use of data from a CRM import-compatible data
structure that has been exported into a CRM compatible form by semantic
reduction to CRM concepts, i.e. by generalizing all subsumed concepts to the
most specific CRM concept applicable, and by discarding elements that fall
outside the scope of the CRM.
A data structure is partially import-compatible
with the CRM if the above holds for a reduced CRM-compatible form.
An information system is export-compatible
with the CRM if it is possible to export all user data from this
information system into an import-compatible data structure. This capability is
the recommended kind of CRM-compatibility for local information systems.
An information system is partially export compatible if it
is possible to export all user data from this information system into a
partially import-compatible data structure. This is not the recommended kind of
CRM-compatibility, but it may not be feasible for legacy systems to acquire a
higher level of CRM compatibility without unreasonable effort. This reduced
level of CRM compatibility is nonetheless highly useful.
Note that there is no minimum requirement for the classes and properties
that must be present in the exported user data. Therefore it is possible that
the data may pertain to instances of just a single property, such as E21
Person. P131 is identified by: E82 Actor Appellation.
An information system is import-compatible
with the CRM if it is possible to import data encoded in a CRM-compatible
form and to access the data in a manner equivalent to and homogeneous with all
generic data of this system that fall under the same concepts. This capability
is considered as the normal kind of CRM compatibility for integrated access
systems that physically copy source data in a data warehouse style
(materialized access systems).
An information system is partially import-compatible with the CRM
if it is possible to import data encoded in a reduced CRM-compatible form and
to access the data in a manner equivalent to and homogeneous with all generic
data of this system that fall under the same concepts. Depending on the
functional requirements, it makes sense for integrated access systems to offer
access services of reduced complexity by being only partially import-compatible
with the CRM.
Note that it makes sense for integrated access systems to import data
from extended data structures by semantic reduction to CRM defined concepts.
Note that local information system providers may choose to make their
systems import-compatible with the CRM in order to exchange data, for example
in the case of museum object loans or for system migration purposes.
Communities of practice may choose to agree on import compatibility for
extended data structures.
Some local information systems are likely to focus on specialized
subject areas, such as inscriptions. For these specialized systems, the ability
to import a specific data structure is recommended. This should be
export-compatible with the CRM, and encompass the concepts that are required by
the subject matter (“dedicated import compatibility”).
An information system is access-compatible with the CRM if it is possible to access the user data in the information system by
querying with CRM classes and properties so that the meaning of the answers to
the queries corresponds to the query terms used. It is not regarded as a
reduction of compatibility if access is limited to data deemed to be exchanged.
An information system is partially access-compatible with the CRM
if it is possible to access the user data in the information system by querying
with a consistent subset of CRM classes and properties, corresponding to a
reduced CRM-compatible form, so that the meaning of the answers to the queries
corresponds to the query terms used.
An access-compatible system may be export-compatible with respect
to the query answers. Note that it may make sense for an access-compatible
content management system to return only content items in response to queries
rather than being export compatible.
fig. 1: Possible data flow between different kinds of CRM-compatible systems and
data structures
Fig. 1 shows a symbolic representation of some of the data flow patterns
defined above between different kinds of CRM-compatible systems and data
structures. In this figure it is assumed that the Local System B exports data
into a CRM export-compatible data structure, which implies that it can be
exported into a CRM-compatible form or any other CRM import-compatible data
structure. Therefore Local System B is export-compatible with the CRM. For
Local System A, the figure symbolizes the case where the exported data contain
elements that correspond to specializations of the CRM or fall out of its
scope.
A provider of a data structure or information system claiming
compatibility with the CRM has to provide a declaration that describes the kind
of compatibility and, depending on the kind, the following additional
information:
· For export-compatible data structures:
The subset of CRM concepts directly instantiated by any possible data in
this data structure after transformation into a CRM-compatible form.
· For export-compatible systems:
· For partially or dedicated import-compatible systems:
The subset of CRM concepts under which data can be imported into the
system.
· For access-compatible systems:
a. The query language by which the system can be queried.
b. The subset of CRM concepts directly instantiated by any possible query
answers exported from the system after transformation into a CRM-compatible
form.
c. For partially access-compatible systems, the subset of CRM concepts by
which the system can be queried.
The provider should be able to demonstrate the claim with suitable test
data. The provider should be able to demonstrate its claim according to certain
procedures included in any applicable certificate practice related statement.
The provider should either make evidence of these procedures publicly available
on the Internet on a site nominated by the ISO community of use, so that any
third party is able to verify the claim with suitable test data, or acquire a
certificate by a certification authority (CA).
A trusted third party recognised and authorised by a competent regulatory
authority to act as a CA in this practice area, should be able to verify the
credentials of the provider applying for such certificate and thus, of its
claim with suitable test data, before issuing the certificate so that the users
can trust the information in the CA certificates.
The CA will grant the provider of the certified system the right to use the
“CRM compatible” logo..
The CRM is an ontology in the sense used in computer science. It has
been expressed as an object-oriented semantic model, in the hope that this
formulation will be comprehensible to both documentation experts and
information scientists alike, while at the same time being readily converted to
machine-readable formats such as RDF Schema, KIF, DAML+OIL, OWL, STEP, etc. It
can be implemented in any Relational or object-oriented schema. CRM instances
can also be encoded in RDF, XML, DAML+OIL, OWL and others.
Although the definition of the CRM provided here is complete, it is an
intentionally compact and concise presentation of the CRM’s 86 classes and 137
unique properties. It does not attempt to articulate the inheritance of
properties by subclasses throughout the class hierarchy (this would require the
declaration of several thousand properties, as opposed to 137). However, this
definition does contain all of the information necessary to infer and
automatically generate a full declaration of all properties, including
inherited properties.
The following definitions of key terminology used in this document are
provided both as an aid to readers unfamiliar with object-oriented modelling
terminology, and to specify the precise usage of terms that are sometimes
applied inconsistently across the object oriented modelling community for the
purpose of this document. Where applicable, the editors have tried to
consistently use terminology that is compatible with that of the Resource
Description Framework (RDF)[3],
a recommendation of the World Wide Web Consortium. The editors have tried to
find a language which is comprehensible to the non-computer expert and precise
enough for the computer expert so that both understand the intended meaning.
Class |
A class is a category of items that share one or more common traits
serving as criteria to identify the items belonging to the class. These properties
need not be explicitly formulated in logical terms, but may be described in a
text (here called a scope note) that refers to a common
conceptualisation of domain experts. The sum of these traits is called the intension
of the class. A class may be the domain or range of none, one
or more properties formally defined in a model. The formally defined
properties need not be part of the intension of their domains or ranges: such
properties are optional. An item that belongs to a class is called an instance
of this class. A class is associated with an open set of real life instances,
known as the extension of the class. Here “open” is used in the sense
that it is generally beyond our capabilities to know all instances of a class
in the world and indeed that the future may bring new instances about at any
time (Open World). Therefore a class cannot be defined by enumerating
its instances. A class plays a role analogous to a grammatical noun, and can
be completely defined without reference to any other construct (unlike
properties, which must have an unambiguously defined domain and
range). In some contexts, the terms individual class, entity or node are used
synonymously with class. For example: Person is a class. To be a Person may actually be determined by DNA characteristics,
but we all know what a Person is. A Person may have the property of being a
member of a Group, but it is not necessary to be member of a Group in order
to be a Person. We shall never know all Persons of the past. There will be
more Persons in the future. |
subclass |
A subclass is a class that is a specialization of another class
(its superclass). Specialization or the IsA relationship means that:
A subclass can have more than one immediate superclass and
consequently inherits the properties of all of its superclasses (multiple
inheritance). The IsA relationship or specialization between two or more
classes gives rise to a structure known as a class hierarchy. The IsA
relationship is transitive and may not be cyclic. In some contexts (e.g. the
programming language C++) the term derived class is used synonymously with
subclass. For example: Every Person IsA Biological Object, or Person is a subclass of
Biological Object. Also, every Person IsA Actor. A Person may die. However other kinds of
Actors, such as companies, don’t die (c.f. 2). Every Biological Object IsA Physical Object.
A Physical Object can be moved. Hence a Person can be moved also (c.f. 3). |
superclass |
A superclass is a class that is a generalization of one or more
other classes (its subclasses), which means that it subsumes all instances
of its subclasses, and that it can also have additional instances that do not
belong to any of its subclasses. The intension of the superclass is
less restrictive than any of its subclasses. This subsumption relationship or
generalization is the inverse of the IsA relationship or specialization. In some contexts (e.g. the programming language C++) the term parent
class is used synonymously with superclass. For example: “Biological Object subsumes Person” is synonymous with “Biological
Object is a superclass of Person”. It needs fewer traits to identify an item
as a Biological Object than to identify it as a Person. |
intension |
The intension of a class or property is its intended
meaning. It consists of one or more common traits shared by all instances
of the class or property. These traits need not be explicitly formulated in
logical terms, but may just be described in a text (here called a scope
note) that refers to a conceptualisation common to domain experts. In
particular the so-called primitive concepts, which make up most of the
CRM, cannot be further reduced to other concepts by logical terms. |
extension |
The extension of a class is the set of all real life instances
belonging to the class that fulfil the criteria of its intension. This
set is “open” in the sense that it is generally beyond our capabilities to
know all instances of a class in the world and indeed that the future may
bring new instances about at any time (Open World). An information
system may at any point in time refer to some instances of a class, which
form a subset of its extension. |
scope note |
A scope note is a textual description of the intension of a class
or property. Scope notes are not formal modelling constructs, but are provided to
help explain the intended meaning and application of the CRM’s classes and
properties. Basically, they refer to a conceptualisation common to domain
experts and disambiguate between different possible interpretations.
Illustrative example instances of classes and properties are also
regularly provided in the scope notes for explanatory purposes. |
instance |
An instance of a class is a real
world item that fulfils the criteria of the intension of the class.
Note, that the number of instances declared for a class in an
information system is typically less than the total in the real world. For
example, you are an instance of Person, but you are not mentioned in all
information systems describing Persons. For example: The painting known as the “The Mona Lisa” is an instance of the class
Man Made Object. An instance of a property is a factual relation between an
instance of the domain and an instance of the range of the
property that matches the criteria of the intension of the property. For example: “The Louvre is current owner of
The Mona Lisa” is an instance of the property “is current owner of”. |
property |
A property serves to define a relationship of a specific kind between
two classes. The property is characterized by an intension,
which is conveyed by a scope note. A property plays a role analogous
to a grammatical verb, in that it must be defined with reference to both its domain
and range, which are analogous to the subject and object in grammar
(unlike classes, which can be defined independently). It is arbitrary, which
class is selected as the domain, just as the choice between active and
passive voice in grammar is arbitrary. In other words, a property can be
interpreted in both directions, with two distinct, but related
interpretations. Properties may themselves have properties that relate to
other classes (This feature is used in this model only in order to describe
dynamic subtyping of properties). Properties can also be specialized in the
same manner as classes, resulting in IsA relationships between subproperties
and their superproperties. In some contexts, the terms attribute, reference, link, role or slot
are used synonymously with property. For example: “Physical Man-Made Thing depicts
CRM Entity” is equivalent to “CRM Entity is depicted by Physical
Man-Made Thing”. |
subproperty |
A subproperty is a property that is a specialization of another
property (its superproperty). Specialization or IsA relationship means
that:
A subproperty can have more than one immediate superproperty and
consequently inherits the properties of all of its superproperties (multiple
inheritance). The IsA relationship or specialization between two or more
properties gives rise to the structure we call a property hierarchy. The IsA
relationship is transitive and may not be cyclic. Some object-oriented programming
languages, such as C++, do not contain constructs that allow for the
expression of the specialization of properties as sub-properties |
superproperty |
A superproperty is a property that is a generalization of one
or more other properties (its subproperties), which means that it subsumes
all instances of its subproperties, and that it can also have
additional instances that do not belong to any of its subproperties. The intension
of the superproperty is less restrictive than any of its subproperties. The
subsumption relationship or generalization is the inverse of the IsA
relationship or specialization. |
domain |
The domain is the class for which a property is formally
defined. This means that instances of the property are applicable to instances
of its domain class. A property must have exactly one domain, although the
domain class may always contain instances for which the property is not
instantiated. The domain class is analogous to the grammatical subject of the
phrase for which the property is analogous to the verb. It is arbitrary,
which class is selected as the domain and which as the range, just as
the choice between active and passive voice in grammar is arbitrary. Property
names in the CRM are designed to be semantically meaningful and grammatically
correct when read from domain to range. In addition, the inverse
property name, normally given in parentheses, is also designed to be
semantically meaningful and grammatically correct when read from range to
domain. |
range |
The range is the class that comprises all potential values of a
property. That means that instances of the property can link
only to instances of its range class. A property must have exactly one range,
although the range class may always contain instances that are not the value
of the property. The range class is analogous to the grammatical object of a
phrase for which the property is analogous to the verb. It is arbitrary,
which class is selected as domain and which as range, just as the
choice between active and passive voice in grammar is arbitrary. Property
names in the CRM are designed to be semantically meaningful and grammatically
correct when read from domain to range. In addition the inverse property
name, normally given in parentheses, is also designed to be semantically
meaningful and grammatically correct when read from range to domain. |
inheritance |
Inheritance of properties from superclasses to subclasses
means that if an item x is an instance of a class A, then
all optional properties that may hold
for the instances of any of the superclasses of A may also hold for item x. |
strict inheritance |
Strict inheritance means that there are no exceptions to the
inheritance of properties from superclasses to subclasses.
For instance, some systems may declare that elephants are grey, and regard a
white elephant as an exception. Under strict inheritance it would hold that:
if all elephants were grey, then a white elephant could not be an elephant.
Obviously not all elephants are grey. To be grey is not part of the intension
of the concept elephant but an optional property. The CRM applies strict
inheritance as a normalization principle. |
multiple inheritance |
Multiple inheritance means that a
class A may have more than one immediate superclass. The extension
of a class with multiple immediate superclasses is a subset of the
intersection of all extensions of its superclasses. The intension of a
class with multiple immediate superclasses extends the intensions of all its
superclasses, i.e. its traits are more restrictive than any of its
superclasses. If multiple inheritance is used, the resulting “class
hierarchy” is a directed graph and not a tree structure. If it is represented
as an indented list, there are necessarily repetitions of the same class at
different positions in the list. For example, Person is both, an Actor
and a Biological Object. |
endurant, perdurant |
“The difference between enduring and
perduring entities (which we shall also call endurants and perdurants)
is related to their behaviour in time. Endurants are wholly present (i.e., all
their proper parts are present) at any time they are present. Perdurants, on
the other hand, just extend in time by accumulating different temporal parts,
so that, at any time they are present, they are only partially present, in
the sense that some of their proper temporal parts (e.g., their previous or
future phases) may be not present. E.g., the piece of paper you are reading
now is wholly present, while some temporal parts of your reading are not
present any more. Philosophers say that endurants are entities that are in
time, while lacking however temporal parts (so to speak, all their parts flow
with them in time). Perdurants, on the other hand, are entities that happen
in time, and can have temporal parts (all their parts are fixed in time).”
(Gangemi et al. 2002, pp. 166-181). |
shortcut |
A shortcut is a formally defined single property that
represents a deduction or join of a data path in the CRM. The scope notes
of all properties characterized as shortcuts describe in words the equivalent
deduction. Shortcuts are introduced for the cases where common documentation
practice refers only to the deduction rather than to the fully developed
path. For example, museums often only record the dimension of an object
without documenting the Measurement that observed it. The CRM declares
shortcuts explicitly as single properties in order to allow the user to
describe cases in which he has less detailed knowledge than the full data
path would need to be described. For each shortcut, the CRM contains in its schema
the properties of the full data path explaining the shortcut. |
monotonic reasoning |
Monotonic reasoning is a term from
knowledge representation. A reasoning form is monotonic if an addition to the
set of propositions making up the knowledge base never determines a decrement
in the set of conclusions that may be derived from the knowledge base via
inference rules. In practical terms, if experts enter subsequently correct
statements to an information system, the system should not regard any results
from those statements as invalid, when a new one is entered. The CRM is
designed for monotonic reasoning and so enables conflict-free merging of huge
stores of knowledge. |
disjoint |
Classes are disjoint if the intersection of
their extensions is an empty set. In other words, they have no common instances
in any possible world. |
primitive |
The term primitive as used in knowledge representation characterizes a
concept that is declared and its meaning is agreed upon, but that is not
defined by a logical deduction from other concepts. For example, mother may
be described as a female human with child. Then mother is not a primitive
concept. Event however is a primitive concept. Most of the CRM is made up of primitive concepts. |
Open World |
The “Open World Assumption” is a term from knowledge base systems. It
characterizes knowledge base systems that assume the information stored is
incomplete relative to the universe of discourse they intend to describe.
This incompleteness may be due to the inability of the maintainer to provide
sufficient information or due to more fundamental problems of cognition in
the system’s domain. Such problems are characteristic of cultural information
systems. Our records about the past are necessarily incomplete. In addition,
there may be items that cannot be clearly assigned to a given class. In particular, absence of a certain property for an item
described in the system does not mean that this item does not have this property.
For example, if one item is described as Biological Object and another as
Physical Object, this does not imply that the latter may not be a Biological
Object as well. Therefore complements of a class with respect to a superclass
cannot be concluded in general from an information system using the Open
World Assumption. For example, one cannot list “all Physical Objects known to
the system that are not Biological Objects in the real world”, but one may of
course list “all items known to the system as Physical Objects but that are
not known to the system as Biological Objects”. |
complement |
The complement of a class A with respect to one of its superclasses
B is the set of all instances of B that are not instances of A.
Formally, it is the set-theoretic difference of the extension of B
minus the extension of A. Compatible extensions of the CRM should not declare
any class with the intension of them being the complement of
one or more other classes. To do so will normally violate the desire to
describe an Open World. For example, for all possible cases of human
gender, male should not be declared as the complement of female or vice
versa. What if someone is both or even of another kind? |
query containment |
Query containment is a problem from database
theory: A query X contains another query Y, if for each possible population
of a database the answer set to query X contains also the answer set to query
Y. If query X and Y were classes, then X would be superclass of Y. |
interoperability |
Interoperability means the capability of
different information systems to communicate some of their contents. In
particular, it may mean that
Generally, syntactic interoperability is distinguished from semantic
interoperability. Syntactic interoperability means that the
information encoding of the involved systems and the access protocols are
compatible, so that information can be processed as described above without
error. However, this does not mean that each system processes the data in a
manner consistent with the intended meaning. For example, one system may use
a table called “Actor” and another one called “Agent”. With syntactic
interoperability, data from both tables may only be retrieved as distinct,
even though they may have exactly the same meaning. To overcome this
situation, semantic interoperability has to be added. The CRM relies on
existing syntactic interoperability and is concerned only with adding semantic
interoperability. |
semantic interoperability |
Semantic interoperability means the capability of different
information systems to communicate information consistent with the intended
meaning. In more detail, the intended meaning encompasses
Obviously communication about data structure must be resolved first. In
this case consistent communication means that data can be transferred between
data structure elements with the same intended meaning or that data from
elements with the same intended meaning can be merged. In practice, the
different levels of generalization in different systems do not allow the
achievement of this ideal. Therefore semantic interoperability is regarded as
achieved if elements can be found that provide a reasonably close
generalization for the transfer or merge. This problem is being studied
theoretically as the query containment problem. The CRM is only
concerned with semantic interoperability on the level of data structure
elements. |
property quantifiers |
We use the term "property
quantifiers" for the declaration of the allowed number of instances
of a certain property that can refer to a particular instance of the range
class or the domain class of that property. These declarations are
ontological, i.e. they refer to the nature of the real world described and
not to our current knowledge. For example, each person has exactly one
father, but collected knowledge may refer to none, one or many. |
universal |
The fundamental ontological distinction between universals and
particulars can be informally understood by considering their relationship
with instantiation: particulars are entities that have no instances in
any possible world; universals are entities that do have instances. Classes
and properties (corresponding to predicates in a logical language)
are usually considered to be universals. (after Gangemi et al. 2002, pp.
166-181). |
Quantifiers for properties are provided
for the purpose of semantic clarification only, and should not be
treated as implementation recommendations. The CRM has been designed to
accommodate alternative opinions and incomplete information, and therefore all
properties should be implemented as optional and repeatable for their domain
and range (“many to many (0,n:0,n)”). Therefore the term “cardinality
constraints” is avoided here, as it typically pertains to implementations.
The following table lists all possible property quantifiers occurring in
this document by their notation, together with an explanation in plain words.
In order to provide optimal clarity, two widely accepted notations are used
redundantly in this document, a verbal and a numeric one. The verbal notation
uses phrases such as “one to many”, and the numeric one, expressions such as
“(0,n:0,1)”. While the terms “one”, “many” and “necessary” are quite intuitive,
the term “dependent” denotes a situation where a range instance cannot exist
without an instance of the respective property. In other words, the property is
“necessary” for its range.
many to many (0,n:0,n) |
Unconstrained: An individual domain
instance and range instance of this property can have zero, one or more
instances of this property. In other words, this property is optional and
repeatable for its domain and range. |
one to many (0,n:0,1) |
An individual domain instance of this property
can have zero, one or more instances of this property, but an individual
range instance cannot be referenced by more than one instance of this
property. In other words, this property is optional for its domain and range,
but repeatable for its domain only. In some contexts this situation is called
a “fan-out”. |
many to one (0,1:0,n) |
An individual domain instance of this
property can have zero or one instance of this property, but an individual
range instance can be referenced by zero, one or more instances of this
property. In other words, this property is optional for its domain and range,
but repeatable for its range only. In some contexts this situation is called
a “fan-in”. |
many to many, necessary (1,n:0,n) |
An individual domain instance of this
property can have one or more instances of this property, but an individual
range instance can have zero, one or more instances of this property. In
other words, this property is necessary and repeatable for its domain, and
optional and repeatable for its range. |
one to many, necessary (1,n:0,1) |
An individual domain instance of this property can have one or more
instances of this property, but an individual range instance cannot be
referenced by more than one instance of this property. In other words, this
property is necessary and repeatable for its domain, and optional but not
repeatable for its range. In some contexts this situation is called a
“fan-out”. |
many to one, necessary (1,1:0,n) |
An individual domain instance of this property must have exactly one
instance of this property, but an individual range instance can be referenced
by zero, one or more instances of this property. In other words, this
property is necessary and not repeatable for its domain, and optional and
repeatable for its range. In some contexts this situation is called a
“fan-in”. |
one to many, dependent (0,n:1,1) |
An individual domain instance of this property can have zero, one or
more instances of this property, but an individual range instance must be
referenced by exactly one instance of this property. In other words, this
property is optional and repeatable for its domain, but necessary and not
repeatable for its range. In some contexts this situation is called a
“fan-out”. |
one to many, necessary, dependent (1,n:1,1) |
An individual domain instance of this property can have one or more
instances of this property, but an individual range instance must be
referenced by exactly one instance of this property. In other words, this
property is necessary and repeatable for its domain, and necessary but not
repeatable for its range. In some contexts this situation is called a
“fan-out”. |
many to one, necessary, dependent (1,1:1,n) |
An individual domain instance of this property must have exactly one
instance of this property, but an individual range instance can be referenced
by one or more instances of this property. In other words, this property is
necessary and not repeatable for its domain, and necessary and repeatable for
its range. In some contexts this situation is called a “fan-in”. |
one to one (1,1:1,1) |
An individual domain instance and range instance of this property must
have exactly one instance of this property. In other words, this property is necessary
and not repeatable for its domain and for its range. |
The CRM defines some dependencies between
properties and the classes that are their domains or ranges. These can be one
or both of the following:
A) the property is necessary for the domain
B) the property is necessary for the range, or, in other words, the
range is dependent on the property.
The possible kinds of dependencies are
defined in the table above. Note that if a dependent property is not specified
for an instance of the respective domain or range, it means that the property
exists, but the value on one side of the property is unknown. In the case of
optional properties, the methodology proposed by the CRM does not distinguish
between a value being unknown or the property not being applicable at all. For
example, one may know that an object has an owner, but the owner is unknown. In
a CRM instance this case cannot be distinguished from the fact that the object
has no owner at all. Of course, such details can always be specified by a
textual note.
The following naming conventions have been
applied throughout the CRM:
·
Classes are identified by numbers preceded by the
letter “E” (historically classes were sometimes referred to as “Entities”), and
are named using noun phrases (nominal groups) using title case (initial
capitals). For example, E63 Beginning of Existence.
· Properties are identified by numbers preceded by the letter “P,” and are
named in both directions using verbal phrases in lower case. Properties with
the character of states are named in the present tense, such as “has type”,
whereas properties related to events are named in past tense, such as “carried
out.” For example, P126 employed (was employed in).
·
Property names should be read in their
non-parenthetical form for the domain-to-range direction, and in parenthetical
form for the range-to-domain direction.
·
Properties with a range that is a subclass of E59
Primitive Value (such as E1 CRM Entity. P3 has note: E62 String, for
example) have no parenthetical name form, because reading the property name in
the range-to-domain direction is not regarded as meaningful.
·
Properties that have identical domain and range are
either symmetric or transitive. Instantiating a symmetric property implies that
the same relation holds for both the domain-to-range and the range-to-domain
directions. An example of this is E53 Place. P122 borders with: E53 Place.
The names of symmetric properties have no parenthetical form, because reading
in the range-to-domain direction is the same as the domain-to-range reading.
Transitive asymmetric properties, such as E4 Period. P9 consist of (forms part of): E4 Period have a parenthetical form that
relates to the meaning of the inverse direction.
·
The choice of the domain of properties, and hence the
order of their names, are established in accordance with the following priority
list:
· Temporal Entity and its subclasses
· Thing and its subclasses
· Actor and its subclasses
· Other
The following modelling principles have guided and informed the
development of the CIDOC CRM.
Because the CRM’s primary role is the meaningful integration of information
in an Open World, it aims to be monotonic in the sense of Domain Theory. That
is, the existing CRM constructs and the deductions made from them must always
remain valid and well-formed, even as new constructs are added by extensions to
the CRM.
For example:
One may add a subclass of E7 Activity to describe the practice of an
instance of group to use a certain name for a place over a certain time-span.
By this extension, no existing IsA Relationships or property inheritances are
compromised.
In addition, the CRM aims to enable the
formal preservation of monotonicity when augmenting a particular CRM compatible
system. That is, existing CRM instances, their properties and deductions made
from them, should always remain valid and well-formed, even as new instances,
regarded as consistent by the domain expert, are added to the system.
For example:
If someone describes correctly that an item is an instance of E19
Physical Object, and later it is correctly characterized as an instance of E20
Biological Object, the system should not stop treating it as an instance of E19
Physical Object.
In order to formally preserve monotonicity for the frequent cases of
alternative opinions, all formally defined properties should be implemented as
unconstrained (many: many) so that conflicting instances of properties
are merely accumulated. Thus knowledge integrated following the CRM serves as a
research base, accumulating relevant alternative opinions around well-defined
entities, whereas conclusions about the truth are the task of open-ended
scientific or scholarly hypothesis building.
For example:
El Greco and even King Arthur should
always remain an instance of E21 Person and be dealt with as existing within
the sense of our discourse, once they are entered into our knowledge base.
Alternative opinions about properties, such as their birthplaces and their
living places, should be accumulated without validity decisions being made
during data compilation.
Although the scope of the CRM is very
broad, the model itself is constructed as economically as possible.
·
A class is not declared unless it is required as the
domain or range of a property not appropriate to its superclass, or it is a key
concept in the practical scope.
·
CRM classes and properties that share a superclass are
non-exclusive by default. For example, an object may be both an instance of E20
Biological Object and E22 Man-made Object.
·
CRM classes and properties are either primitive, or
they are key concepts in the practical scope.
·
Complements of CRM classes are not declared.
Some properties are declared as shortcuts of longer, more
comprehensively articulated paths that connect the same domain and range
classes as the shortcut property via one or more intermediate classes. For
example, the property E18 Physical Thing. P52 has current owner (is current
owner of): E39 Actor, is a shortcut for a fully articulated path from E18
Physical Thing through E8 Acquisition to E39 Actor. An instance of the
fully-articulated path always implies an instance of the shortcut property.
However, the inverse may not be true; an instance of the fully-articulated path
cannot always be inferred from an instance of the shortcut property.
The class E13 Attribute Assignment allows
for the documentation of how the assignment of any property came about, and
whose opinion it was, even in cases of properties not explicitly characterized
as “shortcuts”.
Classes are disjoint if they share no
common instances in any possible world. There are many examples of disjoint
classes in the CRM.
A comprehensive declaration of all
possible disjoint class combinations afforded by the CRM has not been provided
here; it would be of questionable practical utility, and may easily become
inconsistent with the goal of providing a concise definition. However, there
are two key examples of disjoint class pairs that are fundamental to effective
comprehension of the CRM:
·
E2 Temporal Entity is disjoint from E77 Persistent
Item. Instances of the class E2 Temporal Entity
are perdurants, whereas instances of the class E77 Persistent Item are
endurants. Even though instances of E77 Persistent Item have a limited
existence in time, they are fundamentally different in nature from instances of
E2 Temporal Entity, because they preserve their identity between events.
Declaring endurants and perdurants as disjoint classes is consistent with the
distinctions made in data structures that fall within the CRM’s practical
scope.
·
E18 Physical Thing is disjoint from E28 Conceptual
Object. The distinction is between material and
immaterial items, the latter being exclusively man-made. Instances of E18
Physical Thing and E28 Conceptual Object differ in many fundamental ways; for
example, the production of instances of E18 Physical Thing implies the
incorporation of physical material, whereas the production of instances of E28
Conceptual Object does not. Similarly, instances of E18 Physical Thing cease to
exist when destroyed, whereas an instance of E28 Conceptual Object perishes
when it is forgotten or its last physical carrier is destroyed.
Virtually all structured descriptions of museum objects begin with a
unique object identifier and information about the "type" of the
object, often in a set of fields with names like "Classification",
"Category", "Object Type", "Object Name", etc.
All these fields are used for terms that declare that the object belongs to a
particular category of items. In the CRM the class E55 Type comprises such
terms from thesauri and controlled vocabularies used to characterize and
classify instances of CRM classes. Instances of E55 Type represent
concepts (universals) in contrast to instances of E41 Appellation which are
used to name instances of CRM classes.
E55 Type is the CRM’s interface to domain specific ontologies and
thesauri. These can be represented in the CRM as subclasses of E55 Type,
forming hierarchies of terms, i.e. instances of E55 Type linked via P127 has
broader term (has narrower term). Such hierarchies may be extended with
additional properties.
For this purpose the CRM provides two basic properties that describe
classification with terminology, corresponding to what is the current practice
in the majority of information systems. The class E1 CRM Entity is the domain
of the property P2 has type (is type of), which has the range E55 Type.
Consequently, every class in the CRM, with the exception of E59 Primitive
Value, inherits the property P2 has type (is type of). This provides a
general mechanism for simulating a specialization of the classification of CRM
instances to any level of detail, by linking to external vocabulary sources,
thesauri, classification schema or ontologies.
Analogous to the function of the P2 has type (is type of) property, some
properties in the CRM are associated with an additional property. These are
numbered in the CRM documentation with a ‘.1’ extension. The range of these
properties of properties always falls under E55 Type. Their purpose is to simulate
a specialization of their parent property through the use of property subtypes
declared as instances of E55 Type. They do not appear in the property hierarchy
list but are included as part of the property declarations and referred to in
the class declarations. For example, P62.1 mode of depiction: E55 Type is
associated with E24 Physical Man-made Thing. P62 depicts (is depicted by): E1
CRM Entity.
The class E55 Type also serves as the range of properties that relate to
categorical knowledge commonly found in cultural documentation. For example,
the property P125 used object of type (was type of object used in) enables
the CRM to express statements such as “this casting was produced using a
mould”, meaning that there has been an unknown or unmentioned object, a mould,
that was actually used. This enables the specific instance of the casting to be
associated with the entire type of manufacturing devices known as moulds.
Further, the objects of type “mould” would be related via P2 has type (is
type of) to this term. This indirect relationship may actually help in
detecting the unknown object in an integrated environment. On the other side,
some casting may refer directly to a known mould via P16 used specific
object (was used for). So a statistical question to how many objects
in a certain collection are made with moulds could be answered correctly
(following both paths through P16 used specific object (was used for) - P2
has type (is type of) and P125 used object of type (was type of object
used in). This consistent treatment of categorical knowledge enhances the
CRM’s ability to integrate cultural knowledge.
In addition to being an interface to external thesauri and
classification systems E55 Type is an ordinary class in the CRM and a subclass
of E28 Conceptual Object. E55 Type and its subclasses inherit all properties
from this superclass. Thus together with the CRM class E83 Type Creation
the rigorous scholarly or scientific process that ensures a type is
exhaustively described and appropriately named can be modelled inside the CRM.
In some cases, particularly in archaeology and the life sciences, E83 Type
Creation requires the identification of an exemplary specimen and the
publication of the type definition in an appropriate scholarly forum. This is
very central to research in the life sciences, where a type would be referred
to as a “taxon,” the type description as a “protologue,” and the exemplary
specimens as “original element” or “holotype”.
Finally, types, that is, instances of E55 Type and its subclasses, are
used to characterize the instances of a CRM class and hence refine the meaning
of the class. A type ‘artist’ can be used to characterize persons through
P2 has type (is type of). On the other hand, in an art history
application of the CRM it can be adequate to extend the CRM class E21 Person
with a subclass E21.xx Artist. What is the difference of the type
‘artist’ and the class Artist? From an everyday conceptual point of view there
is no difference. Both denote the concept ‘artist’ and identify the same set of
persons. Thus in this setting a type could be seen as a class and the class of
types may be seen as a metaclass. Since current systems do not provide an
adequate control of user defined metaclasses, the CRM prefers to model
instances of E55 Type as if they were particulars, with the relationships
described in the previous paragraphs.
Users may decide to implement a concept either as a subclass extending
the CRM class system or as an instance of E55 Type. A new subclass should only
be created in case the concept is sufficiently stable and associated with
additional explicitly modelled properties specific to it. Otherwise, an
instance of E55 Type provides more flexibility of use. Users that may want to
describe a discourse not only using a concept extending the CRM but also
describing the history of this concept itself, may chose to model the same
concept both as subclass and as an instance of E55 Type with the same name.
Similarly it should be regarded as good practice to foresee for each term
hierarchy refining a CRM class a term equivalent of this class as top term. For
instance, a term hierarchy for instances of E21 Person may begin with “Person”.
Since the intended scope of the CRM is a
subset of the “real” world and is therefore potentially infinite, the model has
been designed to be extensible through the linkage of compatible external type
hierarchies.
Compatibility of extensions with the CRM means that data structured
according to an extension must also remain valid as a CRM instance. In
practical terms, this implies query containment: any queries based on
CRM concepts should retrieve a result set that is correct according to the
CRM’s semantics, regardless of whether the knowledge base is structured
according to the CRM’s semantics alone, or according to the CRM plus compatible
extensions. For example, a query such as “list all events” should recall 100%
of the instances deemed to be events by the CRM, regardless of how they are
classified by the extension.
A sufficient condition for the compatibility of an extension with the
CRM is that CRM classes subsume all classes of the extension, and all
properties of the extension are either subsumed by CRM properties, or are part
of a path for which a CRM property is a shortcut. Obviously, such a condition
can only be tested intellectually.
Of necessity, some concepts covered by the
CRM are less thoroughly elaborated than others: E39 Actor and E30 Right, for
example. This is a natural consequence of staying within the CRM’s clearly
articulated practical scope in an intrinsically unlimited domain of discourse.
These ‘underdeveloped’ concepts can be considered as hooks for compatible
extensions.
The CRM provides a number of mechanisms to ensure that coverage of the
intended scope is complete:
In mechanisms 1 and 2 the CRM concepts
subsume and thereby cover the extensions.
In mechanism 3, the information is
accessible at the appropriate point in the respective knowledge base. This
approach is preferable when detailed, targeted queries are not expected; in
general, only those concepts used for formal querying need to be
explicitly modelled.
fig. 2 reasoning about spatial information
The diagram above shows a partial view of the
CRM, representing reasoning about spatial information. Five of the main
hierarchy branches are included in this view: E39 Actor, E51 Contact Point, E41
Appellation, E53 Place and E70 Thing. All classes are shown as blue-white
rectangles. Properties are shown as single arrows. In some cases the order of
priority for property names has been reversed in order to facilitate reading
the diagram from left to right. Double arrows indicate IsA relations between
classes and their subclasses or between properties and their subproperties.
'Shortcuts' are indicated with light grey rectangles and their names are
written in italics, such as the P59 has section (is located on or within)
between E53 Place and E18 Physical Thing, which is a shortcut of the path
through E46 Section Definition. .
As can be seen, an instance of E53 Place is identified by an
instance of E44 Place Appellation, which may be an instance of E45 Address, E47
Spatial Coordinates, E48 Place Name, or E46 Section Definition such as
‘basement’, ‘prow’, or ‘lower left-hand corner.’ An instance of E53 Place may consist
of or form part of another instance of E53 Place, thereby allowing a
hierarchy of geometric ‘containers’ to be constructed.
An instance of E45 Address can be considered both as an E44 Place
Appellation–a way of referring to an E53 Place–and as an E51 Contact Point for
an E39 Actor. An E39 Actor may have any number of instances of E51 Contact
Point. E18 Physical Thing is found on locations as a consequence of being
created there or being moved there. Therefore the properties P53 has former
or current location (is former or current location of) (and P55 has
current location (currently holds) are regarded as shortcuts of the fully
articulated paths through the respective events. P55 has current location
(currently holds) is a subproperty of P53 has former or current
location (is former or current location of). The latter is a container for
location information in the absence of knowledge about time of validity and
related events.
An interesting aspect of the model is the P58 has section definition
(defines section) property between E46 Section Definition and E18 Physical
Thing (and the corresponding shortcut from E53 Place to E19 Physical Object).
This allows an instance of E53 Place to be defined as a section of an instance
of E19 Physical Object. For example, we may know that Nelson fell at a
particular spot on the deck of H.M.S. Victory, without knowing the exact
position of the vessel in geospatial terms at the time of the fatal shooting of
Nelson. Similarly, a signature or inscription can be located “in the lower
right corner of” a painting, regardless of where the painting is hanging.
fig. 3 reasoning about temporal information
This second example shows how the CRM
handles reasoning about temporal information. Four of the main hierarchy
branches are included in this view: E2 Temporal Entity, E52 Time-Span, E77
Persistent Item and E53 Place.
The E2 Temporal Entity class is an abstract class (i.e. it has no direct
instances) that serves to group together all classes with a temporal component,
such as instances of E4 Period, E5 Event and E3 Condition State.
An instance of E52 Time-Span is simply a temporal interval that does not
make any reference to cultural or geographical contexts (unlike instances of E4
Period, which took place at a particular instance of E53 Place).
Instances of E52 Time-Span are sometimes identified by instances of E49 Time
Appellation, often in the form of E50 Date.
Both E52 Time-Span and E4 Period have transitive properties. E52
Time-Span has the transitive property P86 falls within (contains), denoting
a purely incidental inclusion; whereas E4 Period has the transitive property
P9 consists of (forms part of) that supports the decomposition of instances
of E4 Period into their constituent parts. For example, the E52 Time-Span
during which a building is constructed might falls within the E52
Time-Span of a particular government, although there is no causal or contextual
connection between the two instances of E52 Time-Span; conversely, the E4
Period of the Chinese Song Dynasty consists of the Northern Song Period
and the Southern Song Period.
Instances of E52 Time-Span are related to their outer bounds (i.e. their
indeterminacy interval) by the property P82 at some time within, and to
their inner bounds via the property P81 ongoing throughout. The range of
these properties is the E61 Time Primitive class, instances of which are treated by the CRM as application or system specific date
intervals that are not further analysed.
Although they do not provide comprehensive definitions, compact
monohierarchical presentations of the class and property IsA hierarchies have
been found to significantly aid comprehension and navigation of the CRM, and
are therefore provided below.
The class hierarchy presented below has the following format:
The property hierarchy presented below has the following format:
CRM Entity |
|||||||||||||||||||||||||||
- |
Temporal Entity |
||||||||||||||||||||||||||
- |
- |
Condition State |
|||||||||||||||||||||||||
- |
- |
Period |
|||||||||||||||||||||||||
- |
- |
- |
Event |
||||||||||||||||||||||||
- |
- |
- |
- |
Activity |
|||||||||||||||||||||||
- |
- |
- |
- |
- |
Acquisition Event |
||||||||||||||||||||||
- |
- |
- |
- |
- |
Move |
||||||||||||||||||||||
- |
- |
- |
- |
- |
Transfer of Custody |
||||||||||||||||||||||
- |
- |
- |
- |
- |
Modification |
||||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Production |
|||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Part Addition |
|||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Part Removal |
|||||||||||||||||||||
- |
- |
- |
- |
- |
Attribute Assignment |
||||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Condition Assessment |
|||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Identifier Assignment |
|||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Measurement |
|||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Type Assignment |
|||||||||||||||||||||
- |
- |
- |
- |
- |
Creation |
||||||||||||||||||||||
- |
- |
- |
- |
- |
- |
Type Creation |