Issue 351: Modelling Principles

ID: 
351
Starting Date: 
2017-10-09
Working Group: 
4
Status: 
Open
Background: 

On 39th CIDOC CRM-SIG meeting,  Martin made a presentaion about. “What do we describe and why” and then  presented a text about methodology. The crm-sig reviewed and  accepted as a draft document. It is decided to put this document in googledocs for further reading, adding notes and comments. Also it is decided to put the text on the site in an issue format. Homework is assigned to  Christian Emil, Thanasis Velios, Marta Acierno, Achille, Alex Siedlecki and Steve to further elaborate this document on googledocs https://docs.google.com/document/d/1RKJaD71idCcKKaTEdjw_dpPU5m1i13H3tb91...

Crete, October 2017

 

 

Posted by Richard   on 19/10/2017

Martin,

I have now had chance to read this document.  I agree that, once finalised, it will become a useful guide to the modelling approach adopted by the CRM SIG.  Would it be possible for us to have a summary of the conclusions that were reached when it was discussed at the recent SIG meeting?

At 66 pages, the document is stretching the meaning of "short" (despite our commitment to maintaining independence from scale - this is a pretty large dwarf!).

It lacks a 'road map' which tells the reader how to go about getting value from it, assuming they know little or nothing about this subject when they start reading.  (Come to that, do we have a clear idea of the background knowledge and intentions of the typical/target reader?  If so, these should maybe be stated.)

The general introduction, in particular, is very dense and theoretical, and will probably cause most readers to give up before they even get to the meat of the document.  If this introduction is to remain, I suggest that it is included as an appendix.  I would also place the glossary at the end, since it interrupts the flow of the text.  (Where glossary terms appear in the text, they could have a pop-up containing their definition: in that way, they would actually be useful.)  Instead of the General introduction, I would include a short outline of the main structure of the document: process model, engineering principles, and conceptual modelling checklist.  You could briefly explain that the CRM has a particular modelling approach (and hence the need for this document), without going into detail.
I suggest that live cross-reference links to specific sections of the published CRM would help to ground the text, and would give these modelling ideas a concrete context.  This might also be an effective way of giving examples in the modelling principles section: some of the examples provided are currently too cryptic to be helpful.

Current Proposal: 

Posted by George on 1/12/2017

<PROPOSAL> <HW>

 
FYI, we have gone up a version of CRM Principles document to v.0.1.3. New text includes distinction between an ontology and a data structure. Find the new link below.
 

 

Posted by Richard  on 14/12/2017

<COMMENT>

George,

A quick look at the updated version suggests that it is substantially the same in its overall organisation.  Please see my comments to the list dated 19 October: I would appreciate a response to these.

posted by Richard on 4/1/2018

<COMMENT> <QUESTION>

On 14/12/2017 21:48, Martin Doerr wrote:
>
> This statement "some of the examples provided are currently too cryptic to be helpful." is too general to be helpful ...please tell us which ones
As a general point I don't understand why there are two 'Eg' sections for each principle.  Some have a screen icon by the first and a mouse icon by the second; others have '+' by the first and '-' by the second. Is it that you would like there to be two examples for each principle, or do they play different roles?  (In some, e.g. 1.2, there are two points in the first Eg and none in the second, suggesting they serve different purposes.)

Specifics:

    2.1 TADIRAH "Research Object" needs a note explaining (presumably) that the research intention has no impact on an object being a member of the "object" class
    6.4 Getty’s ‘Object ID’, the EAD - what do these demonstrate as regards mandatory/optional properties?
    7.1 both examples here are too cryptic: please explain what they show/what should be done with them
    7.2 is the first part of the second Eg making a different point to the first Eg? If so, it's not clear what it is. Second part of second Eg also cryptic
    7.3 examples are unexplained, though I get the general idea. The re-appearance of "Research Object" (TADIRAH) also raises the question whether 7.3 is different in substance from 2.1
    7.4 "hamlet - village" looks like a completion of the first Eg; it's not clear what we're advising to do about "ship - boat"
    8.1 isn't the second Eg the conclusion of the first one?
    8.2 the purpose of the second set of Egs isn't clear

posted by  Phil Carlisle on 4/1/2018

<QUESTION>

Hi Richard,

I agree the Examples need to be clearer especially those using symbols.

I took the ‘+’ and ‘-‘ to mean that one example was a good/positive example whereas the other was an example of what you shouldn’t do (bad/negative).

Are the symbols meant to do the same? If so can we standardize or provide a legend to clarify what is intended? Even include an annotated example of a principle perhaps.

 

 

posted by  Phil Carlisle on 4/1/2018

<COMMENT>

I’ve just cut and pasted the ‘computer’ and ‘mouse’ symbols from example 7.2 into a blank word document and they come back with a ‘smiley face’ and ‘sad face’ symbol so I think they are the same as the ‘+’ and ‘-‘ of the other examples.

posted by Marta Acierno on 6/1/2018

<HW>

I hope this mail finds you well. You may find attached the revision of the 'Modelling Principles' text. Considering my ‘entry level’, I have preferred not to work directly on the shared file, but if you prefer I can transfer my comments on it. Please feel free to disregard all the suggestions you should consider inappropriate or too naïf.

Coming to the next sig meeting, unfortunately, although both very interested, neither Donatella nor I will be able to participate. Donatella is too busy with the beginning of the university course and regarding me, my daughter will undergo a little surgery in the same days. In any case, we will participate for sure on May.

Posted by Christian Emil on 14/1/2018

<COMMENT>

Dear all,

I have read the document before reading your comments. I have many (minor) comments to the text. I am not quite sure how the group intend to proceed with the revision of the documents You will  find my comments below.  It is important that the document is read by honest persons with a introductory or medium level of knowledge.

Best,

Christian-Emil

The document needs proof reading; I assume this can be done later. We need a common term for the “original CRM”. In the document one uses CRM, CIDOC CRM, CRM Basic, CRMbasic, basic CRM. In the definition of CRM the term CRMbase is used.

The general introduction is not always easy to understand. I agree with Marta's  comments.  It is important that the text easy to read and understand. For example, split up the long sentences and assume you should explain the content to your old aunt in a phone call.

Page 11-12 Definition of empirical source material there are four bullet points. Are the two last ones subordinated case (2)?

Phase B

Each step  1 -9  will benefit from a single, good example illustrating/giving intuition to the reader

Phase C

1: Implement or express? An implementation can be considered to be a model of the ontology expressed in say FOL.

2. All classes and properties correspond to atomic predicates and a textual description is needed for each. The FOL expressions in CRM are actually examples of 1. Second order logic will rarely be needed.

Mapping

Mapping is very different from the phase A, B and C.  The phases are about ontology construction in general. The mapping is presented in the introduction to the chapter as mapping to CRM. The text in the mapping section is apparently about mapping between models in general. The section is long and detailed and should be a separate chapter. Mapping between ontologies/models is also found in logic/model theory and in algebra/category theory (functors).  For example between a model expressed in FOL and a RDF implementation there is a mapping/implementation function. One may consider to add a few sentences about the fact that mapping also is a theoretical topic with a long tradition in mathematics and logic.

Glossary

The acronyms e.g. KR should be added.

Principles:

The form used to describe each principle is fine, but ambitious.In the document the first principles are best described. The last ones are brief and only partially filled in. For each principle there is a a field for examples of good practice  or positive comments and (+ or smiley) and a field for not so good practice. Standardize the use of icon/+ and – signs. For some principles the + field describes what one should not do and not what one should do e.g. 1.2,  3.4.

Below are the short comments I wrote when reading the text. I hope they may be useful

1.1

Here the slogan is “Models should be useful”. The negative example is FRBRoo. One may conclude that FRBRoo is not useful. Consider another example or reformulate.

1.4 In the current society ‘sex’ (gender) is not a good example. Is the second example good or bad?  The negative example needs an example, but should perhaps

2.1

Most people don’t know the DARIAH/NEDIMAH typology of DH. Even is one knows about TADIRAH it is not so clear why it is a bad concept.

2.2

The negative example is about making particulars into classes or concepts. Would person be a better example? Many skilled and well educated persons fall into the trap and it takes some effort to explain to them that a person or a hammer is not a concept.

2.3

A negative example could have been the multiple  identified by we had in CRM

2.4

Slogan and negative example are missing.

3.1

No slogan.

The positive field: Explain/describe  the difference between the two “mothers”. What is a psychological concept in contrast to other concepts? Would it be possible to have a type ‘murder’ and how should it be linked to the operational description with activity and death.  A profession is usually expressed by the use of type. 

Negative example is missing.

3.2 (type on the text 3.1) 

The second example need an explanation and is negative and could have been placed in the negative field.

3.3

The positive field: What is the grammatical subject in the first sentence

3.4

The positive field: An example of what we don’t do, not what we do

4 Open world

The introduction at page 42 should open with a definition of the principle of open world assumption. The core of the principle is that lack of knowledge does not imply falsity. If a marriage is not register of marriage in, say, in Sweden, one cannot conclude that a visiting German couple is not married.  In everyday life the open world assumption neglected or considered false. (In constructive mathematics & logic one fins a similar principle, on has to construct a method to find a given construct. Ad absurdum proofs over infinite sets are not accepted.)

The text in 6.1 is clearer. Consider a revision and use the text in 6.1

4.1 

The problem description is very good.

In argument: The ‘next superclass’ is unclear. ‘abstract superclasses’ may be needed as placeholders for properties.

The gender/sex example is not a good example today. The negative example: Many (well educated) people outside Europe don’t know what Europeana is and even among those knowing few know the EDM.

4.2

is similar to 5.4

4.3 

How does this principle differ from a similar principle for classes and form the open world assumption in general?  What is “a closed world of properties”?

5.1 

Positive examples: first example, use full sentence, second example – The movie Rashomon is not known to everybody and thus gives little intuition.

The negative example needs at least a full sentence and a hint to why it is negative.

5.2

In the last senetnec of the argument field ‘ not particular interpretation of how those states of affairs came about’  may  contradict the title of the principle “Do not model conclusions before and without  their reasons”

Positive field: Explain very briefly (by footnote) what Oetzi is.

Negative field, typo: ‘starts are bad’ should be ‘states are bad’

5.3

Positive field:  Contains what you should not di (never define a class as a complement).

5.3  typo in the title ‘domains and range or properties’ should be ‘domains and range of properties’?

In the text one alters between ‘relation’ and ‘property’. Does the two terms denote the same?

Argument: A principle of conservative extension could be added to the document

The negative: add text and make the example clearer. This is very CRM specific

6.1

This is a well explained principle. It the definition of the open world assumption and parts of the text can be used in the intro of 4.

6.2

Especially in the case of contradictory information, the provenance of the information is obligatory.  Add a sentence about that.

The statement in the square brackets at the end of the argument: “Contradiction is to be supported at the level of the knowledge base, not the model” should  be made more prominent, peraps be made into a separate text or principle. A KB with contradictions is not a valid model for the theory created by the conceptual model.

6.3

Is  just a sketch and need elaboration

6.4

The principle is important to avoid misunderstandings that one needs to “implement” as described in the model.  However, the principle address several issues.

1)      An ontology does not describe the implementation level, e.g. the problem of implementing properties of properties in RDF.

2)       

3)      One don’t need to provide (fill in)  all the information indicated by the classes and proeprties of CRM. That this is necessary is a frequent misunderstanding. Should be strongly emphasized

4)       

The positive example need some more text.

The negative example: Add the misunderstanding mentioned above. Why are Object ID and EAD bad. EAD is not an ontology. it is a data format (not very good though)

7

Objectivity is  hard.

7.1  Is transaction, acquisition, transfer more neutral than bying, selling, delivering, receiving? If so, please explain. Second negative example is not very relevant for view neutral.

7.2 Argument: maybe the last sentence can be deleted. A (objective/subjective) view has to be explained and the reasons for the view has to be clear.

7.3

Positive examples?, reasons for stating the for examples as bad practice (may be considered subjective without),

7.4

Size/scale is ok. This could perhaps be generalized and extended to other characteri8stica?

8  The difference/separation of words in ordinary language and name of classes is very important, so the proposed 8.3 is important. Perhaps one should write the scope note before finding a name for a class or a property?

A very frequent case is a place name which may denote an organization place, physical structure, spatiotemporal entity.

The entire chapter 8 could be extended and get a more prominent place in the beginning of the document.

8.1 is good

8.2  Do we have an example of a binary relation (all relations can  in principle be reduced to binary relations as said in the beginning of the document)

In the 44th joint meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9; 37th FRBR - CIDOC CRM Harmonization meeting, the sig decided to leave this issue open for the moment. 
HW: CEO is assigned with reviewing the second part of the document Principles for modelling ontologies: a short reference guide.
 

Paris, June 2019

In the 45th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 38th FRBR – CIDOC CRM Harmonization meeting, given the difference of opinion that arose regarding the content and structure of the document Principles for Modelling Ontologies: A Short Reference Guide, which arose at the very last moment, the members of the sig pondered on what would be the best way to collaborate with one another, when they need to produce larger documents. Everyone agreed that uploading a text in its final form does not allow multiple reviewers to contribute to the final output, and it means that someone has to do all the work from scratch with every revision.

PROPOSAL: To put the document in github and link to the crm site. The namespace for that should be (FORTH CIDOC CRM). The proposal was accepted.

HW: MaDu & MS are to help with the initial setup. (see the following post by Matej Durco on 23/10/2019)

HW: DA & NC will work together to push things from the github to the crm-sig list and vv.

HW: the document still needs be reviewed, so TV volunteered to give it a go and mentioned he’ll ask OE to help with that. 

Heraklion, October 2019

Posted by Matej Durco on 23/10/2019

Dear all,

as discussed here initial info for handling the principles via gitlab:

The repository is under: https://gitlab.com/acdh-oeaw/cidoc-modelling-principles the rendered html is available under: https://acdh-oeaw.gitlab.io/cidoc-modelling-principles

You can login to gitlab with your github or google account under: https://gitlab.com/users/sign_in

Once you login, let me know, we will give you access to the repo.

When you have access to the repo, you should clone it locally.

When you do changes you need to commit them and push them back to the repo.

You have to add your public key to your profile (or set a password) to be able to push changes to the repository. Let us know, if there are any questions regarding the procedure.

We said that we will tackle the workflow to integrate the html to cidoc-crm page in a separate step.

Please forward this message to further persons, who should be able to make changes.

 

In the 50th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 43nd FRBR – CIDOC CRM Harmonization meeting, TV explained what the present state of this issue is. The drafting of a didactic text  regarding the principles that the CRM was based on, was followed by a period of extensive reviewing up to the point where SIG members thought it was too difficult to approach such a long document in a linear fashion. It was decided that the text be broken down to sections and be reworked in Gitlab. Since then, there has been no systematic work on the document –in fact, there are comments  in the Google doc that have not been moved to the Git repository.
There’s still pending issues to be dealt with. These are summarized in the WD. But what we need to discuss at this stage is (i) whether we work on the document through Git or Google docs or directly through the website, and (ii) to assign someone with reviewing it. 

DECISION: Continue editing the Modelling Principles document through Google docs. In the meantime, the document is to be published in its draft version on the CRM site.

HW: MR, EC, OE will review the document

In the 51st CIDOC CRM & 44th FRBRoo SIG meeting, EC shared some insights concerning the document Principles for Modelling Ontologies (HW assigned to her on the 50th sig meeting). Points she raised are:  

  1. The document lacks a clear purpose statement: does not state who it is intended for, why they should take it into consideration, how it would help implementers do a better job
  2. The examples provided in the text do not come with an adequate explanation and iseem irrelevant in the context 
  3. they appear in
  4. The text is written using many different registers. A very dense, academic prose and normal prose are used interchangeably in the document. 
  5. Fonts & styles are not used consistently. The same applies to citations
  6. Instead of one all-purpose document, maybe a series of smaller documents, each with a well-defined purpose would serve better. 

For an extensive commenting and editing on the document, see here

Decision
Proceed as EC suggested --start by point 1 above. 
HW: MD, EC, OE, SdS to identify intended users of this text

Discussion points

People trained to create standards and people trained to write documentation that can be understood by non-experts are not one and the same group. Need funding to assign someone the task of redrafting this (and other) introductory documents/training material. 

This has to be done in a sustainable fashion. Instead of asking people to redraft all documents, once we’re done editing them, we could organize a series of workshops on how to write documentation. SdS has someone in mind (professional author, knows a thing or two about creative writing). This stands as a concrete funding proposal. 

Adressing pt 1. (stating the purpose) will help resolve the other problems. If there is need to break the document into multiple ones (each addressing one well-defined issue) or if there is need to rewrite the document with multiple idealizations of end-users in mind etc., this will become apparent by fleshing out the purpose the document serves. 
Alternative purposes suggested: 

  • how the CRM came to life, what principles it followed –then it’s not a guideline, but a history of CIDOC CRM
  • identifying a set of best practices to be used when one is creating an ontology. It shouldn’t be specifically about the CIDOC CRM. 
     

In the 52nd CIDOC CRM & 45th FRBRoo SIG meeting, MD gave an overview of the issue and proposed that the SIG acknowledge these principles as the norm to apply when modelling in the CRM and family models. 

Discussion points:

  • GB objected to that –he considers the principles a useful guideline outlining how this group has arrived at this particular conceptual model, but not a normative text that one must observe at all times. More of a rule of thumb, not a canon. He does not think that the principles define procedures that allow comparable/similar data to individually give rise to the same modelling constructs; what they do instead is impose some constraint guiding one when creating modelling constructs. That the principles don’t come in a strict hierarchy, allows people engaged in conceptual modelling with the CRM and other compatible models to resolve particular problems applying that subset of the principles that is deemed more fitting in each case.
  • MD: disagrees with GBs statement, he maintains that the principles have a normative effect and they have served as the basis to develop the CRM (base and family models). Does not see the reason why to abandon or even relax the normative nature of the principles, especially since following them has proved successful in creating conceptual models grounded in empirical evidence. 
  • TV: It is impossible to arrive at the principles through reading the scope note of a class or property. One must go through a number of classes/properties, and annotate them with the principles that have been used to produce them, as an exercise. This is a way to identify a potential hierarchy of the principles –which ones are used more often etc. 
  • MD: disagrees with quantifying on the importance of some principles over others and claims that they are all equally important. Newcomers to the SIG need to see how these principles apply throughout the CRM (maybe through a tutorial?). But first and foremost, they should be made aware of what these principles are. 
  • FB: He also feels that the principles listed in the document are highly abstract, not self-explanatory or easy to understand. He suggested that instead of each SIG member separately engaging in the exercise of annotating the scope notes of classes and properties of the CRM individually, it would be better that the SIG prepared some examples to be shared with anyone interested. He also pointed that this discussion is highly relevant for issue 504 and suggested that we discuss it in the appropriate context (i.e. of 504).

Alternative proposal: 

  • Close the document as it is now. The document in its current form can be accessed here (under Resources/Technical Documents) 
  • Then take the first part of the document (the one that showcases the overall procedure followed –i.e., the bottom-up modelling, which relies on empirical evidence and actual data that are subsequently translated into modelling constructs (pp.10-28) and make examples that highlight the process better. 
  • Discuss the principles separately with practical examples
  • Discuss how to publish this text, make it more visible 
  • Dedicate a whole session if necessary to the principles document in the next sig.

Decision: Proceed as proposed (see above). 

HW: TV, GH, MD, FB to work on this issue for the next SIG.

February, 2022

In the 53rd CIDOC CRM & 46th FRBRoo SIG meeting, the Sig discussed the state of this issue. 

There are currently two different entries on the website for the Principles for Modelling Ontologies under Technical Papers. Principles for Modelling Ontologies: A Short Reference Guide contains two documents: (v.0.1.2, i.e., the last known updated version of the document and v.0.1.2-comments by EC, i.e., the same document with comments by EC). CM Principles Word v.0.1.2 [Introduction text] contains the introductory text that needs to be revised and supplemented with examples (Issue 587).

There is yet another version (v.0.1.3) that had not been made publicly available, but was circulated among members of the Sig after the meeting in Lyon (May 2018). It featured comments and updates by CEO and MD that have not been incorporated in the last known updated version of the document –hence EC has not considered them upon reviewing the document.

Discussion:

Insofar as the version by CEO provides examples and use-cases for how to apply the principles to model an ontology that are commonsensical enough and explain the case in point, then the text is ready to be published as a guideline. If not, not so much.

CEO had only gone through the first 50 pg of the document back in 2018, so there is still a substantial part of the document he has not updated. In that sense, his version should not be granted “guideline” status either.

v.0.1.2 has issues identified by EC during the review of the text and they have not been properly addressed yet. 

Decision:

Everyone share their version to the CRM Sig list and CEO and ETs. CEO will run a diff on the various documents and collate them. Inform Issue 587 with the final version of the document –especially the part of the introduction that is to be supplemented with examples from SeaLiT and CRMtex.  

HW: CEO to share with EC, MD & ETs his last edited version (May 2018).

HW: EC, CEO to collate the two versions.

Once we end up with only one document, it should be managed through the CIDOC repository

May 2022

Post by Erin Canning (23 November 2022)

Dear all,
Please find homework prepared for Issue 351 in the below link:
https://docs.google.com/document/d/1702Eg7Wcvtp29uXR29rbqu7MiEhS5hkiE_P8...  

All the best,
Erin Canning

In the 55th joint meeting of the CIDOC CRM and SO/TC46/SC4/WG9; 48th FRBR/LRMoo SIG meeting,, the SIG approved EC's HE (collated all "last edited" versions of the Modelling Principles documents, into a new document: v0.2, that can be edited through the CIDOC repo and can be accessed through the CRM website under Resources\Technical Papers\Principles for Modelling Ontologies: A Short Reference Guide). The people that have been involved in the process are to resolve any comments left open. 

The SIG will decide at the next meeting how to advertise this text as a best practices document (i.e. through which channels), when the comments will have all been resolved.

Upon discussing issue 533, the SIG resolved to incorporate the example from epigraphy that showcases the differentiation of polysemous concepts through distinct ontological classes (HW by AF, FM, MD) in the collated version of the 'Principles for Modelling Ontologies'.

HW: to EC, MD to undertake this.

Belval, December 2022