Issue 624: Add E33_E41_Linguistic_Appellation to the Official Specification

ID: 
624
Starting Date: 
2022-11-14
Working Group: 
3
Status: 
Open
Background: 

Post by George Bruseker (7 November 2022)

There are two references to the class that is a subclass of E41 and E33 that allows you to talk about the language of a name (which is a super common requirement... actually almost always necessary). I can't give you it's official name because I dont know because it isn't in the spec doc and it doesn't have ONE name in the RDFS. 

In one reference it is called: E41_E33_Linguistic_Appellation and then later it is called E33_E41_Linguistic_Appellation. Try find f in the rdfs doc and you will what I mean.

https://cidoc-crm.org/rdfs/7.1.1/CIDOC_CRM_v7.1.1.rdfs

Actually I don't care what it is called, but it would be nice if it was really, really clear. 

I think this speaks against the practice of hiding classes we don't like and call implementation classes in the RDFS and should make them full classes in the standard so that they are fully vetted and controlled. It is a fundamental class. It should be in the standard in the first place.

And definitely it should not have two different name in the RDFS. Can we confirm that it is supposed to be E33_E41 and not E41_E33? 

Cheers,

George

Post by Elias Tzortzakakis (7 November 2022)

Dear George,

 

The rdfs defines 1 such class using just 1 name the ‘E33_E41_Linguistic_Appellation’.

The second name reference you are referring to ‘E41_E33_Linguistic_Appellation’ exists only in the XML comments of the rdfs file.

 

There has been a discussion and decision about the correct order.

Please see issue

https://cidoc-crm.org/Issue/ID-555-rdfs-implementation-and-related-issues and search for post starting with In the 51st CIDOC CRM & 44th FRBRoo SIG meeting

Decision: keeping numbers of the numeric identifier in order. 

 

Thus the rdfs is valid and consistent but the comment lines should also definitely be adapted to this decision.

Thanks for spotting,

 

I will correct this ASAP,

 

Kind regards,

Elias Tzortzakakis

Post by George Bruseker (7 November 2022)

Thank Elias,

You are definitely right that it is ok in the actual doc but mis referenced in the xml commentary. My point is not that the RDFS is wrong and it is great that it is produced and solid. I am more interested in how NOT having legitimate classes in the standard but compromising and just putting them in RDFS means that a) we create all sorts of arcana around what should be an open standard and b) because the class is not documented in the specification document we don't actually have a rule to know what is should be called.

So it's more a process and principles level issue.

Cheers,

George

Post by Stephen Stead (8 November 2022)

Surely the RDFS E33_E41 is just a workaround for a common multiple instantiation that is problematic in RDFS land not a need for a new class.

Post by George Bruseker (8 November 2022)

It's not really though. In the majority of cases when you talk about a name you need to talk about a language too. Especially if CRM wants to be inclusive etc. We have a subclass 'title' of appellation that does allow but it only works for inanimate objects. So it is useless as a general case. The use of E33_E41 should be a default in most modelling cases with E41 being the exception (mostly names are in a language). The general idea of a name in a language is not an arcane concept, but the majority concept. Needing to use an arcane construct either E33_E41 or multi instantiation for the majority case when the standard could just provide the appropriate class and document it and allow people to build around it, would be a superior way to go imho.

Post by Pavlos Fafalios (8 November 2022)

Dear George,

To my understanding (without having been involved in the relevant discussions about having the E33_E41 class in the RDFS but not in CRM),
and according to the discussion in issue 363,
classes that use to co-occur on things simultaneously without being associated with properties only applicable to the combination of such classes, are not modelled individually as subclasses of multiple parent classes (a principle used for keeping the ontology compact).

The 'E35 Title' class exists because there is a property 'P102 has title' (of E71 Human-Made Thing) that needs to point to something that is both a linguistic object and an appellation. 
So, for having a CRM class "E? Linguistic Appellation", there should be a property that needs to point to something that is both a linguistic object and an appellation (and with the intended meaning), e.g. a 'has linguistic appellation' property for E39 Actor or E77 Persistent Item. To my understanding, since there is no such property, there is (currently) no need to introduce such a class in CRM. 

Best,
Pavlos

Post by George Bruseker (8 November 2022)

Hi Pavlos,

I understand that principle, I am contesting it. There is another principle which is handle obvious cases in the domain. I hold that that principle has greater importance here. It's like we inverse the 80 / 20 principle and we choose to solve only 20% of the problem and let the other 80% be a workaround. But the 80% is where the actual information / people are at.

Cheers
G

Post by Rob Sanderson (8 November 2022)

I agree with George that this should be added.

There are plenty of cases of classes without additional properties that serve only to join two parent classes. For example E22_Human-Made_Object, E25_Human-Made_Feature, and E34_Inscription. There are also remaining leaf nodes with no properties with only one parent class, such as E27_Site. Further, there are classes that have a property, but which is semantically indistinguishable from its super property. If the requirement is a property, then I propose 

Pxx_is_named_by (names)
Domain: E1
Range: Exx_Name (previously E33_E41)
Sub Property Of: P1_is_identified_by
Super Property Of:  P102 has title

This property describes the naming of any entity by a name in a human language.

And the 
Exx_Name
Super Class: E33, E41
Super Class Of: E35 Title

The discussion last time devolved to "Well we use those so we don't want to get rid of them so we're not going to even though they don't have properties". But here's the thing ... *everything* has a Name (by which I mean an E33_E41_Linguistic_Appellation). And it's easy to demonstrate that E33_E41 is very well used. 

So ... I don't find the argument that we can't do this "because rules" very convincing when those rules are applied so inconsistently. 

Rob

Post by Thanasis Velios (8 November 2022)

The section on Minimality outlines when new classes are declared and it includes:

"It serves as a merging point of two CIDOC CRM class branches via multiple IsA (e.g., E25 Human-Made Feature). When the branch superclasses are used for multiple instantiation of an item, this item is in the intersection of the scopes. The class resulting from multiple IsA should be narrower in scope than the intersection of the scopes of the branch superclasses."

If I interpret this correctly, we need to ask:

Is "E33 E41 Linguistic Appellation" narrower in scope that the result of multiple instantiation of "E33 Linguistic Object" and "E41 Appellation"?

And if I understand George's message correctly, it looks like it is not narrower, no?

All the best,

Thanasis
 

Post by Pavlos Fafalios (8 November 2022)

Dear George and Robert,

Your comments are well taken and understood. I do not take a position against or for the addition of this class (I'm not yet sure of either decision), nor I support that "rules" must be always respected. I just tried to find a good reason for not having already introduced such a class (and thus facilitate the discussion).

Best,
Pavos

Post by Martin Doerr (8 November 2022)

Dear All,

I just want to remind that we have a principle explicitly in the introduction of the CRM not to add classes without distinct properties of their own which is sufficiently relevant. By this, we purged a lot of very useful classes from the CRM, because it is "base".

I prefer not to hear again "if we don't like a class". I kindly ask members to delete such terms from our vocabulary.

Any argument in favour of a class in CRMbase which is nothing more semantics than multiple IsA, must be measured by this principle, and not by likes.

If the principle is to be abandoned again, please make an issue. If the principle is unclear, please make an issue.

Any issue for adding more custom classes to RDFS, to be discussed.

Best,

Martin
 

Post by Martin Doerr (8 November 2022)

Dear All,

Apologies, I missed some messages in this thread... Indeed "Human-Made Feature" is an interesting case. I prefer to review the arguments for Human-Made Feature, Human-Made Object. They are actually more complex, because feautures and objects differ wrt to Move, "bears feature" etc.

I prefer also a discussion to add more custom classes to the RDFS, such as "Active Destruction", rather that inceasing CRMbase.

Best,

Martin

Post by Martin Doerr (8 November 2022)

...apologies for writing piecewise!

Here some arguments for and against a Linguistic Appellation:

A) In order for an Appellation to become language specific, does it need some special traits?
   Can an Identifier become language specific? A place primitive?
   If not, there must be more to such an Appellation, that is in favor of a subclass.

B) is being language specific unique? Can names belong to 3 languages, but not to others?

C) How does an instance of Linguistic Appellation comes into being: If it is acquired later, it would be an accidental property, not essential. Then the Linguistic Appellation would have a potential to become part of a language.
The Human Made Feature comes into being in this combination exclusively.

D) Question: May be the language of an Appellation is a different property from that of an Linguistic Object, it may be like an adoption to genetic descendence. Is that difference relevant? Or do we distinguish a name being commonly known as in some language from being formed in that language (such as "Köln")?

Best

Martin
 

Post by Rob Sanderson (8 November 2022)

Thank you for the clarification, Martin!

I have proposed the justifications for deleting three further classes that do not, I believe, fulfil the criteria of being classes in CRM Base.

And indeed, let us judge these objectively and by the given criteria, rather than subjective and personal preferences. If we come across a class that we simply cannot delete without irreparable damage to the ontology, at *that point* let us reconsider the criteria as being incomplete.

Thanks again,

Rob

Post by Mark Fichtner (8 November 2022)

Dear all,

while I must agree with Rob that the three classes he proposed for deletion are not a particular best pratice in ontology building from a semantic point of view, I don't feel good with the direction the CRM is going currently. At our museum we are following the CRM because it is the only "really standardized" standard for our domain. It is expressive enough for a full top level ontology while also covering the domain of cultural heritage. We are not interested in yet another standard that maps metadata in a very common way - we have enough of these and if we would want to use dublin core we would do so. The full potential of the CRM is what binds us to using it.

Concepts like "Title" are really important for our domain - it is one of the most important metadata fields for documentation in our museum. With the abolishment of properties and classes in CRM Base that were used a lot in the past the SIG and the CRM takes a turn away from the museum side of documentation towards being a very general ontology. While I know development may always hurt a little bit, this does not feel right in any way anymore. 

I am asking myself: Is this really what the CIDOC CRM should do? Is it possible for the CIDOC CRM to survive in comparison to standards that are more widely spread while abolishing it's own user base? Do we really want a domain ontology - extending CRM Base called "CRM Museum Documentation Ontology" because we throw out everything that is museum related out of CRM Base? At least I might have my long loved E84 Information Carrier back there... :D

No offense intended - just my two cents and the perspective of the GNM Nürnberg on the current CRM development...

Best,

Mark

Post by George Bruseker (9 November 2022)

Dear all,

Thanks Paulos for pointing out the principle under discussion. 

As this unexpectedly lively debate over a simple request for clearly documenting in the standard a central and useful class has illustrated, the principle in question appears ill formulated or incomplete since it sets a letter of the law but requires much 'spirit' to fill in how it is actually applied. One could call that 'likes'.

As Rob's reductio ad absurdum demonstrates, there are some rules for some classes and other rules for other classes. Yes to Sites, because ... like? No to Linguistic_Appellation because ... not like? I believe we would not like to and not benefit from deleting the classes he proposes for deletion although according to the letter of the law of the principle we ought to.

The class Linguistic Appellation comes out an empirical process of ontology design and is a key concept for the community. It is not 'RDFS whatever'. I believe this is another principle (how do we weigh the many countervailing principles?) that says such classes should stay in or join the ontology. We should represent (in an ontologically correct manner) key concepts of the community. This is probably the justification for Site. But if it is the justification for site, then it is a justification for Linguistic_Appellation. Names with languages are a core concept of the community (I don't imagine anyone cares to disagree?).

There are two more principles to call upon:

1) consistency of reasoning / application of rules
2) good practice as a standard

I think the first is self explanatory but let me elaborate more on the second. CIDOC CRM is meant to be a communication standard and to be the standard it is a ubiquitously self documenting enterprise, that should show, from itself, exactly what it means. The spec is a sort of formal ontology bible to users to adjudicate whether something is or is not, is right or is not right. CRM achieves its job of enabling communication between data and projects that otherwise are not connected, just when it is so transparently clear and self documenting that it is applied unambiguously. In attempting to apply Linguistic Appellation, which is a key concept in the domain, and not finding it in the standard, I ran into confusion because it has been kept out of the standard for what are shown above to be inconsistent and insufficient reasons. It ought not to have been because it is a key concept in the community and we should enable clear communication.

So I propose not to go on a class deletion spree, nor on a class creation spree, but rather to revisit the ambiguous principle and add in our description of ceteris paribus conditions that we accept which includes "important to the domain", and then revisit E33_E41 in that spirit to see if we can give it full recognition, so that we can get on with core mission which is enabling communication across data structures via a clear, consistent and self documenting ontology.

Cheers,

George

Post by Athina Kritsotaki (9 November 2022)

Dear all,

I fully agree that we must follow the principles of the ontology development and remove classes that do not fulfil the criteria of being classes in CRM Base. But, in my opinion, for specific classes of this kind (that they seem not to fulfill the criteria because they don't have properties ), such as Inscription, we should make an issue not to delete, but to discuss the alternatives of removing this class and maybe to remember the initial purpose of use of this class or to find if there is an open issue regarding this - For E34, there is the issue 533; So, my question is: what about the classes that we have introduced in CRM base or in other compatible models, such as S7 Simulation or S5 in sci, which have no properties at all, but, as I remember very well, the argument for introducing them (I am speaking for sci) was that that they are domain specific but we haven't yet developed them, but we intend to do so in future. - should we delete them? E34 has not been developed, in my understanding, and it is now replaced by CRM tex. So the issue , in my opinion, should be (for this class)  how we sychronize and not delete.

BRs,

Athina 

Post by Carlo Meghini (9 November 2022)

Dear Mark, all,

why not, then, have a domain-independent CRM core and an extension for museum documentation, perhaps generalized to CRMCH or something like that? Apologies if this proposal is already on the table and I missed it.

Modularity at the core would sort out agendas, keeping discussions about general questions, like space-time volumes, diachronicity, etcetera, well separated from those about titles, information carriers, etcetera. More importantly it would allow independent (but harmonized) evolution.

Carlo

Post by George Bruseker (9 November 2022)       

Hi Carlo,

To my thinking, while intellectually that would be the neatest of solutions, pragmatically it would be a huge problem not only for system developers who rely on the continuity of CRM but also socially and politically in terms of the CRMs embeddedness in the museum community and its operation within the aegis of ICOM.

 

CIDOC CRM walks a fine line between being a pure top level ontology and a domain level ontology which is both a disadvantage because of the kinds of discussions we have here and now and an advantage, especially in that it keeps grounded to reality and not in the realm of pure theory where we know no longer the relation of our modelling to any actual reality.

 

While of course any sensible proposition should be discussed, I think the above would be a non-starter as CRM would really be in danger of losing its identity and its social embeddedness with a community.

 

A project which abstracted the CRM to pure top level classes would certainly be most interesting and exciting but also probably better done in a pure academic environment from which blue sky research, the world of information management of CH institutions could benefit in the long run.

 

In the meantime, it's my opinion we have to and have accepted that CRM is an ontology that has roots in the museum cum CH community and should continue to maintain its fruitful and stable relation therewith. So we have to implement our principles such that we are able to grow organically with that community and its research interests and needs, respectfully listening and adapting to them, whilst keeping an eye to maintaining the generality of the ontology as a useful medium for discussing events and their participants in space time.

 

Cheers,

George

   

Post by Martin Doerr (9 November 2022)

Dear All,

I would like to focus on the semantic questions wrt E33_E41. Would it be well defined? Please remember, that there were implementation arguments against multiple instantiation, not semantic ones. Therefore, we decided to solve the problem in the implementation side. Why the unlucky choice of two different labels now would warrent a deeper semantics is not clear to me. We can solve the issue by deciding a label.

If there are possibly deeper semantics, as I indicated in my last message, could we specify this?

Is the language of an E33_E41 a created within, made for, used by? Is it language or language speakers? What substance makes an Appellation "languageable"?

Can someone take a position? If this stays unclear, I vote for the current solution.

Cheers,

Martin 

Post by Rob Sanderson (9 November 2022)

 

To re-merge the threads, apologies for the duplication...

 

The language of an E33_E41 is the language in which the linguistic content of the entity is expressed, per P72_has_language. 

 

For example, 

 

The language of the name of Douglas Adams (the Person) that has the symbolic content of "Douglas Adams" is English.

The language of the name of Douglas Adams (the Person) that has the symbolic content of "دوغلاس آدمز" is Arabic.

 

These are clearly expressed in a language, and appellations, and symbolic.

 

Or:

 

eg:Q42 a crm:E21_Person ;

  crm:P1_is_identified_by [

    a crm:E33_E41_Linguistic_Appellation ;

    P190_has_symbolic_content "Douglas Adams" ;

    P72_has_language <uri-for-English> ]

  crm:P1_is_identified_by [

    a crm:E33_E41_Linguistic_Appellation ;

    P190_has_symbolic_content "دوغلاس آدمز" ;

    P72_has_language <uri-for-Arabic> ]    

 

E33_E41 is a super-class of E35, which is semantically narrower through its scope note as applying only to "works", and "can be clearly identified as titles due to their form". I don't think anyone would say that "Douglas Adams" is the "title" of the person.

 

Rob

    

Post by George Bruseker (9 November 2022)

Dear both,

I have to agree with Robert, I basically can't even conceive how this is an argument. Obviously names come in languages MOST of the time. This is a basic feature of living in a human society, is it not? Is this not a base experience of being embodied as a human being that we all commonly have access to and is an undeniable reality? We live in language. We name things in language. 

But if we need to show that it was documented in a database, here are some:

Getty AAT wrenches
https://www.getty.edu/vow/AATFullDisplay?find=spanner&logic=AND&note=&en...

Wrench has a name in many languages.

Here is geonames:

https://www.geonames.org/6094817/ottawa.html

Ottawa, has many names in different languages.

Here is VIAF

https://viaf.org/viaf/15873/#Picasso,_Pablo,_1881-1973

Picasso has names in different languages.

This could be done ad infinitum. How many databases should be listed before this is accepted as an empirical part of reality documented in CH and needing a class?

or to put it another way, if one only lived in a world of CRMese and knew nothing else about the world in itself, understanding what E33_E41 is is just a question of understanding what E35_Title is and then taking the conceptual leap that it can be applied to E1. That's it! Names, in a language, can be applied to anything, in human societies. And they often are, and in documentation, and so the standard should represent it and perspicuously document that representation.  Not everything is totally inscrutable.

Cheers,

George

Post by Martin Doerr (9 November 2022)

Dear both,

The question was not if names can belong to language, or if langauges create names. It was how this is unambiguously defined.

The example below is what I feared. The fact that the arabic script is mainly used for Arabic, does itr make a transcript of an English name "Arabic?" why not Farsi?  I ask here for the Librarians to express their opinion.

Why is Douglas Adams not "German"? I would use it in German exactly in this form.

But "Adams" I  think is a last name exclusive to English, as Dörr to German.

What is the language of "Martin", "Martino",  of 

Martin: Identical in English, Spanish, French, Dutch, German, Norwegian, Danish, Swedish? Martino in Italian, Rumanian?

From Wikipedia: "Joshua".

Josua or Jozua is a male given name and a variation of the Hebrew name Yeshua.[1][2] Notable people with this name include:

Following scripts, only  יְהוֹשֻׁעַ would be Hebrew, but Yeshua English?

Post by George Bruseker (9 November 2022)

Dear Martin,

I don't see an ontological problem here. One name can be used by / in many languages. If it is, that can be documented.

    The question was not if names can belong to language, or if langauges create names. It was how this is unambiguously defined.

It isn't our job as ontologists to unambiguously define the instances of things in the world. This is for the domain specialists.
 

    The example below is what I feared. The fact that the arabic script is mainly used for Arabic, does itr make a transcript of an English name "Arabic?" why not Farsi?  I ask here for the Librarians to express their opinion.

Who documents the object, documents their knowledge and, hopefully, thereby, the state of affairs in the world. 

I don't understand the Farsi aspect of the above question. Why would transliterating a name into English from Arabic make it Farsi? Librarians?

Here's a person with a name: https://en.wikipedia.org/wiki/Averroes

His name is ابن رشد in Arabic and also أبو الوليد محمد ابن احمد ابن رشد.

With E33_E41 we can say that. Without it, we can't.

His name in English is usually Averroes and also he is known as Ibn Rushd.

With E33_E41 we can say that. Without it, we cant.

He has a transliterated name: Abū l-Walīd Muḥammad Ibn ʾAḥmad Ibn Rušd . Is that his name in Arabic or English or no language? I don't know. Both? Maybe. I'm not a scholar of philosopher's names and it's not my province to judge. This is not the domain of the ontologist but the specialist in onomastics or the appropriate discipline.

    Why is Douglas Adams not "German"? I would use it in German exactly in this form.

Then put in the KB for this name 'has language English' and 'has language German' and the problem is solved.
 

    But "Adams" I  think is a last name exclusive to English, as Dörr to German.

    What is the language of "Martin", "Martino",  of 

    Martin: Identical in English, Spanish, French, Dutch, German, Norwegian, Danish, Swedish? 

If that is what the expert in onomastics thinks, yes. Not an ontological issue. We provide the semantic framework, they do the researching.
 

    Martino in Italian, Rumanian?

    From Wikipedia: "Joshua".

    Josua or Jozua is a male given name and a variation of the Hebrew name Yeshua.[1][2] Notable people with this name include:

        Josua Bühler (1895–1983), Swiss philatelist
        Josua de Grave (1643–1712), Dutch draughtsman and painter
        Josua Harrsch (1669–1719), German missionary
        Josua Hoffalt (born 1984), French ballet dancer
        Josua Järvinen (1871–1948), Finnish politician
        Josua Koroibulu (born 1982), Fijian rugby league footballer
        Josua Heschel Kuttner (c. 1803–1878), Jewish Orthodox scholar and rabbi
        Josua Lindahl (1844–1912), Swedish-American geologist and paleontologist
        Josua Maaler (1529–1599), Swiss pastor and lexicographer
        Josua Mateinaniu (fl. 1835), Fijian missionary
        Josua Mejías (born 1997), Venezuelan footballer
        Johann Josua Mosengel (1663–1731), German pipe organ builder
        Jozua Naudé (disambiguation), several people
        Josua Swanepoel (born 1983), South African cricketer
        Josua Tuisova (born 1994), Fijian rugby union player
        Josua Vakurunabili (born 1992), Fijian rugby union player
        Josua Vici (born 1994), Fijian rugby union player

    Following scripts, only  יְהוֹשֻׁעַ would be Hebrew, but Yeshua English?

This is a question for the knowledge base. The English speaker writing this article thinks that "Josua" applies to these people. It is up to them to instantiate an instance of the class, call it Hebrew and then assign it as a name of those individuals. If someone wants to dispute this, they can use negative properties. I don't know if the above wikipedia article is true or not, but I would like to be able to represent that data in the KB so that I could try to find out.

So, not sure why that's a blocker.

Best,

George

Post by Rob Sanderson (9 November 2022)

Unsurprisingly, I agree with George.

The quantification of P72 has language is many to many, necessary.

Meaning that a Linguistic Object can have many languages, and each Language can be the language of many Linguistic Objects.

So, if you wanted to say that my name is "Robert Sanderson", and that Name P72_has_language English and the same entity P72_has_language French ... no problem. That they are pronounced differently in those two languages (Roh-bit vs Roe-bear) is interesting, but not a symbolic (nor propositional) concern.

Combined with the open world assumption, saying that my name is "Robert Sanderson" in English and French doesn't preclude it from also being the symbolic representation of my name in German or Dutch.

So per George's response, I think there's no philosophical issue. Per mine, there's no technical issue. And per previously, there's no scoping / inclusion logistics issue.

I assume the next step is to propose a scope note and formal definition?

Rob

Post by Gordon Dunsire (10 November 2022)

 All
 
A librarian expresses an initial opinion:
 
What about gender of a name? E.g. "Gordon" is male; "Gordana" is female. The Library of Congress has only recently stopped assigning gender to the referant of a name, which has resulted in howlers like "Robert Galbraith" (pseudonym of J.K. Rowling) is a male because the name is 'male'.
 
RDA: resource description and access is an implementation of the IFLA Library Reference Model (entity-relationship version). Names and titles are given equal treatment; the only difference between a 'name' and a 'title' is that 'title' is the traditional word for the 'name' of an information resource. Since LRM/RDA has four 'resource entities', we have 'title of work', 'title of expression', 'title of manifestation', and 'title of item'; all other entities have 'name": 'name of place', 'name of time-span', 'name of agent', etc.
 
This discussion exposes a further difference, but it is not absolute. A 'title' is usually composed of words, etc. taken from a natural language: "Ceci n'est pas une pipe" uses French words; "The treachery of images" uses English words; "La trahison des images" is back to French; "The wind and the song" is back to English ... On the other hand, a 'name' is usually composed of words, etc. that have no other use in natural language. But there are many counter-examples, and the distinction may not exist in a specific language group (e.g. Chinese?).
 
Although RDA has a property for 'has language of nomen' ('nomen' being the generic term for 'name/title', 'access point', and 'identifier'), the expectation is that it only has utility for 'title', but not 'name'.
 
The sibling property 'has script of nomen' has utility for names and titles. It is important for transliterations.
 

Post by Martin Doerr (10 November 2022)

Dear Gordon,

"The Library of Congress has only recently stopped assigning gender to the referant of a name",

That is interesting!

I'd kindly ask for your expert opinion, about the "language" of a name.

We had introduced the language property of a title because of the frequent cases of words of a natural language and their translations.

Here, my question is:

A) In library practice, do you associate a name with a language, and what would be the rules.

George wrote: "He has a transliterated name: Abū l-Walīd Muḥammad Ibn ʾAḥmad Ibn Rušd . Is that his name in Arabic or English or no language? I don't know. Both? Maybe. I'm not a scholar of philosopher's names and it's not my province to judge. This is not the domain of the ontologist but the specialist in onomastics or the appropriate discipline. "

I absolutely disagree with that. Can transliteration to another script change and produce a language-specificity? That is definitely an ontological question. Otherwise, we have no concept at all for this property.

My example of Joshua had another purpose: The spelling and pronunciation "Josua" is the one used in German, but not exclusively. "Joshua" in English (and?), may be Yeshua in Hebrew written in Latin script? If this is the case, they are variants shaped and used in different language groups. That would justify a language-specificity.

B) If the meaning of the language property we are seeking for is not the language of the name, but the suitable use in a language group of the name for the named instance, then, it is a subproperty of P1 and not P72. Such as "is typically identified in English by...etc. That is an ontological question.

All the best,

Martin

Post by Rob Sanderson (10 November 2022)

Hi Martin,

No one is proposing anything other than P72. Please stop creating issues where none exist :)

"The Big Apple" is a name for the Place which is also known as "New York City".
Does anyone disagree that "The Big Apple" is in English with the precise semantics of P72, or that it is not a Name for that Place?

Rob

Post by George Bruseker (10 November 2022)

I agree. This is p72:

This property associates an instance(s) of E33 Linguistic Object with an instance of E56 Language in which it is, at least partially, expressed.

A mountain is surely made of a molehill here. 

Can a name be expressed in a language? Yes. Can someone and recognize this and document it? Yes. Does this happen all the time? Yes. Should the standard express it yes. We already agreed this. That  is why it is in the rdfs. But in practice this makes it hard for people to apply it. So the proposa to please make it part of the standard so people can exchange information.

Post by Martin Doerr (10 November 2022)

Hi Robert,

No questions the existence of names with a language. I do not remember doing that.

I simply try, as often, to teach the group principles.

The scope note for E41 explicitly says:

"Different languages may use different appellations for the same thing, such as the names of major cities. Some appellations may be formulated using a valid noun phrase of a particular language. In these cases, the respective instances of E41 Appellation should also be declared as instances of E33 Linguistic Object. Then the language using the appellation can be declared with the property P72 has language: E56 Language."

May be it is not clear, what I am discussing.

As long as on the application side, we declare E41_E33 , it is still up to the user to decide which sense of linguistic object we apply.

If the proposal is to introduce a new Multiple IsA class into CRMbase, all good practice requires to write a scope note and to clarify in which sense it is a linguistic object and the property P72 is applied. The class instance itself is then associated with the property, and not its incidental use.

This is a general ontological principle we apply, sine qua non. If we ignore that, we would further create a conflict with P139:

"This property should not be confused with additional variants of names used characteristically for a single, particular item, such as individual nicknames. It is a directed relationship, where the range expresses the derivative or variant and the domain the source of derivation or original form of variation, if such a direction can be established. "

That was my comment below.

What is your opinion, does anything prevent any name to be used in any language?

Therefore, I follow the usual discourse to help identifying ontological distinctions.
For instance, https://www.behindthename.com/name/joshua provides languages to name variants. This use should definitely be included if we decide a new class for CRMbase.

If we decide to include in P72  also a "made for" , or "characteristically used by" for E41-E33, we have to describe that. If we decide against, we have to decide that. If we are not able to decide, we leave the application in the RDFS as is, because it is not mature for ontological standardization at an international level, but resolved application-wise.

I asked for Farsi, because to my best knowledge, Iran uses the Arabic script. Therefore a transcription of Douglas Adams to Arabic script equally applies to Farsi and Arabic. As such, it would not be "Arabic".

I hope that makes my reasoning clearer.

All the best,

Martin

Post by Gordon Dunsire (11 November 2022)

All
 
Back to the language characteristics of names and titles ...
 
First, to recap more detail about the LRMer and RDA treatment of 'names'.
 
The LRMer uses the class Nomen, defined as "An association between an entity and a designation that refers to it". All instances of the class are reifications of the statement "This instance of an entity has an appellation that is this designation". The designation is known as a 'nomen string', and it is the datatype object of the property LRM-R13 "has appellation". The property declares a range of Nomen, so an instance of nomen is an object type object of the property. The reification is used to distinguish instances of the same designator being assigned to multiple instances of one entity or multiple entities.
 
RDA implements Nomen and "has appellation" by distinguishing three categories of designation/appellation: name/title; access point, identifier. RDA does not treat an IRI as a nomen; that is, all appellations are strings, and a nomen IRI is a "thing". RDA adds sub-properties of LRM-R13 that are specific to these categories, but does not sub-class Nomen. The ontological distinction of the categories lies in the kind of agent/actor who assigns the designation/appellation, the syntax of the nomen string, and the context of its application in bibliographic metadata. A name/title is assigned by a creator of an information resource in the syntax of common discourse and the context of promoting the resource to its user; an access point is assigned by a creator of metadata for the resource and instances of associated entities in the syntax of a 'string encoding scheme', 'authority file', etc. and the context of collocating the instances in an ordered list; an identifier is assigned by a creator of metadata for instances of any entity in the syntax of an identifier assignment scheme that is typically mediated through an algorithm and the context of direct access to the resource when the identifier is known.
 
RDA does not expect an identifier to be based on language or translatable; the relationship between two identifiers for the same entity is a 'mapping', not a translation. [RDA does not forbid this; two identifiers that are based on a name or title that is translated might be considered to be translated themselves. This often occurs in official publications of multi-lingual governments.]
 
RDA does not expect an access point to be translated. The component strings of two access points for the same instance may be different language versions, but they are assembled into the structured strings of each access point by applying the same string encoding scheme; the result is two distinct access points, not a translation of one access point into another. This is illustrated by VIAF (Virtual International Authority File) which labels individual access points by assigning agency, not language. [Again, this is not forbidden in RDA.]
 
RDA expects titles of manifestations to be translated. Some manifestations bear statements of title, responsible agents, and other associated entities such as place of publication in multiple languages and scripts ("parallel statements") but is rare for names of persons to be 'translated'. Some manifestations are subsequent translations of an existing manifestation, and it less rare for names of persons and other entities to be 'translated' with local versions.
 
RDA does not generally expect names of persons to be translated. An exception is a name that includes an epithet, such as "Thomas the Rhymer" (although VIAF has no 'translated' version).
A quick Google search does not reveal a translation of "The Big Apple", so I guess that translation of names (of agents, places, etc.) that include epithets is unusual. There is an interesting article on translating geographic names, aptly entitled "Navigating through treacherous waters" (https://translationjournal.net/journal/28names.htm).
 
RDA expects all designation/appellations to be transliterated.
 
To answer Martin's questions:
 
A) In library practice, do you associate a name with a language, and what would be the rules.
 
GD: Yes. The LRM and RDA provide a property for language of a nomen. The LRM defines this in the context of the scheme to which the nomen belongs; that is, the language covered by the assigning agent. The RDA property does not provide semantics that add to its title "has language of nomen" or specific options for its use, because the 'waters are treacherous'.
 
Can transliteration to another script change and produce a language-specificity?
 
GD: I don't think so.
 
B) If the meaning of the language property we are seeking for is not the language of the name, but the suitable use in a language group of the name for the named instance, then, it is a subproperty of P1 and not P72. Such as "is typically identified in English by...etc. That *is *an ontological question.
 
GD: I would say that, in the library context, it is 'suitable use in a language group'.
 
 

Post by Francesco Beretta (12 November 2022)

Dear Martin, all

Sorry to intervene so late in this interesting exchange, I was away for some days and I'm going through my emails now.

I encountered the same questions while working a few years ago in a history project interested in the evolution of the use of names and surnames.

The approach of the project was similar to the one presented by Martin below and amounted to saying that it is difficult to state to which language a first name, or surname, belongs in itself, except for some cases or if we consider the region of origin, but what is relevant is that this specific string of characters is used at a given time (and attested in the sources) in a language or in another (i.e. in a society speaking this language) to identify a person or an object.

To capture the information envisaged in the project in the sense of this approach I decided to stick to the substance of crm:E41 Appellation class:

"This class comprises signs, either meaningful or not, or arrangements of signs following a specific syntax, that are used or can be used to refer to and identify a specific instance of some class or category within a certain context. Instances of E41 Appellation do not identify things by their meaning, even if they happen to have one, but instead by convention, tradition, or agreement." (CRM 6.2).

and to add in what has become the SDHSS CRM unofficial extension the sdh:C11 Appellation in a Language class.

This class has as you'll see a clear social, i.e. intentional flavor, and captures the information that some appellation is considered as a valid appellation of a thing in a language (i.e. society speaking his language) during an attested time-span.

This was also an attempt to cope with the frbroo:F52 Name Use Activity issue:

413 Pursuit and Name Use Activity to CRMsoc
573 CRMsoc & F51 Pursuit & F52 Name Use Activity

which is somewhat slowed down by the ongoing exchanges around the nature and substance of the social world as foundation of the CRMsoc extension.

But one could easily provide another substance to an Appellation in a Language class making it a Name Use Activity (in a Language) class (and subclass of crm:E13 Attribute Assignment or crm:E7 Activity).

This would be in my opinion a good way of coping with the wish expressed by George at the beginning of this exchange to "make [this kind of classes] full classes in the standard so that they are fully vetted and controlled. It is a fundamental class. It should be in the standard in the first place", wish that I definitely share. And also to stick, as far as I can understand, to the modelling principles reminded by Martin.

And it would also finally solve the issues still open, to my knowledge, concerning the original FRBR-oo class.

Best

Francesco

Post by George Bruseker (13 November 2022)

Here is fun example of linguistic object which I guess challenges p72 but is still actually diaskedastic and perithoric to our enterprise, brought to you by the great zolotas

https://youtu.be/2XAcuxFqk9k

In what language is it? In what language is this email? 

And is it in our capacity as ontologists that we would decide?

Post by Martin Doerr (13 November 2022)

Dear Francesco, dear Gordon,

Thank you very much!

It also appears to me that a large part of the semantics we are discussing is property of the relation between a particular and a name and not the name. The LRM Nomen has consequently modelled LRMoo,
as

 "F12 Nomen
Subclass of: E89 Propositional Object
Scope note: This class comprises associations between an instance of any class, and signs or arrangements of
signs that are used to refer to and identify that instance."

You both confirm this in different ways. I had been talking about a property of P1 is identified by, Francesco about the Name use, all variants of the same pattern. I think we should floow this thought.

By the way, I am working in methodology ontology engineering 30 years, and obviously would never propose to decide instances. Don't know, how I could be interpreted that way.

The ontologist must provide a definition of the "intension" of a class or property. This definition, be it appealing to common sense or in terms of logical rules or anything in between, must enable domain experts with a reasonable precision to decide on a justifiable base, if an instance belongs to the concept yes or no. This should approximate at least some professional practice. A "fuzzy zone" or area of undecidable instances always exists. In CRM-SIG, we always explore this by examples. If, e.g. for an arbitrary name there are no limits to any expert to which language it may belong, it demonstrates a weakness of the concept. In this case, we seek to understand all possible similar senses, to find out if the intuitive concept actually hides a quite substantial concept of different logic.

So, of course, as ontologists we must understand examples, or find a large enough group of experts that are able to understand them, but we define intensions. Should have been obvious

And yes, why shouldn't Zolotas speak a Greek-English mix? Obviously, it is neither German nor Chinese nor Arabic...that would not challege P72, since Quantification is not unique, and many European languages are not so alien to each other.

All the best,

Martin
 

Current Proposal: 

Post by George Bruseker (13 November 2022)

Dear all,

Given the opened discussion of the utility and need for this class and the discussion which illustrates a clear precedent and use for this class, as well as the need for the top principle for the need to clearly document elements of the standard, I propose for the class to be added not just in the RDFS (where it is already official) but to be given a scope note and added to the basic documentation in order to support inter-dataset communication with clarity and consistency. 

Best,

George

Post by George Bruseker (5 December 2022)

Dear all,

Issue 624 can be found here: https://cidoc-crm.org/Issue/ID-624-add-e33e41linguisticappellation-to-th...

The discussion revolves around adding a class to the specification and not just the rdfs which represents the phenomenon of names being in languages.

The homework for the issue can be found in this google doc: 

https://docs.google.com/document/d/1-l6OrEy8I3doP5cCm5dTzLav6SwE2prxBtpP...

Best,

George

Post by Francesco Beretta (6 December 202)

Dear all,

Reconsidering the whole exchanges in this issue, and the examples, notably those by Martin on November 9th, it appears that the information we want to model is:

this instance of E41 Appellation (i.e. a name as identifier of an entity) is used in this language (E56) — formerly or now, this is another topic.

So, the simplest solution (as a shortcut of longer ones but making sense in the context of the examples brought by George) is to add a property:

E41 Appellation --> is (was ?) used in --> E56 Language.

This solution avoids adding persistent item classes, which is somehow cumbersome, it copes with the problem and brings the information to the conceptual model in a concise and stringent way, without engaging in the ontological discussion about the language in which an appellation is (was created in this language, is used as such, etc. etc.).

The substance of the property, given all the examples you brought, seems to be quite clear: we can observe (through text and speach acts) that an appellation is used in a language as a valid identifier of an entity.

Best

Francesco   

Post by Martin Doerr (6 December 2022)

Dear All,

I think this is  a good proposal what concerns the Appellation itself. If solves the names of universals and the personal name provenance.
It does however not reflect the appellations for specific instances in a given language, such as the ten thousands in the TGN.

The subtle point is that rdf label is private to the domain instance, as is LRM Nomen. Therefore the RDF language tags on labels are not on the Appellation, but on the link instance, and hence cannot be transferred to such a model.

Using a range property instead of a link property is a logical error, because it creates non-sensical associations.

Which of all these do we want, and how to model the latter?

Best,

Martin

In the 55th joint meeting of the CIDOC CRM and SO/TC46/SC4/WG9; 48th FRBR/LRMoo SIG meeting, the reviewed the proposal by GB & RS to introduce class Exxx Name into the CIDOC CRM specification, as the equivalent of the E33_E41_Linguistic_Appellation that has been minted for the rdfs implementation.

The proposal spurred a lively debate concerning the utility of such a class.

Way to move forward
HW assigned to MD to formally make a counter-argument, cast in terms of a strict example showcasing logical problems that ensue from admitting class Exxx Name into CIDOC CRM. This way, the SIG will make an informed decision, either dispensing with it or altering the proposal seeing that it does not create the type of logical problems that were alluded to.

For details of the proposal and overall discussion, see attached document

Belval, December 2022

 

Post by Martin Doerr (14 December 2022)

Dear Francesco, dear George,

After the discussion in the last CRM SIG meeting, I propose to follow Francesco's "sdh:C11 Appellation in a Language class." as a longpath for P1.

I propose to generalize the context. It could be a language, it could be a country, a Group. I propose to analyze, if this can be mapped or identified with LRM Nomen and its properties. It can further be made compatible with the RDF labels with a language tag, which are domain instance specific and not range specific, and of course can represent the TGN language attributes. For VIAF, we would need a "national" context, i.e., the national library.

Best,

Martin

Post by Rob Sanderson (15 December 2022)

This doesn't meet the requirements, unfortunately.

sdh:C11 is a temporal entity -- the state of being named something -- and not a name itself. While interesting, as previously States have been widely decreed as an anti-pattern to be avoided, it does not meet the requirements set forth for E33_E41, which is that an Appellation itself can have a Language.

So I believe that this does not solve the problem as stated - that E33_E41_Linguistic_Appellation does not have a description outside of the RDFS document.

Rob

Post by Martin Doerr (15 December 2022)

Dear Robert,

On 12/15/2022 4:57 PM, Robert Sanderson wrote:
>
> This doesn't meet the requirements, unfortunately.

To my best understanding, and of others on this list, it has not made sufficiently clear so far by you which semantics the linguistic Appellation should comprise.  Following our methodology, requirements must be backed up by representative examples that allow for narrowing down the senses to be comprised. The do not come from authority.

Most examples provided so far did not demonstrate the independence of the language specificity of the Appellation from the individual identified by it, but exactly the opposite. The difference is a matter of fundamental logic of semantic networks, and cannot be ignored.

Examples must be sufficiently representative for a large set of data. TGN, for instance, is huge, and domainßinstance specific. VIAF refers to national libraries, not to languages. "The Big Apple" is a rather rare case of a complete English noun phrase used as a place name, which exactly fits the scope note of E41. It could be documented as Title. Transliteration, you mentioned, does not create a language specificity, but a script specificity.

Please respect that it belongs to our method to discuss, if the sense of an original submission actually represents the best semantics fit for purpose, and to modify it if needed. I simply act here, as any CRM-SIG member should, as a knowledge engineer based on the examples you and others provided and try to propose the most adequate solution, and not to defend any position. I do not have any other project of my own. Please stay in your answers on the level of arguments based on representative examples and their interpretation.

>
> sdh:C11 is a temporal entity -- the state of being named something -- and not a name itself. While interesting, as previously States have been widely decreed as an anti-pattern to be avoided, it does not meet the requirements set forth for E33_E41, which is that an Appellation itself can have a Language.

Indeed I may not describe C11 as a State in the sense we discussed it. It is as timeless as all our properties of persistent items. States are better avoided if temporal inner bounds are to be given, because they require complete observation, a sort of Closed World. This is not the case here. But this distracts from the question to what the language here pertains.

To repeat, if E33_41 is to enter unmodified CRMbase as you propose, it needs a scope note and examples that disambiguate scope and senses.  Then, it must be differentiated from domain-instance specific use, and the relevance
of the remaining scope must be argued. All examples must be discussed and voted for.

Rather than an anonymous "requirement set forth", I definitely would like to see your examples of use of E33_41 in your applications. Is that possible? Are you sure they fit the independence from the domain instance? Are you sure there will be no abuse in the sense I, Francesco and LRM propose?

Best,

Martin

Post by George Bruseker (15 December 2022)

Dear both,

I'm too covidy still to follow this in detail but I think the issue was left, for the notes to show, for Martin to provide an example to show the problem he sees that we cannot see.

Cheers,

George

Post by Martin Doerr (16 December 2022)

I am working on that!

Best,

Martin

Post by Francesco Beretta (16 December 2022)

Dear Martin, Rob,

If we consider the intended phenomenon in reality, we can observe (through everyday experience or documentation) that humans use names for identifying things. Insofar as humans live in cultural contexts, and these are realized through languages, these names are in some way related to, or valid in different languages.

If we stick to the ontological substance of E41 Appellation, we can observe that people can use the "The Big Apple" appellation to identify New York City even in sentences expressed in other languages than English, and possibly without even understanding the meaning of this expression.

This phenomenon, which occurs on Earth in billions of instances at every moment, can be expressed, or has been expressed in the context of CIDOC CRM in three ways:

  • in using frbroo:F52 Name Use Activity which, as a subclass of crm:E7 Activity, captures the information about the dynamic of human groups in space and time and thus in a linguistic context. One could interpret sdh:C11 Appellation in a Language in this sense and add a property situating the activity in a linguistic context
  • in using LRM Nomen as Martin proposes. The concerned propositional object captures the intentional content (as social philosophers would say) of the belief that this appellation is validly usable, i. e. understandable in this language in order to identify a thing
  • in using sdh:C11 Appellation in a Language as it was originally modelled in a social perspective, i.e. as a subclass of Intentional State or State of Mind, situating in a temporal region (as temporal phenomenon) the fact that a thing is considered as being validly named with this appellation in a linguistic and social context. This is the perspective of CRMsoc with a domain that complements CRMbase from the 'inside' perspective (in the sense of intention carryied in the individual minds) and (indirectly) through observable phenomena and documentation. Therefore not a State as alternative to Event (in the same CRMbase domain) but something else, a sort of quality of the minds of the believers —a state of mind— of the LRM Nomen instance.

This said, one can consider the property crm:P1 is identified by (identifies) as a shortcut and abstraction of this phenomenon, regardless of the ways of expressing it summarized above, relating the intended entity with an appellation of it.

In the same perspective of abstraction and simplification, and in my opinion as a robust way, without adding subclasses of Persistent Item which risks to be cumbersome and their substance not well defined and rigid/disjoint, I'd be in favor, as already expressed, of adding an additional property:

E41 Appellation --> P... is used in --> E56 Language.

as a shortcut of another aspect of  sdh:C11 Appellation in a Language (without engaging for the moment in defining what this class is)

This solution seems to cope with the problem and brings the information to the conceptual model in a concise and stringent way, without engaging in the ontological discussion about the language in which an appellation is (was it created in this language? is it used as such? etc. etc.).

The substance of the property, given all the examples you brought, seems to be quite clear: the property expresses the observable fact (in documentation and every day life) that an appellation is used in a language (by an intentional community or society — not necessarily a group with potential of acting together) as a valid identifier of an entity.

Best
Francesco

Post by Rob Sanderson (16 December 2022)

While this is interesting, the issue is not to re-engineer names and languages from first principles. There is an existing class, E33_E41_Linguistic_Appellation, which is in use in many projects and products around the world. The issue, as raised by George, is that it is thought that it would be better for this class to have documentation outside of the RDFS document that defines it technically. 

There are two possible outcomes of this issue:
1. It is agreed that there should be human-intended documentation for the class, and then that documentation gets written for E33_E41_Linguistic_Appellation.
2. It is not agreed that there should be human-intended documentation for the class, and documentation gets written outside of CIDOC-CRM.

Rob

Post by Thanasis Velios (19 December 2022)

Dear all,

Just to remind you that there is some documentation for E33_E41_Linguistic_Appellation in the RDFS implementation guidelines document:

https://cidoc-crm.org/sites/default/files/issue%20443%20-%20Implementing...

in the section "Language of an Appellation". Perhaps extending that section would be useful with any details that George and Rob think are missing. I have just noticed as well that the property 'P72 has language' is mentioned in a confusing way in that section as it does not apply directly to E41 and there is also no mention of multiple instantiation.

Following Francesco's message, isn't LRM Nomen a good choice to address
links to language?

Thanasis

Meetings discussed: