Issue 407: Ordinal Property for E55 Type

Starting Date: 
2019-01-03
Working Group: 
3
Status: 
Open
Background: 

Posted by Stephen Stead & Robert Sanderson on 3/1/2019

During the discussions at the CRM-SIG meeting during November 2018 in Berlin the problem of dealing with instances E55 Type that have ordinal relationships with other instances of E55 Type came up. There were a number of use cases explored including:-

  • Condition report status values like Excellent, Good, Average, Poor, Critical where being able to query for all items that were below “Average” or “Good” and above would be useful.
  • Map scales expressed as types
  • Fire Hazard Ratings

This lead Robert and I to suggest that a new property be created that allowed this kind of ordinal relationship to be expressed. The quantification allows for parallel hierarchies, e.g. if someone has a type that is “slightly better than average but not quite good”, then they could align that with an existing hierarchy of Good > Average by saying that it is greater than “Average” and that “Good” is greater than both it and Average.

 

Pxx is conceptually greater than (is conceptually less than)

Domain: E55 Type

Range: E55 Type

Quantification: many to many (0,n:0,n)

 

This property allows instances of E55 Type to be declared as having an order relative to other instances of E55 Type, without necessarily having a specific value associated with either instance.  This allows, for example, for an E55 Type instance representing the concept of "good" to be greater than the E55 Type instance representing the concept of "average". This property is transitive, and thus if "average" is greater than "poor", then "good" is also greater than "poor". In the domain of statistics, types that participate in this kind of relationship are called "Ordinal Variables"; as opposed to those without order which are called "Nominal Variables". This property allows for queries that select based on the relative position of participating E55 Types.

 

Examples:

  * Good (E55) is conceptually greater than Average (E55)

  * Map Scale 1:10000 (E55) is conceptually greater than Map Scale 1:20000 (E55)

  * Fire Hazard Rating 4 (E55) is conceptually greater than Fire Hazard Rating 3 (E55)

 

Comments Welcome

Current Proposal: 

Posted by Franco Niccolucci on 3/1/2019

This proposal makes sense to me, and I would strongly support it.

Only, the name “is conceptually greater” is not completely appropriate, in my opinion. For example, “Good” is not ‘greater' than “Poor”: it is ‘better’; “Old” is not ‘greater' than “Young” - actually, except for wines, it is worst

Maybe “conceptually precedes”, and “conceptually follows” for the reverse? This would reflect the ordinal character of the concerned types in a neutral way. Being a little cryptic would convey the generic value of a pre-defined order to the reader.

I am aware that such names are only labels, and in principle can be anything. But since we are christening the new property, a little effort to choose a more significant one could be done.

Furthermore, this introduction of ordinality leads me to ask “who said that?”: if some orders may be considered factual, e.g “heavy” is greater than “light", others are possibly not, being the consequence of a subjective appreciation: is “handmade” greater than “industrial”? But this is another story.

Posted by Daria Hookk  on 3/1/2019

Average->Good: next step of charachteristics in direction of increasing

 

 

Posted by Stephen Stead on 3/1/2019

Hi Franco, All
Happy New Year
If we change to Franco's suggested labels (which I think there is a strong case for) I am confused by the order he suggests as this seems to me to be the reverse of the original property. So to get the same ordering I would expect:-
Pxx conceptually follows (conceptually precedes)
Now if that means, to some, the opposite of the original property then we may have hit a snag with the suggested new property labels.

Posted by Franco Niccoloucci on 3/1/2019

Dear Steve

You are right.

X < Y is equivalent to "X precedes Y" or "Y follows X” in an ordered set

So: “Good (condition) follows Poor (Condition)” “Small precedes Large” etc.

No snag, I just messed the concepts. To my excuse, it is not uncommon when the E55 Type of one’s age (decidedly) follows E55 Type = Young.

Posted by Stephen Stead on 3/1/2019

Excellent then the revised property, scope note and examples would be:-

Pxx conceptually follows (conceptually precedes)

Domain: E55 Type

Range: E55 Type

Quantification: many to many (0,n:0,n)

This property allows instances of E55 Type to be declared as having an order relative to other instances of E55 Type, without necessarily having a specific value associated with either instance.  This allows, for example, for an E55 Type instance representing the concept of "good" to follow the E55 Type instance representing the concept of "average". This property is transitive, and thus if "average" follows "poor", then "good" also follows "poor". In the domain of statistics, types that participate in this kind of relationship are called "Ordinal Variables"; as opposed to those without order which are called "Nominal Variables". This property allows for queries that select based on the relative position of participating E55 Types.

Examples:

  * Good (E55) conceptually follows Average (E55)

  * Map Scale 1:10000 (E55) conceptually follows Map Scale 1:20000 (E55)

  * Fire Hazard Rating 4 (E55) conceptually follows Fire Hazard Rating 3 (E55)

 

How does that seem?

Posted by Martin on 3/1/2019

Dear All,

Very nice all that, but the critical question for a concept to enter CRM base is:

What is the scientific question in an information integration environment, that needs this property to make the relevant connection/ inference,

and further:

Why is that proposed for CRM base and not for SKOS?

and finally:

What is the coverage of problems that benefit from this property?

These concerns are part of the methodology we follow, and most substantial. We must make sure they appear in the "principles".

Posted by George Bruseker on 4/1/2019

Overall, I would say it sounds like a good proposition. I post my replies to Martin’s queries below.

With regards to label, I agree with Franco that greater/lesser is not right, but I’m also not sure that follows/precedes is a good terminology either. I don’t think it is very heuristically useful phrasing to say that ‘good' precedes ‘poor’.

In this case, because it is quite a technical matter, could we not make the label somewhat technical? Something like, is_ordinal_superior_to / is_ordinal_inferior_of? By qualifying the superior, we make clear we mean in the sense of some abstract scale rather than general greatness.

> Dear All,
>
> Very nice all that, but the critical question for a concept to enter CRM base is:
>
> What is the scientific question in an information integration environment, that needs this property to make the relevant connection/ inference,
>

To me it meets a known need in documentation where we have various qualitative relations of greater/lesser than some other concept. This happens also in conservation (conservation state = good, poor), in risk analysis (‘high’, ‘low’) and so on. I think the function it would serve is to be able to not lose these differentiations and to be able to query on them.

Which of these objects are in poor state and high risk of damage?

> and further:
>
> Why is that proposed for CRM base and not for SKOS?
>

We control CRMbase?

> and finally:
>
> What is the coverage of problems that benefit from this property?
>

Questions of qualitative judgments expressed relative to one another encountered CH documentation.

> These concerns are part of the methodology we follow, and most substantial. We must make sure they appear in the "principles".
>

True.

Posted by Thomas Francart  on 4/1/2019

Hello

Le jeu. 3 janv. 2019 à 19:05, Martin Doerr <martin@ics.forth.gr> a écrit :

    Dear All,

    Very nice all that, but the critical question for a concept to enter CRM base is:

    What is the scientific question in an information integration environment, that needs this property to make the relevant connection/ inference,

    and further:

    Why is that proposed for CRM base and not for SKOS?


SKOS already deals with this use-case using skos:OrderedCollection.

Posted by Athanasios Velios on 5/1/2019

A recent example of the usefulness of such a property is from the integration of the 3 condition survey databases of the Ashmolean Museum
where we were trying to identify the percentage of the collection which requires conservation work across metal objects, scrolls and textiles.
Each condition survey had different options for "general condition". 
E.g. what was "good - average - bad" in one, was "excellent - good - poor" in another. One query could be: how many objects have condition
marked with an option which does not have an ordinal follower (i.e. the last in each list).

I agree with George about not assigning value in the label. Why would "large" be after "small"? In fact, in conservation "good" is almost
always before "poor". I think the same applies to inferior/superior. I would suggest:

ordinally after (ordinaly before)

skos:OrderedCollection looks good, but isn't too closely linked to RDF?


I am reading that it depends on RDF collections and any reasoning might
be too generic.

Posted by Stephen Stead on 7/1/2019

Hi all

Happy New Year

The property name: Perhaps we should borrow from the nomenclature of ordinal statistics and use

ranked higher than (ranked lower than)

Hi Martin

Excellent questions!

 

1] Research questions that are enabled:-

I envisaged questions of the form that Athanasios has suggested as well as the opposite; “Where are examples of “x” object type that have a condition of “y” or better that I can have access to for comparative observations”

In the map world I also thought of the integration question “During the planning of this expedition was there a map at “x” scale or larger published and available within “y” distance of the expedition headquarters”. This was the type of question envisaged in the Arctic Cloud project.

 

2] Reasons for CRM rather than SKOS:-

As George says we control CRMbase and not SKOS. More substantially the solution of skos:OrderedCollection does not allow the integration of different terms from different sources into the same term ordered collection without physically merging them. While that could be overcome (it scales like a bag of bolts) the more substantial problem is it does not allow branching paths through the collection; for example Excellent > Good > Poor and Excellent > Average > Poor is not possible. Another concern is that all Collections are automatically ordered by their position in the implemented list: that is all collections are ordered even if there is no such ordering in the real world.

3] Coverage of problems:-

Collection management: questions of collection morbidity, storage effectiveness and process validation

Museology: Do different collection management regimes materially affect the short, medium and long term collection conservation

Material Science: which materials have survived best

Cultural Heritage Geo-informatics: What map scales were available, when, for what and for/by whom.

Risk Management: What is the current state across institutions. What is the history of risk classification across the domain/region/institution type

Audience Research: Many institutions are starting to collect Likert scale data as part of the feedback on exhibitions. This could then be linked to exhibition content to gain insight into the affective museum experience. This is what Erin Canning is working on.

 

Posted by Martin on 7/1/2019

Dear All,

On 1/7/2019 8:02 AM, Stephen Stead wrote:
>
> Hi all
>
> Happy New Year
>
> The property name: Perhaps we should borrow from the nomenclature of ordinal statistics and use
>
> ranked higher than (ranked lower than)

>

> Hi Martin
>
> Excellent questions!
>
> 1] Research questions that are enabled:-
>
> I envisaged questions of the form that Athanasios has suggested as well as the opposite; “Where are examples of “x” object type that have a condition of “y” or better that I can have access to for comparative observations”
>
> In the map world I also thought of the integration question “During the planning of this expedition was there a map at “x” scale or larger published and available within “y” distance of the expedition headquarters”. This was the type of question envisaged in the Arctic Cloud project.

I have the impression that these are indeed the only research questions at a factual level (about particulars), that are supported by such a property. The scope of the CRM is deliberately restricted to this level, in order to maintain a clear modularity against, in particular, terminological systems. With "broader/narrower" we maintain a minimal interface to such systems.

The above examples are about inclusion of categories, yet another much more specialized case of getting something of type x and narrower. In case of a few qualities, the retrieval problem can easily be solved by enumeration. The underlying IT system will anyway do nothing else than expanding the "y" or better. The example also shows that the sense of the ordering is quite diverse: "better", or "higher resolution" etc., are not implied by one general property. each ordered collection will have different senses.

Any ordered collection can be expanded by a set of ((n-1)**2)/2 "pyramid" of generalizations, which effectively represent the order. This solution is effective for smaller sorted sets. Map scales may be a different case, the only one I am currently aware of.

> 2] Reasons for CRM rather than SKOS:-
>
> As George says we control CRMbase and not SKOS . More substantially the solution of skos:OrderedCollection does not allow the integration of different terms from different sources into the same term ordered collection without physically merging them. While that could be overcome (it scales like a bag of bolts) the more substantial problem is it does not allow branching paths through the collection; for example Excellent > Good > Poor and Excellent > Average > Poor is not possible. Another concern is that all Collections are automatically ordered by their position in the implemented list: that is all collections are ordered even if there is no such ordering in the real world.

The question of integrating different ordered collections of terms is definitely out of scope of the CRM, and a question of terminology mapping, and definitely not solved in any way by such a property.

We cannot solve all the problems of the world. We explicitly recommend SKOS as complementary, in order to maintain some order between standardization efforts. We have discussed with the NKOS group for many years the need to standardized specializations of "related term", but never could mobilize any larger community to do so. There are some dozen candidates, and theoretical issues. Picking up now one of the most specialized, poses a serious methodological question, if we aware of the scope, relative relevance and further related issues to such a modelling.

We already have to many open fronts in CRM-SIG. We encounter the danger not not to control SKOS, but to loose control of the CRM itself. Anybody can make a local extension to SKOS, and recommend it, without the SKOS team, exactly as anybody can make a local extension to the CRM. There may be other models already dealing with the problem.
>
> 3] Coverage of problems:-
>
> Collection management: questions of collection morbidity, storage effectiveness and process validation
>
> Museology: Do different collection management regimes materially affect the short, medium and long term collection conservation
>
> Material Science: which materials have survived best
>
> Cultural Heritage Geo-informatics: What map scales were available, when, for what and for/by whom.
>
> Risk Management: What is the current state across institutions. What is the history of risk classification across the domain/region/institution type
>
> Audience Research: Many institutions are starting to collect Likert scale data as part of the feedback on exhibitions. This could then be linked to exhibition content to gain insight into the affective museum experience. This is what Erin Canning is working on
.

We should not confuse the question of standardizing ordered value sets with providing a link between the terms. The link does not solve that at all.

I would argue we are out of scope of CRMbase.

Posted by Phil Carlisle on 7/1/2019

Dear all,

I have to agree with Martin. I’d go even further and advocate the removal of P127 has broader term. This property only accounts for the BTG relationship and only raises the question as to why the other BT/NT relationships aren’t represented or the associative relationship or the equivalent for that matter.

Surely it is better to continue to point users to the recognized international ‘de facto’ standard(SKOS) or the actual standard (ISO 25964) and, if there is a real need for such properties, to argue for them to be included in the relevant standards.

 

 

Posted by martin on 7/1/2019

I'd like to add, that the Ordinal Property actually needs the concept of an Ordered Collection in the first place, within which it operates.

Posted by Thanasis Velios on 7/1/2019

Why was "broader/narrower" included in the CRM? Similar arguments could be made about that property, no?

T.

P.S. The example "Good->Average->Bad" and "Very good->Good->Bad" is indeed a terminology matching exercise, but we still need to reason on
the fact that average condition objects rank before bad condition objects.

Posted by Robert Sanderson on 7/1/2019

I agree with Thanasis. If ranked_higher_than / ranked_lower_than is out of scope when there are clear use cases across the range of domains covered by the CRM, then it seems that narrower and broader should be deprecated in favor of SKOS and simply leave E55 Type as the merge point.

Another need that we have for this is description of material qualities for conservation reference collections, where some materials are more/less  flammable, acidic, dangerous or similar than others without a clear scale.  My original proposed modeling solution was to use a dimension but that was not deemed appropriate due to the lack of reproducible measurement, however in order to find “more flammable than” we need to either use a dimension or to have this ranking.

I disagree with Martin’s assertion about what information systems will do. If the property is declared as transitive, then with inferencing in a graph store, all you need to do is search for ranked_higher_than to find all of the higher ranked resources. Compared to using an rdf:List, which is notorious for being hard to use in queries.

 

Posted by Martin on 7/1/2019

Dear Robert, all,

On 1/7/2019 7:00 PM, Robert Sanderson wrote:
>
> I agree with Thanasis. If ranked_higher_than / ranked_lower_than is out of scope when there are clear use cases across the range of domains covered by the CRM, then it seems that narrower and broader should be deprecated in favor of SKOS and simply leave E55 Type as the merge point.

Well, I think that needs a more detailed discussion. broader/narrower is the fundamental semantics of thesauri, and can hardly be compared in relevance to the ranking property proposed, or others such as BTP etc. But indeed, we explicitly declare in the CRM that detailed modelling of terms, i.e., meta-properties, is not intended so far in the CRM.
>
> Another need that we have for this is description of material qualities for conservation reference collections, where some materials are more/less  flammable, acidic, dangerous or similar than others without a clear scale.  My original proposed modeling solution was to use a dimension but that was not deemed appropriate due to the lack of reproducible measurement, however in order to find “more flammable than” we need to either use a dimension or to have this ranking.

Sure, I have NOT questioned at all the need for this property. I have argued for keeping CRMbase in a reasonable frame. I am further not convinced that this property is actually the cure for all fuzzy and discrete ranking problems. I simply find it not yet well studied.

One absolutely vital question of methodology is how to restrict  by functional criteria a module of an ontology. Otherwise, we end up in a great mess many other teams have encountered, and we started with in the first days of the CRM.
>
> I disagree with Martin’s assertion about what information systems will do. If the property is declared as transitive, then with inferencing in a graph store, all you need to do is search for ranked_higher_than to find all of the higher ranked resources. Compared to using an rdf:List, which is notorious for being hard to use in queries.

My argument was different: a) we can use broader/narrower if we add a "pyramid" of broader terms on ranked lists.

B) If the properties is transitive, the triple or graph store will collect all respective terms internally, and then query them. That was my argument. About difficulties using a list in queries, well, it is a question of UI.

By the way, the scale of maps is defined in LRM as a property. So we should look at this first. Why do we need a scale of a map? Because on paper, it is an indication of the resolution, but not the actual resolution. This becomes more and more obsolete with digital maps, and is equally inadequate for very old maps.

Posted by Martin on 8/1/2019

Dear All,

I propose to put the Ordinal Property into CRMSci:

Except for map scale, which is actually numerical,
it appears to me that it has to do with evaluative tasks, summarizing complex criteria into some overall measure for sake of decision support, which are the scope of CRMSci. I suggest to relate it to Ordered Collection, explicitly or implicitly in the scope note. I cannot imagine the property going across different Ordered Collections, and it should comprise all values of an Ordered Collection.

I assume it should be related to observation/evaluation, and be described as related to  the dimension concept. May be it should be related in the scope note to quality (including risk) assessment.

-------

About cartographic scale: LRM and tentative mapping to LRMoo:

How do the geographers / OGC deal with map scale?

Opinions?

Posted by Christian Emil on 9/1/2019

Dear all,

Sorry for joining the discussion so late.


The  “Pxx conceptually follows “ can be used to create a (partial) ordering of concepts (classes).

It is at a first view, a handy property.

However, it may be problematic to use types as qualification of other types in this way. For example, according to the modeling principles document one should avoid concepts like ‘large’ and ‘small’ (e.g. a large mouse and a small elephant) as well as “good” and “bad”.

The last two examples are examples where the relations are defined according to a predefined linear scale/dimension.

The first example is not generally true. OED:

“1. Estimated by average; i.e. by equally distributing the aggregate inequalities of a series among all the individuals of which the series is composed.

2. a. Equal to what would be the result of taking an average; medium, ordinary; of the usual or prevalent standard.”

I question if such a property is a good idea inside the type hierarchy. ​