Issue 325: digitization of a physical object which incorporates an information object

Starting Date: 
2016-12-09
Working Group: 
4
Status: 
Done
Closing Date: 
2017-04-06
Background: 

In the  37th joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 30th   FRBR - CIDOC CRM Harmonization meeting, discussing the remark of the outcome of the issues 203-205, about the revision of P138, made the following comments:

  • with regards to digitization and the question if the 3d object or other digitization ‘incorporate’ the information that was carried by the painting, the scope note of P138 is complete.
  • On the one hand, a digitization might be done in a way that only captures some attributes which could not be said to carry the information that the object was meant to convey.
  • Also   if we take an accurate image of the object, then in a sense we could say that the resultant digital product has indeed incorporated the information carried by the original.
  • From the above it seems that there is also the question of what tolerance one holds towards loss of information within the transfer from analogue to digital.
  • This requires a deeper investigation since if information content of physical thing is of symbolic nature then there is a resolution at which we could certainly say that the digitization will carry the symbolic content of the original. (properly digitized texts, where symbols are properly captured)

Finally the crm-sig assigned to MD, Oyvind and (possibly with the help of Max Plank’s people)to write a guideline   under which conditions it can be said that a digitization of a physical object incorporates an information object that the physical object that was digitized carries

Berlin, December 2016


 

Current Proposal: 

Posted by Martin on 14/3/2017

Dear All,

Here my and Oeyvind's proposal:

"

An important class of E90 Symbolic Objects consists of arrangements of instances of a finite set of symbols, such as the character types of an alphabet, following rules of arrangement that make the relative order of the symbols unambiguous, such as writing symbols regularly spaced, in regularly spaced lines, aggregated into paragraphs etc. Most texts, but also diagrams such as metro maps fall under this category, because they represent stations and their connections in a way only loosely related to their actual geometry. The rules for the arrangements of symbols quantify relative positions so that they can unambiguously be interpreted as logical structures or grammars. For instance, inter-word spaces are sufficiently larger than inter-character spaces not to confuse them. We denote these objects as “formal symbol structures”. Common to them is the property that there is a minimal resolution under which they are still “readable”, i.e. under which the used kinds of symbols can be distinguished and the logical role of their relative positions unambiguously be interpreted. This even holds for melodies expressed in some musical key. If the digitization of a physical carrier of such a formal symbol structure is carried out in the necessary resolution and coverage, the complete symbolic content of the physical carrier will also be readable in a representation of the reproducing digital object, as well known from operating paper scanners. Then, the digital object will incorporate (P165 incorporates) the symbolic object on the physical carrier.

Ambiguity about this property may, for instance, arise if some characters are badly readable on the original. For practical reasons, we recommend not to regard such minor shortcomings of the original as a reason to question the P165 incorporates relation of the digital representation, as long as the overall sense (or score) is recognizable, in particular, if the intended meaning can be guessed equally well from the original as from the digital representation. The same holds for minor flaws in the digital representation itself.

In contrast, for symbolic objects in a non-discrete form, such as paintings, there is no clear minimal resolution and the actual color reflection behavior cannot be reproduced digitally with current means. As long as this is the case, the digitized image cannot be said to incorporate the original, it only “P138 represents” it. (For audio recordings, there is no equivalent to  P138 represents, and the more general P67 refers to should be employed). Actually, the symbolic object a physical object carries is not uniquely defined by the physical carrier but depends on the type of the symbolic object defined in a model, which in turn serves a purpose of representation. If, for instance, the original is a manuscript, a digitized image may incorporate its text, which is instantiated and defined as a symbolic object of type “character sequence” in an information system. However other research-relevant optical features, which belong to richer symbolic properties of the manuscript surface, such as colors, may not be resolved by the image. This richer definition of a symbolic object carried by the physical object would not be incorporated in the digitized image. In other words, the type of the symbolic object that is described to be carried by the physical object determines the features under consideration and hence allows for deciding if the digitized image has sufficient qualities to incorporate it. There is however still no good typology of symbolic objects with respect to the relevant representational feature types.

Outcome: 

In the 38th joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 31st FRBR - CIDOC CRM Harmonization meeting, the crm-sig reviewed MD's and Oyvind's proposal for the guideline about the conditions it can be said that a digitization of a physical object incorporates an information object that the physical object that was digitized carries. The approved text is the following:

An important class of E90 Symbolic Objects consists of arrangements of instances of a finite set of symbols, such as the character types of an alphabet, following rules of arrangement that make the relative order of the symbols unambiguous, such as writing symbols regularly spaced, in regularly spaced lines, aggregated into paragraphs etc. Most texts, but also diagrams such as metro maps fall under this category, because they represent stations and their connections in a way only loosely related to their actual geometry. The rules for the arrangements of symbols quantify relative positions so that they can unambiguously be interpreted as logical structures or grammars. For instance, inter-word spaces are sufficiently larger than inter-character spaces not to confuse them. We denote these objects as “formal symbol structures”. Common to them is the property that there is a minimal resolution under which they are still “readable”, i.e. under which the used kinds of symbols can be distinguished and the logical role of their relative positions unambiguously be interpreted. This even holds for melodies expressed in some musical key. If the digitization of a physical carrier of such a formal symbol structure is carried out in the necessary resolution and coverage, the complete symbolic content of the physical carrier will also be readable in a representation of the reproducing digital object, as well known from operating paper scanners. Then, the digital object will incorporate (P165 incorporates) the symbolic object on the physical carrier.
Ambiguity about this property may, for instance, arise if some characters are badly readable on the original. For practical reasons, we recommend not to regard such minor shortcomings of the original as a reason to question the P165 incorporates relation of the digital representation, as long as the overall sense (or score) is recognizable, in particular, if the intended meaning can be guessed equally well from the original as from the digital representation. The same holds for minor flaws in the digital representation itself.
In contrast, for symbolic objects in a non-discrete form, such as paintings, there is no clear minimal resolution and the actual color reflection behavior cannot be reproduced digitally with current means. As long as this is the case, the digitized image cannot be said to incorporate the original, it only “P138 represents” it. (For audio recordings, there is no equivalent to  P138 represents, and the more general P67 refers to should be employed). Actually, the symbolic object a physical object carries is not uniquely defined by the physical carrier but depends on the type of the symbolic object defined in a model, which in turn serves a purpose of representation. If, for instance, the original is a manuscript, a digitized image may incorporate its text, which is instantiated and defined as a symbolic object of type “character sequence” in an information system. However other research-relevant optical features, which belong to richer symbolic properties of the manuscript surface, such as colors, may not be resolved by the image. This richer definition of a symbolic object carried by the physical object would not be incorporated in the digitized image. In other words, the type of the symbolic object that is described to be carried by the physical object determines the features under consideration and hence allows for deciding if the digitized image has sufficient qualities to incorporate it. At the time of writing there is still no good typology of symbolic objects with respect to the relevant representational feature types.

Also the crm-sig decided to be made an FAQ on official site and index with P165 and digitization, with authors MD and OE

The issue is closed

Heraklion, April 2017

Meetings discussed: