Archetypical sounds

ID

274

Starting Date

2015-02-10

Working Group

3 - Changes in the CIDOC CRM model

Status

Open

Background

Posted by Steve Stead on 10/2/2015

We have no way of documenting sounds that behave like instances of E37 Mark.

I would like to propose a new class for this.

The 34nd joined meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9 and the 27th FRBR - CIDOC CRM. It is decided that Thanasis Velios will work on this

Heraklion, October 2015

In the 35th joined meeting of the CIDOC CRM SIG and 28th FRBR - CIDOC CRM Harmonization meeting, the crm-sig discussed the proposal made by Thanasis Velios (see below) about a Audio /Sonic Item Class but it is decided to postpone the incorporation of this class in the CRM until there is evidence and if the evidence conforms to the proposed definition. Also we should examine if such a concept imply traditional melodies and if there is any community of use.

EXX Audio Item/Sonic Item
Subclass of: E73 Information Object
Superclass of: EXX Audio Mark/Sonic Mark

Scope Note: This class comprises the intellectual or conceptual aspects of recognisable sounds and compositions.

This class does not intend to describe the idiosyncratic characteristics of an individual performance or playback of sound, but the underlying prototype. For example, a sound such as Walter Werzowa's Intel sonic logo is generally considered to be the same logo when played in any number of adverts or media. The tone may change, but the logo remains uniquely identifiable. The same is true of music which is performed many times. This means that audio items are independent of their performance.

The class EXX Audio Item provides a means of identifying and linking together instances of E7 Activity that deliver a performance of the same sounds, compositions or soundtracks etc., as follows E7 Activity through PXX delivers audio item (is delivered by), EXX Audio Item, [P138 represents (has representation) to E1 CRM Entity – not sure how representing anything else apart from a conceptual object is possible with sound, but I suspect blind people would find it a relatively simple task]

Examples:
- Walter Werzowa's Intel sonic logo (EXX)
- Francisco Tárrega's Nokia tune (Grande Valse) (EXX)
-Beethoven’s “Ode an die Freude” (Ode to Joy) (E73)

In First Order Logic:
EXX(x) ⊃ E73(x)

Properties:
[P138 represents (has representation): E1 CRM Entity
(P138.1 mode of representation: E55 Type) – again not sure about these.]

Prato, February 2016

Current Proposal

Posted by George Bruseker on 18/8/2019

Dear all,

In the course of recent modelling exercises, I encounter often the need to have a way of modelling audio information or sounds. While this initially seems an easy request, on further thought, it seems complicated. Picking up from previous work at the SIG by Steven and Thanasis, I have been trying to formulate something reasonable and consistent. Below I present the present state of thought on my part and a potential scope note and properties list proposal. This continues the old issue: http://www.cidoc-crm.org/Issue/ID-274-archetypical-sounds building on it and expanding it somewhat in scope.

Motivation:

Sound is an object of collection, curation and preservation. It is something that falls within the wider scope of cultural heritage and the more specific scope of museum information. Sound is related to the sense of hearing and forms a basic meso-scopic aspect of human experience of interest in the study of the past.

Background Problems:

The closest existing model we have to the question of modelling an object of the senses in CIDOC CRM is the ‘Visual Item’. In a way, the obvious solution is to think if an ‘Audio Item’ forms a useful parallel, as has been explored in the past.

The potential parallel has the advantages of symmetry and being well known, but comes with ontological issues.

‘Images’ as they are modelled in CRM have the features of being a) the result of intentional action by a human being and b) being representational or potentially representational.

So the visual item scope note begins, “This class comprises the intellectual or conceptual aspects of recognisable marks and images.”

The seeming problems in trying to create a modelling parallel then of ‘image’ and ‘sound’ include that a) the basic case of sound is not to represent anything (but of course there is onomtopeia and then language which as a small subset of all sounds is representational, in the basic sense) and b) not all sound is created intentionally as part of an intellectual project (indeed a fleetingly small subset of sounds compared to the total would seem to be so).

The problem, to my mind, arises because the image has a particular nature (and is a particular focus of Western thought and creative production) because although all physical objects give off ‘images’ in the sense of appearance, if we wish to fix an image, human being must intervene and through the power of their creativity/mind and under certain cultural codes translate the perceived or imagined or speculated image on to a surface. Images are also often used for representing the world and as a means of communication and, with photography and other methods, documentation. Thus a visual item is a sensible ontological category for western thinking and is definitely a product of the human mind.

Like the image, many things give off sounds. Unlike the image, there is - arguably - no need to translate sound via the human mind and add intellectual content in order for sound to… sound. So an inanimate object like the smashing of a pane of glass that accidentally falls from a table makes a sound, no-volition involved (not even willed), and a frog seeking its mate or warning off predators makes its croaking sound and need never consult a human being as to whether it is croaking its song correctly. Human beings develop at least two major systems of sound: music and speech. These two subsets of sounds are intentional. The former is, usually, non representational (except for Peter and the Wolf, or perhaps the representation of inner emotions?), while the latter is fundamentally representational.

Within the domain of cultural heritage, the question is what is documented:

   collection of folk songs
   collection of speech samples
   collection of interviews
   collection of bird songs
   collection of frog calls
   Collection of machine sounds
   collection of musical performances

Argument for Modelling something around sound:

There is an immaterial, repeatable pattern of sound recognizable by human beings that is not linked to a particular episode of performing a sound sequence, nor to a particular recording or recording medium. It is of interest for researchers to be able to track these different instances where they are present. Sound has different means of propagation and repetition. An image is marked on a surface and it is tracked across different media which are able to act as a surface of simulate a surface. Sound is a ‘time based’ entity which until recently could only be performed and reperformed. Now that we have recordings (the last 200 or so years), it is still the case that the carriers of sound and their means of carrying are different than an image (we need a player for a recording and not for an image unless it is digital).

Thus there is both a research need and there seem to be different properties that are required to describe the relation of sound to other entities.

Potential solutions:

Given the above, we need an approach to modelling how ‘sound’ is present in CH documentation systems and what questions are asked of it. The following solutions seem possible:

Limit the sounds that can be modelled to those that are planned (concerts, speeches, etc.): this solves the intentionality problem if not the representationality problem. Since the representation problem is likely also an issue in Visual Item (since much more art is non representational for example), whatever reasoning permits the notion of the property of ‘represents’ on visual item, also could be applied to the audio item. This would mean, however, that it is not obvious how we model the croak of a frog. Or rather, we are able to model the croak of a frog, since the recording will have the intellectual input of the sound engineer, but the frog itself, not being on the level of we humans, won’t be able to perform its own croak. [also how do we model the appearance of Jesus on toast as occurs often enough around the world? Can we say that it is an image? Certainly man did not make it.]
Decide that sounds indicate a new branch of immaterial item that is NOT human made. Is there is something above E28 Conceptual Object, which is is ‘pattern’ under which E28 resides alongside ‘sound’. The notion of this ‘pattern’ would be recognizable immaterial items which are object of discussion but are not human made. For the moment, the only example I would put under this would be sound sequence patterns, like the croak of a frog, which are not human made and yet certainly sound sequences. This could be used as a class together with symbol in order to indicate some sort of ‘audio item’ which encodes the croaking of a frog and the like. We could then also have an audio item which follows the ‘visual item’ pattern separately, which has the notion of human creation as well.
Decide the above is heterodox and that all immaterial items are the product of human mind. Then we can follow the simple pattern of simply having an ‘audio item’ parallel to ‘visual item’ and then argument goes that all sounds are actually humanly perceived/able and their identifiability is based on a process of their being made an object of discourse and then recoded/encoded in some way such that it can be recognized again. So when the frog croaks, it would croak an instance of the human made object ‘audio item’, unbeknownst to it.

There are potentially better ways to look at it than the above, but these are the possibilities that come to mind to me. It seems to me that the 3rd solution is the closest to existing CRM approaches and on that basis then, I have crafted a first attempt at a scope note, building on what was already done in this issue.

Proposed Scope Note:

Features: instances can be recordings or acts of musical performances or speech but also recordings or acts of inanimate and biological objects.

Causes a relation to E5 which is the possibility of an event not ‘sound’ the audio item (not necessary to know what you are doing… can be a frog or a robot or an AI whatever) Arguably there should also be a new property which is a sub property of ‘carries’ which deals with the fact that the object ‘bears’ the audio item but not in a way that is immediate (I need a play back device). If this were accepted then we would need a subclass like information carrier to come back (I am not at all sure this is a good idea, just putting out a thought). Finally, there could potentially be a property for an audio item providing a typical case for a class of sounds.

EXX Audio Item
Subclass of: Information Object

Scope Note: This class comprises the intellectual or conceptual aspects of recognisable sounds and compositions.

The substance of an audio item is a recognizable pattern of vibration in a medium as perceivable by an auditory system. Sounds in and of themselves are not human constructs, instances of audio item, however, are. Specifically they are the identifiable and recognizable vibratory patterns which have become objects of discourse within given cultures and societies and act as symbolic markers and can be the basis for contemplation, discourse and reasoning inter alia.

This class does not intend to describe the idiosyncratic characteristics of an individual occurrence of a particular sound, performance or playback of sound, but rather the underlying prototype. For example, a sound such as Walter Werzowa's Intel sonic logo is generally considered to be the same logo when played in any number of adverts or media. The tone may change, but the logo remains uniquely identifiable. The same is true of music or speeches which are performed many times. While individual characteristics of the performance or speech may incidentally change, a basic, identical form can be recognized across performance instances. This means that an instance of audio item is independent of performance.

Aside from sounds following a particular composition, sounds captured from the environment (natural or human) and recognizable within a certain society or culture can be instances of audio item. Examples would include the sound of a tuned Porsche Carrera engine revving at 3000 rpm, the warble of the common Loon, or the David Frost Interviews with Nixon.

The class EXX Audio Item provides a means of identifying and linking together instances of E5 Event in which the same sounds, compositions or utterances etc. can be identified to have occurred, using PXX sounded (was sounded by), EXX Audio Item. Further an instance of EXX Audio Item may be recorded and then can be indicated as pXX is recorded on (bears recording of) E24 Physical Man Made Thing.

Examples:
- Walter Werzowa's Intel sonic logo (EXX)
- Francisco Tárrega's Nokia tune (Grande Valse) (EXX)
- Beethoven’s “Ode an die Freude” (Ode to Joy) (E73)
- a recording of the Greater Horned Toad
- the sound of the Porsche 911 engine revved at 3000 rpm

Pxx sounded (was sounded by) D: E5 R: Exx Audio Item
Pxx bears recording of (is recorded by) D: E24 R: Exx Audio Item
Pxx sounds typical for D: Exx Audio Item R: E55 Type

I am sure there is much to be discussed/improved. Look forward to your thoughts.

In the 47th joint meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9; 40th FRBR - CIDOC CRM Harmonization meeting; GB presented a classification of sound recordings instantiating a set of prototypical sounds and commented on the similarity of these sounds to visual images (instances of E36) from a conceptual point of view –in that they create identifiable patterns and have intellectual/conceptual aspects.
However, given the scope note of E36 starts by defining visual items as *intellectual or conceptual aspects of recognizable marks and images*, it could not possibly be expanded to refer to the sounds animals produce. The proposed solution to that: only consider such sounds in as much as they represent the outcome of an activity performed by a human agent (collection), which is what grants them an intellectual/conceptual aspect.

According to the discussion that followed the sig decided to reconsider the HW –continue working on that. TV will ask sound art colleagues to point him in the right direction with regards to sound integration. MD will rework the scope notes and examples for E90 Symbolic Object and E73 Information Object. GB and OE to contribute to that.

June 2020

Post by Thanasis Velios (12 june 2021)

Dear all,

My homework for this issue was to check for cases of integration for sound in sound arts. I spoke with a professor of sound art in UAL and from the examples we discussed it appears that there are very few collections systematically cataloguing sound art and they hardly document anything else apart from title and artist. We could not find any projects around integration of such collections. This does not mean that there is no desire for integration from the academic perpsective, but I could not identify any domain experts working towards that direction.

All the best,

Thanasis

Post by Daria Hook (13 June 2021)

Dear colleagues,

the examples of cataloguing the audiovisual information exist, and an expert gave it me:
http://www.dapx.org/showservices.asp?ID=1003&fbclid=IwAR1MoBC2u07qVOGw5…;

Details: Chinese standard includes 25 positions of metadata (code of archive, its category, level of invenorisation, unique identifier, file numver, name of the scan creator, date of digital copying, date of the next conversion/migration, permission, notes, address of real location, original носитель, mode of digital copying, copying device, software and OS, name of file, size of file, format, video parameters, audio paremeters etc.)

Something similar was proposed for the Russian archives.

With kind regards,
Daria Hookk

Post by George Bruseker (15 June 2021)

I imagine I'm running fast at windmills here but I already prepared a homework for this several sigs ago in which I list dozens of collections of sounds. There is documentation and research on sound in CH.

What exactly are we searching for in this issue?

Post by Thanasis Velios (15 June 2021)

Maybe we have failed to document the issue in its development but my understanding was that we were looking for use cases from sound arts where a sound piece with distinct identity (for example) has been used as part of another sound piece. The integration would have been required to identify these as related separate entities thus providing an additional argument for the new class. This is different to the typical preservation metadata documented for audio recordings or performances.

All the best,

Thanasis

Post by Martin Doerr (15 June 2021)

This is also my understanding. Basically, we would need properties substantially different from other information objects, and different from audiovisual recordings for a new class, that would express relations not covered by others in the CRM, and essential in these applications. To my understanding, these have not been identified so far.

The question of bird songs that came up is more complex, because it is a type-type relation: species A uses to sing soundtype B.

Another aspect is sound as intangible heritage, very different again.

All the best,

Martin

In the 57th CIDOC CRM & 50th FRBR/LRMoo SIG Meeting, the SIG assigned GB & SdS to review the empirical evidence gathered until now, review the class and property definitions by GB dating back to 2019, and bring them up for a vote in the next meeting, in Paris.

HW: GB, SdS to summarize the proposal and bring to a decidable form in time for Paris 2024

Marseille, October 2023

In the 60th joint meeting of the CIDOC CRM and ISO/TC46/SC4/WG9 & 53rd FRBR/LRMoo SIG, the SIG decided to not close the issue and take a vote instead, on introducing Exx Audio Item in CRMbase V7.3.1, with the id E100. The idea is to have this class at the same level as E33 Linguistic Object and E36 Visual Item.
The objection that recorded sounds would fall in the scope of LRMoo was considered unfounded, in the sense that LRMoo only records performances, and not any random identifiable sound, which, in its turn, is much broader than an expression (F2) created (R17i) by the (F28) recording (R81) of a performance (F31). The details can be found in the attached document.

HW: GB, SdS, to formulate the scope notes for the properties of the class.
HW: PR to see whether there are implications for LRMoo.

The properties proposed are:

Pxx sounded (was sounded by) D: E5 Event, R: E100 Audio Item (to use power of event modeling)
Pxx bears recording of (is recorded by) D: E24 Physical Human-Made Thing R: E100 Audio Item (prob sub property of ‘carries’)
Pxx sounds typical for D: E100 Audio Item R: E55 Type (analogy to ‘payment’)

Bern, April 2025

Post by George Bruseker (12 March 2026)

Dear all,

Here is the homework to clear up the hanging chads on issue 274 and baptize our 100th class! Thanks to Rob and Clarisse for useful input and feedback.

The general context of this issue is the longstanding need to have a class to manage audio items in their own right. Two SIGs ago this class was voted as a new class. To complete the issue its properties needed scope notes and examples. In the document below you will find homework for that issue.

https://docs.google.com/document/d/1shHt_cxb0a-pzFYgba0FG1Rtj8f7l_84/edit

The scope notes have been composed, some minor changes have been made to the main scope note (editorial so that we use the class name and number) and some examples added to the class and properties to illustrate.

This is proposed for decision at the next SIG.

Best,

George

Post by Martin Doerr (15 March 2026)

Dear George, All,

I wonder if it would be wiser to have this class in CRMsci：

a) CRMbase has no concept to observe or measure an event, but CRMsci has. Putting it in CRMbase leaves the class for a long time without a creation concept, whereas CRMsci can easily have one. There was a lot of
thought already expressed in FRBRoo about recording. Do we have a concept for this further evolution?

b) "sounded" as a flat property of an Event as a whole is very unspecific. I'd prefer an "was recorded from".

c) Equal relevance has video and audiovisual media in general. How would the extensions to the latter look like?

d) How do you describe the relation between the prototypicality of the Audio Item and a particular recording?

e) How about the numerous audio recordings from interviews that correspond to texts?

Best,

Martin

In the 62nd joint meeting of the CIDOC CRM and ISO/TC46/SC4/WG9 & 55th FRBR/LRMoo SIG, GB presented the HW that he, RS, ClB, and OE had prepared, concerning the properties of E100 Audio Item. The details of the HW and a summary of the discussion that followed can be found in the attached document.

Decisions:

Update the scope note of E100 Audio Item as proposed in CIDOC CRM version 7.4
Introduce properties P201 sounded (was sounded by), P202 bears recording of (is recorded on) in CIDOC CRM version 7.4. It is a draft version, so it doesn’t matter that they’re not polished.
Do not introduce property Pxx sounds typical for (is typical sound of) at the moment. Explore its connection to P137 exemplifies, to avoid duplicating the property.
HW: PR, SdS, NC, GB to review the scope notes for E100 Audio Item, P201 sounded (was sounded by), P202 bears recording of (is recorded on) & Pxx sounds typical for (is typical sound of), and provide additional comments.

Oxford, March 2026

Developed & Designed by Alaa Haddad

Choose a shortcut

Archetypical sounds