Issue 530: Bias in data structure
Post by Athanasios Velios on 4/03/2021
In version 7.1 a short but important sentence has been added at the end
of the scope section:
"Discussions on the types of bias present in the CIDOC CRM are in
progress within the CIDOC CRM community."
Issue 530 is used to track the discussions here:
It is important to engage in this discussion so that we first understand
the issues around bias and privileged positions and then how these may
or may not impact the development of the model.
We will then be more confident in making a more complete statement is
future versions. Issue 530 is scheduled to be discussed at the community
session of the forthcoming meeting.
Looking forward to it.
Posted by Anais Guillem on 8/03/2021
Dear Thanasis, all,
Some digital humanists work and publish on this question of bias in digital humanities: here is an example of very a propos publication:
I gathered myself bibliography about decolonizing knowledge and methodology especially in digital project. I could join the discussion of your working group if you want.
Posted by Thanasis on 08/03/2021
Fantastic! Thank you for sharing and you are first in the list.
For the rest in list and if you did not attend today's sessions, following discussion for issue 530, a working group is being formed to discuss bias in the CRM. Please let me know if you wish to contribute to the discussion.
Posted by Erin Canning on 08/03/2021
I would also like to be involved in this discussion, please! I too have a reading list on the subject that I would be happy to share; I have been meaning to pull everything into a Zotero library and this is a good excuse to do so. For a single article to start things off, I would recommend Miriam Posner's "What's Next: The Radical, Unrealized Potential of the Digital Humanities" as an interesting read.
Posted by Robert Sanderson on 08/03/2021
Happy to join as well. I'm co-chair for the Bias Awareness and Responsibility Committee for Cultural Heritage at Yale University, and happy to share our experiences in that work. This is especially relevant to our work as we move to adopt CIDOC-CRM (via Linked Art) as our baseline ontology.
Some readings that we found useful:
https://doi.org/10.1080/0270319X.2019.1696069 -- "Aliens" vs Catalogers: Bias in the Library of Congress Subject Headings
https://journals.litwinbooks.com/index.php/jclis/article/view/120 -- Cultural Humility as a Framework for Anti-Oppressive Archival Description
https://doi.org/10.1111/cura.12191 -- Coming Together to Address Systemic Racism in Museums
https://www.youtube.com/watch?v=MbrC0yvBCNo&ab_channel=CollectionsTrust -- Decolonizing the Database by Dr Errol Francis
And, in print media: Algorithms of Oppression: How Search Engines Reinforce Racism by Sufiya Noble of UCLA
A colleague and I presented about our work at EuroMed2020: Libraries, Archives and Museums are not Neutral: Working Toward Eliminating Systemic Bias and Racism in Cultural Heritage Information Systems
Youtube capture of the zoom: https://youtu.be/V9-IHQQv-LY?t=26661
From a CIDOC-CRM perspective, I think there are several issues to grapple with, including those that were brought up today.
Some differentiation I would try to draw, and without presumption that the answer for any of them is positive or negative:
* Ontology Features
-- does the data structure described by the ontology introduce, require or reinforce biases (especially harmful ones)?
-- does the ontology preclude use or engagement with different communities - is it accessible or are there barriers to entry that limit usage to certain communities, thereby introducing bias through exclusion
* Documentation of the Ontology
-- does the documentation about the ontology introduce, require or reinforce biases?
-- is the documentation accessible to broad and diverse communities?
-- is the documentation transparent about issues that are known or presumed to exist
* Methodology of determining the Ontology
-- does the way we produce the ontology, from ideation to standardization, introduce, require or reinforce biases
-- is the methodology accessible to broad and diverse communities for participation?
-- is the methodology transparent as to how it works, and accountable when it doesn't?
* Implementations and Instances of the Ontology
-- I think these are useful as second-order evidence, but that we should not be too involved or prescriptive.
And some micro-topics and thoughts, which are more opinionated:
* P48 Has Preferred Identifier -- this breaks the very beneficial "neutral standpoint" design decision. We should deprecate it for this reason, quite apart from the issue on the docket that it should be deprecated as an outmoded design pattern.
* E31 Document, E32 Authority Document vs E73 Information Object -- The need to distinguish "propositions about reality" and "terminology or conceptual systems" from other information seems to introduce subjectivity and the potential therein for harmful biases as to what constitutes "truth" or "reality", and what is a "terminology" versus what is just a word document.
Posted by Thanasis on 08/03/2021
Thank you Erin. We are using Zotero already for the CRM so this is a good idea. I can check if a new folder can be created for issue 530.
Posted by Nicola Carboni on 9/3/2021
Dear Thanasis, all,
I would be happy to join the discussion. Another useful reading other than the already cited ones, is "Cataloguing Culture: Legacies of Colonialism in Museum Documentation” by Hannah Turner.
Regarding the name and the scope of the issue: should we focus on data structure (I see the title of the issue is "bias in data structure") or specifically on ontologies and CRM?
While I do very much believe that data structure is an enormously important topic to discuss, it is an extremely large subject, and entail a larger series of problems which do derive from the informational foundation, the concept of structure itself, the recorded information, as well as disciplinary inheritance in the chosen subject matter.
I second rob proposing to focus on the problem of the ontology and the process of documentation/development. I would add that we should include some point about CRM as system of thought as well as the problem of formalisation.
>>* Implementations and Instances of the Ontology
>>-- I think these are useful as second-order evidence, but that we should not be too involved or prescriptive.
I would include the topic, as to make clear the diversity in implementation (use of terminological systems as well as the use of the concept of controlled terminology itself), avoiding indeed the prescriptive stance.
Posted by Thanasis on 9/03/2021
Indeed the intention is to focus on ontological level for the CRM and not to expand to data structures, schemas etc. The issue label does not represent the issue exactly, but it can act as a reminder. I will add the reference to the library.
Posted by George on 10/03/2021
I, too, fully support this important initiative and hope to learn much from colleagues in the discussion. The opening discussion was already very fruitful to start us off at looking at fundamental issues to take into account in the method of creating ontologies.
The shared zotero library suggestion is a great one.
In the 49th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 42nd FRBR – CIDOC CRM Harmonization meeting, TV gave an outline of the issue -question of bias in data structures (link to presentation) -and introduced Prof. P.Goodwin who brought the sig up to date on the Worlding Public Cultures Project.
The discusson concluded in a concise proposal to move forward with the issue of bias by forming a working group, where to disuss bias-related concerns (forms of bias in data structures which interfere with cultural points of view; empirical or theoretical means we have to detect them; whether documenting concepts of one’s culture as an empirical fact can be regarded as bias;) with the aim of
- finding a common denominator and maximising diversity;
- producing a statement on buas for the CRM specification document,
- establishing crieteria for examining classes and properties
- creating new issues for improving the model.
The sig agreed to the above and decided to take action as indicated: (a) inform the sig-list of the decision to start a WG on the discourse around bias (ask for participation), (b) assign TV with leading the initiative/discussion. Prof. P.Goodwin and Dr M.Hidalgo Urbaneja will support the initiative.
Details of the discussion can be found here.
In the 50th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 43nd FRBR – CIDOC CRM Harmonization meeting, EC reported on the progress of the WG dealing with Bias. Progress report here.
MD: understanding bias in data structures is a misrepresentation of the actual problem. And the scope of the CRM is clearly stated and does not encourage bias.
What they should be aiming for is to identify the bias that may be introduced by the intended use of one particular data construct. It is not the data construct as such that gets identified as a source of bias.
GB: besides investigating whether constructs in an ontology further entrench bias, they could also review the process of ontology building, and dialogue and see if bias manifests in that case too.
EC: we all come from particular perspectives and that translates into our understanding of the world. However, bias comes into play when it comes to existing power structures. In which case, you cannot just undo bias, because it’s a symptom of some sort of inequity. Identifying sources of bias serves to raise the issue, and for their part they are interested in identifying sources of bias in ontology/data structures.
In the 52nd joint meeting of the CIDOC CRM SIG and ISO/TC46/SC4/WG9; 45th FRBR - CIDOC CRM Harmonization meeting; EC gave a progress report of the work undertaken by the bias group. The focus has been on the following sub-topics:
- Identify areas of concern in the CRM (throughout the model -at different levels: from scope notes to working practices
- Produce a statement on bias for the CRM specification document (link to DRAFT document HERE)
- Establish criteria for examining classes and properties (link to DRAFT document HERE)
- Create new issues for improving the model (link to DRAFT document HERE
- Also: by reviewed area - E39, E21, E74: HERE
The group will meet again on March 14 to carry on reviewing CRM according to Functional Units