Issue 528: Guidelines and Protocols for Translating CIDOC CRM
Posted by George on 25/02/2021
With the advent of CIDOC CRM 7.1, a new stable community version (aimed for ISO approval) of the CIDOC CRM is established. This is the occasion for the broader community wishing to implement the standard on a stable basis to invest and engage with a mature ontological specification and text.
A key aspect of this work at the community implementation level is to render the standard in various languages so that it can be studied, appropriated and applied without linguistic barriers by different linguistic and cultural communities around the world.
Towards this end, the task of translation is key and an important intellectual process and product of the CIDOC CRM community in its own right.
The formulation of open, transparent and regular protocols and processes for creating a translation would thus be a crucial groundwork to lay out in order to give the appropriate support and weight to the translation efforts of the CIDOC CRM semantic data community.
At present, a search of the website (using the website search tools) returns only one article regarding translation. It is an issue from 2002 (http://www.cidoc-crm.org/Issue/ID-58-how-to-organize-the-translation-of-the-model) on how to organize the translation of the CIDOC CRM.
It would seem then that there is a need to pick up this issue again and address its various aspects (especially given the phenomenal growth of the CIDOC CRM uptake and the spread of its use to different linguistic communities around the world).
It seems prudent therefore to communallly create a formulation of guidelines for translation best practice and, separately, open and explicit protocols for submission and acceptance of CIDOC CRM translations, to be developed and put into action by the community.
The spirit of the guidelines and protocols should be to make a transparent space for engaging in this important work and understanding its relation to the overall CIDOC CRM community effort. It should aim to support existing translation efforts and provide an obvious, open and transparent path for additional translation efforts.
Of consideration for inclusion in these guidelines and protocols are the following topics:
Protocol for Starting an Official Translation
Who can start an official translation, are there any preconditions?
Protocol for Accepting an Official Translation
What are the criteria for accepting a translation as official?
When do the translated classes and properties pass into the serializations?
Is there recognition of the translating group in the serialization (for the respective translation element)
Recommended Tools for Supporting Translation
Are there any tools recommended for supporting translation? Any recommended methods?
Networks of Support (Community of Translation Projects)
The translation of the CIDOC CRM is the translation of an aimed for neutral ontological description of CH data. The translation of the standard requires a creative effort to understand and elucidate the conceptual objects specified in the ontology. Given the complexity of this effort involving philosophical, computer science and cultural heritage specific knowledge, the process can be quite challenging. Sharing experiences across language translations may help eludicate problems in understanding the standard or finding useful philosophic correlate expressions in different languages.
Do/can we facilitate a place of exchange on these topics?
Means of Approaching (Ontological Translation Methodology)
Are there better or worse methods for approaching the translation task as such?
E.g.: should one translate classes and properties from E1 to En, P1 to Pn or should one follow the ontological hierarchy?
What are key terms that might best be approached first in order to support the general translation? (E.g.: Space Time Volume?)
Change Management - Version Compare
What is the best way to manage iteration between version and efficient translation? (don’t want to retranslate all if possible)
Place of Publication of Translation and Level of Recognition
Where are official translations published? Are they sufficiently visible? What is their relation to serializations?
Under what copyright should translations be made?
Infrastructure to Support Publication / Promotion of Translations
Is there any? Should there be any?
Template for Translators’ Introduction
The translation work in itself is another intellectual work which requires many important choices and requires the introduction of an interpretation of meaning and sense. A translator’s introduction then would be important in order to convey important decisions and methodological choices. Should this be standardized?
The above represents a first set of ideas. I propose we have a general discussion of this question and see if there is interest and capacity in the membership to create such guidelines and protocols.
Posted by Anais Guillem on 25/02/2021
I would like to follow up on George's email about the translation. In October 2019, a group of French archaeologists and CH specialists expressed an interest to translate the latest version and the future version 7 in order to disseminate CIRDOC CRM more easily. Now, the project of translation is international (France, Belgium and Canada) and a collaborative effort. It is mostly inspired by Wiki contributions and everything is done in Gitlab with version control. The group meets (via Zoom) once a month to establish some priorities and discuss the different issues.
The project is open to anyone interested in contributing to the translation in French: you just need a Huma-Num account.
The translation files could be used for translations in other languages. The diagrams are also in the process of translation. The translation issues are discussed in the Gitlab issues. The how-to is explained in the Wiki section of the gitlab project.
It would be very interesting to know if there are currently other translations projects in other languages to compare the process and methodology. The git repository could be cloned if another group wants to translate the ontology in another language.
Posted by Philippe Michon on 27/02/2021
As this issue arises from a discussion between George and us at the Canadian Heritage Information Network (CHIN), I just wanted to confirm that we are greatly interested in this issue.
The main reason is that we must have a French version in order to be able to use CIDOC CRM within our organization. Indeed, we have rules on bilingualism that oblige us to have a quality French equivalent (that meets the quality and maintenance standards of governmental agencies) in some strict time limits of the standards to which we refer.
We are contributing to the French translation initiative presented by Anaïs. In addition, for administrative reasons, we are in the process of setting up a specific translation process for the Canadian team.
Of course, we will share with you as soon as possible the documents that we will make publicly available to our editors and partners. Here is a list of what we plan to share in the coming year:
Google Docs translation templates
Protocol to convert Google Doc Templates in Markdown (our goal is to publish on Github Pages)
Index of CIDOC CRM entities (translated)
Update protocol (e.g. 7.0 to 7.1)
Spreadsheet for keeping track of the typos in the English version
List of the translation challenges
Best practices for translation
We hope that our work will serve as a foundation for the development of general recommendations and protocols in order to further democratize CIDOC CRM.
We look forward to participating in discussions concerning this issue.
Posted by Franco on 27/02/2021
the appearance of this issue is the sign of the vitality, importance and diffusion of the CRM.
Undertaking a transation poses a number of issues that need to be addressed before moving to practicalities.
The “Canadian case” shows the need of complying with legal constraints. For example, if a country formally decides that the national standard for cultural heritage documentation is the CRM, the related decree will need to have an appendix with the CRM version approved, and I think that it would not be acceptable to include it in English, but it should be in that country’s national official language(s). Thus it is better to have an ‘approved' translation in advance, to guarantee that the ‘official’ text is a faithful one. This may also resolve contractual issues, for example with companies contracted to prepare heritage documentation compliant with CRM.
On the other hand, using different translated versions of the CRM may - at least in principle - undermine its universality. Even if machine actionability would eventually be preserved, attention must be paid to the human side of the job, to guarantee that scope notes - for example - give the same meaning to labels acroos translations.
What should be translated? Of course, the discursive part, as the introduction - the pages numbered with Roman numerals in the CRM description. But, they contain examples and references to Classes and Properties, for which the specific rules should apply. For example, the statement on page xi "In CIDOC CRM such statements of responsibility are expressed though knowledge creation events such as E13 Attribute Assignment and its relevant subclasses.” includes such a reference that must follow the translation rules for Class names.
Another example is the “IsA” relationship. If translated, it contains the indeterminate article “A” which in some languages must follow the grammatical gender of the term it refers to, and thus gets two/three equivalents. So my choice would be to consider it as a symbol and keep it in English also in the translations. There may be other issues of this kind, so a general directive should be 1) established 2) accepted according to local constraints. I believe that the decision could be easy in this particular case; but it must be decided for all the similar occurrences.
The above leads me to think that before undertaking any translation, the official English version should be examined to evaluate what is English - and may be translated - and what is symbolic and just seems English - not to be translated. IsA is an example, there may be others. The translation may be funny from a literary point of view (“Martin Doerr IsA un homme”), so an explanation could be given - maybe in a footnote - to help understandability.
Naming conventions (pages xiv - xv) should of course be preserved. Here examples are given in Italic e.g. "E53 Place. P122 borders with: E53 Place”. I am not completely clear with the need of a full stop after Place (could be a typo from copy-paste), but also the use of Italic is introduced surreptitiously. By the way, it is maybe high time to establish a recommendation to standardize how to quote class and property names e.g. in articles, in order to distinguish them from plain discourse also typographically.
Coming to scope notes, I think that only the symbolic parts should remain in English, i.e. the alphanumeric label e.g. “E1”.
The above are just examples of what a preventive survey of the official English text will define as “not translatable”. In my opinion it wouldn’t take much time to fo it.
The next step is what George calls “translation rules”. I am looking forward to fierce debates about the translation of “Human-made”, if it should follow the style of the Nusée de l’Homme (“fait par l’homme”) or choose a gender-neutral “anthropogenic” or whatever else.
I agree with George on the necessity of general guidelines and protocols to translation. But since these depend on the culture behind the language into which the CRM is going to be translated, accepting them is not automatic: how can a native English (or Greek, or German) speaker decide what is better for Italian or French? So such protocols should be stated in a general form, and then implemented language by language, what brings us back to George’s topic about "What are the criteria for accepting a translation as official?” and who is in charge of it. There may be different levels of “acceptance”, e.g. a working text, a published translation for comments, a technically approved one and a linguistically approved one. I would feel confident enough to address the first three levels, but for the highest level I would need the support of linguists - better if official ones.
To profit of what is already being undertaken, who decides if the French Canadian version is OK? Is there any potential conflict between what the SIG (or any judge established by it) decides and the decision by an officially established Canadian referee for effective bilingualism?
Finally, copyright. The copyright statement in the title page of CRM documentation "Copyright © 2003 ICOM/CRM Special Interest Group” in my opinion sounds a bit old-fashioned and unpleasant, there are nowadays more appropriate licensing schemes that allow public open use, give appropriate recognition to authors, and protect the moral rights of those involved in the work, people and organizations, while avoiding any unauthorized commercial exploitation. In the era of Open Science it sounds a bit conservative. The same should apply to translations.
As you may have understood from this long email, I am interested in the adventure, both in preparing the general framework and in supporting a translation into Italian. If useful, we can advertise the initiative through various networks, to inform those potentially interested in the job.
Posted by George 03/03/2021
Thanks already for your valuable feedback and uptake on this proposal. I am pleased to say that this issue has been added to the official CRM SIG issue list:
It is also scheduled to be discussed in the afternoon session of the upcoming SIG on Monday March 8th. I do hope everyone responding here and all others interested in this topic will be available to share their knowledge and help us move this subject forward.
Posted by Massoomeh on 8/3/2021
Thank you George for proposing this issue. I totally agree with this proposal. Due to our experience with translating the Model into Persian, Omid Hodjati and I answered to your questions. Please follow this link to see the slides of our answers.
I am looking forward to the discussion tonight!