Issue 528: Guidelines and Protocols for Translating CIDOC CRM

ID: 
528
Starting Date: 
2021-02-25
Working Group: 
4
Status: 
Open
Background: 

Posted by George on 25/02/2021

Dear all,

With the advent of CIDOC CRM 7.1, a new stable community version (aimed for ISO approval) of the CIDOC CRM is established. This is the occasion for the broader community wishing to implement the standard on a stable basis to invest and engage with a mature ontological specification and text. 

A key aspect of this work at the community implementation level is to render the standard in various languages so that it can be studied, appropriated and applied without linguistic barriers by different linguistic and cultural communities around the world.

Towards this end, the task of translation is key and an important intellectual process and product of the CIDOC CRM community in its own right. 

The formulation of open, transparent and regular protocols and processes for creating a translation would thus be a crucial groundwork to lay out in order to give the appropriate support and weight to the translation efforts of the CIDOC CRM semantic data community.

At present, a search of the website (using the website search tools) returns only one article regarding translation. It is an issue from 2002 (http://www.cidoc-crm.org/Issue/ID-58-how-to-organize-the-translation-of-the-model) on how to organize the translation of the CIDOC CRM. 

It would seem then that there is a need to pick up this issue again and address its various aspects (especially given the phenomenal growth of the CIDOC CRM uptake and the spread of its use to different linguistic communities around the world).

It seems prudent therefore to communallly create a formulation of guidelines for translation best practice and, separately, open and explicit protocols for submission and acceptance of CIDOC CRM translations, to be developed and put into action  by the community. 

The spirit of the guidelines and protocols should be to make a transparent space for engaging in this important work and understanding its relation to the overall CIDOC CRM community effort. It should aim to support existing translation efforts and provide an obvious, open and transparent path for additional translation efforts.

Of consideration for inclusion in these guidelines and protocols are the following topics:

Protocol for Starting an Official Translation

Who can start an official translation, are there any preconditions?

Protocol for Accepting an Official Translation 

What are the criteria for accepting a translation as official? 

When do the translated classes and properties pass into the serializations?

Is there recognition of the translating group in the serialization (for the respective translation element)

Recommended Tools for Supporting Translation

Are there any tools recommended for supporting translation? Any recommended methods?

Networks of Support (Community of Translation Projects)

The translation of the CIDOC CRM is the translation of an aimed for neutral ontological description of CH data. The translation of the standard requires a creative effort to understand and elucidate the conceptual objects specified in the ontology. Given the complexity of this effort involving philosophical, computer science and cultural heritage specific knowledge, the process can be quite challenging. Sharing experiences across language translations may help eludicate problems in understanding the standard or finding useful philosophic correlate expressions in different languages.

Do/can we facilitate a place of exchange on these topics?

Means of Approaching (Ontological Translation Methodology)

Are there better or worse methods for approaching the translation task as such? 

E.g.: should one translate classes and properties from E1 to En, P1 to Pn or should one follow the ontological hierarchy?

What are key terms that might best be approached first in order to support the general translation? (E.g.: Space Time Volume?)

Change Management - Version Compare

What is the best way to manage iteration between version and efficient translation? (don’t want to retranslate all if possible)

Place of Publication of Translation and Level of Recognition

Where are official translations published? Are they sufficiently visible? What is their relation to serializations?

Copyright Issues

Under what copyright should translations be made?

Infrastructure to Support Publication / Promotion of Translations

Is there any? Should there be any?

Template for Translators’ Introduction

The translation work in itself is another intellectual work which requires many important choices and requires the introduction of an interpretation of meaning and sense. A translator’s introduction then would be important in order to convey important decisions and methodological choices. Should this be standardized?

The above represents a first set of ideas. I propose we have a general discussion of this question and see if there is interest and capacity in the membership to create such guidelines and protocols.

Current Proposal: 

Posted by Anais Guillem on 25/02/2021

Hi CRM-lovers,

I would like to follow up on George's email about the translation. In October 2019, a group of French archaeologists and CH specialists expressed an interest to translate the latest version and the future version 7 in order to disseminate CIRDOC CRM more easily. Now, the project of translation is international (France, Belgium and Canada) and a collaborative effort. It is mostly inspired by Wiki contributions and everything is done in Gitlab with version control. The group meets (via Zoom) once a month to establish some priorities and discuss the different issues. 

The project is open to anyone interested in contributing to the translation in French: you just need a Huma-Num account.

https://gitlab.huma-num.fr/bdavid/doc-fr-cidoc-crm

The translation files could be used for translations in other languages. The diagrams are also in the process of translation. The translation issues are discussed in the Gitlab issues. The how-to is explained in the Wiki section of the gitlab project. 

It would be very interesting to know if there are currently other translations projects in other languages to compare the process and methodology. The git repository could be cloned if another group wants to translate the ontology in another language. 

Posted by Philippe Michon on 27/02/2021

Dear all,

As this issue arises from a discussion between George and us at the Canadian Heritage Information Network (CHIN), I just wanted to confirm that we are greatly interested in this issue. 

The main reason is that we must have a French version in order to be able to use CIDOC CRM within our organization. Indeed, we have rules on bilingualism that oblige us to have a quality French equivalent (that meets the quality and maintenance standards of governmental agencies) in some strict time limits of the standards to which we refer.

We are contributing to the French translation initiative presented by Anaïs. In addition, for administrative reasons, we are in the process of setting up a specific translation process for the Canadian team.

Of course, we will share with you as soon as possible the documents that we will make publicly available to our editors and partners. Here is a list of what we plan to share in the coming year:

  1. Google Docs translation templates

  2. Protocol to convert Google Doc Templates in Markdown (our goal is to publish on Github Pages)

  3. Stylesheet

  4. Index of CIDOC CRM entities (translated)

  5. Update protocol (e.g. 7.0 to 7.1)

  6. Spreadsheet for keeping track of the typos in the English version

  7. List of the translation challenges

  8. Best practices for translation

We hope that our work will serve as a foundation for the development of general recommendations and protocols in order to further democratize CIDOC CRM.

We look forward to participating in discussions concerning this issue.

Posted by Franco on 27/02/2021

Dear all,

the appearance of this issue is the sign of the vitality, importance and diffusion of the CRM.

Undertaking a transation poses a number of issues that need to be addressed before moving to practicalities.

The “Canadian case” shows the need of complying with legal constraints. For example, if a country formally decides that the national standard for cultural heritage documentation is the CRM, the related decree will need to have an appendix with the CRM version approved, and I think that it would not be acceptable to include it in English, but it should be in that country’s national official language(s). Thus it is better to have an ‘approved' translation in advance, to guarantee that the ‘official’ text is a faithful one. This may also resolve contractual issues, for example with companies contracted to prepare heritage documentation compliant with CRM.

On the other hand, using different translated versions of the CRM may - at least in principle - undermine its universality. Even if machine actionability would eventually be preserved, attention must be paid to the human side of the job, to guarantee that scope notes - for example - give the same meaning to labels acroos translations. 

What should be translated? Of course, the discursive part, as the introduction - the pages numbered with Roman numerals in the CRM description. But, they contain examples and references to Classes and Properties, for which the specific rules should apply. For example, the statement on page xi "In CIDOC CRM such statements of responsibility are expressed though knowledge creation events such as E13 Attribute Assignment and its relevant subclasses.” includes such a reference that must follow the translation rules for Class names. 
Another example is the “IsA” relationship. If translated, it contains the indeterminate article “A” which in some languages must follow the grammatical gender of the term it refers to, and thus gets two/three equivalents. So my choice would be to consider it as a symbol and keep it in English also in the translations. There may be other issues of this kind, so a general directive should be 1) established 2) accepted according to local constraints. I believe that the decision could be easy in this particular case; but it must be decided for all the similar occurrences. 

The above leads me to think that before undertaking any translation, the official English version should be examined to evaluate what is English - and may be translated - and what is symbolic and just seems English - not to be translated. IsA is an example, there may be others. The translation may be funny from a literary point of view (“Martin Doerr IsA un homme”), so an explanation could be given - maybe in a footnote - to help understandability.

Naming conventions (pages xiv - xv) should of course be preserved. Here examples are given in Italic e.g. "E53 Place. P122 borders with: E53 Place”. I am not completely clear with the need of a full stop after Place (could be a typo from copy-paste), but also the use of Italic is introduced surreptitiously. By the way, it is maybe high time to establish a recommendation to standardize how to quote class and property names e.g. in articles, in order to distinguish them from plain discourse also typographically.

Coming to scope notes, I think that only the symbolic parts should remain in English, i.e. the alphanumeric label e.g. “E1”. 

The above are just examples of what a preventive survey of the official English text will define as “not translatable”. In my opinion it wouldn’t take much time to fo it.

The next step is what George calls “translation rules”. I am looking forward to fierce debates about the translation of “Human-made”, if it should follow the style of the Nusée de l’Homme (“fait par l’homme”) or choose a gender-neutral “anthropogenic” or whatever else.

I agree with George on the necessity of general guidelines and protocols to translation. But since these depend on the culture behind the language into which the CRM is going to be translated, accepting them is not automatic: how can a native English (or Greek, or German) speaker decide what is better for Italian or French? So such protocols should be stated in a general form, and then implemented language by language, what brings us back to George’s topic about "What are the criteria for accepting a translation as official?” and who is in charge of it. There may be different levels of “acceptance”, e.g. a working text, a published translation for comments, a technically approved one and a linguistically approved one. I would feel confident enough to address the first three levels, but for the highest level I would need the support of linguists - better if official ones. 

To profit of what is already being undertaken, who decides if the French Canadian version is OK? Is there any potential conflict between what the SIG (or any judge established by it) decides and the decision by an officially established Canadian referee for effective bilingualism? 

Finally, copyright. The copyright statement in the title page of CRM documentation "Copyright © 2003 ICOM/CRM Special Interest Group”  in my opinion sounds a bit old-fashioned and unpleasant, there are nowadays more appropriate licensing schemes that allow public open use, give appropriate recognition to authors, and protect the moral rights of those involved in the work, people and organizations, while avoiding any unauthorized commercial exploitation. In the era of Open Science it sounds a bit conservative. The same should apply to translations.

As you may have understood from this long email, I am interested in the adventure, both in preparing the general framework and in supporting a translation into Italian. If useful, we can advertise the initiative through various networks, to inform those potentially interested in the job.
 

Posted by George 03/03/2021

Dear all,

Thanks already for your valuable feedback and uptake on this proposal. I am pleased to say that this issue has been added to the official CRM SIG issue list:

http://www.cidoc-crm.org/Issue/ID-528-guidelines-and-protocols-for-translating-cidoc-crm

It is also scheduled to be discussed in the afternoon session of the upcoming SIG on Monday March 8th. I do hope everyone responding here and all others interested in this topic will be available to share their knowledge and help us move this subject forward.

Posted by Massoomeh on 8/3/2021

Dear All,
Thank you George for proposing this issue. I totally agree with this proposal. Due to our experience with translating the Model into Persian, Omid Hodjati and I answered to your questions. Please follow this link to see the slides of our answers.

I am looking forward to the discussion tonight!

In the 49th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 42nd FRBR – CIDOC CRM Harmonization meeting, Philippe Michon brought the SIG up to date with the CHIN initiative to translate the CRM into French. The sig decided to put together a WG to discuss translation-related issues (methodology, protocols, tools and software to apply when translating the CRM).

The initiative will be lead by Pilippe Michon (HW: to inform the SIG and set the WG). 

The translation WG should take into consideration the following aspects: 

  1. content guidelines
  2. interoperability standard + versioning tools
    • too many tools 
    • structure units and mark them up etc.
  3. communication and validation protocols 
  4. all teams engaging in CRM translation projects should be mentioned at a visible place on the website (translations page).  
    HW: everyone leading an official translation project to share details 

March 2021

Post by Philippe Michon (18 June 2021)

Dear all,

In anticipation of the SIG meeting, I wanted to inform you of the progress of the work of the CIDOC CRM Translation Guidelines Working Group.

First of all, I would like to remind you that our current mandate is to discuss issues related to translation, in particular questions relating to methodology, protocols, and tools.

The working group is made up of 13 members who represent at least 8 different languages. I would like to take this moment to thank those who have contributed to the reflection during these last two months.

The working group met twice. The first meeting made it possible to present our respective projects in addition to initiating a reflection on the needs that the group should address. The second meeting made it possible to identify more clearly the needs and the documentation that we will have to develop. We have also highlighted certain aspects that go beyond our mandate.

Without going into details in this email, we are considering the creation of 5 potential documents:

  1. "Guide of CIDOC CRM Best Translating Practices" which will define the different levels of translation, the expertise required, the workflows and recommendations on how to properly develop a style guide.
  2. "Governance Guidelines" which will define the licensing options, a translation policy and rules to ensure quality translations.
  3. "Comparison and Update Protocol" which will make it possible to easily compare versions, in particular by explaining how changes will be tracked. This document will also include the mechanisms to ensure the improvement of the original version, in particular by the presentation of a clear communication protocol between the SIG and the translation initiatives.
  4. "Introduction for translators who are new to CIDOC CRM" which will serve as a practical guide for translators who are less familiar with CIDOC CRM. It is presently contemplated to reuse documents which already exist.
  5. "Tools and Interchange Protocols" which will define the technological aspects which will facilitate the exchange of information between the SIG and the translation initiatives. We think in particular the questions of formats, templates, styles, compatibility, updates, bibliography management and tools.

 

As mentioned above, some aspects are outside the scope of our working group and for this reason, we would like to solicit the participation of the SIG with regard to the following aspects:

  1. We believe that it is important to give visibility to translation initiatives on the CIDOC CRM website, in particular to be able to quickly identify current initiatives, but also to easily access the documentation that we are going to produce.
  2. We would also need your insight into governance, particularly in terms of licensing, the ecosystem the initiatives will be part of, and publication formats.
  3. We believe that a comprehensive glossary to cover certain ambiguous terms would be very useful to allow a quality translation.
  4. Finally, in order to facilitate the creation of references, direct access to the SIG bibliography on Zotero would be appreciated.

 

In conclusion, the next few months will be devoted to writing this documentation and we invite those wishing to participate in the initiative to contact us.

Everything will be presented to you in a more detailed fashion during the SIG meeting; here is the visual support that will accompany the presentation if you ever want to consult it in advance.

All the best,
Philippe

Meetings discussed: