Issue 417: begin_of_the_begin /end_of_the_end is excluded from time range?
Posted by Robert Sanderson on 8/5/2019
I admit I made the rookie mistake of assuming that the P81a/b and P82a/b properties followed the typical temporal pattern of an inclusive beginning and an exclusive end.
Or using interval notation: [begin_of_the_begin, end_of_the_end)
Thus if you know that an event happened sometime in 1586, the begin of the begin would be 1586-01-01T00:00:00 and the end of the end would be 1587-01-01:00:00:00.
However, http://www.cidoc-crm.org/guidelines-for-using-p82a-p82b-p81a-p81b seems to clarify that both are exclusive.
> "P82a_begin_of_the_begin" should be instantiated as the latest point in time the user is sure that the respective temporal phenomenon is indeed *not yet* happening.
> "P82b_end_of_the_end" should be instantiated as the earliest point in time the user is sure that the respective temporal phenomenon is indeed *no longer* ongoing.
And thus (begin_of_the_begin, end_of_the_end)
Meaning that the begin of the begin would need to be 1585-12-31T23:59:59 such that midnight on January first is included in the range, and the end of the end would be midnight of January first, 1587.
However, in the following paragraph it says:
> … e.g. 1971 = Jan 1 1971 0:00:00. Respectively, for “P82b_end_of_the_end” the implementation should “round it up”, e.g. 1971 = Dec 31 1971 23:59:59.
Which would mean that both ends were *included* in the range.
And thus [begin_of_the_begin, end_of_the_end]
Enquiring minds that need to implement this consistently would like to know which is correct ☺
Posted by Florian Kräutli on 9/5/2019
Not having read the guidelines as attentively as you I usually implement P82a/b suggesting that the begin and end date are both included in the range.
For example, here's the date related to a book published in 1586:
I think this is readable as a confidence interval of the book having been published somewhen in 1586, lacking better ways to express the level of accuracy in date datatypes.
Posted by Robert Sanderson on 9/5/2019
Thanks Florian, Nicola!
Should the example be updated (and thus we must all update our implementations) or the specification to match the example which everyone seems to do in practice?
My proposal would be to do the latter, in the face of the current ambiguity.
What has everyone else done in this situation? 3 data points is interesting, but still anecdotal.
(And I’m not going to mention leap seconds that would make the end of some years 23:59:60 instead of 23:59:59, which would be solved by an exclusive end date)
Posted by Florian Kräutli on 10/5/2019
I actually think that the text makes the right assumption. If something is said to have happened in 1586 we can be reasonablycertain that it happened before 1 January 1587. We can’t be certain that it did not happen a millisecond after 31 December 1586 at 23:59:59.
I think we should provide two examples. One that matches the text and the current one, mentioning that this can be done for ease of implementation.
Which version one implements is after all not the decision of the CRM, but depends on the available knowledge and interpretation of the source data.
Posted by Martin on 10/5/2019
Rob is right.
If we talk about seconds, it is somehow hunting flies. But we really need to test how databases interpret intervals given in dates.
The conversion to the begin of the year,day / end of the year,day should be done by the data entry templates, knowing that we instantiate an ..a or ..b property, and NOT manually. We have written such modules in the past for RDBMS implementations. Could be a standard S/W module. Would someone volunteer to provide?
Posted by Martin on 10/5/2019
In other words:
If date <= 1951 internally converts to dec 31, 23:59... 1951 , Florian's solution works out for querying things possibly having happened in this range. If it converts to jan 1, 0:0, it is wrong. To be checked how all 9 date queries work.
Posted by Martin on 11/5/2019
Sorry for answering in pieces. The "ultima ratio" for all we do are the queries, and not the entities. There are 9 possible questions about a time-span: Give me all events that (1) must have happened before event X started, that may have happened before event X started, that must have happened before event X ended, that may have... etc. If the last second of a day is included or not, is completely irrelevant for our purposes. If the end of the end of 1895 is interpreted as Jan 1, 0:0, 1896, the question is, how implementations will answer the above queries wrt 1896, and not, if the last second is in or out. I'll try in the next weeks to sort that out. I hope, different RDF databases will be consistent at least!
Best wishes and thank you for pointing to this issue!