<time>safe for historians
The HTML 5 spec introduces the
<time> element to mark up a date or time. Although I support the inclusion of these semantics in HTML, I believe that the current specification of the
<time> element is vague because it avoids the question whether the element is safe for historians. Right now it hurts historical research more than it helps. In this entry I’ll explain why.
Although I will concentrate on the HTML5 syntax here, what I have to say also applies to the microformats datetime design pattern. The Microformats site adds one important detail to the discussion that the HTML5 spec overlooks: the point of having a
<time> element (or a datetime design pattern) at all:
Use the datetime-design-pattern to make datetimes that are human readable also formally machine readable.
Who needs machine readable dates? As far as I can see there are two target audiences for this operation. The first is obviously social applications that have to work with dates, and where it can be useful to compare dates of two different events. An app must be able to see if two events fall on the same day and warn you if they do.
However, as a target audience social applications are immediately followed by historians (or historical, chronological applications). After all, historians are (dare I say it?) historically the most prolific users of dates, until they were upstaged by social applications.
This raises the question whether the
<time> element should be tailored for historical use at all. When I started writing this entry I was convinced that it should.
In keeping with the definition of its purpose I the see the
<time> element as a tool for an Internet-wide chronological search-and-compare system. Such a system will be a boon to historians, who would be allowed to quickly and easily look up events that happened around the same time as the event they’re writing about.
In history, just as in other academic disciplines, serendipitous discoveries are the meat of exciting new theories. A history-compliant use of the
<time> element that allows automatic search and compare would broaden the horizons of historians.
However, now that I’ve reviewed some of the more common problems that have to be solved in order to decrease potential harm, I’m starting to doubt whether the
<time> element can easily be made to fit history.
Right now, though, the specification is a vague compromise that doesn’t make the
<time> element useful for historical research, but still allows it to be used historically.
I feel this ambiguity should be removed. I feel that the specification should clearly state whether the
<time> element is meant for historical use or not. The current vague, implied “No” should be changed to a clear answer. I prefer Yes, but I can live with No.
<time> element should be made safe for historians, there’s quite a bit of work to be done; some of which is discussed in this article. If it should not be made history-safe, we have to add a cut-off date to the specification. Dates before this cut-off date would be ignored.
The basic problem (that we’ll discuss in great detail below) is that the current specification requires the use of the so-called proleptic Gregorian calendar. Although that makes perfect sense in the modern age, it becomes progressively more pointless as we travel back in time, and somewhere in the late 16th century we reach the point that proleptic Gregorian dates become actively harmful to historical research.
The basic problem is that historians of the Middle Ages and earlier periods use Julian dates because that’s what the documents of that era use. If we’d map them to proleptic Gregorian dates, as the specification demands, they would be worse than useless in any kind of automatic search-and-compare system.
Hundreds of years’ worth of historical literature uses Julian dates if the people from the era it discusses did so, and therefore a system that uses proleptic Gregorian dates just doesn’t find any matches.
The current specification acknowledges this problem — somewhat. It says:
For dates before the introduction of the Gregorian calendar, authors are encouraged to not use the time element, or else to be very careful about converting dates and times from the period to the Gregorian calendar
A literal interpretation would have odd consequences. If I’d write about the secret negotiations between Louis XIV and Charles II to destroy the Dutch Republic in the early 1670s, I would be allowed to mark up the dates of Louis XIV’s letters, but not those of Charles II’s ones. France used the Gregorian calendar back then, but England stuck to the Julian. Such a rule is useless for historians. Besides, it’s just plain weird.
As to marking up Charles II’s letters with Gregorian dates, that’s possible, but it could lead to the same problems we discussed above: the generally accepted date for a letter might be Julian, in which case an automated search for the Gregorian date misfires dramatically.
So I believe this remark is incorrect and should be changed. The specification should clearly and unambiguously state whether or not the
<time> element is fit for historical use instead of trying to find a vague formula that avoids this basic question. (I don’t even understand why this question should be avoided. It’s a simple one, though the consequences of a Yes are pretty complicated.)
If the answer is No the specification should define a cut-off date that is the earliest date the
<time> element (or automatic search systems based on it) accepts as valid. Earlier dates are simply ignored by a compliant implementation. That neatly avoids the bulk of the problems mentioned in this article, and makes sure that any historical use that falls within the constraints of the specification is actually useful.
Therefore, if historical use of the
<time> element is to be disallowed, we MUST (in the sense of RFC 2119) define a cut-off date.
The most obvious candidate for a cut-off date is 1 January 1970, the start of Unix Epoch time. There’s one problem, though: if we’d cut off the
<time> element there, many people alive today wouldn’t be able to mark up their birth dates.
Therefore I’d like to propose 1 January 1870 instead. Its relation to the start of the Unix Epoch is clear and it allows everybody alive today to mark up their dates of birth.
Besides, there’s some vaguish historical justification for this date. Around 1870 the final phase of European colonial imperialism started, which caused almost the entire world to be divided among the colonial powers. Not coincidentally, this also caused the Gregorian calendar to spread to even the most obscure corners of the world, and it became a true world standard.
The only problem with that cut-off date would be that Russia still used the Julian calendar in 1870 and continued to do so until 1918. Moving the cut-off date to 1918 is possible, but it would mean a few of the very oldest people in the world would not be able to mark up their birth dates.
If you’re convinced that the current specification of the
<time> element should not be changed to accomodate historians, you can stop reading here. The historical overview that follows is not important to you.
You should just:
<time>element is not meant for historical use.
Thank you for your attention.
If you’re still with me, you’re obviously interested in chronological problems. You’ll get what you want — in spades.
If the HTML5
<time> element is to be made safe for historical use, the specification MUST (again in the sense of RFC 2119) allow
datetimeformat; most importantly Easter.
Of these six rules, I believe that the first five are universal. Although I will defend them by studying European history exclusively, I think that most other chronologies will be served by them, too. The sixth rule is specific to European history; other civilisations will have other cut-off dates for arithmetic operations.
In order to understand all this we have to review the history of dates. There are two separate problems we have to discuss: the calendar (i.e. the days and months of the year), and the names of the years. The specification treats the first point vaguely, and ignores the second.
The solar year is about 365.2422 days long, which means it cannot be expressed in an integer number of days. As history progressed, calendars became better at dealing with this problem. For the purpose of our discussion, the Julian and Gregorian reforms are the most important.
Unfortunately, in 1582 the wars of religion raged in Europe, and the Protestants were not really eager to follow the directions of the Antichrist in Rome, especially if he happened to be right. The Orthodox also had pressing (and much older) reasons to demonstratively ignore the papacy.
Therefore all Catholic countries switched to the new calendar within years, but the Protestant and Orthodox countries refused to follow.
Most Protestant countries went over in 1700, when the wars of religion had become a vague memory and the actual difference between the two calendars acute. After all, 1700 was the first year that was a leap year in the Julian calendar but not in the Gregorian one.
Nonetheless, England and Russia continued to use the Julian calendar until 1752 and 1918, respectively. (In fact, one Scottish island stubbornly refused to implement the Gregorian calendar until ten years ago or so. The sheep might get confused.)
To this day, most Orthodox use the Julian calendar (even though the Orthodox states use the Gregorian one, and some Orthodox churches use the so-called Revised Julian calendar). That’s why the Orthodox celebrate Christmas (and sometimes Easter) on different days than the Catholics and Protestants.
Currently the specification decrees the use of so-called proleptic Gregorian dates; that is, the date a day would have had if the Gregorian calendar had already been in use back then.
Although this makes sense in the recent past, we’ll see that this decree becomes more harmful the more we go back into history. Although (proleptic) Gregorian makes perfect sense as a default, it MUST be possible to define another calendar.
Besides, there’s the matter of onus. Who, exactly, is responsible for mapping a Julian date to a proleptic Gregorian one? The HTML author or some kind of universal date-calculating system?
In other words, if I, as an historian, talk about 18th century Julian dates, do I have to map them to the Gregorian calendar myself (possibly by means of software I have to buy and install), or can I just trust an Internet-wide system to do so? Obviously I prefer the second solution because it’s less work for me and will probably introduce less errors.
Let’s take a look at a practical example.
On 24 January 1918 Lenin signed a decree that moved the brand-new Soviet Union from the Julian to the Gregorian calendar. In order to bridge the 14-day gap between Julian and Gregorian, 1-13 February 1918 were omitted, so that 31 January was directly followed by 14 February.
Now how are we going to mark up this paragraph? Let’s try the proleptic Gregorian calendar:
<p>On <time datetime="1918-02-06">24 January 1918</time> Lenin signed a decree that moved the brand-new Soviet Union from the Julian to the Gregorian calendar. In order to bridge the 14-day gap between Julian and Gregorian, 1-13 February 1918 were omitted, so that <time datetime="1918-02-13">31 January</time> was directly followed by <time datetime="1918-02-14">14 February</time>.</p>
This remapping might be confusing for human and machine, but using Gregorian dates still makes sense, especially since the text is about Russia introducing the Gregorian calendar.
1-13 February 1918 bit is a problem. They’re dates, but they have never existed. I think it’s best not to mark them up at all.
Slightly more confusing is the following:
After the October revolution (25 October 1917) Russia became a communist state.
This date is Julian; we’ll have to map it to Gregorian, but the consequence is that the October revolution takes place in November. Fortunately we’re used to this fact; most history books mention this oddity.
The use of Julian
datetime values becomes mandatory as we enter the Middle Ages. So let’s jump eight hundred years more into the past.
Jerusalem was conquered by the crusaders on 15 July 1099, and a great slaughter was perpetrated among its inhabitants of all races and creeds.
According to the specification we’d either have to use proleptic Gregorian dates or not use a
datetime attribute at all. Since I feel the second option invalidates the entire
<time> element, I’m forced to choose the first one:
<p>Jerusalem was conquered by the crusaders on <time datetime="1099-07-09">15 July 1099</time>, and a great slaughter was perpetrated among its inhabitants of all races and creeds.</p>
The problem is that the proleptic Gregorian 9 July is worthless. Every history book uses 15 July as the date of Jerusalem’s conquest, so an online search by a program that parses
datetime values would misfire dramatically.
More in general, medieval historians use whichever date system the people from that age actually used, and therefore all dates in all books about medieval history are Julian, and not proleptic Gregorian.
Because medieval historians use Julian dates, mapping medieval dates to proleptic Gregorian is going to cause widespread confusion. The machine-readable dates will match those used in history books and source collections. Thus, the misuse of the proleptic Gregorian calendar will actively hamper historical research instead of aiding it.
In the case of medieval (and earlier) history we MUST use Julian
datetime values. We do have to specify that fact, of course, which means we need an extra attribute, which I’ve dubbed
calendar for the moment:
<p>Jerusalem was conquered by the crusaders on <time datetime="1099-07-15" calendar="Julian">15 July 1099</time>, and a great slaughter was perpetrated among its inhabitants of all races and creeds.</p>
Easter is an important date; in fact during most of church history it was the most important holiday of the year. It’s not a fixed feast; it is celebrated on the first Sunday after the full moon after the first new moon that falls on or after the March equinox. (The actual calculation is somewhat more complicated, but this definition will do for now.)
Thus, the definition of Easter depends on the definition of the March equinox. In the Gregorian calendar it’s 21 March, and in the Julian calendar it’s also 21 March, but of course the two 21 Marches are several days apart, and if a new moon occurs in the gap between them, the Julian and Gregorian Easter dates will not match even after the Julian one is mapped to Gregorian. (This rule still holds for the Orthodox church.)
Because it was so hard to calculate this most important date of the year, considerable ingenuity was applied to the job throughout late antiquity and the middle ages. In fact, the very survival of chronological knowledge in the dark ages can be ascribed to the need to calculate Easter.
Every church had its paschal tables, which showed the dates for Easter (as well as chronologically related feasts such as Good Friday and Pentecost). When individual priests or monks started adding extra notes about important events to these tables, chronicles were born.
Concern over the slowly shifting date of Easter was what prompted Gregory XIII to institute his calendar reforms. He wanted to make sure the modern Church celebrated Easter on the dates prescribed by the Council of Nicaea in 325.
Let’s jump four more centuries back and take a look at a practical example.
On Easter 675, a land dispute between Praejectus, bishop of Clermont, and Hector, count of Marseille, was heard before the royal court.
In order to properly mark up
Easter 675 we have to first calculate Julian Easter 675 and then map this date to the proleptic Gregorian one. A new moon may have occurred between Julian and Gregorian 21 March 675, after all.
This calculation is not impossible, but the question is on whom the onus should rest. The author, or some kind of centralised date system? (I, in any case, have not attempted to calculate the precise date.)
I feel the onus should be removed from the historian who wants to write about poor Praejectus and his murder and is not interested in HTML5 chronology.
Besides, “Easter 675” is an exact date: both modern historians and people who actually lived in 675 will reach the same result when they calculate it.
The problem is that it’s so very hard calculate, especially when you insist on the proleptic Gregorian date. And once you’ve found the correct result, it turns out nobody is interested. So let’s save ourselves a tough job and just do this:
<p>On <time datetime="0675-Easter" calendar="Julian">Easter 675</time>, a land dispute between Praejectus, bishop of Clermont, and Hector, count of Marseille, was heard before the royal court.</p>
This is how early medieval chronological reconstructions work:
The murder of bishop Praejectus probably took place on 26 January 676. We know for a fact he was still alive on Easter 675, and his successor as bishop of Clermont is said to have ruled “for fifteen years and a bit,” and to have died in the reign of king Theodoric III.
Since Theodoric III died in April 691, Praejectus’ successor became bishop in early 676 at the latest. Besides, St. Praejectus’ feast is celebrated on 26 January; and it is not unreasonable to assume it took place on the anniversary of his murder.
Early medieval historians are quite happy when they can pin such an exact date on an event; and never mind that the date is Julian.
Now how are we going to mark up all this? There are several problems here:
The first date should be marked up fully. After all it refers to an exact, specific date. The third date would probably have to be marked up by a
<time> element without a
As to the second date, we MUST use
<time datetime="0691-04" calendar="Julian">, and never mind that that date is incomplete.
The fact that we know the month of Theodoric III’s death makes this date more precise than most dates from that era. Any machine-generated historical timeline tool MUST mention “April 691” as the date of Theodoric’s death, because the fact might be important to chronological research such as determining when Praejectus was martyred.
Let’s go another eight hundred years back and land just in time to see Hannibal victorious against the Romans at Cannae. This historical battle, sources assure us, took place on 2 August 216 BC. We don’t have a prayer of re-mapping this date to a proleptic Gregorian or a Julian one.
The ancient Roman year had 355 days, and in theory every second year ought to have a so-called intercalary month of 22 or 23 days. The problem was that these months were inserted irregularly, and no chronologist ancient or modern has ever taken the trouble to track down the exact use of the intercalary month. (Besides, the sources are just not there.)
This means that we will never know exactly on which proleptic Gregorian date the battle of Cannae took place. The best we can say is that it took place in high summer; probably in July or August.
However, if a source would say that a certain event happened on 5 August 216 BC, we can be certain that it took place three days after the battle of Cannae. The Romans saw the use of a reliable chronology and were generally accurate within the constraints of their weird calendar.
Thus, the date of the battle of Cannae should be marked up as:
<time datetime="-216-08-02" calendar="Ancient Roman">2 August 216 BC</time>
With this final example we’ve discussed the need for a
calendar attribute sufficiently.
Apart from the
calendar attribute, this code example contains something else an HTML5 validator would get extremely upset about: the negative year.
The restriction that BC years may not be used is of course totally absurd in a historical context.
That brings us to the second problem: the names of the years, and especially the use of different naming systems.
If you’re into ancient chronology, it’s best to see years as having names, not numbers. Essentially our modern numbering of years from the Incarnation provides a common naming system; not a numbering system.
"2009" and not the number
Until the waning of the middle ages, all monarchies used naming systems based on the regnal year of the king or emperor. Although we talk about 12 August 1274, a contemporary document would not use the name “1274,” but would instead say “the second year of king Edward” for the English, or “the fourth year of king Philip” for the French.
(Of course historians first have to figure out these documents refer to Edward I and Philip III, and not, for instance, Edward III and Philip VI. Medieval chronology is such fun.)
In addition to this traditional naming system, that has been in use since the dawn of history, the Middle Ages used the naming system of years since the Incarnation that the devoutly Christian historians of the age considered the defining moment of human history.
The Book of Revelations clearly states that the Antichrist would be locked up for a thousand years before being allowed to briefly rule the Earth. The use of the Christian era naming system allowed everybody to get duly upset around the year 1000, and widespread confusion was sowed.
The Christian naming system was invented by Dionysius Exiguus, a monk living in Rome. Exactly why he thought that the year that he published his system was the 525th since Christ’s birth is not known, but his counting has been used ever since.
Before Dionysius introduced his reform, people used the old Roman system, in which every year was named after its two consuls.
After the Romans had discarded their monarchy in 509 BC they were forced to stop using regnal years. They needed a new naming system, and they decided to allow their two chief magistrates, the consuls, to give their names to the year.
Thus, “in the consulate of Cn. Pompeius Magnus and M. Licinius Crassus Dives” is a historically valid alternative to “70 BC.” In fact, BC or AD years may be considered a convenient shorthand for the “semantically” more correct consular years.
Although the consuls lost all political power after Augustus founded the Empire in 27 BC, the title was still given out to aristocrats who’d deserved a plum, as wel as to the Emperor himself, until the office was abolished in 541 AD. The consuls continued to give their names to the year. (In return they were graciously allowed to squander their fortunes on organising circus games.)
Modern historians have mapped consular years to Christian ones, and have established lookup tables. The ancient historian Dionysius of Halicarnassus has carefully mapped Greek history to Roman consular years. (He may have made mistakes, but if he did we’re not in a position to find out. We must accept his chronology.)
Thus the Roman consular years give us a common naming system (a namespace, if you wish) for 1049 years of Greek, Roman, and early Medieval history. This naming system can be combined with the Christian one to give us a more-or-less reliable chronology going back about 2,500 years.
The last eight hundred or so of the Roman consular years are universally accepted as historically reliable. In the more uncertain first two hundred years, even the most radical reviser proposes a shift of eight years at most, so across the centuries this naming system remained pretty reliable.
300 BC is the earliest year that we can map with complete accuracy; i.e. we can say with certainty that the consulate of M. Valerius Maximus Corvus for the fifth time and Q. Appuleius Pansa occurred exactly 2308 years before the present time.
For this reason among others, breaking off chronology at 1 AD, as HTML5 proposes, is pure nonsense. We’d miss out on another three hundred years of perfectly good, historically and arithmetically valid chronology.
The year 0 does not exist. The consulate of Cossus Cornelius Lentulus and L. Calpurnius Piso, which we call “1 BC, ” was directly followed by the consulate of C. Caesar and L. Aemilius Paullus, which we call “1 AD.”
Emperor Augustus ruled from 27 BC to 14 AD, and that’s a reign of 40 years, not 41.
While we’re on the subject of Augustus, he was also responsible for finally setting down a regular pattern of one leap year per four years. The first leap year in the new system was the consulate of S. Aelius Catus and C. Sentius Saturninus, and because we happen to call that year “4 AD” we’ve grown used to thinking that a leap year must necessarily be divisible by four.
The Roman consular lists, the fasti consulares, have been preserved in several versions. The standard version is the one created during Augustus’ rule by the historian M. Terentius Varro and set up on the Forum Romanum to serve as a public calendar. It was dutifully updated every year.
Because these fasti were moved to the Capitol after being excavated in the 16th century, they’re known as the Capitoline Fasti.
Historians are pretty certain that some errors have crept into the Capitoline Fasti. The first problematic year is 301 BC, when the Capitoline Fasti say a dictator was appointed instead of consuls. Although this was allowed under the Roman constitution, dictatorships are quite rare, and this particular one is not mentioned in any other source. Therefore, modern historians have concluded, this dictatorship never actually took place.
Thus the consulate of M. Livius Denter and M. Aemilius Paullus, which we call “302 BC,” was directly followed by the consulate of M. Valerius Maximus Corvus for the fifth time and Q. Appuleius Pansa, which we call “300 BC.”
There are several such problematic years in the Capitoline Fasti before 300 BC. Unfortunately historians disagree on some of these cases, and therefore they’ve decided to follow Varro’s system anyway, warts and all. Years from 509 to 301 BC are called “Varronian years,” and about three to eight of them have never existed.
Therefore Christian years before 300 BC are names, and not numbers, and MUST NOT be used for arithmetic operations.
Varronian years are still being used in history books. If we say that “the Greeks defeated the Persian navy in 480 BC at Salamis, and the Persian army in 479 BC at Plataea” we’re using Varronian years.
Historians are pretty certain that these dates are in fact three to eight years off, and that we cannot say that the battle of Salamis took place exactly 2488 before the present year. It’s more in the order of somewhere between 2480 and 2485 — probably.
Nonetheless, it does not make sense to say the battle of Salamis occurred anywhere from 477 to 472 BC. All history books say “480 BC,” and people (as well as chronological search systems) would get confused if we did anything else. We MUST continue to use the Varronian year.
So the example has to be marked up as:
The Greeks defeated the Persian navy in <time datetime="-480" yearNames="Varronian">480 BC</time>, and the Persian army in <time datetime="-479" yearNames="Varronian">479 BC</time>
More-or-less reliable chronology starts with the consulate of L. Iunius Brutus and L. Tarquinius Collatinus, Varronian year “509 BC.” All dates before 509 BC are educated guesswork at best. As we go further back in time the guesswork increases at the expense of the education.
Sure, you’ll often encounter earlier dates, but these have been painstakingly reconstructed by both ancient and modern historians, and there’s simply no way we can tell whether they’re right or wrong.
That’s the historians’ problem. Nonetheless, a history-compatible implementation of
<time> MUST allow an arbitrary year-naming system to be specified. (The actual mapping of such a system to the consular/Christian system is a problem for historians; not for spec writers. If an historian would use his own system, he’d be responsible for creating lookup tables.)
An example will show why this is necessary.
It is my personal belief that the so-called “First Dark Ages,” traditionally dated from 1200 to 800 BC, have never existed; i.e. 1200 BC was the same year as 800 BC (roughly speaking, of course).
I also feel that earlier chronology is a mess wrongly based on the so-called Thirty Dynasties of Egypt scheme of the historian Manetho, whose work is almost completely lost, and who wrote in a time when Alexander the Great’s rapacious successors were trying to outdo each other in bragging about the venerable antiquity of the people they were exploiting. Egyptian and Mesopotamian chronology thus became a tool in a propaganda war, and it has never recovered.
As a result, I think that the Egyptian and Mesopotamian monarchies developed a few centuries later than is generally assumed.
I also think that Egyptian chronology has serious defects and should be re-thought from the ground up. (Especially the fact that the XXII dynasty was a priestly one concurrent with the last native dynasty, the Persians, and the Ptolemies ought to be recognised.)
Since all other ancient chronologies are based on the Egyptian one, this would have far-reaching consequences.
This opinion is not popular in historical circles; in fact most professional historians of the age will hotly defend the First Dark Ages and their painstakingly created chronologies. That’s fine; since I’m the challenger I have to prove my challenge by doing some research.
The point is that in order to mark up my research in HTML I’d have to create my own year-naming system, while also using the year-naming systems that are currently in use among historians. To make matters more complex, most Ancient Near East chronologies have a high, a middle, and a low variant (all of which are wrong, in my opinion).
In other words, I have to be very careful to specify which year-naming system a particular
<time> element belongs to. I also have to be able to denote years belonging to my own chronological theory. Thus, I MUST be able to invent a value for the attribute I’ve called
(Incidentally, this research would greatly benefit from a centralised chronological system I could plug in to to automatically convert dates from other systems to my proposed system. I’d have to create lookup tables (or maybe even arithmetic operations), but once I’d have done that, I’d be able to move the onus of recalculating centuries of history to an automatic system. Now THAT would be a benefit of a history-safe
<time>safe for historians
This short treatment of ancient chronology highlights only a few of the most important problems, and it doesn’t even try to cover non-European civilisations. More study is clearly necessary.
In conclusion, making
<time> safe for historians is not an easy job, and, as I said at the start, the question is whether it should be attempted at all. I hope to have given you some useful information that will allow you to take a position on this question.
I’m speaking at the following conferences:
Comments are closed.