1 June 2017 saw the publication of Unicode v.10.0. This now annual updating of the Unicode standard is publicly best known for the encoding of new emojis, allowing them to be reproduced in different platforms and on different computers correctly, and making them useable in email, in databases, and across the internet. Of course, the encoding is a form of standardisation, but the end-user still needs to have an appropriate font to display them, meaning that there is always a lag between encoding and actual usage.
However, Unicode is, and always has been, about a lot more than just emojis. The overwhelming majority of the character encodings in every new version are not emojis, but other real scripts. V.10.0, among other things, encodes Japanese hentaigana for the first time (two charts, here and here). Hentaigana is regarded as a dead, but it has never quite died.
Modern Japanese writing uses Chinese characters and the phonetic ‘hiragana’ syllabary. Hiragana typically uses only one spelling per syllable, meaning that in hiragana the syllable no is only written の, ba is only written ば, ma is only written ま, etc.
This was not always so. Until the end of the nineteenth century, hiragana was much more complicated. Throughout Japan’s ancient and medieval history, every syllable could be written with any of a selection of forms. For example, ma could be written with ま, , , or . This larger historical repertoire of symbols is known nowadays as hentaigana.
From 1900 a reduced, simpler set of hiragana became official and standard, and these are used today. Therefore, ma could only be written ま. Movable type printing and the development of the Japanese typewriter – a complicated machine that nevertheless required a finite number of symbols – further pushed hentaigana to extinction. For example, modern printed editions of the medieval classics of Japanese literature, such as the Tale of Genji, only use modern hiragana, even though the original manuscripts used hentaigana.
So, hentaigana is an obsolete writing system. Surely, then, the interest in encoding it in Unicode is purely for historical interest. To some extent, yes. However, hentaigana has never entirely vanished from the linguistic landscape of Japan. Although it is largely seen now in hand-written calligraphy that is largely unintelligible to modern Japanese, and more for aesthetic than practical use, there are some exceptions.
Wandering around a Japanese town, you’ll occasionally notice food establishments, particularly those that serve simple traditional food such as soba or udon noodles, with signs that include hentaigana, e.g. the ba of soba, or the do of udon. Exceptionally, they will even daringly write both syllables of soba in hentaigana, but usually just the one to ensure legibility. So, for example, in the black script on the white sign in the photograph below, the do of udon is written in hentaigana, but a Japanese reader should still see that the word begins with u (う), ends in n (ん), and has one syllable in-between: easy enough to work out that it says udon, even if the reader does not know the hentaigana in the middle.
This use of hentaigana in signs that are intended still to be legible in Japan’s signscape is limited to a few words like soba/udon, revealing how peripheral hentaigana has become.
However, movable-type printing presses and typewriters are now things of the past, and the restrictions that both imposed upon the number of written symbols available in Japanese writing has vanished. The digital world now allows anything to be typed, so long as it has been encoded and is included in a font. The encoding of hentaigana now opens the door to its future use on the web or even in texts. Already everything that has been encoded can be used – and is used – in SMS, such as the phenomenon known as gyarumoji (‘girl writing’) in which words are input with unusual characters, using roman, cyrillic or mathematical symbols simply because they visually vaguely resemble the correct Japanese forms, in part to encrypt the message from parents’ or teachers’ view: a more sophisticated version of the ‘l8r’ or ‘2day’ of English. As soon as fonts catch up with Unicode version 10.0, will people start using hentaigana in text messages in the same way? It’s an interesting thought.
I said above that hentaigana was essentially obsolete from 1900. This certainly applies to printed Japanese, but handwriting took a while to catch up. Hentaigana still appeared in hand-written Japanese before World War 2, largely in the writing of the older generations. But through the twentieth century it retained an odd status. People born in the nineteenth century still had personal names that could be written in hiragana, which meant that there were people with names officially registered before 1900 using hentaigana. So these nineteenth-century names remained officially recognised with hentaigana spelling into the twentieth century. In fact, because of this and despite the 1900 reform, the Ministry of Justice still allowed the registration of babies’ names in hentaigana right through to 1948, when hentaigana was disallowed in future registrations. Now, the Ministry of Justice is wanting to digitise all its name registration records. In order to do so, hentaigana needs an encoding and inclusion in fonts, and it is this that has been a major drive towards the encoding of hentaigana in Unicode.
Only time will tell how this will affect the hitherto ethereal existence of hentaigana, but one can’t help but wonder whether this is the start of a renaissance for an officially obsolete script that never entirely vanished.
Forthcoming: ‘Diversity in the Japanese Writing System(s): a schematic diachronic analysis’, in Mark Irwin & Matthew Zisk (eds.) Japanese Sociohistorical Linguistics.
2008: ‘Nonconventional script choice in Japan’, International Journal of the Sociology of Language 192: 133-151.