In the past couple of months, we have been discussing the benefits of localizing your content into other languages. The logical reaction would be to start localizing your website and software into all the languages immediately.
Depending on the languages you are localizing into, you will run into various obstacles, but there is a very specific issue which you will have to handle if you want to localize your content into, for example, Arabic. This issue is the script used by these languages.
Today, we will focus on discussing challenges coming from localizing your content into right-to-left languages.
You might think, well, Russian also uses a different script than my English website, and I managed this just fine. We’d agree and disagree at the same time. Why? Besides using different scripts, these scripts also flow in a different direction, namely whereas most scripts flow from left to right (so-called LTR scripts), some scripts flow in different directions.
Since most Asian scripts can also be written left-to-right (Chinese hengpai, and Japanese yokogaki), we will focus only on the right-to-left direction issue.
First, let’s start with a list of RTL scripts. The most commonly listed RTL scripts are:
This is where things get interesting. First of all, in practice, the majority of RTL languages such as Arabic and Hebrew are bi-directional scripts. In other words, they can use both right-to-left and left-to-right within the same sentence.
To achieve this, bidi support is necessary, as provided by Unicode. In order to offer bidi support, Unicode prescribes an algorithm dividing all its characters into one of four types – strong, weak, neutral, and explicit formatting.
Two of these types, neutral and explicit formatting, are not of interest for this discussion, as these are mostly punctuation and symbols. The strong and weak characters are necessary to discuss though, to understand the bi-directional approach used in these scripts.
Strong characters are specific to only those scripts, e.g. alphabetic characters. Weak characters include European digits, arithmetic symbols, and currency symbols.
How would a sentence in Arabic look like, taking into account the bi-directional approach?
.setisbew 1000 era erehT
In case you cannot read it, the above sentence reads There are 1000 websites. As you can see, the text would be RTL, while the number would be presented LTR.
If this is not enough, Azeri/Azerbaijani is written in Latin, Cyrillic or Arabic scripts. When written in Latin or Cyrillic scripts, Azeri is written left-to-right (LTR). When written in the Arabic script, it is written right-to-left.
The final implication the direction of the script has on your content? All other non-textual content (like images) also needs to be adapted to match the RTL direction, if you are localizing into a language using such a script.
The easiest way to describe this would be using before/after images for a product. On a website using an LTR script, the before image would be on the left, and the after image would be on the right. However, on a website using an RTL script, the images would have to be reversed, with the before image being on the right. At this point, you might be scratching your head.
What does this all mean?
1. Languages don’t have a direction, scripts do. Keep Azeri in mind when trying to distinguish this. Depending on the script used, it can be written LTR or RTL. If a language can use more than one script and/or script direction, you need to decide which one you will use, depending on your target audience.
2. Languages which use a top-to-bottom direction can mostly be displayed left-to-right. This simplifies localization into languages such as Chinese and Japanese.
3. Right-to-left languages are often actually bi-directional. With bidi support, you will be able to display numbers left-to-right, keeping the rest of the content right-to-left.
4. The whole layout needs to be adapted to the RTL script (like images and website menus).
5. If you are localizing your website, don’t force the text direction. If you force left-to-right display, the website will be displayed incorrectly in e.g. Arabic.
Are right-to-left-languages easy to localize into?
In all honesty, this all sounds more complicated than it actually is. By using a system which supports RTL and bi-directional display, such as our Translation Management System, you can strike off a couple of worries from the very start.
You can also make use of our native speakers who will point our any irregularities, so the whole process will be much easier than you think.
Want to test it yourself? You can do it here, for free!
Read more about RTL languages and non-latin typographies here.