The Challenging Issues of Arabic Software Localization

Estimated read time 13 min read
Knowledge Sharing

Localisation is the creation of software application that adapts to many locales worldwide. A locale is a collection of language-related user preferences specified by pairing a language and a geographical region. For example, English (United States) and English (United Kingdom) are two different locales, as are French (Canada) and French (France). When software products are developed for distribution in locale, technical issues affecting software development and the subsequent localisation process must be considered. Localisation is not a matter of transferring technology to business clients in the target market. It is also about the cultural conventions that must be handled in computer programmes to give all its products the same service, independent of which language the users speak.

In exploring localisation from a quality assurance perspective, this study sought to analyse and identify the persistent problems influencing xCP Documentum localisation for Arabic speakers regarding accessibility, readability, and translatability issues. The goal of this article is to identify some of those issues and provide a solution hoping that this would help those in charge of applications localisation for Arabic language worldwide benefit from the current practices and apply them of other projects.

Internationalization
Mirroring

One of the biggest concerns with Arabic localisation is text direction. Arabic is considered a “bi-directional” language using right-to-left (RTL) script. This has some implications for the page layout, table columns, spreadsheets, graphs, cascading menus and user interface elements are normally a mirror image of content produced in English. Mirroring gives a perfect RTL look and feel to the user interface.

Fundamental to internationalization is ensuring the product is built on the universal character set, Unicode. This means not only the HTML page that you serve to  users, but all the backend databases, content management systems, scripts, and so forth. It is important to ensure users can move between the product languages easily on the user interface and without any errors in the display.

If working with monetization, you’ll need to consider how to handle users who work with a range of currencies. In addition to deciding how to format and represent monetary data when displayed to the user, you also should consider how to put in place mechanisms to manage diverse currency systems. How will you develop pricing models for different countries, which may have large variations in standard of living? How will you convert subscriptions and payments from one currency to another?

The mirroring process is handled by software engineers and is done at the back-end of the application. Engineers Add a dir attribute to the html tag to set the default base direction of the page. The language identification in the HTML scripts ‘EN-br’ for British English and Fr-fa’ for France. For our case study, it is AR-ar which is rather different than the other locales. The tag ‘dir = rtl’ is sets the default base direction for the whole document and all block elements in the document will inherit this setting unless the direction is explicitly overridden.

AR-ar refers to a text that reads from right to left, a text that it is different in script, design, grammar and morphology. AR-ar refers also to a text that must be mirrored to read correctly. When the engineering part of internationalisation is completed, we follow it by adapting the product content to the language and culture of the market. This simple addition to the html element will have the following effects throughout the rendered page.

User interface

Interfaces with international user populations – such as Microsoft Word, shown here – have to be carefully designed to make them easy to adapt to other languages and cultures. User interface is not limited to a graphical user interface but also includes error messages, logs, and console input and output. Graphics and icons are also affected by text direction. In an Arabic graphical user interface, even the layout of items such as tables and charts are typically mirror imaged on the horizontal plane. Does the feature incorporate components that are not translated for your target markets? How will you handle unsupported markets? What are the ramifications, fall back mechanisms, etc.? If your feature has UI elements that combine to form a sentence, can the UI be reordered? For example, the recurrence dialog in the calendar is problematic and adds complexity to localization because the ordering of the sentence doesn’t make sense in non-English languages.

Sensitive Graphics

Sensitive graphics present another challenge with regard to mirroring. Some sensitive graphics can have different meaning when mirrored. For example, within an LTR layout in a browser, an arrow that points to the left represents the concept of going back to the previous page; an arrow that points to the right would signify going forward to the next page. When these arrows are mirrored for an RTL layout, the meaning will be just the opposite. The direction of writing affects the way information should be presented and placed.

Cultural Orientation

One of the advantages of graphics is that they can communicate more universally than text, provided they are not culture-specific. Graphical UI components of the software product might need to be revised for the international audience. Use images that are not geared toward a particular culture or locale, and avoid including text within graphics. These practices will help minimize the localization of graphics, which is very expensive

Bidirectional

Arabic mixes right-to-left and left-to-right text on the same line, and it is important to be able to control the direction of the surrounding context for that to work properly. It’s also important to handle data strings in a way that preserves information about their base direction, so that when they are used on the user interface, they don’t look mangled.

The Market

From my long experience working in the field of language industry, I have come across three type of software localisation product. I would like to show some examples in the graphs below:

Language scope

The third task included in the localisation of a product, is the cultural adaptation. Countries have different cultural ties. This is why the product advertised must respect those conventions. A concrete example of this can be found in the Arab world where Arabic countries use the Modern standard form of Arabic MSA as a language of the media and spoken forms of Arabic for daily conversations. There are many variations of Arabic\href{https://en.wikipedia.org/wiki/Varieties_of_Arabic}{variations} and a term can mean different things according to user’s location. In Egypt for example, speakers say /kazouza/ to name a fresh drink. In Morocco, the term used is /lmonada/ a term that has been borrowed from the French dictionary /Limonade/ and was customised to the Moroccan tongue.

Content Customisation

Translation also was not looking consistent and I had to go through the application to check it and come up with a better translation according to the context. Finally, and Based on the elements mentioned above, I have suggested the followings the followings solutions:

Text translation is the most important aspect of localizing the software UI. The translation should communicate in language, phrasing, and vocabulary that is natural as well as accurate. Moreover, the language should be clear, familiar to the user, and appropriate for the culture of the target country. Additionally, it should follow conventions used in the target country such as customary symbols, punctuation, formatting, and typography. Title bars, menus, dialog boxes, buttons, messages, and tool tips are some common UI controls that have text. Help files and user manuals typically require heavy translation efforts.

There are two ways to work out this process. If the application content is not massive, the company can hire a team of linguists to help with the content translation, the linguist can translate the content directly inside the ‘html’ or ‘xml’ scripts without moving the text outside the application. There are many tools offering that. e.g OxygenXML is offering the facility of editing the text inside the editor. This process is not easy in itself because it requires the linguists to have a good knowledge of the software industry. They should know how to move between the code editor and the visual editor, how to run regressive test scenarios and who to write an SQL script. They should also know where the application segments sit on the system.

The second option is to extract the content from the scripts, add it to an excel spreadsheet while creating different text columns, one for the source text and another one for the targeted text, then request the language vendor to take in charge the content translation. There are some drawbacks with this second option. The language vendor is usually a consultancy, a business. They are using the service of independent linguists who when they work with a segment of a text neglect the context of the text they are translating. Translation errors and a loss of consistency is an error some language vendors fall in.

The localisation of a software application is a mandatory task for any business seeking an expansion to other markets. It is also an obligation in some countries to customises the  For example, all UI is required to be translated by the Toulon Law in France and the National Assembly of Québec: Bill 96 in Canada.7 In Europe and in order to respect the legal convention, it is an obligation for any company seeking to commercialise its product outside its national borders to translate and localise the content of its products, including but not limited the product description, the product labels, the packaging and the website displaying the product. This is usually done for the product reference.

Indeed, localization does entail a vast amount of translation-including all of a product’s text, menus, dialog boxes, buttons, wizards, online Help, printed documentation, packaging, and CD labels. Localization also needs to adapt the product to the particular locale in which it will be used

Arabic requires right-to-left (RTL) layout of not only the text, but of the whole user interface (UI), including buttons, menus, and dialog boxes. (For more information on RTL lay-out, see Mirroring.) Customers are likely to buy services and products if the information of commercial websites is offered in their language.

All those tasks must be completed before the commercialisation and advertisement the product. In Europe for example, consumers cannot purchase a product that does not speak the local language. This is usually done for the safety of the consumer and for the legal protection of the business. If a consumer buys a product of which the content is written in another language, they can fall in the error of buying a product that is not suitable for them and any consequences resulting from the wrong usage of the product will be handled by the business that manufactures, sells and advertise the product. Businesses prefer not to engage in legal issues because it is against the market reputation. There’s a larger lesson here that translation without sufficient context can lead to errors.

Arabic Layout

Limitations

This focuses on the language handling that can be seen in the graphical user interface and primarily of the Chinese and Arabic scripts. It does not include any technique for translation or how this should be treated. Neither does this thesis include anything about fonts, even though the availability of fonts is one of the most critical aspects of displaying or printing text in a complex script. (Without adequate font support a fully functional complex script-capable application is completely worthless). But font issues are such a large and time-consuming topic, not possible to be handled within the time frame of this, and are therefore considered to be out of scope. Because of the same reasons this thesis is limited to a “script-technical focus”, not a “cultural focus” (even though “culture” is mentioned in the thesis). It does not either discuss the distinction between a script versus typography, since typography is very culture-dependent.

Alignment and Formatting

Written content should be aligned to properly display and wrap text at the end of each sentence. If not, the order of the words and the meaning of the sentence will be incorrect. Fortunately, most applications have tools that can change the writing direction of the text. For example, markup languages like HTML have tags you can add to your code to adjust the “DIR attribute” and specify the base direction of text (LTR, RTL). Based on the elements mentioned above and the design of EMC Europe (see example below) I have come with the following conclusions. The application should be redesigned completely because it is, from a user perspective, look faulty. Moreover, in EMC application we cannot distinguish between attributes, which attribute is a parent attribute and which attribute is a child. Seeing the fact that the design Seeing the fact that the layout is not looking correct, a user cannot distinguish between attributes, which attributes is a “Parent” attribute and which attribute is a “Child” attribute. I am attaching an example for more clarification of the case. This line shows how to use a footnote to further explain or cite text\footnote.

Confirmation Messages /Error Messages

During my work on both application xCP and Record Manager, I have realized that Error/Confirmation messages were returning corrupted and thus incorrect. I have suggested on them a solution that has worked efficiently.

Although the software was converted to Arabic both technical and cultural-oriented problems prevailed including the misleading Use Interface. Different studies have been conducted to approach the topic of localization quality Lobanov [27] believes that localization Quality means that the localized product should reflect the Original in terms of language, idea, cultural nuances, and accessibility.

Conclusion and Future Research

We would argue that although great efforts have been exerted in localizing the content, a wide array of problems is persistent. Firstly, many sections of the website are partially translated as some English words and phrases are kept with the translated Arabic text. Secondly, the graphics in the English version are kept in the Arabic version to assist the verbal component represented by the items’ descriptions. Sometimes, the Arabic description is found to be vague and even misleading. Finally, diverse inconsistencies with respect to grammar, style, and technicality are observed. Such findings assert Lobanov’s conclusion that localization quality should reflect the original in terms of language, idea, cultural nuances, and accessibility.

Arabic Localization Challenges
Arabic Localization Challenges

REFERENCES

https://www.w3.org/International/i18n-drafts/nav/about

https://learn.microsoft.com/en-us/windows/win32/intl/understanding-internationalization}

chakir.mahjoubi https://lexsense.net

Knowledge engineer with expertise in natural language processing, Chakir's work experience spans, language corpus creation, software localisation, data lineage, patent translation, glossary creation and statistical analysis of experimentally obtained results.

More From Author

+ There are no comments

Add yours