Community-powered software localization

Log-in or register.

Community-powered software localization

Published by on January 16th 2011.

It has been almost 2 years since the community-based localization functions first appeared in RealWorld Paint.COM. Other RW applications followed later. Was it a success? What went right and what went wrong? Let's find out right now.

I want this blog entry to be a source of information for other software developers, who are considering adding community-driven translations into their software. Before I dive into the technical details, I want to say:

A big thank you

to all who have participated in the localization effort so far.

Why bother with localization?

English is the native language of about 1/3 of internet citizens. A good portion of the remaining 2/3 are able to use English applications. Still, the percentage of people preferring to use their native language is increasing as more people get access to the internet. It is a very good feeling to be able to switch an application to your own language, especially if it is not a major language.

Supporting multiple languages sends a positive message. A message that you care.

Why have I chosen community-driven localization?

My applications were available in English and Czech (my native language) for a couple of years. People were asking for other languages and some of them even offered help with the translation. Sadly, I had to turn them down, the process would be too difficult.

Localization is easy when I do the whole work myself, but delegating the translation to someone else requires a completely different approach.

Community-powered localization was the best way, probably the only way for me, an author of freeware and specialized shareware tools with limited budget (and it is always limited, because of the number of languages).

How does it work?

From user perspective: click on a language link.

Translator perspective: use a window inside the application to translate; click button to save and share the work. Click for video guide.

Developer perspective: keep web server running; consistently use localization library in application.

Difficulties of localization

The task of localization appeared simple on the first sight, but it proved quite difficult. Following tasks need to performed:

  1. Assembling a list of reference English expressions.
  2. Delivering the list to translators.
  3. Delivering the list of translated expressions to end users.

Let's look at the individual steps in more details.

List of reference expressions

Before I had chosen a solution, I had to consider this:

  • There are various localizable elements - menu items, tool tips, window elements, default names.
  • There are user-created elements that should be localizable, because they can be shared.
  • Application consists of multiple dynamic libraries. Some expressions are duplicated in multiple libraries.
  • I have multiple applications sharing some of the expressions (File/Open/Save/...).
  • Applications live in time. There will be newer versions of the same app.
  • Plug-ins can be made by me or by third-parties - they must be localizable.

The list of reference English expressions changes every time one of the above points changes.

There was one additional condition: I wanted to spend zero time on maintenance of the list of reference expressions.

rsrc/language-options.png image

I had to refuse the concept of a static (per version, per application) list. Instead, I have chosen the opposite - a list that is continuously generated by the application itself. Whenever is the localization subsystem asked to translate an expression, the expression is added to a local table of expressions.

Delivering reference expressions to translators

As already said above, the list of expressions to translate is assembled dynamically as the translating person uses the application.

When people opt-in to participate in the translation efforts, the application starts to build the list and also monitors, which expressions are used most often. The list of expressions is sorted by frequency of use and the translator starts from the most often used expressions.

I also decided to allow translations to be specified directly inside the application, though this is
just an implementation detail.

There were additional challenges. What if multiple translators worked on the same language concurrently? There was a clear need to synchronize the process.

I decided to add a "Synchronize" function. The user hits a button and the application sends all the translated expressions and also all the encountered but not yet translated expressions to a web server. The server then merges the new data with the old expressions and responds with the latest table of both translated and untranslated expressions.

In case of conflict, access rights are used to decide whose expression takes precedence.

Multiple users may work on the same language if they click on the "Synchronize" button often enough.

Delivering translated expressions to end users

This was a critical part. The language switch must be on a visible place - I put it right on the initial screen of the application.

rsrc/languages-panel.png image

When the user clicks on a language, the application not only switches to that language, but also downloads the latest language pack.

What went right?

After nearly 2 years, the application is at least partially available in all major languages and in some minor ones. Although not all expressions were translated, the application is actually quite usable when the percentage of translated strings reaches ~30% for a language. This is due to the frequency sorting of expression usage and due to the fact, that there is a lot of expressions that are only ever visible if the user is an expert.

Community-powered localization has its advantages:

  • Translators have domain-knowledge. They are after all users of the application.
  • Some translators are very fast - they love the software and they love translating.
  • Translators work as volunteers.
  • Multiple translators may work on the same language.
  • Translations are available even before completion.

But there are also risks:

  • Overzealous users. There was one person that used Google Translate to translate into multiple languages. He did not respond to emails, I had to disable his account and IP and delete all his work.
  • No control over the translation quality or speed.

What went wrong?

I have taken the list of languages from Windows. This was probably a wrong move. The Windows list contains sub-languages. Like French (France) and French (Canada). They are very similar and it would be more beneficial if the French-speaking translators in France and Canada were able to cooperate. Now there are two very similar translations.

I have left printf formatting sequences (%i, %s, ...) and Windows accelerator markup (&File) in the expressions. Although I have given instructions how to deal with them, it confused some users.

The application can crash if the printf character sequences are wrong. I should have probably checked if the English expression and the translation match (from the printf point of view) and refuse to accept the translation in case of inconsistency.

It would be probably better to strip & characters from the source expressions and auto-generate the keyboard accelerators in the GUI. The result would not be optimal, but probably better than those created by users without the required knowledge.

Implementation details worth mentioning

The application

I had to create my own localization subsystem. Basically a function that takes English expression and returns an expression in the requested language. Localization in Windows has been about directly localizing dialog templates or forms, menus and even graphics. This is not feasible with community-driven translations.

Translating menu items or tool tips in the application is trivial. I just call a function that converts English to X. For dialog templates, I have written one function that modifies the dialog template prior to displaying it. Since I already had to modify dialog templates in runtime due to Vista fonts, I just modified an existing class.

All the features related to localization are concentrated inside a single dynamic library. This is an optional library and users may choose not to install it.

Synchronization service

I also had to create a database on this web server. It contains:

  • list of users (already available),
  • list of languages,
  • list of applications (including versions),
  • list of known expressions,
  • mapping between expressions and applications,
  • list of translations for each expression and language.

Communication between the server and the application is relatively trivial. There is a timestamp
for each language to avoid unnecessary traffic (client says, I want language X, have revision Y,
server either sends a new revision or nothing if Y is the latest one).

Web-based interface

I have a partially finished a web-based interface for translators. I hope to improve it this year
and make it publicly available.

Summary

From my point of view, the project is a success. The applications are available in multiple languages
and the system is nearly maintenance-free. I only have to create flags for new languages when the
number of translated expressions reaches given threshold.

I had to refuse the way recommended by Microsoft for localization - I mean using satellite dlls - and I had to adopt and extend the Linux way. The Windows way may be suitable for corporate environment, but normal users would not be able to work with dlls directly. Effective distribution and localizable user data would be a problem too.

Recent comments

user icon JDDellGuy contributing user on January 17th 2011

Very interesting. How was the "overzealous" person a problem? Were their translations incorrect, or was it simply too much information at once, or something else?

user icon Vlasta site administrator on January 17th 2011

He was using Google Translate to translate into ~dozen languages he did not really understand. The results were far from optimal.

user icon sixλxis forum moderator on January 17th 2011

This was a good read, very informative.

user icon Erik registered user on January 22nd 2011

Thanks for the Big thank you

I am translating Real World Icon Editor into german right now and the other applications will follow.

1 Hour later...

I translated germany language from 46 to 50 % (ca. 100 Words)

user icon Anonymous
Select background
What about ICL files?
Vista & Win 7 icons
I wish there were...