Thursday, June 27, 2013

Machine Translation and the Future of Translation

There is a saying that has been going around the translation industry for the past few decades: “Machine translation is between 5 and 50 years away from perfection”. Although this is clearly a bit sarcastic and cynical, it also makes very good sense.

Compared with the machine translations of 10, 20, or 30 years ago, more contemporary automated translation engines – such as Google Translate – have really come a long way. However, they are still extremely far from perfect. Try, for example, entering an Italian or Spanish phrase into Google Translate, and translating it into English. Chances are the translation was pretty close if not perfect (depending on length, grammar, complexity, etc. of the phrase). Now try translating, say, a Japanese phrase into Estonian. The translation you get is probably barely even close.

Obviously, languages like English/Spanish/Italian which share many words and even grammatical structures are going to translate much more smoothly and naturally than completely unrelated languages such as Japanese/Estonian or Danish/Chinese. But sometimes, even Google Translate – used with pair of similar or even root-sharing languages – can surprise you. For example, with the many different ways to conjugate verbs in Spanish, even with a single, properly placed pronoun (I/you/we/he/she/it/they/etc.), the automatic translation you get can sometimes be laughably incorrect due to funky verb conjugations, dialect, even complicated punctuation.

Even so, automatic translation these days is much more advanced than it was even five or ten years ago. Does anyone else remember playing the “Babelfish game”? You’d put a simple phrase into Babelfish, translate it, translate it BACK, and repeat this process using two or more languages, eventually ending up with a bizarre, sometimes even psychedelic or surreal chunk of language that was almost invariably completely nonsensical and hilarious. Of course, you can still do the same thing with Babelfish or Google Translate these days, but I must admit that it takes a good number more clicks now than it did previously.

If you’ve ever seen Star Trek, with its “Universal Translator”, or listened to/read/seen/played any of various incarnations of the Hitchhiker’s Guide to the Galaxy series with its “Babelfish” (from which the Babelfish translation engine got its name), then you’re probably familiar with the direction in which a lot of people perceive the future of automatic/machine translation to be heading: a person says something – anything – in virtually any foreign language, and as the words leave their mouths, they are automatically translated by a machine (or an actual tiny fish, in “Hitchhiker’s” case), and heard by the listener in their native tongue. While this is obviously fantastical and inconceivable for a number of obvious reasons which I won’t bother going into here, one can’t help but wonder: just how close to that ideal can machine translation actually come? And, even more importantly: how will that affect us as translators?
Luckily for us, I really doubt that machine translation will ever get THAT far. However, it is true that the more precise and easy-to-use machine translation becomes, the harder it will be for translators to get work, and the less money will generally be available for these hardworking humans. I don’t presume to have the answers to the big “What do we do about the human/automatic translation battle?!” question, but I can offer two opinions I have on the matter:

1) As I stated earlier, I don’t believe it is possible that machine translation will ever reach the levels of science fiction-based examples of the “future of translation”; and 2) let’s all hope that, if machine translation IS ever perfected, it’s 50 years away, and not 5.

Wednesday, June 19, 2013

Localization: a Challenge for Programmers

Mon Gonzales, programmer at Active Gaming Media, shares his views on Game Localization

Q- How is the process of localization from your point of view?

A - As a developer, my role is to make the localization process as simple and efficient as possible. This is usually how we proceed here:
During the conception of the game or application -the part where nothing yet is coded-, we determine which texts we will need, and write a first, simplified version of them, in a single language (usually English). When I'm coding the application, I use the previously mentioned texts as a reference. When we get a first version of the application, I gather all the texts and visual text elements, which are reviewed carefully and improved by a professional writer: dialogues are made more lively, in-game texts more pleasant to read, etc.. The final text is sent to our translators who start working on them. Our proofreaders check the translated texts. I get the texts back, it is a critical moment: I have to implement them exhaustively, while not spending too much time on it.
The last part, but not the least: text debugging. Our debug team checks all the texts directly inside the application and make sure there are no problems such as missing texts, overflows or mistranslations.

Q - How do you gather and implement texts and graphics?

A - I will start with graphic contents as the way of gathering texts depend of the platform you're working on. I always have two big folders for images: "Graphics only" and "Localizable Graphics", the later having as many subfolders as languages we translate. During the development, I keep not the localizable images, but also the template graphic files I used to generate them.
When the the localization starts, I send all the "Localizable Graphics" to our translators so that they can not only translate the text, but also check whether their new text will fit the image or not.
Then I can generate localized images for the application. If you did things methodically enough, you -as a programmer- should be able to retrieve these images quite easily.
As I said earlier, gathering texts can be different depending of the machine you're working on. I will take the example of the iPhone, as the iPhone SDK contains very useful tools for text gathering. Concretely, when you are developing an iPhone Application, rather than inputting the texts directly, you will rather use a function to indicate that you are using a localizable text. The iPhone will then, all by itself, pick the localized text (in the user's language) inside a file which consists on a series of lines like this one: "Press start key" = "Please press start"; Or more simply : "Text to be translated"="Translated text";
These files can be generated automatically by a program in a couple of seconds: the programmer doesn't need to spend hours seeking every line of text inside the code. The big advantage of this format is that it's easy to understand for translators, and convenient for programmers: when you get the text back, you're done with just one copy/paste. Sometimes, things are not made that easy: in this case the programmer has to create a similar system. Otherwise, there's a great chance you will lose time trying to copy the texts one by one, possibly forgetting some, and losing even more time when you need to copy them back one by one. Organization is the key of a fast and complete localization, as a bad programming style can ruin what was otherwise a good localization.

Q - Are there some problems regarding the texts translators return you?

A - Well it depends. I am lucky enough to develop for a company specialized in game localization: it means that our translators are well aware of how they have to work and can avoid some common mistakes. So we don't really have problems. There can always be a few things like overflows, but never something too serious. The biggest risk is usually related to formatted texts. If the translators change or delete formatted parts, it can have some dramatic impact on the application. To avoid such problems, I always provide our translators with a complete documentation, guidelines, and I make sure I am always available if they have some questions for me. Gamers tend to think that localization work is only the problem of translators or translation companies, but the role of the programmer is capital. A bad programming work can cause a lot of trouble: untranslated texts/graphic elements, display problems, etc.. In the end, the success or failure of a localization depends half of the translators, half of the programmer.

Q - Finally, what tips would you give to a junior programmer, regarding localization?

A - If you are to localize your game/application, then you will have to think about it before coding a single line. First of all, you have to check whether your SDK includes or not tools that can make localization work easier. It would be a waste of time not to use them. If you don't have such tools already, before creating your own ones, try to ask a more experienced programmer for some advice, as some tools and functions may have already been created inside your company. In any case, it is a big mistake to start coding if you're not 100% sure to know how you will handle texts. As I said earlier, organization is everything!

Thursday, June 6, 2013

Localizing text strings on iPhone: The simple way

Localization of text strings on iPhone is certainly one of the easier things to handle when working on a game or application. Everything you need is already provided with the iPhone SDK.

However, as you have to use specific functions for localizable texts, you would be well inspired to learn about localization techniques for handling iPhone before even you get started with coding your application.

The process can be broken down in a few simple steps:

1. Use NSLocalizedString while coding

NSLocalizedString is a function that returns you an NSString containing, if possible, the translation of the text that you will give as the first argument, in the user’s default language, or the first language it can find a translation for according to the user’s preferences. If it can’t find any text at all, it will just return the text you provided by default. Example: NSString * text = NSLocalizedString(@”Text to be translated”,@”Contextual comments”);

Put the string that you would like to see translated in the first NSString argument. You can use the second argument to (later) provide translators who will work on your files with comments
and useful contextual information. If you are using things such as ‘%d’, it’s probably a good idea to use the comments part to explain to translators what it is they are looking at, even if it appears particularly simple to you. Translators often tend to be confused when they see something that is not plain text.

2. Gather your strings 

Now that you’ve finished coding your application, you will probably want to gather your strings that need translation and send these out to your translators. Fortunately, this can be done automatically, so you won’t need to fetch everything by yourself. First, for each language you want to implement, create a corresponding folder under the name languagecode.lproj. For example en.lproj, fr.lproj, ja.lproj etc..

Then, open a Terminal window on your Mac and go to your project folder (if for some reason you’re completely unfamiliar with command lines, a cd Documents/projectname will do it in most cases). Now, let’s assume you used English for the NSLocalizedString arguments. All you have to do is to type the following command: genstrings /en.lproj /Classes/*.m
 
The name of the command is pretty self-explanatory: genstrings will generate strings, checking all of the .m source files in your Classes folder, look for NSLocalizedString arguments, and put them together in a file inside the /en.lproj directory. That file is called Localizable.strings, a simple text file containing a succession of lines like this: /* Comments you put in the second argument */
“String you want to see translated” = “Translated string” Normally, the left and right strings should be identical for the language you formerly developed the game for. You may want to use to an ID code for your strings, but I would advise you to be careful as they will be displayed in case of problem. So this one is ready to go.

3. Get your files translated! 

Now you can send that file to your translators, and ask them to translate ONLY THE RIGHT PART. I’m very serious about this. It’s really a huge unnecessary pain to get files with both columns translated, making the file unusable as it is and making you lose time. Make sure that the name of the files you get is still Localizable.strings and put them in their respective .lproj folders (fr.lproj for French, etc…). There are a couple of tricks to make the process a little smoother for translators, which helps out you, the programmer, in the end. I will offer some tips in a later article.

4. Test, and test a lot 

A common mistake that developers make is to just copy the translated files, check a screen or two to make sure the correct language is displayed, and then just assume that the game is ready. It probably isn’t! Before releasing your game, ask a native speaker of each language to check your app thoroughly, and make sure that there are no problems such as:
- Mistranslations or missing translations
- Texts that don’t make sense in their context or that are misleading
- Text overflows (you would be surprised to see how long some short English words can become in, say, German)
- Crashes and over major displaying problems: it’s very easy to delete a ” or alter a %s, which can cause major problems in your app.
The last three reasons in particular can cause your game to be rejected during submission, so when you test your game, take the process seriously for ALL languages, while making sure to utilize native speakers.

Overall, the iPhone SDK comes with all you need to do simple, text localization. If you are unfamiliar to localization techniques and looking for an easy and approachable process, this option is about as perfect an option you could hope for.