Friday, November 18, 2011

Active Gaming Media: Our experience in the process of Game Localization and Debugging

Although some stages of game development are well-known of gamers (say, for example, coding, designing, or testing), game localization is often overlooked and only noticed when there are problems with it - or when games are not localized at all. Although it's true that great graphics or a perfect gameplay are easier to "feel" and enjoy in a game, delivering a good localization is still an important and interesting part of game development and should be cared about as much as the rest. 

With this article, we are hoping to give an idea of how localization works, and what makes it such a long and difficult process.

I - What is localization and why is it important?

First of all, I would like to make the difference between translation and localization, as not only gamers but also sometimes developers are confused between these terms. While translation is the simple transposition of a text from a language to another, localization goes further and is rather an adaptation of a product to a new market.

Localizing a game is a long process that involves not only translators but also programmers, designers, sound engineers, testers, and starts as early as the game's concept is being put on paper, before any line of code is written. Besides texts, some graphical elements (including graphical texts) will likely need some rework to match new cultures, voices will need to be recorded in as many languages as the game offers, and all of these elements require a careful implementation by the programmers. Altogether, the whole process can take up to several months. For now, let's have a quick overlook at the history of game localization.

II - History of game localization

The history of video game localization can be summed up in 4 big eras:

The beginning (8 bits and partly 16-bit machines): The history of game localization begins when consoles started having a wide international audience, which brings us back to the NES/Sega Master System days. At that time, developers had little -if any- knowledge about localization things and development kits didn't have any simple functions to simplify localization. Usually developers simply inserted their translated texts directly inside the code, hopefully not forgetting any parts. There was also little control over the quality of translations, usually performed by "friends" of developers, which in resulted in some famous mistranslations you have surely heard about. There were no real game localization agencies at that time and localization was a pretty amateurish business. 

Growth of the localization market: During the SNES/Genesis era and after, the industry started to become more professional, and developers started outsourcing their localization works to some of the few specialized companies existing at that time. Some of them already had small internal localization teams, but this was only the luxury of a couple of wealthy developers. This was when we started having games in other languages than just English and Japanese. Finally, European gamers (outside UK) could enjoy games in their own languages and the general quality of translations improved significantly.

A mature industry: In the late 90's/early 00's, the industry ended its growth as games became more complex and started having huge volumes of texts/graphics/sounds to localize. This meant there was more and more money to make out of localization and it encouraged small companies to try and enter the industry. This saw a higher competition, resulting in better translations at lower rates. At the same time, programmers got used to work with texts in different languages and were able to code efficient localization functions. The use of external files for translated texts definitely became the norm.

A changing industry: Over the last years, the game industry has changed significantly. The growth of mobile gaming as well as the comeback of independent developers (PSN, WiiWare, Xbla...) meant that a lot of new people entered the market, with the same problems as their precursors: little knowledge about localization.

The spectacular growth of Internet also allowed some companies to settle themselves and start doing business at a low price. Quite a few "lowcost" companies based on internet appeared offering very low prices but are to be considered very carefully. We recently had to "proofread" a lot of small games that were actually "translated" automatically with automatic translation tools. For example in one of our recently proofread games – a basketball game for cellphones – we the French version was using words such as "copulate" because the automatic translator didn't pick the right translation for certain words. And as there is no constructor validation for mobile games (except iPhone), bad game translations made a spectacular comeback. Of course, low prices don't mean low quality, but I would recommend newcomers to be very careful about who they deal with and ask solid references before signing any contract.

III - The current localization process

After this little history lesson, let's have a look at how games are currently localized, or rather how they should be. 

The localization process starts as early as the game is being designed. The following points have to be decided before the programming starts:
How translated texts will be managed? And What system will be used to fetch translations in the game's code? 

This is something to be discussed before coding starts. An old practice, that some developers still use unfortunately, is to have the texts written directly inside the code. It means concretely having as many lines of code as languages for a same text. It is probably the worst practice as texts are made hard to gather and edit – it is very easy to forget a couple of texts when you have thousands of lines of code – . We have seen a couple of young developers doing that and they were obviously struggling to get things done right.

My recommendation is to keep all the texts in one or several separated files. For consoles, although some developers stick to more simple text formats, I would recommend the use of XML. It is an easy-to-read format and can be edited simply. Moreover, all development kits' libraries now include simple functions to open and parse XML files easily, allowing to fetch and switch between texts easily. The best thing to do is to create a small function that opens a XML file, finds the former text, and picks up the right translation depending on the game's settings. Once this is done, texts can be obtained in a single line of code, with something which will look like 'text = getTranslatedText(""Original Text"");'.

As for the XML files themselves, a good practice is to have one file per language, where texts can be identified by an ID. Most of time, developers use IDs such as "MSG001", "MENU006", etc.. However, if you have the chance to create something more "explicit", you should go for it. For example, you can have something like this in English, in a file called localized_texts_en.xml: which in French will look like this (in a file named localized_texts_fr.xml) It is possible to use other file types though. For example, iPhone includes functions that can automatically fetch translated in plain-text files. I will explain this point in detail later. 

The other point that will require a special attention is how you will structure your files/directories system. You will often have to localize voices and some graphic elements inside a game and it is important to organize your work so that sounds/voices are separated, as well as visual texts/purely graphic elements. My recommendation is to have a clear directory system for such resources. If you keep your graphic files in a /graphics folder, then you should do something like graphics/localizable/language_code for your localizable resources. This separation will allow you to make sure you don't forget any files or send files that don't need to be translated to the localization team. Once again, writing a small function that will easily retrieve these resources is a good practice, rather than entering manually the full path name every time. Also, having self-explanatory names for files will help the developer reading the code easier later. Having a file called "car.png" will tell a lot more than one called "42a.png". The programmer and designer/sound engineers should agree on a good system before they get to work. After the game is coded and ready for localization, if the programmer did a good job, the extraction of localizable contents should be easy. For example, for the iPhone, which I've been personally working on, you can easily extract all the localizable texts contained in a source code directory with a single command function, which will go through all the code files and copy the strings that need to be translated in one file. To save time, programmers should code such functions for their projects on different systems. Usually, programmers put all these texts in one or several Excel sheets before they send texts to the localization team. 

From my experience, Excel is the more common and convenient format, and surely the easiest to use, as it offers efficient Import/Export options. Translators feel comfortable with it as well and it allows them to freely add comments or remarks they may have. Some developers sometimes send directly XML files, which translators are less used to but still usually understand (they are trained for it!). Although it didn't happen to our company, some localization firms told us they sometimes received directly source code to translate. If such an idea ever came to your mind, please forget it right now, as it will painfully difficult for translators to work on and also let potentially anybody get a look on your code, which you took moths to write. From the moment localizable files are sent, the localization team can start working.

 Famliarization: Often overlooked, this stage is extremely important though. It will help translators to understand how the game works, feel the general atmosphere and tone of the game, understand instructions perfectly, and thus being able to recreate the game's feeling in their mother language. A lack of familiarization often results in problems with consistency, accuracy of instructions or the way characters express themselves. If translators feel some graphical elements shall be changed (for example not to shock other cultures), it is the only moment they can express the need. At the end of that stage, the localization team (translators, localization manager...) meets the development team and they decide on a few points such a the texts' style, visual elements that need to be changed for new territories etc.. Once everything is agreed, translators work on texts, designers on graphics, and sound engineers on voices/sound effects.
Now, the translation can start. If the game is to be released on different systems, translators will have to deliver different files respecting the terminology of each machine. After this, proofreading is done with a special care for terminologies and overall consistency. Texts that need particular treatment (audio or visual) are sent to the sound and design teams. When everything is done, localized files get back to the development team. The developer then has to implement the new texts as well as new graphics and sounds, and make basic checks (Are there any missing files? Do they have proper names? etc..). Most of the implementation work shouldn't be much harder than copy and pasting files and creating a language selection menu (for consoles) if localization functions were written correctly during the development. 

At that moment, a first version of the game is printed. Quality assurance is then performed by a multilingual debug team, whose role is to play the game throughly and report all bugs they find. There are three categories of bugs: A, B or C 

‘A’ are critical bugs: crashes, black screens, serious layout issues, misleading instructions or terminology problems. All of these can be sufficient to have the game rejected by the constructor during the validation stage. Thus these bugs must be solved in absolute priority.

‘B’ bugs are minor issues such visual/layout problems (usually text overflows), texts missing partly, unclear instructions, etc.. Although they are not major problems, they can still alter the gamer's experience and should be fixed whenever possible.

‘C’ bugs are usually small improvements that can be done to improve the overall experience: improved texts, better presentation of text elements... these "bugs" usually have a part of subjectivity and don't necessarily require rework, although it's better to treat them the same.
The programmer and translators are involved in this stage to fix all the problems and regularly generate new versions of the game to be tested over again by the debug team. Bugs can be report in various ways and methods change from a developer to another. Some use general bug report software such as Mantice or Bugzilla, while others have their own systems: those can consist of advanced reporting systems implemented on special servers or more simply template excel files which the quality assurance team is expected to deliver daily. 

Validation by the constructor: The most stressing stage for developers. The game is sent out to constructors who play the game in depth for a couple of days at least, sometimes weeks, and validate the game if there are no ‘A’ bugs or elements that would need censoring. If the game is rejected, developers will need to get back to the QA stage, or even translation stage if the texts are really bad.

Nintendo, Sony and Microsoft are equally demanding on this stage and are usually very strict on quality of games, no matter who develops them. Recently, a 2nd party developer appoached us in panic because their game - exclusive to the machine it was going to be released on - was rejected by the constructor, and required a couple of extra weeks of QA. Although a popular thought is that nowadays constructors let developers release about anything on their machines, the truth is that their validation processes are way more strict than they were before.

IV - Problems usually encountered during localization

Besides the usual translation issues, we sometimes get to face problems specific to localization. Usually, these problems are found during the quality assurance checks, and fixing them requires the help of not only translators, but also programmers and sometimes designers, voice actors, etc. 

The most sensitive point, and maybe the one that translators understand least, is probably when variables are included into files to be localized. Things like "You earned %i%d". Of course they are meant to be replaced by some numbers or names. Sometimes, translators don't really understand what these are here for and just put them aside. It also happens that translators know what these mean but accidentally alter or delete them. Usually, it will result in some minor display glitches, but it can have more serious consequences such as crashes and the like.
How to avoid this: Although translators are trained to handle such variables, it is always safer to write clear instructions in the files to translate. Writing things such as "%d is the number of points earned, don't modify it and put it at the appropriate place" may sound like a loss of time, but they can save you lots of efforts when texts are returned. 

Terminologies are often a big source of trouble. It regularly happens that developers have their master versions rejected by constructors because of terminology mistakes, sometimes resulting in the report of the game. It can be the fault of translators, forgetting to follow the terminology instructions carefully, but sometimes mistakes are done by the developers themselves.
How to avoid issues:  Indeed, before they send out any texts to the localization team, developers should make sure the localization staff always work with the latest versions of terminologies for each console. Also, terminologies can vary from a territory to another even for a same language. One of the most common problems is overflow. It frequently happens that developers try to fit their texts to a certain areas, forgetting the differences between languages. In new languages, the new texts may just turn to be too long with no proper, shorter alternatives. It regularly happens from Japanese to English but not only. Indeed, there are big size differences even between European languages. In general, English texts will look significantly shorter than German ones. 

These problems can be easily solved for text boxes by splitting the texts in two successive boxes, for example, but it can quickly be tricky with graphical elements. In some cases, the only solution will be to ask the designer to change completely the former design in order to have something small enough.

How to avoid issues: Let the designers know that other languages may require much longer texts and that they should design elements while keeping this in mind. Regarding the translation itself, one of the points to be the most careful with is consistency. Weapons, places, names etc. should be named exactly the same throughout the game. If the names keep changing or even do just a few times, the player will be confused and this is a good reason to reject a master - this applies for all constructors, including Apple. 

How to avoid issues: One good way to avoid these problems is to have the names translated only once, in a general form and replace the code automatically later. This brings us directly to the next point: Variables inside the code.

 Some small things can have real bad consequences. Something as little as changing a file's name can give developers a hard time. Of course, when there are only a couple of files to localize, it's not such a big deal to replace file names, but it can quickly become annoying when there are literally hundreds of files that have been unproperly renamed. For graphics and sound files, the number of files tends to add up quickly. Although it's an easy problem to solve, developers are usually not too happy to lose time renaming all these files. 

How to solve: Give very clear instructions to the staff and ask the project managers to regularly check everything is done right. – A concrete example: the localization of iPhone Apps
I like the iPhone's example, as it is an easy platform to develop for, with lots of "small" developers, sometimes alone, working on it. Its SDK consists of a series of tools, libraries and software that already exists for mac. This was our first project for Apple's mobile phone. I will start with a small explanation on how localization works for iPhone. I gave the example of text extraction earlier, which is really simple for iPhone. Localizable files are actually plain text and consist of a series of lines which will look like this: “Touch the screen to start” = “Touchez l'ecran pour commencer”; Or more simple: “Text to be translated”=”Translated text”;
These files can be generated automatically by a program in a couple of seconds: the programmer doesn’t need to spend hours seeking every line of text inside the code. The big advantage of this format is that it’s easy to understand for translators, and convenient for programmers: when you get the text back, you’re done with just one copy/paste. The code for using localized strings is also very simple, it consists on a single line, which looks like this: LocalizedString(@"Touch the screen",@""); where "Touch the screen" can be replaced by any of the elements on the left side of localized strings files. 

As for the graphic elements, we adopted a simple repertory system with consistent file names and coded a small function that can retrieve the appropriate files depending on the user's language settings. It was also a good example of how a localization team always needs adaptation and a good communication. When the localizable files were sent out for translation, it was stated more than clearly that only the right parts had to be translated. It was also mentioned during the meetings. By "stated more than clearly", I mean this part was literally written in red, bold, underlined and in a bigger font than the rest! Still, when I got all the files back, one had ALL the left texts translated and one had some partial left translations. In the end, I had to manually copy and paste the original left side for the whole text. As it was plain text, there was no easy way to have it done automatically. 

Fortunately, this was a small translation! But since then, even if Apple's format is really simple, I always send out files after they were converted to Excel sheets, where columns can be replaced easily. With Export/Import options, this takes no more than a few minutes to update. In the end, translators feel more comfortable when the files they have to work look always the same. The designer on his side, made a couple of mistakes in the file namings! He added some spaces by accident, or sometimes edited some characters that didn't display fine for him! The consequence is that some images were just not showing up and we lost time to fix this.

VI - Why are some game never localized?

As we could see until now, localization is a complex process that involves much of the game development team. While programmers are busy working on localization, they have little time to care about other projects. It is thus unsurprising that some games are not released in all territories when the sales potential is low and rather let their programmers focus on new projects and keep their money for that. The costs for localization can add up quite quickly. In the end, you will need to pay not only translators but also the QA team, the programmer who will be busy during that time, sound engineers and designers if their help is required, and pay for the game's rating (ESRB, PEGI...). In the recent year in particular, games tended to get bigger and more complex, thus requiring a lot of work whenever a version is released. It also happens that some games are simply "unlocalizable", when they are too tightly connected to a culture or territory. For example, would it make sense to localize a Japanese game which gameplay is based on elements that exist only there? Certainly not.