19 Jan 2013

pHp Localization with GETTEXT on Windows


Have you, like me built multilingual sites with multiple language files holding strings in multidimensional arrays, only to find synching the adding, editing and deleting items across files to be almost impossible? Well, the answer may be to use GETTEXT. Although this basically does the same thing - create separate files for all your localizations, it does have a number of advantages:
  • you can write pretty much plain text in your html, as opposed to providing a tricky array item, e.g. echo __('this is plain text') vs. echo $lang['plain_text'].
  • localizations are kept in .po and .mo files which can be modified and updated via editors like PoEdit, making the process relatively painless.
  • these editors scan your .php files for __('...') and update the po files for you.
  • if translation strings are not found in the target language, the default language string is used instead - although not ideal, at least you don't have to worry too much if a localization is only 99% complete - it will still display some flavour of text. For this reason, the original texts in the php files are usually written in the main language of your target audience, which may or may not be English.
  • if your localizations are disseminated amongst a number of translators or you have a lot of localizations, you can create .pot files, which you or your users can then use as a template for creating the .po files.
  • .po files can be uploaded to Pootle, a Python-based translation platform
  • It's fast - I mean really fast. Different to arrays, you don't have to load the whole localization!
So, this approach should help you keep a better handle on your localizations and their updating.

How do I Create po/mo Localizations?

There are two main ways to create .po files - either directly from the source files, or from a .pot file. From source would probably be easier in the case of a simple bilingual site, maintained entirely by one person for example. The .pot file, as mentioned, would be best for the outsourced or many languages situation.
Let's look at setting up a page with GETTEXT text in a simple page.
<?php include('includes/config.php'); //contains our gettext setup - see later?>
<meta charset="utf-8">
<title><?php echo _('This is the page title'); ?></title>
<h2><?php echo _('This is the main page title'); ?></h2>
<p><?php echo _('Enter your intro text here'); ?></p>
OK, from the above we can see 3 translation strings. If you've downloaded and installed PoEdit, follow these steps to create a translation from source.
  1. Create a directory structure for your localizations:
    That will create localization areas for Welsh (cy_GB) and Italian (it_IT). You could name the localizations directory anything you want, e.g. langs, locales, l10n, etc - it's not important.
  2. Next open up PoEdit and create a new catalog and add some catalog properties:
    The image above shows an imaginary language, Xoravian - you'd enter the name of the language for the localisation. The charsets will usually be set to UTF-8 and the plural forms can be gleaned from following the link near the textbox.
  3. On the second tab, save the basename as '.', then give the paths you need to check for strings in the paths textbox:
  4. To finish the catalog properties, just check the keywords:
    The usual suspects that you may want to use are:
    • __
    • gettext
    If you use Wordpress, you may wish to add _e.
  5. Once you've entered all the properties, save the .po file to the appropriate LC_MESSAGES directory (let's save it as lang.po into the Italian directory):
  6. Next, PoEdit will run an update to grab all the gettext strings in the paths you've entered. If successful a list of strings will appear. As below:
    In addition, your directory, should now contain two new files:

That's it, we're done creating the localization file. Now all that we need to do is enter the translation strings and save.

Need the create from .pot approach? I'll create that as a new post on PoEdit soon, but in short, just create a .po file as above and name it lang.pot. They have the same file formats.

Getting GETTEXT to work in Windows

OK, that last bit looked a bit complicated, but getting GETTEXT to play nicely with Windows is a bit of a nightmare and documentation is very patchy. So, I hope the following is useful and works for you.

In the first code snippet, we came across:

<?php include('includes/config.php'); //contains our gettext setup - see later?>

So here now is the all-important code in the includes/config.php file to get everything to work:

    check url for lang parameter and limit the allowed languages,
    otherwise fallback to default text
    if (isset($_GET['lang']) && in_array($_GET['lang'],array('it_IT','cy_GB')){
        $lang = $_GET['lang'];
        setlocale(LC_ALL, $lang);
        $fn = 'lang'; //the name of the .po file
        bindtextdomain($fn, "./localization");
        bind_textdomain_codeset($fn, 'UTF-8');

1 comment:

  1. Hi, Alan!
    If you’re interested to localize software which uses .po language files, I warmly recommend http://poeditor.com/
    It's a web-based localization platform with a very intuitive work interface, easy to use even for technically inexperienced translators. It is perfect for crowdsourcing and it has a lot of management-oriented features.
    It will most likely make your work a lot easier, so feel free to try it out and , if you like it, to recommend it to developers and everyone who might find it useful.