3 The Format of PO Files

The GNU gettext toolset helps programmers and translators at producing, updating and using translation files, mainly those PO files which are textual, editable files. This chapter explains the format of PO files.

A PO file is made up of many entries, each entry holding the relation between an original untranslated string and its corresponding translation. All entries in a given PO file usually pertain to a single project, and all translations are expressed in a single target language. One PO file entry has the following schematic structure:

white-space
#  translator-comments
#. extracted-comments
#: reference...
#, flag...
#| msgid previous-untranslated-string
msgid untranslated-string
msgstr translated-string

The general structure of a PO file should be well understood by the translator. When using PO mode, very little has to be known about the format details, as PO mode takes care of them for her.

A simple entry can look like this:

#: lib/error.c:116
msgid "Unknown system error"
msgstr "Error desconegut del sistema"

Entries begin with some optional white space. Usually, when generated through GNU gettext tools, there is exactly one blank line between entries. Then comments follow, on lines all starting with the character #. There are two kinds of comments: those which have some white space immediately following the # - the translator comments -, which comments are created and maintained exclusively by the translator, and those which have some non-white character just after the # - the automatic comments -, which comments are created and maintained automatically by GNU gettext tools. Comment lines starting with #. contain comments given by the programmer, directed at the translator; these comments are called extracted comments because the xgettext program extracts them from the program’s source code. Comment lines starting with #: contain references to the program’s source code. Comment lines starting with #, contain flags; more about these below. Comment lines starting with #| contain the previous untranslated string for which the translator gave a translation.

All comments, of either kind, are optional.

References to the program’s source code, in lines that start with #:, are of the form file_name:line_number or just file_name. If the file_name contains spaces. it is enclosed within Unicode characters U+2068 and U+2069.