A library with CLI tools allowing to add i18n and l10n to Python apps and libs with ease.
Contents
verboselib
is a thin abstraction layer on top of GNU gettext toolset and Python gettext module.
In contrast to the raw gettext
API, verboselib
provides a unified and thread-safe way to access translation catalogs and an easier way to set active language dynamically.
Additionally, it allows definitions of lazy translations, which can be useful for module-level variables or class-level attributes.
Besides, verboselib
comes with a bunch of CLI tools for extracting translatable messages from sources and compiling them. Those tools do not impose code dependencies and can be used as stand-alone utilities.
The primary domain of use is applications & services. However, it can also be used in libraries.
In such case users of a target library will have to be aware of using verboselib
as well. Hence, verboselib
can be viewed as an i18n & l10n framework.
Generally, reliance on frameworks is a thing to refrain from in stand-alone public libraries, but it can also be a totally valid design decision for auxiliary libraries at a product-level scale.
Available as a PyPI package:
pip install verboselib
The following examples provide a quick overview of what usage of verboselib
looks like.
The briefest usage example shows how to get immediate translations:
from verboselib import Translations # (1)
from verboselib import set_language # (2)
translations = Translations( # (3)
domain="the_app",
locale_dir_path="locale",
)
_ = translations.gettext # (4)
set_language("en") # (5)
print(_("Hi there!")) # (6) 'Hi there!'
set_language("sv") # (7)
print(_("Hi there!")) # (8) 'Hej där!'
And here is the explanation for the noted lines:
- Import
Translations
class, which is a translations registry. - Import
set_language()
function, which allows switching between languages. - Create an instance of
Translations
class, specifying the messages domain and location of the translations catalogs directory. - Define
_
as a shortcut fortranslations.gettext()
function. - Set the current language to English.
- Print a
Hi there!
there message passed as an argument to the_
function. This givesHi there!
as the output. - Set the current language to Swedish.
- Print the same message again and get
Hej där!
as the output.
The example is naïve, but calls to gettext()
via the _
shortcut are very common inside functions, where messages are translated when functions are called. For example:
def print_message():
print(_("Hi there!"))
set_language("en")
print_message() # 'Hi there!'
set_language("sv")
print(_("Hi there!")) # 'Hej där!'
Oftentimes there's a need to have a placeholder or just a message, the definition of which must be separated from its evaluation. This is achieved via lazy translations:
from verboselib import Translations
from verboselib import set_language
translations = Translations(
domain="the_app",
locale_dir_path="locale",
)
L_ = translations.gettext_lazy # (1)
class Greeter:
greeting_fmt = L_("Hi there, {name}!") # (2)
@classmethod
def make_greeting(cls, name):
return cls.greeting_fmt.format( # (3)
name=name,
)
set_language("en")
print(Greeter.make_greeting("user")) # (4) 'Hi there, user!'
set_language("sv")
print(Greeter.make_greeting("user")) # 'Hej där, user!'
Comments for the noted lines:
gettext_lazy
is used instead ofgettext
andL_
shortcut is used instead of_
.- A translatable string is defined as a class-level attribute using
L_
shortcut. - The translatable string is accessed as a normal string.
- The method is called and a parameterized translated string is returned.
This example is also naïve, but here the value of Greeter.greeting_fmt
is not translated into a solid string during construction of the Greeter
class. This is important, as the class is constructed only once. The actual type of greeting_fmt
is not a string, but lazy_string.LazyString
, which is a string's proxy:
>>> type(Greeter.greeting_fmt)
<class 'lazy_string.LazyString'>
There are several aspects to consider when using verboselib
:
- Active language.
- Translations catalogs registry — an instance of
verboselib.Translations
class. - Translations catalogs directory — a directory where
.po
and.mo
files are located. - Translatable messages themselves.
- Tools for extracting messages and compiling translations.
The sections below describe those aspects separately.
Active language is the language which will be used for getting final values of translatable strings.
Its current value is a string defined by a user, e.g. "en". The value can be missing, i.e. not set.
In addition to the current value, it is possible to define a default value. So, if the current value is not set, it will fallback to the default value.
By default, both "current value" and "default value" are not set, i.e. they are None
.
Finally, it's possible to turn translations off, so that translations will be equal to original messages.
The current language in verboselib
is controlled and queried via the following functions:
set_language(language)
Sets the current language for the current thread.
from verboselib import set_language set_language("en")
get_language()
Queries name of the current language in the current thread as a string.
from verboselib import get_language from verboselib import set_language get_language() # None set_language("en") get_language() # 'en'
drop_language()
Removes the value of the current language for the current thread. The value will fallback to the default value.
from verboselib import drop_language from verboselib import get_language from verboselib import set_language set_language("en") get_language() # 'en' drop_language() get_language() # None
The default language is controlled by functions which are similar to functions used to control the current language:
set_default_language(language)
Sets the default language for the current thread.
from verboselib import set_default_language set_default_language("en")
get_default_language
Queries value of the default language for the current thread as a string.
from verboselib import get_default_language from verboselib import set_default_language get_default_language() # None set_default_language("en") get_default_language() # 'en'
drop_default_language
Removes the value of the default language for the current thread.
from verboselib import drop_default_language from verboselib import get_default_language from verboselib import set_default_language set_default_language("en") get_default_language() # 'en' drop_default_language() get_default_language() # None
Usually, only the set_default_language(...)
is used. This can be helpful if None
is a possible value for the current language. In such a case at least a default language will be used:
def greet_user(user):
set_language(user.language) # can be None
print(_("Hi there, {name}!").format(name=user.name))
drop_language()
set_default_language("en")
...
user = get_user()
greet_user(user)
At certain times it can be useful to disable translations, for example, during debugging.
This can be done via set_language_bypass()
function. It disables the current language and prevents it from falling back to the default language.
from verboselib import drop_language
from verboselib import get_language
from verboselib import set_language
from verboselib import set_default_language
from verboselib import set_language_bypass
set_default_language("en")
set_language("fr")
get_language() # 'fr'
set_language_bypass()
get_language() # None
drop_language()
get_language() # 'en'
Note that the 2nd call to get_language()
returned None
.
verboselib
comes up with a couple of helper functions for converting languages to locales:
from verboselib import to_locale
to_locale("en-us") # 'en_US'
and vice versa, for converting locales to languages:
from verboselib import to_language
to_language("en_US") # 'en-us'
Translations catalogs registry (verboselib.Translations
) is a facade in front of gettext.GNUTranslations.
The verboselib.Translations
class requires the following arguments to be provided:
domain
- A name (
string
) of the domain of translations. Usually, it's the name of the application, of the library, or it can be just"messages"
. locale_dir_path
- A path (
string
orpathlib.Path
) to the translations catalogs directory, which is a place where actual translations are stored. Usually, such directory is calledlocale
and is located inside the top-level directory of the application or library. The path is strongly recommended to be absolute.
Example:
from pathlib import Path
from verboselib import Translations
__here__ = Path(__file__).absolute().parent
translations = Translations(
domain="messages",
locale_dir_path=(__here__ / "locale"),
)
Although instances of Translations
are just objects which can be passed to functions, it is recommended to create a single instance of Translations
as a global variable in a separate module, say translations.py
. Those instances are thread-safe.
Additionally, it can be handy to make module-level aliases for the methods of a Translations
instance:
# foo_package/translations.py
from pathlib import Path
from verboselib import Translations
translations = Translations(
domain="foo_package",
locale_dir_path=(Path(__file__).absolute().parent / "locale"),
)
gettext = translations.gettext
gettext_lazy = translations.gettext_lazy
This can look a bit ugly, but in such a case it's convenient to access those methods as functions from other modules, e.g.:
# foo_package/logic.py
from .translations import gettext as _
from .translations import gettext_lazy as L_
print(_("Hello"))
greeting_fmt = L_("Hello, {name}")
The API of verboselib.Translations
is compatible with GNUTranslations
and includes the following methods:
gettext(message)
ngettext(singular, plural, n)
pgettext(context, message)
npgettext(context, singular, plural, n)
Additionally, verboselib.Translations
provides their lazy versions:
gettext_lazy(message)
ngettext_lazy(singular, plural, n)
pgettext_lazy(context, message)
npgettext_lazy(context, singular, plural, n)
Those lazy methods return an instance of lazy_string.LazyString
which is a string's proxy.
As for ngettext
and npgettext
methods and their lazy counterparts, not only an int
can be passed as the n
argument, but also a callable accepting no arguments and returning an int
. For example, both the following calls are valid and conceptually identical:
translations.ngettext("window", "windows", 1)
translations.ngettext("window", "windows", lambda: 1)
All translations are stored in a catalogs directory, where each language has its own subdirectory.
This section describes how to build such a catalog.
Firstly, translatable messages are extracted from source files into .po
files. Those files contain IDs of messages and file locations where those messages are observed, e.g.:
#: foo.py:105 foo.py:203 msgid "Hi there, {name}!" msgstr ""
Translators fill in, well, translations for IDs inside .po
files:
#: foo.py:105 foo.py:203 msgid "Hi there, {name}!" msgstr "Hej där, {name}!"
Finally, those .po
files are compiled into .mo
files.
In order to be discovered and extracted, messages in source files have to be marked in a certain way.
This is achieved by wrapping a message by a pair of parentheses ()
prefixed by a keyword. This makes it look like a function call, which it really is:
gettext("a message")
Here, gettext
is used as a keyword. It's also possible to use its shortcut which is _
:
_("a message")
Both of those variants are equal, but the latter is more concise.
By default verboselib
recognizes the following keywords:
Keyword | Shortcut | Example |
---|---|---|
gettext |
_ |
_("message") |
gettext_lazy |
L_ |
L_("message") |
ngettext |
N_ |
N_("single", "plural", 123) |
ngettext_lazy |
LN_ |
LN_("single", "plural", 123) |
pgettext |
P_ |
P_("message context", "message") |
pgettext_lazy |
LP_ |
LP_("message context", "message") |
npgettext |
NP_ |
LP_("message context", "single", "plural", 123) |
npgettext_lazy |
LNP_ |
LP_("message context", "single", "plural", 123) |
Technically, any literal can be used as a keyword. But if a non-default keyword is used, it must be specified during extraction, which is described later.
Every .po
file includes a header at the beginning.
It consists of key-value metadata separated from file's body via a blank line. Example:
msgid "" msgstr "" "Project-Id-Version: foo 1.0.0\n" "PO-Revision-Date: 2020-10-09 21:24+0300\n" "Report-Msgid-Bugs-To: support@foo.com\n" "Last-Translator: Mr Translator <translation-team@foo.com>\n" "Language: de\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" msgid "Log in" msgstr ""
Refer to gettext
docs for more details on .po files and on .po headers.
Functions as pgettext()
, npgettext()
, and their lazy fellows allow to provide a message context.
This is just a string which will appear in .po
files to give a hint for translators about the meaning of the message.
For example, the following call to pgettext
:
P_("abbrev. month", "Jan")
will add a msgctxt
attribute into .po
files:
#: foo.py:90 msgctxt "abbrev. month" msgid "Jan" msgstr ""
Functions as ngettext()
, npgettext()
, and their lazy counterparts allow to get different translations depending on the integer number n
provided to them, e.g.:
N_("window", "windows", 1)
In this trivial example n
is 1
. However, it can be a variable or a parameterless callable returning an int
:
def get_users_online() -> int:
n = ...
return n
N_("user online", "users online", get_users_online)
This looks pretty simple, but that is not the end of the story.
Different languages can have different number of plural forms and each form can have their own calculation rules.
So, in order to make plural forms actually work, each .po
file must include a Plural-Forms
metadata in its header.
For example, languages of the Germanic family, like English, have 2 plural forms defined as:
"Plural-Forms: nplurals=2; plural=n != 1\n"
Examples of rules for other languages can be found at Plural-Forms documentation page.
Finally, every message having plural forms must have as many translations as there are plural forms specified by Plural-Forms
. For example:
#: foo.py:74 msgid "window" msgid_plural "windows" msgstr[0] "вікно" msgstr[1] "вікна" msgstr[2] "вікон"
Refer to gettext
docs for more info on translating plural forms.
verboselib
comes with a couple of stand-alone CLI utilities for extracting and compiling translatable messages.
These utilities are implemented as subcommands of the main command named verboselib
.
Run verboselib
with -h
flag to get generic help:
verboselib -h usage: verboselib [-h] [-V] {extract,x,compile,c} ... run a verboselib command optional arguments: -h, --help show this help message and exit -V, --version show version of verboselib and exit subcommands: {extract,x,compile,c} extract (x) extract translatable strings from sources into '.po' files compile (c) compile '.po' text files into '.mo' binaries
Used to extract translatable messages from sources. Creates or updates the directory with translations catalogs. Run with -h
flag for help:
verboselib x -h usage: extract [-h] [-d DOMAIN] [-l LOCALE] [-a] [-o OUTPUT_DIR] [-k KEYWORD] [--no-default-keywords] [-e EXTENSIONS] [-s] [-i PATTERN] [--no-default-ignore] [--no-wrap] [--no-location] [--no-obsolete] [--keep-pot] [--xgettext-extra-args XGETTEXT_EXTRA_ARGS] [--msguniq-extra-args MSGUNIQ_EXTRA_ARGS] [--msgmerge-extra-args MSGMERGE_EXTRA_ARGS] [--msgattrib-extra-args MSGATTRIB_EXTRA_ARGS] [-v] extract translatable strings from sources into '.po' files optional arguments: -h, --help show this help message and exit -d DOMAIN, --domain DOMAIN domain of message files (default: messages) -l LOCALE, --locale LOCALE create or update '.po' message files for the given locale(s), ex: 'en_US'; can be specified multiple times (default: None) -a, --all update all '.po' message files for all existing locales (default: False) -o OUTPUT_DIR, --output-dir OUTPUT_DIR path to the directory where locales will be stored, a.k.a. 'locale dir' (default: locale) -k KEYWORD, --keyword KEYWORD extra keyword to look for, ex: 'L_'; can be specified multiple times (default: None) --no-default-keywords do not use default keywords as {'_', 'gettext', 'L_', 'gettext_lazy', 'N_:1,2', 'ngettext:1,2', 'LN_:1,2', 'ngettext_lazy:1,2', 'P_:1c,2', 'pgettext:1c,2', 'LP_:1c,2', 'pgettext_lazy:1c,2', 'NP_:1c,2,3', 'npgettext:1c,2,3', 'LNP_:1c,2,3', 'npgettext_lazy:1c,2,3'} (default: False) -e EXTENSIONS, --extension EXTENSIONS extra file extension(s) to scan in addition to '.py'; separate multiple values with commas or specify the parameter multiple times (default: None) -s, --links follow links to files and directories when scanning sources for translation strings (default: False) -i PATTERN, --ignore PATTERN extra glob-style patterns for ignoring files or directories; can be specified multiple times (default: None) --no-default-ignore do not ignore the common glob-style patterns as {'.*', '*~', 'CVS', '__pycache__', '*.pyc'} (default: False) --no-wrap do not break long message lines into several lines (default: False) --no-location do not write location lines, ex: '#: filename:lineno' (default: False) --no-obsolete remove obsolete message strings (default: False) --keep-pot keep '.pot' file after creating '.po' files (useful for debugging) (default: False) --xgettext-extra-args XGETTEXT_EXTRA_ARGS extra arguments for 'xgettext' utility; can be comma-separated or specified multiple times (default: None) --msguniq-extra-args MSGUNIQ_EXTRA_ARGS extra arguments for 'msguniq' utility; can be comma-separated or specified multiple times (default: None) --msgmerge-extra-args MSGMERGE_EXTRA_ARGS extra arguments for 'msgmerge' utility; can be comma-separated or specified multiple times (default: None) --msgattrib-extra-args MSGATTRIB_EXTRA_ARGS extra arguments for 'msgattrib' utility; can be comma-separated or specified multiple times (default: None) -v, --verbose use verbose output (default: False)
The basic usage example:
verboselib x -l 'uk' -l 'en' -l 'it'
Use -a
flag to update all existing .po
files:
verboselib x -a
Use --keyword
(-k
) argument to specify additional keywords to look for, e.g.:
verboselib x -a -k 'FOO_' -k 'BAR_'
Compiles all .po
files into .mo
files. Basic usage has no arguments:
verboselib c
Use -h
flag for help:
verboselib c -h usage: compile [-h] [-d LOCALES_DIR] [-l LOCALE] [-e EXCLUDE] [-f] [--msgfmt-extra-args MSGFMT_EXTRA_ARGS] [-v] compile '.po' text files into '.mo' binaries optional arguments: -h, --help show this help message and exit -d LOCALES_DIR, --locale-dir LOCALES_DIR path to the directory where locales are stored (default: locale) -l LOCALE, --locale LOCALE locale(s) to process, ex: 'en_US'; can be specified multiple times; all locales are processed if not specified (default: None) -e EXCLUDE, --exclude EXCLUDE locale(s) to exclude, ex: 'en_US'; can be specified multiple times (default: None) -f, --use-fuzzy use fuzzy translations (default: False) --msgfmt-extra-args MSGFMT_EXTRA_ARGS extra arguments for 'msgfmt' utility; can be comma-separated or specified multiple times (default: None) -v, --verbose use verbose output (default: False)
The current and the default languages are thread-local. Hence, the functions for manipulating and querying them, like set_language()
, are thread-safe. However, the values have to be set in each thread separately.
As for the translations catalog registry, verboselib.Translations
, it is also thread-safe, as it relies on RLocks. It's recommended to be used in libraries. However, if the target is an application and it is guaranteed to be single-threaded, it's possible to use a not-thread-safe version:
from verboselib import NotThreadSafeTranslations
1.1.0 (Nov 18, 2020)
API changes:
verboselib.lazy
is extracted into a standalone lib lazy-string.
Misc:
- Annotations using classes from
typing
are tidied up and deprecations starting from Python 3.9 are handled.
1.0.1 (Oct 30, 2020)
- Fix
verboselib.utils.export()
helper which adds objects to__all__
variable of their own modules.
- Fix
1.0.0 (Oct 11, 2020)
API changes:
verboselib.factory.TranslationsFactory
is nowverboselib.translations.Translations
.locale_dir
argument ofTranslations
islocale_dir_path
now and instances ofpathlib.Path
can be used in addition to strings.verboselib.translations.Translations
is now thread-safe.verboselib.translations.NotThreadSafeTranslations
is added.- Methods
ugettext()
andugettext_lazy()
are removed fromTranslations
. - Methods
ngettext()
,ngettext_lazy()
,pgettext()
,pgettext_lazy()
,npgettext()
, andnpgettext_lazy()
are added toTranslations
. - Function
get_default_language()
is added. - Function
verboselib.heplers.to_locale()
no longer hasto_lower
argument. verboselib-manage
CLI utility is now justverboselib
.compile
andextract
subcommands ofverboselib
havec
andx
aliases respectively.- Flags
--no-default-keywords
,--xgettext-extra-args
,--msguniq-extra-args
,--msgmerge-extra-args
, and--msgattrib-extra-args
are added to theextract
command. - Flags
--exclude
,--use-fuzzy
, and--msgfmt-extra-args
are added to thecompile
command.
Python support:
- Support of all
Python
versions below3.7
is dropped.
Other:
- All external dependencies are removed.
- The license is switched from
LGPLv3
toMIT
.
0.2.1 (Jul 16, 2017)
- Fix
version
command. - Rename
verboselib-manage.py
executable to simplyverboselib-manage
.
- Fix
0.2.0 (Dec 31, 2014)
- Add
get_default_language()
method. - Use default translation classes from
gettext
module.
- Add
0.1.0 (Jul 17, 2014)
Initial version.