pybabel - Man Page
Name
babel — Babel Documentation
Babel is an integrated collection of utilities that assist in internationalizing and localizing Python applications, with an emphasis on web-based applications.
User Documentation
The user documentation explains some core concept of the library and gives some information about how it can be used.
Introduction
The functionality Babel provides for internationalization (I18n) and localization (L10N) can be separated into two different aspects:
- tools to build and work with gettext message catalogs, and
- a Python interface to the CLDR (Common Locale Data Repository), providing access to various locale display names, localized number and date formatting, etc.
Message Catalogs
While the Python standard library includes a gettext module that enables applications to use message catalogs, it requires developers to build these catalogs using GNU tools such as xgettext, msgmerge, and msgfmt. And while xgettext does have support for extracting messages from Python files, it does not know how to deal with other kinds of files commonly found in Python web-applications, such as templates, nor does it provide an easy extensibility mechanism to add such support.
Babel addresses this by providing a framework where various extraction methods can be plugged in to a larger message extraction framework, and also removes the dependency on the GNU gettext tools for common tasks, as these aren’t necessarily available on all platforms. See Working with Message Catalogs for details on this aspect of Babel.
Locale Data
Furthermore, while the Python standard library does include support for basic localization with respect to the formatting of numbers and dates (the locale module, among others), this support is based on the assumption that there will be only one specific locale used per process (at least simultaneously.) Also, it doesn’t provide access to other kinds of locale data, such as the localized names of countries, languages, or time-zones, which are frequently needed in web-based applications.
For these requirements, Babel includes data extracted from the Common Locale Data Repository (CLDR), and provides a number of convenient methods for accessing and using this data. See Locale Data, Date and Time, and Number Formatting for more information on this aspect of Babel.
Installation
Babel is distributed as a standard Python package fully set up with all the dependencies it needs. On Python versions where the standard library zoneinfo module is not available, pytz needs to be installed for timezone support. If pytz is installed, it is preferred over the standard library zoneinfo module where possible.
virtualenv
Virtualenv is probably what you want to use during development, and if you have shell access to your production machines, you’ll probably want to use it there, too. Use pip to install it:
$ sudo pip install virtualenv
If you’re on Windows, run it in a command-prompt window with administrator privileges, and leave out sudo.
Once you have virtualenv installed, just fire up a shell and create your own environment. I usually create a project folder and a venv folder within:
$ mkdir myproject $ cd myproject $ virtualenv venv New python executable in venv/bin/python Installing distribute............done.
Now, whenever you want to work on a project, you only have to activate the corresponding environment. On OS X and Linux, do the following:
$ . venv/bin/activate
If you are a Windows user, the following command is for you:
$ venv\scripts\activate
Either way, you should now be using your virtualenv (notice how the prompt of your shell has changed to show the active environment).
Now you can just enter the following command to get Babel installed in your virtualenv:
$ pip install Babel
A few seconds later and you are good to go.
System-Wide Installation
This is possible as well, though I do not recommend it. Just run pip with root privileges:
$ sudo pip install Babel
(On Windows systems, run it in a command-prompt window with administrator privileges, and leave out sudo.)
Living on the Edge
If you want to work with the latest version of Babel, you will need to use a git checkout.
Get the git checkout in a new virtualenv and run in development mode:
$ git clone https://github.com/python-babel/babel Initialized empty Git repository in ~/dev/babel/.git/ $ cd babel $ virtualenv venv New python executable in venv/bin/python Installing distribute............done. $ . venv/bin/activate $ python setup.py import_cldr $ pip install --editable . ... Finished processing dependencies for Babel
Make sure to not forget about the import_cldr step because otherwise you will be missing the locale data. The custom setup command will download the most appropriate CLDR release from the official website and convert it for Babel.
This will pull also in the dependencies and activate the git head as the current version inside the virtualenv. Then all you have to do is run git pull origin to update to the latest version. If the CLDR data changes you will have to re-run python setup.py import_cldr.
Locale Data
While message catalogs allow you to localize any messages in your application, there are a number of strings that are used in many applications for which translations are readily available.
Imagine for example you have a list of countries that users can choose from, and you’d like to display the names of those countries in the language the user prefers. Instead of translating all those country names yourself in your application, you can make use of the translations provided by the locale data included with Babel, which is based on the Common Locale Data Repository (CLDR) developed and maintained by the Unicode Consortium.
The Locale Class
You normally access such locale data through the Locale class provided by Babel:
>>> from babel import Locale >>> locale = Locale('en', 'US') >>> locale.territories['US'] u'United States' >>> locale = Locale('es', 'MX') >>> locale.territories['US'] u'Estados Unidos'
In addition to country/territory names, the locale data also provides access to names of languages, scripts, variants, time zones, and more. Some of the data is closely related to number and date formatting.
Most of the corresponding Locale properties return dictionaries, where the key is a code such as the ISO country and language codes. Consult the API documentation for references to the relevant specifications.
Locale Display Names
Locales itself can be used to describe the locale itself or other locales. This mainly means that given a locale object you can ask it for its canonical display name, the name of the language and other things. Since the locales cross-reference each other you can ask for locale names in any language supported by the CLDR:
>>> l = Locale.parse('de_DE') >>> l.get_display_name('en_US') u'German (Germany)' >>> l.get_display_name('fr_FR') u'allemand (Allemagne)'
Display names include all the information to uniquely identify a locale (language, territory, script and variant) which is often not what you want. You can also ask for the information in parts:
>>> l.get_language_name('de_DE') u'Deutsch' >>> l.get_language_name('it_IT') u'tedesco' >>> l.get_territory_name('it_IT') u'Germania' >>> l.get_territory_name('pt_PT') u'Alemanha'
Calendar Display Names
The Locale class provides access to many locale display names related to calendar display, such as the names of weekdays or months.
These display names are of course used for date formatting, but can also be used, for example, to show a list of months to the user in their preferred language:
>>> locale = Locale('es') >>> month_names = locale.months['format']['wide'].items() >>> for idx, name in sorted(month_names): ... print name enero febrero marzo abril mayo junio julio agosto septiembre octubre noviembre diciembre
Date and Time
When working with date and time information in Python, you commonly use the classes date, datetime and/or time from the datetime package. Babel provides functions for locale-specific formatting of those objects in its dates module:
>>> from datetime import date, datetime, time >>> from babel.dates import format_date, format_datetime, format_time >>> d = date(2007, 4, 1) >>> format_date(d, locale='en') u'Apr 1, 2007' >>> format_date(d, locale='de_DE') u'01.04.2007'
As this example demonstrates, Babel will automatically choose a date format that is appropriate for the requested locale.
The format_*() functions also accept an optional format argument, which allows you to choose between one of four format variations:
- short,
- medium (the default),
- long, and
- full.
For example:
>>> format_date(d, format='short', locale='en') u'4/1/07' >>> format_date(d, format='long', locale='en') u'April 1, 2007' >>> format_date(d, format='full', locale='en') u'Sunday, April 1, 2007'
Core Time Concepts
Working with dates and time can be a complicated thing. Babel attempts to simplify working with them by making some decisions for you. Python’s datetime module has different ways to deal with times and dates: naive and timezone-aware datetime objects.
Babel generally recommends you to store all your time in naive datetime objects and treat them as UTC at all times. This simplifies dealing with time a lot because otherwise you can get into the hairy situation where you are dealing with datetime objects of different timezones. That is tricky because there are situations where time can be ambiguous. This is usually the case when dealing with dates around timezone transitions. The most common case of timezone transition is changes between daylight saving time and standard time.
As such we recommend to always use UTC internally and only reformat to local time when returning dates to users. At that point the timezone the user has selected can usually be established and Babel can automatically rebase the time for you.
To get the current time use the now() method of the datetime object, passing utc to it as the timezone.
For more information about timezones see Time-zone Support.
Pattern Syntax
While Babel makes it simple to use the appropriate date/time format for a given locale, you can also force it to use custom patterns. Note that Babel uses different patterns for specifying number and date formats compared to the Python equivalents (such as time.strftime()), which have mostly been inherited from C and POSIX. The patterns used in Babel are based on the Locale Data Markup Language specification (LDML), which defines them as follows:
A date/time pattern is a string of characters, where specific strings of characters are replaced with date and time data from a calendar when formatting or used to generate data for a calendar when parsing. […]
Characters may be used multiple times. For example, if y is used for the year, yy might produce “99”, whereas yyyy produces “1999”. For most numerical fields, the number of characters specifies the field width. For example, if h is the hour, h might produce “5”, but hh produces “05”. For some characters, the count specifies whether an abbreviated or full form should be used […]
Two single quotes represent a literal single quote, either inside or outside single quotes. Text within single quotes is not interpreted in any way (except for two adjacent single quotes).
For example:
>>> d = date(2007, 4, 1) >>> format_date(d, "EEE, MMM d, ''yy", locale='en') u"Sun, Apr 1, '07" >>> format_date(d, "EEEE, d.M.yyyy", locale='de') u'Sonntag, 1.4.2007' >>> t = time(15, 30) >>> format_time(t, "hh 'o''clock' a", locale='en') u"03 o'clock PM" >>> format_time(t, 'H:mm a', locale='de') u'15:30 nachm.' >>> dt = datetime(2007, 4, 1, 15, 30) >>> format_datetime(dt, "yyyyy.MMMM.dd GGG hh:mm a", locale='en') u'02007.April.01 AD 03:30 PM'
The syntax for custom datetime format patterns is described in detail in the the Locale Data Markup Language specification. The following table is just a relatively brief overview.
Date Fields
Field | Symbol | Description |
Era | G | Replaced with the era string for the current date. One to three letters for the abbreviated form, four lettersfor the long form, five for the narrow form |
Year | y | Replaced by the year. Normally the length specifies the padding, but for two letters it also specifies the maximum length. |
Y | Same as y but uses the ISO year-week calendar. ISO year-week increments after completing the last week of the year. Therefore it may change a few days before or after y. Recommend use with the w Symbol. | |
u | ?? | |
Quarter | Q | Use one or two for the numerical quarter, three for the abbreviation, or four for the full name. |
q | Use one or two for the numerical quarter, three for the abbreviation, or four for the full name. | |
Month | M | Use one or two for the numerical month, three for the abbreviation, or four for the full name, or five for the narrow name. |
L | Use one or two for the numerical month, three for the abbreviation, or four for the full name, or 5 for the narrow name. | |
Week | w | Week of year according to the ISO year-week calendar. This may have 52 or 53 weeks depending on the year. Recommend use with the Y symbol. |
W | Week of month. | |
Day | d | Day of month. |
D | Day of year. | |
F | Day of week in month. | |
g | ?? | |
Week day | E | Day of week. Use one through three letters for the short day, or four for the full name, or five for the narrow name. |
e | Local day of week. Same as E except adds a numeric value that will depend on the local starting day of the week, using one or two letters. | |
c | ?? |
Time Fields
Field | Symbol | Description |
Period | a | AM or PM |
Hour | h | Hour [1-12]. |
H | Hour [0-23]. | |
K | Hour [0-11]. | |
k | Hour [1-24]. | |
Minute | m | Use one or two for zero places padding. |
Second | s | Use one or two for zero places padding. |
S | Fractional second, rounds to the count of letters. | |
A | Milliseconds in day. | |
Timezone | z | Use one to three letters for the short timezone or four for the full name. |
Z | Use one to three letters for RFC 822, four letters for GMT format. | |
v | Use one letter for short wall (generic) time, four for long wall time. | |
V | Same as z, except that timezone abbreviations should be used regardless of whether they are in common use by the locale. |
Time Delta Formatting
In addition to providing functions for formatting localized dates and times, the babel.dates module also provides a function to format the difference between two times, called a ‘’time delta’’. These are usually represented as datetime.timedelta objects in Python, and it’s also what you get when you subtract one datetime object from an other.
The format_timedelta function takes a timedelta object and returns a human-readable representation. This happens at the cost of precision, as it chooses only the most significant unit (such as year, week, or hour) of the difference, and displays that:
>>> from datetime import timedelta >>> from babel.dates import format_timedelta >>> delta = timedelta(days=6) >>> format_timedelta(delta, locale='en_US') u'1 week'
The resulting strings are based from the CLDR data, and are properly pluralized depending on the plural rules of the locale and the calculated number of units.
The function provides parameters for you to influence how this most significant unit is chosen: with threshold you set the value after which the presentation switches to the next larger unit, and with granularity you can limit the smallest unit to display:
>>> delta = timedelta(days=6) >>> format_timedelta(delta, threshold=1.2, locale='en_US') u'6 days' >>> format_timedelta(delta, granularity='month', locale='en_US') u'1 month'
Time-zone Support
Many of the verbose time formats include the time-zone, but time-zone information is not by default available for the Python datetime and time objects. The standard library includes only the abstract tzinfo class, which you need appropriate implementations for to actually use in your application. Babel includes a tzinfo implementation for UTC (Universal Time).
Babel uses either
`zoneinfo`_
or pytz for timezone support. If pytz is installed, it is preferred over the standard library’s zoneinfo. You can directly interface with either of these modules from within Babel:
>>> from datetime import time >>> from babel.dates import get_timezone, UTC >>> dt = datetime(2007, 4, 1, 15, 30, tzinfo=UTC) >>> eastern = get_timezone('US/Eastern') >>> format_datetime(dt, 'H:mm Z', tzinfo=eastern, locale='en_US') u'11:30 -0400'
The recommended approach to deal with different time-zones in a Python application is to always use UTC internally, and only convert from/to the users time-zone when accepting user input and displaying date/time data, respectively. You can use Babel together with zoneinfo or pytz to apply a time-zone to any datetime or time object for display, leaving the original information unchanged:
>>> british = get_timezone('Europe/London') >>> format_datetime(dt, 'H:mm zzzz', tzinfo=british, locale='en_US') u'16:30 British Summer Time'
Here, the given UTC time is adjusted to the “Europe/London” time-zone, and daylight savings time is taken into account. Daylight savings time is also applied to format_time, but because the actual date is unknown in that case, the current day is assumed to determine whether DST or standard time should be used.
Babel also provides support for working with the local timezone of your operating system. It’s provided through the LOCALTZ constant:
>>> from babel.dates import LOCALTZ, get_timezone_name >>> LOCALTZ <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> >>> get_timezone_name(LOCALTZ) u'Central European Time'
Localized Time-zone Names
While the Locale class provides access to various locale display names related to time-zones, the process of building a localized name of a time-zone is actually quite complicated. Babel implements it in separately usable functions in the babel.dates module, most importantly the get_timezone_name function:
>>> from babel import Locale >>> from babel.dates import get_timezone_name, get_timezone >>> tz = get_timezone('Europe/Berlin') >>> get_timezone_name(tz, locale=Locale.parse('pt_PT')) u'Hora da Europa Central'
You can pass the function either a datetime.tzinfo object, or a datetime.date or datetime.datetime object. If you pass an actual date, the function will be able to take daylight savings time into account. If you pass just the time-zone, Babel does not know whether daylight savings time is in effect, so it uses a generic representation, which is useful for example to display a list of time-zones to the user.
>>> from datetime import datetime >>> from babel.dates import _localize >>> dt = _localize(tz, datetime(2007, 8, 15)) >>> get_timezone_name(dt, locale=Locale.parse('de_DE')) u'Mitteleurop\xe4ische Sommerzeit' >>> get_timezone_name(tz, locale=Locale.parse('de_DE')) u'Mitteleurop\xe4ische Zeit'
Number Formatting
Support for locale-specific formatting and parsing of numbers is provided by the babel.numbers module:
>>> from babel.numbers import format_number, format_decimal, format_compact_decimal, format_percent
Examples:
# Numbers with decimal places >>> format_decimal(1.2345, locale='en_US') u'1.234' >>> format_decimal(1.2345, locale='sv_SE') u'1,234' # Integers with thousand grouping >>> format_decimal(12345, locale='de_DE') u'12.345' >>> format_decimal(12345678, locale='de_DE') u'12.345.678'
Pattern Syntax
While Babel makes it simple to use the appropriate number format for a given locale, you can also force it to use custom patterns. As with date/time formatting patterns, the patterns Babel supports for number formatting are based on the Locale Data Markup Language specification (LDML).
Examples:
>>> format_decimal(-1.2345, format='#,##0.##;-#', locale='en') u'-1.23' >>> format_decimal(-1.2345, format='#,##0.##;(#)', locale='en') u'(1.23)'
The syntax for custom number format patterns is described in detail in the the specification. The following table is just a relatively brief overview.
Symbol | Description |
0 | Digit |
1-9 | ‘1’ through ‘9’ indicate rounding. |
@ | Significant digit |
# | Digit, zero shows as absent |
. | Decimal separator or monetary decimal separator |
- | Minus sign |
, | Grouping separator |
E | Separates mantissa and exponent in scientific notation |
+ | Prefix positive exponents with localized plus sign |
; | Separates positive and negative subpatterns |
% | Multiply by 100 and show as percentage |
‰ | Multiply by 1000 and show as per mille |
¤ | Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If tripled, uses the long form of the decimal symbol. |
' | Used to quote special characters in a prefix or suffix |
* | Pad escape, precedes pad character |
Rounding Modes
Since Babel makes full use of Python’s Decimal type to perform number rounding before formatting, users have the chance to control the rounding mode and other configurable parameters through the active Context instance.
By default, Python rounding mode is ROUND_HALF_EVEN which complies with UTS #35 section 3.3. Yet, the caller has the opportunity to tweak the current context before formatting a number or currency:
>>> from babel.numbers import decimal, format_decimal >>> with decimal.localcontext(decimal.Context(rounding=decimal.ROUND_DOWN)): >>> txt = format_decimal(123.99, format='#', locale='en_US') >>> txt u'123'
It is also possible to use decimal.setcontext or directly modifying the instance returned by decimal.getcontext. However, using a context manager is always more convenient due to the automatic restoration and the ability to nest them.
Whatever mechanism is chosen, always make use of the decimal module imported from babel.numbers. For efficiency reasons, Babel uses the fastest decimal implementation available, such as cdecimal. These various implementation offer an identical API, but their types and instances do not interoperate with each other.
For example, the previous example can be slightly modified to generate unexpected results on Python 2.7, with the cdecimal module installed:
>>> from decimal import localcontext, Context, ROUND_DOWN >>> from babel.numbers import format_decimal >>> with localcontext(Context(rounding=ROUND_DOWN)): >>> txt = format_decimal(123.99, format='#', locale='en_US') >>> txt u'124'
Changing other parameters such as the precision may also alter the results of the number formatting functions. Remember to test your code to make sure it behaves as desired.
Parsing Numbers
Babel can also parse numeric data in a locale-sensitive manner:
>>> from babel.numbers import parse_decimal, parse_number
Examples:
>>> parse_decimal('1,099.98', locale='en_US') 1099.98 >>> parse_decimal('1.099,98', locale='de') 1099.98 >>> parse_decimal('2,109,998', locale='de') Traceback (most recent call last): ... NumberFormatError: '2,109,998' is not a valid decimal number
Note: as of version 2.8.0, the parse_number function has limited functionality. It can remove group symbols of certain locales from numeric strings, but may behave unexpectedly until its logic handles more encoding issues and other special cases.
Examples:
>>> parse_number('1,099', locale='en_US') 1099 >>> parse_number('1.099.024', locale='de') 1099024 >>> parse_number('123' + u'\xa0' + '4567', locale='ru') 1234567 >>> parse_number('123 4567', locale='ru') ... NumberFormatError: '123 4567' is not a valid number
Working with Message Catalogs
Introduction
The gettext translation system enables you to mark any strings used in your application as subject to localization, by wrapping them in functions such as gettext(str) and ngettext(singular, plural, num). For brevity, the gettext function is often aliased to _(str), so you can write:
print(_("Hello"))
instead of just:
print("Hello")
to make the string “Hello” localizable.
Message catalogs are collections of translations for such localizable messages used in an application. They are commonly stored in PO (Portable Object) and MO (Machine Object) files, the formats of which are defined by the GNU gettext tools and the GNU translation project.
The general procedure for building message catalogs looks something like this:
- use a tool (such as xgettext) to extract localizable strings from the code base and write them to a POT (PO Template) file.
- make a copy of the POT file for a specific locale (for example, “en_US”) and start translating the messages
- use a tool such as msgfmt to compile the locale PO file into a binary MO file
- later, when code changes make it necessary to update the translations, you regenerate the POT file and merge the changes into the various locale-specific PO files, for example using msgmerge
Python provides the gettext module as part of the standard library, which enables applications to work with appropriately generated MO files.
As gettext provides a solid and well supported foundation for translating application messages, Babel does not reinvent the wheel, but rather reuses this infrastructure, and makes it easier to build message catalogs for Python applications.
Message Extraction
Babel provides functionality similar to that of the xgettext program, except that only extraction from Python source files is built-in, while support for other file formats can be added using a simple extension mechanism.
Unlike xgettext, which is usually invoked once for every file, the routines for message extraction in Babel operate on directories. While the per-file approach of xgettext works nicely with projects using a Makefile, Python projects rarely use make, and thus a different mechanism is needed for extracting messages from the heterogeneous collection of source files that many Python projects are composed of.
When message extraction is based on directories instead of individual files, there needs to be a way to configure which files should be treated in which manner. For example, while many projects may contain .html files, some of those files may be static HTML files that don’t contain localizable message, while others may be Jinja2 templates, and still others may contain Genshi markup templates. Some projects may even mix HTML files for different templates languages (for whatever reason). Therefore the way in which messages are extracted from source files can not only depend on the file extension, but needs to be controllable in a precise manner.
Babel accepts a configuration file to specify this mapping of files to extraction methods, which is described below.
Front-Ends
Babel provides two different front-ends to access its functionality for working with message catalogs:
- A Command-Line Interface, and
- Distutils/Setuptools Integration
Which one you choose depends on the nature of your project. For most modern Python projects, the distutils/setuptools integration is probably more convenient.
Extraction Method Mapping and Configuration
The mapping of extraction methods to files in Babel is done via a configuration file. This file maps extended glob patterns to the names of the extraction methods, and can also set various options for each pattern (which options are available depends on the specific extraction method).
For example, the following configuration adds extraction of messages from both Genshi markup templates and text templates:
# Extraction from Python source files [python: **.py] # Extraction from Genshi HTML and text templates [genshi: **/templates/**.html] ignore_tags = script,style include_attrs = alt title summary [genshi: **/templates/**.txt] template_class = genshi.template:TextTemplate encoding = ISO-8819-15 # Extraction from JavaScript files [javascript: **.js] extract_messages = $._, jQuery._
The configuration file syntax is based on the format commonly found in .INI files on Windows systems, and as supported by the ConfigParser module in the Python standard library. Section names (the strings enclosed in square brackets) specify both the name of the extraction method, and the extended glob pattern to specify the files that this extraction method should be used for, separated by a colon. The options in the sections are passed to the extraction method. Which options are available is specific to the extraction method used.
The extended glob patterns used in this configuration are similar to the glob patterns provided by most shells. A single asterisk (*) is a wildcard for any number of characters (except for the pathname component separator “/”), while a question mark (?) only matches a single character. In addition, two subsequent asterisk characters (**) can be used to make the wildcard match any directory level, so the pattern **.txt matches any file with the extension .txt in any directory.
Lines that start with a # or ; character are ignored and can be used for comments. Empty lines are ignored, too.
NOTE:
if you’re performing message extraction using the command Babel provides for integration into setup.py scripts, you can also provide this configuration in a different way, namely as a keyword argument to the setup() function. See Distutils/Setuptools Integration for more information.
Default Extraction Methods
Babel comes with a few builtin extractors: python (which extracts messages from Python source files), javascript, and ignore (which extracts nothing).
The python extractor is by default mapped to the glob pattern **.py, meaning it’ll be applied to all files with the .py extension in any directory. If you specify your own mapping configuration, this default mapping is discarded, so you need to explicitly add it to your mapping (as shown in the example above.)
Referencing Extraction Methods
To be able to use short extraction method names such as “genshi”, you need to have pkg_resources installed, and the package implementing that extraction method needs to have been installed with its meta data (the egg-info).
If this is not possible for some reason, you need to map the short names to fully qualified function names in an extract section in the mapping configuration. For example:
# Some custom extraction method [extractors] custom = mypackage.module:extract_custom [custom: **.ctm] some_option = foo
Note that the builtin extraction methods python and ignore are available by default, even if pkg_resources is not installed. You should never need to explicitly define them in the [extractors] section.
Writing Extraction Methods
Adding new methods for extracting localizable methods is easy. First, you’ll need to implement a function that complies with the following interface:
def extract_xxx(fileobj, keywords, comment_tags, options): """Extract messages from XXX files. :param fileobj: the file-like object the messages should be extracted from :param keywords: a list of keywords (i.e. function names) that should be recognized as translation functions :param comment_tags: a list of translator tags to search for and include in the results :param options: a dictionary of additional options (optional) :return: an iterator over ``(lineno, funcname, message, comments)`` tuples :rtype: ``iterator`` """
- NOTE:
Any strings in the tuples produced by this function must be either unicode objects, or str objects using plain ASCII characters. That means that if sources contain strings using other encodings, it is the job of the extractor implementation to do the decoding to unicode objects.
Next, you should register that function as an entry point. This requires your setup.py script to use setuptools, and your package to be installed with the necessary metadata. If that’s taken care of, add something like the following to your setup.py script:
def setup(... entry_points = """ [babel.extractors] xxx = your.package:extract_xxx """,
That is, add your extraction method to the entry point group babel.extractors, where the name of the entry point is the name that people will use to reference the extraction method, and the value being the module and the name of the function (separated by a colon) implementing the actual extraction.
NOTE:
As shown in Referencing Extraction Methods, declaring an entry point is not strictly required, as users can still reference the extraction function directly. But whenever possible, the entry point should be declared to make configuration more convenient.
Translator Comments
First of all what are comments tags. Comments tags are excerpts of text to search for in comments, only comments, right before the python gettext calls, as shown on the following example:
# NOTE: This is a comment about `Foo Bar` _('Foo Bar')
The comments tag for the above example would be NOTE:, and the translator comment for that tag would be This is a comment about `Foo Bar`.
The resulting output in the catalog template would be something like:
#. This is a comment about `Foo Bar` #: main.py:2 msgid "Foo Bar" msgstr ""
Now, you might ask, why would I need that?
Consider this simple case; you have a menu item called “manual”. You know what it means, but when the translator sees this they will wonder did you mean:
- a document or help manual, or
- a manual process?
This is the simplest case where a translation comment such as “The installation manual” helps to clarify the situation and makes a translator more productive.
NOTE:
Whether translator comments can be extracted depends on the extraction method in use. The Python extractor provided by Babel does implement this feature, but others may not.
Command-Line Interface
Babel includes a command-line interface for working with message catalogs, similar to the various GNU gettext tools commonly available on Linux/Unix systems.
When properly installed, Babel provides a script called pybabel:
$ pybabel --help Usage: pybabel command [options] [args] Options: --version show program's version number and exit -h, --help show this help message and exit --list-locales print all known locales and exit -v, --verbose print as much as possible -q, --quiet print as little as possible commands: compile compile message catalogs to MO files extract extract messages from source files and generate a POT file init create new message catalogs from a POT file update update existing message catalogs from a POT file
The pybabel script provides a number of sub-commands that do the actual work. Those sub-commands are described below.
compile
The compile sub-command can be used to compile translation catalogs into binary MO files:
$ pybabel compile --help Usage: pybabel compile [options] compile message catalogs to MO files Options: -h, --help show this help message and exit -D DOMAIN, --domain=DOMAIN domains of PO files (space separated list, default 'messages') -d DIRECTORY, --directory=DIRECTORY path to base directory containing the catalogs -i INPUT_FILE, --input-file=INPUT_FILE name of the input file -o OUTPUT_FILE, --output-file=OUTPUT_FILE name of the output file (default '<output_dir>/<locale>/LC_MESSAGES/<domain>.mo') -l LOCALE, --locale=LOCALE locale of the catalog to compile -f, --use-fuzzy also include fuzzy translations --statistics print statistics about translations
If directory is specified, but output-file is not, the default filename of the output file will be:
<directory>/<locale>/LC_MESSAGES/<domain>.mo
If neither the input_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and compiles each of them to MO files in the same directory.
extract
The extract sub-command can be used to extract localizable messages from a collection of source files:
$ pybabel extract --help Usage: pybabel extract [options] <input-paths> extract messages from source files and generate a POT file Options: -h, --help show this help message and exit --charset=CHARSET charset to use in the output file (default "utf-8") -k KEYWORDS, --keywords=KEYWORDS, --keyword=KEYWORDS space-separated list of keywords to look for in addition to the defaults (may be repeated multiple times) --no-default-keywords do not include the default keywords -F MAPPING_FILE, --mapping-file=MAPPING_FILE, --mapping=MAPPING_FILE path to the mapping configuration file --no-location do not include location comments with filename and line number --add-location=ADD_LOCATION location lines format. If it is not given or "full", it generates the lines with both file name and line number. If it is "file", the line number part is omitted. If it is "never", it completely suppresses the lines (same as --no-location). --omit-header do not include msgid "" entry in header -o OUTPUT_FILE, --output-file=OUTPUT_FILE, --output=OUTPUT_FILE name of the output file -w WIDTH, --width=WIDTH set output line width (default 76) --no-wrap do not break long message lines, longer than the output line width, into several lines --sort-output generate sorted output (default False) --sort-by-file sort output by file location (default False) --msgid-bugs-address=MSGID_BUGS_ADDRESS set report address for msgid --copyright-holder=COPYRIGHT_HOLDER set copyright holder in output --project=PROJECT set project name in output --version=VERSION set project version in output -c ADD_COMMENTS, --add-comments=ADD_COMMENTS place comment block with TAG (or those preceding keyword lines) in output file. Separate multiple TAGs with commas(,) -s, --strip-comments, --strip-comment-tags strip the comment TAGs from the comments. --input-dirs=INPUT_DIRS alias for input-paths (does allow files as well as directories). --ignore-dirs=IGNORE_DIRS Patterns for directories to ignore when scanning for messages. Separate multiple patterns with spaces (default ".* ._") --header-comment=HEADER_COMMENT header comment for the catalog
The meaning of --keyword values is as follows:
- Pass a simple identifier like _ to extract the first (and only the first) argument of all function calls to _,
- To extract other arguments than the first, add a colon and the argument indices separated by commas. For example, the dngettext function typically expects translatable strings as second and third arguments, so you could pass dngettext:2,3.
- Some arguments should not be interpreted as translatable strings, but context strings. For that, append “c” to the argument index. For example: pgettext:1c,2.
- In C++ and Python, you may have functions that behave differently depending on how many arguments they take. For this use case, you can add an integer followed by “t” after the colon. In this case, the keyword will only match a function invocation if it has the specified total number of arguments. For example, if you have a function foo that behaves as gettext (argument is a message) or pgettext (arguments are a context and a message) depending on whether it takes one or two arguments, you can pass --keyword=foo:1,1t --keyword=foo:1c,2,2t.
The default keywords are equivalent to passing
--keyword=_ --keyword=gettext --keyword=ngettext:1,2 --keyword=ugettext --keyword=ungettext:1,2 --keyword=dgettext:2 --keyword=dngettext:2,3 --keyword=N_ --keyword=pgettext:1c,2 --keyword=npgettext:1c,2,3
init
The init sub-command creates a new translations catalog based on a PO template file:
$ pybabel init --help Usage: pybabel init [options] create new message catalogs from a POT file Options: -h, --help show this help message and exit -D DOMAIN, --domain=DOMAIN domain of PO file (default 'messages') -i INPUT_FILE, --input-file=INPUT_FILE name of the input file -d OUTPUT_DIR, --output-dir=OUTPUT_DIR path to output directory -o OUTPUT_FILE, --output-file=OUTPUT_FILE name of the output file (default '<output_dir>/<locale>/LC_MESSAGES/<domain>.po') -l LOCALE, --locale=LOCALE locale for the new localized catalog -w WIDTH, --width=WIDTH set output line width (default 76) --no-wrap do not break long message lines, longer than the output line width, into several lines
update
The update sub-command updates an existing new translations catalog based on a PO template file:
$ pybabel update --help Usage: pybabel update [options] update existing message catalogs from a POT file Options: -h, --help show this help message and exit -D DOMAIN, --domain=DOMAIN domain of PO file (default 'messages') -i INPUT_FILE, --input-file=INPUT_FILE name of the input file -d OUTPUT_DIR, --output-dir=OUTPUT_DIR path to base directory containing the catalogs -o OUTPUT_FILE, --output-file=OUTPUT_FILE name of the output file (default '<output_dir>/<locale>/LC_MESSAGES/<domain>.po') --omit-header do not include msgid entry in header -l LOCALE, --locale=LOCALE locale of the catalog to compile -w WIDTH, --width=WIDTH set output line width (default 76) --no-wrap do not break long message lines, longer than the output line width, into several lines --ignore-obsolete whether to omit obsolete messages from the output --init-missing if any output files are missing, initialize them first -N, --no-fuzzy-matching do not use fuzzy matching --update-header-comment update target header comment --previous keep previous msgids of translated messages
If output_dir is specified, but output-file is not, the default filename of the output file will be:
<directory>/<locale>/LC_MESSAGES/<domain>.mo
If neither the output_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and updates each of them.
Distutils/Setuptools Integration
Babel provides commands for integration into setup.py scripts, based on either the distutils package that is part of the Python standard library, or the third-party setuptools package.
These commands are available by default when Babel has been properly installed, and setup.py is using setuptools. For projects that use plain old distutils, the commands need to be registered explicitly, for example:
from distutils.core import setup from babel.messages import frontend as babel setup( ... cmdclass = {'compile_catalog': babel.compile_catalog, 'extract_messages': babel.extract_messages, 'init_catalog': babel.init_catalog, 'update_catalog': babel.update_catalog} )
compile_catalog
The compile_catalog command is similar to the GNU msgfmt tool, in that it takes a message catalog from a PO file and compiles it to a binary MO file.
If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:
$ ./setup.py compile_catalog --help Global options: --verbose (-v) run verbosely (default) --quiet (-q) run quietly (turns verbosity off) --dry-run (-n) don't actually do anything --help (-h) show detailed help message Options for 'compile_catalog' command: ...
Running the command will produce a binary MO file:
$ ./setup.py compile_catalog --directory foobar/locale --locale pt_BR running compile_catalog compiling catalog to foobar/locale/pt_BR/LC_MESSAGES/messages.mo
Options
The compile_catalog command accepts the following options:
Option | Description |
--domain | domain of the PO file (defaults to lower-cased project name) |
--directory (-d) | name of the base directory |
--input-file (-i) | name of the input file |
--output-file (-o) | name of the output file |
--locale (-l) | locale for the new localized string |
--use-fuzzy (-f) | also include “fuzzy” translations |
--statistics | print statistics about translations |
If directory is specified, but output-file is not, the default filename of the output file will be:
<directory>/<locale>/LC_MESSAGES/<domain>.mo
If neither the input_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and compiles each of them to MO files in the same directory.
These options can either be specified on the command-line, or in the setup.cfg file.
extract_messages
The extract_messages command is comparable to the GNU xgettext program: it can extract localizable messages from a variety of difference source files, and generate a PO (portable object) template file from the collected messages.
If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:
$ ./setup.py extract_messages --help Global options: --verbose (-v) run verbosely (default) --quiet (-q) run quietly (turns verbosity off) --dry-run (-n) don't actually do anything --help (-h) show detailed help message Options for 'extract_messages' command: ...
Running the command will produce a PO template file:
$ ./setup.py extract_messages --output-file foobar/locale/messages.pot running extract_messages extracting messages from foobar/__init__.py extracting messages from foobar/core.py ... writing PO template file to foobar/locale/messages.pot
Method Mapping
The mapping of file patterns to extraction methods (and options) can be specified using a configuration file that is pointed to using the --mapping-file option shown above. Alternatively, you can configure the mapping directly in setup.py using a keyword argument to the setup() function:
setup(... message_extractors = { 'foobar': [ ('**.py', 'python', None), ('**/templates/**.html', 'genshi', None), ('**/templates/**.txt', 'genshi', { 'template_class': 'genshi.template:TextTemplate' }) ], }, ... )
Options
The extract_messages command accepts the following options:
Option | Description |
--charset | charset to use in the output file |
--keywords (-k) | space-separated list of keywords to look for in addition to the defaults |
--no-default-keywords | do not include the default keywords |
--mapping-file (-F) | path to the mapping configuration file |
--no-location | do not include location comments with filename and line number |
--omit-header | do not include msgid “” entry in header |
--output-file (-o) | name of the output file |
--width (-w) | set output line width (default 76) |
--no-wrap | do not break long message lines, longer than the output line width, into several lines |
--input-dirs | directories that should be scanned for messages |
--sort-output | generate sorted output (default False) |
--sort-by-file | sort output by file location (default False) |
--msgid-bugs-address | set email address for message bug reports |
--copyright-holder | set copyright holder in output |
--add-comments (-c) | place comment block with TAG (or those preceding keyword lines) in output file. Separate multiple TAGs with commas(,) |
These options can either be specified on the command-line, or in the setup.cfg file. In the latter case, the options above become entries of the section [extract_messages], and the option names are changed to use underscore characters instead of dashes, for example:
[extract_messages] keywords = _ gettext ngettext mapping_file = mapping.cfg width = 80
This would be equivalent to invoking the command from the command-line as follows:
$ setup.py extract_messages -k _ -k gettext -k ngettext -F mapping.cfg -w 80
Any path names are interpreted relative to the location of the setup.py file. For boolean options, use “true” or “false” values.
init_catalog
The init_catalog command is basically equivalent to the GNU msginit program: it creates a new translation catalog based on a PO template file (POT).
If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:
$ ./setup.py init_catalog --help Global options: --verbose (-v) run verbosely (default) --quiet (-q) run quietly (turns verbosity off) --dry-run (-n) don't actually do anything --help (-h) show detailed help message Options for 'init_catalog' command: ...
Running the command will produce a PO file:
$ ./setup.py init_catalog -l fr -i foobar/locales/messages.pot \ -o foobar/locales/fr/messages.po running init_catalog creating catalog 'foobar/locales/fr/messages.po' based on 'foobar/locales/messages.pot'
Options
The init_catalog command accepts the following options:
Option | Description |
--domain | domain of the PO file (defaults to lower-cased project name) |
--input-file (-i) | name of the input file |
--output-dir (-d) | name of the output directory |
--output-file (-o) | name of the output file |
--locale | locale for the new localized string |
If output-dir is specified, but output-file is not, the default filename of the output file will be:
<output_dir>/<locale>/LC_MESSAGES/<domain>.po
These options can either be specified on the command-line, or in the setup.cfg file.
update_catalog
The update_catalog command is basically equivalent to the GNU msgmerge program: it updates an existing translations catalog based on a PO template file (POT).
If the command has been correctly installed or registered, a project’s setup.py script should allow you to use the command:
$ ./setup.py update_catalog --help Global options: --verbose (-v) run verbosely (default) --quiet (-q) run quietly (turns verbosity off) --dry-run (-n) don't actually do anything --help (-h) show detailed help message Options for 'update_catalog' command: ...
Running the command will update a PO file:
$ ./setup.py update_catalog -l fr -i foobar/locales/messages.pot \ -o foobar/locales/fr/messages.po running update_catalog updating catalog 'foobar/locales/fr/messages.po' based on 'foobar/locales/messages.pot'
Options
The update_catalog command accepts the following options:
Option | Description |
--domain | domain of the PO file (defaults to lower-cased project name) |
--input-file (-i) | name of the input file |
--output-dir (-d) | name of the output directory |
--output-file (-o) | name of the output file |
--locale | locale for the new localized string |
--ignore-obsolete | do not include obsolete messages in the output |
--no-fuzzy-matching (-N) | do not use fuzzy matching |
--previous | keep previous msgids of translated messages |
If output-dir is specified, but output-file is not, the default filename of the output file will be:
<output_dir>/<locale>/LC_MESSAGES/<domain>.po
If neither the input_file nor the locale option is set, this command looks for all catalog files in the base directory that match the given domain, and updates each of them.
These options can either be specified on the command-line, or in the setup.cfg file.
Support Classes and Functions
The babel.support modules contains a number of classes and functions that can help with integrating Babel, and internationalization in general, into your application or framework. The code in this module is not used by Babel itself, but instead is provided to address common requirements of applications that should handle internationalization.
Lazy Evaluation
One such requirement is lazy evaluation of translations. Many web-based applications define some localizable message at the module level, or in general at some level where the locale of the remote user is not yet known. For such cases, web frameworks generally provide a “lazy” variant of the gettext functions, which basically translates the message not when the gettext function is invoked, but when the string is accessed in some manner.
Extended Translations Class
Many web-based applications are composed of a variety of different components (possibly using some kind of plugin system), and some of those components may provide their own message catalogs that need to be integrated into the larger system.
To support this usage pattern, Babel provides a Translations class that is derived from the GNUTranslations class in the gettext module. This class adds a merge() method that takes another Translations instance, and merges the content of the latter into the main catalog:
translations = Translations.load('main') translations.merge(Translations.load('plugin1'))
API Reference
The API reference lists the full public API that Babel provides.
API Reference
This part of the documentation contains the full API reference of the public API of Babel.
Core Functionality
The core API provides the basic core functionality. Primarily it provides the Locale object and ways to create it. This object encapsulates a locale and exposes all the data it contains.
All the core functionality is also directly importable from the babel module for convenience.
Basic Interface
- class babel.core.Locale(language: str, territory: str | None = None, script: str | None = None, variant: str | None = None, modifier: str | None = None)
Representation of a specific locale.
>>> locale = Locale('en', 'US') >>> repr(locale) "Locale('en', territory='US')" >>> locale.display_name u'English (United States)'
A Locale object can also be instantiated from a raw locale string:
>>> locale = Locale.parse('en-US', sep='-') >>> repr(locale) "Locale('en', territory='US')"
Locale objects provide access to a collection of locale data, such as territory and language names, number and date format patterns, and more:
>>> locale.number_symbols['latn']['decimal'] u'.'
If a locale is requested for which no locale data is available, an UnknownLocaleError is raised:
>>> Locale.parse('en_XX') Traceback (most recent call last): ... UnknownLocaleError: unknown locale 'en_XX'
For more information see RFC 3066.
- property character_order: str
The text direction for the language.
>>> Locale('de', 'DE').character_order 'left-to-right' >>> Locale('ar', 'SA').character_order 'right-to-left'
- property compact_currency_formats: LocaleDataDict
Locale patterns for compact currency number formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').compact_currency_formats["short"]["one"]["1000"] <NumberPattern u'¤0K'>
- property compact_decimal_formats: LocaleDataDict
Locale patterns for compact decimal number formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').compact_decimal_formats["short"]["one"]["1000"] <NumberPattern u'0K'>
- property currencies: LocaleDataDict
Mapping of currency codes to translated currency names. This only returns the generic form of the currency name, not the count specific one. If an actual number is requested use the babel.numbers.get_currency_name() function.
>>> Locale('en').currencies['COP'] u'Colombian Peso' >>> Locale('de', 'DE').currencies['COP'] u'Kolumbianischer Peso'
- property currency_formats: LocaleDataDict
Locale patterns for currency number formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').currency_formats['standard'] <NumberPattern u'\xa4#,##0.00'> >>> Locale('en', 'US').currency_formats['accounting'] <NumberPattern u'\xa4#,##0.00;(\xa4#,##0.00)'>
- property currency_symbols: LocaleDataDict
Mapping of currency codes to symbols.
>>> Locale('en', 'US').currency_symbols['USD'] u'$' >>> Locale('es', 'CO').currency_symbols['USD'] u'US$'
- property date_formats: LocaleDataDict
Locale patterns for date formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').date_formats['short'] <DateTimePattern u'M/d/yy'> >>> Locale('fr', 'FR').date_formats['long'] <DateTimePattern u'd MMMM y'>
- property datetime_formats: LocaleDataDict
Locale patterns for datetime formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en').datetime_formats['full'] u'{1}, {0}' >>> Locale('th').datetime_formats['medium'] u'{1} {0}'
- property datetime_skeletons: LocaleDataDict
Locale patterns for formatting parts of a datetime.
>>> Locale('en').datetime_skeletons['MEd'] <DateTimePattern u'E, M/d'> >>> Locale('fr').datetime_skeletons['MEd'] <DateTimePattern u'E dd/MM'> >>> Locale('fr').datetime_skeletons['H'] <DateTimePattern u"HH 'h'">
- property day_period_rules: LocaleDataDict
Day period rules for the locale. Used by get_period_id.
- property day_periods: LocaleDataDict
Locale display names for various day periods (not necessarily only AM/PM).
These are not meant to be used without the relevant day_period_rules.
- property days: LocaleDataDict
Locale display names for weekdays.
>>> Locale('de', 'DE').days['format']['wide'][3] u'Donnerstag'
- property decimal_formats: LocaleDataDict
Locale patterns for decimal number formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').decimal_formats[None] <NumberPattern u'#,##0.###'>
- classmethod default(category: str | None = None, aliases: Mapping[str, str] = {'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'}) -> Locale
Return the system default locale for the specified category.
>>> for name in ['LANGUAGE', 'LC_ALL', 'LC_CTYPE', 'LC_MESSAGES']: ... os.environ[name] = '' >>> os.environ['LANG'] = 'fr_FR.UTF-8' >>> Locale.default('LC_MESSAGES') Locale('fr', territory='FR')
The following fallbacks to the variable are always considered:
- LANGUAGE
- LC_ALL
- LC_CTYPE
- LANG
- Parameters
- category – one of the LC_XXX environment variable names
- aliases – a dictionary of aliases for locale identifiers
- property default_numbering_system: str
The default numbering system used by the locale. >>> Locale(‘el’, ‘GR’).default_numbering_system u’latn’
- property display_name: str | None
The localized display name of the locale.
>>> Locale('en').display_name u'English' >>> Locale('en', 'US').display_name u'English (United States)' >>> Locale('sv').display_name u'svenska'
- Type
unicode
- property english_name: str | None
The english display name of the locale.
>>> Locale('de').english_name u'German' >>> Locale('de', 'DE').english_name u'German (Germany)'
- Type
unicode
- property eras: LocaleDataDict
Locale display names for eras.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').eras['wide'][1] u'Anno Domini' >>> Locale('en', 'US').eras['abbreviated'][0] u'BC'
- property first_week_day: int
The first day of a week, with 0 being Monday.
>>> Locale('de', 'DE').first_week_day 0 >>> Locale('en', 'US').first_week_day 6
- get_display_name(locale: Locale | str | None = None) -> str | None
Return the display name of the locale using the given locale.
The display name will include the language, territory, script, and variant, if those are specified.
>>> Locale('zh', 'CN', script='Hans').get_display_name('en') u'Chinese (Simplified, China)'
Modifiers are currently passed through verbatim:
>>> Locale('it', 'IT', modifier='euro').get_display_name('en') u'Italian (Italy, euro)'
- Parameters
locale – the locale to use
- get_language_name(locale: Locale | str | None = None) -> str | None
Return the language of this locale in the given locale.
>>> Locale('zh', 'CN', script='Hans').get_language_name('de') u'Chinesisch'
Added in version 1.0.
- Parameters
locale – the locale to use
- get_script_name(locale: Locale | str | None = None) -> str | None
Return the script name in the given locale.
- get_territory_name(locale: Locale | str | None = None) -> str | None
Return the territory name in the given locale.
- property interval_formats: LocaleDataDict
Locale patterns for interval formatting.
NOTE:
The format of the value returned may change between Babel versions.
How to format date intervals in Finnish when the day is the smallest changing component:
>>> Locale('fi_FI').interval_formats['MEd']['d'] [u'E d. – ', u'E d.M.']
- SEE ALSO:
The primary API to use this data is babel.dates.format_interval().
- Return type
dict[str, dict[str, list[str]]]
- language
the language code
- property language_name: str | None
The localized language name of the locale.
>>> Locale('en', 'US').language_name u'English'
- property languages: LocaleDataDict
Mapping of language codes to translated language names.
>>> Locale('de', 'DE').languages['ja'] u'Japanisch'
See ISO 639 for more information.
- property list_patterns: LocaleDataDict
Patterns for generating lists
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en').list_patterns['standard']['start'] u'{0}, {1}' >>> Locale('en').list_patterns['standard']['end'] u'{0}, and {1}' >>> Locale('en_GB').list_patterns['standard']['end'] u'{0} and {1}'
- property measurement_systems: LocaleDataDict
Localized names for various measurement systems.
>>> Locale('fr', 'FR').measurement_systems['US'] u'am\xe9ricain' >>> Locale('en', 'US').measurement_systems['US'] u'US'
- property meta_zones: LocaleDataDict
Locale display names for meta time zones.
Meta time zones are basically groups of different Olson time zones that have the same GMT offset and daylight savings time.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').meta_zones['Europe_Central']['long']['daylight'] u'Central European Summer Time'
Added in version 0.9.
- property min_week_days: int
The minimum number of days in a week so that the week is counted as the first week of a year or month.
>>> Locale('de', 'DE').min_week_days 4
- modifier
the modifier
- property months: LocaleDataDict
Locale display names for months.
>>> Locale('de', 'DE').months['format']['wide'][10] u'Oktober'
- classmethod negotiate(preferred: Iterable[str], available: Iterable[str], sep: str = '_', aliases: Mapping[str, str] = {'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'}) -> Locale | None
Find the best match between available and requested locale strings.
>>> Locale.negotiate(['de_DE', 'en_US'], ['de_DE', 'de_AT']) Locale('de', territory='DE') >>> Locale.negotiate(['de_DE', 'en_US'], ['en', 'de']) Locale('de') >>> Locale.negotiate(['de_DE', 'de'], ['en_US'])
You can specify the character used in the locale identifiers to separate the different components. This separator is applied to both lists. Also, case is ignored in the comparison:
>>> Locale.negotiate(['de-DE', 'de'], ['en-us', 'de-de'], sep='-') Locale('de', territory='DE')
- Parameters
- preferred – the list of locale identifiers preferred by the user
- available – the list of locale identifiers available
- aliases – a dictionary of aliases for locale identifiers
- property number_symbols: LocaleDataDict
Symbols used in number formatting by number system.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('fr', 'FR').number_symbols["latn"]['decimal'] u',' >>> Locale('fa', 'IR').number_symbols["arabext"]['decimal'] u'٫' >>> Locale('fa', 'IR').number_symbols["latn"]['decimal'] u'.'
- property ordinal_form: PluralRule
Plural rules for the locale.
>>> Locale('en').ordinal_form(1) 'one' >>> Locale('en').ordinal_form(2) 'two' >>> Locale('en').ordinal_form(3) 'few' >>> Locale('fr').ordinal_form(2) 'other' >>> Locale('ru').ordinal_form(100) 'other'
- property other_numbering_systems: LocaleDataDict
Mapping of other numbering systems available for the locale. See: https://www.unicode.org/reports/tr35/tr35-numbers.html#otherNumberingSystems
>>> Locale('el', 'GR').other_numbering_systems['traditional'] u'grek'
- NOTE:
The format of the value returned may change between Babel versions.
- classmethod parse(identifier: str | Locale | None, sep: str = '_', resolve_likely_subtags: bool = True) -> Locale
Create a Locale instance for the given locale identifier.
>>> l = Locale.parse('de-DE', sep='-') >>> l.display_name u'Deutsch (Deutschland)'
If the identifier parameter is not a string, but actually a Locale object, that object is returned:
>>> Locale.parse(l) Locale('de', territory='DE')
If the identifier parameter is neither of these, such as None e.g. because a default locale identifier could not be determined, a TypeError is raised:
>>> Locale.parse(None) Traceback (most recent call last): ... TypeError: ...
This also can perform resolving of likely subtags which it does by default. This is for instance useful to figure out the most likely locale for a territory you can use 'und' as the language tag:
>>> Locale.parse('und_AT') Locale('de', territory='AT')
Modifiers are optional, and always at the end, separated by “@”:
>>> Locale.parse('de_AT@euro') Locale('de', territory='AT', modifier='euro')
- Parameters
- identifier – the locale identifier string
- sep – optional component separator
- resolve_likely_subtags – if this is specified then a locale will have its likely subtag resolved if the locale otherwise does not exist. For instance zh_TW by itself is not a locale that exists but Babel can automatically expand it to the full form of zh_hant_TW. Note that this expansion is only taking place if no locale exists otherwise. For instance there is a locale en that can exist by itself.
- Raises
- ValueError – if the string does not appear to be a valid locale identifier
- UnknownLocaleError – if no locale data is available for the requested locale
- TypeError – if the identifier is not a string or a Locale
- property percent_formats: LocaleDataDict
Locale patterns for percent number formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').percent_formats[None] <NumberPattern u'#,##0%'>
- property periods: LocaleDataDict
Locale display names for day periods (AM/PM).
>>> Locale('en', 'US').periods['am'] u'AM'
- property plural_form: PluralRule
Plural rules for the locale.
>>> Locale('en').plural_form(1) 'one' >>> Locale('en').plural_form(0) 'other' >>> Locale('fr').plural_form(0) 'one' >>> Locale('ru').plural_form(100) 'many'
- property quarters: LocaleDataDict
Locale display names for quarters.
>>> Locale('de', 'DE').quarters['format']['wide'][1] u'1. Quartal'
- property scientific_formats: LocaleDataDict
Locale patterns for scientific number formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').scientific_formats[None] <NumberPattern u'#E0'>
- script
the script code
- property script_name: str | None
The localized script name of the locale if available.
>>> Locale('sr', 'ME', script='Latn').script_name u'latinica'
- property scripts: LocaleDataDict
Mapping of script codes to translated script names.
>>> Locale('en', 'US').scripts['Hira'] u'Hiragana'
See ISO 15924 for more information.
- property territories: LocaleDataDict
Mapping of script codes to translated script names.
>>> Locale('es', 'CO').territories['DE'] u'Alemania'
See ISO 3166 for more information.
- territory
the territory (country or region) code
- property territory_name: str | None
The localized territory name of the locale if available.
>>> Locale('de', 'DE').territory_name u'Deutschland'
- property text_direction: str
The text direction for the language in CSS short-hand form.
>>> Locale('de', 'DE').text_direction 'ltr' >>> Locale('ar', 'SA').text_direction 'rtl'
- property time_formats: LocaleDataDict
Locale patterns for time formatting.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').time_formats['short'] <DateTimePattern u'h:mm a'> >>> Locale('fr', 'FR').time_formats['long'] <DateTimePattern u'HH:mm:ss z'>
- property time_zones: LocaleDataDict
Locale display names for time zones.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').time_zones['Europe/London']['long']['daylight'] u'British Summer Time' >>> Locale('en', 'US').time_zones['America/St_Johns']['city'] u'St. John’s'
- property unit_display_names: LocaleDataDict
Display names for units of measurement.
- SEE ALSO:
You may want to use babel.units.get_unit_name() instead.
- NOTE:
The format of the value returned may change between Babel versions.
- variant
the variant code
- property variants: LocaleDataDict
Mapping of script codes to translated script names.
>>> Locale('de', 'DE').variants['1901'] u'Alte deutsche Rechtschreibung'
- property weekend_end: int
The day the weekend ends, with 0 being Monday.
>>> Locale('de', 'DE').weekend_end 6
- property weekend_start: int
The day the weekend starts, with 0 being Monday.
>>> Locale('de', 'DE').weekend_start 5
- property zone_formats: LocaleDataDict
Patterns related to the formatting of time zones.
NOTE:
The format of the value returned may change between Babel versions.
>>> Locale('en', 'US').zone_formats['fallback'] u'%(1)s (%(0)s)' >>> Locale('pt', 'BR').zone_formats['region'] u'Hor\xe1rio %s'
Added in version 0.9.
- babel.core.default_locale(category: str | None = None, aliases: Mapping[str, str] = {'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'}) -> str | None
Returns the system default locale for a given category, based on environment variables.
>>> for name in ['LANGUAGE', 'LC_ALL', 'LC_CTYPE']: ... os.environ[name] = '' >>> os.environ['LANG'] = 'fr_FR.UTF-8' >>> default_locale('LC_MESSAGES') 'fr_FR'
The “C” or “POSIX” pseudo-locales are treated as aliases for the “en_US_POSIX” locale:
>>> os.environ['LC_MESSAGES'] = 'POSIX' >>> default_locale('LC_MESSAGES') 'en_US_POSIX'
The following fallbacks to the variable are always considered:
- LANGUAGE
- LC_ALL
- LC_CTYPE
- LANG
- Parameters
- category – one of the LC_XXX environment variable names
- aliases – a dictionary of aliases for locale identifiers
- babel.core.negotiate_locale(preferred: Iterable[str], available: Iterable[str], sep: str = '_', aliases: Mapping[str, str] = {'ar': 'ar_SY', 'bg': 'bg_BG', 'bs': 'bs_BA', 'ca': 'ca_ES', 'cs': 'cs_CZ', 'da': 'da_DK', 'de': 'de_DE', 'el': 'el_GR', 'en': 'en_US', 'es': 'es_ES', 'et': 'et_EE', 'fa': 'fa_IR', 'fi': 'fi_FI', 'fr': 'fr_FR', 'gl': 'gl_ES', 'he': 'he_IL', 'hu': 'hu_HU', 'id': 'id_ID', 'is': 'is_IS', 'it': 'it_IT', 'ja': 'ja_JP', 'km': 'km_KH', 'ko': 'ko_KR', 'lt': 'lt_LT', 'lv': 'lv_LV', 'mk': 'mk_MK', 'nl': 'nl_NL', 'nn': 'nn_NO', 'no': 'nb_NO', 'pl': 'pl_PL', 'pt': 'pt_PT', 'ro': 'ro_RO', 'ru': 'ru_RU', 'sk': 'sk_SK', 'sl': 'sl_SI', 'sv': 'sv_SE', 'th': 'th_TH', 'tr': 'tr_TR', 'uk': 'uk_UA'}) -> str | None
Find the best match between available and requested locale strings.
>>> negotiate_locale(['de_DE', 'en_US'], ['de_DE', 'de_AT']) 'de_DE' >>> negotiate_locale(['de_DE', 'en_US'], ['en', 'de']) 'de'
Case is ignored by the algorithm, the result uses the case of the preferred locale identifier:
>>> negotiate_locale(['de_DE', 'en_US'], ['de_de', 'de_at']) 'de_DE'
>>> negotiate_locale(['de_DE', 'en_US'], ['de_de', 'de_at']) 'de_DE'
By default, some web browsers unfortunately do not include the territory in the locale identifier for many locales, and some don’t even allow the user to easily add the territory. So while you may prefer using qualified locale identifiers in your web-application, they would not normally match the language-only locale sent by such browsers. To workaround that, this function uses a default mapping of commonly used language-only locale identifiers to identifiers including the territory:
>>> negotiate_locale(['ja', 'en_US'], ['ja_JP', 'en_US']) 'ja_JP'
Some browsers even use an incorrect or outdated language code, such as “no” for Norwegian, where the correct locale identifier would actually be “nb_NO” (Bokmål) or “nn_NO” (Nynorsk). The aliases are intended to take care of such cases, too:
>>> negotiate_locale(['no', 'sv'], ['nb_NO', 'sv_SE']) 'nb_NO'
You can override this default mapping by passing a different aliases dictionary to this function, or you can bypass the behavior althogher by setting the aliases parameter to None.
- Parameters
- preferred – the list of locale strings preferred by the user
- available – the list of locale strings available
- sep – character that separates the different parts of the locale strings
- aliases – a dictionary of aliases for locale identifiers
Exceptions
- exception babel.core.UnknownLocaleError(identifier: str)
Exception thrown when a locale is requested for which no locale data is available.
- identifier
The identifier of the locale that could not be found.
Utility Functions
- babel.core.get_global(key: _GLOBAL_KEY) -> Mapping[str, Any]
Return the dictionary for the given key in the global data.
The global data is stored in the babel/global.dat file and contains information independent of individual locales.
>>> get_global('zone_aliases')['UTC'] u'Etc/UTC' >>> get_global('zone_territories')['Europe/Berlin'] u'DE'
The keys available are:
- all_currencies
- currency_fractions
- language_aliases
- likely_subtags
- parent_exceptions
- script_aliases
- territory_aliases
- territory_currencies
- territory_languages
- territory_zones
- variant_aliases
- windows_zone_mapping
- zone_aliases
- zone_territories
- NOTE:
The internal structure of the data may change between versions.
Added in version 0.9.
- Parameters
key – the data key
- babel.core.parse_locale(identifier: str, sep: str = '_') -> tuple[str, str | None, str | None, str | None] | tuple[str, str | None, str | None, str | None, str | None]
Parse a locale identifier into a tuple of the form (language, territory, script, variant, modifier).
>>> parse_locale('zh_CN') ('zh', 'CN', None, None) >>> parse_locale('zh_Hans_CN') ('zh', 'CN', 'Hans', None) >>> parse_locale('ca_es_valencia') ('ca', 'ES', None, 'VALENCIA') >>> parse_locale('en_150') ('en', '150', None, None) >>> parse_locale('en_us_posix') ('en', 'US', None, 'POSIX') >>> parse_locale('it_IT@euro') ('it', 'IT', None, None, 'euro') >>> parse_locale('it_IT@custom') ('it', 'IT', None, None, 'custom') >>> parse_locale('it_IT@') ('it', 'IT', None, None)
The default component separator is “_”, but a different separator can be specified using the sep parameter.
The optional modifier is always separated with “@” and at the end:
>>> parse_locale('zh-CN', sep='-') ('zh', 'CN', None, None) >>> parse_locale('zh-CN@custom', sep='-') ('zh', 'CN', None, None, 'custom')
If the identifier cannot be parsed into a locale, a ValueError exception is raised:
>>> parse_locale('not_a_LOCALE_String') Traceback (most recent call last): ... ValueError: 'not_a_LOCALE_String' is not a valid locale identifier
Encoding information is removed from the identifier, while modifiers are kept:
>>> parse_locale('en_US.UTF-8') ('en', 'US', None, None) >>> parse_locale('de_DE.iso885915@euro') ('de', 'DE', None, None, 'euro')
See RFC 4646 for more information.
- Parameters
- identifier – the locale identifier string
- sep – character that separates the different components of the locale identifier
- Raises
ValueError – if the string does not appear to be a valid locale identifier
- babel.core.get_locale_identifier(tup: tuple[str] | tuple[str, str | None] | tuple[str, str | None, str | None] | tuple[str, str | None, str | None, str | None] | tuple[str, str | None, str | None, str | None, str | None], sep: str = '_') -> str
The reverse of parse_locale(). It creates a locale identifier out of a (language, territory, script, variant, modifier) tuple. Items can be set to None and trailing Nones can also be left out of the tuple.
>>> get_locale_identifier(('de', 'DE', None, '1999', 'custom')) 'de_DE_1999@custom' >>> get_locale_identifier(('fi', None, None, None, 'custom')) 'fi@custom'
Added in version 1.0.
- Parameters
- tup – the tuple as returned by parse_locale().
- sep – the separator for the identifier.
Date and Time
The date and time functionality provided by Babel lets you format standard Python datetime, date and time objects and work with timezones.
Date and Time Formatting
- babel.dates.format_datetime(datetime=None, format='medium', tzinfo=None, locale=default_locale('LC_TIME'))
Return a date formatted according to the given pattern.
>>> from datetime import datetime >>> dt = datetime(2007, 4, 1, 15, 30) >>> format_datetime(dt, locale='en_US') u'Apr 1, 2007, 3:30:00\u202fPM'
For any pattern requiring the display of the timezone:
>>> format_datetime(dt, 'full', tzinfo=get_timezone('Europe/Paris'), ... locale='fr_FR') 'dimanche 1 avril 2007, 17:30:00 heure d’été d’Europe centrale' >>> format_datetime(dt, "yyyy.MM.dd G 'at' HH:mm:ss zzz", ... tzinfo=get_timezone('US/Eastern'), locale='en') u'2007.04.01 AD at 11:30:00 EDT'
- Parameters
- datetime -- the datetime object; if None, the current date and time is used
- format -- one of "full", "long", "medium", or "short", or a custom date/time pattern
- tzinfo -- the timezone to apply to the time for display
- locale -- a Locale object or a locale identifier
- babel.dates.format_date(date=None, format='medium', locale=default_locale('LC_TIME'))
Return a date formatted according to the given pattern.
>>> from datetime import date >>> d = date(2007, 4, 1) >>> format_date(d, locale='en_US') u'Apr 1, 2007' >>> format_date(d, format='full', locale='de_DE') u'Sonntag, 1. April 2007'
If you don't want to use the locale default formats, you can specify a custom date pattern:
>>> format_date(d, "EEE, MMM d, ''yy", locale='en') u"Sun, Apr 1, '07"
- Parameters
- date -- the date or datetime object; if None, the current date is used
- format -- one of "full", "long", "medium", or "short", or a custom date/time pattern
- locale -- a Locale object or a locale identifier
- babel.dates.format_time(time=None, format='medium', tzinfo=None, locale=default_locale('LC_TIME'))
Return a time formatted according to the given pattern.
>>> from datetime import datetime, time >>> t = time(15, 30) >>> format_time(t, locale='en_US') u'3:30:00\u202fPM' >>> format_time(t, format='short', locale='de_DE') u'15:30'
If you don't want to use the locale default formats, you can specify a custom time pattern:
>>> format_time(t, "hh 'o''clock' a", locale='en') u"03 o'clock PM"
For any pattern requiring the display of the time-zone a timezone has to be specified explicitly:
>>> t = datetime(2007, 4, 1, 15, 30) >>> tzinfo = get_timezone('Europe/Paris') >>> t = _localize(tzinfo, t) >>> format_time(t, format='full', tzinfo=tzinfo, locale='fr_FR') '15:30:00 heure d’été d’Europe centrale' >>> format_time(t, "hh 'o''clock' a, zzzz", tzinfo=get_timezone('US/Eastern'), ... locale='en') u"09 o'clock AM, Eastern Daylight Time"
As that example shows, when this function gets passed a datetime.datetime value, the actual time in the formatted string is adjusted to the timezone specified by the tzinfo parameter. If the datetime is "naive" (i.e. it has no associated timezone information), it is assumed to be in UTC.
These timezone calculations are not performed if the value is of type datetime.time, as without date information there's no way to determine what a given time would translate to in a different timezone without information about whether daylight savings time is in effect or not. This means that time values are left as-is, and the value of the tzinfo parameter is only used to display the timezone name if needed:
>>> t = time(15, 30) >>> format_time(t, format='full', tzinfo=get_timezone('Europe/Paris'), ... locale='fr_FR') u'15:30:00 heure normale d\u2019Europe centrale' >>> format_time(t, format='full', tzinfo=get_timezone('US/Eastern'), ... locale='en_US') u'3:30:00\u202fPM Eastern Standard Time'
- Parameters
- time -- the time or datetime object; if None, the current time in UTC is used
- format -- one of "full", "long", "medium", or "short", or a custom date/time pattern
- tzinfo -- the time-zone to apply to the time for display
- locale -- a Locale object or a locale identifier
- babel.dates.format_timedelta(delta, granularity='second', threshold=.85, add_direction=False, format='long', locale=default_locale('LC_TIME'))
Return a time delta according to the rules of the given locale.
>>> from datetime import timedelta >>> format_timedelta(timedelta(weeks=12), locale='en_US') u'3 months' >>> format_timedelta(timedelta(seconds=1), locale='es') u'1 segundo'
The granularity parameter can be provided to alter the lowest unit presented, which defaults to a second.
>>> format_timedelta(timedelta(hours=3), granularity='day', locale='en_US') u'1 day'
The threshold parameter can be used to determine at which value the presentation switches to the next higher unit. A higher threshold factor means the presentation will switch later. For example:
>>> format_timedelta(timedelta(hours=23), threshold=0.9, locale='en_US') u'1 day' >>> format_timedelta(timedelta(hours=23), threshold=1.1, locale='en_US') u'23 hours'
In addition directional information can be provided that informs the user if the date is in the past or in the future:
>>> format_timedelta(timedelta(hours=1), add_direction=True, locale='en') u'in 1 hour' >>> format_timedelta(timedelta(hours=-1), add_direction=True, locale='en') u'1 hour ago'
The format parameter controls how compact or wide the presentation is:
>>> format_timedelta(timedelta(hours=3), format='short', locale='en') u'3 hr' >>> format_timedelta(timedelta(hours=3), format='narrow', locale='en') u'3h'
- Parameters
- delta -- a timedelta object representing the time difference to format, or the delta in seconds as an int value
- granularity -- determines the smallest unit that should be displayed, the value can be one of "year", "month", "week", "day", "hour", "minute" or "second"
- threshold -- factor that determines at which point the presentation switches to the next higher unit
- add_direction -- if this flag is set to True the return value will include directional information. For instance a positive timedelta will include the information about it being in the future, a negative will be information about the value being in the past.
- format -- the format, can be "narrow", "short" or "long". ( "medium" is deprecated, currently converted to "long" to maintain compatibility)
- locale -- a Locale object or a locale identifier
- babel.dates.format_skeleton(skeleton, datetime=None, tzinfo=None, fuzzy=True, locale=default_locale('LC_TIME'))
Return a time and/or date formatted according to the given pattern.
The skeletons are defined in the CLDR data and provide more flexibility than the simple short/long/medium formats, but are a bit harder to use. The are defined using the date/time symbols without order or punctuation and map to a suitable format for the given locale.
>>> from datetime import datetime >>> t = datetime(2007, 4, 1, 15, 30) >>> format_skeleton('MMMEd', t, locale='fr') u'dim. 1 avr.' >>> format_skeleton('MMMEd', t, locale='en') u'Sun, Apr 1' >>> format_skeleton('yMMd', t, locale='fi') # yMMd is not in the Finnish locale; yMd gets used u'1.4.2007' >>> format_skeleton('yMMd', t, fuzzy=False, locale='fi') # yMMd is not in the Finnish locale, an error is thrown Traceback (most recent call last): ... KeyError: yMMd
After the skeleton is resolved to a pattern format_datetime is called so all timezone processing etc is the same as for that.
- Parameters
- skeleton -- A date time skeleton as defined in the cldr data.
- datetime -- the time or datetime object; if None, the current time in UTC is used
- tzinfo -- the time-zone to apply to the time for display
- fuzzy -- If the skeleton is not found, allow choosing a skeleton that's close enough to it.
- locale -- a Locale object or a locale identifier
- babel.dates.format_interval(start, end, skeleton=None, tzinfo=None, fuzzy=True, locale=default_locale('LC_TIME'))
Format an interval between two instants according to the locale's rules.
>>> from datetime import date, time >>> format_interval(date(2016, 1, 15), date(2016, 1, 17), "yMd", locale="fi") u'15.–17.1.2016'
>>> format_interval(time(12, 12), time(16, 16), "Hm", locale="en_GB") '12:12–16:16'
>>> format_interval(time(5, 12), time(16, 16), "hm", locale="en_US") '5:12 AM – 4:16 PM'
>>> format_interval(time(16, 18), time(16, 24), "Hm", locale="it") '16:18–16:24'
If the start instant equals the end instant, the interval is formatted like the instant.
>>> format_interval(time(16, 18), time(16, 18), "Hm", locale="it") '16:18'
Unknown skeletons fall back to "default" formatting.
>>> format_interval(date(2015, 1, 1), date(2017, 1, 1), "wzq", locale="ja") '2015/01/01~2017/01/01'
>>> format_interval(time(16, 18), time(16, 24), "xxx", locale="ja") '16:18:00~16:24:00'
>>> format_interval(date(2016, 1, 15), date(2016, 1, 17), "xxx", locale="de") '15.01.2016 – 17.01.2016'
- Parameters
- start -- First instant (datetime/date/time)
- end -- Second instant (datetime/date/time)
- skeleton -- The "skeleton format" to use for formatting.
- tzinfo -- tzinfo to use (if none is already attached)
- fuzzy -- If the skeleton is not found, allow choosing a skeleton that's close enough to it.
- locale -- A locale object or identifier.
- Returns
Formatted interval
Timezone Functionality
- babel.dates.get_timezone(zone: str | tzinfo | None = None) -> tzinfo
Looks up a timezone by name and returns it. The timezone object returned comes from pytz or zoneinfo, whichever is available. It corresponds to the tzinfo interface and can be used with all of the functions of Babel that operate with dates.
If a timezone is not known a LookupError is raised. If zone is None a local zone object is returned.
- Parameters
zone -- the name of the timezone to look up. If a timezone object itself is passed in, it's returned unchanged.
- babel.dates.get_timezone_gmt(datetime: _Instant = None, width: Literal['long', 'short', 'iso8601', 'iso8601_short'] = 'long', locale: Locale | str | None = 'en_US_POSIX', return_z: bool = False) -> str
Return the timezone associated with the given datetime object formatted as string indicating the offset from GMT.
>>> from datetime import datetime >>> dt = datetime(2007, 4, 1, 15, 30) >>> get_timezone_gmt(dt, locale='en') u'GMT+00:00' >>> get_timezone_gmt(dt, locale='en', return_z=True) 'Z' >>> get_timezone_gmt(dt, locale='en', width='iso8601_short') u'+00' >>> tz = get_timezone('America/Los_Angeles') >>> dt = _localize(tz, datetime(2007, 4, 1, 15, 30)) >>> get_timezone_gmt(dt, locale='en') u'GMT-07:00' >>> get_timezone_gmt(dt, 'short', locale='en') u'-0700' >>> get_timezone_gmt(dt, locale='en', width='iso8601_short') u'-07'
The long format depends on the locale, for example in France the acronym UTC string is used instead of GMT:
>>> get_timezone_gmt(dt, 'long', locale='fr_FR') u'UTC-07:00'
Added in version 0.9.
- Parameters
- datetime -- the datetime object; if None, the current date and time in UTC is used
- width -- either "long" or "short" or "iso8601" or "iso8601_short"
- locale -- the Locale object, or a locale string
- return_z -- True or False; Function returns indicator "Z" when local time offset is 0
- babel.dates.get_timezone_location(dt_or_tzinfo: _DtOrTzinfo = None, locale: Locale | str | None = 'en_US_POSIX', return_city: bool = False) -> str
Return a representation of the given timezone using "location format".
The result depends on both the local display name of the country and the city associated with the time zone:
>>> tz = get_timezone('America/St_Johns') >>> print(get_timezone_location(tz, locale='de_DE')) Kanada (St. John’s) (Ortszeit) >>> print(get_timezone_location(tz, locale='en')) Canada (St. John’s) Time >>> print(get_timezone_location(tz, locale='en', return_city=True)) St. John’s >>> tz = get_timezone('America/Mexico_City') >>> get_timezone_location(tz, locale='de_DE') u'Mexiko (Mexiko-Stadt) (Ortszeit)'
If the timezone is associated with a country that uses only a single timezone, just the localized country name is returned:
>>> tz = get_timezone('Europe/Berlin') >>> get_timezone_name(tz, locale='de_DE') u'Mitteleurop\xe4ische Zeit'
Added in version 0.9.
- Parameters
- dt_or_tzinfo -- the datetime or tzinfo object that determines the timezone; if None, the current date and time in UTC is assumed
- locale -- the Locale object, or a locale string
- return_city -- True or False, if True then return exemplar city (location) for the time zone
- Returns
the localized timezone name using location format
- babel.dates.get_timezone_name(dt_or_tzinfo: _DtOrTzinfo = None, width: Literal['long', 'short'] = 'long', uncommon: bool = False, locale: Locale | str | None = 'en_US_POSIX', zone_variant: Literal['generic', 'daylight', 'standard'] | None = None, return_zone: bool = False) -> str
Return the localized display name for the given timezone. The timezone may be specified using a datetime or tzinfo object.
>>> from datetime import time >>> dt = time(15, 30, tzinfo=get_timezone('America/Los_Angeles')) >>> get_timezone_name(dt, locale='en_US') u'Pacific Standard Time' >>> get_timezone_name(dt, locale='en_US', return_zone=True) 'America/Los_Angeles' >>> get_timezone_name(dt, width='short', locale='en_US') u'PST'
If this function gets passed only a tzinfo object and no concrete datetime, the returned display name is independent of daylight savings time. This can be used for example for selecting timezones, or to set the time of events that recur across DST changes:
>>> tz = get_timezone('America/Los_Angeles') >>> get_timezone_name(tz, locale='en_US') u'Pacific Time' >>> get_timezone_name(tz, 'short', locale='en_US') u'PT'
If no localized display name for the timezone is available, and the timezone is associated with a country that uses only a single timezone, the name of that country is returned, formatted according to the locale:
>>> tz = get_timezone('Europe/Berlin') >>> get_timezone_name(tz, locale='de_DE') u'Mitteleurop\xe4ische Zeit' >>> get_timezone_name(tz, locale='pt_BR') u'Hor\xe1rio da Europa Central'
On the other hand, if the country uses multiple timezones, the city is also included in the representation:
>>> tz = get_timezone('America/St_Johns') >>> get_timezone_name(tz, locale='de_DE') u'Neufundland-Zeit'
Note that short format is currently not supported for all timezones and all locales. This is partially because not every timezone has a short code in every locale. In that case it currently falls back to the long format.
For more information see LDML Appendix J: Time Zone Display Names
Added in version 0.9.
Changed in version 1.0: Added zone_variant support.
- Parameters
- dt_or_tzinfo -- the datetime or tzinfo object that determines the timezone; if a tzinfo object is used, the resulting display name will be generic, i.e. independent of daylight savings time; if None, the current date in UTC is assumed
- width -- either "long" or "short"
- uncommon -- deprecated and ignored
- zone_variant -- defines the zone variation to return. By default the variation is defined from the datetime object passed in. If no datetime object is passed in, the 'generic' variation is assumed. The following values are valid: 'generic', 'daylight' and 'standard'.
- locale -- the Locale object, or a locale string
- return_zone -- True or False. If true then function returns long time zone ID
- babel.dates.UTC
A timezone object for UTC.
- babel.dates.LOCALTZ
A timezone object for the computer's local timezone.
Data Access
- babel.dates.get_period_names(width: Literal['abbreviated', 'narrow', 'wide'] = 'wide', context: _Context = 'stand-alone', locale: Locale | str | None = 'en_US_POSIX') -> LocaleDataDict
Return the names for day periods (AM/PM) used by the locale.
>>> get_period_names(locale='en_US')['am'] u'AM'
- Parameters
- width -- the width to use, one of "abbreviated", "narrow", or "wide"
- context -- the context, either "format" or "stand-alone"
- locale -- the Locale object, or a locale string
- babel.dates.get_day_names(width: Literal['abbreviated', 'narrow', 'short', 'wide'] = 'wide', context: _Context = 'format', locale: Locale | str | None = 'en_US_POSIX') -> LocaleDataDict
Return the day names used by the locale for the specified format.
>>> get_day_names('wide', locale='en_US')[1] u'Tuesday' >>> get_day_names('short', locale='en_US')[1] u'Tu' >>> get_day_names('abbreviated', locale='es')[1] u'mar' >>> get_day_names('narrow', context='stand-alone', locale='de_DE')[1] u'D'
- Parameters
- width -- the width to use, one of "wide", "abbreviated", "short" or "narrow"
- context -- the context, either "format" or "stand-alone"
- locale -- the Locale object, or a locale string
- babel.dates.get_month_names(width: Literal['abbreviated', 'narrow', 'wide'] = 'wide', context: _Context = 'format', locale: Locale | str | None = 'en_US_POSIX') -> LocaleDataDict
Return the month names used by the locale for the specified format.
>>> get_month_names('wide', locale='en_US')[1] u'January' >>> get_month_names('abbreviated', locale='es')[1] u'ene' >>> get_month_names('narrow', context='stand-alone', locale='de_DE')[1] u'J'
- Parameters
- width -- the width to use, one of "wide", "abbreviated", or "narrow"
- context -- the context, either "format" or "stand-alone"
- locale -- the Locale object, or a locale string
- babel.dates.get_quarter_names(width: Literal['abbreviated', 'narrow', 'wide'] = 'wide', context: _Context = 'format', locale: Locale | str | None = 'en_US_POSIX') -> LocaleDataDict
Return the quarter names used by the locale for the specified format.
>>> get_quarter_names('wide', locale='en_US')[1] u'1st quarter' >>> get_quarter_names('abbreviated', locale='de_DE')[1] u'Q1' >>> get_quarter_names('narrow', locale='de_DE')[1] u'1'
- Parameters
- width -- the width to use, one of "wide", "abbreviated", or "narrow"
- context -- the context, either "format" or "stand-alone"
- locale -- the Locale object, or a locale string
- babel.dates.get_era_names(width: Literal['abbreviated', 'narrow', 'wide'] = 'wide', locale: Locale | str | None = 'en_US_POSIX') -> LocaleDataDict
Return the era names used by the locale for the specified format.
>>> get_era_names('wide', locale='en_US')[1] u'Anno Domini' >>> get_era_names('abbreviated', locale='de_DE')[1] u'n. Chr.'
- Parameters
- width -- the width to use, either "wide", "abbreviated", or "narrow"
- locale -- the Locale object, or a locale string
- babel.dates.get_date_format(format: _PredefinedTimeFormat = 'medium', locale: Locale | str | None = 'en_US_POSIX') -> DateTimePattern
Return the date formatting patterns used by the locale for the specified format.
>>> get_date_format(locale='en_US') <DateTimePattern u'MMM d, y'> >>> get_date_format('full', locale='de_DE') <DateTimePattern u'EEEE, d. MMMM y'>
- Parameters
- format -- the format to use, one of "full", "long", "medium", or "short"
- locale -- the Locale object, or a locale string
- babel.dates.get_datetime_format(format: _PredefinedTimeFormat = 'medium', locale: Locale | str | None = 'en_US_POSIX') -> DateTimePattern
Return the datetime formatting patterns used by the locale for the specified format.
>>> get_datetime_format(locale='en_US') u'{1}, {0}'
- Parameters
- format -- the format to use, one of "full", "long", "medium", or "short"
- locale -- the Locale object, or a locale string
- babel.dates.get_time_format(format: _PredefinedTimeFormat = 'medium', locale: Locale | str | None = 'en_US_POSIX') -> DateTimePattern
Return the time formatting patterns used by the locale for the specified format.
>>> get_time_format(locale='en_US') <DateTimePattern u'h:mm:ss a'> >>> get_time_format('full', locale='de_DE') <DateTimePattern u'HH:mm:ss zzzz'>
- Parameters
- format -- the format to use, one of "full", "long", "medium", or "short"
- locale -- the Locale object, or a locale string
Basic Parsing
- babel.dates.parse_date(string: str, locale: Locale | str | None = 'en_US_POSIX', format: _PredefinedTimeFormat = 'medium') -> datetime.date
Parse a date from a string.
This function first tries to interpret the string as ISO-8601 date format, then uses the date format for the locale as a hint to determine the order in which the date fields appear in the string.
>>> parse_date('4/1/04', locale='en_US') datetime.date(2004, 4, 1) >>> parse_date('01.04.2004', locale='de_DE') datetime.date(2004, 4, 1) >>> parse_date('2004-04-01', locale='en_US') datetime.date(2004, 4, 1) >>> parse_date('2004-04-01', locale='de_DE') datetime.date(2004, 4, 1)
- Parameters
- string -- the string containing the date
- locale -- a Locale object or a locale identifier
- format -- the format to use (see get_date_format)
- babel.dates.parse_time(string: str, locale: Locale | str | None = 'en_US_POSIX', format: _PredefinedTimeFormat = 'medium') -> datetime.time
Parse a time from a string.
This function uses the time format for the locale as a hint to determine the order in which the time fields appear in the string.
>>> parse_time('15:30:00', locale='en_US') datetime.time(15, 30)
- Parameters
- string -- the string containing the time
- locale -- a Locale object or a locale identifier
- format -- the format to use (see get_time_format)
- Returns
the parsed time
- Return type
time
- babel.dates.parse_pattern(pattern: str | DateTimePattern) -> DateTimePattern
Parse date, time, and datetime format patterns.
>>> parse_pattern("MMMMd").format u'%(MMMM)s%(d)s' >>> parse_pattern("MMM d, yyyy").format u'%(MMM)s %(d)s, %(yyyy)s'
Pattern can contain literal strings in single quotes:
>>> parse_pattern("H:mm' Uhr 'z").format u'%(H)s:%(mm)s Uhr %(z)s'
An actual single quote can be used by using two adjacent single quote characters:
>>> parse_pattern("hh' o''clock'").format u"%(hh)s o'clock"
- Parameters
pattern -- the formatting pattern to parse
Languages
The languages module provides functionality to access data about languages that is not bound to a given locale.
Official Languages
- babel.languages.get_official_languages(territory: str, regional: bool = False, de_facto: bool = False) -> tuple[str, ...]
Get the official language(s) for the given territory.
The language codes, if any are known, are returned in order of descending popularity.
If the regional flag is set, then languages which are regionally official are also returned.
If the de_facto flag is set, then languages which are “de facto” official are also returned.
- WARNING:
Note that the data is as up to date as the current version of the CLDR used by Babel. If you need scientifically accurate information, use another source!
- Parameters
- territory (str) – Territory code
- regional (bool) – Whether to return regionally official languages too
- de_facto (bool) – Whether to return de-facto official languages too
- Returns
Tuple of language codes
- Return type
tuple[str]
- babel.languages.get_territory_language_info(territory: str) -> dict[str, dict[str, float | str | None]]
Get a dictionary of language information for a territory.
The dictionary is keyed by language code; the values are dicts with more information.
The following keys are currently known for the values:
- population_percent: The percentage of the territory’s population speaking the
language.
- official_status: An optional string describing the officiality status of the language.
Known values are “official”, “official_regional” and “de_facto_official”.
- WARNING:
Note that the data is as up to date as the current version of the CLDR used by Babel. If you need scientifically accurate information, use another source!
- NOTE:
Note that the format of the dict returned may change between Babel versions.
See https://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html
- Parameters
territory (str) – Territory code
- Returns
Language information dictionary
- Return type
dict[str, dict]
List Formatting
This module lets you format lists of items in a locale-dependent manner.
- babel.lists.format_list(lst: Sequence[str], style: Literal['standard', 'standard-short', 'or', 'or-short', 'unit', 'unit-short', 'unit-narrow'] = 'standard', locale: Locale | str | None = 'en_US_POSIX') -> str
Format the items in lst as a list.
>>> format_list(['apples', 'oranges', 'pears'], locale='en') u'apples, oranges, and pears' >>> format_list(['apples', 'oranges', 'pears'], locale='zh') u'apples、oranges和pears' >>> format_list(['omena', 'peruna', 'aplari'], style='or', locale='fi') u'omena, peruna tai aplari'
These styles are defined, but not all are necessarily available in all locales. The following text is verbatim from the Unicode TR35-49 spec [1].
- standard: A typical ‘and’ list for arbitrary placeholders. eg. “January, February, and March”
- standard-short: A short version of an ‘and’ list, suitable for use with short or abbreviated placeholder values. eg. “Jan., Feb., and Mar.”
- or: A typical ‘or’ list for arbitrary placeholders. eg. “January, February, or March”
- or-short: A short version of an ‘or’ list. eg. “Jan., Feb., or Mar.”
- unit: A list suitable for wide units. eg. “3 feet, 7 inches”
- unit-short: A list suitable for short units eg. “3 ft, 7 in”
- unit-narrow: A list suitable for narrow units, where space on the screen is very limited. eg. “3′ 7″”
[1]: https://www.unicode.org/reports/tr35/tr35-49/tr35-general.html#ListPatterns
- Parameters
- lst – a sequence of items to format in to a list
- style – the style to format the list with. See above for description.
- locale – the locale
Messages and Catalogs
Babel provides functionality to work with message catalogs. This part of the API documentation shows those parts.
Messages and Catalogs
This module provides a basic interface to hold catalog and message information. It’s generally used to modify a gettext catalog but it is not being used to actually use the translations.
Catalogs
- class babel.messages.catalog.Catalog(locale: str | Locale | None = None, domain: str | None = None, header_comment: str | None = '# Translations template for PROJECT.\n# Copyright (C) YEAR ORGANIZATION\n# This file is distributed under the same license as the PROJECT project.\n# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.\n#', project: str | None = None, version: str | None = None, copyright_holder: str | None = None, msgid_bugs_address: str | None = None, creation_date: datetime | str | None = None, revision_date: datetime | time | float | str | None = None, last_translator: str | None = None, language_team: str | None = None, charset: str | None = None, fuzzy: bool = True)
Representation of a message catalog.
- __iter__() -> Iterator[Message]
Iterates through all the entries in the catalog, in the order they were added, yielding a Message object for every entry.
- Return type
iterator
- add(id: _MessageID, string: _MessageID | None = None, locations: Iterable[tuple[str, int]] = (), flags: Iterable[str] = (), auto_comments: Iterable[str] = (), user_comments: Iterable[str] = (), previous_id: _MessageID = (), lineno: int | None = None, context: str | None = None) -> Message
Add or update the message with the specified ID.
>>> catalog = Catalog() >>> catalog.add(u'foo') <Message ...> >>> catalog[u'foo'] <Message u'foo' (flags: [])>
This method simply constructs a Message object with the given arguments and invokes __setitem__ with that object.
- Parameters
- id – the message ID, or a (singular, plural) tuple for pluralizable messages
- string – the translated message string, or a (singular, plural) tuple for pluralizable messages
- locations – a sequence of (filename, lineno) tuples
- flags – a set or sequence of flags
- auto_comments – a sequence of automatic comments
- user_comments – a sequence of user comments
- previous_id – the previous message ID, or a (singular, plural) tuple for pluralizable messages
- lineno – the line number on which the msgid line was found in the PO file, if any
- context – the message context
- check() -> Iterable[tuple[Message, list[TranslationError]]]
Run various validation checks on the translations in the catalog.
For every message which fails validation, this method yield a (message, errors) tuple, where message is the Message object and errors is a sequence of TranslationError objects.
- Return type
generator of (message, errors)
- delete(id: _MessageID, context: str | None = None) -> None
Delete the message with the specified ID and context.
- Parameters
- id – the message ID
- context – the message context, or None for no context
- get(id: _MessageID, context: str | None = None) -> Message | None
Return the message with the specified ID and context.
- Parameters
- id – the message ID
- context – the message context, or None for no context
- property header_comment: str
The header comment for the catalog.
>>> catalog = Catalog(project='Foobar', version='1.0', ... copyright_holder='Foo Company') >>> print(catalog.header_comment) # Translations template for Foobar. # Copyright (C) ... Foo Company # This file is distributed under the same license as the Foobar project. # FIRST AUTHOR <EMAIL@ADDRESS>, .... #
The header can also be set from a string. Any known upper-case variables will be replaced when the header is retrieved again:
>>> catalog = Catalog(project='Foobar', version='1.0', ... copyright_holder='Foo Company') >>> catalog.header_comment = '''\ ... # The POT for my really cool PROJECT project. ... # Copyright (C) 1990-2003 ORGANIZATION ... # This file is distributed under the same license as the PROJECT ... # project. ... #''' >>> print(catalog.header_comment) # The POT for my really cool Foobar project. # Copyright (C) 1990-2003 Foo Company # This file is distributed under the same license as the Foobar # project. #
- Type
unicode
- is_identical(other: Catalog) -> bool
Checks if catalogs are identical, taking into account messages and headers.
- language_team
Name and email address of the language team.
- last_translator
Name and email address of the last translator.
- property mime_headers: list[tuple[str, str]]
The MIME headers of the catalog, used for the special msgid "" entry.
The behavior of this property changes slightly depending on whether a locale is set or not, the latter indicating that the catalog is actually a template for actual translations.
Here’s an example of the output for such a catalog template:
>>> from babel.dates import UTC >>> from datetime import datetime >>> created = datetime(1990, 4, 1, 15, 30, tzinfo=UTC) >>> catalog = Catalog(project='Foobar', version='1.0', ... creation_date=created) >>> for name, value in catalog.mime_headers: ... print('%s: %s' % (name, value)) Project-Id-Version: Foobar 1.0 Report-Msgid-Bugs-To: EMAIL@ADDRESS POT-Creation-Date: 1990-04-01 15:30+0000 PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE Last-Translator: FULL NAME <EMAIL@ADDRESS> Language-Team: LANGUAGE <LL@li.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Generated-By: Babel ...
And here’s an example of the output when the locale is set:
>>> revised = datetime(1990, 8, 3, 12, 0, tzinfo=UTC) >>> catalog = Catalog(locale='de_DE', project='Foobar', version='1.0', ... creation_date=created, revision_date=revised, ... last_translator='John Doe <jd@example.com>', ... language_team='de_DE <de@example.com>') >>> for name, value in catalog.mime_headers: ... print('%s: %s' % (name, value)) Project-Id-Version: Foobar 1.0 Report-Msgid-Bugs-To: EMAIL@ADDRESS POT-Creation-Date: 1990-04-01 15:30+0000 PO-Revision-Date: 1990-08-03 12:00+0000 Last-Translator: John Doe <jd@example.com> Language: de_DE Language-Team: de_DE <de@example.com> Plural-Forms: nplurals=2; plural=(n != 1); MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Generated-By: Babel ...
- Type
list
- property num_plurals: int
The number of plurals used by the catalog or locale.
>>> Catalog(locale='en').num_plurals 2 >>> Catalog(locale='ga').num_plurals 5
- Type
int
- property plural_expr: str
The plural expression used by the catalog or locale.
>>> Catalog(locale='en').plural_expr '(n != 1)' >>> Catalog(locale='ga').plural_expr '(n==1 ? 0 : n==2 ? 1 : n>=3 && n<=6 ? 2 : n>=7 && n<=10 ? 3 : 4)' >>> Catalog(locale='ding').plural_expr # unknown locale '(n != 1)'
- Type
str
- property plural_forms: str
Return the plural forms declaration for the locale.
>>> Catalog(locale='en').plural_forms 'nplurals=2; plural=(n != 1);' >>> Catalog(locale='pt_BR').plural_forms 'nplurals=2; plural=(n > 1);'
- Type
str
- update(template: Catalog, no_fuzzy_matching: bool = False, update_header_comment: bool = False, keep_user_comments: bool = True, update_creation_date: bool = True) -> None
Update the catalog based on the given template catalog.
>>> from babel.messages import Catalog >>> template = Catalog() >>> template.add('green', locations=[('main.py', 99)]) <Message ...> >>> template.add('blue', locations=[('main.py', 100)]) <Message ...> >>> template.add(('salad', 'salads'), locations=[('util.py', 42)]) <Message ...> >>> catalog = Catalog(locale='de_DE') >>> catalog.add('blue', u'blau', locations=[('main.py', 98)]) <Message ...> >>> catalog.add('head', u'Kopf', locations=[('util.py', 33)]) <Message ...> >>> catalog.add(('salad', 'salads'), (u'Salat', u'Salate'), ... locations=[('util.py', 38)]) <Message ...>
>>> catalog.update(template) >>> len(catalog) 3
>>> msg1 = catalog['green'] >>> msg1.string >>> msg1.locations [('main.py', 99)]
>>> msg2 = catalog['blue'] >>> msg2.string u'blau' >>> msg2.locations [('main.py', 100)]
>>> msg3 = catalog['salad'] >>> msg3.string (u'Salat', u'Salate') >>> msg3.locations [('util.py', 42)]
Messages that are in the catalog but not in the template are removed from the main collection, but can still be accessed via the obsolete member:
>>> 'head' in catalog False >>> list(catalog.obsolete.values()) [<Message 'head' (flags: [])>]
- Parameters
- template – the reference catalog, usually read from a POT file
- no_fuzzy_matching – whether to use fuzzy matching of message IDs
Messages
- class babel.messages.catalog.Message(id: _MessageID, string: _MessageID | None = '', locations: Iterable[tuple[str, int]] = (), flags: Iterable[str] = (), auto_comments: Iterable[str] = (), user_comments: Iterable[str] = (), previous_id: _MessageID = (), lineno: int | None = None, context: str | None = None)
Representation of a single message in a catalog.
- check(catalog: Catalog | None = None) -> list[TranslationError]
Run various validation checks on the message. Some validations are only performed if the catalog is provided. This method returns a sequence of TranslationError objects.
- Return type
iterator
- Parameters
catalog – A catalog instance that is passed to the checkers
- See
Catalog.check for a way to perform checks for all messages in a catalog.
- property fuzzy: bool
Whether the translation is fuzzy.
>>> Message('foo').fuzzy False >>> msg = Message('foo', 'foo', flags=['fuzzy']) >>> msg.fuzzy True >>> msg <Message 'foo' (flags: ['fuzzy'])>
- Type
bool
- is_identical(other: Message) -> bool
Checks whether messages are identical, taking into account all properties.
- property pluralizable: bool
Whether the message is plurizable.
>>> Message('foo').pluralizable False >>> Message(('foo', 'bar')).pluralizable True
- Type
bool
- property python_format: bool
Whether the message contains Python-style parameters.
>>> Message('foo %(name)s bar').python_format True >>> Message(('foo %(name)s', 'foo %(name)s')).python_format True
- Type
bool
Exceptions
- exception babel.messages.catalog.TranslationError
Exception thrown by translation checkers when invalid message translations are encountered.
Low-Level Extraction Interface
The low level extraction interface can be used to extract from directories or files directly. Normally this is not needed as the command line tools can do that for you.
Extraction Functions
The extraction functions are what the command line tools use internally to extract strings.
- babel.messages.extract.extract_from_dir(dirname: str | os.PathLike[str] | None = None, method_map: Iterable[tuple[str, str]] = [('**.py', 'python')], options_map: SupportsItems[str, dict[str, Any]] | None = None, keywords: Mapping[str, _Keyword] = {'N_': None, '_': None, 'dgettext': (2,), 'dngettext': (2, 3), 'gettext': None, 'ngettext': (1, 2), 'npgettext': ((1, 'c'), 2, 3), 'pgettext': ((1, 'c'), 2), 'ugettext': None, 'ungettext': (1, 2)}, comment_tags: Collection[str] = (), callback: Callable[[str, str, dict[str, Any]], object] | None = None, strip_comment_tags: bool = False, directory_filter: Callable[[str], bool] | None = None) -> Generator[_FileExtractionResult, None, None]
Extract messages from any source files found in the given directory.
This function generates tuples of the form (filename, lineno, message, comments, context).
Which extraction method is used per file is determined by the method_map parameter, which maps extended glob patterns to extraction method names. For example, the following is the default mapping:
>>> method_map = [ ... ('**.py', 'python') ... ]
This basically says that files with the filename extension “.py” at any level inside the directory should be processed by the “python” extraction method. Files that don’t match any of the mapping patterns are ignored. See the documentation of the pathmatch function for details on the pattern syntax.
The following extended mapping would also use the “genshi” extraction method on any file in “templates” subdirectory:
>>> method_map = [ ... ('**/templates/**.*', 'genshi'), ... ('**.py', 'python') ... ]
The dictionary provided by the optional options_map parameter augments these mappings. It uses extended glob patterns as keys, and the values are dictionaries mapping options names to option values (both strings).
The glob patterns of the options_map do not necessarily need to be the same as those used in the method mapping. For example, while all files in the templates folders in an application may be Genshi applications, the options for those files may differ based on extension:
>>> options_map = { ... '**/templates/**.txt': { ... 'template_class': 'genshi.template:TextTemplate', ... 'encoding': 'latin-1' ... }, ... '**/templates/**.html': { ... 'include_attrs': '' ... } ... }
- Parameters
- dirname – the path to the directory to extract messages from. If not given the current working directory is used.
- method_map – a list of (pattern, method) tuples that maps of extraction method names to extended glob patterns
- options_map – a dictionary of additional options (optional)
- keywords – a dictionary mapping keywords (i.e. names of functions that should be recognized as translation functions) to tuples that specify which of their arguments contain localizable strings
- comment_tags – a list of tags of translator comments to search for and include in the results
- callback – a function that is called for every file that message are extracted from, just before the extraction itself is performed; the function is passed the filename, the name of the extraction method and and the options dictionary as positional arguments, in that order
- strip_comment_tags – a flag that if set to True causes all comment tags to be removed from the collected comments.
- directory_filter – a callback to determine whether a directory should be recursed into. Receives the full directory path; should return True if the directory is valid.
- See
pathmatch
- babel.messages.extract.extract_from_file(method: _ExtractionMethod, filename: str | os.PathLike[str], keywords: Mapping[str, _Keyword] = {'N_': None, '_': None, 'dgettext': (2,), 'dngettext': (2, 3), 'gettext': None, 'ngettext': (1, 2), 'npgettext': ((1, 'c'), 2, 3), 'pgettext': ((1, 'c'), 2), 'ugettext': None, 'ungettext': (1, 2)}, comment_tags: Collection[str] = (), options: Mapping[str, Any] | None = None, strip_comment_tags: bool = False) -> list[_ExtractionResult]
Extract messages from a specific file.
This function returns a list of tuples of the form (lineno, message, comments, context).
- Parameters
- filename – the path to the file to extract messages from
- method – a string specifying the extraction method (.e.g. “python”)
- keywords – a dictionary mapping keywords (i.e. names of functions that should be recognized as translation functions) to tuples that specify which of their arguments contain localizable strings
- comment_tags – a list of translator tags to search for and include in the results
- strip_comment_tags – a flag that if set to True causes all comment tags to be removed from the collected comments.
- options – a dictionary of additional options (optional)
- Returns
list of tuples of the form (lineno, message, comments, context)
- Return type
list[tuple[int, str|tuple[str], list[str], str|None]
- babel.messages.extract.extract(method: _ExtractionMethod, fileobj: _FileObj, keywords: Mapping[str, _Keyword] = {'N_': None, '_': None, 'dgettext': (2,), 'dngettext': (2, 3), 'gettext': None, 'ngettext': (1, 2), 'npgettext': ((1, 'c'), 2, 3), 'pgettext': ((1, 'c'), 2), 'ugettext': None, 'ungettext': (1, 2)}, comment_tags: Collection[str] = (), options: Mapping[str, Any] | None = None, strip_comment_tags: bool = False) -> Generator[_ExtractionResult, None, None]
Extract messages from the given file-like object using the specified extraction method.
This function returns tuples of the form (lineno, message, comments, context).
The implementation dispatches the actual extraction to plugins, based on the value of the method parameter.
>>> source = b'''# foo module ... def run(argv): ... print(_('Hello, world!')) ... '''
>>> from io import BytesIO >>> for message in extract('python', BytesIO(source)): ... print(message) (3, u'Hello, world!', [], None)
- Parameters
- method – an extraction method (a callable), or a string specifying the extraction method (.e.g. “python”); if this is a simple name, the extraction function will be looked up by entry point; if it is an explicit reference to a function (of the form package.module:funcname or package.module.funcname), the corresponding function will be imported and used
- fileobj – the file-like object the messages should be extracted from
- keywords – a dictionary mapping keywords (i.e. names of functions that should be recognized as translation functions) to tuples that specify which of their arguments contain localizable strings
- comment_tags – a list of translator tags to search for and include in the results
- options – a dictionary of additional options (optional)
- strip_comment_tags – a flag that if set to True causes all comment tags to be removed from the collected comments.
- Raises
ValueError – if the extraction method is not registered
- Returns
iterable of tuples of the form (lineno, message, comments, context)
- Return type
Iterable[tuple[int, str|tuple[str], list[str], str|None]
Language Parsing
The language parsing functions are used to extract strings out of source files. These are automatically being used by the extraction functions but sometimes it can be useful to register wrapper functions, then these low level functions can be invoked.
New functions can be registered through the setuptools entrypoint system.
- babel.messages.extract.extract_python(fileobj: IO[bytes], keywords: Mapping[str, _Keyword], comment_tags: Collection[str], options: _PyOptions) -> Generator[_ExtractionResult, None, None]
Extract messages from Python source code.
It returns an iterator yielding tuples in the following form (lineno, funcname, message, comments).
- Parameters
- fileobj – the seekable, file-like object the messages should be extracted from
- keywords – a list of keywords (i.e. function names) that should be recognized as translation functions
- comment_tags – a list of translator tags to search for and include in the results
- options – a dictionary of additional options (optional)
- Return type
iterator
- babel.messages.extract.extract_javascript(fileobj: _FileObj, keywords: Mapping[str, _Keyword], comment_tags: Collection[str], options: _JSOptions, lineno: int = 1) -> Generator[_ExtractionResult, None, None]
Extract messages from JavaScript source code.
- Parameters
- fileobj – the seekable, file-like object the messages should be extracted from
- keywords – a list of keywords (i.e. function names) that should be recognized as translation functions
- comment_tags – a list of translator tags to search for and include in the results
options –
a dictionary of additional options (optional) Supported options are: * jsx – set to false to disable JSX/E4X support. * template_string – if True, supports gettext(key) * parse_template_string – if True will parse the
contents of javascript template strings.
- lineno – line number offset (for parsing embedded fragments)
- babel.messages.extract.extract_nothing(fileobj: _FileObj, keywords: Mapping[str, _Keyword], comment_tags: Collection[str], options: Mapping[str, Any]) -> list[_ExtractionResult]
Pseudo extractor that does not actually extract anything, but simply returns an empty list.
MO File Support
The MO file support can read and write MO files. It reads them into Catalog objects and also writes catalogs out.
- babel.messages.mofile.read_mo(fileobj: SupportsRead[bytes]) -> Catalog
Read a binary MO file from the given file-like object and return a corresponding Catalog object.
- Parameters
fileobj – the file-like object to read the MO file from
- Note
The implementation of this function is heavily based on the GNUTranslations._parse method of the gettext module in the standard library.
- babel.messages.mofile.write_mo(fileobj: SupportsWrite[bytes], catalog: Catalog, use_fuzzy: bool = False) -> None
Write a catalog to the specified file-like object using the GNU MO file format.
>>> import sys >>> from babel.messages import Catalog >>> from gettext import GNUTranslations >>> from io import BytesIO
>>> catalog = Catalog(locale='en_US') >>> catalog.add('foo', 'Voh') <Message ...> >>> catalog.add((u'bar', u'baz'), (u'Bahr', u'Batz')) <Message ...> >>> catalog.add('fuz', 'Futz', flags=['fuzzy']) <Message ...> >>> catalog.add('Fizz', '') <Message ...> >>> catalog.add(('Fuzz', 'Fuzzes'), ('', '')) <Message ...> >>> buf = BytesIO()
>>> write_mo(buf, catalog) >>> x = buf.seek(0) >>> translations = GNUTranslations(fp=buf) >>> if sys.version_info[0] >= 3: ... translations.ugettext = translations.gettext ... translations.ungettext = translations.ngettext >>> translations.ugettext('foo') u'Voh' >>> translations.ungettext('bar', 'baz', 1) u'Bahr' >>> translations.ungettext('bar', 'baz', 2) u'Batz' >>> translations.ugettext('fuz') u'fuz' >>> translations.ugettext('Fizz') u'Fizz' >>> translations.ugettext('Fuzz') u'Fuzz' >>> translations.ugettext('Fuzzes') u'Fuzzes'
- Parameters
- fileobj – the file-like object to write to
- catalog – the Catalog instance
- use_fuzzy – whether translations marked as “fuzzy” should be included in the output
PO File Support
The PO file support can read and write PO and POT files. It reads them into Catalog objects and also writes catalogs out.
- babel.messages.pofile.read_po(fileobj: IO[AnyStr] | Iterable[AnyStr], locale: str | Locale | None = None, domain: str | None = None, ignore_obsolete: bool = False, charset: str | None = None, abort_invalid: bool = False) -> Catalog
Read messages from a gettext PO (portable object) file from the given file-like object (or an iterable of lines) and return a Catalog.
>>> from datetime import datetime >>> from io import StringIO >>> buf = StringIO(''' ... #: main.py:1 ... #, fuzzy, python-format ... msgid "foo %(name)s" ... msgstr "quux %(name)s" ... ... # A user comment ... #. An auto comment ... #: main.py:3 ... msgid "bar" ... msgid_plural "baz" ... msgstr[0] "bar" ... msgstr[1] "baaz" ... ''') >>> catalog = read_po(buf) >>> catalog.revision_date = datetime(2007, 4, 1)
>>> for message in catalog: ... if message.id: ... print((message.id, message.string)) ... print(' ', (message.locations, sorted(list(message.flags)))) ... print(' ', (message.user_comments, message.auto_comments)) (u'foo %(name)s', u'quux %(name)s') ([(u'main.py', 1)], [u'fuzzy', u'python-format']) ([], []) ((u'bar', u'baz'), (u'bar', u'baaz')) ([(u'main.py', 3)], []) ([u'A user comment'], [u'An auto comment'])
Added in version 1.0: Added support for explicit charset argument.
- Parameters
- fileobj – the file-like object (or iterable of lines) to read the PO file from
- locale – the locale identifier or Locale object, or None if the catalog is not bound to a locale (which basically means it’s a template)
- domain – the message domain
- ignore_obsolete – whether to ignore obsolete messages in the input
- charset – the character set of the catalog.
- abort_invalid – abort read if po file is invalid
- babel.messages.pofile.write_po(fileobj: SupportsWrite[bytes], catalog: Catalog, width: int = 76, no_location: bool = False, omit_header: bool = False, sort_output: bool = False, sort_by_file: bool = False, ignore_obsolete: bool = False, include_previous: bool = False, include_lineno: bool = True) -> None
Write a gettext PO (portable object) template file for a given message catalog to the provided file-like object.
>>> catalog = Catalog() >>> catalog.add(u'foo %(name)s', locations=[('main.py', 1)], ... flags=('fuzzy',)) <Message...> >>> catalog.add((u'bar', u'baz'), locations=[('main.py', 3)]) <Message...> >>> from io import BytesIO >>> buf = BytesIO() >>> write_po(buf, catalog, omit_header=True) >>> print(buf.getvalue().decode("utf8")) #: main.py:1 #, fuzzy, python-format msgid "foo %(name)s" msgstr "" #: main.py:3 msgid "bar" msgid_plural "baz" msgstr[0] "" msgstr[1] ""
- Parameters
- fileobj – the file-like object to write to
- catalog – the Catalog instance
- width – the maximum line width for the generated output; use None, 0, or a negative number to completely disable line wrapping
- no_location – do not emit a location comment for every message
- omit_header – do not include the msgid "" entry at the top of the output
- sort_output – whether to sort the messages in the output by msgid
- sort_by_file – whether to sort the messages in the output by their locations
- ignore_obsolete – whether to ignore obsolete messages and not include them in the output; by default they are included as comments
- include_previous – include the old msgid as a comment when updating the catalog
- include_lineno – include line number in the location comment
Numbers and Currencies
The number module provides functionality to format numbers for different locales. This includes arbitrary numbers as well as currency.
Number Formatting
- babel.numbers.format_number(number: float | Decimal | str, locale: Locale | str | None = 'en_US_POSIX') -> str
Return the given number formatted for a specific locale.
>>> format_number(1099, locale='en_US') u'1,099' >>> format_number(1099, locale='de_DE') u'1.099'
Deprecated since version 2.6.0: Use babel.numbers.format_decimal() instead.
- Parameters
- number – the number to format
- locale – the Locale object or locale identifier
- babel.numbers.format_decimal(number: float | decimal.Decimal | str, format: str | NumberPattern | None = None, locale: Locale | str | None = 'en_US_POSIX', decimal_quantization: bool = True, group_separator: bool = True, *, numbering_system: Literal['default'] | str = 'latn') -> str
Return the given decimal number formatted for a specific locale.
>>> format_decimal(1.2345, locale='en_US') u'1.234' >>> format_decimal(1.2346, locale='en_US') u'1.235' >>> format_decimal(-1.2346, locale='en_US') u'-1.235' >>> format_decimal(1.2345, locale='sv_SE') u'1,234' >>> format_decimal(1.2345, locale='de') u'1,234' >>> format_decimal(1.2345, locale='ar_EG', numbering_system='default') u'1٫234' >>> format_decimal(1.2345, locale='ar_EG', numbering_system='latn') u'1.234'
The appropriate thousands grouping and the decimal separator are used for each locale:
>>> format_decimal(12345.5, locale='en_US') u'12,345.5'
By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:
>>> format_decimal(1.2346, locale='en_US') u'1.235' >>> format_decimal(1.2346, locale='en_US', decimal_quantization=False) u'1.2346' >>> format_decimal(12345.67, locale='fr_CA', group_separator=False) u'12345,67' >>> format_decimal(12345.67, locale='en_US', group_separator=True) u'12,345.67'
- Parameters
- number – the number to format
- format
- locale – the Locale object or locale identifier
- decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
- group_separator – Boolean to switch group separator on/off in a locale’s number format.
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.numbers.format_compact_decimal(number: float | decimal.Decimal | str, *, format_type: Literal['short', 'long'] = 'short', locale: Locale | str | None = 'en_US_POSIX', fraction_digits: int = 0, numbering_system: Literal['default'] | str = 'latn') -> str
Return the given decimal number formatted for a specific locale in compact form.
>>> format_compact_decimal(12345, format_type="short", locale='en_US') u'12K' >>> format_compact_decimal(12345, format_type="long", locale='en_US') u'12 thousand' >>> format_compact_decimal(12345, format_type="short", locale='en_US', fraction_digits=2) u'12.34K' >>> format_compact_decimal(1234567, format_type="short", locale="ja_JP") u'123万' >>> format_compact_decimal(2345678, format_type="long", locale="mk") u'2 милиони' >>> format_compact_decimal(21000000, format_type="long", locale="mk") u'21 милион' >>> format_compact_decimal(12345, format_type="short", locale='ar_EG', fraction_digits=2, numbering_system='default') u'12٫34 ألف'
- Parameters
- number – the number to format
- format_type – Compact format to use (“short” or “long”)
- locale – the Locale object or locale identifier
- fraction_digits – Number of digits after the decimal point to use. Defaults to 0.
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.numbers.format_currency(number: float | decimal.Decimal | str, currency: str, format: str | NumberPattern | None = None, locale: Locale | str | None = 'en_US_POSIX', currency_digits: bool = True, format_type: Literal['name', 'standard', 'accounting'] = 'standard', decimal_quantization: bool = True, group_separator: bool = True, *, numbering_system: Literal['default'] | str = 'latn') -> str
Return formatted currency value.
>>> format_currency(1099.98, 'USD', locale='en_US') '$1,099.98' >>> format_currency(1099.98, 'USD', locale='es_CO') u'US$1.099,98' >>> format_currency(1099.98, 'EUR', locale='de_DE') u'1.099,98\xa0\u20ac' >>> format_currency(1099.98, 'EGP', locale='ar_EG', numbering_system='default') u'1٬099٫98 ج.م.'
The format can also be specified explicitly. The currency is placed with the ‘¤’ sign. As the sign gets repeated the format expands (¤ being the symbol, ¤¤ is the currency abbreviation and ¤¤¤ is the full name of the currency):
>>> format_currency(1099.98, 'EUR', u'¤¤ #,##0.00', locale='en_US') u'EUR 1,099.98' >>> format_currency(1099.98, 'EUR', u'#,##0.00 ¤¤¤', locale='en_US') u'1,099.98 euros'
Currencies usually have a specific number of decimal digits. This function favours that information over the given format:
>>> format_currency(1099.98, 'JPY', locale='en_US') u'\xa51,100' >>> format_currency(1099.98, 'COP', u'#,##0.00', locale='es_ES') u'1.099,98'
However, the number of decimal digits can be overridden from the currency information, by setting the last parameter to False:
>>> format_currency(1099.98, 'JPY', locale='en_US', currency_digits=False) u'\xa51,099.98' >>> format_currency(1099.98, 'COP', u'#,##0.00', locale='es_ES', currency_digits=False) u'1.099,98'
If a format is not specified the type of currency format to use from the locale can be specified:
>>> format_currency(1099.98, 'EUR', locale='en_US', format_type='standard') u'\u20ac1,099.98'
When the given currency format type is not available, an exception is raised:
>>> format_currency('1099.98', 'EUR', locale='root', format_type='unknown') Traceback (most recent call last): ... UnknownCurrencyFormatError: "'unknown' is not a known currency format type"
>>> format_currency(101299.98, 'USD', locale='en_US', group_separator=False) u'$101299.98'
>>> format_currency(101299.98, 'USD', locale='en_US', group_separator=True) u'$101,299.98'
You can also pass format_type=’name’ to use long display names. The order of the number and currency name, along with the correct localized plural form of the currency name, is chosen according to locale:
>>> format_currency(1, 'USD', locale='en_US', format_type='name') u'1.00 US dollar' >>> format_currency(1099.98, 'USD', locale='en_US', format_type='name') u'1,099.98 US dollars' >>> format_currency(1099.98, 'USD', locale='ee', format_type='name') u'us ga dollar 1,099.98'
By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:
>>> format_currency(1099.9876, 'USD', locale='en_US') u'$1,099.99' >>> format_currency(1099.9876, 'USD', locale='en_US', decimal_quantization=False) u'$1,099.9876'
- Parameters
- number – the number to format
- currency – the currency code
- format – the format string to use
- locale – the Locale object or locale identifier
- currency_digits – use the currency’s natural number of decimal digits
- format_type – the currency format type to use
- decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
- group_separator – Boolean to switch group separator on/off in a locale’s number format.
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.numbers.format_compact_currency(number: float | decimal.Decimal | str, currency: str, *, format_type: Literal['short'] = 'short', locale: Locale | str | None = 'en_US_POSIX', fraction_digits: int = 0, numbering_system: Literal['default'] | str = 'latn') -> str
Format a number as a currency value in compact form.
>>> format_compact_currency(12345, 'USD', locale='en_US') u'$12K' >>> format_compact_currency(123456789, 'USD', locale='en_US', fraction_digits=2) u'$123.46M' >>> format_compact_currency(123456789, 'EUR', locale='de_DE', fraction_digits=1) '123,5 Mio. €'
- Parameters
- number – the number to format
- currency – the currency code
- format_type – the compact format type to use. Defaults to “short”.
- locale – the Locale object or locale identifier
- fraction_digits – Number of digits after the decimal point to use. Defaults to 0.
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.numbers.format_percent(number: float | decimal.Decimal | str, format: str | NumberPattern | None = None, locale: Locale | str | None = 'en_US_POSIX', decimal_quantization: bool = True, group_separator: bool = True, *, numbering_system: Literal['default'] | str = 'latn') -> str
Return formatted percent value for a specific locale.
>>> format_percent(0.34, locale='en_US') u'34%' >>> format_percent(25.1234, locale='en_US') u'2,512%' >>> format_percent(25.1234, locale='sv_SE') u'2\xa0512\xa0%' >>> format_percent(25.1234, locale='ar_EG', numbering_system='default') u'2٬512%'
The format pattern can also be specified explicitly:
>>> format_percent(25.1234, u'#,##0‰', locale='en_US') u'25,123‰'
By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:
>>> format_percent(23.9876, locale='en_US') u'2,399%' >>> format_percent(23.9876, locale='en_US', decimal_quantization=False) u'2,398.76%'
>>> format_percent(229291.1234, locale='pt_BR', group_separator=False) u'22929112%'
>>> format_percent(229291.1234, locale='pt_BR', group_separator=True) u'22.929.112%'
- Parameters
- number – the percent number to format
- format
- locale – the Locale object or locale identifier
- decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
- group_separator – Boolean to switch group separator on/off in a locale’s number format.
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.numbers.format_scientific(number: float | decimal.Decimal | str, format: str | NumberPattern | None = None, locale: Locale | str | None = 'en_US_POSIX', decimal_quantization: bool = True, *, numbering_system: Literal['default'] | str = 'latn') -> str
Return value formatted in scientific notation for a specific locale.
>>> format_scientific(10000, locale='en_US') u'1E4' >>> format_scientific(10000, locale='ar_EG', numbering_system='default') u'1أس4'
The format pattern can also be specified explicitly:
>>> format_scientific(1234567, u'##0.##E00', locale='en_US') u'1.23E06'
By default the locale is allowed to truncate and round a high-precision number by forcing its format pattern onto the decimal part. You can bypass this behavior with the decimal_quantization parameter:
>>> format_scientific(1234.9876, u'#.##E0', locale='en_US') u'1.23E3' >>> format_scientific(1234.9876, u'#.##E0', locale='en_US', decimal_quantization=False) u'1.2349876E3'
- Parameters
- number – the number to format
- format
- locale – the Locale object or locale identifier
- decimal_quantization – Truncate and round high-precision numbers to the format pattern. Defaults to True.
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
Number Parsing
- babel.numbers.parse_number(string: str, locale: Locale | str | None = 'en_US_POSIX', *, numbering_system: Literal['default'] | str = 'latn') -> int
Parse localized number string into an integer.
>>> parse_number('1,099', locale='en_US') 1099 >>> parse_number('1.099', locale='de_DE') 1099
When the given string cannot be parsed, an exception is raised:
>>> parse_number('1.099,98', locale='de') Traceback (most recent call last): ... NumberFormatError: '1.099,98' is not a valid number
- Parameters
- string – the string to parse
- locale – the Locale object or locale identifier
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Returns
the parsed number
- Raises
- NumberFormatError – if the string can not be converted to a number
- UnsupportedNumberingSystemError – if the numbering system is not supported by the locale.
- babel.numbers.parse_decimal(string: str, locale: Locale | str | None = 'en_US_POSIX', strict: bool = False, *, numbering_system: Literal['default'] | str = 'latn') -> decimal.Decimal
Parse localized decimal string into a decimal.
>>> parse_decimal('1,099.98', locale='en_US') Decimal('1099.98') >>> parse_decimal('1.099,98', locale='de') Decimal('1099.98') >>> parse_decimal('12 345,123', locale='ru') Decimal('12345.123') >>> parse_decimal('1٬099٫98', locale='ar_EG', numbering_system='default') Decimal('1099.98')
When the given string cannot be parsed, an exception is raised:
>>> parse_decimal('2,109,998', locale='de') Traceback (most recent call last): ... NumberFormatError: '2,109,998' is not a valid decimal number
If strict is set to True and the given string contains a number formatted in an irregular way, an exception is raised:
>>> parse_decimal('30.00', locale='de', strict=True) Traceback (most recent call last): ... NumberFormatError: '30.00' is not a properly formatted decimal number. Did you mean '3.000'? Or maybe '30,00'?
>>> parse_decimal('0.00', locale='de', strict=True) Traceback (most recent call last): ... NumberFormatError: '0.00' is not a properly formatted decimal number. Did you mean '0'?
- Parameters
- string – the string to parse
- locale – the Locale object or locale identifier
- strict – controls whether numbers formatted in a weird way are accepted or rejected
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
- NumberFormatError – if the string can not be converted to a decimal number
- UnsupportedNumberingSystemError – if the numbering system is not supported by the locale.
Exceptions
- exception babel.numbers.NumberFormatError(message: str, suggestions: list[str] | None = None)
Exception raised when a string cannot be parsed into a number.
- suggestions
a list of properly formatted numbers derived from the invalid input
Data Access
- babel.numbers.get_currency_name(currency: str, count: float | Decimal | None = None, locale: Locale | str | None = 'en_US_POSIX') -> str
Return the name used by the locale for the specified currency.
>>> get_currency_name('USD', locale='en_US') u'US Dollar'
Added in version 0.9.4.
- Parameters
- currency – the currency code.
- count – the optional count. If provided the currency name will be pluralized to that number if possible.
- locale – the Locale object or locale identifier.
- babel.numbers.get_currency_symbol(currency: str, locale: Locale | str | None = 'en_US_POSIX') -> str
Return the symbol used by the locale for the specified currency.
>>> get_currency_symbol('USD', locale='en_US') u'$'
- Parameters
- currency – the currency code.
- locale – the Locale object or locale identifier.
- babel.numbers.get_currency_unit_pattern(currency: str, count: float | Decimal | None = None, locale: Locale | str | None = 'en_US_POSIX') -> str
Return the unit pattern used for long display of a currency value for a given locale. This is a string containing {0} where the numeric part should be substituted and {1} where the currency long display name should be substituted.
>>> get_currency_unit_pattern('USD', locale='en_US', count=10) u'{0} {1}'
Added in version 2.7.0.
- Parameters
- currency – the currency code.
- count – the optional count. If provided the unit pattern for that number will be returned.
- locale – the Locale object or locale identifier.
- babel.numbers.get_decimal_symbol(locale: Locale | str | None = 'en_US_POSIX', *, numbering_system: Literal['default'] | str = 'latn') -> str
Return the symbol used by the locale to separate decimal fractions.
>>> get_decimal_symbol('en_US') u'.' >>> get_decimal_symbol('ar_EG', numbering_system='default') u'٫' >>> get_decimal_symbol('ar_EG', numbering_system='latn') u'.'
- Parameters
- locale – the Locale object or locale identifier
- numbering_system – The numbering system used for fetching the symbol. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.numbers.get_plus_sign_symbol(locale: Locale | str | None = 'en_US_POSIX', *, numbering_system: Literal['default'] | str = 'latn') -> str
Return the plus sign symbol used by the current locale.
>>> get_plus_sign_symbol('en_US') u'+' >>> get_plus_sign_symbol('ar_EG', numbering_system='default') u'+' >>> get_plus_sign_symbol('ar_EG', numbering_system='latn') u'+'
- Parameters
- locale – the Locale object or locale identifier
- numbering_system – The numbering system used for fetching the symbol. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – if the numbering system is not supported by the locale.
- babel.numbers.get_minus_sign_symbol(locale: Locale | str | None = 'en_US_POSIX', *, numbering_system: Literal['default'] | str = 'latn') -> str
Return the plus sign symbol used by the current locale.
>>> get_minus_sign_symbol('en_US') u'-' >>> get_minus_sign_symbol('ar_EG', numbering_system='default') u'-' >>> get_minus_sign_symbol('ar_EG', numbering_system='latn') u'-'
- Parameters
- locale – the Locale object or locale identifier
- numbering_system – The numbering system used for fetching the symbol. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – if the numbering system is not supported by the locale.
- babel.numbers.get_territory_currencies(territory: str, start_date: datetime.date | None = None, end_date: datetime.date | None = None, tender: bool = True, non_tender: bool = False, include_details: Literal[False] = False) -> list[str]
- babel.numbers.get_territory_currencies(territory: str, start_date: datetime.date | None = None, end_date: datetime.date | None = None, tender: bool = True, non_tender: bool = False, include_details: Literal[True] = False) -> list[dict[str, Any]]
Returns the list of currencies for the given territory that are valid for the given date range. In addition to that the currency database distinguishes between tender and non-tender currencies. By default only tender currencies are returned.
The return value is a list of all currencies roughly ordered by the time of when the currency became active. The longer the currency is being in use the more to the left of the list it will be.
The start date defaults to today. If no end date is given it will be the same as the start date. Otherwise a range can be defined. For instance this can be used to find the currencies in use in Austria between 1995 and 2011:
>>> from datetime import date >>> get_territory_currencies('AT', date(1995, 1, 1), date(2011, 1, 1)) ['ATS', 'EUR']
Likewise it’s also possible to find all the currencies in use on a single date:
>>> get_territory_currencies('AT', date(1995, 1, 1)) ['ATS'] >>> get_territory_currencies('AT', date(2011, 1, 1)) ['EUR']
By default the return value only includes tender currencies. This however can be changed:
>>> get_territory_currencies('US') ['USD'] >>> get_territory_currencies('US', tender=False, non_tender=True, ... start_date=date(2014, 1, 1)) ['USN', 'USS']
Added in version 2.0.
- Parameters
- territory – the name of the territory to find the currency for.
- start_date – the start date. If not given today is assumed.
- end_date – the end date. If not given the start date is assumed.
- tender – controls whether tender currencies should be included.
- non_tender – controls whether non-tender currencies should be included.
- include_details – if set to True, instead of returning currency codes the return value will be dictionaries with detail information. In that case each dictionary will have the keys 'currency', 'from', 'to', and 'tender'.
Pluralization Support
The pluralization support provides functionality around the CLDR pluralization rules. It can parse and evaluate pluralization rules, as well as convert them to other formats such as gettext.
Basic Interface
- class babel.plural.PluralRule(rules: Mapping[str, str] | Iterable[tuple[str, str]])
Represents a set of language pluralization rules. The constructor accepts a list of (tag, expr) tuples or a dict of CLDR rules. The resulting object is callable and accepts one parameter with a positive or negative number (both integer and float) for the number that indicates the plural form for a string and returns the tag for the format:
>>> rule = PluralRule({'one': 'n is 1'}) >>> rule(1) 'one' >>> rule(2) 'other'
Currently the CLDR defines these tags: zero, one, two, few, many and other where other is an implicit default. Rules should be mutually exclusive; for a given numeric value, only one rule should apply (i.e. the condition should only be true for one of the plural rule elements.
- classmethod parse(rules: Mapping[str, str] | Iterable[tuple[str, str]] | PluralRule) -> PluralRule
Create a PluralRule instance for the given rules. If the rules are a PluralRule object, that object is returned.
- Parameters
rules – the rules as list or dict, or a PluralRule object
- Raises
RuleError – if the expression is malformed
- property rules: Mapping[str, str]
The PluralRule as a dict of unicode plural rules.
>>> rule = PluralRule({'one': 'n is 1'}) >>> rule.rules {'one': 'n is 1'}
- property tags: frozenset[str]
A set of explicitly defined tags in this rule. The implicit default 'other' rules is not part of this set unless there is an explicit rule for it.
Conversion Functionality
- babel.plural.to_javascript(rule: Mapping[str, str] | Iterable[tuple[str, str]] | PluralRule) -> str
Convert a list/dict of rules or a PluralRule object into a JavaScript function. This function depends on no external library:
>>> to_javascript({'one': 'n is 1'}) "(function(n) { return (n == 1) ? 'one' : 'other'; })"
Implementation detail: The function generated will probably evaluate expressions involved into range operations multiple times. This has the advantage that external helper functions are not required and is not a big performance hit for these simple calculations.
- Parameters
rule – the rules as list or dict, or a PluralRule object
- Raises
RuleError – if the expression is malformed
- babel.plural.to_python(rule: Mapping[str, str] | Iterable[tuple[str, str]] | PluralRule) -> Callable[[float | Decimal], str]
Convert a list/dict of rules or a PluralRule object into a regular Python function. This is useful in situations where you need a real function and don’t are about the actual rule object:
>>> func = to_python({'one': 'n is 1', 'few': 'n in 2..4'}) >>> func(1) 'one' >>> func(3) 'few' >>> func = to_python({'one': 'n in 1,11', 'few': 'n in 3..10,13..19'}) >>> func(11) 'one' >>> func(15) 'few'
- Parameters
rule – the rules as list or dict, or a PluralRule object
- Raises
RuleError – if the expression is malformed
- babel.plural.to_gettext(rule: Mapping[str, str] | Iterable[tuple[str, str]] | PluralRule) -> str
The plural rule as gettext expression. The gettext expression is technically limited to integers and returns indices rather than tags.
>>> to_gettext({'one': 'n is 1', 'two': 'n is 2'}) 'nplurals=3; plural=((n == 1) ? 0 : (n == 2) ? 1 : 2);'
- Parameters
rule – the rules as list or dict, or a PluralRule object
- Raises
RuleError – if the expression is malformed
General Support Functionality
Babel ships a few general helpers that are not being used by Babel itself but are useful in combination with functionality provided by it.
Convenience Helpers
- class babel.support.Format(locale: Locale | str, tzinfo: datetime.tzinfo | None = None, *, numbering_system: Literal['default'] | str = 'latn')
Wrapper class providing the various date and number formatting functions bound to a specific locale and time-zone.
>>> from babel.util import UTC >>> from datetime import date >>> fmt = Format('en_US', UTC) >>> fmt.date(date(2007, 4, 1)) u'Apr 1, 2007' >>> fmt.decimal(1.2345) u'1.234'
- compact_currency(number: float | decimal.Decimal | str, currency: str, format_type: Literal['short'] = 'short', fraction_digits: int = 0) -> str
Return a number in the given currency formatted for the locale using the compact number format.
>>> Format('en_US').compact_currency(1234567, "USD", format_type='short', fraction_digits=2) '$1.23M'
- compact_decimal(number: float | decimal.Decimal | str, format_type: Literal['short', 'long'] = 'short', fraction_digits: int = 0) -> str
Return a number formatted in compact form for the locale.
>>> fmt = Format('en_US') >>> fmt.compact_decimal(123456789) u'123M' >>> fmt.compact_decimal(1234567, format_type='long', fraction_digits=2) '1.23 million'
- currency(number: float | Decimal | str, currency: str) -> str
Return a number in the given currency formatted for the locale.
- date(date: datetime.date | None = None, format: _PredefinedTimeFormat | str = 'medium') -> str
Return a date formatted according to the given pattern.
>>> from datetime import date >>> fmt = Format('en_US') >>> fmt.date(date(2007, 4, 1)) u'Apr 1, 2007'
- datetime(datetime: datetime.date | None = None, format: _PredefinedTimeFormat | str = 'medium') -> str
Return a date and time formatted according to the given pattern.
>>> from datetime import datetime >>> from babel.dates import get_timezone >>> fmt = Format('en_US', tzinfo=get_timezone('US/Eastern')) >>> fmt.datetime(datetime(2007, 4, 1, 15, 30)) u'Apr 1, 2007, 11:30:00 AM'
- decimal(number: float | Decimal | str, format: str | None = None) -> str
Return a decimal number formatted for the locale.
>>> fmt = Format('en_US') >>> fmt.decimal(1.2345) u'1.234'
- number(number: float | Decimal | str) -> str
Return an integer number formatted for the locale.
>>> fmt = Format('en_US') >>> fmt.number(1099) u'1,099'
- percent(number: float | Decimal | str, format: str | None = None) -> str
Return a number formatted as percentage for the locale.
>>> fmt = Format('en_US') >>> fmt.percent(0.34) u'34%'
- scientific(number: float | Decimal | str) -> str
Return a number formatted using scientific notation for the locale.
- time(time: datetime.time | datetime.datetime | None = None, format: _PredefinedTimeFormat | str = 'medium') -> str
Return a time formatted according to the given pattern.
>>> from datetime import datetime >>> from babel.dates import get_timezone >>> fmt = Format('en_US', tzinfo=get_timezone('US/Eastern')) >>> fmt.time(datetime(2007, 4, 1, 15, 30)) u'11:30:00 AM'
- timedelta(delta: datetime.timedelta | int, granularity: Literal['year', 'month', 'week', 'day', 'hour', 'minute', 'second'] = 'second', threshold: float = 0.85, format: Literal['narrow', 'short', 'medium', 'long'] = 'long', add_direction: bool = False) -> str
Return a time delta according to the rules of the given locale.
>>> from datetime import timedelta >>> fmt = Format('en_US') >>> fmt.timedelta(timedelta(weeks=11)) u'3 months'
- class babel.support.LazyProxy(func: Callable[[...], Any], *args: Any, enable_cache: bool = True, **kwargs: Any)
Class for proxy objects that delegate to a specified function to evaluate the actual object.
>>> def greeting(name='world'): ... return 'Hello, %s!' % name >>> lazy_greeting = LazyProxy(greeting, name='Joe') >>> print(lazy_greeting) Hello, Joe! >>> u' ' + lazy_greeting u' Hello, Joe!' >>> u'(%s)' % lazy_greeting u'(Hello, Joe!)'
This can be used, for example, to implement lazy translation functions that delay the actual translation until the string is actually used. The rationale for such behavior is that the locale of the user may not always be available. In web applications, you only know the locale when processing a request.
The proxy implementation attempts to be as complete as possible, so that the lazy objects should mostly work as expected, for example for sorting:
>>> greetings = [ ... LazyProxy(greeting, 'world'), ... LazyProxy(greeting, 'Joe'), ... LazyProxy(greeting, 'universe'), ... ] >>> greetings.sort() >>> for greeting in greetings: ... print(greeting) Hello, Joe! Hello, universe! Hello, world!
Gettext Support
- class babel.support.Translations(fp: gettext._TranslationsReader | None = None, domain: str | None = None)
An extended translation catalog class.
- add(translations: Translations, merge: bool = True)
Add the given translations to the catalog.
If the domain of the translations is different than that of the current catalog, they are added as a catalog that is only accessible by the various d*gettext functions.
- Parameters
- translations – the Translations instance with the messages to add
- merge – whether translations for message domains that have already been added should be merged with the existing translations
- classmethod load(dirname: str | PathLike[str] | None = None, locales: Iterable[str | Locale] | str | Locale | None = None, domain: str | None = None) -> NullTranslations
Load translations from the given directory.
- Parameters
- dirname – the directory containing the MO files
- locales – the list of locales in order of preference (items in this list can be either Locale objects or locale strings)
- domain – the message domain (default: ‘messages’)
- merge(translations: Translations)
Merge the given translations into the catalog.
Message translations in the specified catalog override any messages with the same identifier in the existing catalog.
- Parameters
translations – the Translations instance with the messages to merge
Units
The unit module provides functionality to format measurement units for different locales.
- babel.units.format_unit(value: str | float | decimal.Decimal, measurement_unit: str, length: Literal['short', 'long', 'narrow'] = 'long', format: str | None = None, locale: Locale | str | None = 'en_US_POSIX', *, numbering_system: Literal['default'] | str = 'latn') -> str
Format a value of a given unit.
Values are formatted according to the locale’s usual pluralization rules and number formats.
>>> format_unit(12, 'length-meter', locale='ro_RO') u'12 metri' >>> format_unit(15.5, 'length-mile', locale='fi_FI') u'15,5 mailia' >>> format_unit(1200, 'pressure-millimeter-ofhg', locale='nb') u'1\xa0200 millimeter kvikks\xf8lv' >>> format_unit(270, 'ton', locale='en') u'270 tons' >>> format_unit(1234.5, 'kilogram', locale='ar_EG', numbering_system='default') u'1٬234٫5 كيلوغرام'
Number formats may be overridden with the format parameter.
>>> import decimal >>> format_unit(decimal.Decimal("-42.774"), 'temperature-celsius', 'short', format='#.0', locale='fr') u'-42,8\u202f\xb0C'
The locale’s usual pluralization rules are respected.
>>> format_unit(1, 'length-meter', locale='ro_RO') u'1 metru' >>> format_unit(0, 'length-mile', locale='cy') u'0 mi' >>> format_unit(1, 'length-mile', locale='cy') u'1 filltir' >>> format_unit(3, 'length-mile', locale='cy') u'3 milltir'
>>> format_unit(15, 'length-horse', locale='fi') Traceback (most recent call last): ... UnknownUnitError: length-horse is not a known unit in fi
Added in version 2.2.0.
- Parameters
- value – the value to format. If this is a string, no number formatting will be attempted.
- measurement_unit – the code of a measurement unit. Known units can be found in the CLDR Unit Validity XML file: https://unicode.org/repos/cldr/tags/latest/common/validity/unit.xml
- length – “short”, “long” or “narrow”
- format – An optional format, as accepted by format_decimal.
- locale – the Locale object or locale identifier
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.units.format_compound_unit(numerator_value: str | float | decimal.Decimal, numerator_unit: str | None = None, denominator_value: str | float | decimal.Decimal = 1, denominator_unit: str | None = None, length: Literal['short', 'long', 'narrow'] = 'long', format: str | None = None, locale: Locale | str | None = 'en_US_POSIX', *, numbering_system: Literal['default'] | str = 'latn') -> str | None
Format a compound number value, i.e. “kilometers per hour” or similar.
Both unit specifiers are optional to allow for formatting of arbitrary values still according to the locale’s general “per” formatting specifier.
>>> format_compound_unit(7, denominator_value=11, length="short", locale="pt") '7/11'
>>> format_compound_unit(150, "kilometer", denominator_unit="hour", locale="sv") '150 kilometer per timme'
>>> format_compound_unit(150, "kilowatt", denominator_unit="year", locale="fi") '150 kilowattia / vuosi'
>>> format_compound_unit(32.5, "ton", 15, denominator_unit="hour", locale="en") '32.5 tons per 15 hours'
>>> format_compound_unit(1234.5, "ton", 15, denominator_unit="hour", locale="ar_EG", numbering_system="arab") '1٬234٫5 طن لكل 15 ساعة'
>>> format_compound_unit(160, denominator_unit="square-meter", locale="fr") '160 par m\xe8tre carr\xe9'
>>> format_compound_unit(4, "meter", "ratakisko", length="short", locale="fi") '4 m/ratakisko'
>>> format_compound_unit(35, "minute", denominator_unit="nautical-mile", locale="sv") '35 minuter per nautisk mil'
>>> from babel.numbers import format_currency >>> format_compound_unit(format_currency(35, "JPY", locale="de"), denominator_unit="liter", locale="de") '35\xa0\xa5 pro Liter'
See https://www.unicode.org/reports/tr35/tr35-general.html#perUnitPatterns
- Parameters
- numerator_value – The numerator value. This may be a string, in which case it is considered preformatted and the unit is ignored.
- numerator_unit – The numerator unit. See format_unit.
- denominator_value – The denominator value. This may be a string, in which case it is considered preformatted and the unit is ignored.
- denominator_unit – The denominator unit. See format_unit.
- length – The formatting length. “short”, “long” or “narrow”
- format – An optional format, as accepted by format_decimal.
- locale – the Locale object or locale identifier
- numbering_system – The numbering system used for formatting number symbols. Defaults to “latn”. The special value “default” will use the default numbering system of the locale.
- Returns
A formatted compound value.
- Raises
UnsupportedNumberingSystemError – If the numbering system is not supported by the locale.
- babel.units.get_unit_name(measurement_unit: str, length: Literal['short', 'long', 'narrow'] = 'long', locale: Locale | str | None = 'en_US_POSIX') -> str | None
Get the display name for a measurement unit in the given locale.
>>> get_unit_name("radian", locale="en") 'radians'
Unknown units will raise exceptions:
>>> get_unit_name("battery", locale="fi") Traceback (most recent call last): ... UnknownUnitError: battery/long is not a known unit/length in fi
- Parameters
- measurement_unit – the code of a measurement unit. Known units can be found in the CLDR Unit Validity XML file: https://unicode.org/repos/cldr/tags/latest/common/validity/unit.xml
- length – “short”, “long” or “narrow”
- locale – the Locale object or locale identifier
- Returns
The unit display name, or None.
Additional Notes
Babel Development
Babel as a library has a long history that goes back to the Trac project. Since then it has evolved into an independently developed project that implements data access for the
`https://cldr.unicode.org Unicode CLDR project`_
.
This document tries to explain as best as possible the general rules of the project in case you want to help out developing.
Tracking the CLDR
Generally the goal of the project is to work as closely as possible with the
`https://cldr.unicode.org/index/charts CLDR data`_
. This has in the past caused some frustrating problems because the data is entirely out of our hand. To minimize the frustration we generally deal with CLDR updates the following way:
- Bump the CLDR data only with a major release of Babel.
- Never perform custom bugfixes on the CLDR data.
- Never work around CLDR bugs within Babel. If you find a problem in the data, report it upstream.
- Adjust the parsing of the data as soon as possible, otherwise this will spiral out of control later. This is especially the case for bigger updates that change pluralization and more.
- Try not to test against specific CLDR data that is likely to change.
Python Versions
At the moment the following Python versions should be supported:
- Python 3.8 and up
- PyPy 3.8 and up
Unicode
Unicode is a big deal in Babel. Here is how the rules are set up:
- internally everything is unicode that makes sense to have as unicode.
- Encode / decode at boundaries explicitly. Never assume an encoding in a way it cannot be overridden. utf-8 should be generally considered the default encoding.
Dates and Timezones
Babel’s timezone support relies on either pytz or zoneinfo; if pytz is installed, it is preferred over zoneinfo. Babel should assume that any timezone objects can be from either of these modules.
Assumptions to make:
- use UTC where possible.
- be super careful with local time. Do not use local time without knowing the exact timezone.
- time without date is a very useless construct. Do not try to support timezones for it. If you do, assume that the current local date is assumed and not utc date.
Babel Changelog
Version 2.16.0
Features
Bugfixes
- CLDR: Do not allow substituting alternates or drafts in derived locales by @akx in #1113
- Core: Allow falling back to modifier-less locale data by @akx in #1104
- Core: Allow use of importlib.metadata for finding entrypoints by @akx in #1102
- Dates: Avoid crashing on importing localtime when TZ is malformed by @akx in #1100
- Messages: Allow parsing .po files that have an extant but empty Language header by @akx in #1101
- Messages: Fix --ignore-dirs being incorrectly read (#1094) by @john-psina and @Edwin18 in #1052 and #1095
- Messages: Make pgettext search plurals when translation is not found by @tomasr8 in #1085
Infrastructure
Documentation
Version 2.15.0
Python version support
- Babel 2.15.0 will require Python 3.8 or newer. (#1048)
Features
Infrastructure
Version 2.14.0
Upcoming deprecation
- This version, Babel 2.14, is the last version of Babel to support Python 3.7. Babel 2.15 will require Python 3.8 or newer.
- We had previously announced Babel 2.13 to have been the last version to support Python 3.7, but being able to use CLDR 43 with Python 3.7 was deemed important enough to keep supporting the EOL Python version for one more release.
Possibly backwards incompatible changes
- Locale.number_symbols will now have first-level keys for each numbering system. Since the implicit default numbering system still is "latn", what had previously been e.g. Locale.number_symbols['decimal'] is now Locale.number_symbols['latn']['decimal'].
- Babel no longer directly depends on either distutils or setuptools; if you had been using the Babel setuptools command extensions, you would need to explicitly depend on setuptools – though given you’re running setup.py you probably already do.
Features
- CLDR/Numbers: Add support of local numbering systems for number symbols by @kajte in #1036
- CLDR: Upgrade to CLDR 43 by @rix0rrr in #1043
- Frontend: Allow last_translator to be passed as an option to extract_message by @AivGitHub in #1044
- Frontend: Decouple pybabel CLI frontend from distutils/setuptools by @akx in #1041
- Numbers: Improve parsing of malformed decimals by @Olunusib and @akx in #1042
Infrastructure
Version 2.13.1
This is a patch release to fix a few bugs.
Fixes
Version 2.13.0
Upcoming deprecation (reverted)
- It was previously announced that this version, Babel 2.13, would be the last version of Babel to support Python 3.7. Babel 2.14 will still support Python 3.7.
Features
Fixes
- Various typing-related fixes by @akx in #979, in #978, #981, #983
- babel.messages.catalog: deduplicate _to_fuzzy_match_key logic by @akx in #980
- Freeze format_time() tests to a specific date to fix test failures by @mgorny in #998
- Spelling and grammar fixes by @scop in #1008
- Renovate lint tools by @akx in #1017, #1028
- Use SPDX license identifier by @vargenau in #994
- Use aware UTC datetimes internally by @scop in #1009
New Contributors
Version 2.12.1
Fixes
- Version 2.12.0 was missing the py.typed marker file. Thanks to Alex Waygood for the fix! #975
- The copyright year in all files was bumped to 2023.
Version 2.12.0
Deprecations & breaking changes
- Python 3.6 is no longer supported (#919) - Aarni Koskela
- The get_next_timezone_transition function is no more (#958) - Aarni Koskela
- Locale.parse() will no longer return None; it will always return a Locale or raise an exception. Passing in None, though technically allowed by the typing, will raise. (#966)
New features
- CLDR: Babel now uses CLDR 42 (#951) - Aarni Koskela
- Dates: pytz is now optional; Babel will prefer it but will use zoneinfo when available. (#940) - @ds-cbo
- General: Babel now ships type annotations, thanks to Jonah Lawrence’s work in multiple PRs.
- Locales: @modifiers are now retained when parsing locales (#947) - martin f. krafft
- Messages: JavaScript template string expression extraction is now smarter. (#939) - Johannes Wilm
- Numbers: NaN and Infinity are now better supported (#955) - Jonah Lawrence
- Numbers: Short compact currency formats are now supported (#926) - Jonah Lawrence
- Numbers: There’s now a Format.compact_decimal utility function. (#921) - Jonah Lawrence
Bugfixes
Improvements & cleanup
- Dates: babel.dates.UTC is now an alias for datetime.timezone.utc (#957) - Aarni Koskela
- Dates: babel.localtime was slightly cleaned up. (#952) - Aarni Koskela
- Documentation: Documentation was improved by Maciej Olko, Jonah Lawrence, lilinjie, and Aarni Koskela.
- Infrastructure: Babel is now being linted with pre-commit and ruff. - Aarni Koskela
Version 2.11.0
Upcoming deprecation
- This version, Babel 2.11, is the last version of Babel to support Python 3.6. Babel 2.12 will require Python 3.7 or newer.
Improvements
- Support for hex escapes in JavaScript string literals #877 - Przemyslaw Wegrzyn
- Add support for formatting decimals in compact form #909 - Jonah Lawrence
- Adapt parse_date to handle ISO dates in ASCII format #842 - Eric L.
- Use ast instead of eval for Python string extraction #915 - Aarni Koskela
- This also enables extraction from static f-strings. F-strings with expressions are silently ignored (but won’t raise an error as they used to).
Infrastructure
- Tests: Use regular asserts and pytest.raises() #875 – Aarni Koskela
- Wheels are now built in GitHub Actions #888 – Aarni Koskela
- Small improvements to the CLDR downloader script #894 – Aarni Koskela
- Remove antiquated __nonzero__ methods #896 - Nikita Sobolev
- Remove superfluous __unicode__ declarations #905 - Lukas Juhrich
- Mark package compatible with Python 3.11 #913 - Aarni Koskela
- Quiesce pytest warnings #916 - Aarni Koskela
Bugfixes
Documentation
Version 2.10.3
This is a bugfix release for Babel 2.10.2, which was mistakenly packaged with outdated locale data.
Thanks to Michał Górny for pointing this out and Jun Omae for verifying.
This and future Babel PyPI packages will be built by a more automated process, which should make problems like this less likely to occur.
Version 2.10.2
This is a bugfix release for Babel 2.10.1.
Version 2.10.1
This is a bugfix release for Babel 2.10.0.
Version 2.10.0
Upcoming deprecation
Improvements
CLDR: Upgrade to CLDR 41.0. (#853) - Aarni Koskela
- The c and e plural form operands introduced in CLDR 40 are parsed, but otherwise unsupported. (#826)
- Non-nominative forms of units are currently ignored.
- Messages: Implement --init-missing option for pybabel update (#785) - ruro
- Messages: For extract, you can now replace the built-in .* / _* ignored directory patterns with ones of your own. (#832) - Aarni Koskela, Kinshuk Dua
- Messages: Add --check to verify if catalogs are up-to-date (#831) - Krzysztof Jagiełło
- Messages: Add --header-comment to override default header comment (#720) - Mohamed Hafez Morsy, Aarni Koskela
- Dates: parse_time now supports 12-hour clock, and is better at parsing partial times. (#834) - Aarni Koskela, David Bauer, Arthur Jovart
- Dates: parse_date and parse_time now raise ParseError, a subclass of ValueError, in certain cases. (#834) - Aarni Koskela
- Dates: parse_date and parse_time now accept the format parameter. (#834) - Juliette Monsel, Aarni Koskela
Infrastructure
- The internal babel/_compat.py module is no more (#808) - Hugo van Kemenade
- Python 3.10 is officially supported (#809) - Hugo van Kemenade
- There’s now a friendly GitHub issue template. (#800) – Álvaro Mondéjar Rubio
- Don’t use the deprecated format_number function internally or in tests - Aarni Koskela
- Add GitHub URL for PyPi (#846) - Andrii Oriekhov
- Python 3.12 compatibility: Prefer setuptools imports to distutils imports (#843) - Aarni Koskela
- Python 3.11 compatibility: Add deprecations to l*gettext variants (#835) - Aarni Koskela
- CI: Babel is now tested with PyPy 3.7. (#851) - Aarni Koskela
Bugfixes
- Date formatting: Allow using other as fallback form (#827) - Aarni Koskela
- Locales: Locale.parse() normalizes variant tags to upper case (#829) - Aarni Koskela
- A typo in the plural format for Maltese is fixed. (#796) - Lukas Winkler
- Messages: Catalog date parsing is now timezone independent. (#701) - rachele-collin
- Messages: Fix duplicate locations when writing without lineno (#837) - Sigurd Ljødal
- Messages: Fix missing trailing semicolon in plural form headers (#848) - farhan5900
- CLI: Fix output of --list-locales to not be a bytes repr (#845) - Morgan Wahl
Documentation
- Documentation is now correctly built again, and up to date (#830) - Aarni Koskela
Version 2.9.1
Bugfixes
- The internal locale-data loading functions now validate the name of the locale file to be loaded and only allow files within Babel’s data directory. Thank you to Chris Lyne of Tenable, Inc. for discovering the issue!
Version 2.9.0
Upcoming version support changes
- This version, Babel 2.9, is the last version of Babel to support Python 2.7, Python 3.4, and Python 3.5.
Improvements
Bugfixes
- Dates: Correct default Format().timedelta format to ‘long’ to mute deprecation warnings – Aarni Koskela
- Import: Simplify iteration code in “import_cldr.py” – Felix Schwarz
- Import: Stop using deprecated ElementTree methods “getchildren()” and “getiterator()” – Felix Schwarz
- Messages: Fix unicode printing error on Python 2 without TTY. – Niklas Hambüchen
- Messages: Introduce invariant that _invalid_pofile() takes unicode line. – Niklas Hambüchen
- Tests: fix tests when using Python 3.9 – Felix Schwarz
- Tests: Remove deprecated ‘sudo: false’ from Travis configuration – Jon Dufresne
- Tests: Support Py.test 6.x – Aarni Koskela
- Utilities: LazyProxy: Handle AttributeError in specified func – Nikiforov Konstantin (#724)
- Utilities: Replace usage of parser.suite with ast.parse – Miro Hrončok
Documentation
- Update parse_number comments – Brad Martin (#708)
- Add __iter__ to Catalog documentation – @CyanNani123
Version 2.8.1
This is solely a patch release to make running tests on Py.test 6+ possible.
Bugfixes
Version 2.8.0
Improvements
Bugfixes
Docs
- Add years to changelog - Romuald Brunet
- Note that installation requires pytz - Steve (Gadget) Barnes
Version 2.7.0
Possibly incompatible changes
These may be backward incompatible in some cases, as some more-or-less internal APIs have changed. Please feel free to file issues if you bump into anything strange and we’ll try to help!
- General: Internal uses of babel.util.odict have been replaced with collections.OrderedDict from The Python standard library.
Improvements
- CLDR: Upgrade to CLDR 35.1 - Alberto Mardegan, Aarni Koskela (#626, #643)
- General: allow anchoring path patterns to the start of a string - Brian Cappello (#600)
- General: Bumped version requirement on pytz - @chrisbrake (#592)
- Messages: pybabel compile: exit with code 1 if errors were encountered - Aarni Koskela (#647)
- Messages: Add omit-header to update_catalog - Cédric Krier (#633)
- Messages: Catalog update: keep user comments from destination by default - Aarni Koskela (#648)
- Messages: Skip empty message when writing mo file - Cédric Krier (#564)
- Messages: Small fixes to avoid crashes on badly formatted .po files - Bryn Truscott (#597)
- Numbers: parse_decimal() strict argument and suggestions - Charly C (#590)
- Numbers: don’t repeat suggestions in parse_decimal strict - Serban Constantin (#599)
- Numbers: implement currency formatting with long display names - Luke Plant (#585)
- Numbers: parse_decimal(): assume spaces are equivalent to non-breaking spaces when not in strict mode - Aarni Koskela (#649)
- Performance: Cache locale_identifiers() - Aarni Koskela (#644)
Bugfixes
- CLDR: Skip alt=… for week data (minDays, firstDay, weekendStart, weekendEnd) - Aarni Koskela (#634)
- Dates: Fix wrong weeknumber for 31.12.2018 - BT-sschmid (#621)
- Locale: Avoid KeyError trying to get data on WindowsXP - mondeja (#604)
- Locale: get_display_name(): Don’t attempt to concatenate variant information to None - Aarni Koskela (#645)
- Messages: pofile: Add comparison operators to _NormalizedString - Aarni Koskela (#646)
- Messages: pofile: don’t crash when message.locations can’t be sorted - Aarni Koskela (#646)
Tooling & docs
- Docs: Remove all references to deprecated easy_install - Jon Dufresne (#610)
- Docs: Switch print statement in docs to print function - NotAFile
- Docs: Update all pypi.python.org URLs to pypi.org - Jon Dufresne (#587)
- Docs: Use https URLs throughout project where available - Jon Dufresne (#588)
- Support: Add testing and document support for Python 3.7 - Jon Dufresne (#611)
- Support: Test on Python 3.8-dev - Aarni Koskela (#642)
- Support: Using ABCs from collections instead of collections.abc is deprecated. - Julien Palard (#609)
- Tests: Fix conftest.py compatibility with pytest 4.3 - Miro Hrončok (#635)
- Tests: Update pytest and pytest-cov - Miro Hrončok (#635)
Version 2.6.0
Possibly incompatible changes
These may be backward incompatible in some cases, as some more-or-less internal APIs have changed. Please feel free to file issues if you bump into anything strange and we’ll try to help!
Other changes
Bugfixes
Tooling & docs
- Add explicit signatures to some date autofunctions (@xmo-odoo) (PR #554)
- Include license file in the generated wheel package (@jdufresne) (PR #539)
- Python 3.6 invalid escape sequence deprecation fixes (@scop) (PR #528)
- Test and document all supported Python versions (@jdufresne) (PR #540)
- Update copyright header years and authors file (@akx) (PR #559)
Version 2.5.3
This is a maintenance release that reverts undesired API-breaking changes that slipped into 2.5.2 (see #550).
It is based on v2.5.1 (f29eccd) with commits 7cedb84, 29da2d2 and edfb518 cherry-picked on top.
Version 2.5.2
Bugfixes
- Revert the unnecessary PyInstaller fixes from 2.5.0 and 2.5.1 (#533) (@yagebu)
Version 2.5.1
Minor Improvements and bugfixes
- Use a fixed datetime to avoid test failures (#520) (@narendravardi)
- Parse multi-line __future__ imports better (#519) (@akx)
- Fix validate_currency docstring (#522)
- Allow normalize_locale and exists to handle various unexpected inputs (#523) (@suhojm)
- Make PyInstaller support more robust (#525, #526) (@thijstriemstra, @akx)
Version 2.5.0
New Features
Minor Improvements and bugfixes
- Dates: Add __str__ to DateTimePattern (#515) (@sfermigier)
- Dates: Fix an invalid string to bytes comparison when parsing TZ files on Py3 (#498) (@rowillia)
- Dates: Formatting zero-padded components of dates is faster (#517) (@akx)
- Documentation: Fix “Good Commits” link in CONTRIBUTING.md (#511) (@naryanacharya6)
- Documentation: Fix link to Python gettext module (#512) (@Linkid)
- Messages: Allow both dash and underscore separated locale identifiers in pofiles (#489, #490) (@akx)
- Messages: Extract Python messages in nested gettext calls (#488) (@sublee)
- Messages: Fix in-place editing of dir list while iterating (#476, #492) (@MarcDufresne)
- Messages: Stabilize sort order (#482) (@xavfernandez)
- Time zones: Honor the no-inherit marker for metazone names (#405) (@akx)
Version 2.4.0
New Features
Some of these changes might break your current code and/or tests.
Minor Improvements and bugfixes
- Documentation: Improve Date Fields descriptions (#450) (@ldwoolley)
- Documentation: Typo fixes and documentation improvements (#406, #412, #403, #440, #449, #463) (@zyegfryed, @adamchainz, @jwilk, @akx, @roramirez, @abhishekcs10)
- Messages: Default to UTF-8 source encoding instead of ISO-8859-1 (#399) (@asottile)
- Messages: Ensure messages are extracted in the order they were passed in (#424) (@ngrilly)
- Messages: Message extraction for JSX files is improved (#392, #396, #425) (@karloskar, @georgschoelly)
- Messages: PO file reading supports multi-line obsolete units (#429) (@mbirtwell)
- Messages: Python message extractor respects unicode_literals in __future__ (#427) (@sublee)
- Messages: Roundtrip Language headers (#420) (@kruton)
- Messages: units before obsolete units are no longer erroneously marked obsolete (#452) (@mbirtwell)
- Numbers: parse_pattern now preserves the full original pattern (#414) (@jtwang)
- Numbers: Fix float conversion in extract_operands (#435) (@akx)
- Plurals: Fix plural forms for Czech and Slovak locales (#373) (@ykshatroff)
- Plurals: More plural form fixes based on Mozilla and CLDR references (#431) (@mshenfield)
Internal improvements
Version 2.3.4
(Bugfix release, released on April 22th 2016)
Bugfixes
- CLDR: The lxml library is no longer used for CLDR importing, so it should not cause strange failures either. Thanks to @aronbierbaum for the bug report and @jtwang for the fix. (#393)
- CLI: Every last single CLI usage regression should now be gone, and both distutils and stand-alone CLIs should work as they have in the past. Thanks to @paxswill and @ajaeger for bug reports. (#389)
Version 2.3.3
(Bugfix release, released on April 12th 2016)
Bugfixes
- CLI: Usage regressions that had snuck in between 2.2 and 2.3 should be no more. (#386) Thanks to @ajaeger, @sebdiem and @jcristovao for bug reports and patches.
Version 2.3.2
(Bugfix release, released on April 9th 2016)
Bugfixes
Version 2.3.1
(Bugfix release because of deployment problems, released on April 8th 2016)
Version 2.3
(Feature release, released on April 8th 2016)
Internal improvements
Features
- CLDR: Add an API for territory language data (#315)
- Core: Character order and measurement system data is imported and exposed (#368)
- Dates: Add an API for time interval formatting (#316)
- Dates: More pattern formats and lengths are supported (#347)
- Dates: Period IDs are imported and exposed (#349)
- Dates: Support for date-time skeleton formats has been added (#265)
- Dates: Timezone formatting has been improved (#338)
- Messages: JavaScript extraction now supports dotted names, ES6 template strings and JSX tags (#332)
- Messages: npgettext is recognized by default (#341)
- Messages: The CLI learned to accept multiple domains (#335)
- Messages: The extraction commands now accept filenames in addition to directories (#324)
- Units: A new API for unit formatting is implemented (#369)
Bugfixes
- Core: Mixed-case locale IDs work more reliably (#361)
- Dates: S…S formats work correctly now (#360)
- Messages: All messages are now sorted correctly if sorting has been specified (#300)
- Messages: Fix the unexpected behavior caused by catalog header updating (e0e7ef1) (#320)
- Messages: Gettext operands are now generated correctly (#295)
- Messages: Message extraction has been taught to detect encodings better (#274)
Version 2.2
(Feature release, released on January 2nd 2016)
Bugfixes
- General: Add __hash__ to Locale. (#303) (2aa8074)
- General: Allow files with BOM if they’re UTF-8 (#189) (da87edd)
- General: localedata directory is now locale-data (#109) (2d1882e)
- General: odict: Fix pop method (0a9e97e)
- General: Removed uses of datetime.date class from .dat files (#174) (94f6830)
- Messages: Fix plural selection for Chinese (531f666)
- Messages: Fix typo and add semicolon in plural_forms (5784501)
- Messages: Flatten NullTranslations.files into a list (ad11101)
- Times: FixedOffsetTimezone: fix display of negative offsets (d816803)
Features
- CLDR: Update to CLDR 28 (#292) (9f7f4d0)
- General: Add __copy__ and __deepcopy__ to LazyProxy. (a1cc3f1)
- General: Add official support for Python 3.4 and 3.5
- General: Improve odict performance by making key search O(1) (6822b7f)
- Locale: Add an ordinal_form property to Locale (#270) (b3f3430)
- Locale: Add support for list formatting (37ce4fa, be6e23d)
- Locale: Check inheritance exceptions first (3ef0d6d)
- Messages: Allow file locations without line numbers (#279) (79bc781)
- Messages: Allow passing a callable to extract() (#289) (3f58516)
- Messages: Support ‘Language’ header field of PO files (#76) (3ce842b)
- Messages: Update catalog headers from templates (e0e7ef1)
- Numbers: Properly load and expose currency format types (#201) (df676ab)
- Numbers: Use cdecimal by default when available (b6169be)
- Numbers: Use the CLDR’s suggested number of decimals for format_currency (#139) (201ed50)
- Times: Add format_timedelta(format=’narrow’) support (edc5eb5)
Version 2.1
(Bugfix/minor feature release, released on September 25th 2015)
- Parse and honour the locale inheritance exceptions (#97)
- Fix Locale.parse using global.dat incompatible types (#174)
- Fix display of negative offsets in FixedOffsetTimezone (#214)
- Improved odict performance which is used during localization file build, should improve compilation time for large projects
- Add support for “narrow” format for format_timedelta
- Add universal wheel support
- Support ‘Language’ header field in .PO files (fixes #76)
- Test suite enhancements (coverage, broken tests fixed, etc)
- Documentation updated
Version 2.0
(Released on July 27th 2015, codename Second Coming)
- Added support for looking up currencies that belong to a territory through the babel.numbers.get_territory_currencies() function.
- Improved Python 3 support.
- Fixed some broken tests for timezone behavior.
- Improved various smaller things for dealing with dates.
Version 1.4
(bugfix release, release date to be decided)
- Fixed a bug that caused deprecated territory codes not being converted properly by the subtag resolving. This for instance showed up when trying to use und_UK as a language code which now properly resolves to en_GB.
- Fixed a bug that made it impossible to import the CLDR data from scratch on windows systems.
Version 1.3
(bugfix release, released on July 29th 2013)
- Fixed a bug in likely-subtag resolving for some common locales. This primarily makes zh_CN work again which was broken due to how it was defined in the likely subtags combined with our broken resolving. This fixes #37.
- Fixed a bug that caused pybabel to break when writing to stdout on Python 3.
- Removed a stray print that was causing issues when writing to stdout for message catalogs.
Version 1.2
(bugfix release, released on July 27th 2013)
- Included all tests in the tarball. Previously the include skipped past recursive folders.
- Changed how tests are invoked and added separate standalone test command. This simplifies testing of the package for linux distributors.
Version 1.1
(bugfix release, released on July 27th 2013)
- added dummy version requirements for pytz so that it installs on pip 1.4.
- Included tests in the tarball.
Version 1.0
(Released on July 26th 2013, codename Revival)
- support python 2.6, 2.7, 3.3+ and pypy - drop all other versions
- use tox for testing on different pythons
- Added support for the locale plural rules defined by the CLDR.
- Added format_timedelta function to support localized formatting of relative times with strings such as “2 days” or “1 month” (ticket #126).
- Fixed negative offset handling of Catalog._set_mime_headers (ticket #165).
- Fixed the case where messages containing square brackets would break with an unpack error.
- updated to CLDR 23
- Make the CLDR import script work with Python 2.7.
- Fix various typos.
- Sort output of list-locales.
- Make the POT-Creation-Date of the catalog being updated equal to POT-Creation-Date of the template used to update (ticket #148).
- Use a more explicit error message if no option or argument (command) is passed to pybabel (ticket #81).
- Keep the PO-Revision-Date if it is not the default value (ticket #148).
- Make –no-wrap work by reworking –width’s default and mimic xgettext’s behaviour of always wrapping comments (ticket #145).
- Add –project and –version options for commandline (ticket #173).
- Add a __ne__() method to the Local class.
- Explicitly sort instead of using sorted() and don’t assume ordering (Jython compatibility).
- Removed ValueError raising for string formatting message checkers if the string does not contain any string formatting (ticket #150).
- Fix Serbian plural forms (ticket #213).
- Small speed improvement in format_date() (ticket #216).
- Fix so frontend.CommandLineInterface.run does not accumulate logging handlers (ticket #227, reported with initial patch by dfraser)
- Fix exception if environment contains an invalid locale setting (ticket #200)
- use cPickle instead of pickle for better performance (ticket #225)
- Only use bankers round algorithm as a tie breaker if there are two nearest numbers, round as usual if there is only one nearest number (ticket #267, patch by Martin)
- Allow disabling cache behaviour in LazyProxy (ticket #208, initial patch from Pedro Algarvio)
- Support for context-aware methods during message extraction (ticket #229, patch from David Rios)
- “init” and “update” commands support “–no-wrap” option (ticket #289)
- fix formatting of fraction in format_decimal() if the input value is a float with more than 7 significant digits (ticket #183)
- fix format_date() with datetime parameter (ticket #282, patch from Xavier Morel)
- fix format_decimal() with small Decimal values (ticket #214, patch from George Lund)
- fix handling of messages containing ‘\n’ (ticket #198)
- handle irregular multi-line msgstr (no “” as first line) gracefully (ticket #171)
- parse_decimal() now returns Decimals not floats, API change (ticket #178)
- no warnings when running setup.py without installed setuptools (ticket #262)
- modified Locale.__eq__ method so Locales are only equal if all of their attributes (language, territory, script, variant) are equal
- resort to hard-coded message extractors/checkers if pkg_resources is installed but no egg-info was found (ticket #230)
- format_time() and format_datetime() now accept also floats (ticket #242)
- add babel.support.NullTranslations class similar to gettext.NullTranslations but with all of Babel’s new gettext methods (ticket #277)
- “init” and “update” commands support “–width” option (ticket #284)
- fix ‘input_dirs’ option for setuptools integration (ticket #232, initial patch by Étienne Bersac)
- ensure .mo file header contains the same information as the source .po file (ticket #199)
- added support for get_language_name() on the locale objects.
- added support for get_territory_name() on the locale objects.
- added support for get_script_name() on the locale objects.
- added pluralization support for currency names and added a ‘¤¤¤’ pattern for currencies that includes the full name.
- depend on pytz now and wrap it nicer. This gives us improved support for things like timezone transitions and an overall nicer API.
- Added support for explicit charset to PO file reading.
- Added experimental Python 3 support.
- Added better support for returning timezone names.
- Don’t throw away a Catalog’s obsolete messages when updating it.
- Added basic likelySubtag resolving when doing locale parsing and no match can be found.
Version 0.9.6
(released on March 17th 2011)
- Backport r493-494: documentation typo fixes.
- Make the CLDR import script work with Python 2.7.
- Fix various typos.
- Fixed Python 2.3 compatibility (ticket #146, ticket #233).
- Sort output of list-locales.
- Make the POT-Creation-Date of the catalog being updated equal to POT-Creation-Date of the template used to update (ticket #148).
- Use a more explicit error message if no option or argument (command) is passed to pybabel (ticket #81).
- Keep the PO-Revision-Date if it is not the default value (ticket #148).
- Make –no-wrap work by reworking –width’s default and mimic xgettext’s behaviour of always wrapping comments (ticket #145).
- Fixed negative offset handling of Catalog._set_mime_headers (ticket #165).
- Add –project and –version options for commandline (ticket #173).
- Add a __ne__() method to the Local class.
- Explicitly sort instead of using sorted() and don’t assume ordering (Python 2.3 and Jython compatibility).
- Removed ValueError raising for string formatting message checkers if the string does not contain any string formatting (ticket #150).
- Fix Serbian plural forms (ticket #213).
- Small speed improvement in format_date() (ticket #216).
- Fix number formatting for locales where CLDR specifies alt or draft items (ticket #217)
- Fix bad check in format_time (ticket #257, reported with patch and tests by jomae)
- Fix so frontend.CommandLineInterface.run does not accumulate logging handlers (ticket #227, reported with initial patch by dfraser)
- Fix exception if environment contains an invalid locale setting (ticket #200)
Version 0.9.5
(released on April 6th 2010)
- Fixed the case where messages containing square brackets would break with an unpack error.
- Backport of r467: Fuzzy matching regarding plurals should NOT be checked against len(message.id) because this is always 2, instead, it’s should be checked against catalog.num_plurals (ticket #212).
Version 0.9.4
(released on August 25th 2008)
- Currency symbol definitions that is defined with choice patterns in the CLDR data are no longer imported, so the symbol code will be used instead.
- Fixed quarter support in date formatting.
- Fixed a serious memory leak that was introduces by the support for CLDR aliases in 0.9.3 (ticket #128).
- Locale modifiers such as “@euro” are now stripped from locale identifiers when parsing (ticket #136).
- The system locales “C” and “POSIX” are now treated as aliases for “en_US_POSIX”, for which the CLDR provides the appropriate data. Thanks to Manlio Perillo for the suggestion.
- Fixed JavaScript extraction for regular expression literals (ticket #138) and concatenated strings.
- The Translation class in babel.support can now manage catalogs with different message domains, and exposes the family of d*gettext functions (ticket #137).
Version 0.9.3
(released on July 9th 2008)
- Fixed invalid message extraction methods causing an UnboundLocalError.
- Extraction method specification can now use a dot instead of the colon to separate module and function name (ticket #105).
- Fixed message catalog compilation for locales with more than two plural forms (ticket #95).
- Fixed compilation of message catalogs for locales with more than two plural forms where the translations were empty (ticket #97).
- The stripping of the comment tags in comments is optional now and is done for each line in a comment.
- Added a JavaScript message extractor.
- Updated to CLDR 1.6.
- Fixed timezone calculations when formatting datetime and time values.
- Added a get_plural function into the plurals module that returns the correct plural forms for a locale as tuple.
- Added support for alias definitions in the CLDR data files, meaning that the chance for items missing in certain locales should be greatly reduced (ticket #68).
Version 0.9.2
(released on February 4th 2008)
- Fixed catalogs’ charset values not being recognized (ticket #66).
- Numerous improvements to the default plural forms.
- Fixed fuzzy matching when updating message catalogs (ticket #82).
- Fixed bug in catalog updating, that in some cases pulled in translations from different catalogs based on the same template.
- Location lines in PO files do no longer get wrapped at hyphens in file names (ticket #79).
- Fixed division by zero error in catalog compilation on empty catalogs (ticket #60).
Version 0.9.1
(released on September 7th 2007)
- Fixed catalog updating when a message is merged that was previously simple but now has a plural form, for example by moving from gettext to ngettext, or vice versa.
- Fixed time formatting for 12 am and 12 pm.
- Fixed output encoding of the pybabel –list-locales command.
- MO files are now written in binary mode on windows (ticket #61).
Version 0.9
(released on August 20th 2007)
- The new_catalog distutils command has been renamed to init_catalog for consistency with the command-line frontend.
- Added compilation of message catalogs to MO files (ticket #21).
- Added updating of message catalogs from POT files (ticket #22).
- Support for significant digits in number formatting.
- Apply proper “banker’s rounding” in number formatting in a cross-platform manner.
- The number formatting functions now also work with numbers represented by Python Decimal objects (ticket #53).
- Added extensible infrastructure for validating translation catalogs.
- Fixed the extractor not filtering out messages that didn’t validate against the keyword’s specification (ticket #39).
- Fixed the extractor raising an exception when encountering an empty string msgid. It now emits a warning to stderr.
- Numerous Python message extractor fixes: it now handles nested function calls within a gettext function call correctly, uses the correct line number for multi-line function calls, and other small fixes (tickets ticket #38 and ticket #39).
- Improved support for detecting Python string formatting fields in message strings (ticket #57).
- CLDR upgraded to the 1.5 release.
- Improved timezone formatting.
- Implemented scientific number formatting.
- Added mechanism to lookup locales by alias, for cases where browsers insist on including only the language code in the Accept-Language header, and sometimes even the incorrect language code.
Version 0.8.1
(released on July 2nd 2007)
- default_locale() would fail when the value of the LANGUAGE environment variable contained multiple language codes separated by colon, as is explicitly allowed by the GNU gettext tools. As the default_locale() function is called at the module level in some modules, this bug would completely break importing these modules on systems where LANGUAGE is set that way.
- The character set specified in PO template files is now respected when creating new catalog files based on that template. This allows the use of characters outside the ASCII range in POT files (ticket #17).
- The default ordering of messages in generated POT files, which is based on the order those messages are found when walking the source tree, is no longer subject to differences between platforms; directory and file names are now always sorted alphabetically.
- The Python message extractor now respects the special encoding comment to be able to handle files containing non-ASCII characters (ticket #23).
- Added N_ (gettext noop) to the extractor’s default keywords.
- Made locale string parsing more robust, and also take the script part into account (ticket #27).
- Added a function to list all locales for which locale data is available.
- Added a command-line option to the pybabel command which prints out all available locales (ticket #24).
- The name of the command-line script has been changed from just babel to pybabel to avoid a conflict with the OpenBabel project (ticket #34).
Version 0.8
(released on June 20th 2007)
- First public release
License
Babel is licensed under a three clause BSD License. It basically means: do whatever you want with it as long as the copyright in Babel sticks around, the conditions are not modified and the disclaimer is present. Furthermore you must not use the names of the authors to promote derivatives of the software without written consent.
The full license text can be found below (Babel License).
Authors
Babel is written and maintained by the Babel team and various contributors:
- Aarni Koskela
- Christopher Lenz
- Armin Ronacher
- Alex Morega
- Lasse Schuirmann
- Felix Schwarz
- Pedro Algarvio
- Jeroen Ruigrok van der Werven
- Philip Jenvey
- benselme
- Isaac Jurado
- Tobias Bieniek
- Erick Wilder
- Jonah Lawrence
- Michael Birtwell
- Jonas Borgström
- Kevin Deldycke
- Ville Skyttä
- Jon Dufresne
- Jun Omae
- Hugo
- Heungsub Lee
- Tomas R
- Jakob Schnitzer
- Sachin Paliwal
- Alex Willmer
- Daniel Neuhäuser
- Hugo van Kemenade
- Miro Hrončok
- Cédric Krier
- Luke Plant
- Jennifer Wang
- Lukas Balaga
- sudheesh001
- Jean Abou Samra
- Niklas Hambüchen
- Changaco
- Xavier Fernandez
- KO. Mattsson
- Sébastien Diemer
- alexbodn@gmail.com
- saurabhiiit
- srisankethu
- Erik Romijn
- Lukas B
- Ryan J Ollos
- Arturas Moskvinas
- Leonardo Pistone
- Hyunjun Kim
- buhtz
- Bohdan Malomuzh
- Leonid
- Ronan Amicel
- Christian Clauss
- Best Olunusi
- Teo
- Ivan Koldakov
- Rico Hermans
- Daniel
- Oleh Prypin
- Petr Viktorin
- Jean Abou-Samra
- Joe Portela
- Marc-Etienne Vargenau
- Michał Górny
- Alex Waygood
- Maciej Olko
- martin f. krafft
- DS/Charlie
- lilinjie
- Johannes Wilm
- Eric L
- Przemyslaw Wegrzyn
- Lukas Kahwe Smith
- Lukas Juhrich
- Nikita Sobolev
- Raphael Nestler
- Frank Harrison
- Nehal J Wani
- Mohamed Morsy
- Krzysztof Jagiełło
- Morgan Wahl
- farhan5900
- Sigurd Ljødal
- Andrii Oriekhov
- rachele-collin
- Lukas Winkler
- Juliette Monsel
- Álvaro Mondéjar Rubio
- ruro
- Alessio Bogon
- Nikiforov Konstantin
- Abdullah Javed Nesar
- Brad Martin
- Tyler Kennedy
- CyanNani123
- sebleblanc
- He Chen
- Steve (Gadget) Barnes
- Romuald Brunet
- Mario Frasca
- BT-sschmid
- Alberto Mardegan
- mondeja
- NotAFile
- Julien Palard
- Brian Cappello
- Serban Constantin
- Bryn Truscott
- Chris
- Charly C
- PTrottier
- xmo-odoo
- StevenJ
- Jungmo Ku
- Simeon Visser
- Narendra Vardi
- Stefane Fermigier
- Narayan Acharya
- François Magimel
- Wolfgang Doll
- Roy Williams
- Marc-André Dufresne
- Abhishek Tiwari
- David Baumgold
- Alex Kuzmenko
- Georg Schölly
- ldwoolley
- Rodrigo Ramírez Norambuena
- Jakub Wilk
- Roman Rader
- Max Shenfield
- Nicolas Grilly
- Kenny Root
- Adam Chainz
- Sébastien Fievet
- Anthony Sottile
- Yuriy Shatrov
- iamshubh22
- Sven Anderson
- Eoin Nugent
- Roman Imankulov
- David Stanek
- Roy Wellington Ⅳ
- Florian Schulze
- Todd M. Guerra
- Joseph Breihan
- Craig Loftus
- The Gitter Badger
- Régis Behmo
- Julen Ruiz Aizpuru
- astaric
- Felix Yan
- Philip_Tzou
- Jesús Espino
- Jeremy Weinstein
- James Page
- masklinn
- Sjoerd Langkemper
- Matt Iversen
- Alexander A. Dyshev
- Dirkjan Ochtman
- Nick Retallack
- Thomas Waldmann
- xen
Babel was previously developed under the Copyright of Edgewall Software. The following copyright notice holds true for releases before 2013: “Copyright (c) 2007 - 2011 by Edgewall Software”
In addition to the regular contributions Babel includes a fork of Lennart Regebro’s tzlocal that originally was licensed under the CC0 license. The original copyright of that project is “Copyright 2013 by Lennart Regebro”.
General License Definitions
The following section contains the full license texts for Babel and the documentation.
- “Authors” hereby refers to all the authors listed in the Authors section.
- The “Babel License” applies to all the source code shipped as part of Babel (Babel itself as well as the examples and the unit tests) as well as documentation.
- The “Unicode License” applies to the transformed Unicode Common Locale Data Repository (CLDR) data files shipped with Babel, in the directory babel/locale-data.
Babel License
Copyright (c) 2013-2024 by the Babel Team, see Authors for more information.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE Copyright HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE Copyright HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Unicode License
UNICODE LICENSE V3
Copyright AND PERMISSION NOTICE
Copyright © 2004-2024 Unicode, Inc.
NOTICE TO USER: Carefully read the following legal agreement. BY DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING DATA FILES, AND/OR SOFTWARE, YOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE TERMS AND CONDITIONS OF THIS AGREEMENT. IF YOU DO NOT AGREE, DO NOT DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE THE DATA FILES OR SOFTWARE.
Permission is hereby granted, free of charge, to any person obtaining a copy of data files and any associated documentation (the “Data Files”) or software and any associated documentation (the “Software”) to deal in the Data Files or Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of the Data Files or Software, and to permit persons to whom the Data Files or Software are furnished to do so, provided that either (a) this copyright and permission notice appear with all copies of the Data Files or Software, or (b) this copyright and permission notice appear in associated Documentation.
THE DATA FILES AND SOFTWARE ARE PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS.
IN NO EVENT SHALL THE Copyright HOLDER OR HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA FILES OR SOFTWARE.
Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in these Data Files or Software without prior written authorization of the copyright holder.
SPDX-License-Identifier: Unicode-3.0
Author
The Babel Team
Copyright
2024, The Babel Team