localedef(4)


localedef -- format for locale specification source file

Description

This manual page describes the format of a source file used as input to the localedef command. This file (called a localedef( ) file in this manual page) defines information about a locale. The localedef command uses the localization specifications in the source file to create data files used by the localization system to map system input and output into a local format.

A locale defines your execution environment in terms of language and cultural conventions. It consists of one or more categories corresponding to these environment variable names:


LC_COLLATE
Collation order used by commands such as sort and uniq.

LC_CTYPE
Character classification and case conversion.

LC_MESSAGES
Formats of informative and diagnostic messages and interactive responses.

LC_MONETARY
Monetary formatting.

LC_NUMERIC
Numeric, non-monetary formatting.

LC_TIME
Date and time formats.

When you run the localedef command, only those categories defined in the localedef( ) file used as input to the command are defined for the locale created.

Category source definition

The localedef( ) file consists of one or more category source definitions. A category source definition consists of a category header (one of the strings listed above), the category body (consisting of one or more lines of text), and the string END.

A category body contains lines of text. Each line contains an identifier and (optionally) operands. Identifiers are keywords (identifying a locale element) or collating elements. Operands may be characters, strings of characters, or collating elements. (Strings are enclosed in double quotes; literal double quotes in strings should be escaped by a preceding backslash.)

Comments are allowed in the localedef( ) file. By default, the comment character is a pound sign (#); blank lines and lines containing a # in the first position are ignored. The comment character can be changed if the first category header in the file is preceded by a line modifying the comment character. The line should have the following format:

   comment_char #
where # is the alternative comment character to use.

Lines can be continued by placing a backslash at the end of the line; the backslash and subsequent newline character are ignored. Comment lines are not treated in this way (the backslash is ignored, because it is read as part of the comment).

Individual characters, characters in strings, and collating elements, are represented by symbolic names. Alternatively, characters can be represented by themselves, or as octal, hexadecimal, or decimal values. Octal numbers are two or three digits, preceded by a backslash; hexadecimal numbers are two digits, preceded by \x and decimal numbers are two or more digits, preceded by \d.

Symbolic values are enclosed in angle brackets (< and >). The symbolic name (including angle brackets) must match a symbolic name defined in the charmap file specified by localedef -f.

LC_CTYPE

This category section defines character classification, case conversion, and other character attributes. In addition, a series of characters within a single character set can be represented by three adjacent periods ``...'', representing an ellipsis.

The following keywords are recognized:


upper
Defines characters classified as uppercase.

lower
Defines characters classified as lowercase.

alpha
Defines characters classified as letters (upper- or lowercase).

digit
Defines characters classified as digits.

space
Defines characters classified as whitespace.

cntrl
Defines characters classified as control characters.

punct
Defines characters classified as punctuation.

graph
Defines characters classified as printable characters, not including the space character.

print
Defines characters classified as printable characters, including the space character.

xdigit
Defines characters classified as hexadecimal digits. (Note that both upper and lowercase versions of the letters "A" to "F" are included in the POSIX locale; this practice is recommended.)

blank
Defines characters classified as blank characters.

toupper
Defines the mapping of lowercase letters to uppercase letters. (The operand consists of character pairs separated by semicolons; the characters in each pair are separated by a comma and enclosed in parentheses. The first character is lowercase, the second is its uppercase equivalent.)

tolower
Defines the mapping of uppercase letters to lowercase letters. (Same format as in toupper, except that the first character in each pair is uppercase and the second is lowercase.)

copy
Specify an existing locale to be used in the definition of this category. If this keyword is specified, no other keyword is allowed.

Note that some of the above categories are mutually exclusive or automatically inclusive. For example, it is not possible for a character to be alpha and digit. See the X/Open Interface Definitions, Version 4 Issue 2 for further details.

LC_COLLATE

This category section provides a collation sequence for utilities that rely on sorting (such as sort, uniq, and awk, regular expression matching, and some system calls.

A collation sequence defines the order between collating elements (characters and multibyte collating sequences) in the locale. The following keywords are recognized:


collating-element
Defines a collating-element symbol representing a multicharacter collating element.

collating-symbol
Defines a collating symbol for use in collation order statements.

order_start
Define collation rules. This statement is followed by one or more collation order statements, that assign character collation values and weights to collating elements. The syntax of the order_start command is:
   order_start <sort_rules>;<sort_rules> ... ;

The following sort_rules collating weights are allowed:


backward
Specifies that comparison operations for the weight level proceed from the end of the string towards the beginning of the string.

forward
Specifies that comparison operations for the weight level proceed from start of string towards the end of the string.

position
Specifies that comparison operations for the weight level will consider the relative position of elements in the strings not subject to IGNORE.

order_end
Defines the end of collation order statements.

copy
Specifies an existing locale to copy the collating sequence from. (If used, no other keyword is permitted.)

See the X/Open Interface Definitions, Version 4 Issue 2 for further details.

LC_MONETARY

This category section defines the rules and symbols used to format monetary numeric information. The following items are defined in this section:

int_curr_symbol
Specifies the international currency symbol. This is a four-character string; the first three characters contain the alphabetic international currency symbol (in accordance with ISO 4217:1987), while the fourth character is the separator used between the currency symbol and the monetary quantity.

currency_symbol
Specifies the string used as the local currency symbol.

mon_decimal_point
Specifies the symbol used as the decimal delimiter in monetary formatted quantities.

mon_thousands_sep
Specifies the symbol used as a separator for groups of digits to the left of a decimal delimiter in monetary formatted quantities.

mon_grouping
Specifies the size of each group of digits in formatted monetary quantities. Requires as an operand a sequence of integers separated by semicolons. Each integer specifies the number of digits in each group, with the initial integer defining the size of the group immediately preceding the decimal delimiter, and the following integers defining the groups before that. If the last integer is not -1, then the size of the previous group will be repeatedly used for the remainder of the digits. If the last integer is -1, then no further grouping is performed.

positive_sign
Specifies the sign used to indicate a non-negative formatted monetary quantity.

negative_sign
Specifies the sign used to indicate a negative formatted monetary quantity.

int_frac_digits
Specifies an integer representing the number of fractional digits to be written in a formatted monetary quantity using int_curr_symbol.

frac_digits
Specifies an integer representing the number of fractional digits to be written in a formatted monetary quantity using currency_symbol.

p_cs_precedes
Specifies that the currency_symbol or int_curr_symbol precedes the monetary quantity (if value is 1) or follows the monetary quantity (if value is 0). This only applies for non-negative monetary quantities.

p_sep_by_space
Specifies that no space separates the currency symbol from the monetary quantity (if value is 0), or that a space separates the currency symbol from the quantity (if value is 1), or that a space separates the symbol and the sign string (if value is 2). This only applies to non-negative monetary quantities.

n_cs_precedes
Specifies that the currency_symbol or int_curr_symbol precedes the monetary quantity (if value is 1) or follows the monetary quantity (if value is 0). This only applies for negative monetary quantities.

n_sep_by_space
Specifies that no space separates the currency symbol from the monetary quantity (if value is 0), or that a space separates the currency symbol from the quantity (if value is 1), or that a space separates the symbol and the sign string (if value is 2). This only applies to negative monetary quantities.

p_sign_posn
Specifies the positioning of the positive_sign for a non-negative monetary quantity. The following values are possible:

0
Parentheses enclose the quantity and the currency symbol.

1
The sign string precedes the quantity and the currency symbol.

2
The sign string follows the quantity and the currency symbol.

3
The sign string precedes the currency symbol.

4
The sign string follows the currency symbol.

n_sign_posn
Specifies the positioning of the negative_sign for a negative monetary quantity. (Accepts the same values as p_sign_posn.)

copy
Specifies the name of an existing locale from which to copy the definition of this section. If specified, no other keywords are allowed.

See the X/Open Interface Definitions, Version 4 Issue 2 for further details.

LC_NUMERIC

This category defines the rules and symbols used to format non-monetary numeric information. The following keywords are recognized:

copy
Specifies the name of an existing locale to use as the definition for this category. (If specified, no other keywords are allowed.)

decimal_point
Specifies a string containing the symbol used as the decimal delimiter. This keyword cannot be omitted and cannot be set to the empty string.

grouping
Specifies the size of each group of digits to the left of the decimal point (or other separator). The operand is a sequence of integers separated by semicolons. Each integer specifies the number of digits in each group, in ascending ordinality. If the last integer is not -1, the size of the last group is repeatedly applied to the remaining digits; if the last integer is -1, then no further grouping is applied.

thousands_sep
Specifies the symbol used as a separator for groups of digits to the left of the decimal point.

See the X/Open Interface Definitions, Version 4 Issue 2 for further details.

LC_TIME

This category defines the formatting of time and date values. The definitions imply a Gregorian style calendar: formatting time strings for other types of calendar is outside the scope of the X/Open specifications.

The following mandatory keywords are recognized:


abday
Specify the abbreviated weekday names, corresponding to the %a field descriptor. The operand consists of seven semicolon separated strings, each surrounded by double quotes. The first string corresponds to Sunday, the second to Monday, and so on.

day
Specify the full weekday names, corresponding to the %A field descriptor. The operand consists of seven semicolon separated strings, each surrounded by double quotes. The first string corresponds to Sunday, the second to Monday, and so on.

abmon
Specify the abbreviated month names, corresponding to the %b field descriptor. The operand consists of twelve semicolon separated strings, each surrounded by double quotes. The first string corresponds to the first month (January), the second to the second month, and so on.

mon
Specify the full month names, corresponding to the %B field descriptor. The operand consists of twelve semicolon separated strings, each surrounded by double quotes. The first string corresponds to the first month (January), the second to the second month, and so on.

d_t_fmt
Specify the date and time representation, corresponding to the %c field descriptor. The operand consists of a string and can contain any combination of characters and field descriptors. In addition the string can contain escape sequences (\\, \a, \b, \f, \n, \r, \t, \v).

d_fmt
Specify the date representation, corresponding to the %x field descriptor. (Operand as for d_t_fmt.)

t_fmt
Specify the time representation, corresponding to the %X field descriptor. (Operand as for d_t_fmt.)

am_pm
Specify the AM/PM representation, corresponding to the %p field descriptor. The operand consists of two double-quoted strings, separated by a semicolon.

t_fmt_ampm
Specify the time representation in the 12-hour format, corresponding to the %r field descriptor. (Operand as for d_t_fmt.)

copy
Specifies the name of an existing locale from which to copy the definition of this section. If specified, no other keywords are allowed.

ABDAY_x
The abbreviated weekday names, where x is a number from 1 to 7.

DAY_x
The full weekday names, where x is a number from 1 to 7.

ABMON_x
The abbreviated month names, where x is a number from 1 to 12.

MON_x
The full month names, where x is a number from 1 to 12.

D_T_FMT
The appropriate date and time representation.

D_FMT
The appropriate date representation.

T_FMT
The appropriate time representation.

AM_STR
The appropriate AM affix.

PM_STR
The appropriate PM affix.

T_FMT_AMPM
The appropriate time representation in the 12 hour clock format with AM_STR and PM_STR.

ERA
The Era description segments (specifying how years are counted and displayed for each era in a locale).

ERA_D_FMT
The era date format.

ALT_DIGITS
Specifies the alternative symbols for digits, corresponding to the %O conversion specification modifier. The operands consist of semicolon-separated symbols. The first is the alternative symbol for zero, the second is the alternative symbol for one, and so on. (A maximum of 100 alternative symbols are supported.)

See the X/Open Interface Definitions, Version 4 Issue 2 for further details.

LC_MESSAGES

This category defines the format and values for affirmative or negative responses. The following keywords are recognized:

yesexpr
Specifies an operand that describes the acceptable affirmative response to a question expecting a yes/no response. Operand is an extended regular expression.

noexpr
Specifies an operand that describes the acceptable negative response to a question expecting a yes/no response. Operand is an extended regular expression.

copy
Specifies the name of an existing locale from which to copy the definition of this section. If specified, no other keywords are allowed.

See the X/Open System Interface Definitions, Version 4 Issue 2 for further details.

References

awk(1), localedef(1), locale(1), locale(4), sort(1), uniq(1).

Standards conformance

localedef( ) is conformant with X/Open Interface Definitions, Version 4 Issue 2.
© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 25 April 2004