| 1 | \section{\module{stringprep} ---
|
|---|
| 2 | Internet String Preparation}
|
|---|
| 3 |
|
|---|
| 4 | \declaremodule{standard}{stringprep}
|
|---|
| 5 | \modulesynopsis{String preparation, as per RFC 3453}
|
|---|
| 6 | \moduleauthor{Martin v. L\"owis}{[email protected]}
|
|---|
| 7 | \sectionauthor{Martin v. L\"owis}{[email protected]}
|
|---|
| 8 |
|
|---|
| 9 | \versionadded{2.3}
|
|---|
| 10 |
|
|---|
| 11 | When identifying things (such as host names) in the internet, it is
|
|---|
| 12 | often necessary to compare such identifications for
|
|---|
| 13 | ``equality''. Exactly how this comparison is executed may depend on
|
|---|
| 14 | the application domain, e.g. whether it should be case-insensitive or
|
|---|
| 15 | not. It may be also necessary to restrict the possible
|
|---|
| 16 | identifications, to allow only identifications consisting of
|
|---|
| 17 | ``printable'' characters.
|
|---|
| 18 |
|
|---|
| 19 | \rfc{3454} defines a procedure for ``preparing'' Unicode strings in
|
|---|
| 20 | internet protocols. Before passing strings onto the wire, they are
|
|---|
| 21 | processed with the preparation procedure, after which they have a
|
|---|
| 22 | certain normalized form. The RFC defines a set of tables, which can be
|
|---|
| 23 | combined into profiles. Each profile must define which tables it uses,
|
|---|
| 24 | and what other optional parts of the \code{stringprep} procedure are
|
|---|
| 25 | part of the profile. One example of a \code{stringprep} profile is
|
|---|
| 26 | \code{nameprep}, which is used for internationalized domain names.
|
|---|
| 27 |
|
|---|
| 28 | The module \module{stringprep} only exposes the tables from RFC
|
|---|
| 29 | 3454. As these tables would be very large to represent them as
|
|---|
| 30 | dictionaries or lists, the module uses the Unicode character database
|
|---|
| 31 | internally. The module source code itself was generated using the
|
|---|
| 32 | \code{mkstringprep.py} utility.
|
|---|
| 33 |
|
|---|
| 34 | As a result, these tables are exposed as functions, not as data
|
|---|
| 35 | structures. There are two kinds of tables in the RFC: sets and
|
|---|
| 36 | mappings. For a set, \module{stringprep} provides the ``characteristic
|
|---|
| 37 | function'', i.e. a function that returns true if the parameter is part
|
|---|
| 38 | of the set. For mappings, it provides the mapping function: given the
|
|---|
| 39 | key, it returns the associated value. Below is a list of all functions
|
|---|
| 40 | available in the module.
|
|---|
| 41 |
|
|---|
| 42 | \begin{funcdesc}{in_table_a1}{code}
|
|---|
| 43 | Determine whether \var{code} is in table{A.1} (Unassigned code points
|
|---|
| 44 | in Unicode 3.2).
|
|---|
| 45 | \end{funcdesc}
|
|---|
| 46 |
|
|---|
| 47 | \begin{funcdesc}{in_table_b1}{code}
|
|---|
| 48 | Determine whether \var{code} is in table{B.1} (Commonly mapped to
|
|---|
| 49 | nothing).
|
|---|
| 50 | \end{funcdesc}
|
|---|
| 51 |
|
|---|
| 52 | \begin{funcdesc}{map_table_b2}{code}
|
|---|
| 53 | Return the mapped value for \var{code} according to table{B.2}
|
|---|
| 54 | (Mapping for case-folding used with NFKC).
|
|---|
| 55 | \end{funcdesc}
|
|---|
| 56 |
|
|---|
|
|---|