The stc program reads a string table source from a plain text file.
This text file contains strings for several languages.
A directory structure for language/region/encoding is created below
the output directory. Binary string tables - one per language/region/encoding
triple - are
placed in the subdirectories.
The dkstt module can read these string tables. The file finding
mechanism in the dkapp module chooses the string table matching
the user's preferred language.
The input file is read line by line. Comments are started by the
raute (#) character and finished by the end of line.
The input consists of a list of string mappings.
Each string mapping consists of a key string and a list
of scope/value pairs. The scope is started by a language identifier;
region and encoding may follow. Scope and value are separated by
"=". The value is the string for the scope, enclosed in quotes.
Additionally the file may contain optional special information introduced by the keywords "$version", "$author" and "$comment".
$version 1.0 $author Dirk Krause $comment A demo string table to show the structure "/msg/h" en = "Hello!" de = "Hallo!" fr = "Bonjour!" "/msg/g" en = "Goodbye!" de = "Mach's gut!" fr = "Au revoir!"
The conversion is done by
stc <input-file> <target-directory>
The target directory is typically a directory below ${prefix}/share. I.e. to install string tables for an application named foo the command line may look like
stc foo.str /usr/local/share/foo
Today's computer systems allow users to choose their favorite language,
region and encoding settings. Some languages require UTF-8 encoding,
other languages leave it up to the user whether or not to use UTF-8
encoding. Programs must be able to interact with the user in the selected
language and using the selected encoding. So it is necessary to provide
texts for some languages both UTF-8 encoded and 8-bit ASCII encoded.
The stc program from dklibs 1.8.0 (and above) has an automatic
encoding completion feature for some languages (languages which mainly use
characters in the UNICODE range 0x00000000...0x000000FF). If a text is specified
only UTF-8 encoded the ASCII encoded text is created automatically if
it is missing and vice versa.
At this time automatic encoding completion is supported for de, en, fr, nl, be, sp, pt, pl, cs, hu, sv, no and da. This list is possibly and likely incomplete, see the sections below how to add languages.
The auto_complete_languages[] string in the stc.c module contains the list of languages for which automatic encoding completion is enabled. You may modify this list before building and installing the dklibs package. If there are languages I should add to the distribution please use the feature-request/bug-tracking mechanisms on SourceForge.
An administrator can add further languages (i.e. "aa", "bb" and "cc") by placing a section
[*/stc] /languages/mostly-ascii7=de en fr nl be sp pt pl cs hu sv no da aa bb cc
in the system-wide preferences file ${prefix}/etc/appdefaults.
A user can add languages by placing a section
[stc] /languages/mostly-ascii7=de en fr nl be sp pt pl cs hu sv no da aa bb cc
in the user preferences file ${HOME}/.defaults/all.
When running stc, use the command line option
--/languages/mostly-ascii7="de en fr nl be sp pt pl cs hu sv no da aa bb cc"
to add languages.
When editing string table source files you should enter either only
8-bit ASCII encoded strings or only UTF-8 encoded strings.
Do not mix encodings!
For new projects use only UTF-8 encoded strings.
Before editing string table files make sure whether your text editor
saves files UTF-8 encoded or non-UTF-8 encoded.
Do not simply rely on the language/region/encoding settings of the system.
Some "intelligent" editors switch back to non-UTF-8 encoded texts if
they think it might be useful. Examples: On my Fedora Core 3
set up for "de_DE.UTF-8" vim converts files containing 8-bit ASCII encoded
text to UTF-8 when opening and converts back to 8-bit ASCII when saving.
So the files are saved 8-bit ASCII encoded.
For new files or files containing only 7-bit ASCII no conversion and
backward conversion is done. These files are saved UTF-8 encoded.
On the same system nedit prints a message "...switching from
de_DE.UTF-8 to de_DE" when run from a terminal window. Users running
nedit from the menu will never see this message unless they inspect
system log files. Files are saved non-UTF-8 encoded.
To test your editor, create a small
text file, enter a short text, i.e. "xöx" (instead of the german
umlaut ö you can use any other character in the UNICODE range
0x00000080...0x000000FF). Save the file and open it in a hex viewer or hex
editor. If only one byte is used for the special character between the "x"
your editor saves files non-UTF-8 encoded. If multiple bytes are used
for the special character, it saves UTF-8 encoded files.
An example file containing UTF-8 encoded texts might look like this:
$version 1.1 $author Dirk Krause $comment A new demo string table to show the structure "/msg/h" en.utf-8 = "Hello, nice wheather today!" de.utf-8 = "Hallo, schönes Wetter heute!" fr.utf-8 = "Bonjour!" "/msg/g" en.utf-8 = "Goodbye!" de.utf-8 = "Mach's gut!" fr.utf-8 = "Au revoir!"
When stc is run the missing texts for "en", "de" and "fr" are created
on the fly.
If the text contains characters outside the UNICODE range
0x00000000...0x000000FF there is no way to convert the character
to 8-bit ASCII encoding. A dot is inserted into the text instead of
the special character.