Encoding Menu
The Encoding menu is only found in the font view:
-
Reencode
-
Has an attached sub-menu of standard and user defined encodings. The font's
current encoding will be indicated with a check mark. You may change the
encoding by selecting a different entry.
There are two slightly different formats an encoding can take. It can be
define by unicode code points, or it can be defined by glyph names. When
reencoding to an encoding defined by code points, the glyph with the matching
unicode value is placed in the encoding slot. When reencoding to an encoding
defined by glyph name, we first search for a glyph with the matching name
and use it, if not found we search for the glyph name's corresponding unicode
code point (if any) and if found we change its name to that specified by
the encoding.
Example: Suppose we have a font containing a glyph named "uni0041", and an
encoding which maps U+0041->slot 65, then the glyph will be moved into
slot 65. If we have another encoding which maps "A" -> slot 65, then (since
"A" has unicode value U+0041) our glyph will still be mapped to slot 65,
but in addition its name will be changed to "A".
-
Compact
-
Remove any holes from the encoding so all the glyphs get smushed together.
If the font is already compact, then selecting this again will restore the
original.
-
Force Encoding
-
Has the same sub-menu as above. Here we assume that the glyphs of the font
are currently encoded in the right order, but they have the wrong names (This
may seem odd, but it happens a lot). This command will change the names of
all the glyphs to match what they should be if the indicated encoding were
in force.
-
Add Encoding Slots...
-
Add some extra slots at the end of the font into which you can put unencoded
glyphs (variant glyphs, etc.)
-
Remove Unused Slots
-
Removes any unused slots from the end of the font. It does not remove unused
slots inside the font, that would screw up the encoding.
-
Detach Glyphs
-
Detaches any selected encoding slots from their currently associated glyphs.
These slots will now be marked as unused. The glyphs will remain in the font,
just not encoded (If you reencode the font those glyphs will become visible
again).
-
Detach & Remove Glyphs...
-
Similar to the above except that any glyphs detached (which are not used
elsewhere in the encoding) will be removed from the font.
-
Add Encoding Name...
-
Requests an encoding name from the user and searches for it in the iconv()
database. It then adds that encoding to the menu.
-
Load Encoding...
-
Asks the user for a filename and attempts to load a user defined encoding
from that file. (You can only load small encodings -- one byte encodings)
-
Make from Font...
-
Allows you to name the font's current encoding (if it isn't already named),
and add it to the encoding menu.
-
Remove Encoding...
-
Removes one of the user defined encodings from the menu.
-
Display by Group...
-
Allows you restrict the glyphs displayed in the font view to those in a user
defined group (specified in the next command).
-
Define Groups...
-
Allows you to define groups of glyphs which
(presumably) have some meaningful connection to each other.
General notes on encodings
Not all font formats support all encodings. SVG fonts will always be output
in a unicode encoding, truetype fonts in either unicode or one of the CJK
encodings, type1 fonts only support single byte encodings, etc.
In a CID keyed font you are not allowed to change the encoding (in essence
because there is none), but there is an entry
CID->Change Supplement which will
display the Registry/Ordering information and allow you to change the supplement.
FontForge knows about the following encodings by default:
-
ISO-8859-1 (Latin1) -- traditional encoding for western european characters.
Default encoding for http. Does not include the Euro sign.
-
ISO-8859-15 (Latin0) -- Replacement for Latin1. Does include the Euro.
-
ISO-8859-2 (Latin2) -- Central & Eastern European (Czech, Hungarian,
Polish, Romanian, Croatian, Slovak, Slovnian.
-
ISO-8859-3 (Latin3) -- Southern European (Esperanto, Maltese)
-
ISO-8859-4 (Latin4) -- Northern European (Estonian, Latvian, Lithuanian,
Greenlandic, Lappish)
-
ISO-8859-9 (Latin5) -- Turkish
-
ISO-8859-10 (Latin6) -- Nordic (reworking of Latin4&Latin1)
-
ISO-8859-13 (Latin7) -- Another Baltic character set
-
ISO-8859-14 (Latin8) -- Celtic (Gaelic & Welsh)
-
ISO-8859-5 (Cyrillic)
-
ISO-8859-6 (Arabic)
-
ISO-8859-7 (Greek)
-
ISO-8859-8 (Hebrew) -- (and Yiddish)
-
ISO-8859-11 (Thai) -- Also know as TIS 620
<there is no ISO-8859-12>
-
KOI8-R -- Cyrillic
-
Macintosh Roman
-
Windows "ANSI" (CodePage1252)
-
Adobe Standard
-
Symbol
-
TeX Base
-
ISO-10646-1 (Unicode, BMP)
-
ISO-10646-1 (Unicode, Full)
-
ISO-10646-? (Unicode, by plane)
(You can select a specific plane of unicode as an encoding (ie BMP, SMP,
SIP,...)
-
SJIS
-
JIS 208 -- Japanese Kanji (first 8000 characters)
-
JIS 212 -- Japanese Kanji (next 8000 characters)
-
Wansung
-
KSC 5601 -- Korean (this is the 94x94 version of KSC 5601)
-
Johab
-
GB 2312 -- Simplified Chinese
-
Packed GB 2312 -- (I don't know what the proper name for this is, ASCII for
bytes<0x80, and GB 2312 EUC offset by 0x8080)
-
Big5 -- Traditional Chinese
-
Custom -- An unknown encoding
-
Original -- the glyph ordering used in the original font file.
Encoding sources:
An index to images of all the glyphs
in unicode.
You can also add new encodings to the set that FontForge knows about. There
are three menu items that manipulate a set of user defined encodings. As
always these specify both a character set and an encoding. The encoding has
a maximum of 256 entries, but the character set may be larger (up to 1024).
This means that you can define a font with extra characters. Since postscript
fonts can be reencoded at runtime this can be useful.
The Load Encoding command allows you to load an encoding(s) from a file.
Currently the file must either be in the format used by the unicode consortium
for mapping ISO
8859 encodings to unicode, or it must be a postscript encoding array.
The first format looks like this:
0x20 0x0020 # SPACE
0x21 0x0021 # EXCLAMATION MARK
...
A postscript file looks like:
/TeXBase1Encoding [
% 00
/.notdef /dotaccent /fi /fl
/fraction /hungarianumlaut /Lslash /lslash
...
] def
There may be more than one encoding in a postscript file. The encoding parser
is not smart. It will only read arrays specified like this, don't try any
of the innumerable other ways of specifying an array in postscript.
If the font has a custom encoding then the Make From Font
menu
item is enabled. This allows you to name the encoding you have defined for
the current font.
The Remove Encoding
menu item brings up a list showing all the
custom encodings and allows you to delete them.
Here's an example of a postscript encoding file.
It contains:
-
TeXMathItalicEncoding
-
TeXMathSymbolEncoding
-
TeXMathExtensionEncoding
-
IsoLatin -- (which specifies all the characters used in any of the ISO-Latin-*
fonts
-
AdobeExpert -- (Which contains things like lower case numbers, small caps,
fractions, sub/superscript numbers, etc.)
-
CodePage1250 -- Microsoft's encoding for Central European characters
-
CodePage1251 -- Microsoft's Cyrillic encoding
-
CodePage1252 -- Microsoft's Western European encoding (a superset of Latin1.
Sometimes called "ANSI" though I can find no ANSI standard that it follows)
-
CodePage1253 -- Microsoft's Greek encoding
-
CodePage1254 -- Microsoft's Turkish encoding
-
CodePage1255 -- Microsoft's Hebrew encoding (an extension of ISO-8859-8)
-
CodePage1256 -- Microsoft's Arabic encoding
-
CodePage1257 -- Microsoft's Baltic encoding
-
CodePage1258 -- Microsoft's Viet Namese encoding
-
CodePage874 -- Microsoft's Thai encoding
-
MacCentralEuropean
-
MacCyrillic
-
MacGreek
-
MacHebrew
-
US-ASCII -- Not really useful by itself any more, but provides the first
128 characters of almost every other encoding.
Other menus
-- Prev -- TOC --
Next --