The Linux Console Tools
Yann Dirson, dirson@debian.org
18 May 1999
____________________________________________________________
Table of Contents
1. Status of this document
1.1 Other documents
2. What the Linux Console Tools are
3. Understanding the big picture of the console
4. What is Unicode
5. Understanding and setting up the keyboard driver
5.1 How it works
5.2 See also
6. Understanding and setting up the screen driver
6.1 Unicode is everywhere
6.1.1 Screen Font Maps
6.1.2 SFM Fallback tables
6.2 The unicode screen-mode
6.3 The byte screen-mode
6.3.1 Charset slots
6.4 Special UCS2 codes
6.5 About the old 8-bit ``screen maps''
6.6 See also
7. Font files
7.1 The formats
7.2 Tools
7.2.1 Font-files manipulation tools
7.2.2 Font editors
8. The libraries
8.1 libconsole,
8.2 libcfont,
8.3 libctutils,
9. The future of the console driver and of the Linux Console Tools
______________________________________________________________________
11.. SSttaattuuss ooff tthhiiss ddooccuummeenntt
This is an introduction to the Linux Console Tools package. You
should refer to the manpages for more details.
11..11.. OOtthheerr ddooccuummeennttss
The Linux Console Tools WWW site
may contain additionnal
informations, latest news, and such.
Files in the doc/contrib/ directory are unsupported, and may be
obsolete, but are provided just in case someone needs them.
README.{acm,sfm,keytables} give some info on the respective included
data files.
kbd.FAQ.* is the Console and Keyboard HOWTO by Andries Brouwer, as
included in kbd 0.97. It would need some corrections, though.
22.. WWhhaatt tthhee LLiinnuuxx CCoonnssoollee TToooollss aarree
The Linux Console Tools are a set of programs allowing the user to
setup/customize your console (restricted meaning: text mode screen +
keyboard only). It is derived from version 0.94 of the kbd package,
and has benefited from most features introduced in kbd until version
0.97.
The Linux Console Tools are still under development, but using it just
as a replacement for kbd should be quite safe, at it fixes many bugs
kbd has.
33.. UUnnddeerrssttaannddiinngg tthhee bbiigg ppiiccttuurree ooff tthhee ccoonnssoollee
The console driver is currently made of 2 sub-drivers: the keyboard
driver, and the screen driver. Basically, the keyboard driver sends
characters to your application, then the application does its own job,
and sends to the screen driver the characters to be displayed.
44.. WWhhaatt iiss UUnniiccooddee
Traditionnaly, character encodings use 8 bits, and thus are limited to
256 characters. This causes problems because:
1. it's not enough for some languages;
2. people speaking languages using different encodings have to choose
which one they use, and have to switch the system's state when
changing the language, which makes it difficult to mix several
languages in the same file;
3. etc...
Thus the UCS (Universal Character Set), also know as _U_n_i_c_o_d_e was
created to handle and mix all of our world's scripts. This is a
32-bit (4 bytes) encoding, otherwise known as UCS4 because of the size
of its characters, which is normalised by ISO as the 10646-1 standard.
The most widely used characters from UCS are contained in the UCS2
16-bit subset of UCS; this is the subset used by the Linux console.
For convenience, the UTF8 encoding was designed as a variable-length
encoding (with 8 bytes of maximum length) with ASCII compatibility;
all chars that have a UCS4 encoding can be expressed as a UTF8
sesquence, and vice-versa.
The Unicode consortium defines additional
properties for UCS2 characters.
See: unicode(7), utf-8(7).
55.. UUnnddeerrssttaannddiinngg aanndd sseettttiinngg uupp tthhee kkeeyybbooaarrdd ddrriivveerr
55..11.. HHooww iitt wwoorrkkss
The keyboard driver is made up several levels:
+o the keyboard hardware, which turns the user's finger moves into so-
called _s_c_a_n_c_o_d_e_s (Disclaimer: this is not really part of the
software driver itself; no support is provided for bugs in this
domain ;-). An event (key pressed or released) generates from 1 to
6 _s_c_a_n_c_o_d_e_s.
+o a mechanism turning _s_c_a_n_c_o_d_e_s into _k_e_y_c_o_d_e_s using a translation-
table which you can access with the getkeycodes(8) and
setkeycodes(8) utilities. You will only need to look at that if
you have some sort of non-standard (or programmable ?) keys on your
keyboard. AFAIK, these keycodes are the same among a set of
keyboards sharing the same hardware, but differing in the symbols
drawn on the keys.
+o a mechanism turning _k_e_y_c_o_d_e_s into _c_h_a_r_a_c_t_e_r_s using a _k_e_y_m_a_p. You
can access this _k_e_y_m_a_p using the loadkeys(1) and dumpkeys(1)
utilities.
The keyboard driver can be in one of 4 modes (which you can access
using kbd_mode(1)), which will influence what type of data
applications will get as keyboard input:
+o the scancode (K_RAW) mode, in which the application gets scancodes
for input. It is used by applications that implement their own
keyboard driver. For example, X11 does that.
+o the keycode (K_MEDIUMRAW) mode, in which the application gets
information on which keys (identified by their keycodes) get
pressed and released. AFAIK, no real-life application uses this
mode.
+o the ASCII (K_XLATE) mode, in which the application effectively gets
the characters as defined by the _k_e_y_m_a_p, using an 8-bit encoding.
In this mode, the Ascii_0 to Ascii_9 keymap symbols allow to
compose characters by giving their decimal 8bit-code, and Hex_0 to
Hex_F do the same with (2-digit) hexadecimal codes.
+o the Unicode (K_UNICODE) mode, which at this time only differs from
the ASCII mode by allowing the user to compose UTF8 unicode
characters by their decimal value, using Ascii_0 to Ascii_9 (who
needs that ?), or their hexadecimal (4-digit) value, using Hex_0 to
Hex_9. A keymap can be set up to produce UTF8 sequences (with a
U+XXXX pseudo-symbol, where each X is an hexadecimal digit), but be
warned that these UTF8 sequences will also be produced even in
ASCII mode. I think this is a bug in the kernel.
BBEE WWAARRNNEEDD that putting the keyboard in RAW or MEDIUMRAW mode will make
it unusable for most applications. Use showkey(1) to get a demo of
these special modes, or to find out what scancodes/keycodes are
produced by a specific key.
55..22.. SSeeee aallssoo
keytables(5), setleds(1), setmetamode(1).
66.. UUnnddeerrssttaannddiinngg aanndd sseettttiinngg uupp tthhee ssccrreeeenn ddrriivveerr
66..11.. UUnniiccooddee iiss eevveerryywwhheerree
66..11..11.. SSccrreeeenn FFoonntt MMaappss
In recent (as of 1998/08/11) kernels, the screen driver is based on
16-bit unicode (UCS2) encoding, which means that every console-font
loaded sshhoouulldd be defined using a _u_n_i_c_o_d_e _S_c_r_e_e_n _F_o_n_t _M_a_p (SFM for
short), which tells, for each character in the font, the list of UCS2
characters it will render. (-- SFM's were formerly called ``Unicode
Map'', or ``unimap'' for short, but this term should be dropped, as
now what they called ``screen maps'' uses Unicode as well: it probably
confuses many many people--)
66..11..22.. SSFFMM FFaallllbbaacckk ttaabblleess
Starting with release 1997.11.13 of the Linux Console Tools,
consolechars(8) now understands _S_F_M _f_a_l_l_b_a_c_k _t_a_b_l_e_s. Before that,
SFM's should contain at the same time the Unicode of the characters it
was primarily meant to render, as well as any approximations the user
would like to. These fallback tables allow to only put the primary
mappings in the SFM provided with the font-file, and to _s_e_p_a_r_a_t_e_l_y
keep a list telling _`_`_i_f _n_o _g_l_y_p_h _f_o_r _t_h_a_t _c_h_a_r_a_c_t_e_r _i_s _a_v_a_i_l_a_b_l_e _i_n
_t_h_e _c_u_r_r_e_n_t _f_o_n_t_, _t_h_e_n _t_r_y _t_o _d_i_s_p_l_a_y _i_t _w_i_t_h _t_h_e _g_l_y_p_h _f_o_r _t_h_i_s _o_n_e_,
_o_r _e_l_s_e _t_h_e _o_n_e _f_o_r _t_h_a_t _o_n_e_, _o_r _._._._'_'. This permits to keep in one
only place all possible fallbacks, and everyone will be able to choose
which fallback tables (s)he wants. Have a look at
data/consoletrans/*.fallback for examples.
A fallback-table file is made of fallback entries, each entry being on
its own line. Empty lines, and lines beginning with the # comment
character are ignored.
A fallback entry is a series of 2 or more UCS2 codes. The first one is
the character for which we want a glyph; the following ones are those
whose glyph we want to use when no glyph designed specially for our
character is available. The order of the codes defines a priority
order (own glyph if available, then second char's, then the third's,
etc.)
If a SFM was to be loaded, fallback mappings are added to this map
before it is loaded. If there was not (ie. a font without SFM was
loaded, and no --sfm option was given to consolechars, or the --force-
no-sfm option was given), then the current SFM is requested from the
kernel, the fallback mappings are added, and the resulting SFM is
loaded back into the kernel.
Note that each fallback entry is checked against the original SFM, not
against the SFM we get by adding former fallback entries to the
original SFM (the one read from a file, or given by the kernel); this
applies even to entries in different files, and thus the order of -k
options has no effect. If you want some entries to be influenced by
previous ones, you will have to use different fallback files, and to
load them with several consecutive invocations of consolechars -k.
66..22.. TThhee uunniiccooddee ssccrreeeenn--mmooddee
There are basically 2 screen-modes (byte mode and UTF mode). The
simpler to explain is the UTF mode, in which the bytes received from
the application (ie. written to the console screen) are interpreted as
UTF8 sequences, which are converted in the ``equivalent UCS2 codes'',
and then looked-up in the SFM to determine the glyphs used to display
each character.
Switching to and from UTF mode is done by sending to the screen the
escape sequences %G and %@ respectively. You may use the
unicode_start(1) and unicode_stop(1) scripts instead, as they also
change the keyboard mode, and let you optionally change the screen-
font.
Use vt-is-UTF8(1) to find out whether active VT is in UTF mode.
66..33.. TThhee bbyyttee ssccrreeeenn--mmooddee
The byte mode is a bit more complicated, as it uses an additional map
to transform the byte-characters sent by the application into UCS2
characters, which are then treated as told above. This map I call the
Application Charset Map (ACM), because it defines the encoding the
application uses, but it used to be called a ``screen map'', or
``console map'' (this comes from the time where the screen driver
didn't use Unicode, and there was only one Map down there).
Although there is only one ACM active at a given time, there are 4 of
them at any time in the kernel; 3 of them are built-in and never
change, and they define the IBM codepage 437 (the i386's default, and
thus the kernel's default even on other archs), the DEC VT100 charset,
and the ISO latin1 charset; the 4th is user-definable, and defaults on
boot to the ``straight to font'' mapping, decribed below under
``Special UCS2 codes''.
The consolechars(1) command can be used to change the ACM, as well as
the font and its associated SFM.
66..33..11.. CChhaarrsseett sslloottss
The Linux Console Driver has 2 slots for charsets, labeled _G_0 and _G_1.
Each of these slots contains a reference to one of the 4 kernel ACMs,
3 of which are predefined to provide the _c_p_4_3_7, _i_s_o_0_1, and _v_t_1_0_0
_g_r_a_p_h_i_c_s charsets. The 4th one is user-definable; this is the one you
can set with consolechars --acm and get with consolechars --old-acm.
Versions of the Linux Console Tools prior to 1998.08.11, as well as
all versions of kbd at least until 0.96a, were always assuming you
wanted to use the G0 slot, pointing to the user-defined ACM. You can
now use the charset utility to tune your charset slots.
You will note that, although each VT has its own slot settings, there
is only one user-defined ACM for use by all the VTs. That is, whereas
you can have tty1 using _G_0_=_c_p_4_3_7 and _G_1_=_v_t_1_0_0, at the same time as
tty2 using _G_0_=_i_s_o_0_1 and _G_1_=_i_s_o_0_2 (user-defined), you ccaannnnoott have at
the same time tty1 using _i_s_o_0_2 and tty2 using _i_s_o_0_3. This is a
limitation of the linux kernel.
Note that you can emulate such a setting using the filterm utility,
with your console in UTF8-mode, by telling filterm to translate screen
output on-the-fly to UTF8.
You'll find ffiilltteerrmm in the kkoonnwweerrtt package, by Marcin Kowalczyk, which
is available from his WWW site
.
66..44.. SSppeecciiaall UUCCSS22 ccooddeess
There are special UCS2 values you should care about, but the present
list is probably not exhaustive:
+o codes C from U+F000 to U+F1FF are not looked-up in the SFM, and
directly accesses the character in font-position C & 0x01FF (yes, a
font can be 512-chars on many hardware platforms, like VGA). This
is refered to as the _s_t_r_a_i_g_h_t _t_o _f_o_n_t zone.
+o code U+FFFD is the _r_e_p_l_a_c_e_m_e_n_t _c_h_a_r_a_c_t_e_r, usually at font-position
0 in a font. It is displayed by the kernel each time the
application requested a unicode character that is not present in
the SFM. This allows not only the driver to be safe in Unicode
mode, but also prevents displaying invalid characters when the ACM
on a particular VT contains characters not in the current font !
66..55.. AAbboouutt tthhee oolldd 88--bbiitt ````ssccrreeeenn mmaappss''''
There was a time where the kernel didn't know anything about Unicode.
In this ancient time, Application Charset Maps were called ``screen
maps'', and just mapped the application's characters into font
positions. The file format used for these 8bit ACM's is still
supported for backward compatibility, but should not be used any more.
The old way of using custom ACM's didn't know about unicode, so the
ACM had to depend on the font. Now, as each VT chooses its own ACM
(from the 4 ones in the kernel at a given time), and as the console-
font is common to all VT's, we can use a charset even if the font
can't display all of its characters; it will then display the
replacement character (U+FFFD).
66..66.. SSeeee aallssoo
psfaddtable(1), psfgettable(1), psfstriptable(1), showfont(1).
77.. FFoonntt ffiilleess
77..11.. TThhee ffoorrmmaattss
The primary font file format for the Linux Console Tools, as of
version 0.2.x, is the PSF format, which is also used by kbd. 0.3.x
will introduce the XPSF format, which will be able to replace all
existing file formats.
Raw fonts can be converted into PSF files with the font2psf(1)
(written by Martin Lohner, SuSE GmbH).
Versions 0.2.x do not have support for the CP format again - this will
come back in the 0.3.x development branch.
77..22.. TToooollss
77..22..11.. FFoonntt--ffiilleess mmaanniippuullaattiioonn ttoooollss
The psfaddtable(1), psfgettable(1), and psfstriptable(1) tools are
provided by the Linux Console Tools for manipulation of the SFM
embedded in PSF files. These are the only font-manipulation tools
provided by the Linux Console Tools as of version 0.2.x. The
font2psf(1) tool is available in the contrib directory to convert old
raw fonts into PSF fonts.
There are plans for a more generic font-conversion tool based on
libcfont. It will be mostly trivial to write once work on libcfont
will be advanced enough.
The only way provided by the Linux Console Tools to display a font's
contents is to load it, and then to display it using showfont(1).
77..22..22.. FFoonntt eeddiittoorrss
I do not curently know of a good font-editor suitable for editing
console fonts. I tried fonter, but this one has a bad design flaw:
you can only properly edit cp437 fonts (or maybe ASCII-based fonts if
you like unreadable screens) because it works on the console and loads
the font you are editing. I was told about cse which I did not tried
yet. Marcin Kowalczyk is working on the fonty
tool (which I did not
check yet either), which will help font designers, but is not AFAIK a
real editor. Robert de Bath works on his own tools
which handle a variety of file
formats and table formats.
88.. TThhee lliibbrraarriieess
There are several shared libraries installed by the Linux Console
Tools. They were at first meant just to share code betwwen the
various utilities (kbd has lots of duplicated code), but they could be
used as a base to build new tools.
However, they are not yet ready for production use (hence the version
number 0.0.0), and are absolutely not complete nor coherent at the
time.
Here is a summary of what they are meant to become:
88..11.. <> lliibbccoonnssoollee,, <> ++
is a meant to be a collection of:
+o wrappers around the kernel-level functionnalities, which should be
as kernel-version-independant as reasonable;
+o higher-level interfaces to these functionnalities.
Maybe this goal overlaps with some part of libggi (see ``The
future''), but I didn't investigate that for now.
88..22.. lliibbccffoonntt,, <>
is meant to provide a high-level interface to console-font file-
handling. It also exports the lower-level functions used to construct
higher-level ones.
It only supports for now some low- to medium-level functions that ease
writing programs, but I hope to make it a lot more than that,
especially with the coming of the XPSF file-format (see doc/font-
formats/xpsf.draft for details).
As of release 1998.08.11, implementation of higher-level interface has
just started.
88..33.. <> lliibbccttuuttiillss,, <> ++
is a collection of misc utility functions for use by the 2 other libs
and by the tools. I hope most this stuff will one day make its way to
an existing general purpose utility-library. Any offers welcomed.
99.. TThhee ffuuttuurree ooff tthhee ccoonnssoollee ddrriivveerr aanndd ooff tthhee LLiinnuuxx CCoonnssoollee TToooollss
The Linux Console Tools were derived from kbd. It is not a good thing
to have two distinct distributions for these tools, so we once hoped
we'd manage to finally merge the two packages back, together with
Andries Brouwer, who still maintains kbd. However, due to the lack of
technical cooperation from kbd's maintainer, and to the growing gap
with kbd, this project is now on hold.
The driver in 2.2.x kernel has been reworked a lot, and it seems it
will continue to evolve in 2.3.x. There are already some new
features, such as fonts with width != 8, which will be supported in
the future.
There is an ongoing project, known as GGI (for General Graphical
Interface), which is in the process of, among other things,
revolutionarize the way the console is handled. Have a look at their
WWW site for details.
As far as possible, I will try to keep the Linux Console Tools in sync
with what is developped for the kernel, and to what gets added to new
releases of kbd but I have to look better at the current state of the
GGI project before I give any more info.