MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C2B68D.278D2280"
Content-class: urn:content-classes:message
Subject: Re: latex/3480: Support for UTF-8 missing in inputenc.sty
Date: Tue, 7 Jan 2003 21:41:01 +0100
Message-ID: <200301072041.h07Kf1Lg015317@sun.dante.de>
Thread-Topic: latex/3480: Support for UTF-8 missing in inputenc.sty
Thread-Index: AcK2jSgt4FLxO/G5QcuxDpKXkBuPLA==
From: "Frank Mittelbach" <frank.mittelbach@latex-project.org>
To: <latex-gnats@latex-project.org>
Cc: <gnats-admin@latex-project.org>
Reply-To: "Frank Mittelbach" <frank.mittelbach@latex-project.org>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C2B68D.278D2280
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

The following reply was made to PR latex/3480; it has been noted by =
GNATS.

From: Frank Mittelbach <frank.mittelbach@latex-project.org>
To: Dominique Unruh <dominique@unruh.de>,
   Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Cc: latex-bugs@latex-project.org, LATEX-L@listserv.uni-heidelberg.de
Subject: Re: latex/3480: Support for UTF-8 missing in inputenc.sty
Date: Tue, 7 Jan 2003 21:34:51 +0100

 following up on the discussion concering utf-8 support for LaTeX, below =
is a
 package written to provide that support within the inputenc framework.
=20
 it is not complete, nor are its tables set up finally, we would need =
some
 volunteers to help us here.
=20
 but first i would like to hear comments/suggestions on the approach
=20
 cheers
 frank
=20
 ps the ins file generates a small test document, that generates one =
error
 (deliberately)
=20
 -------------- utf8ienc.ins
 \input docstrip
=20
 \preamble
=20
 This is a generated file.
=20
 Copyright 2002
 The LaTeX3 Project and any individual authors listed elsewhere
 in this file.=20
=20
 It may be distributed and/or modified under the
 conditions of the LaTeX Project Public License, either version 1.2
 of this license or (at your option) any later version.
 The latest version of this license is in
    http://www.latex-project.org/lppl.txt
 and version 1.2 or later is part of all distributions of LaTeX=20
 version 1999/12/01 or later.
=20
 \endpreamble
=20
 \keepsilent
 \askonceonly
=20
 \usedir{tex/latex/base}
=20
 \generate{\file{utf8.def}{\from{utf8ienc.dtx}{utf8}}
           \file{t1enc.dfu}{\from{utf8ienc.dtx}{t1}}
           \file{ot1enc.dfu}{\from{utf8ienc.dtx}{ot1}}
           \file{omsenc.dfu}{\from{utf8ienc.dtx}{oms}}
           \file{utf8-test.tex}{\from{utf8ienc.dtx}{test}}
           }
=20
 \ifToplevel{
 \Msg{***********************************************************}
 \Msg{*}
 \Msg{* To finish the installation you have to move the following}
 \Msg{* files into a directory searched by TeX:}
 \Msg{*}
 \Msg{* \space\space utf8.def}
 \Msg{* \space\space *.dfu}
 \Msg{*}
 \Msg{* To produce the documentation run the files ending with}
 \Msg{* `.dtx' through LaTeX.}
 \Msg{*}
 \Msg{* Happy TeXing}
 \Msg{***********************************************************}
 }
=20
 \endbatchfile
 -------------- utf8ienc.ins
=20
 -------------- utf8ienc.dtx
 % \iffalse
 %<*driver>
 \documentclass{ltxdoc}
 \usepackage[utf8]{inputenc}
 \GetFileInfo{utf8.def}
 \title{Providing some UTF-8 support via \texttt{inputenc}}
 \date{\fileversion\space\filedate{} printed \today}
  \author{%
   Frank Mittelbach \and Chris Rowley\thanks{Borrowing heavily from
       code by David Carlisle and tables by Sebastian Rahtz}}
 \begin{document}
  \maketitle
  \tableofcontents
  \DocInput{utf8ienc.dtx}
 \end{document}
 %</driver>
 % \fi
 %
 % \CheckSum{604}
 %
 % \section{Introduction}
 %
 % [The whole section is rather unfinished \ldots\ just like the code.]
 %
 % \subsection{Background and general stuff}
 %=20
 % For many reasons what this package provides is a long way from any
 % type of `Unicode compliance'.
 %=20
 % In stark contrast to 8-bit character sets, with 16 or more bits it =
can
 % easily be very inefficient to support the full range.\footnote{In
 %   fact, \LaTeX's current 8-bit support does not go so far as to make
 %   all 8-bit characters into valid input.}  Moreover, useful support =
of
 % character input by a typesetting system overwhelmingly means finding
 % an acceptable visual representation of a sequence of characters and
 % this, for \LaTeX{}, means having available a suitably encoded 8-bit
 % font.
 %=20
 % Unfortunately it is not possible to predict exactly what valid UTF-8
 % octet sequences will appear in a particular file so it is best to
 % make all the unsupported but valid sequences produce a reasonably
 % clear and noticeable error message.
 %=20
 % There are two directions from which to approach the question of what
 % to load.  One is to specify the ranges of Unicode characters that =
will
 % result in some sensible typesetting; this requires the provider to
 % ensure that suitable fonts are loaded and that these input characters
 % generate the correct typesetting via the encodings of those fonts.  =
The
 % other is to inspect the font encodings to be used and use these to
 % define which input Unicode characters should be supported. =20
 %=20
 % For Western European languages, at least, going in either direction
 % leads to many straightforward decisions and a few that are more
 % subjective.  In both cases some of the specifications are \TeX{}
 % specific whilst most are independent of the particular typesetting
 % software in use.
 %=20
 % As we have argued elsewhere, \LaTeX{} needs to refer to characters =
via
 % `seven-bit-text' names and, so far, these have been chosen by
 % reference to historical sources such as Plain \TeX{} or Adobe =
encoding
 % descriptions.  It is unclear whether this ad hoc naming structure =
should
 % simply be extended or whether it would be useful to
 % supplement it with standardised internal Unicode character names such =
as=20
 % one or more of the following:\footnote{Burkhard und Holger Mittelbach =

 %   spielen mit mir!  Sie haben etwas hier geschrieben.}
 %=20
 % \begin{verbatim}
 %   \ltxutwochar <4 hex digits>   =20
 %=20
 %   \ltxuchar {<hex digits>}
 %     B H U R R R
 %
 %   \ltxueightchartwo   <2 utf8 octets as 8-bit char tokens>   =20
 %   \ltxueightcharthree <3 utf8 octets ...>   =20
 %   \ltxueightcharfour  <4 utf8 octets ...>   =20
 % \end{verbatim}
 %=20
 %=20
 % \subsection{More specific stuff}
 %=20
 % In addition to setting up the mechanism for reading UTF-8 characters
 % and specifying the \LaTeX-level support available, this package
 % contains support for some default historically expected \TeX-related
 % characters and some example `Unicode definition files' for standard
 % font encodings.
 %=20
 %=20
 % \subsection{Notes}
 %=20
 % No Unicode combining characters.
 %=20
 % No attempt to be useful beyond latin and maybe Cyrillic for European
 % languages (as of now).=20
 %
 %
 % \subsection{Basic operation of the code}
 %
 % The \texttt{inputenc} package makes the upper 8-bit characters active =
and
 %    assigns to all of them an error message. It then waits for the
 %    input encoding files to change this set-up.  Similarly, whenever
 %    |\inputencoding| is encountered in a document, first the upper
 %    8-bit characters are set back to produce an error and then the =
definitions
 %    for the new input encoding are loaded, changing some of the=20
 %    previous settings.
 %
 %    The 8-bit input encodings currently supported by \texttt{inputenc}
 %    all use |\DeclareInputText| and the like to map an
 %    8-bit number to some \LaTeX{} internal form, e.g.~to |\"a|.
 %
 %    The situation when supporting UTF-8 as the input encoding is
 %    different, however. Here we only have to set up the actions of
 %    those 8-bit numbers that can be the first octet in a UTF-8
 %    representation of a Unicode character.  But we cannot simply set
 %    this to some internal \LaTeX{} form since the Unicode character
 %    consists of more than one octets; instead we have to define this
 %    starting octet to parse the right number of further octets that
 %    together form the UTF-8 representation of some Unicode character.
 %
 %    Therefore when switching to \texttt{utf8} within the
 %    \texttt{inputenc} framework the characters with numbers (hex)
 %    from \texttt{"C3} to \texttt{"E0} are defined to parse for a
 %    second octet following, the characters from \texttt{"E1}
 %    \texttt{"F0} are defined to parse for two more octets and finally
 %    the characters from \texttt{"F1} to \texttt{"F4} are defined to
 %    parse for three additional octets.
 %
 %    Thus when such a character is encountered in the document (so
 %    long as expansion is not prohibited) a defined number of
 %    additional octets (8-bit characters) are read and from them a
 %    unique control sequence name is immediately constructed.
 %
 %    This control sequence is either defined (good) or undefined
 %    (likely); in the latter case the user gets an error message
 %    saying that this UTF-8 sequence (or, better, Unicode character)
 %    is not supported.
 %
 %    If it is defined then the definition will expand to a \LaTeX{}
 %    internal form: e.g.~for <fill in example> we get |\"a| as the
 %    internal form which then, depending on the font encoding,
 %    eventually resolves to the single glyph `latin-a-umlaut' or to
 %    the composite glyph `latin-a with an umlaut accent'.
 %
 %    These mappings from (UTF-8 encoded) Unicode characters to \LaTeX{}
 %    internal forms are made indirectly.  The code below provides a
 %    declaration |\DeclareUnicodeCharacter| which maps Unicode numbers
 %    (as hexadecimal) to \LaTeX{} internal forms.
 %
 %    This mapping needs to be set up only once so here it is done at
 %    |\begin{document}| by looking at the list of font encodings that
 %    are loaded by the document and providing mappings related to
 %    those font encodings whenever these are available. Thus at most
 %    only those Unicode characters that can be represented by the =
glyphs
 %    available in these encodings will be defined.
 %
 %    Technically this is done by loading one file per encoding,
 %    if available, that is supposed to provide the necessary mapping
 %    information.
 %
 %
 % \StopEventually{}
 %
 %
 %
 %
 % \section{Coding}
 %
 % \subsection{Housekeeping}
 %
 %    The usual introductory bits and pieces:
 %
 %    \begin{macrocode}
 %<utf8>\ProvidesFile{utf8.def}
 %<t1>\ProvidesFile{t1enc.dfu}
 %<ot1>\ProvidesFile{ot1enc.dfu}
 %<oms>\ProvidesFile{omsenc.dfu}
 %<test>\ProvidesFile{utf8-test.tex}
    [2003/01/07 v1.0a UTF-8 support for inputenc]
 %    \end{macrocode}
 %   =20
 %    \begin{macrocode}
 %<*utf8>
 \makeatletter
 %    \end{macrocode}
 %   =20
 %
 %
 % \subsection{Parsing UTF-8 input}
 %
 % \begin{macro}{\UTFviii@two@octets}
 % \begin{macro}{\UTFviii@three@octets}
 % \begin{macro}{\UTFviii@four@octets}
 %    A UTF-8 char (that is not actually a 7-bit char, i.e.~a
 %    single octet) is parsed as
 %    follows: each starting octet is an active \TeX{} character token;=20
 %    each of these is defined below to be a macro with one to
 %    three arguments nominally (depending on the starting octet). It
 %    calls one of |\UTFviii@two@octets|, |\UTFviii@three@octets|, or =
|\UTFviii@four@octets| which
 %    then actually picks up the argument(s).
 %
 %    From the arguments a control sequence with a name of the form
 %    \verb=3Du8:#1#2..=3D is constructed where the |#i| ($i>1$) are the
 %    arguments and |#1| is the starting octet (as a \TeX{} character
 %    token).  Since some or even all of these characters are active
 %    (when inputenc is loaded) we need to use |\string| when building
 %    the csname.
 %
 %    The csname thus constructed can of course be undefined but to
 %    avoid producing an unhelpful low-level undefined command error we
 %    pass it to |\UTFviii@defined| which is responsible for producing
 %    a more sensible error message (not yet done!!).  If, however, it =
is
 %    defined we simply execute the thing (which should then expand to
 %    an encoding specific internal \LaTeX{} form).
 %    \begin{macrocode}
 \def\UTFviii@two@octets#1#2{\expandafter
     \UTFviii@defined\csname u8:#1\string#2\endcsname}
 %    \end{macrocode}
 % \end{macro}
 %   =20
 %    \begin{macrocode}
 \def\UTFviii@three@octets#1#2#3{\expandafter
     \UTFviii@defined\csname u8:#1\string#2\string#3\endcsname}
 %    \end{macrocode}
 % \end{macro}
 %   =20
 %    \begin{macrocode}
 \def\UTFviii@four@octets#1#2#3#4{\expandafter
     \UTFviii@defined\csname u8:#1\string#2\string#3\string#4\endcsname}
 %    \end{macrocode}
 % \end{macro}
 %   =20
 % \begin{macro}{\UTFviii@defined}
 %    This tests whether its argument is different from |\relax|: it
 %    either calls for a sensible error message (not done), or it gets
 %    the |\fi| out of the way (in case the command has arguments) and
 %    executes it.
 %    \begin{macrocode}
 \def\UTFviii@defined#1{%
   \ifx#1\relax
     \ERRORnotDEFINED#1%
   \else\expandafter
     #1%
   \fi
 }
 %    \end{macrocode}
 % \end{macro}
 %   =20
 % \begin{macro}{\UTFviii@loop}
 %    This wonderful bit of code from Dr Carlisle defines the starting
 %    octets to call |\UTFviii@two@octets| etc as appropriate. The =
starting
 %    octet itself is passed directly as the first argument, the others
 %    are picked up later en route.
 %
 %    The |\UTFviii@loop| loops through the numbers starting at
 %    |\count@|${}+1$ and ending at |\@tempcnta| each time executing
 %    the code in |\UTFviii@tmp|.
 %
 %    All this is done in a group so that temporary catcode changes=20
 %    etc.~vanish after everything is set up.
 %    \begin{macrocode}
 \begingroup
 \catcode`\~13
 \catcode`\"12
 %    \end{macrocode}
 %   =20
 %    \begin{macrocode}
 \def\UTFviii@loop{%
   \uccode`\~\count@
   \uppercase\expandafter{\UTFviii@tmp}%
   \advance\count@\@ne
   \ifnum\count@<\@tempcnta
   \expandafter\UTFviii@loop
   \fi}
 %    \end{macrocode}
 %
 %    Setting up 2-byte UTF-8:
 %    \begin{macrocode}
     \count@"C2
     \@tempcnta"E0
     \def\UTFviii@tmp{\xdef~{\noexpand\UTFviii@two@octets\string~}}
 \UTFviii@loop
 %    \end{macrocode}
 %    Setting up 3-byte UTF-8:
 %    \begin{macrocode}
     \count@"E0
     \@tempcnta"F0
     \def\UTFviii@tmp{\xdef~{\noexpand\UTFviii@three@octets\string~}}
 \UTFviii@loop
 %    \end{macrocode}
 %   =20
 %    Setting up 4-byte UTF-8:
 %    \begin{macrocode}
     \count@"F0
     \@tempcnta"F4
     \def\UTFviii@tmp{\xdef~{\noexpand\UTFviii@four@octets\string~}}
 \UTFviii@loop
 \endgroup
 %    \end{macrocode}
 % \end{macro}
 %
 %    For this case we must disable the warning generated by
 %    \texttt{inputenc} if it doesn't see any new |\DeclareInputText|
 %    commands.
 %    \begin{macrocode}
 \@inpenc@test
 %    \end{macrocode}
 %   =20
 %
 %    If this file (\texttt{utf8.def}) is not being read while setting
 %    up \texttt{inputenc}, i.e.~in the preamble, but when
 %    |\inputencoding| is called somewhere within the document, we do =
not
 %    need to input the specific Unicode mappings again. We therefore
 %    stop reading the file at this point.
 %    \begin{macrocode}
 \ifx\@begindocumenthook\@undefined=20
   \makeatother=20
 %    \end{macrocode}
 %    The |\fi| must be on the same line as |\endinput| or else it will
 %    never be seen!
 %    \begin{macrocode}
   \endinput \fi
 %    \end{macrocode}
 %
 %
 % \subsection{Mapping Unicode codes to \LaTeX{} internal forms}
 %
 %=20
 % \begin{macro}{\DeclareUnicodeCharacter}
 %    The |\DeclareUnicodeCharacter| declaration defines a mapping from
 %    a Unicode character code point to a \LaTeX{} internal form. The =
first
 %    argument is the Unicode number as hexadecimal digits and the =
second is
 %    the actual \LaTeX{} internal form.=20
 %
 %    We start by making sure that some characters have the right
 %    |\catcode| when they are used in the definitions below.
 %    \begin{macrocode}
 \begingroup
 \catcode`\"=3D12
 \catcode`\<=3D12
 \catcode`\.=3D12
 \catcode`\,=3D12
 \catcode`\;=3D12
 \catcode`\!=3D12
 \catcode`\~=3D13
 %    \end{macrocode}
 %   =20
 %    \begin{macrocode}
 \gdef\DeclareUnicodeCharacter#1#2{%
    \count@"#1\relax
    \typeout{ \space\space defining Unicode char #1 (decimal =
\the\count@)}%
    \begingroup
 %    \end{macrocode}
 %    Next we do the parsing of the number stored in |\count@| and =
assign the
 %    result to |\UTFviii@tmp|. Actually all this could be done in-line,
 %    the macro |\parse@XML@charref| is only there to extend this code
 %    to parsing Unicode numbers in other contexts one day (perhaps).
 %    \begin{macrocode}
     \parse@XML@charref
 %    \end{macrocode}
 %    Here is an example of what is happening, for 0163 (which is the
 %    decimal Unicode number for \textsterling{}). After
 %    |\parse@XML@charref| we have, stored in |\UTFviii@tmp|, the =
sequence:
 %    \begin{quote}
 %      |\UTFviii@two@octets| \texttt{\^A\textsterling}
 %    \end{quote}
 %    what we actually need to produce is a definition of the form
 %    \begin{quote}
 %      |\def\u8:|\texttt{\^A\textsterling} |{|\textit{\LaTeX{} internal =
form}|}|\,.
 %    \end{quote}
 %    So first we use the prefix commands |\UTFviii@two@octets|, etc.~to
 %    generate the csname that we wish to define \ldots
 %    \begin{macrocode}
     \def\UTFviii@two@octets##1##2{\csname u8:##1\string##2\endcsname}%
     \def\UTFviii@three@octets##1##2##3{\csname u8:##1%
                                      \string##2\string##3\endcsname}%
     \def\UTFviii@four@octets##1##2##3##4{\csname u8:##1%
                            \string##2\string##3\string##4\endcsname}%
 %    \end{macrocode}
 %    \ldots and then we need to use the right number of |\expandafter|s =
to
 %    finally make the definition: expanding |\UTFviii@tmp| once to get
 %    its contents, a second time to replace the prefix command by its
 %    |\csname| expansion, and a third time to turn the expansion into
 %    a csname after which the |\gdef| finally gets applied.
 %    \begin{macrocode}
     \expandafter\expandafter\expandafter
     \expandafter\expandafter\expandafter
     \expandafter
      \gdef\UTFviii@tmp{#2}%
    \endgroup
 }
 %    \end{macrocode}
 % \end{macro}
 %
 %
 % \begin{macro}{\parse@XML@charref}
 %    This macro parses a Unicode number (decimal) and returns its
 %    UTF-8 representation as a sequence of \TeX{} car tokens. In the
 %    original code it had two argument delimited by \texttt{;} here,
 %    however, we supply the Unicode number implicitly.
 %    \begin{macrocode}
 \gdef\parse@XML@charref{%
 %    \end{macrocode}
 %    We need to keep a few things local, mainly the |\uccode|'s that
 %    are set up below. However, the group originally used here is
 %    actually unnecessary since we call this macro only within another
 %    group; but it will be important to restore the group if this
 %    macro gets used for other purposes.
 %    \begin{macrocode}
 %  \begingroup
 %    \end{macrocode}
 %    The original code from David supported the convention that a
 %    Unicode slot number could be given either as a decimal or as a
 %    hexadecimal (by starting with \texttt{x}).  We do not do this so
 %    this code is also removed.  This could be reactivated if one
 %    wants to support document commands that accept Unicode numbers
 %    (but then the first case needs to be changed from an error
 %    message back to something more useful again).
 %    \begin{macrocode}
 %  \uppercase{\count@\if x\noexpand#1"\else#1\fi#2}\relax
 %    \end{macrocode}
 %    As |\count@| already contains the right value we make
 %    |\parse@XML@charref| work without arguments.=20
 %    \begin{macrocode}
   \ifnum\count@<"A0\relax
     \ERROR-WE-DONT-DEAL-WITH-THAT
 %    \end{macrocode}
 %    Do not ask us to provide an explanation for the code below, it is
 %    straight borrowed from \texttt{xmltex} by David and we trust him
 %    totally (and we are too lazy to reread the Unicode book to see if
 %    this is the correct algorithm).\footnote{We were hoping to also
 %    find in his work the \TeX{} code for going the other way: from
 %    UTF-8 octets to Unicode slot number, but no luck!}
 %    \begin{macrocode}
   \else\ifnum\count@<"800\relax
      \parse@UTFviii@a,%
      \parse@UTFviii@b C\UTFviii@two@octets.,%
   \else\ifnum\count@<"10000\relax
      \parse@UTFviii@a;%
      \parse@UTFviii@a,%
      \parse@UTFviii@b E\UTFviii@three@octets.{,;}%
    \else
      \parse@UTFviii@a;%
      \parse@UTFviii@a,%
      \parse@UTFviii@a!%
      \parse@UTFviii@b F\UTFviii@four@octets.{!,;}%
     \fi
     \fi
   \fi
 %  \endgroup
 }
 %    \end{macrocode}
 % \end{macro}
 %
 % \begin{macro}{\parse@UTFviii@a}
 %    \ldots so somebody else can document this part :-) =
\ldots~David?:-))))!
 %    \begin{macrocode}
 \gdef\parse@UTFviii@a#1{%
      \@tempcnta\count@
      \divide\count@64
      \@tempcntb\count@
      \multiply\count@64
      \advance\@tempcnta-\count@
      \advance\@tempcnta"80
      \uccode`#1\@tempcnta
      \count@\@tempcntb}
 %    \end{macrocode}
 % \end{macro}
 %
 % \begin{macro}{\parse@UTFviii@b}
 %    \ldots same here
 %    \begin{macrocode}
 \gdef\parse@UTFviii@b#1#2#3#4{%
      \advance\count@"#10\relax
      \uccode`#3\count@
      \uppercase{\gdef\UTFviii@tmp{#2#3#4}}}
 %    \end{macrocode}
 %   =20
 %    \begin{macrocode}
 \endgroup
 %    \end{macrocode}
 % \end{macro}
 %
 %    \begin{macrocode}
 \@onlypreamble\DeclareUnicodeCharacter
 %    \end{macrocode}
 %    These are preamble only as long as we don't support Unicode
 %    charrefs in documents.
 %    \begin{macrocode}
 \@onlypreamble\parse@XML@charref
 \@onlypreamble\parse@UTFviii@a
 \@onlypreamble\parse@UTFviii@b
 %    \end{macrocode}
 %   =20
 %
 % \subsection{Loading Unicode mappings at begin document}
 %
 %    At begin of document we loop through all defined encodings
 %    (stored in |\cdp@list| and for each load a file
 %    \textit{name}\texttt{enc.dfu} if it exist. That file is then
 %    supposed to contain |\DeclareUnicodeCharacter| declarations.
 %    \begin{macrocode}
 \AtBeginDocument{%
  \begingroup
  \def\cdp@elt#1#2#3#4{%
   \typeout{Now handling font encoding #1 ...}%
   \lowercase{%
     \InputIfFileExists{#1enc.dfu}}
           {\typeout{... processing Unicode mapping file for font =
encoding #1}}%
           {\typeout{... no Unicode mapping file for font encoding #1}}%
  }\cdp@list
  \endgroup}
 \makeatother
 %</utf8>
 %    \end{macrocode}
 %   =20
 %
 %
 % \section{Mapping characters that existing in font encodings}
 %
 % This section is a first attempt to provide Unicode definitions for
 %    characters whose glyphs are currently provided by the standard =
\LaTeX{}
 %    font-encodings |T1|, |OT1|, etc. They are by no means completed
 %    and need checking.
 %
 % For example, one should check the already existing input encodings
 %    for glyphs that may in fact be available and required,
 %    e.g.~\texttt{latin4} has a number of glyphs with the |\=3D|
 %    accent. Since the |T1| encoding does not provide such glyphs as
 %    these characters are not listed below (yet).
 %
 % The list below was generated by looking at the current \LaTeX{} font
 %    encoding files, e.g., \texttt{t1enc.def} and using the work by
 %    Sebastian Rahtz in (\texttt{ucharacters.sty}), with a few
 %    modifications.
 %
 % \subsection{Mappings for T1 glyphs}
 %
 %    \begin{macrocode}
 %<*t1>
 \DeclareUnicodeCharacter{00A1}{\textexclamdown }
 \DeclareUnicodeCharacter{00A3}{\textsterling}=20
 \DeclareUnicodeCharacter{00AB}{\guillemotleft}
 \DeclareUnicodeCharacter{00BB}{\guillemotright}
 \DeclareUnicodeCharacter{00BF}{\textquestiondown }
 \DeclareUnicodeCharacter{00C0}{\@tabacckludge`A}
 \DeclareUnicodeCharacter{00C1}{\@tabacckludge'A}
 \DeclareUnicodeCharacter{00C2}{\^A}
 \DeclareUnicodeCharacter{00C3}{\~A}
 \DeclareUnicodeCharacter{00C4}{\"A}
 \DeclareUnicodeCharacter{00C5}{\r A}
 \DeclareUnicodeCharacter{00C6}{\AE }
 \DeclareUnicodeCharacter{00C7}{\c C}
 \DeclareUnicodeCharacter{00C8}{\@tabacckludge`E}
 \DeclareUnicodeCharacter{00C9}{\@tabacckludge'E}
 \DeclareUnicodeCharacter{00CA}{\^E}
 \DeclareUnicodeCharacter{00CB}{\"E}
 \DeclareUnicodeCharacter{00CC}{\@tabacckludge`I}
 \DeclareUnicodeCharacter{00CD}{\@tabacckludge'I}
 \DeclareUnicodeCharacter{00CE}{\^I}
 \DeclareUnicodeCharacter{00CF}{\"I}
 \DeclareUnicodeCharacter{00D0}{\DH }
 \DeclareUnicodeCharacter{00D1}{\~N}
 \DeclareUnicodeCharacter{00D2}{\@tabacckludge`O}
 \DeclareUnicodeCharacter{00D3}{\@tabacckludge'O}
 \DeclareUnicodeCharacter{00D4}{\^O}
 \DeclareUnicodeCharacter{00D5}{\~O}
 \DeclareUnicodeCharacter{00D6}{\"O}
 \DeclareUnicodeCharacter{00D8}{\O }
 \DeclareUnicodeCharacter{00D9}{\@tabacckludge`U}
 \DeclareUnicodeCharacter{00DA}{\@tabacckludge'U}
 \DeclareUnicodeCharacter{00DB}{\^U}
 \DeclareUnicodeCharacter{00DC}{\"U}
 \DeclareUnicodeCharacter{00DD}{\@tabacckludge'Y}
 \DeclareUnicodeCharacter{00DE}{\TH }
 \DeclareUnicodeCharacter{00DF}{\ss }
 \DeclareUnicodeCharacter{00E0}{\@tabacckludge`a}
 \DeclareUnicodeCharacter{00E1}{\@tabacckludge'a}
 \DeclareUnicodeCharacter{00E2}{\^a}
 \DeclareUnicodeCharacter{00E3}{\~a}
 \DeclareUnicodeCharacter{00E4}{\"a}
 \DeclareUnicodeCharacter{00E5}{\r a}
 \DeclareUnicodeCharacter{00E6}{\ae }
 \DeclareUnicodeCharacter{00E7}{\c c}
 \DeclareUnicodeCharacter{00E8}{\@tabacckludge`e}
 \DeclareUnicodeCharacter{00E9}{\@tabacckludge'e}
 \DeclareUnicodeCharacter{00EA}{\^e}
 \DeclareUnicodeCharacter{00EB}{\"e}
 \DeclareUnicodeCharacter{00EC}{\@tabacckludge`\i}
 \DeclareUnicodeCharacter{00EC}{\@tabacckludge`i}
 \DeclareUnicodeCharacter{00ED}{\@tabacckludge'\i}
 \DeclareUnicodeCharacter{00ED}{\@tabacckludge'i}
 \DeclareUnicodeCharacter{00EE}{\^\i}
 \DeclareUnicodeCharacter{00EE}{\^i}
 \DeclareUnicodeCharacter{00EF}{\"\i}
 \DeclareUnicodeCharacter{00EF}{\"i}
 \DeclareUnicodeCharacter{00F0}{\dh }
 \DeclareUnicodeCharacter{00F1}{\~n}
 \DeclareUnicodeCharacter{00F2}{\@tabacckludge`o}
 \DeclareUnicodeCharacter{00F3}{\@tabacckludge'o}
 \DeclareUnicodeCharacter{00F4}{\^o}
 \DeclareUnicodeCharacter{00F5}{\~o}
 \DeclareUnicodeCharacter{00F6}{\"o}
 \DeclareUnicodeCharacter{00F8}{\o }
 \DeclareUnicodeCharacter{00F9}{\@tabacckludge`u}
 \DeclareUnicodeCharacter{00FA}{\@tabacckludge'u}
 \DeclareUnicodeCharacter{00FB}{\^u}
 \DeclareUnicodeCharacter{00FC}{\"u}
 \DeclareUnicodeCharacter{00FD}{\@tabacckludge'y}
 \DeclareUnicodeCharacter{00FE}{\th }
 \DeclareUnicodeCharacter{00FF}{\"y}
 \DeclareUnicodeCharacter{0102}{\u A}=20
 \DeclareUnicodeCharacter{0103}{\u a}
 \DeclareUnicodeCharacter{0104}{\k A}
 \DeclareUnicodeCharacter{0105}{\k a}
 \DeclareUnicodeCharacter{0106}{\@tabacckludge'C}
 \DeclareUnicodeCharacter{0107}{\@tabacckludge'c}=20
 \DeclareUnicodeCharacter{010C}{\v C}
 \DeclareUnicodeCharacter{010D}{\v c}
 \DeclareUnicodeCharacter{010E}{\v D}
 \DeclareUnicodeCharacter{010F}{\v d}
 \DeclareUnicodeCharacter{0110}{\DJ }
 \DeclareUnicodeCharacter{0111}{\dj }
 \DeclareUnicodeCharacter{0118}{\k E}
 \DeclareUnicodeCharacter{0119}{\k e}
 \DeclareUnicodeCharacter{011A}{\v E}
 \DeclareUnicodeCharacter{011B}{\v e}
 \DeclareUnicodeCharacter{011E}{\u G}
 \DeclareUnicodeCharacter{011F}{\u g}
 \DeclareUnicodeCharacter{0130}{\.I}
 \DeclareUnicodeCharacter{0131}{\i}
 \DeclareUnicodeCharacter{0131}{\i}
 \DeclareUnicodeCharacter{0139}{\@tabacckludge'L}
 \DeclareUnicodeCharacter{013A}{\@tabacckludge'l}
 \DeclareUnicodeCharacter{013D}{\v L}
 \DeclareUnicodeCharacter{013E}{\v l}
 \DeclareUnicodeCharacter{0141}{\L }
 \DeclareUnicodeCharacter{0142}{\l }
 \DeclareUnicodeCharacter{0143}{\@tabacckludge'N}
 \DeclareUnicodeCharacter{0144}{\@tabacckludge'n}
 \DeclareUnicodeCharacter{0147}{\v N}
 \DeclareUnicodeCharacter{0148}{\v n}
 \DeclareUnicodeCharacter{014A}{\NG }
 \DeclareUnicodeCharacter{014B}{\ng }
 \DeclareUnicodeCharacter{0150}{\H O}
 \DeclareUnicodeCharacter{0151}{\H o}
 \DeclareUnicodeCharacter{0152}{\OE }
 \DeclareUnicodeCharacter{0153}{\oe }
 \DeclareUnicodeCharacter{0154}{\@tabacckludge'R}
 \DeclareUnicodeCharacter{0155}{\@tabacckludge'r}
 \DeclareUnicodeCharacter{0158}{\v R}
 \DeclareUnicodeCharacter{0159}{\v r}
 \DeclareUnicodeCharacter{015A}{\@tabacckludge'S}
 \DeclareUnicodeCharacter{015B}{\@tabacckludge's}
 \DeclareUnicodeCharacter{015E}{\c S}
 \DeclareUnicodeCharacter{015F}{\c s}
 \DeclareUnicodeCharacter{0160}{\v S}
 \DeclareUnicodeCharacter{0161}{\v s}
 \DeclareUnicodeCharacter{0162}{\c T}
 \DeclareUnicodeCharacter{0163}{\c t}
 \DeclareUnicodeCharacter{0164}{\v T}
 \DeclareUnicodeCharacter{0165}{\v t}
 \DeclareUnicodeCharacter{016E}{\r U}
 \DeclareUnicodeCharacter{016F}{\r u}
 \DeclareUnicodeCharacter{0170}{\H U}
 \DeclareUnicodeCharacter{0171}{\H u}
 \DeclareUnicodeCharacter{0178}{\"Y}
 \DeclareUnicodeCharacter{0179}{\@tabacckludge'Z}
 \DeclareUnicodeCharacter{017A}{\@tabacckludge'z}
 \DeclareUnicodeCharacter{017B}{\.Z}
 \DeclareUnicodeCharacter{017C}{\.z}
 \DeclareUnicodeCharacter{017D}{\v Z}
 \DeclareUnicodeCharacter{017E}{\v z}
 \DeclareUnicodeCharacter{2013}{\textendash }
 \DeclareUnicodeCharacter{2014}{\textemdash}=20
 \DeclareUnicodeCharacter{2018}{\textquoteleft}
 \DeclareUnicodeCharacter{2019}{\textquoteright}
 \DeclareUnicodeCharacter{201C}{\textquotedblleft }
 \DeclareUnicodeCharacter{201D}{\textquotedblright }
 \DeclareUnicodeCharacter{2030}{\textperthousand }
 \DeclareUnicodeCharacter{2031}{\textpertenthousand }
 \DeclareUnicodeCharacter{2039}{\guilsinglleft }
 \DeclareUnicodeCharacter{203A}{\guilsinglright }
 \DeclareUnicodeCharacter{2423}{\textvisiblespace }
 \DeclareUnicodeCharacter{201A}{\quotesinglbase}
 \DeclareUnicodeCharacter{201E}{\quotedblbase}
 %</t1>
 %    \end{macrocode}
 %    The following definitions are in the encoding file but have no
 %    direct equivalent in Unicode or simply do not make sense in that
 %    context (or I couldn't find anything or \ldots :-).
 %\begin{verbatim}
 %\DeclareTextSymbol{\j}{OT1}{17}
 %\DeclareTextSymbol{\SS}{T1}{223}
 %\DeclareTextSymbol{\textcompwordmark}{T1}{23}
 %
 %\DeclareTextAccent{\"}{OT1}{127}
 %\DeclareTextAccent{\'}{OT1}{19}
 %\DeclareTextAccent{\.}{OT1}{95}
 %\DeclareTextAccent{\=3D}{OT1}{22}
 %\DeclareTextAccent{\H}{OT1}{125}
 %\DeclareTextAccent{\^}{OT1}{94}
 %\DeclareTextAccent{\`}{OT1}{18}
 %\DeclareTextAccent{\r}{OT1}{23}
 %\DeclareTextAccent{\u}{OT1}{21}
 %\DeclareTextAccent{\v}{OT1}{20}
 %\DeclareTextAccent{\~}{OT1}{126}
 %\DeclareTextCommand{\b}{OT1}[1]
 %\DeclareTextCommand{\c}{OT1}[1]
 %\DeclareTextCommand{\d}{OT1}[1]
 %\DeclareTextCommand{\k}{T1}[1]
 %\end{verbatim}=20
 %   =20
 %
 %
 % \subsection{Mappings for OT1 glyphs}
 %
 %    This is even more incomplete as again it covers only the single
 %    glyphs from |OT1| plus some that have been explicitly defined for
 %    this encoding. Everything that is provided in |T1|, and that
 %    could be provided as composite glyphs via |OT1|, could and
 %    probably should be set up as well.  Which leaves the many things
 %    that are not provided in |T1| but can be provided in |OT1| (and
 %    in |T1|) by composite glyphs.
=20
 %    \begin{macrocode}
 %<*ot1>
 \DeclareUnicodeCharacter{00A1}{\textexclamdown }
 \DeclareUnicodeCharacter{00A3}{\textsterling}=20
 \DeclareUnicodeCharacter{00BF}{\textquestiondown }
 \DeclareUnicodeCharacter{00C5}{\r A}
 \DeclareUnicodeCharacter{00C6}{\AE }
 \DeclareUnicodeCharacter{00D8}{\O }
 \DeclareUnicodeCharacter{00DF}{\ss }
 \DeclareUnicodeCharacter{00E6}{\ae }
 \DeclareUnicodeCharacter{00EC}{\@tabacckludge`i}
 \DeclareUnicodeCharacter{00ED}{\@tabacckludge'i}
 \DeclareUnicodeCharacter{00EE}{\^i}
 \DeclareUnicodeCharacter{00EF}{\"i}
 \DeclareUnicodeCharacter{00F8}{\o }
 \DeclareUnicodeCharacter{0131}{\i}
 \DeclareUnicodeCharacter{0141}{\L }
 \DeclareUnicodeCharacter{0142}{\l }
 \DeclareUnicodeCharacter{0152}{\OE }
 \DeclareUnicodeCharacter{0153}{\oe }
 \DeclareUnicodeCharacter{2013}{\textendash }
 \DeclareUnicodeCharacter{2014}{\textemdash}=20
 \DeclareUnicodeCharacter{2018}{\textquoteleft}
 \DeclareUnicodeCharacter{2019}{\textquoteright}
 \DeclareUnicodeCharacter{201C}{\textquotedblleft }
 \DeclareUnicodeCharacter{201D}{\textquotedblright }
 %</ot1>
 %    \end{macrocode}
 % Stuff not mapped (note that |\j| ($\jmath$) is not a Unicode =
character):
 %\begin{verbatim}
 %\DeclareTextSymbol{\j}{OT1}{17}
 %\DeclareTextAccent{\"}{OT1}{127}
 %\DeclareTextAccent{\'}{OT1}{19}
 %\DeclareTextAccent{\.}{OT1}{95}
 %\DeclareTextAccent{\=3D}{OT1}{22}
 %\DeclareTextAccent{\^}{OT1}{94}
 %\DeclareTextAccent{\`}{OT1}{18}
 %\DeclareTextAccent{\~}{OT1}{126}
 %\DeclareTextAccent{\H}{OT1}{125}
 %\DeclareTextAccent{\u}{OT1}{21}
 %\DeclareTextAccent{\v}{OT1}{20}
 %\DeclareTextAccent{\r}{OT1}{23}
 %\DeclareTextCommand{\b}{OT1}[1]
 %\DeclareTextCommand{\c}{OT1}[1]
 %\DeclareTextCommand{\d}{OT1}[1]
 %\end{verbatim}
 %   =20
 %
 %
 % \subsection{Mappings for OMS glyphs}
 %
 %    Only a few glyphs to set up here.
 %    \begin{macrocode}
 %<*oms>
 \DeclareUnicodeCharacter{00A7}{\textsection}=20
 \DeclareUnicodeCharacter{00B6}{\textparagraph}
 \DeclareUnicodeCharacter{02D9}{\textperiodcentered}=20
 \DeclareUnicodeCharacter{2020}{\textdagger}=20
 \DeclareUnicodeCharacter{2021}{\textdaggerdbl}=20
 \DeclareUnicodeCharacter{2022}{\textbullet}=20
 %</oms>
 %    \end{macrocode}
 %
 % Characters like |\textbackslash| are not mapped as they are
 %    (primarily) only in the lower 127 and the code here only sets up
 %    mappings for UTF-8 characters that are at least 2 octets long.
 %\begin{verbatim}
 %\DeclareTextSymbol{\textbackslash}{OMS}{110}        % "6E
 %\DeclareTextSymbol{\textbar}{OMS}{106}              % "6A
 %\DeclareTextSymbol{\textbraceleft}{OMS}{102}        % "66
 %\DeclareTextSymbol{\textbraceright}{OMS}{103}       % "67
 %\end{verbatim}
 %
 % But the following (and some others) might actually lurk in Unicode
 %    somewhere\ldots
 %\begin{verbatim}
 %\DeclareTextSymbol{\textasteriskcentered}{OMS}{3}   % "03
 %\DeclareTextCommand{\textcircled}{OMS}
 %\end{verbatim}
 %   =20
 %   =20
 %
 %
 % \subsection{Mappings for TS1 glyphs}
 %
 % Exercise for somebody else.
 %
 %
 % \subsection{Mappings for \texttt{latex.ltx} glyphs}
 %
 % There is also a collection of characters already set up in the =
kernel,
 % one way or the other. Since these do not clearly relate to any
 %    particular font encoding they are mapped when the
 % \texttt{utf8} support is first set up.=20
 %
 % Also there are a number of |\providecommand|s in the various input
 % encoding files which may or may not go into this part.
 %    \begin{macrocode}
 %<*utf8>
 % This space is intentionally empty ...
 %</utf8>
 %    \end{macrocode}
 %
 %
 % \section{A test document}=20
 %
 %    Here is a very small test document which may or may not survive
 %    if the current document is transfered from one place to the
 %    other.
 %    \begin{macrocode}
 %<*test>
 \documentclass{article}
=20
 \usepackage[latin1,utf8]{inputenc}
 \usepackage[T1]{fontenc}
 \usepackage{trace}
=20
 \begin{document}
=20
  German umlauts in UTF-8: =C3=A4=C3=B6=C3=BC
=20
 \inputencoding{latin1}  % switch to latin1
=20
  German umlauts in UTF-8 but read by latin1 (and will produce one
  error since \verb=3D\textcurrency=3D is not provided): =
=C3=A4=C3=B6=C3=BC
=20
 \inputencoding{utf8}    % switch back to utf8
=20
  German umlauts in UTF-8: =C3=A4=C3=B6=C3=BC
=20
 \showoutput
 \tracingstats=3D2
 \stop
 %</test>
 %    \end{macrocode}
 %   =20
 % \Finale
 %
 \endinput
 -------------- utf8ienc.dtx
=20


------_=_NextPart_001_01C2B68D.278D2280
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>Re: latex/3480: Support for UTF-8 missing in inputenc.sty</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>The following reply was made to PR latex/3480; it has =
been noted by GNATS.</FONT>
</P>

<P><FONT SIZE=3D2>From: Frank Mittelbach =
&lt;frank.mittelbach@latex-project.org&gt;</FONT>

<BR><FONT SIZE=3D2>To: Dominique Unruh =
&lt;dominique@unruh.de&gt;,</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; Markus Kuhn =
&lt;Markus.Kuhn@cl.cam.ac.uk&gt;</FONT>

<BR><FONT SIZE=3D2>Cc: latex-bugs@latex-project.org, =
LATEX-L@listserv.uni-heidelberg.de</FONT>

<BR><FONT SIZE=3D2>Subject: Re: latex/3480: Support for UTF-8 missing in =
inputenc.sty</FONT>

<BR><FONT SIZE=3D2>Date: Tue, 7 Jan 2003 21:34:51 +0100</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;following up on the discussion concering utf-8 =
support for LaTeX, below is a</FONT>

<BR><FONT SIZE=3D2>&nbsp;package written to provide that support within =
the inputenc framework.</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;it is not complete, nor are its tables set up =
finally, we would need some</FONT>

<BR><FONT SIZE=3D2>&nbsp;volunteers to help us here.</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;but first i would like to hear =
comments/suggestions on the approach</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;cheers</FONT>

<BR><FONT SIZE=3D2>&nbsp;frank</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;ps the ins file generates a small test =
document, that generates one error</FONT>

<BR><FONT SIZE=3D2>&nbsp;(deliberately)</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;-------------- utf8ienc.ins</FONT>

<BR><FONT SIZE=3D2>&nbsp;\input docstrip</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\preamble</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;This is a generated file.</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;Copyright 2002</FONT>

<BR><FONT SIZE=3D2>&nbsp;The LaTeX3 Project and any individual authors =
listed elsewhere</FONT>

<BR><FONT SIZE=3D2>&nbsp;in this file. </FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;It may be distributed and/or modified under =
the</FONT>

<BR><FONT SIZE=3D2>&nbsp;conditions of the LaTeX Project Public License, =
either version 1.2</FONT>

<BR><FONT SIZE=3D2>&nbsp;of this license or (at your option) any later =
version.</FONT>

<BR><FONT SIZE=3D2>&nbsp;The latest version of this license is in</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; <A =
HREF=3D"http://www.latex-project.org/lppl.txt">http://www.latex-project.o=
rg/lppl.txt</A></FONT>

<BR><FONT SIZE=3D2>&nbsp;and version 1.2 or later is part of all =
distributions of LaTeX </FONT>

<BR><FONT SIZE=3D2>&nbsp;version 1999/12/01 or later.</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\endpreamble</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\keepsilent</FONT>

<BR><FONT SIZE=3D2>&nbsp;\askonceonly</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\usedir{tex/latex/base}</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\generate{\file{utf8.def}{\from{utf8ienc.dtx}{utf8}}</FONT=
>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\file{t1enc.dfu}{\from{utf8ienc.dtx}{t1}}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\file{ot1enc.dfu}{\from{utf8ienc.dtx}{ot1}}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\file{omsenc.dfu}{\from{utf8ienc.dtx}{oms}}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\file{utf8-test.tex}{\from{utf8ienc.dtx}{test}}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\ifToplevel{</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\Msg{*****************************************************=
******}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{*}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* To finish the installation you have to =
move the following}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* files into a directory searched by =
TeX:}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{*}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* \space\space utf8.def}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* \space\space *.dfu}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{*}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* To produce the documentation run the =
files ending with}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* `.dtx' through LaTeX.}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{*}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\Msg{* Happy TeXing}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\Msg{*****************************************************=
******}</FONT>

<BR><FONT SIZE=3D2>&nbsp;}</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\endbatchfile</FONT>

<BR><FONT SIZE=3D2>&nbsp;-------------- utf8ienc.ins</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;-------------- utf8ienc.dtx</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \iffalse</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*driver&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\documentclass{ltxdoc}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\usepackage[utf8]{inputenc}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\GetFileInfo{utf8.def}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\title{Providing some UTF-8 support via =
\texttt{inputenc}}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\date{\fileversion\space\filedate{} printed =
\today}</FONT>

<BR><FONT SIZE=3D2>&nbsp; \author{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; Frank Mittelbach \and Chris =
Rowley\thanks{Borrowing heavily from</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; code by David =
Carlisle and tables by Sebastian Rahtz}}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\begin{document}</FONT>

<BR><FONT SIZE=3D2>&nbsp; \maketitle</FONT>

<BR><FONT SIZE=3D2>&nbsp; \tableofcontents</FONT>

<BR><FONT SIZE=3D2>&nbsp; \DocInput{utf8ienc.dtx}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\end{document}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/driver&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \fi</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \CheckSum{604}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \section{Introduction}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% [The whole section is rather unfinished =
\ldots\ just like the code.]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Background and general =
stuff}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% For many reasons what this package provides =
is a long way from any</FONT>

<BR><FONT SIZE=3D2>&nbsp;% type of `Unicode compliance'.</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% In stark contrast to 8-bit character sets, =
with 16 or more bits it can</FONT>

<BR><FONT SIZE=3D2>&nbsp;% easily be very inefficient to support the =
full range.\footnote{In</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; fact, \LaTeX's current 8-bit =
support does not go so far as to make</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; all 8-bit characters into valid =
input.}&nbsp; Moreover, useful support of</FONT>

<BR><FONT SIZE=3D2>&nbsp;% character input by a typesetting system =
overwhelmingly means finding</FONT>

<BR><FONT SIZE=3D2>&nbsp;% an acceptable visual representation of a =
sequence of characters and</FONT>

<BR><FONT SIZE=3D2>&nbsp;% this, for \LaTeX{}, means having available a =
suitably encoded 8-bit</FONT>

<BR><FONT SIZE=3D2>&nbsp;% font.</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% Unfortunately it is not possible to predict =
exactly what valid UTF-8</FONT>

<BR><FONT SIZE=3D2>&nbsp;% octet sequences will appear in a particular =
file so it is best to</FONT>

<BR><FONT SIZE=3D2>&nbsp;% make all the unsupported but valid sequences =
produce a reasonably</FONT>

<BR><FONT SIZE=3D2>&nbsp;% clear and noticeable error message.</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% There are two directions from which to =
approach the question of what</FONT>

<BR><FONT SIZE=3D2>&nbsp;% to load.&nbsp; One is to specify the ranges =
of Unicode characters that will</FONT>

<BR><FONT SIZE=3D2>&nbsp;% result in some sensible typesetting; this =
requires the provider to</FONT>

<BR><FONT SIZE=3D2>&nbsp;% ensure that suitable fonts are loaded and =
that these input characters</FONT>

<BR><FONT SIZE=3D2>&nbsp;% generate the correct typesetting via the =
encodings of those fonts.&nbsp; The</FONT>

<BR><FONT SIZE=3D2>&nbsp;% other is to inspect the font encodings to be =
used and use these to</FONT>

<BR><FONT SIZE=3D2>&nbsp;% define which input Unicode characters should =
be supported.&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% For Western European languages, at least, =
going in either direction</FONT>

<BR><FONT SIZE=3D2>&nbsp;% leads to many straightforward decisions and a =
few that are more</FONT>

<BR><FONT SIZE=3D2>&nbsp;% subjective.&nbsp; In both cases some of the =
specifications are \TeX{}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% specific whilst most are independent of the =
particular typesetting</FONT>

<BR><FONT SIZE=3D2>&nbsp;% software in use.</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% As we have argued elsewhere, \LaTeX{} needs =
to refer to characters via</FONT>

<BR><FONT SIZE=3D2>&nbsp;% `seven-bit-text' names and, so far, these =
have been chosen by</FONT>

<BR><FONT SIZE=3D2>&nbsp;% reference to historical sources such as Plain =
\TeX{} or Adobe encoding</FONT>

<BR><FONT SIZE=3D2>&nbsp;% descriptions.&nbsp; It is unclear whether =
this ad hoc naming structure should</FONT>

<BR><FONT SIZE=3D2>&nbsp;% simply be extended or whether it would be =
useful to</FONT>

<BR><FONT SIZE=3D2>&nbsp;% supplement it with standardised internal =
Unicode character names such as </FONT>

<BR><FONT SIZE=3D2>&nbsp;% one or more of the =
following:\footnote{Burkhard und Holger Mittelbach </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; spielen mit mir!&nbsp; Sie haben =
etwas hier geschrieben.}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; \ltxutwochar &lt;4 hex =
digits&gt;&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; \ltxuchar {&lt;hex =
digits&gt;}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp;&nbsp; B H U R R R</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; \ltxueightchartwo&nbsp;&nbsp; =
&lt;2 utf8 octets as 8-bit char tokens&gt;&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; \ltxueightcharthree &lt;3 utf8 =
octets ...&gt;&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp; \ltxueightcharfour&nbsp; &lt;4 =
utf8 octets ...&gt;&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{More specific stuff}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% In addition to setting up the mechanism for =
reading UTF-8 characters</FONT>

<BR><FONT SIZE=3D2>&nbsp;% and specifying the \LaTeX-level support =
available, this package</FONT>

<BR><FONT SIZE=3D2>&nbsp;% contains support for some default =
historically expected \TeX-related</FONT>

<BR><FONT SIZE=3D2>&nbsp;% characters and some example `Unicode =
definition files' for standard</FONT>

<BR><FONT SIZE=3D2>&nbsp;% font encodings.</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Notes}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% No Unicode combining characters.</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% No attempt to be useful beyond latin and =
maybe Cyrillic for European</FONT>

<BR><FONT SIZE=3D2>&nbsp;% languages (as of now). </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Basic operation of the =
code}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% The \texttt{inputenc} package makes the upper =
8-bit characters active and</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; assigns to all of them an =
error message. It then waits for the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; input encoding files to =
change this set-up.&nbsp; Similarly, whenever</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\inputencoding| is =
encountered in a document, first the upper</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; 8-bit characters are set =
back to produce an error and then the definitions</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; for the new input encoding =
are loaded, changing some of the </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; previous settings.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The 8-bit input encodings =
currently supported by \texttt{inputenc}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; all use |\DeclareInputText| =
and the like to map an</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; 8-bit number to some =
\LaTeX{} internal form, e.g.~to |\&quot;a|.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The situation when =
supporting UTF-8 as the input encoding is</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; different, however. Here we =
only have to set up the actions of</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; those 8-bit numbers that =
can be the first octet in a UTF-8</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; representation of a Unicode =
character.&nbsp; But we cannot simply set</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; this to some internal =
\LaTeX{} form since the Unicode character</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; consists of more than one =
octets; instead we have to define this</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; starting octet to parse the =
right number of further octets that</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; together form the UTF-8 =
representation of some Unicode character.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Therefore when switching to =
\texttt{utf8} within the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \texttt{inputenc} framework =
the characters with numbers (hex)</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; from \texttt{&quot;C3} to =
\texttt{&quot;E0} are defined to parse for a</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; second octet following, the =
characters from \texttt{&quot;E1}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \texttt{&quot;F0} are =
defined to parse for two more octets and finally</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the characters from =
\texttt{&quot;F1} to \texttt{&quot;F4} are defined to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; parse for three additional =
octets.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Thus when such a character =
is encountered in the document (so</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; long as expansion is not =
prohibited) a defined number of</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; additional octets (8-bit =
characters) are read and from them a</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; unique control sequence =
name is immediately constructed.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; This control sequence is =
either defined (good) or undefined</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; (likely); in the latter =
case the user gets an error message</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; saying that this UTF-8 =
sequence (or, better, Unicode character)</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; is not supported.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; If it is defined then the =
definition will expand to a \LaTeX{}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; internal form: e.g.~for =
&lt;fill in example&gt; we get |\&quot;a| as the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; internal form which then, =
depending on the font encoding,</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; eventually resolves to the =
single glyph `latin-a-umlaut' or to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the composite glyph =
`latin-a with an umlaut accent'.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; These mappings from (UTF-8 =
encoded) Unicode characters to \LaTeX{}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; internal forms are made =
indirectly.&nbsp; The code below provides a</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; declaration =
|\DeclareUnicodeCharacter| which maps Unicode numbers</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; (as hexadecimal) to =
\LaTeX{} internal forms.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; This mapping needs to be =
set up only once so here it is done at</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\begin{document}| by =
looking at the list of font encodings that</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; are loaded by the document =
and providing mappings related to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; those font encodings =
whenever these are available. Thus at most</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; only those Unicode =
characters that can be represented by the glyphs</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; available in these =
encodings will be defined.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Technically this is done by =
loading one file per encoding,</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; if available, that is =
supposed to provide the necessary mapping</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; information.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \StopEventually{}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \section{Coding}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Housekeeping}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The usual introductory bits =
and pieces:</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;utf8&gt;\ProvidesFile{utf8.def}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;t1&gt;\ProvidesFile{t1enc.dfu}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;ot1&gt;\ProvidesFile{ot1enc.dfu}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;oms&gt;\ProvidesFile{omsenc.dfu}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%&lt;test&gt;\ProvidesFile{utf8-test.tex}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; [2003/01/07 v1.0a UTF-8 support =
for inputenc]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*utf8&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\makeatletter</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Parsing UTF-8 input}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\UTFviii@two@octets}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\UTFviii@three@octets}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\UTFviii@four@octets}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; A UTF-8 char (that is not =
actually a 7-bit char, i.e.~a</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; single octet) is parsed =
as</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; follows: each starting =
octet is an active \TeX{} character token; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; each of these is defined =
below to be a macro with one to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; three arguments nominally =
(depending on the starting octet). It</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; calls one of =
|\UTFviii@two@octets|, |\UTFviii@three@octets|, or =
|\UTFviii@four@octets| which</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; then actually picks up the =
argument(s).</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; From the arguments a =
control sequence with a name of the form</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \verb=3Du8:#1#2..=3D is =
constructed where the |#i| ($i&gt;1$) are the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; arguments and |#1| is the =
starting octet (as a \TeX{} character</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; token).&nbsp; Since some or =
even all of these characters are active</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; (when inputenc is loaded) =
we need to use |\string| when building</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the csname.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The csname thus constructed =
can of course be undefined but to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; avoid producing an =
unhelpful low-level undefined command error we</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; pass it to =
|\UTFviii@defined| which is responsible for producing</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; a more sensible error =
message (not yet done!!).&nbsp; If, however, it is</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; defined we simply execute =
the thing (which should then expand to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; an encoding specific =
internal \LaTeX{} form).</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\def\UTFviii@two@octets#1#2{\expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \UTFviii@defined\csname =
u8:#1\string#2\endcsname}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\def\UTFviii@three@octets#1#2#3{\expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \UTFviii@defined\csname =
u8:#1\string#2\string#3\endcsname}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\def\UTFviii@four@octets#1#2#3#4{\expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \UTFviii@defined\csname =
u8:#1\string#2\string#3\string#4\endcsname}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\UTFviii@defined}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; This tests whether its =
argument is different from |\relax|: it</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; either calls for a sensible =
error message (not done), or it gets</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the |\fi| out of the way =
(in case the command has arguments) and</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; executes it.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\def\UTFviii@defined#1{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \ifx#1\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \ERRORnotDEFINED#1%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \else\expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; #1%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \fi</FONT>

<BR><FONT SIZE=3D2>&nbsp;}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\UTFviii@loop}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; This wonderful bit of code =
from Dr Carlisle defines the starting</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; octets to call =
|\UTFviii@two@octets| etc as appropriate. The starting</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; octet itself is passed =
directly as the first argument, the others</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; are picked up later en =
route.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The |\UTFviii@loop| loops =
through the numbers starting at</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\count@|${}+1$ and ending =
at |\@tempcnta| each time executing</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the code in =
|\UTFviii@tmp|.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; All this is done in a group =
so that temporary catcode changes </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; etc.~vanish after =
everything is set up.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\begingroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\~13</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\&quot;12</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\def\UTFviii@loop{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \uccode`\~\count@</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; =
\uppercase\expandafter{\UTFviii@tmp}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \advance\count@\@ne</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \ifnum\count@&lt;\@tempcnta</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \expandafter\UTFviii@loop</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \fi}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Setting up 2-byte =
UTF-8:</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \count@&quot;C2</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \@tempcnta&quot;E0</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\def\UTFviii@tmp{\xdef~{\noexpand\UTFviii@two@octets\string~}}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\UTFviii@loop</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Setting up 3-byte =
UTF-8:</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \count@&quot;E0</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \@tempcnta&quot;F0</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\def\UTFviii@tmp{\xdef~{\noexpand\UTFviii@three@octets\string~}}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\UTFviii@loop</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Setting up 4-byte =
UTF-8:</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \count@&quot;F0</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \@tempcnta&quot;F4</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\def\UTFviii@tmp{\xdef~{\noexpand\UTFviii@four@octets\string~}}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\UTFviii@loop</FONT>

<BR><FONT SIZE=3D2>&nbsp;\endgroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; For this case we must =
disable the warning generated by</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \texttt{inputenc} if it =
doesn't see any new |\DeclareInputText|</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; commands.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\@inpenc@test</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; If this file =
(\texttt{utf8.def}) is not being read while setting</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; up \texttt{inputenc}, =
i.e.~in the preamble, but when</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\inputencoding| is called =
somewhere within the document, we do not</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; need to input the specific =
Unicode mappings again. We therefore</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; stop reading the file at =
this point.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\ifx\@begindocumenthook\@undefined </FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \makeatother </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The |\fi| must be on the =
same line as |\endinput| or else it will</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; never be seen!</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \endinput \fi</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Mapping Unicode codes to \LaTeX{} =
internal forms}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% </FONT>

<BR><FONT SIZE=3D2>&nbsp;% =
\begin{macro}{\DeclareUnicodeCharacter}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The =
|\DeclareUnicodeCharacter| declaration defines a mapping from</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; a Unicode character code =
point to a \LaTeX{} internal form. The first</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; argument is the Unicode =
number as hexadecimal digits and the second is</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the actual \LaTeX{} =
internal form. </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; We start by making sure =
that some characters have the right</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\catcode| when they are =
used in the definitions below.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\begingroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\&quot;=3D12</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\&lt;=3D12</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\.=3D12</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\,=3D12</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\;=3D12</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\!=3D12</FONT>

<BR><FONT SIZE=3D2>&nbsp;\catcode`\~=3D13</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\gdef\DeclareUnicodeCharacter#1#2{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; \count@&quot;#1\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; \typeout{ \space\space defining =
Unicode char #1 (decimal \the\count@)}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; \begingroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Next we do the parsing of =
the number stored in |\count@| and assign the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; result to |\UTFviii@tmp|. =
Actually all this could be done in-line,</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; the macro =
|\parse@XML@charref| is only there to extend this code</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; to parsing Unicode numbers =
in other contexts one day (perhaps).</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \parse@XML@charref</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Here is an example of what =
is happening, for 0163 (which is the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; decimal Unicode number for =
\textsterling{}). After</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\parse@XML@charref| we =
have, stored in |\UTFviii@tmp|, the sequence:</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{quote}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
|\UTFviii@two@octets| \texttt{\^A\textsterling}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{quote}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; what we actually need to =
produce is a definition of the form</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{quote}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
|\def\u8:|\texttt{\^A\textsterling} |{|\textit{\LaTeX{} internal =
form}|}|\,.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{quote}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; So first we use the prefix =
commands |\UTFviii@two@octets|, etc.~to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; generate the csname that we =
wish to define \ldots</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\def\UTFviii@two@octets##1##2{\csname =
u8:##1\string##2\endcsname}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\def\UTFviii@three@octets##1##2##3{\csname u8:##1%</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp; \string##2\string##3\endcsname}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\def\UTFviii@four@octets##1##2##3##4{\csname u8:##1%</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp; =
\string##2\string##3\string##4\endcsname}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \ldots and then we need to =
use the right number of |\expandafter|s to</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; finally make the =
definition: expanding |\UTFviii@tmp| once to get</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; its contents, a second time =
to replace the prefix command by its</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\csname| expansion, and a =
third time to turn the expansion into</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; a csname after which the =
|\gdef| finally gets applied.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\expandafter\expandafter\expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\expandafter\expandafter\expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \expandafter</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\gdef\UTFviii@tmp{#2}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; \endgroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\parse@XML@charref}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; This macro parses a Unicode =
number (decimal) and returns its</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; UTF-8 representation as a =
sequence of \TeX{} car tokens. In the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; original code it had two =
argument delimited by \texttt{;} here,</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; however, we supply the =
Unicode number implicitly.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\gdef\parse@XML@charref{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; We need to keep a few =
things local, mainly the |\uccode|'s that</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; are set up below. However, =
the group originally used here is</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; actually unnecessary since =
we call this macro only within another</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; group; but it will be =
important to restore the group if this</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; macro gets used for other =
purposes.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp; \begingroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The original code from =
David supported the convention that a</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Unicode slot number could =
be given either as a decimal or as a</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; hexadecimal (by starting =
with \texttt{x}).&nbsp; We do not do this so</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; this code is also =
removed.&nbsp; This could be reactivated if one</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; wants to support document =
commands that accept Unicode numbers</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; (but then the first case =
needs to be changed from an error</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; message back to something =
more useful again).</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp; \uppercase{\count@\if =
x\noexpand#1&quot;\else#1\fi#2}\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; As |\count@| already =
contains the right value we make</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; |\parse@XML@charref| work =
without arguments. </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \ifnum\count@&lt;&quot;A0\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\ERROR-WE-DONT-DEAL-WITH-THAT</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Do not ask us to provide an =
explanation for the code below, it is</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; straight borrowed from =
\texttt{xmltex} by David and we trust him</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; totally (and we are too =
lazy to reread the Unicode book to see if</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; this is the correct =
algorithm).\footnote{We were hoping to also</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; find in his work the \TeX{} =
code for going the other way: from</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; UTF-8 octets to Unicode =
slot number, but no luck!}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; =
\else\ifnum\count@&lt;&quot;800\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\parse@UTFviii@a,%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \parse@UTFviii@b =
C\UTFviii@two@octets.,%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; =
\else\ifnum\count@&lt;&quot;10000\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\parse@UTFviii@a;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\parse@UTFviii@a,%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \parse@UTFviii@b =
E\UTFviii@three@octets.{,;}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp; \else</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\parse@UTFviii@a;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\parse@UTFviii@a,%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\parse@UTFviii@a!%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \parse@UTFviii@b =
F\UTFviii@four@octets.{!,;}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \fi</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; \fi</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \fi</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp; \endgroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\parse@UTFviii@a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \ldots so somebody else can =
document this part :-) \ldots~David?:-))))!</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\gdef\parse@UTFviii@a#1{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\@tempcnta\count@</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\divide\count@64</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\@tempcntb\count@</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\multiply\count@64</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\advance\@tempcnta-\count@</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\advance\@tempcnta&quot;80</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\uccode`#1\@tempcnta</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\count@\@tempcntb}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \begin{macro}{\parse@UTFviii@b}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \ldots same here</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\gdef\parse@UTFviii@b#1#2#3#4{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\advance\count@&quot;#10\relax</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\uccode`#3\count@</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
\uppercase{\gdef\UTFviii@tmp{#2#3#4}}}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\endgroup</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \end{macro}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\@onlypreamble\DeclareUnicodeCharacter</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; These are preamble only as =
long as we don't support Unicode</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; charrefs in =
documents.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\@onlypreamble\parse@XML@charref</FONT>

<BR><FONT SIZE=3D2>&nbsp;\@onlypreamble\parse@UTFviii@a</FONT>

<BR><FONT SIZE=3D2>&nbsp;\@onlypreamble\parse@UTFviii@b</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Loading Unicode mappings at begin =
document}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; At begin of document we =
loop through all defined encodings</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; (stored in |\cdp@list| and =
for each load a file</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; =
\textit{name}\texttt{enc.dfu} if it exist. That file is then</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; supposed to contain =
|\DeclareUnicodeCharacter| declarations.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\AtBeginDocument{%</FONT>

<BR><FONT SIZE=3D2>&nbsp; \begingroup</FONT>

<BR><FONT SIZE=3D2>&nbsp; \def\cdp@elt#1#2#3#4{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \typeout{Now handling font encoding #1 =
...}%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; \lowercase{%</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp; =
\InputIfFileExists{#1enc.dfu}}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
{\typeout{... processing Unicode mapping file for font encoding =
#1}}%</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
{\typeout{... no Unicode mapping file for font encoding #1}}%</FONT>

<BR><FONT SIZE=3D2>&nbsp; }\cdp@list</FONT>

<BR><FONT SIZE=3D2>&nbsp; \endgroup}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\makeatother</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/utf8&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \section{Mapping characters that existing in =
font encodings}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% This section is a first attempt to provide =
Unicode definitions for</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; characters whose glyphs are =
currently provided by the standard \LaTeX{}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; font-encodings |T1|, |OT1|, =
etc. They are by no means completed</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; and need checking.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% For example, one should check the already =
existing input encodings</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; for glyphs that may in fact =
be available and required,</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; e.g.~\texttt{latin4} has a =
number of glyphs with the |\=3D|</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; accent. Since the |T1| =
encoding does not provide such glyphs as</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; these characters are not =
listed below (yet).</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% The list below was generated by looking at =
the current \LaTeX{} font</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; encoding files, e.g., =
\texttt{t1enc.def} and using the work by</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Sebastian Rahtz in =
(\texttt{ucharacters.sty}), with a few</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; modifications.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Mappings for T1 glyphs}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*t1&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00A1}{\textexclamdown =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00A3}{\textsterling} =
</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00AB}{\guillemotleft}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00BB}{\guillemotright}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00BF}{\textquestiondown }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C0}{\@tabacckludge`A}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C1}{\@tabacckludge'A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C2}{\^A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C3}{\~A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C4}{\&quot;A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C5}{\r A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C6}{\AE }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C7}{\c C}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C8}{\@tabacckludge`E}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C9}{\@tabacckludge'E}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00CA}{\^E}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00CB}{\&quot;E}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00CC}{\@tabacckludge`I}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00CD}{\@tabacckludge'I}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00CE}{\^I}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00CF}{\&quot;I}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D0}{\DH }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D1}{\~N}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D2}{\@tabacckludge`O}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D3}{\@tabacckludge'O}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D4}{\^O}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D5}{\~O}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D6}{\&quot;O}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D8}{\O }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D9}{\@tabacckludge`U}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DA}{\@tabacckludge'U}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DB}{\^U}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DC}{\&quot;U}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DD}{\@tabacckludge'Y}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DE}{\TH }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DF}{\ss }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E0}{\@tabacckludge`a}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E1}{\@tabacckludge'a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E2}{\^a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E3}{\~a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E4}{\&quot;a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E5}{\r a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E6}{\ae }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E7}{\c c}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E8}{\@tabacckludge`e}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E9}{\@tabacckludge'e}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EA}{\^e}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EB}{\&quot;e}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EC}{\@tabacckludge`\i}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EC}{\@tabacckludge`i}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00ED}{\@tabacckludge'\i}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00ED}{\@tabacckludge'i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EE}{\^\i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EE}{\^i}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EF}{\&quot;\i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EF}{\&quot;i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F0}{\dh }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F1}{\~n}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F2}{\@tabacckludge`o}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F3}{\@tabacckludge'o}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F4}{\^o}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F5}{\~o}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F6}{\&quot;o}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F8}{\o }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F9}{\@tabacckludge`u}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00FA}{\@tabacckludge'u}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00FB}{\^u}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00FC}{\&quot;u}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00FD}{\@tabacckludge'y}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00FE}{\th }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00FF}{\&quot;y}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0102}{\u A} </FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0103}{\u a}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0104}{\k A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0105}{\k a}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0106}{\@tabacckludge'C}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0107}{\@tabacckludge'c} </FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{010C}{\v C}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{010D}{\v c}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{010E}{\v D}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{010F}{\v d}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0110}{\DJ }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0111}{\dj }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0118}{\k E}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0119}{\k e}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{011A}{\v E}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{011B}{\v e}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{011E}{\u G}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{011F}{\u g}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0130}{\.I}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0131}{\i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0131}{\i}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0139}{\@tabacckludge'L}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{013A}{\@tabacckludge'l}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{013D}{\v L}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{013E}{\v l}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0141}{\L }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0142}{\l }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0143}{\@tabacckludge'N}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0144}{\@tabacckludge'n}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0147}{\v N}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0148}{\v n}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{014A}{\NG }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{014B}{\ng }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0150}{\H O}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0151}{\H o}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0152}{\OE }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0153}{\oe }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0154}{\@tabacckludge'R}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0155}{\@tabacckludge'r}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0158}{\v R}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0159}{\v r}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{015A}{\@tabacckludge'S}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{015B}{\@tabacckludge's}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{015E}{\c S}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{015F}{\c s}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0160}{\v S}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0161}{\v s}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0162}{\c T}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0163}{\c t}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0164}{\v T}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0165}{\v t}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{016E}{\r U}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{016F}{\r u}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0170}{\H U}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0171}{\H u}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0178}{\&quot;Y}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0179}{\@tabacckludge'Z}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{017A}{\@tabacckludge'z}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{017B}{\.Z}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{017C}{\.z}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{017D}{\v Z}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{017E}{\v z}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2013}{\textendash =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2014}{\textemdash} =
</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2018}{\textquoteleft}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2019}{\textquoteright}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{201C}{\textquotedblleft }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{201D}{\textquotedblright =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2030}{\textperthousand =
}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2031}{\textpertenthousand =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2039}{\guilsinglleft =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{203A}{\guilsinglright =
}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2423}{\textvisiblespace }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{201A}{\quotesinglbase}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{201E}{\quotedblbase}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/t1&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; The following definitions =
are in the encoding file but have no</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; direct equivalent in =
Unicode or simply do not make sense in that</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; context (or I couldn't find =
anything or \ldots :-).</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\begin{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextSymbol{\j}{OT1}{17}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextSymbol{\SS}{T1}{223}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%\DeclareTextSymbol{\textcompwordmark}{T1}{23}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\&quot;}{OT1}{127}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\'}{OT1}{19}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\.}{OT1}{95}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\=3D}{OT1}{22}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\H}{OT1}{125}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\^}{OT1}{94}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\`}{OT1}{18}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\r}{OT1}{23}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\u}{OT1}{21}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\v}{OT1}{20}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\~}{OT1}{126}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\b}{OT1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\c}{OT1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\d}{OT1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\k}{T1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\end{verbatim} </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Mappings for OT1 glyphs}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; This is even more =
incomplete as again it covers only the single</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; glyphs from |OT1| plus some =
that have been explicitly defined for</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; this encoding. Everything =
that is provided in |T1|, and that</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; could be provided as =
composite glyphs via |OT1|, could and</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; probably should be set up =
as well.&nbsp; Which leaves the many things</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; that are not provided in =
|T1| but can be provided in |OT1| (and</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; in |T1|) by composite =
glyphs.</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*ot1&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00A1}{\textexclamdown =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00A3}{\textsterling} =
</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00BF}{\textquestiondown }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C5}{\r A}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00C6}{\AE }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00D8}{\O }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00DF}{\ss }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00E6}{\ae }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EC}{\@tabacckludge`i}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00ED}{\@tabacckludge'i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EE}{\^i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00EF}{\&quot;i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00F8}{\o }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0131}{\i}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0141}{\L }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0142}{\l }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0152}{\OE }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{0153}{\oe }</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2013}{\textendash =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2014}{\textemdash} =
</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2018}{\textquoteleft}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2019}{\textquoteright}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{201C}{\textquotedblleft }</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{201D}{\textquotedblright =
}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/ot1&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;% Stuff not mapped (note that |\j| ($\jmath$) =
is not a Unicode character):</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\begin{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextSymbol{\j}{OT1}{17}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\&quot;}{OT1}{127}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\'}{OT1}{19}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\.}{OT1}{95}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\=3D}{OT1}{22}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\^}{OT1}{94}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\`}{OT1}{18}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\~}{OT1}{126}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\H}{OT1}{125}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\u}{OT1}{21}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\v}{OT1}{20}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextAccent{\r}{OT1}{23}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\b}{OT1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\c}{OT1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\d}{OT1}[1]</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\end{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Mappings for OMS glyphs}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Only a few glyphs to set up =
here.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*oms&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00A7}{\textsection} =
</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{00B6}{\textparagraph}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{02D9}{\textperiodcentered} =
</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2020}{\textdagger} =
</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2021}{\textdaggerdbl} =
</FONT>

<BR><FONT SIZE=3D2>&nbsp;\DeclareUnicodeCharacter{2022}{\textbullet} =
</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/oms&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% Characters like |\textbackslash| are not =
mapped as they are</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; (primarily) only in the =
lower 127 and the code here only sets up</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; mappings for UTF-8 =
characters that are at least 2 octets long.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\begin{verbatim}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%\DeclareTextSymbol{\textbackslash}{OMS}{110}&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp; % &quot;6E</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%\DeclareTextSymbol{\textbar}{OMS}{106}&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; % =
&quot;6A</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%\DeclareTextSymbol{\textbraceleft}{OMS}{102}&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp; % &quot;66</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%\DeclareTextSymbol{\textbraceright}{OMS}{103}&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp;&nbsp; % &quot;67</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\end{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% But the following (and some others) might =
actually lurk in Unicode</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; somewhere\ldots</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\begin{verbatim}</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;%\DeclareTextSymbol{\textasteriskcentered}{OMS}{3}&nbsp;&n=
bsp; % &quot;03</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\DeclareTextCommand{\textcircled}{OMS}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%\end{verbatim}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Mappings for TS1 glyphs}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% Exercise for somebody else.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \subsection{Mappings for \texttt{latex.ltx} =
glyphs}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% There is also a collection of characters =
already set up in the kernel,</FONT>

<BR><FONT SIZE=3D2>&nbsp;% one way or the other. Since these do not =
clearly relate to any</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; particular font encoding =
they are mapped when the</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \texttt{utf8} support is first set up. =
</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% Also there are a number of |\providecommand|s =
in the various input</FONT>

<BR><FONT SIZE=3D2>&nbsp;% encoding files which may or may not go into =
this part.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*utf8&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;% This space is intentionally empty ...</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/utf8&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;% \section{A test document} </FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; Here is a very small test =
document which may or may not survive</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; if the current document is =
transfered from one place to the</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; other.</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \begin{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;*test&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\documentclass{article}</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\usepackage[latin1,utf8]{inputenc}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\usepackage[T1]{fontenc}</FONT>

<BR><FONT SIZE=3D2>&nbsp;\usepackage{trace}</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\begin{document}</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp; German umlauts in UTF-8: =
=C3=A4=C3=B6=C3=BC</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\inputencoding{latin1}&nbsp; % switch to =
latin1</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp; German umlauts in UTF-8 but read by latin1 =
(and will produce one</FONT>

<BR><FONT SIZE=3D2>&nbsp; error since \verb=3D\textcurrency=3D is not =
provided): =C3=A4=C3=B6=C3=BC</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\inputencoding{utf8}&nbsp;&nbsp;&nbsp; % switch =
back to utf8</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp; German umlauts in UTF-8: =
=C3=A4=C3=B6=C3=BC</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>

<BR><FONT SIZE=3D2>&nbsp;\showoutput</FONT>

<BR><FONT SIZE=3D2>&nbsp;\tracingstats=3D2</FONT>

<BR><FONT SIZE=3D2>&nbsp;\stop</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&lt;/test&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; \end{macrocode}</FONT>

<BR><FONT SIZE=3D2>&nbsp;%&nbsp;&nbsp;&nbsp; </FONT>

<BR><FONT SIZE=3D2>&nbsp;% \Finale</FONT>

<BR><FONT SIZE=3D2>&nbsp;%</FONT>

<BR><FONT SIZE=3D2>&nbsp;\endinput</FONT>

<BR><FONT SIZE=3D2>&nbsp;-------------- utf8ienc.dtx</FONT>

<BR><FONT SIZE=3D2>&nbsp;</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C2B68D.278D2280--