MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C2CB18.5E354C00"
In-Reply-To:  <Pine.LNX.4.44.0302010348050.29347-100000@gilas>
References: <15931.3562.730605.294877@istrati.mittelbach-online.de>            <Pine.LNX.4.44.0302010348050.29347-100000@gilas>
Content-class: urn:content-classes:message
Subject:      Re: latex/3480: Support for UTF-8 missing in inputenc.sty
Date: Mon, 3 Feb 2003 00:54:59 +0100
Message-ID: A<15933.45011.419355.58318@istrati.mittelbach-online.de>
Thread-Topic:      Re: latex/3480: Support for UTF-8 missing in inputenc.sty
Thread-Index: AcLLGF6BoBhHfK0uTtuxEorNZ4GnoA==
From: "Frank Mittelbach" <frank.mittelbach@LATEX-PROJECT.ORG>
To: <LATEX-L@listserv.uni-heidelberg.de>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@listserv.uni-heidelberg.de>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C2CB18.5E354C00
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Roozbeh,

 > > that's not the way it works in TeX, is it? at the time input =
encoding is
 > > translated to LICR we are before the decision for "text" or "math". =
 the
 > > naming conventions for the LICR objects are a bit dubious here as =
they often
 > > say "\text..." but that is the major goal for them, ie make the =
LICR objects
 > > work in text and with different font encodings. [...]
 >
 > Sorry Frank, I don't understand the LaTeX internals completely, so
 > although I'm trying my best to understand your description, I can't
 > suggest any good solution.

all i was trying to say is the following (and i don't really think there =
_is_
a problem to solve)

 - the LICR in current latex is three sub parts:

   a) the majority of objects that only work in text (not math) and that =
have
   complex inner definitions, eg they change depending on the current =
text
   font encoding etc

   b) a noticable number of objects that only work in math  --- and as =
an aside
   are of a fairly simple and static definition each (type \mathchardef =
...)

   c) a fairly limited set of objects that work both in text and math,
   primarily the visible ascii characters plus a few od symbols that =
have been
   set up this way.

The main part that was really tacked was a) when the system was built =
simply
because b) was and is and essentially will be static (it is a feature of
mathematics that symbols essentially do not change according to =
surrounding
conditions, eg making a formula uppercase just because it is in a =
running
heading is not really a good idea, etc ...)

with various limitations of TeX and size of memory etc extending c) was =
also
out of question in the early nineties.

nowadays this has changed and we can in fact provide a method (at least =
when
using e-TeX features) that allows us to make all LICR objects belong to =
type
c) in some sense

but since b) is nevertheless a static class of its own my approach is =
that all
externally incoming characters are mapped first to class a) and then =
have an
internal mapping that translates such objects to something in class b) =
if used
in math (ie via inpmath)

that would mean as far as unicode is concerned we map everything to a) =
ie the
wonderful names like \textgreater or \"a ... (what i said is that those =
names
are a bit of a pity, but this is something that is not really possible =
to
change but on the other hand not the end of the world either)

David's unicode file should then become much more straight forward.

has this explanation helped a bit?

good night
frank

------_=_NextPart_001_01C2CB18.5E354C00
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: latex/3480: Support for UTF-8 missing in =
inputenc.sty</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>Roozbeh,</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; &gt; that's not the way it works in TeX, is =
it? at the time input encoding is</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; translated to LICR we are before the =
decision for &quot;text&quot; or &quot;math&quot;.&nbsp; the</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; naming conventions for the LICR =
objects are a bit dubious here as they often</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; say &quot;\text...&quot; but that is =
the major goal for them, ie make the LICR objects</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; &gt; work in text and with different font =
encodings. [...]</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; Sorry Frank, I don't understand the LaTeX =
internals completely, so</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; although I'm trying my best to understand =
your description, I can't</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; suggest any good solution.</FONT>
</P>

<P><FONT SIZE=3D2>all i was trying to say is the following (and i don't =
really think there _is_</FONT>

<BR><FONT SIZE=3D2>a problem to solve)</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;- the LICR in current latex is three sub =
parts:</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&nbsp; a) the majority of objects that only work =
in text (not math) and that have</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; complex inner definitions, eg they =
change depending on the current text</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; font encoding etc</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&nbsp; b) a noticable number of objects that =
only work in math&nbsp; --- and as an aside</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; are of a fairly simple and static =
definition each (type \mathchardef ...)</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&nbsp; c) a fairly limited set of objects that =
work both in text and math,</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; primarily the visible ascii characters =
plus a few od symbols that have been</FONT>

<BR><FONT SIZE=3D2>&nbsp;&nbsp; set up this way.</FONT>
</P>

<P><FONT SIZE=3D2>The main part that was really tacked was a) when the =
system was built simply</FONT>

<BR><FONT SIZE=3D2>because b) was and is and essentially will be static =
(it is a feature of</FONT>

<BR><FONT SIZE=3D2>mathematics that symbols essentially do not change =
according to surrounding</FONT>

<BR><FONT SIZE=3D2>conditions, eg making a formula uppercase just =
because it is in a running</FONT>

<BR><FONT SIZE=3D2>heading is not really a good idea, etc ...)</FONT>
</P>

<P><FONT SIZE=3D2>with various limitations of TeX and size of memory etc =
extending c) was also</FONT>

<BR><FONT SIZE=3D2>out of question in the early nineties.</FONT>
</P>

<P><FONT SIZE=3D2>nowadays this has changed and we can in fact provide a =
method (at least when</FONT>

<BR><FONT SIZE=3D2>using e-TeX features) that allows us to make all LICR =
objects belong to type</FONT>

<BR><FONT SIZE=3D2>c) in some sense</FONT>
</P>

<P><FONT SIZE=3D2>but since b) is nevertheless a static class of its own =
my approach is that all</FONT>

<BR><FONT SIZE=3D2>externally incoming characters are mapped first to =
class a) and then have an</FONT>

<BR><FONT SIZE=3D2>internal mapping that translates such objects to =
something in class b) if used</FONT>

<BR><FONT SIZE=3D2>in math (ie via inpmath)</FONT>
</P>

<P><FONT SIZE=3D2>that would mean as far as unicode is concerned we map =
everything to a) ie the</FONT>

<BR><FONT SIZE=3D2>wonderful names like \textgreater or \&quot;a ... =
(what i said is that those names</FONT>

<BR><FONT SIZE=3D2>are a bit of a pity, but this is something that is =
not really possible to</FONT>

<BR><FONT SIZE=3D2>change but on the other hand not the end of the world =
either)</FONT>
</P>

<P><FONT SIZE=3D2>David's unicode file should then become much more =
straight forward.</FONT>
</P>

<P><FONT SIZE=3D2>has this explanation helped a bit?</FONT>
</P>

<P><FONT SIZE=3D2>good night</FONT>

<BR><FONT SIZE=3D2>frank</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C2CB18.5E354C00--