Received: from mail.proteosys.com ([62.225.9.49]) by nummer-3.proteosys with Microsoft SMTPSVC(5.0.2195.5329); Mon, 3 Feb 2003 01:08:24 +0100 Received: by mail.proteosys.com (8.12.2/8.12.2) with ESMTP id h1308L6C016253 for ; Mon, 3 Feb 2003 01:08:22 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.27]) by relay2.uni-heidelberg.de (8.12.4/8.12.4) with ESMTP id h12Nv8tt001748; Mon, 3 Feb 2003 00:57:08 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C2CB18.5E354C00" Received: from listserv (listserv.uni-heidelberg.de [129.206.100.27]) by listserv.uni-heidelberg.de (8.12.2/8.12.2/SuSE Linux 0.6) with ESMTP id h12N036Z030154; Mon, 3 Feb 2003 00:49:29 +0100 Received: from LISTSERV.UNI-HEIDELBERG.DE by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8d) with spool id 6434 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 3 Feb 2003 00:49:29 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.12.2/8.12.2/SuSE Linux 0.6) with ESMTP id h12NnT5f030497 for ; Mon, 3 Feb 2003 00:49:29 +0100 Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.187]) by relay.uni-heidelberg.de (8.12.4/8.12.4) with ESMTP id h12Nv6XM015015 for ; Mon, 3 Feb 2003 00:57:06 +0100 (MET) Received: from [212.227.126.155] (helo=mrelayng.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 18fTz1-0005r6-00 for LATEX-L@listserv.uni-heidelberg.de; Mon, 03 Feb 2003 00:57:07 +0100 Received: from [80.129.3.15] (helo=istrati.mittelbach-online.de) by mrelayng.kundenserver.de with asmtp (Exim 3.35 #1) id 18fTz0-0001f9-00 for LATEX-L@listserv.uni-heidelberg.de; Mon, 03 Feb 2003 00:57:07 +0100 Received: (from frank@localhost) by istrati.mittelbach-online.de (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) id h12NsxN14870; Mon, 3 Feb 2003 00:54:59 +0100 In-Reply-To: References: <15931.3562.730605.294877@istrati.mittelbach-online.de> Return-Path: X-Mailer: VM 6.96 under Emacs 20.7.1 X-OriginalArrivalTime: 03 Feb 2003 00:08:24.0141 (UTC) FILETIME=[5E4ACFD0:01C2CB18] X-Authentication-Warning: istrati.mittelbach-online.de: frank set sender to frank@mittelbach-online.de using -f X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang) X-Spam-Score: -0.7 () IN_REP_TO,REFERENCES,SPAM_PHRASE_00_01,X_AUTH_WARNING Content-class: urn:content-classes:message Subject: Re: latex/3480: Support for UTF-8 missing in inputenc.sty Date: Mon, 3 Feb 2003 00:54:59 +0100 Message-ID: A<15933.45011.419355.58318@istrati.mittelbach-online.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: latex/3480: Support for UTF-8 missing in inputenc.sty Thread-Index: AcLLGF6BoBhHfK0uTtuxEorNZ4GnoA== From: "Frank Mittelbach" To: Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4525 This is a multi-part message in MIME format. ------_=_NextPart_001_01C2CB18.5E354C00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Roozbeh, > > that's not the way it works in TeX, is it? at the time input = encoding is > > translated to LICR we are before the decision for "text" or "math". = the > > naming conventions for the LICR objects are a bit dubious here as = they often > > say "\text..." but that is the major goal for them, ie make the = LICR objects > > work in text and with different font encodings. [...] > > Sorry Frank, I don't understand the LaTeX internals completely, so > although I'm trying my best to understand your description, I can't > suggest any good solution. all i was trying to say is the following (and i don't really think there = _is_ a problem to solve) - the LICR in current latex is three sub parts: a) the majority of objects that only work in text (not math) and that = have complex inner definitions, eg they change depending on the current = text font encoding etc b) a noticable number of objects that only work in math --- and as = an aside are of a fairly simple and static definition each (type \mathchardef = ...) c) a fairly limited set of objects that work both in text and math, primarily the visible ascii characters plus a few od symbols that = have been set up this way. The main part that was really tacked was a) when the system was built = simply because b) was and is and essentially will be static (it is a feature of mathematics that symbols essentially do not change according to = surrounding conditions, eg making a formula uppercase just because it is in a = running heading is not really a good idea, etc ...) with various limitations of TeX and size of memory etc extending c) was = also out of question in the early nineties. nowadays this has changed and we can in fact provide a method (at least = when using e-TeX features) that allows us to make all LICR objects belong to = type c) in some sense but since b) is nevertheless a static class of its own my approach is = that all externally incoming characters are mapped first to class a) and then = have an internal mapping that translates such objects to something in class b) = if used in math (ie via inpmath) that would mean as far as unicode is concerned we map everything to a) = ie the wonderful names like \textgreater or \"a ... (what i said is that those = names are a bit of a pity, but this is something that is not really possible = to change but on the other hand not the end of the world either) David's unicode file should then become much more straight forward. has this explanation helped a bit? good night frank ------_=_NextPart_001_01C2CB18.5E354C00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: latex/3480: Support for UTF-8 missing in = inputenc.sty

Roozbeh,

 > > that's not the way it works in TeX, is = it? at the time input encoding is
 > > translated to LICR we are before the = decision for "text" or "math".  the
 > > naming conventions for the LICR = objects are a bit dubious here as they often
 > > say "\text..." but that is = the major goal for them, ie make the LICR objects
 > > work in text and with different font = encodings. [...]
 >
 > Sorry Frank, I don't understand the LaTeX = internals completely, so
 > although I'm trying my best to understand = your description, I can't
 > suggest any good solution.

all i was trying to say is the following (and i don't = really think there _is_
a problem to solve)

 - the LICR in current latex is three sub = parts:

   a) the majority of objects that only work = in text (not math) and that have
   complex inner definitions, eg they = change depending on the current text
   font encoding etc

   b) a noticable number of objects that = only work in math  --- and as an aside
   are of a fairly simple and static = definition each (type \mathchardef ...)

   c) a fairly limited set of objects that = work both in text and math,
   primarily the visible ascii characters = plus a few od symbols that have been
   set up this way.

The main part that was really tacked was a) when the = system was built simply
because b) was and is and essentially will be static = (it is a feature of
mathematics that symbols essentially do not change = according to surrounding
conditions, eg making a formula uppercase just = because it is in a running
heading is not really a good idea, etc ...)

with various limitations of TeX and size of memory etc = extending c) was also
out of question in the early nineties.

nowadays this has changed and we can in fact provide a = method (at least when
using e-TeX features) that allows us to make all LICR = objects belong to type
c) in some sense

but since b) is nevertheless a static class of its own = my approach is that all
externally incoming characters are mapped first to = class a) and then have an
internal mapping that translates such objects to = something in class b) if used
in math (ie via inpmath)

that would mean as far as unicode is concerned we map = everything to a) ie the
wonderful names like \textgreater or \"a ... = (what i said is that those names
are a bit of a pity, but this is something that is = not really possible to
change but on the other hand not the end of the world = either)

David's unicode file should then become much more = straight forward.

has this explanation helped a bit?

good night
frank

------_=_NextPart_001_01C2CB18.5E354C00--