Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1NG9or07692 for ; Fri, 23 Feb 2001 17:09:50 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1NG9os12788 . for ; Fri, 23 Feb 2001 17:09:50 +0100 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09DB3.0C73DB00" Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1NG8oQ15412 for ; Fri, 23 Feb 2001 17:08:50 +0100 (MET) Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id RAA25800 for ; Fri, 23 Feb 2001 17:08:49 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1NG8lQ15408 for ; Fri, 23 Feb 2001 17:08:47 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <4.B9B72843@mail.listserv.gmd.de>; Fri, 23 Feb 2001 17:08:37 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 493389 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 23 Feb 2001 17:08:44 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA19936 for ; Fri, 23 Feb 2001 17:08:43 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA44490 for ; Fri, 23 Feb 2001 17:08:42 +0100 Received: from abel.math.umu.se (abel.math.umu.se [130.239.20.139]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f1NG8bh07787 for ; Fri, 23 Feb 2001 17:08:37 +0100 (MET) Received: from [130.239.20.144] (mac144.math.umu.se [130.239.20.144]) by abel.math.umu.se (8.9.2/8.9.2) with ESMTP id RAA12165 for ; Fri, 23 Feb 2001 17:06:38 +0100 (CET) Return-Path: X-Sender: lars@abel.math.umu.se x-mime-autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id RAA19937 Content-class: urn:content-classes:message Subject: Re: LaTeX's internal char prepresentation (UTF8 or Unicode?) Date: Fri, 23 Feb 2001 17:08:37 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: =?iso-8859-1?Q?Lars_Hellstr=F6m?= Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4007 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09DB3.0C73DB00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I would like to point out that the debate on the LICR and related = matters has mainly only delt with what one might call the LaTeX Text Character Model (LTCM), but there is another character model in current LaTeX = which should also be given some thought: the LaTeX Math Character Model = (LMCM). Possibly one could also distinguish a LaTeX Verbatim Character Model = (LVCM) (sorry about all these acronyms), but I'm less certain about that one. Luckily matters may be easier in these models because there we don't = have do deal with that multilingual complex of problems which noone = completely understands because noone knows all the languages. Concerning the LMCM, I believe the expressed opinion was that greek and cyrillic letters (as input characters) should be allowed in math, but = that symbols outside ASCII should not (except when necessary for compability reasons). I suspect user demands may make the latter problematic if the input encoding becomes Unicode (in some form), especially if they get = the math characters well sorted out, but that is a distant problem. In the world of 8-bit encodings a restriction of input symbols in math to ASCII = is probably the right things to do. Allowing greek letters does however raise some interesting problems. = Many of the greek letters have var-forms in the current math fonts, so which form should the input letter select? E.g. \epsilon and \varepsilon are hardly distinct enough to count as different letters/symbols, they are merely different glyphs, so which one should it be? I for one much = prefer \varepsilon, so I would like to have some interface which lets the user select this. In a more general view, one should perhaps try to clear up the LMCM so = that the user commands select characters (or character plus math class) = rather than glyphs. This could make it easier to provide new math fonts in that one wouldn't have to concentrate on providing precisely the same set of glyphs as the CM math fonts do, but could provide more (very tricky = these days, as new glyph forms require new commands that make documents which = use them incompatible with other math fonts) or fewer (possible by = duplicating the glyphs) forms of the characters as it suits the design. Lars Hellstr=F6m ------_=_NextPart_001_01C09DB3.0C73DB00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: LaTeX's internal char prepresentation (UTF8 or = Unicode?)

I would like to point out that the debate on the LICR = and related matters
has mainly only delt with what one might call the = LaTeX Text Character
Model (LTCM), but there is another character model in = current LaTeX which
should also be given some thought: the LaTeX Math = Character Model (LMCM).
Possibly one could also distinguish a LaTeX Verbatim = Character Model (LVCM)
(sorry about all these acronyms), but I'm less = certain about that one.

Luckily matters may be easier in these models because = there we don't have
do deal with that multilingual complex of problems = which noone completely
understands because noone knows all the = languages.

Concerning the LMCM, I believe the expressed opinion = was that greek and
cyrillic letters (as input characters) should be = allowed in math, but that
symbols outside ASCII should not (except when = necessary for compability
reasons). I suspect user demands may make the latter = problematic if the
input encoding becomes Unicode (in some form), = especially if they get the
math characters well sorted out, but that is a = distant problem. In the
world of 8-bit encodings a restriction of input = symbols in math to ASCII is
probably the right things to do.

Allowing greek letters does however raise some = interesting problems. Many
of the greek letters have var-forms in the current = math fonts, so which
form should the input letter select? E.g. \epsilon = and \varepsilon are
hardly distinct enough to count as different = letters/symbols, they are
merely different glyphs, so which one should it be? I = for one much prefer
\varepsilon, so I would like to have some interface = which lets the user
select this.

In a more general view, one should perhaps try to = clear up the LMCM so that
the user commands select characters (or character = plus math class) rather
than glyphs. This could make it easier to provide new = math fonts in that
one wouldn't have to concentrate on providing = precisely the same set of
glyphs as the CM math fonts do, but could provide = more (very tricky these
days, as new glyph forms require new commands that = make documents which use
them incompatible with other math fonts) or fewer = (possible by duplicating
the glyphs) forms of the characters as it suits the = design.

Lars Hellstr=F6m

------_=_NextPart_001_01C09DB3.0C73DB00--