Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1HJRWf04147 for ; Sat, 17 Feb 2001 20:27:32 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1HJRWd20375 . for ; Sat, 17 Feb 2001 20:27:32 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1HJRQH25170 for ; Sat, 17 Feb 2001 20:27:26 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09917.AC46EA00" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id UAA12796 for ; Sat, 17 Feb 2001 20:27:26 +0100 (MET) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1HJRPH25166 for ; Sat, 17 Feb 2001 20:27:26 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <8.7C297DFE@mail.listserv.gmd.de>; Sat, 17 Feb 2001 20:27:17 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489656 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sat, 17 Feb 2001 20:27:22 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA25300 for ; Sat, 17 Feb 2001 20:27:21 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA43040 for ; Sat, 17 Feb 2001 20:27:22 +0100 Received: from musse.tninet.se (musse.tninet.se [195.100.94.12]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f1HJRMx13224 for ; Sat, 17 Feb 2001 20:27:22 +0100 (MET) Received: (qmail 24830 invoked from network); 17 Feb 2001 20:27:21 +0100 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by musse.tninet.se with SMTP; 17 Feb 2001 20:27:21 +0100 Received: from [195.100.226.149] (du149-226.ppp.su-anst.tninet.se [195.100.226.149]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 394650.438038.982delenn-s1 for ; Sat, 17 Feb 2001 20:27:18 +0100 In-Reply-To: References: Return-Path: X-Sender: haberg@pop.matematik.su.se Content-class: urn:content-classes:message Subject: Re: LaTeX's internal char prepresentation (UTF8 or Unicode?) Date: Sat, 17 Feb 2001 20:27:17 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3960 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09917.AC46EA00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable At 12:54 -0500 2001/02/17, Barbara Beeton wrote: >while this would obviously work for text in natural languages, >unicode will never contain all the possible "embellished" letters >and symbols used in math. (and this may include instances with two >or even more diacritics on a single letter or symbol.) this set, >while not infinite, is much too large to want to address even using >the unicode private area. but for latex (or any successor) to be >useful for the particular content for which tex was first developed, >this has to be taken into account. I do not think about math in particular, but the other combining = symbols: Whereas Unicode in some case have single symbols for math combined characters, such as the negation of <=3D> may have its own symbol, in = other cases there might not, so that one still has to write \not\myrelation. = (I do not know if Unicode has changed lately and now has a lot of math combining characters.) Actually, even though one can spend some interesting thinking on how to = do with Unicode combining characters if they happen to math, I do not think that the final solution will make much difference, because the mathematicians will find out how to handle it. (Or you will have to explain better what you have in your mind.) -- I can add that a simple method to allow different input encodings = when reading from a file could be to have it to be treated by = default as say Unicode unless there is an ASCII file with say name .e with information about the encoding. (One could also allow change the default encoding for different files by means of startup arguments.) = This file .e could have very simple information, or as complex as = you bother to write the preprocessor, if you say want mixed encodings or be able to switch between encodings in the very same file. -- In effect, = one is creating a mini-language for reading encodings in a way that TeX does not have to bother about it. Hans Aberg ------_=_NextPart_001_01C09917.AC46EA00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: LaTeX's internal char prepresentation (UTF8 or = Unicode?)

At 12:54 -0500 2001/02/17, Barbara Beeton = wrote:
>while this would obviously work for text in = natural languages,
>unicode will never contain all the possible = "embellished" letters
>and symbols used in math.  (and this may = include instances with two
>or even more diacritics on a single letter or = symbol.)  this set,
>while not infinite, is much too large to want to = address even using
>the unicode private area.  but for latex (or = any successor) to be
>useful for the particular content for which tex = was first developed,
>this has to be taken into account.

I do not think about math in particular, but the other = combining symbols:

Whereas Unicode in some case have single symbols for = math combined
characters, such as the negation of <=3D> may = have its own symbol, in other
cases there might not, so that one still has to write = \not\myrelation. (I
do not know if Unicode has changed lately and now has = a lot of math
combining characters.)

Actually, even though one can spend some interesting = thinking on how to do
with Unicode combining characters if they happen to = math, I do not think
that the final solution will make much difference, = because the
mathematicians will find out how to handle it.

(Or you will have to explain better what you have in = your mind.)

-- I can add that a simple method to allow different = input encodings when
reading from a file <filename> could be to have = it to be treated by default
as say Unicode unless there is an ASCII file with say = name <filename>.e
with information about the encoding. (One could also = allow change the
default encoding for different files by means of = startup arguments.) This
file <filename>.e could have very simple = information, or as complex as you
bother to write the preprocessor, if you say want = mixed encodings or be
able to switch between encodings in the very same = file. -- In effect, one
is creating a mini-language for reading encodings in = a way that TeX does
not have to bother about it.

  Hans Aberg

------_=_NextPart_001_01C09917.AC46EA00--