Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f4DJYMf17728 for ; Sun, 13 May 2001 21:34:22 +0200 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f4DJYM717345 . for ; Sun, 13 May 2001 21:34:22 +0200 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f4DJYL003637 for ; Sun, 13 May 2001 21:34:21 +0200 (MET DST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C0DBE3.B5C4B300" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id VAA05898 for ; Sun, 13 May 2001 21:34:21 +0200 (MEST) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f4DJYKU04042 for ; Sun, 13 May 2001 21:34:21 +0200 (MET DST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <12.B5E90F05@mail.listserv.gmd.de>; Sun, 13 May 2001 21:32:44 +0200 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 495362 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 13 May 2001 21:34:18 +0200 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA05394 for ; Sun, 13 May 2001 21:34:16 +0200 (MET DST) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA187232 for ; Sun, 13 May 2001 21:34:16 +0200 Received: from musse.tninet.se (musse.tninet.se [195.100.94.12]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f4DJYHQ16715 for ; Sun, 13 May 2001 21:34:17 +0200 (MET DST) Received: (qmail 22444 invoked from network); 13 May 2001 21:34:10 +0200 Received: from garibaldi.tninet.se (HELO algonet.se) (195.100.94.103) by musse.tninet.se with SMTP; 13 May 2001 21:34:10 +0200 Received: from [195.100.226.136] (du136-226.ppp.su-anst.tninet.se [195.100.226.136]) by garibaldi.tninet.se (BLUETAIL Mail Robustifier 2.2.2) with ESMTP id 840600.782448.989garibaldi-s1 for ; Sun, 13 May 2001 21:34:08 +0200 In-Reply-To: References: <200105112029.f4BKT3707962@smtp.wanadoo.es> Return-Path: X-Sender: haberg@pop.matematik.su.se x-mime-autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id VAA05395 Content-class: urn:content-classes:message Subject: Re: Multilingual Encodings Summary 2.2 Date: Sun, 13 May 2001 20:32:35 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4051 This is a multi-part message in MIME format. ------_=_NextPart_001_01C0DBE3.B5C4B300 Content-Type: text/plain; charset="macintosh" Content-Transfer-Encoding: quoted-printable At 15:18 +0200 2001/05/13, Lars Hellstr=9Am wrote: >>> This is why current LaTeX converts everything to >>>LICR before it is written to the .aux file: the elements of the input >>>encoding (as Frank called them above) do not have a single = welldefined >>>meaning. What has been discussed is that one might used some form of >>>Unicode (most likely UTF-8) in these files instead. >> >>Forget everything about variable sized characters as far as the = extension >>of TeX goes, and hook onto translators outside that recognize other >>formats. Variable sized characters just complicates programming. > >Well, the \InputTranslation and \OutputTranslation primitives of Omega >already provide that functionality, so there is no need to deal with >variable-sized characters in the TeX programming. The problem is that = one >might want to employ additional sets of translations (which would then = act >on streams of equally-sized characters) between those extremes of the >program, but Omega doesn't provide for this. I am not sure what you mean here: UTF-8 is variable sized. I suggested that for every file not using a 32-bit character type, one = has an additional file (in ASCII) identified by some kind of file name = ending with information about the encoding. (For example, if the file "" = is not 32-bit, is there si also an ASCII file named ".encoding".) This way, one can provide as many IO code converters as one bothers to write, without the extended TeX ever knows anything about it. (If Omega uses C++ for IO, one can use something called a codecvt. Or use pipes, where available.) Hans Aberg ------_=_NextPart_001_01C0DBE3.B5C4B300 Content-Type: text/html; charset="macintosh" Content-Transfer-Encoding: quoted-printable Re: Multilingual Encodings Summary 2.2

At 15:18 +0200 2001/05/13, Lars Hellstr=9Am = wrote:
>>> This is why current LaTeX converts = everything to
>>>LICR before it is written to the .aux = file: the elements of the input
>>>encoding (as Frank called them above) do = not have a single welldefined
>>>meaning. What has been discussed is that = one might used some form of
>>>Unicode (most likely UTF-8) in these = files instead.
>>
>>Forget everything about variable sized = characters as far as the extension
>>of TeX goes, and hook onto translators = outside that recognize other
>>formats. Variable sized characters just = complicates programming.
>
>Well, the \InputTranslation and = \OutputTranslation primitives of Omega
>already provide that functionality, so there is = no need to deal with
>variable-sized characters in the TeX programming. = The problem is that one
>might want to employ additional sets of = translations (which would then act
>on streams of equally-sized characters) between = those extremes of the
>program, but Omega doesn't provide for = this.

I am not sure what you mean here: UTF-8 is variable = sized.

I suggested that for every file not using a 32-bit = character type, one has
an additional file (in ASCII) identified by some kind = of file name ending
with information about the encoding. (For example, if = the file "<name>" is
not 32-bit, is there si also an ASCII file named = "<name>.encoding".)

This way, one can provide as many IO code converters = as one bothers to
write, without the extended TeX ever knows anything = about it. (If Omega
uses C++ for IO, one can use something called a = codecvt. Or use pipes,
where available.)

  Hans Aberg

------_=_NextPart_001_01C0DBE3.B5C4B300--