Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1IG3qf15566 for ; Sun, 18 Feb 2001 17:03:52 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1IG3qd23646 . for ; Sun, 18 Feb 2001 17:03:52 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1IG3pQ09374 for ; Sun, 18 Feb 2001 17:03:51 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C099C4.6300A400" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id RAA14296 for ; Sun, 18 Feb 2001 17:03:51 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1IG3nQ09370 for ; Sun, 18 Feb 2001 17:03:50 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <7.350B5D70@mail.listserv.gmd.de>; Sun, 18 Feb 2001 17:03:41 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489499 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 18 Feb 2001 17:03:46 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA04017 for ; Sun, 18 Feb 2001 17:03:45 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA22268 for ; Sun, 18 Feb 2001 17:03:45 +0100 Received: from musse.tninet.se (musse.tninet.se [195.100.94.12]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f1IG3ix27410 for ; Sun, 18 Feb 2001 17:03:44 +0100 (MET) Received: (qmail 9246 invoked from network); 18 Feb 2001 17:03:44 +0100 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by musse.tninet.se with SMTP; 18 Feb 2001 17:03:44 +0100 Received: from [195.100.226.135] (du135-226.ppp.su-anst.tninet.se [195.100.226.135]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 635540.512221.982delenn-s2 for ; Sun, 18 Feb 2001 17:03:41 +0100 In-Reply-To: <200102181055.f1IAt4i20466@smtp.wanadoo.es> Return-Path: X-Sender: haberg@pop.matematik.su.se Content-class: urn:content-classes:message Subject: Re: LaTeX's internal char prepresentation (UTF8 or Unicode?) Date: Sun, 18 Feb 2001 17:02:19 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3973 This is a multi-part message in MIME format. ------_=_NextPart_001_01C099C4.6300A400 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable At 11:51 +0100 2001/02/18, Javier Bezos wrote: >> Since we have been asked to provide input encoding changes for LaTeX = within >> paragraphs, eg for individual words, something like this would = happen >>if such >> a change appears, say, inside the argument of \section. > >A system to coordinate preprocess and "internal" process is necessary. The way I thought of it, the preprocessor should be able to handle mixed encodings. -- Thus, the extended TeX (and LaTeX) only sees Unicode characters, and nothing else. Also, I think that the use of multiple encodings in a single file is a pretty transitory thing: MacOS X, released in a regular version the next month supports Unicode fully -- so the access to editors able to handle Unicode will happen pretty soon (no more than a few years), as the availability on personal computers will push the developments a great = deal. And the reason for using multiple encodings is probably the result of = the lack of editors that can handle Unicode. So, I do not think it matters if one uses a seemingly complicated = system, with additional files specifying encoding for now: People will probably soon want to be able to translate their multiple encoding files to = single Unicode encoding files instead. (If you formerly wrote files with a = mixture of say Russian and Latin encodings, where it was only possible to see = the correct renderings by changing the settings of the editor, then when you get hold of a Unicode editor, the first thing that you would want is to = not having the bother of changing the settings of the editor all the time. Thus, you would want a convenient way of converting your old files to Unicode so that your new editor can read them. Therefore it is best if these old mixed encodings files already have a markup that admits an = easy conversion to Unicode.) It is always difficult to judge the future, but, well, this is my guess. Hans Aberg ------_=_NextPart_001_01C099C4.6300A400 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: LaTeX's internal char prepresentation (UTF8 or = Unicode?)

At 11:51 +0100 2001/02/18, Javier Bezos wrote:
>>  Since we have been asked to provide = input encoding changes for LaTeX within
>>  paragraphs, eg for individual words, = something like this would happen
>>if such
>>  a change appears, say, inside the = argument of \section.
>
>A system to coordinate preprocess and = "internal" process is necessary.

The way I thought of it, the preprocessor should be = able to handle mixed
encodings. -- Thus, the extended TeX (and LaTeX) only = sees Unicode
characters, and nothing else.

Also, I think that the use of multiple encodings in a = single file is a
pretty transitory thing: MacOS X, released in a = regular version the next
month supports Unicode fully -- so the access to = editors able to handle
Unicode will happen pretty soon (no more than a few = years), as the
availability on personal computers will push the = developments a great deal.
And the reason for using multiple encodings is = probably the result of the
lack of editors that can handle Unicode.

So, I do not think it matters if one uses a seemingly = complicated system,
with additional files specifying encoding for now: = People will probably
soon want to be able to translate their multiple = encoding files to single
Unicode encoding files instead. (If you formerly = wrote files with a mixture
of say Russian and Latin encodings, where it was only = possible to see the
correct renderings by changing the settings of the = editor, then when you
get hold of a Unicode editor, the first thing that = you would want is to not
having the bother of changing the settings of the = editor all the time.
Thus, you would want a convenient way of converting = your old files to
Unicode so that your new editor can read them. = Therefore it is best if
these old mixed encodings files already have a markup = that admits an easy
conversion to Unicode.)

It is always difficult to judge the future, but, well, = this is my guess.

  Hans Aberg

------_=_NextPart_001_01C099C4.6300A400--