Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1CBtu918414 for ; Mon, 12 Feb 2001 12:55:56 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1CBtud28974 . for ; Mon, 12 Feb 2001 12:55:56 +0100 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C094EA.C1BCAE00" Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1CBttM28195 for ; Mon, 12 Feb 2001 12:55:55 +0100 (MET) Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id MAA03714 for ; Mon, 12 Feb 2001 12:55:51 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1CBtn710743 for ; Mon, 12 Feb 2001 12:55:49 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <3.92649644@mail.listserv.gmd.de>; Mon, 12 Feb 2001 12:55:42 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488029 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Mon, 12 Feb 2001 12:55:45 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA05104 for ; Mon, 12 Feb 2001 12:55:44 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA45438 for ; Mon, 12 Feb 2001 12:55:42 +0100 Received: from Sina.sharif.ac.ir (sina.Sharif.AC.IR [194.225.40.9]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f1CBtbu03055 for ; Mon, 12 Feb 2001 12:55:38 +0100 (MET) Received: from localhost (roozbeh@localhost) by Sina.sharif.ac.ir (8.9.3/8.9.3) with ESMTP id PAA13845 for ; Mon, 12 Feb 2001 15:25:31 +0330 In-Reply-To: <200102121113.LAA03110@nag.co.uk> Return-Path: X-Sender: roozbeh@Sina.sharif.ac.ir Content-class: urn:content-classes:message Subject: Re: LaTeX's internal char prepresentation (UTF8 or Unicode?) Date: Mon, 12 Feb 2001 12:55:31 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Roozbeh Pournader" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3844 This is a multi-part message in MIME format. ------_=_NextPart_001_01C094EA.C1BCAE00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable On Mon, 12 Feb 2001, David Carlisle wrote: > 1) make every character active and look ahead to see if it is being > followed by a combining char. > This is possible and fun to code in TeX but I don't really think it > is a long term stable solution. > > 2) use perl (or anything else) to detect all combining characters > and replace them by some command placed before the base. > This is quick and easy to arrange, but if you are having a perl > pre-pass before TeX, it may as well go further and decode the > entire character stream into "latex internal form" ie 7bit ascii = tex > markup. In which case we may as well stay with that markup as = latexs > internal form. These two are not clean enough. That's the reason Omega OTPs exist. = Yannis and John disliked these two approaches. They now avoid active characters as much as possible (eg '~' is not active anymore), and also avoid pre-passes. Both solutions were needed when I was working on FarsiTeX: I needed to pre-pass to do contextual shaping, and I needed active = Tatweels inserted between letters to stretch them to fit the line of text. I = don't need any of them now. BTW, it seems that I have turned into a fan of Omega, without any big experince with it, I've only played. --roozbeh ------_=_NextPart_001_01C094EA.C1BCAE00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: LaTeX's internal char prepresentation (UTF8 or = Unicode?)

On Mon, 12 Feb 2001, David Carlisle wrote:

> 1) make every character active and look ahead to = see if it is being
>    followed by a combining = char.
>    This is possible and fun to = code in TeX but I don't really think it
>    is a long term stable = solution.
>
> 2) use perl (or anything else) to detect all = combining characters
>    and replace them by some = command placed before the base.
>    This is quick and easy to = arrange, but if you are having a perl
>    pre-pass before TeX, it may as = well go further and decode the
>    entire character stream into = "latex internal form" ie 7bit ascii tex
>    markup. In which case we may = as well stay with that markup as latexs
>    internal form.

These two are not clean enough. That's the reason = Omega OTPs exist. Yannis
and John disliked these two approaches. They now = avoid active characters
as much as possible (eg '~' is not active anymore), = and also avoid
pre-passes. Both solutions were needed when I was = working on FarsiTeX: I
needed to pre-pass to do contextual shaping, and I = needed active Tatweels
inserted between letters to stretch them to fit the = line of text. I don't
need any of them now.

BTW, it seems that I have turned into a fan of Omega, = without any big
experince with it, I've only played.

--roozbeh

------_=_NextPart_001_01C094EA.C1BCAE00--