Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1DEGqH26192 for ; Tue, 13 Feb 2001 15:16:52 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1DEGpd01562 . for ; Tue, 13 Feb 2001 15:16:51 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1DEGp704135 for ; Tue, 13 Feb 2001 15:16:51 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C095C7.9C51A200" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id PAA13938 for ; Tue, 13 Feb 2001 15:16:50 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1DEGn704130 for ; Tue, 13 Feb 2001 15:16:49 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <2.6F474CE1@mail.listserv.gmd.de>; Tue, 13 Feb 2001 15:16:42 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488296 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 13 Feb 2001 15:16:47 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA02567 for ; Tue, 13 Feb 2001 15:16:45 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA36374 for ; Tue, 13 Feb 2001 15:16:45 +0100 Received: from hromeo.algonet.se (hromeo.algonet.se [194.213.74.51]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f1DEGjg02120 for ; Tue, 13 Feb 2001 15:16:45 +0100 (MET) Received: (qmail 12502 invoked from network); 13 Feb 2001 15:16:43 +0100 Received: from garibaldi.tninet.se (HELO algonet.se) (195.100.94.103) by hromeo.algonet.se with SMTP; 13 Feb 2001 15:16:43 +0100 Received: from [195.100.226.136] (du136-226.ppp.su-anst.tninet.se [195.100.226.136]) by garibaldi.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 399295.73801.982garibaldi-s2 for ; Tue, 13 Feb 2001 15:16:41 +0100 In-Reply-To: <14985.13977.836075.844694@gargle.gargle.HOWL> Return-Path: X-Sender: haberg@pop.matematik.su.se Content-class: urn:content-classes:message Subject: Re: Multilingual Encodings Summary Date: Tue, 13 Feb 2001 15:10:54 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3876 This is a multi-part message in MIME format. ------_=_NextPart_001_01C095C7.9C51A200 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable At 14:28 +0100 2001/02/13, Marcel Oliver wrote: >2.1. Problems with Current TeX: > >It has been remarked that TeX does not really have an "internal >representation". Rather, TeX keeps text as a string of ASCII >characters that are re-parsed through the one-and-only TeX parser >whenever something is to be done with it. (TeX gurus: is this >simplistic statement essentially correct???) I am not a TeX guru, but I get the impression that the TeX looks like = this: The string of TeX tokens buffer is normally empty, but sometimes a macro may insert a string of tokens (perhaps a macro expansion can be viewed = as though the body is first inserted in this buffer, before being = evaluated). The - Hyphenation patterns are specified in terms of the output encoding. > This means that every character appearing in the hyphenation rules > must have a physical slot in the selected font. However, logically > hyphenation should not depend on output encoding, and one should be > able to mix fonts with different output encodings without losing > correct hyphenation. I get the impression that this is the result of TeX's lack of being able = to create suitable objects: If TeX was able to first create objects of type "word", to which other operations, such as hyphenation are applied, = then this kind of problems would go away. Let's hear from some TeX gurus how TeX really works. >- Unicode is currently receiving a lot of attention and publicity. So > it may be advantageous to ride that wave, in particular as it seems > technically sound. The new MacOS X (which is Mach 3 & 4.4BSD based), which exists in a beta and is released in a regular version at the end of this upcoming March, evidently supports Unicode fully. -- The main point is that if personal computers now finally support Unicode, Unicode will soon become = ubiquitous. Hans Aberg ------_=_NextPart_001_01C095C7.9C51A200 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: Multilingual Encodings Summary

At 14:28 +0100 2001/02/13, Marcel Oliver wrote:
>2.1. Problems with Current TeX:
>
>It has been remarked that TeX does not really = have an "internal
>representation".  Rather, TeX keeps = text as a string of ASCII
>characters that are re-parsed through the = one-and-only TeX parser
>whenever something is to be done with it.  = (TeX gurus: is this
>simplistic statement essentially = correct???)

I am not a TeX guru, but I get the impression that the = TeX looks like this:
  <string of TeX tokens> <not yet = gulped up ASCII (or 8-bit)>
The string of TeX tokens buffer is normally empty, = but sometimes a macro
may insert a string of tokens (perhaps a macro = expansion can be viewed as
though the  body is first inserted in this = buffer, before being evaluated).
The <not yet gulped up ASCII (or 8-bit) buffer is = read converted into
tokens at need. TeX does not back-track.

>- Hyphenation patterns are specified in terms of = the output encoding.
>  This means that every character appearing = in the hyphenation rules
>  must have a physical slot in the selected = font.  However, logically
>  hyphenation should not depend on output = encoding, and one should be
>  able to mix fonts with different output = encodings without losing
>  correct hyphenation.

I get the impression that this is the result of TeX's = lack of being able to
create suitable objects: If TeX was able to first = create objects of type
"word",  to which other operations, = such as hyphenation are applied, then
this kind of problems would go away.

Let's hear from some TeX gurus how TeX really = works.

>- Unicode is currently receiving a lot of = attention and publicity.  So
>  it may be advantageous to ride that wave, = in particular as it seems
>  technically sound.

The new MacOS X (which is Mach 3 & 4.4BSD based), = which exists in a beta
and is released in a regular version at the end of = this upcoming March,
evidently supports Unicode fully. -- The main point = is that if personal
computers now finally support Unicode, Unicode will = soon become ubiquitous.

  Hans Aberg

------_=_NextPart_001_01C095C7.9C51A200--