Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f1JHIP104670 for ; Mon, 19 Feb 2001 18:18:25 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f1JHIPd28221 . for ; Mon, 19 Feb 2001 18:18:25 +0100 MIME-Version: 1.0 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1JHIOH15698 for ; Mon, 19 Feb 2001 18:18:24 +0100 (MET) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C09A97.F7880680" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id SAA06010 for ; Mon, 19 Feb 2001 18:18:24 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f1JHIMQ28866 for ; Mon, 19 Feb 2001 18:18:23 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <8.C95B5B2E@mail.listserv.gmd.de>; Mon, 19 Feb 2001 18:18:13 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 490637 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Mon, 19 Feb 2001 18:18:19 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA29242 for ; Mon, 19 Feb 2001 18:18:18 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA49774 for ; Mon, 19 Feb 2001 18:18:15 +0100 Received: from smtp.wanadoo.es (m1smtpisp02.wanadoo.es [62.36.220.21] (may be forged)) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f1JHIFx29224 for ; Mon, 19 Feb 2001 18:18:15 +0100 (MET) Received: from wanadoo.es (m1wmail1.wanadoo.es [62.36.220.41]) by smtp.wanadoo.es (8.10.2/8.10.2) with ESMTP id f1JHI6p23442 for ; Mon, 19 Feb 2001 18:18:10 +0100 (MET) Return-Path: x-mime-autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id SAA29246 x-xam3-api-version: 1.1.11.1.5 x-senderip: 195.53.220.3 Content-class: urn:content-classes:message Subject: Re: Multilingual Encodings Summary 2.0 Date: Mon, 19 Feb 2001 18:18:06 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "jbezos" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3986 This is a multi-part message in MIME format. ------_=_NextPart_001_01C09A97.F7880680 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable > Question raised by this: Can OCPs output control sequences, or do they = just > produce characters? They can output any token (IIRC there is a bug when \input is used, but I'm not sure). In fact, tokens are necessary when translating Unicode to, say, OT1. >Can one specify what catcode the characters should have? Unfortunately not. The catcodes used are the catcodes when the replacement is done. That means that "private" names containing @ cannot be used (in general, or if \csname is used). > Another question regarding OCPs: Is the OCP model general enough to = support > a reasonable size OCP that translates UTF-8 to 32-bit Unicode, or is = it > necessary to implement the entire translation as a gigantic (2^31 = entries) > table? They can, because OCPs can do calculations. The utf8 OCP takes about ten lines of code. The uppercase OCP is also very short because most of uppercase variants follow some rule in the Unicode Standard (either -1 or -32). Javier _________________________________________________________________________= _____ Consigue tu cuenta de correo universal y gratuita en = http://webmail.wanadoo.es ------_=_NextPart_001_01C09A97.F7880680 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: Multilingual Encodings Summary 2.0

> Question raised by this: Can OCPs output control = sequences, or do they just
> produce characters?

They can output any token (IIRC there is a bug
when \input is used, but I'm not sure). In = fact,
tokens are necessary when translating Unicode = to,
say, OT1.

>Can one specify what catcode the characters should = have?

Unfortunately not. The catcodes used are the
catcodes when the replacement is done. That = means
that "private" names containing @ cannot = be
used (in general, or if \csname is used).

> Another question regarding OCPs: Is the OCP model = general enough to support
> a reasonable size OCP that translates UTF-8 to = 32-bit Unicode, or is it
> necessary to implement the entire translation as = a gigantic (2^31 entries)
> table?

They can, because OCPs can do
calculations. The utf8 OCP takes about ten = lines
of code. The uppercase OCP is also very short
because most of uppercase variants follow some
rule in the Unicode Standard (either -1 or = -32).

Javier

________________________________________________________________= ______________
Consigue tu cuenta de correo universal y gratuita en = http://webmail.wanadoo.es

------_=_NextPart_001_01C09A97.F7880680--