Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with SMTP id o1GExva2023942 for ; Tue, 16 Feb 2010 15:59:58 +0100 Received: (qmail 29662 invoked by alias); 16 Feb 2010 14:59:52 -0000 Delivered-To: GMX delivery to rainer.schoepf@gmx.net Received: (qmail invoked by alias); 16 Feb 2010 14:59:51 -0000 Received: from relay2.uni-heidelberg.de (EHLO relay2.uni-heidelberg.de) [129.206.210.211] by mx0.gmx.net (mx066) with SMTP; 16 Feb 2010 15:59:51 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id o1GEw3Io016618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 16 Feb 2010 15:58:04 +0100 Received: from listserv.uni-heidelberg.de (localhost.localdomain [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id o1FN14TV013003; Tue, 16 Feb 2010 15:58:02 +0100 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 384223 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Tue, 16 Feb 2010 15:58:02 +0100 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id o1GEw2aF032603 for ; Tue, 16 Feb 2010 15:58:02 +0100 Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id o1GEvkFQ030995 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 16 Feb 2010 15:57:51 +0100 Received: from math.jussieu.fr (mail.math.jussieu.fr [134.157.13.55]) by shiva.jussieu.fr (8.14.3/jtpda-5.4) with ESMTP id o1GEvjSt079510 for ; Tue, 16 Feb 2010 15:57:45 +0100 (CET) X-Ids: 166 Received: from [192.168.0.124] (thue.elzevir.fr [88.165.216.11]) (authenticated bits=0) by math.jussieu.fr (8.14.1/8.14.1) with ESMTP id o1GEvhGZ057458 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 16 Feb 2010 15:57:44 +0100 (CET) User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090707) MIME-Version: 1.0 References: <4B727378.8060704@morningstar2.co.uk>, <4B729944.5050308@residenset.net> , <4B72B36E.6010401@morningstar2.co.uk> <4B730157.5060605@morningstar2.co.uk> <3A6B880C-91AA-4597-8E53-92F72E3FA70A@yahoo.de> X-Enigmail-Version: 0.95.0 OpenPGP: id=50A89B42; url=http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xB201857250A89B42 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.94.2/10395/Tue Feb 16 05:20:36 2010 on shiva.jussieu.fr X-Virus-Status: Clean X-Miltered: at jchkmail2.jussieu.fr with ID 4B7AB269.002 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 4B7AB269.002/134.157.13.55/mail.math.jussieu.fr/math.jussieu.fr/ Message-ID: <4B7AB267.6030909@elzevir.fr> Date: Tue, 16 Feb 2010 15:57:43 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: =?ISO-8859-1?Q?Manuel_P=E9gouri=E9-Gonnard?= Subject: Re: LaTeX3 8-bit only? To: LATEX-L@listserv.uni-heidelberg.de In-Reply-To: <3A6B880C-91AA-4597-8E53-92F72E3FA70A@yahoo.de> Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=5D7Q89H36p6i75npGen84eVAEFK/syJmFuaL1OLtauwJ5R/kaZ9HAe8peGX1DeqJYTcon 3bvVAat2VgIIwDvGHk2iJ8SK2vcWsidNyPYmt/3qYC38eHth5J9rzKCneOvOHFTDGwVio4KF1cMC ST+og==V1; X-Resent-By: Forwarder X-Resent-For: rainer.schoepf@gmx.net X-Resent-To: rainer@rainer-schoepf.de X-Scanned-By: MIMEDefang 2.63 on 85.214.41.38 Status: R X-Status: X-Keywords: X-UID: 6287 Hi, Philipp Stephani a écrit : > Current implementation strategies for strings in development environments > define one Unicode encoding scheme (UTF-16 in nearly all cases like Windows, > Java, Python, Qt, .NET, COM, Cocoa, Carbon; a few technologies like Gnome and > Emacs choose UTF-8 instead) that is used exclusively for internal processing, > and define "strings" as sequences of UTF-16 or UTF-8 code units. LaTeX could > do the same, depending on the engine: UTF-8 for pdfTeX, UTF-16 for XeTeX. > Other possibilities (e.g. LICR or UTF-32) are probably either too complicated > or not flexible enough. For the record, LuaTeX uses what you might call UTF-32 internally (a "character" is a Unicode code-point, no more, no less). My humble opinion is that LaTeX3 should define a character as being whatever the underlying engine thinks is a character. That is, a "character" should be a "character token" (with the catcode ignored or, equivalently, normalised): - for pdfTeX, an 8-bit number - for XeTeX, a 16-bit number - for LuaTeX, a number in the range 0 -- 0x10ffff This way, the format does not need to hack extensively (as LaTeX2e does) around the engine's limitations, and can let the engine do his job, and concentrate on his own job as a macro package. (Sort of Unix philosophy: do one thing, do it well.) I mean, LaTeX2e *had to* hack around the encoding limitations of pdfTeX because there was no alternative, but now there are. Just my 2 cents. Manuel.