Received: from mail.proteosys.com ([62.225.9.49]) by nummer-3.proteosys with Microsoft SMTPSVC(5.0.2195.5329); Wed, 8 Jan 2003 16:23:47 +0100 Received: by mail.proteosys.com (8.12.2/8.12.2) with ESMTP id h08FNi6C021146 for ; Wed, 8 Jan 2003 16:23:45 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.27]) by relay.uni-heidelberg.de (8.12.4/8.12.4) with ESMTP id h08FGREV025980; Wed, 8 Jan 2003 16:16:27 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C2B729.F0208380" Received: from listserv (listserv.uni-heidelberg.de [129.206.100.27]) by listserv.uni-heidelberg.de (8.12.2/8.12.2/SuSE Linux 0.6) with ESMTP id h081CYqQ025022; Wed, 8 Jan 2003 16:09:52 +0100 Received: from LISTSERV.UNI-HEIDELBERG.DE by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8d) with spool id 6213 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Wed, 8 Jan 2003 16:09:52 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.12.2/8.12.2/SuSE Linux 0.6) with ESMTP id h08F97Tk031502 for ; Wed, 8 Jan 2003 16:09:07 +0100 Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.189]) by relay.uni-heidelberg.de (8.12.4/8.12.4) with ESMTP id h08FFfEV025792 for ; Wed, 8 Jan 2003 16:15:41 +0100 (MET) Received: from [212.227.126.161] (helo=mrelayng.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 18WHvg-0005Ly-00 for LATEX-L@listserv.uni-heidelberg.de; Wed, 08 Jan 2003 16:15:40 +0100 Received: from [80.129.2.146] (helo=istrati.mittelbach-online.de) by mrelayng.kundenserver.de with asmtp (Exim 3.35 #1) id 18WHvd-0008H7-00 for LATEX-L@listserv.uni-heidelberg.de; Wed, 08 Jan 2003 16:15:39 +0100 Received: (from frank@localhost) by istrati.mittelbach-online.de (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) id h08FFDH02135; Wed, 8 Jan 2003 16:15:13 +0100 In-Reply-To: <200301081412.OAA06820@penguin.nag.co.uk> References: <200212031601.gB3G11cQ009558@sun.dante.de> <15899.14827.804209.458595@istrati.mittelbach-online.de> <20030108101702392721.GyazMail.jbezos@wanadoo.es> <15900.10746.324648.315246@istrati.mittelbach-online.de> <200301081412.OAA06820@penguin.nag.co.uk> Return-Path: X-Mailer: VM 6.96 under Emacs 20.7.1 X-OriginalArrivalTime: 08 Jan 2003 15:23:47.0101 (UTC) FILETIME=[F02FECD0:01C2B729] X-Authentication-Warning: istrati.mittelbach-online.de: frank set sender to frank@mittelbach-online.de using -f X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang) X-Spam-Score: -2.3 () EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,SPAM_PHRASE_02_03,X_AUTH_WARNING Content-class: urn:content-classes:message Subject: Re: latex/3480: Support for UTF-8 missing in inputenc.sty Date: Wed, 8 Jan 2003 16:15:13 +0100 Message-ID: A<15900.16513.364386.947407@istrati.mittelbach-online.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: latex/3480: Support for UTF-8 missing in inputenc.sty Thread-Index: AcK3KfBU86kbDzUCT9CHKXsTtTO0Mg== From: "Frank Mittelbach" To: Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4421 This is a multi-part message in MIME format. ------_=_NextPart_001_01C2B729.F0208380 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable David Carlisle writes: > More serious problems (which make me wonder if it's worth the effort = of > supporting utf8 in a standard TeX) are combining characters. we are aware of this and you are of course right that this is not = something one is not going to resolve that in TeX (despite its turing = completeness) > In xmltex you can make these work by making every possible base > character active and look ahead for a following combiner, but that is > turned off by default as it's not exactly fast or robust. > In LaTeX you can't do much other than make a combining accent = generate an > error as you can't really make the base ascii characters active if = you > are using the \abc style markup. right. well, all of this is really coming from the problem that, say = redhat turned utf8 on as its default with the result that people writing = "ordinary" documents are suddenly having a problem that LaTeX will not process = them. might be that in time it will be as insufficient than not having = anything, but once the whole world gone Unicode :-) we definitely need to use some TeX successor that can handle Unicode data natively anyway > The second thing that I have never really fixed in xmltex in this = area > is that the style of mapping the input character to an internal = csname > which you then map to a typesetting instruction is fine for = supporting > small European based character sets, but it soon gets to be pain if > you are supporting large Asian character sets. yes, and it is not there to provide that (even if in theory it would = work) > right font/character from the utf8 sequences. I never got this = working > in xmltex though (as modifying anything in xmltex is a pain. It's not > the most documented piece of code ever produced) isn't that the case? :-) frank ------_=_NextPart_001_01C2B729.F0208380 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: latex/3480: Support for UTF-8 missing in = inputenc.sty

David Carlisle writes:

 > More serious problems (which make me wonder = if it's worth the effort of
 > supporting utf8 in a standard TeX) are = combining characters.

we are aware of this and you are of course right that = this is not something
one is not going to resolve that in TeX (despite its = turing completeness)

 > In xmltex you can make these work by making = every possible base
 > character active and look ahead for a = following combiner, but that is
 > turned off by default as it's not exactly = fast or robust.
 > In LaTeX you can't do much other than make = a combining accent generate an
 > error as you can't really make the base = ascii characters active if you
 > are using the \abc style markup.

right. well, all of this is really coming from the = problem that, say redhat
turned utf8 on as its default with the result that = people writing "ordinary"
documents are suddenly having a problem that LaTeX = will not process them.

might be that in time it will be as insufficient than = not having anything, but
once the whole world gone Unicode :-) we definitely = need to use some TeX
successor that can handle Unicode data natively = anyway

 > The second thing that I have never really = fixed in xmltex in this area
 > is that the style of mapping the input = character to an internal csname
 > which you then map to a typesetting = instruction is fine for supporting
 > small European based character sets, but = it soon gets to be pain if
 > you are supporting large Asian character = sets.

yes, and it is not there to provide that (even if in = theory it would work)

 > right font/character from the utf8 = sequences. I never got this working
 > in xmltex though (as modifying anything in = xmltex is a pain. It's not
 > the most documented piece of code ever = produced)

isn't that the case? :-)

frank

------_=_NextPart_001_01C2B729.F0208380--