Received: from mail.proteosys.com ([213.139.130.197]) by nummer-3.proteosys with Microsoft SMTPSVC(5.0.2195.5329); Fri, 18 Jul 2003 21:07:04 +0200 Received: by mail.proteosys.com (8.12.9/8.12.2) with ESMTP id h6IJ6qSb005361 for ; Fri, 18 Jul 2003 21:07:02 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.27]) by relay.uni-heidelberg.de (8.12.9/8.12.9) with ESMTP id h6IJ0Nmp002335; Fri, 18 Jul 2003 21:00:24 +0200 (MET DST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C34D5F.C642BC00" Received: from listserv (listserv.uni-heidelberg.de [129.206.100.27]) by listserv.uni-heidelberg.de (8.12.3/8.12.3/SuSE Linux 0.6) with ESMTP id h6HM0DjP008378; Fri, 18 Jul 2003 20:59:50 +0200 Received: from LISTSERV.UNI-HEIDELBERG.DE by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8d) with spool id 1381 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Fri, 18 Jul 2003 20:59:50 +0200 Received: from relay2.uni-heidelberg.de (relay2.uni-heidelberg.de [129.206.210.211]) by listserv.uni-heidelberg.de (8.12.3/8.12.3/SuSE Linux 0.6) with ESMTP id h6IIxoM9021403 for ; Fri, 18 Jul 2003 20:59:50 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.gmx.net (pop.gmx.de [213.165.64.20]) by relay2.uni-heidelberg.de (8.12.9/8.12.9) with SMTP id h6IJ0JGl008573 for ; Fri, 18 Jul 2003 21:00:19 +0200 (MET DST) Received: (qmail 13813 invoked by uid 65534); 18 Jul 2003 19:00:17 -0000 Received: from pD9008664.dip.t-dialin.net (EHLO wilson.rwth-aachen.de) (217.0.134.100) by mail.gmx.net (mp016) with SMTP; 18 Jul 2003 21:00:17 +0200 In-Reply-To: (William F. Hammond's message of "Fri, 18 Jul 2003 14:01:31 -0400") Organization: Aachen University of Technology (RWTH) References: <20030710081528.A12401@diabolo.informatik.rwth-aachen.de> <16150.26432.179873.408825@pussy.npc.de> <200307171952.38152.tim@birdsnest.maths.tcd.ie> Return-Path: X-OriginalArrivalTime: 18 Jul 2003 19:07:08.0702 (UTC) FILETIME=[C91033E0:01C34D5F] User-Agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.2 (gnu/linux) x-mime-autoconverted: from quoted-printable to 8bit by listserv.uni-heidelberg.de id h6IIxoM9021404 X-Accept-Language: de, en X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang) x-pgp-fingerprint: CA13 274E 96EF 1DB1 4992 D7D4 D523 14FB 4752 F2EF X-Face: $:ZH*7V$(*!W]7{qQLhM-f#d(Q6#shsBz8[qPwvRr(Hy{#Y3-$C\85(LKA[4'=X]Jy\),51 DU?fMKf}G[2r)>~K8Z3dWD<'R/hRsgW>Q.Fytf-:n*FG&iWyWNMM+c)(_R.k`$zrcq5%9yt"cd)Q]c 5G_W!:/8\S4ytn&NYP,OVd_|*GjEqvk:zK(,BTXvqgj4 X-Spam-Score: -32.8 () EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_GNUS_UA x-binford: 6100 (more power) x-pgp-affinity: will accept encrypted message for GPG Content-class: urn:content-classes:message Subject: Re: XML, UTF-8 and TeX engines Date: Fri, 18 Jul 2003 19:52:09 +0100 Message-ID: A X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: XML, UTF-8 and TeX engines Thread-Index: AcNNX8lWNEjLPoFmTHWIPdCuHYi1nQ== From: "Torsten Bronger" To: Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4714 This is a multi-part message in MIME format. ------_=_NextPart_001_01C34D5F.C642BC00 Content-Type: text/plain; charset="iso-8859-7" Content-Transfer-Encoding: quoted-printable Halloechen! William F Hammond writes: > Torsten Bronger writes: > > [...] > >>> As LaTeX is evolving it will be possible for gellmu's "alpha" >>> (an empty element marked up in Gellmu source as \alpha) to be >>> formatted in LaTeX as (math) \alpha when recursively inside a >>> math element and not inside either of gellmu's "mbox" or "text", >>> while outside of math "alpha" could easily be morphed to a >>> suitable unicode point. >> >> So you distuguish between both cases within your Gellmu tools? >> Okay, we have to, I do so, too; but actually I think that this is >> something that the typesetter should provide. So, an \alpha in >> math mode should be cmmi, and in text mode is must be part of a >> Greek word. > > One way or another there should be a distinction. > > But I want gellmu article to be able to reach xhtml+mathml and for > this I want to have a source markup way of identifying math > symbols. Granted, but eventually it's MathML and then a following processor must cope with a Unicode alpha. And either it's something like my Unicode --> LaTeX filter program, or it's the typesetter itself. I prefer the latter strongly, because all other variants I've seen so far looked like kludges more or less. > For that purpose it is convenient for me to hold on to > (the xml form of \alpha) until the end of any pipeline. Beyond > that I think it inefficient use of xml structure to look > individually at items of cdata. I think so, too, however db2latex and the MathML-->XSLT-->LaTeX project (sorry, don't know its Sourceforge name at the moment) do something like that apparently. > So my formatter is willing to think about how to handle > but not about how to handle =E1 (which will be understood only as > the unicode object that it is and which, therefore, should not be > found loose inside math). But then your formatter stops when having reached XML, or it starts with a format that has similar limitations as LaTeX. > (The last sentence is supposed to have a single U+03B1 that is > UTF-8 encoded; I don't know what will happen in the mail.) It arrived in one piece (but not as UTF-8). Tschoe, Torsten. -- Torsten Bronger, aquisgrana, europa vetus ------_=_NextPart_001_01C34D5F.C642BC00 Content-Type: text/html; charset="iso-8859-7" Content-Transfer-Encoding: quoted-printable Re: XML, UTF-8 and TeX engines

Halloechen!

William F Hammond <hammond@CSC.ALBANY.EDU> = writes:

> Torsten Bronger = <bronger@PHYSIK.RWTH-AACHEN.DE> writes:
>
> [...]
>
>>> As LaTeX is evolving it will be possible = for gellmu's "alpha"
>>> (an empty element marked up in Gellmu = source as \alpha) to be
>>> formatted in LaTeX as (math) \alpha when = recursively inside a
>>> math element and not inside either of = gellmu's "mbox" or "text",
>>> while outside of math "alpha" = could easily be morphed to a
>>> suitable unicode point.
>>
>> So you distuguish between both cases within = your Gellmu tools?
>> Okay, we have to, I do so, too; but actually = I think that this is
>> something that the typesetter should = provide.  So, an \alpha in
>> math mode should be cmmi, and in text mode = is must be part of a
>> Greek word.
>
> One way or another there should be a = distinction.
>
> But I want gellmu article to be able to reach = xhtml+mathml and for
> this I want to have a source markup way of = identifying math
> symbols.

Granted, but eventually it's MathML and then a = following processor
must cope with a Unicode alpha.  And either it's = something like my
Unicode --> LaTeX filter program, or it's the = typesetter itself.  I
prefer the latter strongly, because all other = variants I've seen so
far looked like kludges more or less.

> For that purpose it is convenient for me to hold = on to </alpha>
> (the xml form of \alpha) until the end of any = pipeline.  Beyond
> that I think it inefficient use of xml structure = to look
> individually at items of cdata.

I think so, too, however db2latex and the = MathML-->XSLT-->LaTeX
project (sorry, don't know its Sourceforge name at = the moment) do
something like that apparently.

> So my formatter is willing to think about how to = handle </alpha>
> but not about how to handle =E1 (which will be = understood only as
> the unicode object that it is and which, = therefore, should not be
> found loose inside math).

But then your formatter stops when having reached XML, = or it starts
with a format that has similar limitations as = LaTeX.

> (The last sentence is supposed to have a single = U+03B1 that is
> UTF-8 encoded; I don't know what will happen in = the mail.)

It arrived in one piece (but not as UTF-8).

Tschoe,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus

------_=_NextPart_001_01C34D5F.C642BC00--