MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C34D03.F0FF2700"
In-Reply-To:  <Pine.GSO.4.53.0307171639540.24023@sun06.ams.org> (Barbara              Beeton's message of "Thu, 17 Jul 2003 16:43:27 -0400")
Organization: Aachen University of Technology (RWTH)
References: <20030710081528.A12401@diabolo.informatik.rwth-aachen.de>            <16150.26432.179873.408825@pussy.npc.de>            <m3smp5laza.fsf_-_@wilson.rwth-aachen.de>            <200307171952.38152.tim@birdsnest.maths.tcd.ie>            <m3vfu0aoic.fsf@wilson.rwth-aachen.de>            <Pine.GSO.4.53.0307171639540.24023@sun06.ams.org>
User-Agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.2 (gnu/linux)
Content-class: urn:content-classes:message
Subject:      Re: XML, UTF-8 and TeX engines
Date: Fri, 18 Jul 2003 08:45:07 +0100
Message-ID: A<m3he5k5ldo.fsf@wilson.rwth-aachen.de>
Thread-Topic:      Re: XML, UTF-8 and TeX engines
Thread-Index: AcNNA/MkHgr2yB8QQLy3k1rUSRS0aQ==
From: "Torsten Bronger" <bronger@PHYSIK.RWTH-AACHEN.DE>
To: <LATEX-L@listserv.uni-heidelberg.de>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@listserv.uni-heidelberg.de>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C34D03.F0FF2700
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Halloechen!

Barbara Beeton <bnb@AMS.ORG> writes:

> timothy murphy asked,
>
>     > And isn't it quite sensible to distinguish between text and =
maths?
>
> and torsten bronger responded,
>
>     XML doesn't do it and I find this very convenient.  In (La)TeX, =
for
>     many characters you need different commands for text and math =
mode.
>     I'd love to have a typesetting system to which I could pass a say
>     'small Greek letter alpha', and it would just work in every =
context.
>     No font families, no encodings, no active/special characters, no
>     babel settings, and no modes to worry about.  Wonderful ...
>
> not quite.  i'm with tim here.
>
> for math publication, it's traditional to have variables in
> italic.  it's also traditional to have theorems in italic.
> unless it's marked as math, how can you tell that "a" in a
> theorem is a variable or the english indefinite article?
> knuth tried to avoid this by
>  - using a slanted font instead of italic for theorems
>  - making a math italic that is ever so slightly wider than
>    text italic
> but it still definitely requires marking a math "a" as math.

You're right, the program -- or XML file format -- must provide a
way to mark math areas, and it must apply rules or whatever to
typeset accordingly.

But I said "many characters you need different commands for text and
math mode".  In other words, those rules are not enough at all.  I
wrote (yet another ;) set of Unicode --> LaTeX replacements, and
it's full of "\ifmmode ... \else ... \fi" constructs.  So I must be
aware of the current mode for *most* characters.  One line says
e.g.:

0x107   cacute                   "\ifmmode \acute{c}\else \'{c}\fi{}"

My dream is to just insert the UTF-8 sequence of 0x107 and it
works.  Of course, the "cacute" doesn't make sense in math mode, and
therefore LaTeX doesn't support such things, however I cannot tell
XML authors which characters they are allowed to type.  Even the
standard latin1 inputenc option isn't math-proof.

Tschoe,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus

------_=_NextPart_001_01C34D03.F0FF2700
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: XML, UTF-8 and TeX engines</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>Halloechen!</FONT>
</P>

<P><FONT SIZE=3D2>Barbara Beeton &lt;bnb@AMS.ORG&gt; writes:</FONT>
</P>

<P><FONT SIZE=3D2>&gt; timothy murphy asked,</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; &gt; And isn't it quite =
sensible to distinguish between text and maths?</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt; and torsten bronger responded,</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; XML doesn't do it and I =
find this very convenient.&nbsp; In (La)TeX, for</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; many characters you need =
different commands for text and math mode.</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; I'd love to have a =
typesetting system to which I could pass a say</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; 'small Greek letter =
alpha', and it would just work in every context.</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; No font families, no =
encodings, no active/special characters, no</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp; babel settings, and no =
modes to worry about.&nbsp; Wonderful ...</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt; not quite.&nbsp; i'm with tim here.</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt; for math publication, it's traditional to have =
variables in</FONT>

<BR><FONT SIZE=3D2>&gt; italic.&nbsp; it's also traditional to have =
theorems in italic.</FONT>

<BR><FONT SIZE=3D2>&gt; unless it's marked as math, how can you tell =
that &quot;a&quot; in a</FONT>

<BR><FONT SIZE=3D2>&gt; theorem is a variable or the english indefinite =
article?</FONT>

<BR><FONT SIZE=3D2>&gt; knuth tried to avoid this by</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp; - using a slanted font instead of italic =
for theorems</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp; - making a math italic that is ever so =
slightly wider than</FONT>

<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp; text italic</FONT>

<BR><FONT SIZE=3D2>&gt; but it still definitely requires marking a math =
&quot;a&quot; as math.</FONT>
</P>

<P><FONT SIZE=3D2>You're right, the program -- or XML file format -- =
must provide a</FONT>

<BR><FONT SIZE=3D2>way to mark math areas, and it must apply rules or =
whatever to</FONT>

<BR><FONT SIZE=3D2>typeset accordingly.</FONT>
</P>

<P><FONT SIZE=3D2>But I said &quot;many characters you need different =
commands for text and</FONT>

<BR><FONT SIZE=3D2>math mode&quot;.&nbsp; In other words, those rules =
are not enough at all.&nbsp; I</FONT>

<BR><FONT SIZE=3D2>wrote (yet another ;) set of Unicode --&gt; LaTeX =
replacements, and</FONT>

<BR><FONT SIZE=3D2>it's full of &quot;\ifmmode ... \else ... \fi&quot; =
constructs.&nbsp; So I must be</FONT>

<BR><FONT SIZE=3D2>aware of the current mode for *most* =
characters.&nbsp; One line says</FONT>

<BR><FONT SIZE=3D2>e.g.:</FONT>
</P>

<P><FONT SIZE=3D2>0x107&nbsp;&nbsp; =
cacute&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &quot;\ifmmode \acute{c}\else =
\'{c}\fi{}&quot;</FONT>
</P>

<P><FONT SIZE=3D2>My dream is to just insert the UTF-8 sequence of 0x107 =
and it</FONT>

<BR><FONT SIZE=3D2>works.&nbsp; Of course, the &quot;cacute&quot; =
doesn't make sense in math mode, and</FONT>

<BR><FONT SIZE=3D2>therefore LaTeX doesn't support such things, however =
I cannot tell</FONT>

<BR><FONT SIZE=3D2>XML authors which characters they are allowed to =
type.&nbsp; Even the</FONT>

<BR><FONT SIZE=3D2>standard latin1 inputenc option isn't =
math-proof.</FONT>
</P>

<P><FONT SIZE=3D2>Tschoe,</FONT>

<BR><FONT SIZE=3D2>Torsten.</FONT>
</P>

<P><FONT SIZE=3D2>--</FONT>

<BR><FONT SIZE=3D2>Torsten Bronger, aquisgrana, europa vetus</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C34D03.F0FF2700--