MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C0A02C.6FBC7200"
In-Reply-To:  <200102261652.QAA28993@penguin.nag.co.uk>
References: <v03110701b6c02763991a@[195.100.226.137]> (message from Hans Aberg            on Mon, 26 Feb 2001 16:37:33 +0100)             <v03110700b6bc6c238279@[195.100.226.147]> (message from Hans Aberg            on Fri, 23 Feb 2001 21:04:40 +0100)             <Pine.GSO.4.33.0102231112130.27803-100000@sun06.ams.org> (message            from Barbara Beeton on Fri, 23 Feb 2001 11:16:42 -0500)             <Pine.GSO.4.33.0102231112130.27803-100000@sun06.ams.org>            <v03110700b6bc6c238279@[195.100.226.147]>            <v03110701b6c02763991a@[195.100.226.137]>
Content-class: urn:content-classes:message
Subject:      Re: LaTeX's internal char prepresentation (UTF8 or Unicode?)
Date: Mon, 26 Feb 2001 20:34:45 +0100
Message-ID:  <v03110702b6c05adcb14a@[195.100.226.136]>
From: "Hans Aberg" <haberg@MATEMATIK.SU.SE>
Sender: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
To: "Multiple recipients of list LATEX-L" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C0A02C.6FBC7200
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

At 16:52 +0000 2001/02/26, David Carlisle wrote:
>> What characters are included in a set?
>you could look at the tables:-)

Yes I did (apart from the fact that I could find no convenient archive =
to
pick them down, making the process excruciatingly slow on my computer).

>>  If it is a-zA-Z0-9 plus undotted ij,
>> one set has 64 characters, giving room for 1024/64 =3D 16 sets.
>
>It varies from set to set. the basic collections are
>
>letters (a-z A-Z)
>digits
>greek (including variant greek forms)
>
>some of the alphabets don't have greek some don't have grrek or =
letters.

Looking into the TeX book, there are 40 Greek characters not identical =
to
Latin. It might be tempting to add a full set of Greek letters, but in
_math_ it seems pointless: letters will mostly appear singly with no =
other
suitable context information identifying them as Greek. (By contrast, in
Greek text, one will know that they are semantically Greek letters from =
the
context, and further they may be drawn from special Greek fonts, giving
them a slightly different look from the Latin letters, which may be =
drawn
from a different Latin font).

If the Greek letters appear in shapes
  upright
  slanted
  bold
  bold slanted
that gives 160 characters.

This gives at most (1024 - 160)/64, or 13 Latin sets. I think these =
should be
  Bold
  Italic
  Bold Italic
  Double-struck
  Calligraphic
  Bold Calligraphic
  Script
  Bold Script
  Fraktur
  Sans-serif
  Bold Sans-serif
  Sans-serif Italic
  Sans-serif Bold Italic
with no "Bold Fraktur" and no "Monospace".

-- The monospace is not really a _math_ font, there is no _semantic_
difference in using a monospace over another font, not even when writing
computer language code. So strictly speaking, it is a form of rendering.

And the "Bold Fraktur" seems unnecessary. That is, unless somebody can
demonstrate that it is in actual use.

By contrast, I can think of a (thought) example where Calligraphic and
Script are in use in the same formula: I think the "O" of order O(n) (as =
in
complexity of algorithms, for example) should be in the RSFS like =
Script.
But it would be perfectly OK to have Calligraphic letters denoting some
other quantity (say categorical objects, even though some prefer Script =
for
that too). Well, anyway, one could without too much effort produce =
sensible
formulas where the two appear side-by-side, indicating different =
semantic
meanings.

But perhaps Unicode has already made up its mind, so there is nothing to =
do
about it...

  Hans Aberg

------_=_NextPart_001_01C0A02C.6FBC7200
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: LaTeX's internal char prepresentation (UTF8 or =
Unicode?)</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>At 16:52 +0000 2001/02/26, David Carlisle =
wrote:</FONT>

<BR><FONT SIZE=3D2>&gt;&gt; What characters are included in a =
set?</FONT>

<BR><FONT SIZE=3D2>&gt;you could look at the tables:-)</FONT>
</P>

<P><FONT SIZE=3D2>Yes I did (apart from the fact that I could find no =
convenient archive to</FONT>

<BR><FONT SIZE=3D2>pick them down, making the process excruciatingly =
slow on my computer).</FONT>
</P>

<P><FONT SIZE=3D2>&gt;&gt;&nbsp; If it is a-zA-Z0-9 plus undotted =
ij,</FONT>

<BR><FONT SIZE=3D2>&gt;&gt; one set has 64 characters, giving room for =
1024/64 =3D 16 sets.</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;It varies from set to set. the basic collections =
are</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;letters (a-z A-Z)</FONT>

<BR><FONT SIZE=3D2>&gt;digits</FONT>

<BR><FONT SIZE=3D2>&gt;greek (including variant greek forms)</FONT>

<BR><FONT SIZE=3D2>&gt;</FONT>

<BR><FONT SIZE=3D2>&gt;some of the alphabets don't have greek some don't =
have grrek or letters.</FONT>
</P>

<P><FONT SIZE=3D2>Looking into the TeX book, there are 40 Greek =
characters not identical to</FONT>

<BR><FONT SIZE=3D2>Latin. It might be tempting to add a full set of =
Greek letters, but in</FONT>

<BR><FONT SIZE=3D2>_math_ it seems pointless: letters will mostly appear =
singly with no other</FONT>

<BR><FONT SIZE=3D2>suitable context information identifying them as =
Greek. (By contrast, in</FONT>

<BR><FONT SIZE=3D2>Greek text, one will know that they are semantically =
Greek letters from the</FONT>

<BR><FONT SIZE=3D2>context, and further they may be drawn from special =
Greek fonts, giving</FONT>

<BR><FONT SIZE=3D2>them a slightly different look from the Latin =
letters, which may be drawn</FONT>

<BR><FONT SIZE=3D2>from a different Latin font).</FONT>
</P>

<P><FONT SIZE=3D2>If the Greek letters appear in shapes</FONT>

<BR><FONT SIZE=3D2>&nbsp; upright</FONT>

<BR><FONT SIZE=3D2>&nbsp; slanted</FONT>

<BR><FONT SIZE=3D2>&nbsp; bold</FONT>

<BR><FONT SIZE=3D2>&nbsp; bold slanted</FONT>

<BR><FONT SIZE=3D2>that gives 160 characters.</FONT>
</P>

<P><FONT SIZE=3D2>This gives at most (1024 - 160)/64, or 13 Latin sets. =
I think these should be</FONT>

<BR><FONT SIZE=3D2>&nbsp; Bold</FONT>

<BR><FONT SIZE=3D2>&nbsp; Italic</FONT>

<BR><FONT SIZE=3D2>&nbsp; Bold Italic</FONT>

<BR><FONT SIZE=3D2>&nbsp; Double-struck</FONT>

<BR><FONT SIZE=3D2>&nbsp; Calligraphic</FONT>

<BR><FONT SIZE=3D2>&nbsp; Bold Calligraphic</FONT>

<BR><FONT SIZE=3D2>&nbsp; Script</FONT>

<BR><FONT SIZE=3D2>&nbsp; Bold Script</FONT>

<BR><FONT SIZE=3D2>&nbsp; Fraktur</FONT>

<BR><FONT SIZE=3D2>&nbsp; Sans-serif</FONT>

<BR><FONT SIZE=3D2>&nbsp; Bold Sans-serif</FONT>

<BR><FONT SIZE=3D2>&nbsp; Sans-serif Italic</FONT>

<BR><FONT SIZE=3D2>&nbsp; Sans-serif Bold Italic</FONT>

<BR><FONT SIZE=3D2>with no &quot;Bold Fraktur&quot; and no =
&quot;Monospace&quot;.</FONT>
</P>

<P><FONT SIZE=3D2>-- The monospace is not really a _math_ font, there is =
no _semantic_</FONT>

<BR><FONT SIZE=3D2>difference in using a monospace over another font, =
not even when writing</FONT>

<BR><FONT SIZE=3D2>computer language code. So strictly speaking, it is a =
form of rendering.</FONT>
</P>

<P><FONT SIZE=3D2>And the &quot;Bold Fraktur&quot; seems unnecessary. =
That is, unless somebody can</FONT>

<BR><FONT SIZE=3D2>demonstrate that it is in actual use.</FONT>
</P>

<P><FONT SIZE=3D2>By contrast, I can think of a (thought) example where =
Calligraphic and</FONT>

<BR><FONT SIZE=3D2>Script are in use in the same formula: I think the =
&quot;O&quot; of order O(n) (as in</FONT>

<BR><FONT SIZE=3D2>complexity of algorithms, for example) should be in =
the RSFS like Script.</FONT>

<BR><FONT SIZE=3D2>But it would be perfectly OK to have Calligraphic =
letters denoting some</FONT>

<BR><FONT SIZE=3D2>other quantity (say categorical objects, even though =
some prefer Script for</FONT>

<BR><FONT SIZE=3D2>that too). Well, anyway, one could without too much =
effort produce sensible</FONT>

<BR><FONT SIZE=3D2>formulas where the two appear side-by-side, =
indicating different semantic</FONT>

<BR><FONT SIZE=3D2>meanings.</FONT>
</P>

<P><FONT SIZE=3D2>But perhaps Unicode has already made up its mind, so =
there is nothing to do</FONT>

<BR><FONT SIZE=3D2>about it...</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp; Hans Aberg</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C0A02C.6FBC7200--