MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C08AFB.B2F1F680"
In-Reply-To:  <200101292234.RAA14964@pluto.math.albany.edu>
References: <200101292234.RAA14964@pluto.math.albany.edu>
Content-class: urn:content-classes:message
Subject:      Re: default font encoding
Date: Tue, 30 Jan 2001 21:22:21 +0100
Message-ID:  <14967.8829.903878.620595@istrati.zdv.uni-mainz.de>
From: "Frank Mittelbach" <frank.mittelbach@LATEX-PROJECT.ORG>
Sender: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
To: "Multiple recipients of list LATEX-L" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Reply-To: "Mailing list for the LaTeX3 project" <LATEX-L@URZ.UNI-HEIDELBERG.DE>
Status: R

This is a multi-part message in MIME format.

------_=_NextPart_001_01C08AFB.B2F1F680
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Bill,

 > I hope then that by default there are supported names like
 > \textgreater \textasciitilde, etc. for all 33 non-alphanumeric
 > printable ascii characters including 0x20 that work properly in
 > various LaTeX contexts.

Lars already answered that: even with OT1 all those chars are already =
there
(most of them at least), the main reason for a switch from OT1 to =
something
else is not extra chars but hyphenate-able typesetting with languages =
other
than English (for those who don't use MLTeX)


 > My point of view is that of one writing a formatter from an XML
 > document type to LaTeX source.  Of course, David's Carlisle's xmltex,

you should perhaps not translate that to an 8bit input code page but to =
the
latex internal character representation which is 7bit

you might want to have a look at the old talk of 1995 i gave in Brno =
about the
relationship between input/internal/output encodings in latex and what =
role
inputenc and fontenc plays therein, you find it at www.latex-project.org =
in
the papers section.

 > In the general context of formatting from XML to LaTeX source, though
 > not so much in my specific context, nor in the context of authors =
coming
 > from a LaTeX or TeX background, I am concerned about what happens =
with
 > 8 bit characters in the range 0xA0 - 0xFF from the various ISO 8 bit
 > character sets.

well if you translate XML to latex you have control about that range and =
you
can map it to LaTeX's internal form depending on the source input =
encoding of
your XML file. alternatively you could late latex do the mapping if the =
XML
source input encoding is one that is recognised by inputenc (or if not =
by
providing an inputenc mapping for that codepage)

 > By default with T1, I believe, the input encoding for these =
characters
 > matches the "cork" encoding.  But when inputenc is set to something

T1 is a font encoding not an input encoding. there is no inputenc method =
in
LaTeX that supports raw 8bit to be passed straight from input to output =
(well
there is in the sense that if you use vanilla LaTeX without any inputenc =
--
but this is really only there for compatibility and not officially =
supported)

 > with a standard public name -- for example an 8 bit name that would =
be
 > recognized by one of James Clark's XML parsers "xp" or "SP" I think =
it
 > highly desirable that the typeset appearance of the characters match
 > what *should* be the screen appearance in a web browser when the
 > character set is properly specified.
 >
 > In particular under such an encoding absent an explicit author
 > indication for math there should be no math.  For example, the
 > miniature "1/2" at data point 0xBD in ISO-8859-1 (Latin 1) should
 > *not* be regarded as math unless an author should choose for some
 > reason I do not anticipate to place it inside math.

i agree and in some sense i'm quite happy that inputenc still says beta
because i'm for year against having the inputenc files to map to =
anything
other than text objects. in other words, i would want to have the 10 or =
so odd
mappings in the various inputenc defs that do map to math be replaced by
\DeclareInputText.

i'm currently trying to document the internal representation of LaTeX
including inputenc and the like and the current status is impossible to
describe.

 > (Probably, however, the present inputenc name "latin1" needs to =
remain
 > as it is for backward compatibility.)

well, probably, but then how many people would have known (and used the =
fact)
that current inputenc latin1 actually has

\DeclareInputText{189}{\textonehalf} % so that gives an error if placed =
in
                                     % math

but

\DeclareInputMath{185}{\mathonesuperior}


would that also make an uproar on ctt? i.e., changing the inputencs to =
be text
objects by default

comments anybody? (Mr. from the grave?)

cheers
frank

------_=_NextPart_001_01C08AFB.B2F1F680
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7654.12">
<TITLE>     Re: default font encoding</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>Bill,</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; I hope then that by default there are =
supported names like</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; \textgreater \textasciitilde, etc. for all =
33 non-alphanumeric</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; printable ascii characters including 0x20 =
that work properly in</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; various LaTeX contexts.</FONT>
</P>

<P><FONT SIZE=3D2>Lars already answered that: even with OT1 all those =
chars are already there</FONT>

<BR><FONT SIZE=3D2>(most of them at least), the main reason for a switch =
from OT1 to something</FONT>

<BR><FONT SIZE=3D2>else is not extra chars but hyphenate-able =
typesetting with languages other</FONT>

<BR><FONT SIZE=3D2>than English (for those who don't use MLTeX)</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>&nbsp;&gt; My point of view is that of one writing a =
formatter from an XML</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; document type to LaTeX source.&nbsp; Of =
course, David's Carlisle's xmltex,</FONT>
</P>

<P><FONT SIZE=3D2>you should perhaps not translate that to an 8bit input =
code page but to the</FONT>

<BR><FONT SIZE=3D2>latex internal character representation which is =
7bit</FONT>
</P>

<P><FONT SIZE=3D2>you might want to have a look at the old talk of 1995 =
i gave in Brno about the</FONT>

<BR><FONT SIZE=3D2>relationship between input/internal/output encodings =
in latex and what role</FONT>

<BR><FONT SIZE=3D2>inputenc and fontenc plays therein, you find it at =
www.latex-project.org in</FONT>

<BR><FONT SIZE=3D2>the papers section.</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; In the general context of formatting from =
XML to LaTeX source, though</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; not so much in my specific context, nor in =
the context of authors coming</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; from a LaTeX or TeX background, I am =
concerned about what happens with</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; 8 bit characters in the range 0xA0 - 0xFF =
from the various ISO 8 bit</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; character sets.</FONT>
</P>

<P><FONT SIZE=3D2>well if you translate XML to latex you have control =
about that range and you</FONT>

<BR><FONT SIZE=3D2>can map it to LaTeX's internal form depending on the =
source input encoding of</FONT>

<BR><FONT SIZE=3D2>your XML file. alternatively you could late latex do =
the mapping if the XML</FONT>

<BR><FONT SIZE=3D2>source input encoding is one that is recognised by =
inputenc (or if not by</FONT>

<BR><FONT SIZE=3D2>providing an inputenc mapping for that =
codepage)</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; By default with T1, I believe, the input =
encoding for these characters</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; matches the &quot;cork&quot; =
encoding.&nbsp; But when inputenc is set to something</FONT>
</P>

<P><FONT SIZE=3D2>T1 is a font encoding not an input encoding. there is =
no inputenc method in</FONT>

<BR><FONT SIZE=3D2>LaTeX that supports raw 8bit to be passed straight =
from input to output (well</FONT>

<BR><FONT SIZE=3D2>there is in the sense that if you use vanilla LaTeX =
without any inputenc --</FONT>

<BR><FONT SIZE=3D2>but this is really only there for compatibility and =
not officially supported)</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; with a standard public name -- for example =
an 8 bit name that would be</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; recognized by one of James Clark's XML =
parsers &quot;xp&quot; or &quot;SP&quot; I think it</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; highly desirable that the typeset =
appearance of the characters match</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; what *should* be the screen appearance in =
a web browser when the</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; character set is properly =
specified.</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt;</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; In particular under such an encoding =
absent an explicit author</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; indication for math there should be no =
math.&nbsp; For example, the</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; miniature &quot;1/2&quot; at data point =
0xBD in ISO-8859-1 (Latin 1) should</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; *not* be regarded as math unless an author =
should choose for some</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; reason I do not anticipate to place it =
inside math.</FONT>
</P>

<P><FONT SIZE=3D2>i agree and in some sense i'm quite happy that =
inputenc still says beta</FONT>

<BR><FONT SIZE=3D2>because i'm for year against having the inputenc =
files to map to anything</FONT>

<BR><FONT SIZE=3D2>other than text objects. in other words, i would want =
to have the 10 or so odd</FONT>

<BR><FONT SIZE=3D2>mappings in the various inputenc defs that do map to =
math be replaced by</FONT>

<BR><FONT SIZE=3D2>\DeclareInputText.</FONT>
</P>

<P><FONT SIZE=3D2>i'm currently trying to document the internal =
representation of LaTeX</FONT>

<BR><FONT SIZE=3D2>including inputenc and the like and the current =
status is impossible to</FONT>

<BR><FONT SIZE=3D2>describe.</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp;&gt; (Probably, however, the present inputenc =
name &quot;latin1&quot; needs to remain</FONT>

<BR><FONT SIZE=3D2>&nbsp;&gt; as it is for backward =
compatibility.)</FONT>
</P>

<P><FONT SIZE=3D2>well, probably, but then how many people would have =
known (and used the fact)</FONT>

<BR><FONT SIZE=3D2>that current inputenc latin1 actually has</FONT>
</P>

<P><FONT SIZE=3D2>\DeclareInputText{189}{\textonehalf} % so that gives =
an error if placed in</FONT>

<BR><FONT =
SIZE=3D2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp; % math</FONT>
</P>

<P><FONT SIZE=3D2>but</FONT>
</P>

<P><FONT SIZE=3D2>\DeclareInputMath{185}{\mathonesuperior}</FONT>
</P>
<BR>
<BR>

<P><FONT SIZE=3D2>would that also make an uproar on ctt? i.e., changing =
the inputencs to be text</FONT>

<BR><FONT SIZE=3D2>objects by default</FONT>
</P>

<P><FONT SIZE=3D2>comments anybody? (Mr. from the grave?)</FONT>
</P>

<P><FONT SIZE=3D2>cheers</FONT>

<BR><FONT SIZE=3D2>frank</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C08AFB.B2F1F680--