Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f0UKW1713940 for ; Tue, 30 Jan 2001 21:32:01 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f0UKWm728237 . for ; Tue, 30 Jan 2001 21:32:48 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f0UKW0701607 for ; Tue, 30 Jan 2001 21:32:00 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C08AFB.B2F1F680" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id VAA22683 for ; Tue, 30 Jan 2001 21:31:59 +0100 (MET) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f0UKVwM24485 for ; Tue, 30 Jan 2001 21:31:58 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <11.882D532E@mail.listserv.gmd.de>; Tue, 30 Jan 2001 21:31:55 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486061 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 30 Jan 2001 21:31:55 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA04998 for ; Tue, 30 Jan 2001 21:30:24 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA16562 for ; Tue, 30 Jan 2001 21:30:23 +0100 Received: from moutvdom01.kundenserver.de (moutvdom01.kundenserver.de [195.20.224.200]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f0UKUNp21466 for ; Tue, 30 Jan 2001 21:30:23 +0100 (MET) Received: from [195.20.224.208] (helo=mrvdom01.schlund.de) by moutvdom01.kundenserver.de with esmtp (Exim 2.12 #2) id 14NhPu-0007nh-00 for LATEX-L@urz.uni-heidelberg.de; Tue, 30 Jan 2001 21:30:18 +0100 Received: from dialin351.zdv.uni-mainz.de ([134.93.175.51] helo=istrati.zdv.uni-mainz.de) by mrvdom01.schlund.de with esmtp (Exim 2.12 #2) id 14NhPt-0001Hg-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 30 Jan 2001 21:30:18 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id VAA12807; Tue, 30 Jan 2001 21:22:22 +0100 In-Reply-To: <200101292234.RAA14964@pluto.math.albany.edu> References: <200101292234.RAA14964@pluto.math.albany.edu> Return-Path: X-Mailer: VM 6.75 under Emacs 20.4.1 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f Content-class: urn:content-classes:message Subject: Re: default font encoding Date: Tue, 30 Jan 2001 21:22:21 +0100 Message-ID: <14967.8829.903878.620595@istrati.zdv.uni-mainz.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Frank Mittelbach" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3672 This is a multi-part message in MIME format. ------_=_NextPart_001_01C08AFB.B2F1F680 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Bill, > I hope then that by default there are supported names like > \textgreater \textasciitilde, etc. for all 33 non-alphanumeric > printable ascii characters including 0x20 that work properly in > various LaTeX contexts. Lars already answered that: even with OT1 all those chars are already = there (most of them at least), the main reason for a switch from OT1 to = something else is not extra chars but hyphenate-able typesetting with languages = other than English (for those who don't use MLTeX) > My point of view is that of one writing a formatter from an XML > document type to LaTeX source. Of course, David's Carlisle's xmltex, you should perhaps not translate that to an 8bit input code page but to = the latex internal character representation which is 7bit you might want to have a look at the old talk of 1995 i gave in Brno = about the relationship between input/internal/output encodings in latex and what = role inputenc and fontenc plays therein, you find it at www.latex-project.org = in the papers section. > In the general context of formatting from XML to LaTeX source, though > not so much in my specific context, nor in the context of authors = coming > from a LaTeX or TeX background, I am concerned about what happens = with > 8 bit characters in the range 0xA0 - 0xFF from the various ISO 8 bit > character sets. well if you translate XML to latex you have control about that range and = you can map it to LaTeX's internal form depending on the source input = encoding of your XML file. alternatively you could late latex do the mapping if the = XML source input encoding is one that is recognised by inputenc (or if not = by providing an inputenc mapping for that codepage) > By default with T1, I believe, the input encoding for these = characters > matches the "cork" encoding. But when inputenc is set to something T1 is a font encoding not an input encoding. there is no inputenc method = in LaTeX that supports raw 8bit to be passed straight from input to output = (well there is in the sense that if you use vanilla LaTeX without any inputenc = -- but this is really only there for compatibility and not officially = supported) > with a standard public name -- for example an 8 bit name that would = be > recognized by one of James Clark's XML parsers "xp" or "SP" I think = it > highly desirable that the typeset appearance of the characters match > what *should* be the screen appearance in a web browser when the > character set is properly specified. > > In particular under such an encoding absent an explicit author > indication for math there should be no math. For example, the > miniature "1/2" at data point 0xBD in ISO-8859-1 (Latin 1) should > *not* be regarded as math unless an author should choose for some > reason I do not anticipate to place it inside math. i agree and in some sense i'm quite happy that inputenc still says beta because i'm for year against having the inputenc files to map to = anything other than text objects. in other words, i would want to have the 10 or = so odd mappings in the various inputenc defs that do map to math be replaced by \DeclareInputText. i'm currently trying to document the internal representation of LaTeX including inputenc and the like and the current status is impossible to describe. > (Probably, however, the present inputenc name "latin1" needs to = remain > as it is for backward compatibility.) well, probably, but then how many people would have known (and used the = fact) that current inputenc latin1 actually has \DeclareInputText{189}{\textonehalf} % so that gives an error if placed = in % math but \DeclareInputMath{185}{\mathonesuperior} would that also make an uproar on ctt? i.e., changing the inputencs to = be text objects by default comments anybody? (Mr. from the grave?) cheers frank ------_=_NextPart_001_01C08AFB.B2F1F680 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: default font encoding

Bill,

 > I hope then that by default there are = supported names like
 > \textgreater \textasciitilde, etc. for all = 33 non-alphanumeric
 > printable ascii characters including 0x20 = that work properly in
 > various LaTeX contexts.

Lars already answered that: even with OT1 all those = chars are already there
(most of them at least), the main reason for a switch = from OT1 to something
else is not extra chars but hyphenate-able = typesetting with languages other
than English (for those who don't use MLTeX)


 > My point of view is that of one writing a = formatter from an XML
 > document type to LaTeX source.  Of = course, David's Carlisle's xmltex,

you should perhaps not translate that to an 8bit input = code page but to the
latex internal character representation which is = 7bit

you might want to have a look at the old talk of 1995 = i gave in Brno about the
relationship between input/internal/output encodings = in latex and what role
inputenc and fontenc plays therein, you find it at = www.latex-project.org in
the papers section.

 > In the general context of formatting from = XML to LaTeX source, though
 > not so much in my specific context, nor in = the context of authors coming
 > from a LaTeX or TeX background, I am = concerned about what happens with
 > 8 bit characters in the range 0xA0 - 0xFF = from the various ISO 8 bit
 > character sets.

well if you translate XML to latex you have control = about that range and you
can map it to LaTeX's internal form depending on the = source input encoding of
your XML file. alternatively you could late latex do = the mapping if the XML
source input encoding is one that is recognised by = inputenc (or if not by
providing an inputenc mapping for that = codepage)

 > By default with T1, I believe, the input = encoding for these characters
 > matches the "cork" = encoding.  But when inputenc is set to something

T1 is a font encoding not an input encoding. there is = no inputenc method in
LaTeX that supports raw 8bit to be passed straight = from input to output (well
there is in the sense that if you use vanilla LaTeX = without any inputenc --
but this is really only there for compatibility and = not officially supported)

 > with a standard public name -- for example = an 8 bit name that would be
 > recognized by one of James Clark's XML = parsers "xp" or "SP" I think it
 > highly desirable that the typeset = appearance of the characters match
 > what *should* be the screen appearance in = a web browser when the
 > character set is properly = specified.
 >
 > In particular under such an encoding = absent an explicit author
 > indication for math there should be no = math.  For example, the
 > miniature "1/2" at data point = 0xBD in ISO-8859-1 (Latin 1) should
 > *not* be regarded as math unless an author = should choose for some
 > reason I do not anticipate to place it = inside math.

i agree and in some sense i'm quite happy that = inputenc still says beta
because i'm for year against having the inputenc = files to map to anything
other than text objects. in other words, i would want = to have the 10 or so odd
mappings in the various inputenc defs that do map to = math be replaced by
\DeclareInputText.

i'm currently trying to document the internal = representation of LaTeX
including inputenc and the like and the current = status is impossible to
describe.

 > (Probably, however, the present inputenc = name "latin1" needs to remain
 > as it is for backward = compatibility.)

well, probably, but then how many people would have = known (and used the fact)
that current inputenc latin1 actually has

\DeclareInputText{189}{\textonehalf} % so that gives = an error if placed in
          &nbs= p;            = ;            =   % math

but

\DeclareInputMath{185}{\mathonesuperior}



would that also make an uproar on ctt? i.e., changing = the inputencs to be text
objects by default

comments anybody? (Mr. from the grave?)

cheers
frank

------_=_NextPart_001_01C08AFB.B2F1F680--