Received: by nummer-3.proteosys id <01C19443.4B1C8B8C@nummer-3.proteosys>; Thu, 3 Jan 2002 11:42:08 +0100 Return-Path: <@vm.gmd.de:LATEX-L@DHDURZ1.BITNET> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C19443.4B1C8B8C" x-vm-v5-data: ([nil nil nil nil nil nil nil nil nil][nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil]) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message Subject: Re: LaTeX 2.09 beta-test Date: Mon, 28 Oct 1991 19:18:46 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: Sender: "LaTeX-L Mailing list" To: "Rainer M. Schoepf" Reply-To: "LaTeX-L Mailing list" Status: R X-Status: X-Keywords: X-UID: 424 This is a multi-part message in MIME format. ------_=_NextPart_001_01C19443.4B1C8B8C Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable >>> Now as regards modifying TeX to handle various different >>> encodings, let me just say NO, DON'T DO IT! The sort of change >>> which occurs in emTeX makes it non-TeX. I don't think I agree with this remark. If implemented as by Eberhard Mattes, through a command-line extension to TeX, then I see no conflict whatsoever. The xchr/xord pair are specifically intended to allow mappings to/from the characters that a user has available; I quote from C&T B/23: ``People with extended character sets can assign codes arbitrarily, giving an xchr equivalent to whatever characters the users of TeX are allowed to have in their input files''. >>> Modifying xchar/xord so that say, the PC >>> e-acute maps internally to the Cork e-acute causes the difficulty >>> that TeX files created under this assumption are non-portable. Non-portable ? I think that depends how you define `portable'. If I use my PS/2 in codepage-850 mode, and send a file on a disc to another PS/2 or PC also using codepage 850, then the user on that machine can process my file in a manner identical to that in which I can process it. If he or she prefers to use codepage 437, and sends me a codepage-437 file, I can quickly re-configure my PS/2 to use codepage 437 and process his or her file. I cannot send either to an EBCDIC site and have any hope whatsoever that they can process the file, but then I wouldn't expect to be able to; after all, I can't even send them an ASCII file and hope that they can read it ... >>> The purpose of xchar/xord was not for this sort of remapping, but >>> rather to handle the differences between ASCII and EBCDIC. What nonsense; what DEK actually says (C&T A, p.43) is ``TeX always uses the internal character code of Appendix C for the standard ASCII = characters, regardless of what external coding scheme actually appears in the files = being read. Thus b is 98 inside of TeX even when your computer normally deals = with EBCDIC or some other non-ASCII system.'' = =08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08= =08=08=08=08=08 >>> The >>> best pure TeX way to handle code pages is to make chars over 127 >>> active, but as was mentioned this disallows their use in cs >>> names. Best ? I think that depends on how you define `best'. If I were a = native speaker of any language which required diacritical marks, then I would = regard such a solution as `worst', not `best'. If there exists a character = `foo' in my native language, and if that character can be input to TeX in a = meaningful way, then I want that character treated within TeX as having the same = semantics as the character has in my native language, unless I choose to define it otherwise. Thus I want letters (regardless of the presence or absence = of diacritical mark(s)) treated as letters (i.e. catcode 11), and = punctuation, digits, etc, as `others' (catcode 12); if I have two or three types of = space, I want them all treated as `space' (catcode 10). Why should the = characters which occur in my language but not yours be singled out as needing to be = `active' (catcode 13), with all the restrictions that that implies ? ** Phil. ------_=_NextPart_001_01C19443.4B1C8B8C Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: LaTeX 2.09 <Oct 91> beta-test

>>> Now as regards modifying TeX to handle = various different
>>> encodings, let me just say NO, DON'T DO = IT! The sort of change
>>> which occurs in emTeX makes it = non-TeX.

I don't think I agree with this remark.  If = implemented as by
Eberhard Mattes, through a command-line extension to = TeX, then
I see no conflict whatsoever.  The xchr/xord = pair are specifically
intended to allow mappings to/from the characters = that a user has
available; I quote from C&T B/23:

        ``People = with extended character sets can assign codes
          = arbitrarily, giving an xchr equivalent to whatever
          = characters the users of TeX are allowed to have in
          = their input files''.

>>> Modifying xchar/xord so that say, the = PC
>>> e-acute maps internally to the Cork = e-acute causes the difficulty
>>> that TeX files created under this = assumption are non-portable.

Non-portable ?  I think that depends how you = define `portable'.
If I use my PS/2 in codepage-850 mode, and send a = file on a disc
to another PS/2 or PC also using codepage 850, then = the user on
that machine can process my file in a manner = identical to that in
which I can process it.  If he or she prefers to = use codepage 437,
and sends me a codepage-437 file, I can quickly = re-configure my
PS/2 to use codepage 437 and process his or her = file.  I cannot
send either to an EBCDIC site and have any hope = whatsoever that
they can process the file, but then I wouldn't expect = to be able to;
after all, I can't even send them an ASCII file and = hope that they can
read it ...

>>> The purpose of xchar/xord was not for = this sort of remapping, but
>>> rather to handle the differences between = ASCII and EBCDIC.

What nonsense; what DEK actually says (C&T A, = p.43) is ``TeX always uses
the internal character code of Appendix C for the = standard ASCII characters,
regardless of what external coding scheme actually = appears in the files being
read.  Thus b is 98 inside of TeX even when your = computer normally deals with
EBCDIC or some other non-ASCII system.''
       = =08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08=08= =08=08=08=08=08

>>> The
>>> best pure TeX way to handle code pages = is to make chars over 127
>>> active, but as was mentioned this = disallows their use in cs
>>> names.

Best ?  I think that depends on how you define = `best'.  If I were a native
speaker of any language which required diacritical = marks, then I would regard
such a solution as `worst', not `best'.  If = there exists a character `foo' in
my native language, and if that character can be = input to TeX in a meaningful
way, then I want that character treated within TeX as = having the same semantics
as the character has in my native language, unless I = choose to define it
otherwise.  Thus I want letters (regardless of = the presence or absence of
diacritical mark(s)) treated as letters (i.e. catcode = 11), and punctuation,
digits, etc, as `others' (catcode 12); if I have two = or three types of space, I
want them all treated as `space' (catcode 10).  = Why should the characters which
occur in my language but not yours be singled out as = needing to be `active'
(catcode 13), with all the restrictions that that = implies ?

          &nbs= p;            = ;           ** = Phil.

------_=_NextPart_001_01C19443.4B1C8B8C--