Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f4JIOLf18370 for ; Sat, 19 May 2001 20:24:21 +0200 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f4JIOL718902 . for ; Sat, 19 May 2001 20:24:21 +0200 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f4JIOK029888 for ; Sat, 19 May 2001 20:24:20 +0200 (MET DST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C0E090.EC417880" Received: from mailgate2.zdv.Uni-Mainz.DE (mailgate2.zdv.Uni-Mainz.DE [134.93.8.57]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id UAA06229 for ; Sat, 19 May 2001 20:24:20 +0200 (MEST) Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f4JIOJ029884 for ; Sat, 19 May 2001 20:24:19 +0200 (MET DST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <5.E6798B8C@mail.listserv.gmd.de>; Sat, 19 May 2001 20:22:33 +0200 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 495966 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sat, 19 May 2001 20:24:15 +0200 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA23832 for ; Sat, 19 May 2001 20:24:14 +0200 (MET DST) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA121534 for ; Sat, 19 May 2001 20:24:15 +0200 Received: from knatte.tninet.se (knatte.tninet.se [195.100.94.10]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f4JIOFj18330 for ; Sat, 19 May 2001 20:24:15 +0200 (MET DST) Received: (qmail 2569 invoked from network); 19 May 2001 20:24:14 +0200 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by knatte.tninet.se with SMTP; 19 May 2001 20:24:14 +0200 Received: from [195.100.226.134] (du134-226.ppp.su-anst.tninet.se [195.100.226.134]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.2) with ESMTP id 960841.296651.990delenn-s1 for ; Sat, 19 May 2001 20:24:11 +0200 In-Reply-To: References: <200105161742.MAA02503@riemann.math.twsu.edu> Return-Path: X-Sender: haberg@pop.matematik.su.se x-mime-autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id UAA23833 Content-class: urn:content-classes:message Subject: Re: Multilingual Encodings Summary 2.2 Date: Sat, 19 May 2001 19:22:36 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 4087 This is a multi-part message in MIME format. ------_=_NextPart_001_01C0E090.EC417880 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable At 16:43 +0200 2001/05/19, Lars Hellstr=F6m wrote: >>The reason one is getting stuck with it is for backwards = compatibility, and > >Indeed. \epsilon and \varepsilon could probably not be identified = earlier >than in LaTeX3. I am not sure what you mean here: The two types of epsilon dates back al long time. I am not sure exactly how far, but perhaps back to the = thirties of the last century. A long time, mathematicians refused to use LaTeX because it was not = capable to produce the output required in math. I think (but Frank or somebody will know this better) that one reason = for creating the LaTeX3 project was to ensure that mathematicians could use LaTeX to produce the output they want. >>further there is no guarantee that mathematicians will use the symbols = the >>way you dictate. > >You mean saying \in for the set membership relation rather than = \epsilon? >\epsilon is just plain wrong (and has always been so) since it = generates an >Ord math atom, not a Rel math atom as a relation command should. The main point is that to some mathematicians, using one of the epsilon variations has been right at least in the past. As for TeX, if it is the binary relations setting you have in your mind, that can be fixed, I recall. And if things cannot be fixed in TeX by some general mechanism, one can always use kerning in the particular formulas in order to fix up the = look. >>Later, one would expect LaTeX, or whatever scientific typesetting = system, >>being capable to support them all without restrictions. Plus admitting >>future additions. > >Yes, but not necessarily supporting them by default. There is an = important >difference between the default set-up making \epsilon and \varepsilon >different, and providing a mechanism that makes it easy to (on a per >document basis) add such a distinction. What is provided by the default >set-up becomes the minimal core which _all_ set-ups must provide. The problem is that you want to impose a default restriction that cannot = be motivated by some knowledge of actual usage: The \epsilon and = \varepsilon look sufficiently different that they could be used side by side in the same formula, and they may already have. > The >larger you make this core, the bigger the effort needed to support it = will >be, and the alternatives to the default will be correspondingly fewer. = It's >easy to request that all fonts provide everything that is in Unicode if = you >anyway would never help with providing anything. In this case I think it is clear that ever font that will be used with Unicode that supplies one of the epsilon types will supply the other, because I recall the were fit into the same group of 1024 math character symbols. So there is no gain in trying to restrict what already is present in Unicode and TeX. >>I have seen examples of both types of epsilon being used to denote set >>membership, > >No doubt due to "limitations in past typesetting". Whatever; the main thing is that they now are present as different characters and may have already been used as such because it is = perfectly legal. And you do not know for sure that in every manuscript in the past before the advent of TeX they have too been used side by side in the = same manuscript. >>and I have seen examples of both types of epsilon being used as >>a small number > 0. You could probably add a whole range of characters >>moving from \varepsilon to \epsilon to \in for set membership. > >That's where I suspect you get it all wrong. Please do not be so rude in your formulations as the Cambridge wannabe geniuses. :-) > You're talking about a whole >range of _glyphs_, in appearence similar to anything between the >\varepsilon and the \in of Computer Modern, but they're all the same >semantic atom (i.e., character) and thus shouldn't have distinct = internal >representations in LaTeX. All those variations derive from the beginning, I surmise, from the same glyph in the Greek language, but they have since migrated. It is the guy who writes the math paper in question that decides what is the correct semantic interpretation, and not you, and there is nothing you can do = about that. The \in is also originally an epsilon and nothing else. > That at least part of that range of glyphs may >also be used to represent another character (the greek letter small >epsilon) which should have its own internal representation is another >matter. Right. It is very difficult to tell how those characters evolve and to impose restrictions onto that evolution. If, when all this has done, and somebody comes up with the evidence of a new variation that must be added in order to get the math papers right, then that variation should be added as well. >>Knuth, being wise, realized how disparate the use of the symbols are = in >>math, and introduced a macro symbols system so that anyone can define = them >>as they please: > >The point is that the macro system Knuth created has no internal >representation for characters, neither in text nor math---instead it is >based on the user specifying what glyph (or combination of glyphs) is >desired. LaTeX, by contrast, has an internal representation for = characters >as of version 2e, but still uses the Knuthian glyph selection commands = in >math. What I argue is that by version 3 of LaTeX there should be an >internal math character representation as well. I think that over the past years, there has been several ideas of = providing a better math representation on different levels of abstraction, but the difficulty is always how mathematicians use them according to their own objectives. What is a must in some areas is totally unacceptable in = other. For example, a few years ago there was this discussion about how engineering standard about how tensors should be typeset, but which = would be totally unacceptable in a paper in differential geometry. Therefore, I do not think that there has been viable proposal along such = lines. The best one can hope for, I think, is to provide optional packages that people may decide to use if they so want on top of the regular LaTeX = model. >>Further, if you want to make it impossible to use \varepsilon and = \epsilon >>side by side in the same document, you will have to make sure that in = all >>of the world literature in the past up till now it has never been used = that >>way, because that is how the requirements of Unicode were set up. > >I'm not saying that it should be completely impossible to use them side = by >side (even though I would question any attempts to do so), but they >shouldn't be provided as distinct characters in the default set-up. I think it would be unwise to impose any kind of restrictions onto the = math characters in the default settings: If they appears as distinct = entities, one is free to use them as that. And mathematicians seem to always invent new notation, they will = probably be used in new unexpected ways. >>As for the math characters, I do not see there is any point in trying = to >>impose equivalences because the way the may be used in math, and it is = just >>an unnecessary additional work in implementation. > >It is very little additional work in the implementation of LaTeX = (adding an >OCP which normalizes the input somewhat further than what Unicode = precribes >will do), but it saves much (largely unnecessary) work in the >implementation of fonts for LaTeX, and thereby it facilitates the = creations >of new fonts. You will have to check with the font experts how they think that the = future fonts will be developed. But I think that one possibility is that font developers merely take a Unicode chunk and develop the characters in it. That would mean that the two epsilon variations will always be developed together, because they appear both in the 0x1D700 - 0x1D7FF group. Then, if LaTeX is based on a TeX that is based on 32-bit padded = characters with Unicode in the bottom, it will have to follow that. (The Omega draft did not explicitly say if it uses 16-bit or 32-bit = Unicode characters, but I figure that perhaps it is only using 16-bit Unicode characters. Then the two epsilon variations fall without this range. If that is what is causing the complication, I figure it would be best to first make an Omega that is based on 32-bit characters or whatever.) Hans Aberg ------_=_NextPart_001_01C0E090.EC417880 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: Multilingual Encodings Summary 2.2

At 16:43 +0200 2001/05/19, Lars Hellstr=F6m = wrote:
>>The reason one is getting stuck with it is = for backwards compatibility, and
>
>Indeed. \epsilon and \varepsilon could probably = not be identified earlier
>than in LaTeX3.

I am not sure what you mean here: The two types of = epsilon dates back al
long time. I am not sure exactly how far, but perhaps = back to the thirties
of the last century.

A long time, mathematicians refused to use LaTeX = because it was not capable
to produce the output required in math.

I think (but Frank or somebody will know this better) = that one reason for
creating the LaTeX3 project was to ensure that = mathematicians could use
LaTeX to produce the output they want.

>>further there is no guarantee that = mathematicians will use the symbols the
>>way you dictate.
>
>You mean saying \in for the set membership = relation rather than \epsilon?
>\epsilon is just plain wrong (and has always been = so) since it generates an
>Ord math atom, not a Rel math atom as a relation = command should.

The main point is that to some mathematicians, using = one of the epsilon
variations has been right at least in the = past.

As for TeX, if it is the binary relations setting you = have in your mind,
that can be fixed, I recall.

And if things cannot be fixed in TeX by some general = mechanism, one can
always use kerning in the particular formulas in = order to fix up the look.

>>Later, one would expect LaTeX, or whatever = scientific typesetting system,
>>being capable to support them all without = restrictions. Plus admitting
>>future additions.
>
>Yes, but not necessarily supporting them by = default. There is an important
>difference between the default set-up making = \epsilon and \varepsilon
>different, and providing a mechanism that makes = it easy to (on a per
>document basis) add such a distinction. What is = provided by the default
>set-up becomes the minimal core which _all_ = set-ups must provide.

The problem is that you want to impose a default = restriction that cannot be
motivated by some knowledge of actual usage: The = \epsilon and \varepsilon
look sufficiently different that they could be used = side by side in the
same formula, and they may already have.

> The
>larger you make this core, the bigger the effort = needed to support it will
>be, and the alternatives to the default will be = correspondingly fewer. It's
>easy to request that all fonts provide everything = that is in Unicode if you
>anyway would never help with providing = anything.

In this case I think it is clear that ever font that = will be used with
Unicode that supplies one of the epsilon types will = supply the other,
because I recall the were fit into the same group of = 1024 math character
symbols.

So there is no gain in trying to restrict what already = is present in
Unicode and TeX.

>>I have seen examples of both types of epsilon = being used to denote set
>>membership,
>
>No doubt due to "limitations in past = typesetting".

Whatever; the main thing is that they now are present = as different
characters and may have already been used as such = because it is perfectly
legal. And you do not know for sure that in every = manuscript in the past
before the advent of TeX they have too been used side = by side in the same
manuscript.

>>and I have seen examples of both types of = epsilon being used as
>>a small number > 0. You could probably add = a whole range of characters
>>moving from \varepsilon to \epsilon to \in = for set membership.
>
>That's where I suspect you get it all = wrong.

Please do not be so rude in your formulations as the = Cambridge wannabe
geniuses. :-)

> You're talking about a whole
>range of _glyphs_, in appearence similar to = anything between the
>\varepsilon and the \in of Computer Modern, but = they're all the same
>semantic atom (i.e., character) and thus = shouldn't have distinct internal
>representations in LaTeX.

All those variations derive from the beginning, I = surmise, from the same
glyph in the Greek language, but they have since = migrated. It is the guy
who writes the math paper in question that decides = what is the correct
semantic interpretation, and not you, and there is = nothing you can do about
that.

The \in is also originally an epsilon and nothing = else.

> That at least part of that range of glyphs = may
>also be used to represent another character (the = greek letter small
>epsilon) which should have its own internal = representation is another
>matter.

Right. It is very difficult to tell how those = characters evolve and to
impose restrictions onto that evolution.

If, when all this has done, and somebody comes up with = the evidence of a
new variation that must be added in order to get the = math papers right,
then that variation should be added as well.

>>Knuth, being wise, realized how disparate the = use of the symbols are in
>>math, and introduced a macro symbols system = so that anyone can define them
>>as they please:
>
>The point is that the macro system Knuth created = has no internal
>representation for characters, neither in text = nor math---instead it is
>based on the user specifying what glyph (or = combination of glyphs) is
>desired. LaTeX, by contrast, has an internal = representation for characters
>as of version 2e, but still uses the Knuthian = glyph selection commands in
>math. What I argue is that by version 3 of LaTeX = there should be an
>internal math character representation as = well.

I think that over the past years, there has been = several ideas of providing
a better math representation on different levels of = abstraction, but the
difficulty is always how mathematicians use them = according to their own
objectives. What is a must in some areas is totally = unacceptable in other.

For example, a few years ago there was this discussion = about how
engineering standard about how tensors should be = typeset, but which would
be totally unacceptable in a paper in differential = geometry.

Therefore, I do not think that there has been viable = proposal along such lines.

The best one can hope for, I think, is to provide = optional packages that
people may decide to use if they so want on top of = the regular LaTeX model.

>>Further, if you want to make it impossible to = use \varepsilon and \epsilon
>>side by side in the same document, you will = have to make sure that in all
>>of the world literature in the past up till = now it has never been used that
>>way, because that is how the requirements of = Unicode were set up.
>
>I'm not saying that it should be completely = impossible to use them side by
>side (even though I would question any attempts = to do so), but they
>shouldn't be provided as distinct characters in = the default set-up.

I think it would be unwise to impose any kind of = restrictions onto the math
characters in the default settings: If they appears = as distinct entities,
one is free to use them as that.

And mathematicians seem to always invent new notation, = they will probably
be used in new unexpected ways.

>>As for the math characters, I do not see there = is any point in trying to
>>impose equivalences because the way the may = be used in math, and it is just
>>an unnecessary additional work in = implementation.
>
>It is very little additional work in the = implementation of LaTeX (adding an
>OCP which normalizes the input somewhat further = than what Unicode precribes
>will do), but it saves much (largely unnecessary) = work in the
>implementation of fonts for LaTeX, and thereby it = facilitates the creations
>of new fonts.

You will have to check with the font experts how they = think that the future
fonts will be developed.

But I think that one possibility is that font = developers merely take a
Unicode chunk and develop the characters in it. That = would mean that the
two epsilon variations will always be developed = together, because they
appear both in the 0x1D700 - 0x1D7FF group.

Then, if LaTeX is based on a TeX that is based on = 32-bit padded characters
with Unicode in the bottom, it will have to follow = that.

(The Omega draft did not explicitly say if it uses = 16-bit or 32-bit Unicode
characters, but I figure that perhaps it is only = using 16-bit Unicode
characters. Then the two epsilon variations fall = without this range. If
that is what is causing the complication, I figure it = would be best to
first make an Omega that is based on 32-bit = characters or whatever.)

  Hans Aberg

------_=_NextPart_001_01C0E090.EC417880--