Received: from webgate.proteosys.de (mail.proteosys-ag.com [62.225.9.49]) by lucy.proteosys (8.11.0/8.9.3/SuSE Linux 8.9.3-0.1) with ESMTP id f09Ilo715235 for ; Tue, 9 Jan 2001 19:47:50 +0100 Received: by webgate.proteosys.de (8.11.0/8.11.0) with ESMTP id f09Im5709199 . for ; Tue, 9 Jan 2001 19:48:06 +0100 Received: from mail.Uni-Mainz.DE (mailserver1.zdv.Uni-Mainz.DE [134.93.8.30]) by mailgate2.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f09Iln020933 for ; Tue, 9 Jan 2001 19:47:49 +0100 (MET) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C07A6C.AA624F00" Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.8.56]) by mail.Uni-Mainz.DE (8.9.3/8.9.3) with ESMTP id TAA20866 for ; Tue, 9 Jan 2001 19:47:48 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from mail.listserv.gmd.de (mail.listserv.gmd.de [192.88.97.5]) by mailgate1.zdv.Uni-Mainz.DE (8.11.0/8.10.2) with ESMTP id f09IlkM09486 for ; Tue, 9 Jan 2001 19:47:46 +0100 (MET) Received: from mail.listserv.gmd.de (192.88.97.5) by mail.listserv.gmd.de (LSMTP for OpenVMS v1.1a) with SMTP id <3.805927E1@mail.listserv.gmd.de>; Tue, 9 Jan 2001 19:47:45 +0100 Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 479726 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 9 Jan 2001 19:47:42 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id TAA23394 for ; Tue, 9 Jan 2001 19:47:40 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id TAA26934 for ; Tue, 9 Jan 2001 19:47:41 +0100 Received: from angel.algonet.se (angel.algonet.se [194.213.74.112]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f09IlfU28822 for ; Tue, 9 Jan 2001 19:47:41 +0100 (MET) Received: (qmail 2336 invoked from network); 9 Jan 2001 19:47:39 +0100 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by angel.algonet.se with SMTP; 9 Jan 2001 19:47:39 +0100 Received: from [195.100.226.141] (du141-226.ppp.su-anst.tninet.se [195.100.226.141]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 327019.66057.979delenn-s2 ; Tue, 09 Jan 2001 19:47:37 +0100 In-Reply-To: <200101061950.OAA03845@pluto.math.albany.edu> Return-Path: X-Sender: haberg@pop.matematik.su.se Content-class: urn:content-classes:message Subject: Re: GELLMU progress Date: Tue, 9 Jan 2001 19:43:50 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Hans Aberg" Sender: "Mailing list for the LaTeX3 project" To: "Multiple recipients of list LATEX-L" Reply-To: "Mailing list for the LaTeX3 project" Status: R X-Status: X-Keywords: X-UID: 3650 This is a multi-part message in MIME format. ------_=_NextPart_001_01C07A6C.AA624F00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable At 14:50 -0500 1-01-06, William F. Hammond wrote: >2. The default "article" document type for _regular_ GELLMU provides >three character names for each of the 33 non-alphanumeric but >printable ASCII characters. Each of those is at risk for some >conceivable translation target. I will describe the mangling technique I used for my own OOPL -> C++ translation, which avoids all such problems: C++ names (identifiers) are allowed to be alpha-numerical with = underscores _, but may not start with a digit; in addition, names starting with an underscore, or containing two adjacent underscores are reserved for the implementation of the compiler. So, in order to make things simple, I started off in my OOPL with names only containing letters and underscores, with the restriction that names cannot start or end with an underscore or have two adjacent underscores. (Names in math normally do not contain digits.) For example, foo_bar is = OK, but not foo__bar, _foo, bar_, or fo0. My idea is really that the _ ought to be a shorthand for a space, so = these limitations seems reasonable. Then I wanted to extend this so that _any_ binary strings are allowed as names. I did this by allowing names within quotes ` and ', and using standard C-string conventions with backslash = for special characters, and octal and hexadecimal character representations. The idea is also that say foo_bar =3D `foo bar' so that when foo_bar is parsed, it is given the same binary translation = as `foo bar', which is the same as the C-string "foo bar" (minus the C terminating '\0'). Now, I want to create a C++ label for every such binary string: It is = now irrelevant how I obtained this from the parsing in my OOPL. It is also irrelevant how I mangle the names, as long as I stick to the same = mangling convention if different translation units should work together. If I = change mangling conventions in the future, the old sources must be recompiled, = but that is all. It is not so difficult to invent a mangling convention. In my case I decided that an isolated space, as in `foo bar' should be translated to = an underscore, so that in fact foo_bar -> binary `foo bar' -> C++ foo_bar. If there are more spaces, or any other binary character, I merely write them out hexadecimally starting with a digit 0-7, and the second a digit 0-9 or a letter A-V. One also needs to prepend names with something, in order to avoid it starting with a digit (which I need to do anyhow, in order to put different categories of labels into different namespaces). This idea can then be used in many ways: If the output language only = admits an infinitude of names, one can allow whatever names one wants in the = input and mangling them into the output language. Hans Aberg ------_=_NextPart_001_01C07A6C.AA624F00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: GELLMU progress

At 14:50 -0500 1-01-06, William F. Hammond = wrote:
>2.  The default "article" document = type for _regular_ GELLMU provides
>three character names for each of the 33 = non-alphanumeric but
>printable ASCII characters.  Each of those = is at risk for some
>conceivable translation target.

I will describe the mangling technique I used for my = own OOPL -> C++
translation, which avoids all such problems:

C++ names (identifiers) are allowed to be = alpha-numerical with underscores
_, but may not start with a digit; in addition, names = starting with an
underscore, or containing two adjacent underscores = are reserved for the
implementation of the compiler.

So, in order to make things simple, I started off in = my OOPL with names
only containing letters and underscores, with the = restriction that names
cannot start or end with an underscore or have two = adjacent underscores.
(Names in math normally do not contain digits.) For = example, foo_bar is OK,
but not foo__bar, _foo, bar_, or fo0.

My idea is really that the _ ought to be a shorthand = for a space, so these
limitations seems reasonable. Then I wanted to extend = this so that _any_
binary strings are allowed as names. I did this by = allowing names within
quotes ` and ', and using standard C-string = conventions with backslash for
special characters, and octal and hexadecimal = character representations.
The idea is also that say
  foo_bar =3D `foo bar'
so that when foo_bar is parsed, it is given the same = binary translation as
`foo bar', which is the same as the C-string = "foo bar" (minus the C
terminating '\0').

Now, I want to create a C++ label for every such = binary string: It is now
irrelevant how I obtained this from the parsing in my = OOPL. It is also
irrelevant how I mangle the names, as long as I stick = to the same mangling
convention if different translation units should work = together. If I change
mangling conventions in the future, the old sources = must be recompiled, but
that is all.

It is not so difficult to invent a mangling = convention. In my case I
decided that an isolated space, as in `foo bar' = should be translated to an
underscore, so that in fact
  foo_bar -> binary `foo bar' -> C++ = foo_bar.
If there are more spaces, or any other binary = character, I merely write
them out hexadecimally starting with a digit 0-7, and = the second a digit
0-9 or a letter A-V. One also needs to prepend names = with something, in
order to avoid it starting with a digit (which I need = to do anyhow, in
order to put different categories of labels into = different namespaces).

This idea can then be used in many ways: If the output = language only admits
an infinitude of names, one can allow whatever names = one wants in the input
and mangling them into the output language.

  Hans Aberg

------_=_NextPart_001_01C07A6C.AA624F00--