{1} Re uppercase problems (was urgent)

Received: by nummer-3.proteosys id <01C19443.9AB71194@nummer-3.proteosys>; Thu, 3 Jan 2002 11:44:22 +0100 Return-Path: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C19443.9AB71194" x-vm-v5-data: ([nil nil nil nil nil nil nil nil nil][nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil]) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message Subject: {1} Re uppercase problems (was urgent) Date: Fri, 21 Feb 1992 16:59:20 +0100 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Frank Mittelbach" Sender: "LaTeX-L Mailing list" To: "Rainer M. Schoepf" Reply-To: "LaTeX-L Mailing list" Status: R X-Status: X-Keywords: X-UID: 578 This is a multi-part message in MIME format. ------_=_NextPart_001_01C19443.9AB71194 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Yannis said: > I have the following (serious) problem: > If I use > \section{Fran^^8dais et \CyrMacCharacters = ^^f0^^f3^^f1^^f1^^ea^^e8^^e9} > I'll get a tremendously long line in the AUX file, and furthermore the > whole thing won't work because \uppercase first converts everything = and > then expands. The problems he has are actually two separate ones. The first is that the \uppercase primitive is a TeX tool that is not suitable for multi-lingual text. In other words, styles that use uppercase as a tool to convert whole tokenstrings can only be used by automatic means with fonts that have one well-defined codepage. Any codepage switching that also involves uccode table changes will produce chaos because uppercasing will come too early. This is also true if math find its way into such a token list because variable will get uppercased changing the meaning of formulas (something I ran into long time ago when I did my master thesis). Primitive solutions to this problem are: - avoid using \uppercase, i.e. use a style that doesn't uppercase running headings - explicitly use \lowercase around portions that have to stay lowercase. This solution is working when the uccode and lccode tables match which is usually the case. I think that is about the only thing one can do in the current LaTeX. The second problem is that the current aux file concept is using expanded writes so that all commands in a section etc. will be expanded on its way to the ouput file. This will usually result in some errors which means that you have to use \protect in the current implementation or define the commands to expand to \protect to prevent this. This kind of problem will certainly vanish in ltx3 because we will use unexpanded writes for the whole auxilary file handling. So let's try to rephrase the problem for ltx3: 0) Is automatical uppercasing of given text a design tool that should be provided in a general context? 1) If so then: given a how can we upper (or lower) case it while allowing to change uccode table entries in the middle, or as a weaker constraint, can we leave out uppercasing for parts od ? Michael Downes has done some work in this direction in the AMS styles, perhaps he can comment here. 2) Can we achieve a partial solution by using suitable code pages in fonts? Some side remarks: > I'm using the Macintosh set of 8-bit characters for the input of = accented > letters as well for the input of Greek and Russian (with the = corresponding > Apple systems). This works in the following way: > ^^88 for example is an a with grave accent. > I define: \catcode"88=3D\active > \def\`#1{...\ifx#1a\char"E0{}\else... etc } > \def ^^88{\`a} > where \char"E0 is the a with grave of the DC font. I'm not at all keen on the idea of making all kind of characters active to achieve some results and I suppose that we will try a different approach. But much more important is that we will not follow the ``Apple system'' or the ``PC code page such-and-such'' and defining a lot of internal character mappings for various conventions. Such mapping should be done outside the TeX system in my opinion to keep sources portable between different platforms. I don't want to start a holy war on this, especially not now because this is a type of problem which can be address at a much later stage in the project. Nevertheless feel free to comment now (agree or disagree), only that I probably will not answer to this type of discussion right away. > Subj: {1} URGENT Finally some comments on subject lines :-) Naturally on such a list many questions will get discussed and often someone (including me) is starting a new topic which may or may not be picked up directly by others. So please try to make the the subject lines a bit more informative this will help everybody. In fact, I don't think that this problem is more urgent then the other thousand problems that we have to deal with and certainly we need time and a bit dicipline to solve them. cheers Frank ------_=_NextPart_001_01C19443.9AB71194 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable {1} Re uppercase problems (was urgent)

Yannis said:

> I have the following (serious) problem:
> If I use
> \section{Fran^^8dais et \CyrMacCharacters = ^^f0^^f3^^f1^^f1^^ea^^e8^^e9}
> I'll get a tremendously long line in the AUX = file, and furthermore the
> whole thing won't work because \uppercase first = converts everything and
> then expands.

The problems he has are actually two separate = ones.

The first is that the \uppercase primitive is a TeX = tool that is not
suitable for multi-lingual text. In other words, = styles that use
uppercase as a tool to convert whole tokenstrings can = only be used by
automatic means with fonts that have one well-defined = codepage. Any
codepage switching that also involves uccode table = changes will
produce chaos because uppercasing will come too = early. This is also
true if math find its way into such a token list = because variable will
get uppercased changing the meaning of formulas = (something I ran into
long time ago when I did my master thesis).
Primitive solutions to this problem are:

- avoid using \uppercase, i.e. use a style that = doesn't uppercase
running headings

- explicitly use \lowercase around portions that = have to stay
lowercase. This solution is working when the uccode = and lccode tables
match which is usually the case.

I think that is about the only thing one can do in the = current LaTeX.

The second problem is that the current aux file = concept is using
expanded writes so that all commands in a section = etc. will be
expanded on its way to the ouput file. This will = usually result in
some errors which means that you have to use \protect = in the current
implementation or define the commands to expand to = \protect
<something> to prevent this. This kind of = problem will certainly
vanish in ltx3 because we will use unexpanded writes = for the whole
auxilary file handling.

So let's try to rephrase the problem for ltx3:

0) Is automatical uppercasing of given text a = design tool that should
be provided in a general context?

1) If so then: given a <token string> how = can we upper (or lower)
case it while allowing to change uccode table entries = in the middle,
or as a weaker constraint, can we leave out = uppercasing for parts od
<token string>? Michael Downes has done some = work in this direction in
the AMS styles, perhaps he can comment here.

2) Can we achieve a partial solution by using = suitable code pages in
fonts?

Some side remarks:

> I'm using the Macintosh set of 8-bit characters = for the input of accented
> letters as well for the input of Greek and = Russian (with the corresponding
> Apple systems). This works in the following = way:
>   ^^88 for example is an a with grave = accent.
>   I define: = \catcode"88=3D\active
>          =    \def\`#1{...\ifx#1a\char"E0{}\else... etc }
>          =    \def ^^88{\`a}
>   where \char"E0 is the a with = grave of the DC font.

I'm not at all keen on the idea of making all kind of = characters
active to achieve some results and I suppose that we = will try a
different approach. But much more important is that = we will not follow
the ``Apple system'' or the ``PC code page = such-and-such'' and
defining a lot of internal character mappings for = various conventions.
Such mapping should be done outside the TeX system in = my opinion to
keep sources portable between different = platforms. I don't want to
start a holy war on this, especially not now because = this is a type of
problem which can be address at a much later stage in = the project.
Nevertheless feel free to comment now (agree or = disagree), only that I
probably will not answer to this type of discussion = right away.

> Subj: {1} URGENT

Finally some comments on subject lines :-) Naturally = on such a list
many questions will get discussed and often someone = (including me) is
starting a new topic which may or may not be picked = up directly by
others. So please try to make the the subject lines a = bit more
informative this will help everybody. In fact, I = don't think that this
problem is more urgent then the other thousand = problems that we have
to deal with and certainly we need time and a bit = dicipline to solve
them.

cheers Frank

------_=_NextPart_001_01C19443.9AB71194--