Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with SMTP id p1F8tEwu009248 for ; Tue, 15 Feb 2011 09:55:15 +0100 Received: (qmail 5452 invoked by alias); 15 Feb 2011 08:55:08 -0000 Delivered-To: GMX delivery to rainer.schoepf@gmx.net Received: (qmail invoked by alias); 15 Feb 2011 08:55:05 -0000 Received: from relay2.uni-heidelberg.de (EHLO relay2.uni-heidelberg.de) [129.206.210.211] by mx0.gmx.net (mx036) with SMTP; 15 Feb 2011 09:55:05 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id p1F8qaZd023052 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 15 Feb 2011 09:52:37 +0100 Received: from listserv.uni-heidelberg.de (localhost.localdomain [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id p1F8nvNi029896; Tue, 15 Feb 2011 09:52:28 +0100 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 1335594 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Tue, 15 Feb 2011 09:22:15 +0100 Received: from relay2.uni-heidelberg.de (relay2.uni-heidelberg.de [129.206.210.211]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id p1F8MFZp030446 for ; Tue, 15 Feb 2011 09:22:15 +0100 Received: from mail-yx0-f177.google.com (mail-yx0-f177.google.com [209.85.213.177]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id p1F8M6n5008722 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=FAIL) for ; Tue, 15 Feb 2011 09:22:12 +0100 Received: by yxd30 with SMTP id 30so3077349yxd.22 for ; Tue, 15 Feb 2011 00:22:06 -0800 (PST) MIME-Version: 1.0 Received: by 10.151.145.11 with SMTP id x11mr5674978ybn.182.1297758126407; Tue, 15 Feb 2011 00:22:06 -0800 (PST) Received: by 10.146.86.8 with HTTP; Tue, 15 Feb 2011 00:22:06 -0800 (PST) References: Content-Type: text/plain; charset=ISO-8859-1 X-Spam-Whitelist: Message-ID: Date: Tue, 15 Feb 2011 03:22:06 -0500 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Bruno Le Floch Subject: Re: Expandable versions of \uppercase, \MakeUppercase, \lowercase, \MakeLowercase To: LATEX-L@listserv.uni-heidelberg.de In-Reply-To: Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=5D7Q89H36p4WX0t+AtsdWzrXATe7U7iyEYsVEub6UEScnitTuLsF1TdlrkUKNRhypl1WP P4z9N2hLfJzsGszrlv+ygay/ivx19oyBwO3NEg0raNb/3tCvONPdaWhG3fyrhob4EvcA0r7m4G7q eqN5w==V1; X-Resent-By: Forwarder X-Resent-For: rainer.schoepf@gmx.net X-Resent-To: rainer@rainer-schoepf.de Status: R X-Status: X-Keywords: X-UID: 6598 > My personal opinion on uppercasing/lowercasing is that it should be a > property of the font; Both Will and Frank agree on this, but currently, in many fonts that's not possible. Also, it is in fact possible to have an algorithm to produce expandably the result of {replacing some tokens by a corresponding macro} in a given tl. Yes, macro: it can even take arguments. For instance, with my current code (using a specific "case table"), \def\foo#1{arg=#1.} \expandsome{A\foo BC{\expandthis\foo{\B\expandthis\foo{A}} \D\E} !} will expand in two steps to A\foo BC{arg={\B arg=A.}. \D\E} ! Also, we now have \expandafter:nw which expands the token after its argument before carrying on with the argument. It works by "\expandafter-casing" the first argument, namely, replacing every token by "\expandafter" (including braces and spaces). > In terms of the algorithms to perform these operations, I prefer the way > Joseph's code executes (e.g., keeping the number of csnames low) but I > prefer the extensibility of Bruno's (although I suspect Bruno's is faster -- > but a better question to ask is whether Joseph's is too slow). After some work, I realized that there are two points: (1) whether to use many macros, or look at a bunch of cases for each character. (2) whether to be careful with braces and spaces or not. The second point allows us do what I mentionned above. The first point is not necessary for this extensibility, and it will only play a role in speed issues. We are talking about defining (26 + #accents) macros for uppercase, and the same number for lowercase (although I guess that with UTF8, this can become much bigger). For a typical input (sentences, braced stuff) of 5000 tokens, with \tracingall, a wordcount (lines, words, bytes) gives: 2128102 7187359 67511843 Joseph-ULcase.log 230901 1161005 8589159 ULcase.log where ULcase.log is my current version with brace and space checking, and Joseph-ULcase.log has no brace checking. My version could be optimized significantly (2-3x) by using the fact that the replacement that we want for each token takes no argument, but as I said, I want to stay general, because it becomes much more powerful. > something like "\prg_case_str:nVn {#1} \g_uc_replacements_tl { }". I think that it would work. And in fact Joseph's way combined with some ideas I have had will allow us to have a \tl_expand_some:nn {abca} { {a} {A} {b} {\use_ii_i:nn} } => AAc And in fact, we _should_ be able to replace #text at definition time as well, allowing 9 _named_ arguments (I'm not taking this very seriously ;-) ). Namely, replace #first by #1 and #second by #2 in the following \keyworddef\foo#first#second{arg1 is #first, arg2 is #second} I don't know where I should put the code, so it is at http://users.aims.ac.za/~bruno/LaTeX/ULcase/ULcase.sty Note that it is really just a plain TeX file with no \bye, compilable with pdftex, pdflatex, etc. -- Regards, Bruno