Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with SMTP id o1A8rsHm018254 for ; Wed, 10 Feb 2010 09:53:55 +0100 Received: (qmail 1997 invoked by alias); 10 Feb 2010 08:53:46 -0000 Delivered-To: GMX delivery to rainer.schoepf@gmx.net Received: (qmail invoked by alias); 10 Feb 2010 08:53:45 -0000 Received: from relay.uni-heidelberg.de (EHLO relay.uni-heidelberg.de) [129.206.100.212] by mx0.gmx.net (mx094) with SMTP; 10 Feb 2010 09:53:45 +0100 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id o1A8pO5G008541 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 10 Feb 2010 09:51:25 +0100 Received: from listserv.uni-heidelberg.de (localhost.localdomain [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id o19N142Y030722; Wed, 10 Feb 2010 09:51:13 +0100 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 393354 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Wed, 10 Feb 2010 09:51:13 +0100 Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id o1A8pDfl009480 for ; Wed, 10 Feb 2010 09:51:13 +0100 Received: from ueamailgate02.uea.ac.uk (ueamailgate02.uea.ac.uk [139.222.131.185]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id o1A8oxiW008077 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 10 Feb 2010 09:51:07 +0100 Received: from ueams02.uea.ac.uk (ueams02.uea.ac.uk [139.222.131.131]) by ueamailgate02.uea.ac.uk (8.13.1/8.13.1) with ESMTP id o1A8owDZ014931 for ; Wed, 10 Feb 2010 08:50:58 GMT Received: from [139.222.202.115] by ueams02.uea.ac.uk with esmtp (Exim 4.69) (envelope-from ) id 1Nf8Hi-0001Hk-E5 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Wed, 10 Feb 2010 08:50:58 +0000 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.1.7) Gecko/20100111 Thunderbird/3.0.1 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Canit-CHI2: 0.02 X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN, outgoing) X-CanItPRO-Stream: UEA:outgoing (inherits from UEA:default,base:default) X-Canit-Stats-ID: 42277273 - 2b05cf204aa2 X-Scanned-By: MIMEDefang 2.63 on 85.214.41.38 X-Scanned-By: CanIt (www . roaringpenguin . com) on 139.222.131.185 Message-ID: <4B727378.8060704@morningstar2.co.uk> Date: Wed, 10 Feb 2010 08:51:04 +0000 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Joseph Wright Subject: String module To: LATEX-L@listserv.uni-heidelberg.de Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=5D7Q89H36p6i75npGen84eVAEFK/syJmiNoEBJhgjYKpglu1TZLLw7xMZnJMXwBFK0zrU udEInhYyaWAzwtcf5K2pCdD+gZ2/z4PnBLkwixZI+pVtXqOlCN41sOWgjaVeH7+UhPxHlGxFK/rc sw7fg==V1; X-Resent-By: Forwarder X-Resent-For: rainer.schoepf@gmx.net X-Resent-To: rainer@rainer-schoepf.de Status: R X-Status: X-Keywords: X-UID: 6228 Hello all, One of the questions that was raised recently on c.t.t concerning the currently available LaTeX3 modules was the lack of "strings" functionality. Looking on CTAN, I can find a number of packages providing some string-like functionality: - substr - coolstr - stringstrings - xstrings (plus some functions in other packages). I'm sure there are also others. Taking a look through them, I can find some similarities but also a number of differences. Before trying to create some kind of "l3str" package, I thought it might be useful to see what the feeling is about (a) if we need this at all (b) what constitutes a string (c) what functions are needed and (d) anything else! On (a) my feeling is that some kind of tools are needed, given the clear desire to have them (see my list above of current packages). However, perhaps others disagree as l3tl does provide a number of useful features already. The first "big" question is what exactly is a string in a TeX context. If you look at the existing packages, they take differing approaches to handling items inside what they call strings. For example, some would consider "ab{cde}f" to be a string of four items: "a", "b", "cde" and "f", whereas other approaches would remove the "{" and "}" tokens. An obvious suggestion is that a string is something which has been \detokenize'd, but then you have to handle things like: \tl_new:N \l_my_tl \tl_set:Nn \l_my_tl { abc } \str_new:N \l_my_str \str_set:Nn \l_my_str { \l_my_tl def } Here, do we allow an "x" variant so that \l_my_str ends up as "abcdef" (I think so, but what do others consider sensible)? (I am of course imagining the functions used here!) You also have to worry about what happens about special characters (for example, how do you get % into a string). If you escape things at the input stage [say \% => % (catcode 12)] then a simple \detokenize will not work. On features, things that seem to be popular: - Substring functions such as "x characters from one end", "first x characters", etc. - Search functions such as "where is string x in string y". - Case-changing functions. What is sensible does not necessarily mean everything that is currently available (as I say, some things are handled nicely in l3tl). What are the priorities for others? Example of the context in which things might be used would be helpful, as this may well guide the overall discussion. -- Joseph Wright