Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with SMTP id p9AHTUdV011950 for ; Mon, 10 Oct 2011 19:29:31 +0200 Received: (qmail 25252 invoked by alias); 10 Oct 2011 17:29:25 -0000 Delivered-To: GMX delivery to rainer.schoepf@gmx.net Received: (qmail invoked by alias); 10 Oct 2011 17:29:24 -0000 Received: from relay.uni-heidelberg.de (EHLO relay.uni-heidelberg.de) [129.206.100.212] by mx0.gmx.net (mx017) with SMTP; 10 Oct 2011 19:29:24 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id p9AHR03R009571 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 10 Oct 2011 19:27:01 +0200 Received: from listserv.uni-heidelberg.de (localhost.localdomain [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id p9AHPaTx001513; Mon, 10 Oct 2011 19:27:00 +0200 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 1774237 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Mon, 10 Oct 2011 19:25:01 +0200 Received: from relay2.uni-heidelberg.de (relay2.uni-heidelberg.de [129.206.210.211]) by listserv.uni-heidelberg.de (8.13.1/8.13.1) with ESMTP id p9AHP13B009075 for ; Mon, 10 Oct 2011 19:25:01 +0200 Received: from csep02.cliche.se (csep02.cliche.se [195.249.40.184]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id p9AHOlON004777 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 10 Oct 2011 19:24:51 +0200 Received: from nova-2.local (c-94-255-156-147.cust.bredband2.com [94.255.156.147]) by csep02.cliche.se (Postfix) with ESMTP id 57A481865E1 for ; Mon, 10 Oct 2011 19:24:45 +0200 (CEST) User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; sv-SE; rv:1.9.2.22) Gecko/20110902 Thunderbird/3.1.14 MIME-Version: 1.0 References: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by listserv.uni-heidelberg.de id p9AHP13B009082 Message-ID: <4E932A5E.9060406@residenset.net> Date: Mon, 10 Oct 2011 19:24:46 +0200 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: =?ISO-8859-1?Q?Lars_Hellstr=F6m?= Subject: Re: Strings, and regular expressions To: LATEX-L@listserv.uni-heidelberg.de In-Reply-To: Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-GMX-Antispam: 0 (eXpurgate); Detail=5D7Q89H36p7Nd1a4XKk/L3frsCBpYZwpZGXTrZgOrahgmdDVplstvw825WxNKVpyAZ1MS krdCb6uI4l9EufxOTp+LS18JvmQaoehVwWBrcTwn9Snov4cqgxlho8EdMyeWVziu2iMgTum7l67s 6AVK2qQgpa0j+9TxdhgApifI5fJ9+J7XpdDdzm7qhTE2YX77sWgRWRYCxQ=V1; X-Resent-By: Forwarder X-Resent-For: rainer.schoepf@gmx.net X-Resent-To: rainer@rainer-schoepf.de Status: R X-Status: X-Keywords: X-UID: 6908 Bruno Le Floch skrev 2011-10-10 17.07: > Hello all, > > We just added on CTAN two related modules: l3str (string manipulation) > and l3regex (regular expression matching and replacement). Without having looked at it, I'll still say: Wow! [snip] > Speed requirements forbid a back-tracking approach, Does that mean you compile to a finite automaton? Then double wow! > hence back-references cannot be supported. Only "truly regular" features are > implemented. I, for one, have no love for backreferences (a.k.a. The Feature from the Black Lagoon). > - I had the idea of providing # as a shorthand for .*? (arbitrary > sequence of characters, lazy), mimicking what TeX does when finding a > macro parameter. Is it useful? For something called # in analogy with macros, one would probably also expect some interaction with capturing parentheses. Feels like overkill to me. > - Same question for caseless matching, and for look-ahead/look-behind > assertions. > > - A facility for matching a balanced group (e.g., as xparse does for > optional arguments)? That is non-regular, and is difficult to > implement, so I will only look at it if it is really needed. For parsing text where balancing matters, I would suggest using Parsing Expression Grammars (instead of mimicing Perlish extensions to regexps): most of the expressive power of BNFs (and then some), none of the ambiguity, and capable of doing it in linear time! Lars Hellström