Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by h1439878.stratoserver.net (8.14.2/8.14.2/Debian-2build1) with SMTP id q73Et8bg032752 for ; Fri, 3 Aug 2012 16:55:09 +0200 Received: (qmail 9587 invoked by alias); 3 Aug 2012 14:55:03 -0000 Delivered-To: GMX delivery to rainer.schoepf@gmx.net Received: (qmail invoked by alias); 03 Aug 2012 14:55:02 -0000 Received: from relay.uni-heidelberg.de (EHLO relay.uni-heidelberg.de) [129.206.100.212] by mx0.gmx.net (mx069) with SMTP; 03 Aug 2012 16:55:02 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [129.206.100.94]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id q73EqvgC028423 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 3 Aug 2012 16:52:57 +0200 Received: from listserv.uni-heidelberg.de (listserv.uni-heidelberg.de [127.0.0.1]) by listserv.uni-heidelberg.de (8.13.8/8.13.1) with ESMTP id q73CcZKT008220; Fri, 3 Aug 2012 16:52:57 +0200 Received: by LISTSERV.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 16.0) with spool id 2488222 for LATEX-L@LISTSERV.UNI-HEIDELBERG.DE; Fri, 3 Aug 2012 16:52:57 +0200 Received: from relay2.uni-heidelberg.de (relay2.uni-heidelberg.de [129.206.210.211]) by listserv.uni-heidelberg.de (8.13.8/8.13.1) with ESMTP id q73EqvvZ032456 for ; Fri, 3 Aug 2012 16:52:57 +0200 Received: from mail-vb0-f49.google.com (mail-vb0-f49.google.com [209.85.212.49]) by relay2.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id q73Eqp5G024801 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=FAIL) for ; Fri, 3 Aug 2012 16:52:55 +0200 Received: by vbbfo1 with SMTP id fo1so1039465vbb.22 for ; Fri, 03 Aug 2012 07:52:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.189.73 with SMTP id gg9mr1745226vec.26.1344005570895; Fri, 03 Aug 2012 07:52:50 -0700 (PDT) Received: by 10.220.149.2 with HTTP; Fri, 3 Aug 2012 07:52:50 -0700 (PDT) References: Content-Type: text/plain; charset=windows-1252 X-Spam-Whitelist: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by listserv.uni-heidelberg.de id q73EqvvZ032457 Message-ID: Date: Fri, 3 Aug 2012 16:52:50 +0200 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Bruno Le Floch Subject: Re: Peek ahead for next token not in token-list To: LATEX-L@listserv.uni-heidelberg.de In-Reply-To: Precedence: list List-Help: , List-Unsubscribe: List-Subscribe: List-Owner: List-Archive: X-GMX-Antispam: 0 (BackTrace mail analyze); Detail=5D7Q89H36p4L00VTXC6D4q0N+AH0PUCnGL2vqOgpaBYL16oitsMrgDt/NQNpSCZFFjDOy 97xb7Zpf+wZnd5ZXNcvLDXR3Wg3wRjdQbwEMh8=V1; X-Resent-By: Forwarder X-Resent-For: rainer.schoepf@gmx.net X-Resent-To: rainer@rainer-schoepf.de Status: R X-Status: X-Keywords: X-UID: 7098 Hello Joel, I promised to go back to you earlier but didn't, sorry about that. I'm replying to two emails in one, and the result is somewhat long, hopefully helpful. > I've been developing my xpeek package [...] > see . I see that you use the "NPC" prefix in xpeek, probably because of some code I had written (back when you were asking for a \NewPeekCommand command). It may be better to use xpeek as a prefix: since there can be no two packages on CTAN with the same name, using that name as a prefix for internal commands should avoid clashes. Furthermore, it would be best if you use the convention \__xpeek_... for internal commands, and \l__xpeek_... for internal variables. You probably don't have any public code-level functions \xpeek_... or variables \l_xpeek_..., but this would be the conventional beginning. To make the internal convention more convenient and shorter to type, we recently introduced l3docstrip. Replace docstrip by l3docstrip, and replace "xpeek" (or "NPC") by "@@" in all names. Then add % \begin{macrocode} %<@@=xpeek> % \end{macrocode} near the start of the implementation section (see e.g., some l3kernel modules for a model). This change will make it very easy to change the module name if needed, will make the code shorter, and will make the command names less accessible from outside. > \textit{foof}\xspace. > \textit{foof}\xspace! > > Thinking about the problem, it seems I need the ability to scan ahead > in the input stream, ignoring tokens from one list while looking for > tokens from another. In Expl3 terms, I’m hoping to define something > like `\peek_inlist_ignore_auxlist:nnTF`. It should be \xpeek, or \@@ (transformed to \__xpeek), not \peek in any case :). I think it is very important not to use the kernel namespace even when the command name would make more sense with such a name. For instance, in randomwalk.sty I have \@@_int_set_to_random:Nnn, not \int_set_to_random:Nnn. > \peek_ignore_list:N \ignorelist > `\l_peek_token' This syntax is impossible to acheive since \peek_ignore_list:N has no way to know where the `\l_peek_token' "argument" is supposed to end. > The direction I’m considering is to read ahead, consuming tokens. Each > token read is added to a save-list and compared to the ignore-list. If > it’s on the ignore-list, continue; otherwise put the save-list back on > the input stream and stop. > > Does this sound reasonable so far? Somewhat reasonable, yes. I'm not sure what the best approach is. You need to collect the tokens in your ignore list, and you then need to perform an action depending on the next token. It is possible to define \xpeek_collect_do:nn, whose first argument is a list of tokens to ignore, whose second argument is some operation to perform, which will receive as an argument the tokens: \xpeek_collect_do:nn { abc } { \foo \bar } caada => \foo \bar { caa } da Assuming that we have this function (see below for an implementation), and that the following token (the first which is not collected) has its meaning copied to \l_peek_token (like any \peek function), then we can built a \nextnonpunct as \DeclareDocumentCommand { \nextnonpunct } { } { \xpeek_collect_do:nn { .,!? } { ` \l_peek_token ' \use:n } } where the \use:n unbraces whatever punctuation \xpeek_collect_do:nn has collected. How is \xpeek_collect_do:nn implemented? I'm introducing a quark just to have a macro different from anything you may see when peeking ahead: then \peek_meaning:NF always takes the F branch. Not happy about that hack. \quark_new:N \q_@@ \bool_new:N \l_@@_ignore_bool \cs_new_protected:Npn \xpeek_collect_do:nn #1#2 { \@@_collect_do:nnnn { #1 } { #2 } { } { } } \cs_new_protected:Npn \@@_collect_do:nnnn #1#2#3#4 { \peek_meaning:NF \q_@@ { \bool_set_false:N \l_@@_ignore_bool \tl_map_inline:nn {#1} { \token_if_eq_charcode:NNT \l_peek_token ##1 { \bool_set_true:N \l_@@_ignore_bool \tl_map_break: } } \bool_if:NTF \l_@@_ignore_bool { \@@_collect_do:nnnn {#1} {#2} { #3#4 } } { #2 { #3#4 } } } } > To consume tokens one-by-one, I built this function: > > \cs_new_protected:Npn \peek_meaning_really_remove:NTF #1 #2 #3 > { > \peek_meaning_remove:NTF #1 > { #2 } > { > \peek_meaning_remove:NT \l_peek_token > { #3 } > } > } Well, that would remove tokens, not collect them. > (This should be created via \prg_new_conditional, but I haven’t yet > figured that out.) It is (pretty much?) impossible to define peek-like functions as conditionals. > Is the direction I'm taking appropriate for what I'm trying to do? Yes. > Is there some existing functionality that would help that I'm overlooking? Not really. I think we should add \peek_after:nw to cover my use of \peek_meaning:NF \q_@@ in the code above. That would make the code reasonably clean. I've added this function to l3trial/l3kernel-extras, not on CTAN, only on the SVN repository. One correct long-term approach would be to provide a parser for some class of grammar, but that is extremely hard in TeX (the regular expression parser l3regex took me about 4 months of hard work). So don't expect this any time soon. At least for now, I think the \xpeek_collect_do:nn code I give above is (up to a few improvement) a reasonable approach to practical situations where someone wants to look ahead in the input stream. So I'd say, provide \xpeek_collect_do:nn or a similar functionality as a public code-level function in your xpeek package. On 7/30/12, Joel C. Salomon wrote: > After some experimentation, it seems that the \peek_* family of > functions don't work well inside l3prg conditionals; source3.pdf seems > to bear this out in the justification for \__peek_def:nnnn. Indeed: consider \prg_new_conditional:Npnn \foo:n #1 { TF } { \prg_return_true: } This is (currently) equivalent to \cs_new:Npn \foo:nTF #1 { \prg_return_true: \c_zero } and the \prg_return_true: \c_zero combination is equivalent to \use_i:nn (see definition of \prg_return_true:), which selects the true branch and discards the false branch. Note how the \foo:nTF macro only takes one argument: the other two "arguments" are left in the input stream until the last moment, where \prg_return_true/false: selects one of the two. The problem with peek functions is that they need to see past those conditional branches in the input stream. Thus, \peek_meaning:NTF is roughly \cs_new_protected:Npn \peek_meaning:NTF #1#2#3 { \cs_set_eq:NN \l__peek_search_token #1 \cs_set_nopar:Npx \__peek_true:w { \exp_not:n {#2} } \cs_set_nopar:Npx \__peek_false:w { \exp_not:n {#3} } \peek_after:Nw \__peek_meaning: } \cs_new_protected_nopar:Npn \__peek_meaning: { \token_if_eq_meaning:NNTF \l__peek_search_token \l_peek_token { \__peek_true:w } { \__peek_false:w } } The T and F arguments must be taken out of the input stream, stored into dedicated functions \__peek_true:w and \__peek_false:w, and put back after the test. > On TeX.SE, Clemens Niederberger posted an answer to the specific > question I'd posed; see . > It works well, but it's built on recursive expansion of macros with :w > specifiers that I'm really not understanding. I'm thinking, therefore, > that I'm better off getting help implementing the functionality I want > in parts. I suspect his solution is needlessly complicated (he seems to test if the token is in the ignore list in a roundabout way). > What sorts of restrictions are there on the use of \l_peek_token > inside the true-code & false-code branches of the \peek_* functions? None, as far as I know. > Is it reasonable to use \__peek_def:nnnn to generate something like > \peek_unconditional:TF? (The false-code branch should never execute, I > expect.) Definitely not. \__peek_def:nnnn is internal, and may change at a whim. We have been careful to mark internal functions as such, and make no guarantee whatsoever that they will remain. The function you want is \peek_after:nw (see l3kernel-extras), and for now, you can use your own copy \tl_new:N \l__xpeek_code_tl \cs_new_protected:Npn \xpeek_after:nw #1 { \tl_set:Nn \l__xpeek_code_tl {#1} \peek_after:Nw \l__xpeek_code_tl } > Actually, it's \peek_unconditional_remove:T I think I need. I don't think you need that one since the token should be kept somewhere. The copy \l_peek_token is not appropriate, since that control sequence will later be changed to the next token in the input stream. Think of \l_peek_token as a pointer (that's almost not a lie), which TeX can unfortunately not dereference. > \tl_new:N \g_jcs_matchlist_tl > \tl_new:N \g_jcs_ignorelist_tl > \tl_new:N \l_jcs_ignored_tokens_tl > > \cs_new:Npn \jcs_peek_in_matchlist_ignore_ignorelist:TF #1#2 > { > \tl_clear:N \l_jcs_ignored_tokens_tl > \__jcs_peek_in_matchlist_ignore_ignorelist_aux:TF {#1}{#2} > } Braces missing. > \cs_new:Npn \__jcs_peek_in_matchlist_ignore_ignorelist_aux:TF #1#2 > { > \peek_unconditional_remove:T > { > \tl_if_in:N?TF \g_jcs_ignorelist_tl { something involving > \l_peek_token } Not possible, unfortunately. You have to map through \g_jcs_ignorelist_tl, comparing \l_peek_token to each token in the ignorelist (see code for \xpeek_collect_do:nn above). > { > \tl_put_right:N? \l_jcs_ignored_tokens_tl { something > involving \l_peek_token } > keep looking, probably by recursing Yes, that's roughly what I'm doing. I'm storing the tokens as macro arguments #3 and #4 of \@@_collect_do:nnnn, but that's not very sensible, storing in a token list is better. > \tl_use:N \l_jcs_ignored_tokens_tl > \tl_if_in:N?TF \g_jcs_matchlistlist_tl { something > involving \l_peek_token } > {#1} {#2} Again, \tl_if_in is not useable here. You should probably define an auxiliary test \prg_new_protected_conditional:Npnn \@@_if_in:NN #1#2 { TF } { \bool_set_false:N \l_@@_bool \tl_map_inline:Nn #1 { \token_if_eq_charcode:NNT #2 ##1 { \bool_set_true:N \l_@@_bool \tl_map_break: } } \bool_if:NTF \l_@@_bool { \prg_return_true: } { \prg_return_false: } } Used as \@@_if_in:NNTF \g_jcs_ignorelist_tl \l_peek_token { } { }. > Does this sound like the correct path to head down? Yes. Best regards, Bruno