User-Agent: Thunderbird 2.0.0.22 (Windows/20090605)
MIME-Version: 1.0
References: <4A7921CF.5020803@morningstar2.co.uk>                              
            <4A7A1505.4040604@residenset.net>                                  
            <DB76B425-0A88-43B4-9CF0-B92FAF159E37@gmail.com>                   
            <4A7AD930.2090106@residenset.net>                                 
            <8516B615-51AA-4D90-BB7D-A9E122AA0335@gmail.com>                   
            <4A804317.6050909@morningstar2.co.uk> <4A80508F.3030904@elzevir.fr>
            <0C83E480-14E3-4CD8-924E-3B9EA602E004@gmail.com>           
            <4A810D6E.5050207@morningstar2.co.uk>
            <13962923-07A3-4C66-B144-E728DBC10183@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Message-ID:  <4A8252A3.7050805@morningstar2.co.uk>
Date:         Wed, 12 Aug 2009 06:26:59 +0100
Reply-To: Mailing list for the LaTeX3 project
              <LATEX-L@LISTSERV.UNI-HEIDELBERG.DE>
Sender: Mailing list for the LaTeX3 project <LATEX-L@LISTSERV.UNI-HEIDELBERG.DE>
From: Joseph Wright <joseph.wright@MORNINGSTAR2.CO.UK>
Subject: Re: xparse
To: LATEX-L@LISTSERV.UNI-HEIDELBERG.DE
In-Reply-To:  <13962923-07A3-4C66-B144-E728DBC10183@gmail.com>
Precedence: list
Status: R

Will Robertson wrote:
> So how would \my_sanitise:n above be written? If it's not expandable,
> then the results of modifying #1 must be written to a standard scratch
> variable (right?). At which stage I'm not convinced we've saved any
> confusion over the original way to write this sort of thing.
> 
> For example:
> 
> \cs_set:Nn \my_sanitise:n {
>   \tl_set:Nn \l_scratch_tl {#1}
>   \tl_replace_in:Nnn \l_scratch_tl {&} {&amp;}
>   % but what to do now with \l_scratch_tl ??
> }
> 
> If it's easier to hook post-processing steps into the argument stage but
> *harder* to write the functions that do so, have we gained much? Enough
> to justify the initial increase in complexity?
> 
> Maybe to re-write my concern; how could we simplify this example from
> xdoc2l3: (use as much hypothetical syntactic sugar as you like)
> 
> \gdef\XD@gp@intrange#1#2#3\XD@endgrab#4{%
>    \count@=#4\relax
>    \ifnum #1>\count@
>       \PackageError{xdoc2l3}%
>         {Too small! Changed to minimum value #1}\@ehc
>       \count@=#1\relax
>    \else\ifnum #2<\count@
>       \PackageError{xdoc2l3}%
>         {Too large! Changed to maximum value #2}\@ehc
>       \count@=#2\relax
>    \fi\fi
>    \expandafter\XD@gh@put \expandafter{\expandafter{\the\count@}}{#3}%
> }
> \gdef\XD@pgp@intrange#1#2{%
>    \XD@macro@prepend\@tempa{\XD@gp@intrange{#1}{#2}}%
>    \XD@parse@processor
> }
> 
> In my opinion, based on this example, it will be easier for users of
> xparse to write their own argument-parsing functions as nested
> definitions inside \DeclareDocumentCommand than to write something like
> the above. Notice the number of necessary but implicit macros that the
> author must know about:
> 
>   \XD@gp@...
>   \XD@pgp@...
>   \XD@endgrab
>   \XD@gh@put
>   \XD@macro@prepend\@tempa
>   \XD@parse@processor
> 
> To be clear, I'm not yet opposed to this style of argument processing
> and I *do* think there are some simplifications that can be made to the
> code above to make it more palatable for users to write.

Perhaps best would be to design the system so that

\ArgProcessorFunction

would expand to

\arg_processors:nN

which then gets given the argument as #1 and the return variable as #2
(a toks, so that there is nothing going to go wrong with # tokens). As
long as it sets #2 to whatever #1 turns into, then the internal xparse
structure is not something it has to know about (except to return to a
toks). All we have to do is then put the value of the toks into the
correct part of the output toks we build.

> But before we agree to include such complex ideas into what is now a
> simple package (for end-users, I mean), I think it would be good to
> verify that a distillation of xdoc2l3 will "fit" into the philosophy of
> xparse.

I've already suggested that xparse should provide some basic processors,
if we are going to do this at all! It just depends how many.

> After I say that, I suppose it would be remiss of me to go away without
> actually mentioning how I think the ideas in xdoc2l3 might work in
> xparse(-alt).
> 
> First of all, I think we should drop the idea of using more letters for
> shorthands. As we've discussed, each package will use different letters
> and clashes will occur very quickly.
> 
> Secondly, for simplicity let's drop the idea of toks-grabbing as a proxy
> for argument grabbing.
> 
> Thirdly, use actual function names rather than character strings to
> represent argument processors.
> 
> From xdoc2l3, there seem to be two broad cases for argument processing:
>     \argumentprocess{#1}
> and
>     \argumentprocess#1<delimiter>
> 
> These can be referred to in the argument definitions as something like
>     >\CS
> and
>     @\CS{DELIMITER}
> where \CS is the processing function and DELIMITER is what's inserted
> after the argument, in the second case.
> 
> Finally, allow these to be chained together if necessary, as in
> something like this: (\foo takes one argument)
> 
> \DeclareDocumentCommand \foo {
>   @\foo_split_colons:n{::\q_nil} >\foo_range_check:nnn m
> }{...}
> 
> Open questions:
> 
> 1.  Does this cover the functionality that you miss from xparse in xdoc2l3?
> 2.  Is it simple enough for xparse?
> 3.  Assuming that we must deal with both expandable and unexpandable
> argument processing, is there a straightforward way we can write the
> argument processors? (E.g., see how boolean expressions are written in
> expl3; a consistent interface is used for both expandable and
> unexpandable tests.)
> 4.  Does that end up being any easier than writing things by hand?
> 
> I've got some sort of niggling concern that this approach mixes "how the
> arguments are parsed" with "what the structure of the arguments are",
> which xparse is more careful not to do.

As I've indicated, I'm still more in favour of "grab arguments, then
process them in a separate step". I also feel that "daisy chaining" is
not that easy to implement, although you could imagine:

\DeclareDocumentCommand \foo { >{ \ProcessorOne \ProcessTwo }m }

where the processor is applied, returns the processed argument and it
can then be used for the next processor as input.
-- 
Joseph Wright