MIME-Version: 1.0
References: <CACKZBPWfcegW+5axif=KqKc_cAzpRU5dZq2U4Ti9q8A+zaq+FA@mail.gmail.com>
            <CANQYN6x3Hp5L0NoxnfEXF8Uk1GGFTOuZajeeuGCJpRSHp8HREQ@mail.gmail.com>
            <CACKZBPUuynMgXVtaOwTGQO-0PVt=4i4fG9-LBcPTnxsz2bMMUA@mail.gmail.com>
Content-Type: text/plain; charset=windows-1252
Message-ID:  <CANQYN6yN7TP9CsQkWAdM8TYMDEwqBKtKFGpy5N3Zm97GhvfcTw@mail.gmail.com>
Date:         Tue, 16 Jul 2013 20:45:33 -0400
Reply-To: Mailing list for the LaTeX3 project
              <LATEX-L@LISTSERV.UNI-HEIDELBERG.DE>
Sender: Mailing list for the LaTeX3 project <LATEX-L@LISTSERV.UNI-HEIDELBERG.DE>
From: Bruno Le Floch <blflatex@GMAIL.COM>
Subject: Re: Request for argument specifiers which generate unique csnames
To: LATEX-L@LISTSERV.UNI-HEIDELBERG.DE
In-Reply-To:  <CACKZBPUuynMgXVtaOwTGQO-0PVt=4i4fG9-LBcPTnxsz2bMMUA@mail.gmail.com>
Precedence: list
Content-Transfer-Encoding: quoted-printable
Envelope-To: <rainer.schoepf@GMX.NET>
Status: R

Hello Michiel,

Long email ahead: I got sidetracked into explaining a bit what I am
starting to envision for objects.

>> sufficiently rare that it does not warrant adding a new variant letter=
:
>
> I can understand that =97 if there are reasonable alternatives.

I agree that there is no alternative currently, and I am trying to
find what is the best approach.  My main claim so far is that a :U
specifier would not work/fit with the rest of expl3.  I agree that
some of my arguments were somewhat weak, so let me try to clarify
those.

One problem lies in the memory leak.  To avoid wasting csnames, the
user will have to be very careful to free those explicitly.  Using :U
to allocate, and a function to deallocate seems quite ugly, and
unaware users would very easily forget to deallocate resources.  This
is in my view a strong reason to prefer a function to create the
unique name, rather than a specifier, to match with the function
destroying the name.

Another reason, as mentioned above, is that new argument specifiers
are used only when both they are significantly more convenient than
simply providing a function and they are useful in sufficiently many
situations.  Actually, I think more can be said on that point (see
below).


A: Argument specifiers and variants
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D

You mention the diversity of argument specifiers as a reason not to be
surprised by :U.  I disagree, because :U is meant as a variant, not
only an argument specifier.  There are two kinds of argument
specifiers:

- non-variants [NnTFpwD]
- variants of :n [ofVvx(ed, in the past)] and of :N [c].

Any new argument specifier, say :Z, should fit in one of those sets
(:U is a variant).

- If :Z is a non-variant, then it is used only for arguments of new
functions.  But such an argument is either a single token [N], a brace
group [n], or a weird thing [w].  Unless it suddenly came to happen
that some arguments of very many functions shared a similar
characteristic, which we would want to emphasize, there is no good
reason to add other argument specifiers.  What about T, F, p, and D?
Well, there are very many conditionals in expl3, and experience has
shown that the specifiers T and F are useful for visibility (they
would be even if we only provided the TF versions of conditionals and
not both T and F too).  What about p?  I'd say it is partly
historical, as \cs_new:Npn could as well have been \cs_new:Nwn.  What
about D?  Well, it would be more natural to label all of those
functions as :w and be done with it, but somehow they seem very
special, and they are meant to stand out.  I'm of two minds on this.
But basically, no new arg spec in this group.

- To understand the constraints if :Z is a variant, let me explain
some things.  The idea behind variants is that they only concern
expansion control, and nothing which happens there changes TeX's
memory.  In cases where TeX's memory is altered by variants, weird
things can happen, such as a result being changed non-trivially when
adding expansion on an unrelated parameter.

    \cs_set:Npn \test:NNn #1#2#3 { \iow_term:n {#2} }
    \cs_generate_variant:Nn \test:NNn { cc , ccx }
    { \test:ccn { abc } { \if_cs_exist:N \abc yes \else: no \fi: } { } } =
% \no
    { \test:ccx { abc } { \if_cs_exist:N \abc yes \else: no \fi: } { } } =
% \yes

This is currently a very rare problem, which only affects the
conditionals \if_cs_exist:N, \if_cs_exist:w and some esoteric and
not-yet public floating point trap business.  A new variant type :Z
had better (1) alter "expl3 user"-accessible parts of TeX's memory as
little as possible, hence behave mostly as a pipe that takes some
tokens as an input and returns other tokens, (2) either be such that
most N-type arguments (all?) or most n-type arguments can be turned
into Z-type, (3) so far at least we only have variants which perform
some kind of expansion.

The proposed :U and :u (besides having memory issues) break the
requirement (1) and the weak requirement (3), but are probably fine
with (2).  The proposed :r is borderline on (1), fine with (2), and
borderline on (3) since it performs a 'contraction', which is related
to expansion.

If there was no alternative to providing an argument specifier :r,
then I might have been in favor of adding it.  However, any
non-expandable specifier which is a variant of :n can be very well
approximated using a function that sets a temporary token list, then
using :o (or :V) expansion.  Namely, if :Z (here, :r) is a variant of
:n,

    \foo:NnZx \a { AB } <Z-arg> { ghi }
    % is equivalent to
    \tl_set:NZ \l_same_for_all_expansions_tl <Z-arg>
    \foo:Nnox \a { AB } \l_same_for_all_expansions_tl { ghi }

By the way, this once again assumes that expansion of arguments can be
done in any order.


B: Baby-steps towards objects
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

>> In fact, perhaps the right approach is that I revive some ideas I had
>> about objects.  Mapping a function then amounts to repeatedly popping
>> the first item and acting with the function.  I roughly see how to
>> write a wrapper for that, which would mean that the package writer
>> would not need to worry about defining the mapping function: rather he
>> would define a \something_get:N or \something_pop:NN function, and the
>> framework would do the work of defining map_function and map_inline.
>> This is a longer term idea.
>
> This would be a very useful contribution. Anything to auto-generate
> all three mapping functions (including _map_tokens) from a simple
> description.

Here is a short version of what I have in mind.  The syntax is not
fixed at all, and let's say that not everybody agrees with me that
objects are cool.  Also, my use of words is probably not canonical,
sorry.

A "type" is defined by setting up a property list (two actually)
matching a word (new:N, put_right:Nn, tail:n, ...) with a function
which implements it for that type (\seq_new:N, \seq_put_right:Nn,
\ERROR, ...).  An "object" of that type is simply a token list
starting with a specific marker \s__<type> and ending with \s_stop.
This means that one can manipulate such objects (props, seqs, etc.) as
n-type arguments, rather than the current constraint that they be of
N-type always.

Functions to manipulate the object can still be called directly as
they are right now, with no overhead.  They can also be called through
an \obj:Nn function, which takes care of finding the type of the first
argument.  E.g., perhaps,

    \prop_new:N \l_my_prop
    \prop_put:Nnn \l_my_prop { key } { value }

    \iow_term:x { \obj:Nn \l_my_prop { get:Nn =3D { key } , head:n } }

    \obj:Nn \l_my_prop
      {
        pop:NnN =3D { key } ,
        tail:N ,
        map_inline:Nn =3D { \iow_term:n {#1} }
      }

could do (almost) the equivalent of

    \prop_new:N \l_my_prop
    \prop_put:Nnn \l_my_prop { key } { value }

    \iow_term:x { \tl_head:f { \prop_get:Nn { key } } }

    \prop_pop:NnN \l_my_prop { key } \l__obj_internal_tl
    \tl_set:Nx \l__obj_internal_tl { \tl_tail:N \l__obj_internal_tl }
    \tl_map_inline:Nn \l__obj_internal_tl { \iow_term:n {#1} }

Now, in principle many of the operations performed on a given type can
be built from very few building blocks.  One such important block for
objects over which we can map extracts the first item, and the
remainder of the object (as an object of the same type), and leaves
the two parts as brace groups.  Specifically, we want a
"\something_split:nNF" function which takes as arguments an object #1
of type <something>, a two-argument function #2, and some tokens #3,
and outputs #2 followed by the two groups obtained from #1 if
extracting the head was successful (the <something> was non-empty) or
#3 if the <something> was empty.  From such a function, one can derive
mapping functions, a get_left function, a get_right function, and all
sorts of jolly things.

In practice, this would be done by letting a type defer to another for
unknown methods.  There would be a built-in <iterable> type, from
which the <something> type could derive.  The <iterable> type would
define several operations in terms of the split:nNF function of the
object.  For instance, here is get_left and map_function.  Getting an
approach to mapping that is linear in the number of items in the
object (say, for a <seq> object) is a bit more tricky, and requires
more than just split:nNF.  Also, there is no support right now for
breaking out of the mapping, but it can in principle be added just by
updating the <iterable> type, without altering <something>.

    %<@@=3Diterable>

    \cs_new_protected_nopar:Npn \iterable_get_left:NN
      { \exp_args:NV \iterable_get_left:nN }
    \cs_new_protected:Npn \iterable_get_left:nN #1
      { \obj:nn {#1} { split:nNF =3D \@@_get_left:nnN { \@@_get_left:N } =
} }
    \cs_new_protected:Npn \@@_get_left:nnN #1#2#3 { \obj_set:Nn #3 {#1} }
    \cs_new_protected:Npn \@@_get_left:N #1 { \obj_set:Nn #1 { \q_no_valu=
e } }

    \cs_new_protected_nopar:Npn \iterable_map_function:NN
      { \exp_args:NV \iterable_map_function:nN }
    \cs_new_protected:Npn \iterable_map_function:nN #1
      { \obj:nn {#1} { split:nNF =3D \@@_map_function:nnN { } } }
    \cs_new_protected:Npn \@@_map_function:nnN #1#2#3
      { #3 {#1} \iterable_map_function:nN {#2} #3 }

Defining a \something_split:nNF function (and adding some boiler-plate
code to declare the type as a subtype of "iterable") would then
suffice to get get_left, map_function, and many others (not as
explicit functions \something_..., but as methods accessible as
\obj:Nn \l_the_best_something { ... }).


C: Cutting things short, no functions today
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D

> Generally, any situation where an unbounded number of separate
> internal macros are needed without the API user providing names for
> them. I use it for `\with`. I use it for my ":r pointers".

I need to think.  But I believe that writing a separate package for
manipulating pointers, or providing unique names, is the best option.
One big advantage would be that it allows things to be refined more
easily by seeing what is useful in practice (I am happy to look at
code and applications to real-world or somewhat-imaginary documents),
rather than codifying something and realizing later that it was a
mistake.

> \with:Vn \l_variable_tl {
>     \bla:n { \do_not_expand_me bla bla bla #1 bla bla }
> }

This situation does not need unique identifiers.  Simply

    \cs_new_protected:Npn \with:nn #1#2
      { \cs_set_protected:Npn \__with_tmp:n ##1 {#2} \__with_tmp:n {#1} }

no?

> Here's how I currently implement my `\something_map_inline`s. I use
> the non-expl3 version of `\with`, because expl3 doesn't have :U.
>
> \cs_new_protected:Nn \something_map_inline:Nn {
>     \with {Unn} [map_inline] [#2] {
>         \cs_set:cpn {##1} ####1####2 {##2}
>         \something_map_function:NN #1 {##1}
>     }
> }

I'm not fond of the fact that ##1 and ##2 in the body of \with are not
at all on the same footing, since ##1 is the unique name and ##2 is
the original #2.  Also, passing arbitrary parameters (here #2)
surrounded only be square brackets is very fragile, as it tends to
break if #2 contains some ].  One rather silly solution would be to
pass both #1 and #2 as arguments to \with.  this doubles hashes once
more,

    \cs_new_protected:Nn \something_map_inline:Nn {
      \with ... {#1} {#2} {
        \cs_set:cpn {##1} ########1 ########2 {####2}
        \something_map_function:Nc ####1 {##1} } }

> Finally, here's how I create a new pointer with an initialized value:
>
> \cs_new_protected:Nn \ptr_new:Nn {
>     \with{u} [ptr] [#2] {
>         \tl_set:Nn #1 {##1}
>         \tl_new:c     {##1}
>         \tl_set:cn    {##1} {##2}
>     }
> }

I see that, but I think the ptr package should actually be responsible
for building its own unique names.  This needs more thought, but some
of it should be spent first on finding a good object model that would
fit nicely with the rest of TeX.

Regards,
Bruno