Re: [RFC] Pipe Operator (again)

From: Date: Fri, 07 Feb 2025 22:54:47 +0000
Subject: Re: [RFC] Pipe Operator (again)
References: 1 2 3  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message


On Fri, Feb 7, 2025, at 22:04, Larry Garfield wrote:
> Merging a few replies together here, since they overlap.  Also reordering a few of Tim's
> comments...
> 
> On Fri, Feb 7, 2025, at 7:32 AM, Tim Düsterhus wrote:
> > Hi
> >
> > Am 2025-02-07 05:57, schrieb Larry Garfield:
> >> It is now back with a better implementation (many thanks to Ilija for 
> >> his help and guidance in that), and it's nowhere close to freeze, so 
> >> here we go again:
> >> 
> >> https://wiki.php.net/rfc/pipe-operator-v3
> >
> > There's some editorial issues:
> >
> > 1. Status: Draft needs to be updated.
> > 2. The RFC needs to be added to the overview page.
> > 3. List formatting issues in “Future Scope” and “Patches and Tests”.
> >
> > Would also help having a closed voting widget in the “Proposed Voting 
> > Choices” section to be crystal clear on what is being voted on (see 
> > below the next quote).
> 
> I split pipes off from the Composition RFC late last night right before posting; I guess I
> missed a few things while doing so. :-/  Most notably, the Compose section is now removed from
> pipes, as it is not in scope for this RFC.  (As noted, it's going to be more work so has its
> own RFC..)  Sorry for the confusion.  I think it should all be handled now.
> 
> > 5. The “References” (as in reference variables) section would do well 
> > with an example of what doesn't work.
> 
> Example block added.
> 
> > 9. In the “Why in the engine?” section: The RFC makes a claim about 
> > performance.
> >
> > Do you have any numbers?
> 
> Not currently.  The statements here are based on simply counting the number of function calls
> necessary, and PHP function calls are sadly non-cheap.  In previous benchmarks of my own libraries
> using my Crell/fp library, I did find that the number of function calls involved in some tight pipe
> operations was both a performance and debugging concern, but I don't have any hard numbers
> laying about at present to share.
> 
> If you think that's critical, please advise on how to best get meaningful numbers here.
> 
> Regarding the equivalency of pipes:
> 
> Tim Düsterhus wrote:
> > 4. “That is, the following two code fragments are also exactly 
> > equivalent:”.
> >
> > I do not believe this is true (specifically referring to the “exactly” 
> > word in there), since the second code fragment does not have the short 
> > closures, which likely results in an observable behavioral difference 
> > when throwing Exceptions (in the stack trace) and also for debuggers.. Or 
> > is the implementation able to elide the the extra closure? (Of course 
> > there's also the difference between the temporary variable existing, 
> > with would be observable for get_defined_vars() and possibly 
> > destructors / object lifetimes).
> 
> Thomas Hruska wrote:
> > The repeated assignment to $temp in your second example is _not_ 
> > actually equal to the earlier example as you claim.  The second example 
> > with all of the $temp variables should, IMO, just be:
> >
> > $temp = "Hello World";
> > $result = array_filter(array_map('strtoupper', 
> > str_split(htmlentities($temp))), fn($v) { return $v != 'O'; });
> 
> Juris Evertovskis wrote:
> > 3. Does the implementation actually turn 1 |> f(...) |> g(...) into 
> > $π = f(1); g($π)? Is g(f(1)) not performanter? Or is the
> > engine 
> > clever enough with the var reuse anyways?
> 
> There's some subtlety here on these points.  The v2 RFC used the lexer to mutate $a |>
> $b |> $c into the same AST as $c($b($a)), which would then compile as though that had been
> written in the first place.  However, that made addressing references much harder, and there's
> an important caveat around order of operations. (See below.)  The v3 RFC instead uses a compile
> function to take the AST of $a |> $b |> $c and produce opcodes that are effectively equivalent
> to $t = $b($a); $t = $c($t);  I have not compared to see if they are the precise same opcodes, but
> they net effect is the same.  So "effectively equivalent" may be a more accurate
> statement.
> 
> In particular, Tim is correct that, technically, the short lambdas would be used as-is, so
> you'd end up with the equivalent of:
> 
> $temp = (fn($x) => array_map(strtoupper(...), $x))($temp);
> 
> I'm not sure if there's a good way to automatically unwrap the closure there.  (If
> someone knows of one, please share; I'm fine with including it.)  However, the intent is that
> it would be largely unnecessary in the future with a revised PFA implementation, which would obviate
> the need for the explicit wrapping closure.  You would instead write
> 
> $a |> array_map(strtoupper(...), ?);
> 
> Alternatively, one can use higher order user-space functions already.  In trivial cases:
> 
> function amap(Closure $fn): Closure {
>   return fn(array $x) => array_map($fn, $x);
> }
> 
> $a |> amap(strtoupper(...));
> 
> Which I am already using in Crell/fp and several libraries that leverage it, and it's
> quite ergonomic.
> 
> There's a whole bunch of such simple higher order functions here:
> https://github.com/Crell/fp/blob/master/src/array.php
> https://github.com/Crell/fp/blob/master/src/string.php
> 
> Which leads to the subtle difference between that and the v2 implementation, and why
> Thomas' statement is incorrect.  If the expression on the right side that produces a Closure
> has side effects (output, DB interaction, etc.), then the order in which those side effects happen
> may change with the different restructuring.  With all pure functions, that won't make a
> practical difference, and normally one should be using pure functions, but that's not something
> PHP can enforce.
> 
> I don't think there would be an appreciable performance difference between the two
> compiled versions, either way, but using the temp-var approach makes dealing with references easier,
> so it's what we're doing.
> 
> Juris Evertovskis wrote:
> > 1. Do you think it would be hard to add some shorthand for `|> 
> > $condition ? $callable : fn($😐) => $😐`?
> 
> I'm not sure I follow here.  Assuming you're talking about "branch in the next
> step", the standard way of doing that is with a higher order user-space function.  Something
> like:
> 
> function cond(bool $cond, Closure $t, Closure $f): Closure {
>   return $cond ? $t : $f;
> }
> 
> $a |> cond($config > 10, bigval(...), smallval(...)) |> otherstuff(...);
> 
> I think it's premature to try and bake that logic into the language, especially when I
> don't know of any other function-composition-having language that does so at the language level
> rather than the standard library level.  (There are a number of fun operations people build into
> pipelines, but they are all generally done in user space.)
> 
> --Larry Garfield
> 

Put another way, what is the order of operations for this new operator?

For example, what is the output of

$x ? $y |> strlen(…) : $z

$x + $y |> sqrt(…) . EOL

Etc.

I noticed this seems to be missing from the RFC. As a new operator, I think it should be important
to specify that. 

— Rob


Thread (38 messages)

« previous php.internals (#126343) next »