[PHP-DEV] [RFC] Pipe Operator (again)

Crell · February 7, 2025, 4:57am

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn't pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in "what do you want in PHP?" threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it's nowhere close to freeze, so here we go again:

https://wiki.php.net/rfc/pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it's also going to be notably more work, and the two operators don't actually interact at all at the code level, so since people keep saying "Small RFCs!", here's a small RFC.

--
Larry Garfield
larry@garfieldtech.com

Oladoyinbo_Vincent · February 7, 2025, 7:15am

PHP codebase in general is quite unreadable due to robust way of doing things. Pipe operator might make things more complicated even more…

But after reading the RFC, something came to my mind, a way to simplify this stuff

What if we implement it this way:


$pipe = " hello world "
|> strtoupper(self)
|> trim(self, ' ')
|> htmlentities(self)
|> fn (self): string => ....

Maybe ‘self’ or ‘$this’ can be used as the keyword param.

It’s a suggestion anyways.

On Fri, 7 Feb 2025, 5:58 am Larry Garfield, <larry@garfieldtech.com> wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn’t pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in “what do you want in PHP?” threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it’s nowhere close to freeze, so here we go again:

https://wiki.php.net/rfc/pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it’s also going to be notably more work, and the two operators don’t actually interact at all at the code level, so since people keep saying “Small RFCs!”, here’s a small RFC.

–
Larry Garfield
larry@garfieldtech.com

Eugene_Sidelnyk · February 7, 2025, 7:36am

Hi, Larry, That’s super! I hope it will pass!

Oladoyinbo, IMO the way it is described right now (e.g. explicit closures) is much more elegant than a new way of doing things that’s not so obvious and will be necessary to keep in mind and support anyway.

If it’d be necessary to simplify the stuff, like passing particular parameter from the input pipe into the function at the particular position, - I think it would be possible to do it with partial function application I hope to see in the future. (e.g. bind callback for array_map function, making a new function for the pipe that will accept the only parameter - input array)

Thank you

On Fri, Feb 7, 2025, 9:16 AM Oladoyinbo Vincent <oladoyinbov@gmail.com> wrote:

PHP codebase in general is quite unreadable due to robust way of doing things. Pipe operator might make things more complicated even more…

But after reading the RFC, something came to my mind, a way to simplify this stuff

What if we implement it this way:
$pipe = " hello world "
|> strtoupper(self)
|> trim(self, ' ')
|> htmlentities(self)
|> fn (self): string => ....
Maybe ‘self’ or ‘$this’ can be used as the keyword param.

It’s a suggestion anyways.

On Fri, 7 Feb 2025, 5:58 am Larry Garfield, <larry@garfieldtech.com> wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn’t pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in “what do you want in PHP?” threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it’s nowhere close to freeze, so here we go again:

https://wiki.php.net/rfc/pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it’s also going to be notably more work, and the two operators don’t actually interact at all at the code level, so since people keep saying “Small RFCs!”, here’s a small RFC.

–
Larry Garfield
larry@garfieldtech.com

Rob_Landers · February 7, 2025, 8:14am

On Fri, Feb 7, 2025, at 05:57, Larry Garfield wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn’t pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in “what do you want in PHP?” threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it’s nowhere close to freeze, so here we go again:

https://wiki.php.net/rfc/pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it’s also going to be notably more work, and the two operators don’t actually interact at all at the code level, so since people keep saying “Small RFCs!”, here’s a small RFC.

–

Larry Garfield

larry@garfieldtech.com

Hey Larry,

Maybe I missed it, but what happens here?

[1,2] |> add(…)

Is the array deconstructed or passed as-is? Further, if it is passed as-is (my gut is telling me it will be), then what is the error? Is it the normal “missing second parameter when calling add()” error or a new error specific to pipes?

If it is passed as-is, would the following be legal?

…[1,2] |> add(…)

— Rob

Crell · February 7, 2025, 8:16am

On Fri, Feb 7, 2025, at 1:36 AM, Eugene Sidelnyk wrote:

Hi, Larry, That's super! I hope it will pass!

Oladoyinbo, IMO the way it is described right now (e.g. explicit
closures) is much more elegant than a new way of doing things that's
not so obvious and will be necessary to keep in mind and support
anyway.

If it'd be necessary to simplify the stuff, like passing particular
parameter from the input pipe into the function at the particular
position, - I think it would be possible to do it with partial function
application I hope to see in the future. (e.g. bind callback for
array_map function, making a new function for the pipe that will accept
the only parameter - input array)

Thank you

Both of you, please don't top post.

That said, Eugene is correct. Hack (Facebook's PHP fork) had a pipe operator that took an expression with a magic placeholder on the right, rather than a callable. Every other language splits it into two parts, a pipe that takes a function on the right and some way to do easy partial application. I am firmly of the belief that Hack is wrong on this one and two separate features that dovetail together is the superior design over making a single pipe syntax that is less flexible. Especially with FCC now, any purpose-built unary function will be trivial to use, and a higher-order function that returns a unary function is also trivial to write.

As noted in Future Scope, I do want to revisit the PFA RFC at some point, but I need a collaborator who can help with the implementation as that is definitely over my head. (I have ideas for how to simplify the implementation, in concept, but my engine skill is too low to do it myself.)

--Larry Garfield

Tim_Dusterhus · February 7, 2025, 1:32pm

Hi

Am 2025-02-07 05:57, schrieb Larry Garfield:

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it's nowhere close to freeze, so here we go again:

PHP: rfc:pipe-operator-v3

There's some editorial issues:

1. Status: Draft needs to be updated.
2. The RFC needs to be added to the overview page.
3. List formatting issues in “Future Scope” and “Patches and Tests”.

Would also help having a closed voting widget in the “Proposed Voting Choices” section to be crystal clear on what is being voted on (see below the next quote).

---------

Regarding the contents:

4. “That is, the following two code fragments are also exactly equivalent:”.

I do not believe this is true (specifically referring to the “exactly” word in there), since the second code fragment does not have the short closures, which likely results in an observable behavioral difference when throwing Exceptions (in the stack trace) and also for debuggers. Or is the implementation able to elide the the extra closure? (Of course there's also the difference between the temporary variable existing, with would be observable for `get_defined_vars()` and possibly destructors / object lifetimes).

5. The “References” (as in reference variables) section would do well with an example of what doesn't work.

6. In the “Compose” section: The section always uses the word “callables”, but doesn't explain how it resolves the ambiguity of `[Foo::class, 'bar'] + [Bar::class, 'foo']`.

Should it read “Closures” instead of “callables”?

7. In the “Compose” section: It would be useful to explicitly spell out in which order the individual callables are called.

Will `(strrev(...) + ucfirst(...))("foo")` result in `ooF` or will it result in `Oof`?

8. In the “Compose” section: The RFC says that “ComposedClosure” is not quite equivalent, but it doesn't go into detail what is not quite equivalent.

Specifically: Is the result actually limited to a single argument? Using the `ooF` evaluation order, `(strlen(...) + str_replace(...))("o", "", "foo")` could reasonably result in `1`.

9. In the “Why in the engine?” section: The RFC makes a claim about performance.

Do you have any numbers?

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator.

The RFC lists “Compose” as part of the “Proposal” section, but also the “Future Scope”. Should the part in “Proposal” be removed?

However, it's also going to be notably more work, and the two operators don't actually interact at all at the code level, so since people keep saying "Small RFCs!", here's a small RFC.

I like this.

Best regards
Tim Düsterhus

Juris_Evertovskis · February 7, 2025, 3:15pm

On 2025-02-07 06:57, Larry Garfield wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in
many other languages. At the time it didn't pass, in no small part
because the implementation was a bit shaky and it was right before freeze.
Nonetheless, there are now even more (bad) user-space implementations in
the wild, as it gets brought up frequently in "what do you want in PHP?"
threads (though nowhere near generics or better async, of course), so it
seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his
help and guidance in that), and it's nowhere close to freeze, so here we
go again:

PHP: rfc:pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose
operator is a necessary complement to a pipe operator. However, it's also
going to be notably more work, and the two operators don't actually
interact at all at the code level, so since people keep saying "Small
RFCs!", here's a small RFC.

Great feature! Three questions and a comment from me.

1. Do you think it would be hard to add some shorthand for `|> $condition ? $callable : fn($😐) => $😐`?
2. Is compose in the scope or not? You mention it in both the main RFC body and the future scope. Or are those different composes?
3. Does the implementation actually turn `1 |> f(...) |> g(...)` into `$π = f(1); g($π)`? Is `g(f(1))` not performanter? Or is the engine clever enough with the var reuse anyways?

I don't think Laravel's pipeline is relevant here. In it each callback is responsible for invoking the rest of the chain. Thus it allows early returns and interacting with the return value of the following chain (`return 5 + $next($v)`). More like a middleware chaining tool, not a pipe in the same meaning as in this RFC.

BR,
Juris

Christoph_M_Becker · February 7, 2025, 4:51pm

On 07.02.2025 at 05:57, Larry Garfield wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn't pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in "what do you want in PHP?" threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it's nowhere close to freeze, so here we go again:

PHP: rfc:pipe-operator-v3

Thank you! I very much appreciate the simplicity (and efficiency) of
the implementation.

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it's also going to be notably more work, and the two operators don't actually interact at all at the code level, so since people keep saying "Small RFCs!", here's a small RFC.

Fair enough. And with the pipe operator, one might live without a
compose operator, e.g.

  $f1 = fn($x) => 2 * $x;
  $f2 = fn($x) => $x + 3;
  // $f3 = $f2 ∘ $f1
  $f3 = fn($x) => $x |> $f1 |> $f2;

Christoph

Thomas_Hruska · February 7, 2025, 5:59pm

On 2/6/2025 9:57 PM, Larry Garfield wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn't pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in "what do you want in PHP?" threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it's nowhere close to freeze, so here we go again:

PHP: rfc:pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it's also going to be notably more work, and the two operators don't actually interact at all at the code level, so since people keep saying "Small RFCs!", here's a small RFC.

There's a song in here somewhere that goes:

♪♫♬ PHP continues turning into...symbol SOUUUUUUUP! [Oh no.] ♪♫♬

The main example provided in the RFC makes its own excellent argument against the proposed feature:

$result = "Hello World"
     |> 'htmlentities'
     |> str_split(...)
     |> fn($x) => array_map(strtoupper(...), $x)
     |> fn($x) => array_filter($x, fn($v) => $v != 'O');

Symbols make languages harder to grok. I don't want a language like COBOL where things that should sensibly be symbols are words but I also don't want a code golfing language like APL that is just all the symbols all day long. Language features should be able to be easily found via search and every new symbol (or combination of symbols) is inherently unsearchable on most/all search engines. That includes the search engine on php.net. Go try searching for '...' or '=>' or '!=' operators on php.net and you get...nothing! "Texture is the conductor of flavor." -- French Chef Jean-Pierre. Balancing out symbols (liquids like water which have no flavor) and words (meat and veggies packed with flavor) is a language author's core responsibility in the language design soup kitchen.

While I'm not against adding symbols that serve a valuable purpose, there is nothing to be gained by encouraging bad coding habits at the outset. When a limitation is established up front such as "Functions with more than one required parameter are not allowed" then users will find ways to bypass the limitation such that it will kill performance in favor of their perceived and flawed idea of "convenience." This proposal will _minimally_ result in creating an anonymous function to call any basic function with more than one required parameter but also encourage abuse of the splat operator which should be used *exceedingly sparingly*. What I mean by that is: Users will construct arrays (expensive) to pass to anonymous functions with one parameter (expensive) and then use the splat operator inside the anonymous functions to unpack the input array to call the actual function (VERY expensive). Whatever performance gains made by moving bad application design into PHP core will be far outweighed by the abuse that naturally follows to circumvent limitations. In fact, your own contrived example usage includes two anonymous functions that call functions with more than one required parameter! You are _already_ working around the known limitations of *your own proposed feature* !! Why in the world would you ever advertise that?! If that's not enough to kill an RFC before it even goes to a vote, I don't know what is.

The repeated assignment to $temp in your second example is _not_ actually equal to the earlier example as you claim. The second example with all of the $temp variables should, IMO, just be:

$temp = "Hello World";
$result = array_filter(array_map('strtoupper', str_split(htmlentities($temp))), fn($v) { return $v != 'O'; });

By storing the result into $temp for each modification just so that you can have multiline code, you are actually making the engine work harder whereas a single statement saves the engine some unnecessary refcounting/allocation/free work but accomplishes the same objective. I'm nitpicking the clearly contrived second code example that didn't at all improve my impression of the first example and where your own example usage ended up exposing the fundamental flaws in the RFC. I also consider the above compact code to be plenty readable and not particularly necessary to span multiple lines, but that's obviously subjective.

Just because someone _can_ do something doesn't mean that they should. More than likely, users trying to do pipe-like operations in PHP shouldn't be doing them in the first place.

--
Thomas Hruska
CubicleSoft President

CubicleSoft has over 80 original open source projects and counting.
Plus a couple of commercial/retail products.

What software are you looking to build?

Crell · February 7, 2025, 7:48pm

On Fri, Feb 7, 2025, at 10:51 AM, Christoph M. Becker wrote:

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it's also going to be notably more work, and the two operators don't actually interact at all at the code level, so since people keep saying "Small RFCs!", here's a small RFC.

Fair enough. And with the pipe operator, one might live without a
compose operator, e.g.

  $f1 = fn($x) => 2 * $x;
  $f2 = fn($x) => $x + 3;
  // $f3 = $f2 ∘ $f1
  $f3 = fn($x) => $x |> $f1 |> $f2;

Christoph

The v2 RFC took that position, that compose was easy enough to emulate via pipe. Indeed, pipe and compose can both be implemented in terms of each other. However, since the previous RFC I've concluded[1] that both are sufficiently useful that we really out to include both of them. PIpes are just way easier to implement in practice.

--Larry Garfield

[1] Advent of Functional PHP: Review | PeakD

Faizan_Akram_Dar · February 7, 2025, 8:45pm

On Fri, 7 Feb 2025, 21:27 Thomas Hruska, <thruska@cubiclesoft.com> wrote:

The repeated assignment to $temp in your second example is not
actually equal to the earlier example as you claim. The second example
with all of the $temp variables should, IMO, just be:

$temp = “Hello World”;
$result = array_filter(array_map(‘strtoupper’,
str_split(htmlentities($temp))), fn($v) { return $v != ‘O’; });

Tbh, this is unreadable. Larry’s example with an intermediate variable is a magnitude times more readable. This is exactly why we need pipe operator.

I also consider the above compact code to be plenty readable and not
particularly necessary to span multiple lines, but that’s obviously
subjective.

It is not, the functions are being applied from in to out (or right to left), which become hard to read with addition of each new function. Pipe operator makes it natural as they are applied from left to right which is how you read code, literally 0 cognitive load.

Just because someone can do something doesn’t mean that they should.
More than likely, users trying to do pipe-like operations in PHP
shouldn’t be doing them in the first place.

Why not? It clearly makes code more readable and in future with PFA () will allow composing non-unary functions.
PHP is and always has been a multi paradigm language, there is no reason to not add stuff which makes using functional paradigm easier.

Kind regards,
Faizan

Crell · February 7, 2025, 9:04pm

Merging a few replies together here, since they overlap. Also reordering a few of Tim's comments...

On Fri, Feb 7, 2025, at 7:32 AM, Tim Düsterhus wrote:

Hi

Am 2025-02-07 05:57, schrieb Larry Garfield:

It is now back with a better implementation (many thanks to Ilija for
his help and guidance in that), and it's nowhere close to freeze, so
here we go again:

PHP: rfc:pipe-operator-v3

There's some editorial issues:

1. Status: Draft needs to be updated.
2. The RFC needs to be added to the overview page.
3. List formatting issues in “Future Scope” and “Patches and Tests”.

Would also help having a closed voting widget in the “Proposed Voting
Choices” section to be crystal clear on what is being voted on (see
below the next quote).

I split pipes off from the Composition RFC late last night right before posting; I guess I missed a few things while doing so. :-/ Most notably, the Compose section is now removed from pipes, as it is not in scope for this RFC. (As noted, it's going to be more work so has its own RFC.) Sorry for the confusion. I think it should all be handled now.

5. The “References” (as in reference variables) section would do well
with an example of what doesn't work.

Example block added.

9. In the “Why in the engine?” section: The RFC makes a claim about
performance.

Do you have any numbers?

Not currently. The statements here are based on simply counting the number of function calls necessary, and PHP function calls are sadly non-cheap. In previous benchmarks of my own libraries using my Crell/fp library, I did find that the number of function calls involved in some tight pipe operations was both a performance and debugging concern, but I don't have any hard numbers laying about at present to share.

If you think that's critical, please advise on how to best get meaningful numbers here.

Regarding the equivalency of pipes:

Tim Düsterhus wrote:

4. “That is, the following two code fragments are also exactly
equivalent:”.

I do not believe this is true (specifically referring to the “exactly”
word in there), since the second code fragment does not have the short
closures, which likely results in an observable behavioral difference
when throwing Exceptions (in the stack trace) and also for debuggers. Or
is the implementation able to elide the the extra closure? (Of course
there's also the difference between the temporary variable existing,
with would be observable for `get_defined_vars()` and possibly
destructors / object lifetimes).

Thomas Hruska wrote:

The repeated assignment to $temp in your second example is _not_
actually equal to the earlier example as you claim. The second example
with all of the $temp variables should, IMO, just be:

$temp = "Hello World";
$result = array_filter(array_map('strtoupper',
str_split(htmlentities($temp))), fn($v) { return $v != 'O'; });

Juris Evertovskis wrote:

3. Does the implementation actually turn `1 |> f(...) |> g(...)` into
`$π = f(1); g($π)`? Is `g(f(1))` not performanter? Or is the engine
clever enough with the var reuse anyways?

There's some subtlety here on these points. The v2 RFC used the lexer to mutate $a |> $b |> $c into the same AST as $c($b($a)), which would then compile as though that had been written in the first place. However, that made addressing references much harder, and there's an important caveat around order of operations. (See below.) The v3 RFC instead uses a compile function to take the AST of $a |> $b |> $c and produce opcodes that are effectively equivalent to $t = $b($a); $t = $c($t); I have not compared to see if they are the precise same opcodes, but they net effect is the same. So "effectively equivalent" may be a more accurate statement.

In particular, Tim is correct that, technically, the short lambdas would be used as-is, so you'd end up with the equivalent of:

$temp = (fn($x) => array_map(strtoupper(...), $x))($temp);

I'm not sure if there's a good way to automatically unwrap the closure there. (If someone knows of one, please share; I'm fine with including it.) However, the intent is that it would be largely unnecessary in the future with a revised PFA implementation, which would obviate the need for the explicit wrapping closure. You would instead write

$a |> array_map(strtoupper(...), ?);

Alternatively, one can use higher order user-space functions already. In trivial cases:

function amap(Closure $fn): Closure {
return fn(array $x) => array_map($fn, $x);
}

$a |> amap(strtoupper(...));

Which I am already using in Crell/fp and several libraries that leverage it, and it's quite ergonomic.

There's a whole bunch of such simple higher order functions here:

github.com

Crell/fp/blob/master/src/array.php

<?php

declare(strict_types=1);

namespace Crell\fp;

use function is_array;
use function array_map;
use function array_filter;
use function array_reduce;

function amap(callable $c): \Closure
{
    return static function (iterable $it) use ($c): array {
        if (is_array($it)) {
            return array_map($c, $it);
        }
        $result = [];
        foreach ($it as $k => $v) {
            $result[$k] = $c($v);

This file has been truncated. show original

github.com

Crell/fp/blob/master/src/string.php

<?php

declare(strict_types=1);

namespace Crell\fp;

/**
 * Simple passthrough wrapper for str_replace() to make it pipeable.
 *
 * @param array<mixed>|string $find
 * @param array<mixed>|string $replace
 */
function replace(array|string $find, array|string $replace): \Closure
{
    return static fn (string $s): string => str_replace($find, $replace, $s);
}

function implode(string $glue): \Closure
{
    return static fn (array $a): string => \implode($glue, $a);

This file has been truncated. show original

Which leads to the subtle difference between that and the v2 implementation, and why Thomas' statement is incorrect. If the expression on the right side that produces a Closure has side effects (output, DB interaction, etc.), then the order in which those side effects happen may change with the different restructuring. With all pure functions, that won't make a practical difference, and normally one should be using pure functions, but that's not something PHP can enforce.

I don't think there would be an appreciable performance difference between the two compiled versions, either way, but using the temp-var approach makes dealing with references easier, so it's what we're doing.

Juris Evertovskis wrote:

1. Do you think it would be hard to add some shorthand for `|>
$condition ? $callable : fn($😐) => $😐`?

I'm not sure I follow here. Assuming you're talking about "branch in the next step", the standard way of doing that is with a higher order user-space function. Something like:

function cond(bool $cond, Closure $t, Closure $f): Closure {
return $cond ? $t : $f;
}

$a |> cond($config > 10, bigval(...), smallval(...)) |> otherstuff(...);

I think it's premature to try and bake that logic into the language, especially when I don't know of any other function-composition-having language that does so at the language level rather than the standard library level. (There are a number of fun operations people build into pipelines, but they are all generally done in user space.)

--Larry Garfield

Rob_Landers · February 7, 2025, 10:54pm

On Fri, Feb 7, 2025, at 22:04, Larry Garfield wrote:

Merging a few replies together here, since they overlap. Also reordering a few of Tim’s comments…

On Fri, Feb 7, 2025, at 7:32 AM, Tim Düsterhus wrote:

Hi

Am 2025-02-07 05:57, schrieb Larry Garfield:

It is now back with a better implementation (many thanks to Ilija for

his help and guidance in that), and it’s nowhere close to freeze, so

here we go again:

https://wiki.php.net/rfc/pipe-operator-v3

There’s some editorial issues:

Status: Draft needs to be updated.

The RFC needs to be added to the overview page.

List formatting issues in “Future Scope” and “Patches and Tests”.

Would also help having a closed voting widget in the “Proposed Voting

Choices” section to be crystal clear on what is being voted on (see

below the next quote).

I split pipes off from the Composition RFC late last night right before posting; I guess I missed a few things while doing so. :-/ Most notably, the Compose section is now removed from pipes, as it is not in scope for this RFC. (As noted, it’s going to be more work so has its own RFC.) Sorry for the confusion. I think it should all be handled now.

The “References” (as in reference variables) section would do well

with an example of what doesn’t work.

Example block added.

In the “Why in the engine?” section: The RFC makes a claim about

performance.

Do you have any numbers?

Not currently. The statements here are based on simply counting the number of function calls necessary, and PHP function calls are sadly non-cheap. In previous benchmarks of my own libraries using my Crell/fp library, I did find that the number of function calls involved in some tight pipe operations was both a performance and debugging concern, but I don’t have any hard numbers laying about at present to share.

If you think that’s critical, please advise on how to best get meaningful numbers here.

Regarding the equivalency of pipes:

Tim Düsterhus wrote:

“That is, the following two code fragments are also exactly

equivalent:”.

I do not believe this is true (specifically referring to the “exactly”

word in there), since the second code fragment does not have the short

closures, which likely results in an observable behavioral difference

when throwing Exceptions (in the stack trace) and also for debuggers. Or

is the implementation able to elide the the extra closure? (Of course

there’s also the difference between the temporary variable existing,

with would be observable for get_defined_vars() and possibly

destructors / object lifetimes).

Thomas Hruska wrote:

The repeated assignment to $temp in your second example is not

actually equal to the earlier example as you claim. The second example

with all of the $temp variables should, IMO, just be:

$temp = “Hello World”;

$result = array_filter(array_map(‘strtoupper’,

str_split(htmlentities($temp))), fn($v) { return $v != ‘O’; });

Juris Evertovskis wrote:

Does the implementation actually turn 1 |> f(...) |> g(...) into

$π = f(1); g($π)? Is g(f(1)) not performanter? Or is the engine

clever enough with the var reuse anyways?

There’s some subtlety here on these points. The v2 RFC used the lexer to mutate $a |> $b |> $c into the same AST as $c($b($a)), which would then compile as though that had been written in the first place. However, that made addressing references much harder, and there’s an important caveat around order of operations. (See below.) The v3 RFC instead uses a compile function to take the AST of $a |> $b |> $c and produce opcodes that are effectively equivalent to $t = $b($a); $t = $c($t); I have not compared to see if they are the precise same opcodes, but they net effect is the same. So “effectively equivalent” may be a more accurate statement.

In particular, Tim is correct that, technically, the short lambdas would be used as-is, so you’d end up with the equivalent of:

$temp = (fn($x) => array_map(strtoupper(…), $x))($temp);

I’m not sure if there’s a good way to automatically unwrap the closure there. (If someone knows of one, please share; I’m fine with including it.) However, the intent is that it would be largely unnecessary in the future with a revised PFA implementation, which would obviate the need for the explicit wrapping closure. You would instead write

$a |> array_map(strtoupper(…), ?);

Alternatively, one can use higher order user-space functions already. In trivial cases:

function amap(Closure $fn): Closure {

return fn(array $x) => array_map($fn, $x);

}

$a |> amap(strtoupper(…));

Which I am already using in Crell/fp and several libraries that leverage it, and it’s quite ergonomic.

There’s a whole bunch of such simple higher order functions here:

https://github.com/Crell/fp/blob/master/src/array.php

https://github.com/Crell/fp/blob/master/src/string.php

Which leads to the subtle difference between that and the v2 implementation, and why Thomas’ statement is incorrect. If the expression on the right side that produces a Closure has side effects (output, DB interaction, etc.), then the order in which those side effects happen may change with the different restructuring. With all pure functions, that won’t make a practical difference, and normally one should be using pure functions, but that’s not something PHP can enforce.

I don’t think there would be an appreciable performance difference between the two compiled versions, either way, but using the temp-var approach makes dealing with references easier, so it’s what we’re doing.

Juris Evertovskis wrote:

Do you think it would be hard to add some shorthand for `|>

$condition ? $callable : fn($😐) => $😐`?

I’m not sure I follow here. Assuming you’re talking about “branch in the next step”, the standard way of doing that is with a higher order user-space function. Something like:

function cond(bool $cond, Closure $t, Closure $f): Closure {

return $cond ? $t : $f;

}

$a |> cond($config > 10, bigval(…), smallval(…)) |> otherstuff(…);

I think it’s premature to try and bake that logic into the language, especially when I don’t know of any other function-composition-having language that does so at the language level rather than the standard library level. (There are a number of fun operations people build into pipelines, but they are all generally done in user space.)

–Larry Garfield

Put another way, what is the order of operations for this new operator?

For example, what is the output of

$x ? $y |> strlen(…) : $z

$x + $y |> sqrt(…) . EOL

Etc.

I noticed this seems to be missing from the RFC. As a new operator, I think it should be important to specify that.

— Rob

Crell · February 7, 2025, 11:19pm

On Fri, Feb 7, 2025, at 4:54 PM, Rob Landers wrote:

Put another way, what is the order of operations for this new operator?

For example, what is the output of

$x ? $y |> strlen(…) : $z

$x + $y |> sqrt(…) . EOL

Etc.

I noticed this seems to be missing from the RFC. As a new operator, I
think it should be important to specify that.

— Rob

Pipe deliberately binds fairly low, so most other operators will happen first. Including +, ?? and ? :, for which there are tests:

So in the examples above, the second would add $x and $y first, then square-root the result. The first, I think would probably need parens to avoid being invalid but I'd have to try it to be sure.

--Larry Garfield

Rob_Landers · February 7, 2025, 11:35pm

On Sat, Feb 8, 2025, at 00:19, Larry Garfield wrote:

On Fri, Feb 7, 2025, at 4:54 PM, Rob Landers wrote:

Put another way, what is the order of operations for this new operator?

For example, what is the output of

$x ? $y |> strlen(…) : $z

$x + $y |> sqrt(…) . EOL

Etc.

I noticed this seems to be missing from the RFC. As a new operator, I

think it should be important to specify that.

— Rob

Pipe deliberately binds fairly low, so most other operators will happen first. Including +, ?? and ? :, for which there are tests:

https://github.com/php/php-src/pull/17118/files#diff-81789df7e324801626ef4ef8f629cc95dceed4c09073a2b58b70c811bf776904

https://github.com/php/php-src/pull/17118/files#diff-56cbcf85bd7f68fa7a1f837eb15dcc536576986f366976f9642ad20867c471fd

https://github.com/php/php-src/pull/17118/files#diff-775c14f54cd1a27719d30bfab62024aeb1625bc3f3621fa0e7c16fb1c7957fdd

So in the examples above, the second would add $x and $y first, then square-root the result. The first, I think would probably need parens to avoid being invalid but I’d have to try it to be sure.

–Larry Garfield

It might be good to specify it in the RFC so if there are any strange behavior, decades from now, there will be an intent to figure out if it is a feature or a bug.

As to the ternary, it is the difference between that example being valid and this $x |> $x > 3 ? foo(…) : bar(...) |> baz(…) making sense or not. Personally, I wouldn’t write this code and would use parens to disambiguate, but it’d be handy to know when doing code reviews of authors who don’t.

— Rob

Christoph_M_Becker · February 7, 2025, 11:47pm

On 07.02.2025 at 23:54, Rob Landers wrote:

Put another way, what is the order of operations for this new operator?

For example, what is the output of

$x ? $y |> strlen(…) : $z

$x + $y |> sqrt(…) . EOL

Etc.

According to the reference implementation[1], that would be equivalent to

$x ? ($y |> strlen(…)) : $z

($x + $y) |> (sqrt(…) . EOL)

I noticed this seems to be missing from the RFC. As a new operator, I think it should be important to specify that.

Indeed, precendence and associativity need to be mentioned in the RFC.

[1] <https://github.com/php/php-src/pull/17118>

Christoph

Crell · February 8, 2025, 4:05am

On Fri, Feb 7, 2025, at 5:47 PM, Christoph M. Becker wrote:

On 07.02.2025 at 23:54, Rob Landers wrote:

Put another way, what is the order of operations for this new operator?

For example, what is the output of

$x ? $y |> strlen(…) : $z

$x + $y |> sqrt(…) . EOL

Etc.

According to the reference implementation[1], that would be equivalent to

$x ? ($y |> strlen(…)) : $z

($x + $y) |> (sqrt(…) . EOL)

I noticed this seems to be missing from the RFC. As a new operator, I think it should be important to specify that.

Indeed, precendence and associativity need to be mentioned in the RFC.

[1] <https://github.com/php/php-src/pull/17118>

Christoph

I've added a precedence section, using examples from the tests and this thread.

--Larry Garfield

Tim_Dusterhus · February 8, 2025, 11:36am

Hi

On 2/7/25 22:04, Larry Garfield wrote:

I split pipes off from the Composition RFC late last night right before posting; I guess I missed a few things while doing so. :-/ Most notably, the Compose section is now removed from pipes, as it is not in scope for this RFC. (As noted, it's going to be more work so has its own RFC.) Sorry for the confusion. I think it should all be handled now.

The “Introduction” section still talks about function composition rather than the pipe operator, I believe.

5. The “References” (as in reference variables) section would do well
with an example of what doesn't work.

Example block added.

I don't understand that example. If I would write this as regular function calls it works fine. Did you mean to compare against:

inc_print(['a' => 'A', 'b' => 'B']);

i.e.

['a' => 'A', 'b' => 'B'] |> inc_print(...);

? If not, then you will need to expand on “breaks” which is a non-technical term.

9. In the “Why in the engine?” section: The RFC makes a claim about
performance.

Do you have any numbers?

Not currently. The statements here are based on simply counting the number of function calls necessary, and PHP function calls are sadly non-cheap. In previous benchmarks of my own libraries using my Crell/fp library, I did find that the number of function calls involved in some tight pipe operations was both a performance and debugging concern, but I don't have any hard numbers laying about at present to share.

If you think that's critical, please advise on how to best get meaningful numbers here.

Not sure if I missed the dedicated performance section on my first read through the RFC or if it is actually new. It also claims:

> The result is that pipe has virtually no runtime overhead.

Which given your claim that “function calls are non-cheap” and combined with the intermediate closure for calls taking more than one parameter is contradictory.

Generally speaking, if your RFC makes a claim (about performance), then it needs to back this up by evidence and not with feelings.

Regarding the “How”:

A hyperfine (How we use hyperfine to measure PHP Engine performance – Tideways) comparison for a release build comparing:

1. An implementation based on regular function calls without intermediate variables.
2. An implementation based on regular function calls with an intermediate temporary variable.
3. A performance-optimized userland pipe operator implementation.
4. The pipe operator RFC.

would certainly appropriate to gain a first insight.

Having an OPcode dump to compare (1) against (4) would help gain more insights as to where the performance differences come from.

If the expression on the right side that produces a Closure has side effects (output, DB interaction, etc.), then the order in which those side effects happen may change with the different restructuring.

That is a good point. I see you added a precedence section, but this does not fully explain the order of operations in face of side-effects and more generally with regard to “short-circuiting” behavior. An OPcode dump would explain that.

Specifically for:

     foo()
         |> (bar() ? baz(...) : quux(...))
         |> var_dump(...);

What will the output be?

but using the temp-var approach makes dealing with references easier

I thought the RFC said that references were disallowed?

Best regards
Tim Düsterhus

Tim_Dusterhus · February 8, 2025, 11:41am

Hi

On 2/8/25 05:05, Larry Garfield wrote:

Indeed, precendence and associativity need to be mentioned in the RFC.

I've added a precedence section, using examples from the tests and this thread.

Associativity is not explicitly spelled out (though only left associativity makes sense).

And for the ternary conditional, the phrasing is pretty non-technical:

it will likely need to be enclosed in () or else it will be misinterpreted.

What does “misinterpreted” mean in concrete terms? In the stated example there is only one possible way to interpret it as a legal PHP program. Does this mean it will syntax error without the parentheses? Explicitly state the error message then.

Best regards
Tim Düsterhus

Gina_P_Banyard · February 8, 2025, 3:43pm

On Friday, 7 February 2025 at 04:57, Larry Garfield <larry@garfieldtech.com> wrote:

Hi folks. A few years ago I posted an RFC for a pipe operator, as seen in many other languages. At the time it didn't pass, in no small part because the implementation was a bit shaky and it was right before freeze. Nonetheless, there are now even more (bad) user-space implementations in the wild, as it gets brought up frequently in "what do you want in PHP?" threads (though nowhere near generics or better async, of course), so it seems clear there is demand in the market for it.

It is now back with a better implementation (many thanks to Ilija for his help and guidance in that), and it's nowhere close to freeze, so here we go again:

PHP: rfc:pipe-operator-v3

Of particular note, since the last RFC I have concluded that a compose operator is a necessary complement to a pipe operator. However, it's also going to be notably more work, and the two operators don't actually interact at all at the code level, so since people keep saying "Small RFCs!", here's a small RFC.

I'm very much in favour of this RFC, it will make writing functional and date pipeline code less cumbersome.
I was curious *how* the blocking of by-ref parameter is done, and was pleasantly surprised that it is done at run-time, so "prefer-by-ref" parameters work without issues.
This is good motivation for me to go back and push the by-value sort() RFC [1] as it uses that mechanism.
I've also submitted a PR [1] to add such a test case.
Probably a good idea to specify this in the RFC.

Best regards,

Gina P. Banyard

[1] PHP: rfc:array-sort-return-array
[2] Add test to check behaviour of prefer-by-ref parameters by Girgias · Pull Request #1 · Crell/php-src · GitHub