[PHP-DEV] PHP True Async RFC - Stage 2

EdmondDantes · March 18, 2025, 1:52pm

Oops, I made a mistake in the logic of Scope and coroutines.

According to the RFC, the following code behaves differently:

currentScope()->spawn ... // This coroutine belongs to the Scope
spawn ... // This one is a child coroutine

I was sure that I had checked all the major edge cases. Sorry.
This will be fixed soon.

P.S. + 1 example:

declare(strict_types=1);

use Async\Scope;
use function Async\currentScope;

function fetchUrl(string $url): string {
    $ctx = stream_context_create(['http' => ['timeout' => 5]]);
    return file_get_contents($url, false, $ctx);
}

function fetchAllUrls(array $urls): array
{
    $futures = [];
    
    foreach ($urls as $url) {
        $futures[$url] = spawn fetchUrl($url);
    }
    
    await children;
    
    $results = [];
    
    foreach ($futures as $url => $future) {
        $results[$url] = $future->getResult();
    }
    
    return $results;
}

$urls = [
    '[https://example.com](https://example.com)',
    '[https://php.net](https://php.net)',
    '[https://openai.com](https://openai.com)'
];

$results = await spawn fetchAllUrls($urls);
print_r($results);

Ed.

Crell · March 18, 2025, 2:58pm

On Tue, Mar 18, 2025, at 8:45 AM, Talysson Lima wrote:

If I say it's bright, you call it dark.
If I choose the east, you push for the south.
You’re not seeking a path, just a fight...
Debating with you? Not worth the time!

Please do not top post.

--Larry Garfield

EdmondDantes · March 18, 2025, 9:36pm

Continuing the discussion from [PHP-DEV] PHP True Async RFC - Stage 2:

Yes, I understand what this is about.
Here’s a more specific example: launching two coroutines and waiting for both.

$scope = new Scope();
$scope->spawn(fn() => ...);
$scope->spawn(fn() => ...);

await $scope;

The downside of this code is that the programmer might forget to write await $scope.
Additionally, they constantly need to write $scope->.

This code can be replaced with syntactic sugar:

async {
spawn ...
spawn ...
};

Am I understanding this correctly? Does it look nice? I think yes.
And at the same time, if the closing bracket is missing, the compiler will throw an error, meaning you can’t forget to await.

That is the only advantage of this approach.

Now, let’s talk about the downsides.

function task(): void {
spawn function() {
echo "What?";
};

async {
spawn ...
spawn ...
};
}

Let me explain.
You can write the spawn operator outside the async block. Why?

Because nothing can prevent you from doing so. It’s simply impossible.
After all, the function might already be executing inside another async block outside its scope.

That’s exactly what happens an async block is essentially always present as soon as index.php starts running.

It is necessary to determine whether this syntax truly provides enough benefits compared to the direct implementation.

I will think about it.

Rowan_Tommins_IMSoP · March 18, 2025, 9:53pm

On 18 March 2025 08:22:18 GMT, Edmond Dantes <edmond.ht@gmail.com> wrote:

spawning in a scope a "second-class citizen" if `spawn foo($bar);`

Reminds me of *"post-purchase rationalization"* or the *"IKEA effect".*

when effort has already been invested into something, and then suddenly,
there's a more convenient way. And that convenient way seems to devalue the
first one.

That's not really what I meant. I meant that if you're learning the language, and see feature A has native and intuitive syntax, but related feature B is a method call with an awkward signature, you will think that feature A is more "normal" or "default", and feature B is "specialist" or "advanced".

If we want to encourage people to think that "spawn in current scope" is the default, to be used 99% of the time, I guess that's fine. But if we want using scopes to feel like a "natural" part of the language, they should be included in the "normal" syntax.

And if we have short forms for both on day one, I don't see any need to also have the long forms, if they end up needing slightly different semantics around closures and parameters.

I'm confused - $scope->onExit() is already in the RFC, and I wasn't

suggesting any change other than the syntax.

(Although I'm not sure if it should defer to coroutine exit rather than

scope exit by default?)

Yes, that's correct. The onExit method can be used for both a coroutine and
a Scope.
As for the method name onExit, it seems like it would be better to replace
it with something clearer.

I'm still confused why you started talking about how to implement "defer". Are you saying that "onExit" and "defer" are different things? Or that giving it dedicated syntax changes the implementation in some fundamental way?

spawn call bar(...);

It doesn't make sense because the arguments must be defined.

As with any other callable, it would be called with no arguments. It was just an illustration, because the function_name(...) syntax is one of many ways of creating a callable in PHP.

Is there a reason to redefine all of this and make fresh decisions about

what to allow?

If we strictly follow syntax consistency, then of course, we need to cover
all possible use cases.

But when I see code like this:
spawn ($this->getSome()->getFunction())($parameter1, ...);
I start seeing circles before my eyes.

Sure, any flexible syntax lets you write subjectively horrible things.
But like with allowing object keys and adding Async\Key objects, this isn't the place to redesign existing language features based on personal taste.

If the feature is "you can put a function call here", let's not waste time writing a new definition of "function call" - the scope of this project is already big enough already!

A great example of exactly this is the First-Class Callables RFC: PHP: rfc:first_class_callable_syntax

> The syntax CallableExpr(...) is used to create a Closure object that refers to CallableExpr with the same semantics as Closure::fromCallable().
> CallableExpr can be any expression that can be directly called in the PHP grammar.

That's pretty much the whole definition, because the implementation directly uses the existing function_call grammar rule. The only caveat is that calls using the nullsafe ?-> operator are forbidden - for reasons directly related to the feature, not because Nikita thought they were ugly.

spawn <exp> [with (<args>)];
...
spawn test with ("string");
spawn test(...) with ("string");

Only the second of these meets the proposed rule - `test(...)` is an expression which evaluates to a Closure object; `test` as an expression refers to a constant, which I'm pretty sure is not what you intended.

If you allowed any callables, not just Closures, this would technically be valid:

spawn "test" with ("string");

But most likely you'd still want a separate syntax for a direct function call, however you want to spell it:

spawn_this_closure_ive_created_somehow function() {
do_something();
do_something_else();
};
spawn_this_closure_ive_created_somehow test(...) with ("string");
spawn_this_function_call_without_creating_a_closure test("string");

Or forget callables, and anything that looks like it's trying to be one, because creating a Closure isn't actually the user's aim:

spawn_this_function_call_without_creating_a_closure test("string");
spawn_these_statements_use_a_closure_if_you_like_i_dont_care {
do_something();
do_something_else();
}

--
Rowan Tommins
[IMSoP]

Rob_Landers · March 19, 2025, 12:28am

On Sun, Mar 16, 2025, at 10:24, Edmond Dantes wrote:

Good day, everyone. I hope you’re doing well.

https://wiki.php.net/rfc/true_async

Here is a new version of the RFC dedicated to asynchrony.

Key differences from the previous version:

The RFC is not based on Fiber; it introduces a separate class representation for the asynchronous context.

All low-level elements, including the Scheduler and Reactor, have been removed from the RFC.

The RFC does not include Future, Channel, or any other primitives, except those directly related to the implementation of structured concurrency.

The new RFC proposes more significant changes than the previous one; however, all of them are feasible for implementation.

I have also added PHP code examples to illustrate how it could look within the API of this RFC.

I would like to make a few comments right away. In the end, the Kotlin model lost, and the RFC includes an analysis of why this happened. The model that won is based on the Actor approach, although, in reality, there are no Actors, nor is there an assumption of implementing encapsulated processes.

On an emotional level, the chosen model prevailed because it forces developers to constantly think about how long coroutines will run and what they should be synchronized with. This somewhat reminded me of Rust’s approach to lifetime management.

Another advantage I liked is that there is no need for complex syntax like in Kotlin, nor do we have to create separate entities like Supervisors and so on. Everything is achieved through a simple API that is quite intuitive.

Of course, there are also downsides — how could there not be? But considering that PHP is a language for web server applications, these trade-offs are acceptable.

I would like to once again thank everyone who participated in the previous discussion. It was great!

Hey Edmond,

Here are my notes:

The Scheduler and Reactor components should be described in a separate RFC, which should focus on the low-level implementation in C and define API contracts for PHP extensions.

Generally, RFCs are for changes in the language itself, not for API contracts in C. That can generally be handled in PRs, if I understand correctly.

The suspend function has no parameters and does not return any values, unlike the yield operator.

If it can throw, then it does return values? I can foresee people abusing this for flow control and passing out (serialized) values of suspended coroutines. Especially if it is broadcast to all other coroutines awaiting it. It is probably simpler to simply allow passing a value out via suspend.

The suspend function can be used in any function and in any place including from the main execution flow:

Does this mean it is an expression? So you can basically do:

return suspend();

$x = [suspend(), suspend(), suspend()];

foreach ($x as $_) {}

or other weird shenanigans? I think it would be better as a statement.

The await function/operator is used to wait for the completion of another coroutine:

What happens if it throws? Why does it return NULL; why not void or the result of the awaited spawn?

The register_shutdown_function handler operates in synchronous mode, after asynchronous handlers have already been destroyed. Therefore, the register_shutdown_function code should not use the concurrency API. The suspend() function will have no effect, and the spawn operation will not be executed at all.

Wouldn’t it be better to throw an exception instead of silently failing?

From this section, I really don’t like the dual-syntax of spawn, it is function-like, except not. In other words, this won’t behave like you would expect it to:

spawn ($is_callable ? $callable : $default_callable)($value);

I’m not sure what will actually happen here.

When comparing the three different models, it would be ideal to keep to the same example for all three and describe how their execution differs between the example. Having to parse through the examples of each description is a pain.

Child coroutines inherit the parent’s Scope:

Hmm. Do you mean this literally? So if I call a random function via spawn, it will have access to my current scope?

function foo() {

$x = ‘bar’;
}

$x = ‘baz’;

$scope->spawn(foo(…));

echo $x; // baz or bar??

That seems like a massive footgun. I think you mean to say that it would behave like normal. If you spawn a function, it behaves like a function, if you spawn a closure, it closes over variables just like normal. Though I think it is worth defining “when” it closes over the variables – when it executes the closure, or when it hits the spawn.

Does “spawn” not provide a \Scope?

I still don’t understand the need for a special context thing. One of the most subtle footguns with go contexts is to propagate the context when it shouldn’t be propagated. For example, if you are sending a value to a queue, you probably don’t want to send the request context. If you did and the request was cancelled (or even just completed!), it would also cancel putting the value on the queue – which is almost certainly what you do not want. Since the context is handled for you, you also have to worry about a context disappearing while you are using it, from the looks of things.

This can be easily built in userland, so I’m not sure why we are defining it as part of the language.

To ensure data encapsulation between different components, Coroutine Scope Slots provide the ability to associate data using key objects. An object instance is unique across the entire application, so code that does not have access to the object cannot read the data associated with it.

heh, reminds me of records.

I know I have been critical in this email, but I actually like it; for the most part. I think there are still some rough edges to sand down and polish, but it is on the right track!

— Rob

EdmondDantes · March 19, 2025, 7:07am

Continuing the discussion from [PHP-DEV] PHP True Async RFC - Stage 2:

I meant that the defer operator is needed in the language not only in the context of coroutines but in functions in general. In essence, defer is a shortened version of try-finally, which generates more readable code.

Since that’s the case, I shouldn’t describe this operator in this RFC. However, onExit() and defer are essentially almost the same.

I think I understand your point.
spawn <callable>, where callable is literally any expression that is considered callable.
OK. Let’s transform that to rules:

The general syntax of the spawn operator:

spawn [with <scope>] <callable>[(<parameters>)];

where:

callable a valid expression whose result can be invoked.

Examples:

spawn "function_name";

// With variable
$callable = fn() => sleep(5);
spawn $callable;

// With array expression
spawn ([$this, "method"]);
// With condition
spawn ($condition ? $fun1 : $fun2);
// As function result
spawn (getClosure());

// Closures:
spawn (function() use(): string { return "string"; });
spawn (fn() => sleep(5));
// with closure expression
spawn (sleep(...))(5);

// Simplified forms
spawn sleep(5);
spawn function() use(): string { return "string"; };

parameters a list of parameters that will be passed to the callable expression.
scope an expression that should resolve to an object of class Async\Scope.

Examples:

spawn with $scope file_get_contents("[http://localhost](http://localhost)");
spawn with $this->scope file_get_contents("[http://localhost](http://localhost)");
spawn with $this->getScope() file_get_contents("[http://localhost](http://localhost)");
spawn with getScope() file_get_contents("[http://localhost](http://localhost)");
spawn with ($flag ? $scope1 : $scope2) file_get_contents("[http://localhost](http://localhost)");

As you may have noticed, there is still a special form to avoid using "()". However, overall, this syntax looks quite cohesive. Did I get the idea right?

Yes, that’s exactly what we’d like to avoid.

Exactly! It turns out that the expression spawn something();
can be interpreted as if something is a PHP constant rather than a function.

This creates an ambiguous situation in the language:

const MY_CONST = "somefunction";
spawn MY_CONST(); // What happens here??? :)

With your help, we seem to have identified all the pitfalls. Now we can put them together and come up with the best solution.

Daniil_Gentili · March 19, 2025, 8:26am

P.S. + 1 example:
<?php ``` declare(strict_types=1); use Async\Scope; use function Async\currentScope; function fetchUrl(string $url): string { $ctx = stream_context_create(['http' => ['timeout' => 5]]); return file_get_contents($url, false, $ctx); } function fetchAllUrls(array $urls): array { $futures = []; foreach ($urls as $url) { $futures[$url] = (spawn fetchUrl($url))->getFuture(); } await currentScope(); $results = []; foreach ($futures as $url => $future) { $results[$url] = $future->getResult(); } return $results; } $urls = [ '[https://example.com](https://example.com/)', '[https://php.net](https://php.net/)', '[https://openai.com](https://openai.com/)' ]; $results = await spawn fetchAllUrls($urls); print_r($results); ```

I still strongly believe the RFC should not include the footgun that is the await on the current scope, and this example you sent shows exactly why: you gather an array of futures, and instead of awaiting the array, you await the scope (potentially causing a deadlock if client libraries do not use a self-managed scope manually, using whatever extra syntax is required to do that), and then manually extract the values for some reason.

This is still using the kotlin approach equivalent to its runBlocking function, with all the footguns that come with it.

I would like to invite you to google “runBlocking deadlock” on google, and see the vast amount of results, blogposts and questions regarding its dangers: Kotlin Runblocking Deadlock

A few examples:

Coroutines and deadlocks - #2 by wdoker - Support - Kotlin Discussions - "You launched a runBlocking inside the default dispatcher for each core… You are abusing the coroutine machinery! Do not use runBlocking from an asynchronous context, it is meant to enter an asynchronous context!” (a newbie abused an async {}/ await currentScope())
How I Fell in Kotlin’s RunBlocking Deadlock Trap, and How You Can Avoid It | by Sam Cooper | Better Programming - How I Fell in Kotlin’s RunBlocking Deadlock Trap, and How You Can Avoid It (async {}/await currentScope() blocks on internal kotlin runtime fibers, causing a deadlock in some conditions)

Even the official kotlin documentation (runBlocking) says "Calling runBlocking from a suspend function is redundant. For example, the following code is incorrect:”

suspend fun loadConfiguration() {
// DO NOT DO THIS:
val data = runBlocking { // ← redundant and blocks the thread, do not do that
fetchConfigurationData() // suspending function
}
}

Which is fully equivalent to the following PHP (note that I use the “async {}” nursery still used by some people in the discussion, but fully equivalent logic could be written using await currentScope):

function loadConfiguration() {
// DO NOT DO THIS:async { // ← redundant and blocks the thread, do not do that
$data = fetchConfigurationData(); // suspending function
}
}

With current rfc syntax:

function loadConfiguration() {
// DO NOT DO THIS:$data = fetchConfigurationData(); // suspending function
await currentScope(); // ← redundant and blocks the thread, do not do that
}

When even the official language documentation is telling you in ALL CAPS to not use something, you automatically know it’s a major footgun which has already been abused by newbies.

As I reiterated before, making a scope awaitable is a footgun waiting to happen, and while at least now there’s an escape hatch in the form of custom scopes, forcing libraries to use them is a very bad idea IMO, as other people said in this thread, if there’s an “easy” way of spawning fibers (using the global/current context), you discourage people from using the “less easy” way of spawning fibers through custom contexts, which will inevitably lead to deadlocks.

I strongly believe that golang’s scopeless approach (which is the current approach already used by async php) is the best approach, and there should be no ways for users to mess with the internals of libraries that accidentally spawn a fiber in the current scope instead of a custom one.

Regards,
Daniil Gentili.

Rowan_Tommins_IMSoP · March 19, 2025, 8:44am

On 19 March 2025 07:07:36 GMT, Edmond Dantes <edmond.ht@gmail.com> wrote:

Continuing the discussion from [[PHP-DEV] PHP True Async RFC - Stage 2](
[PHP-DEV] PHP True Async RFC - Stage 2 - #24 by Rowan_Tommins_IMSoP
):

[quote="Rowan_Tommins_IMSoP, post:24, topic:1573"]

Just a quick reminder that although various mirrors exist, this is primarily a mailing list, and email clients won't parse whatever unholy mix of markdown and BBCode that is.

A bit of punctuation for things like emphasis etc is fine, but how it looks on php.internals mailing list is going to be how it looks for a lot of contributors.

I meant that the `defer` operator is needed in the language not only in the
context of coroutines but in functions in general. In essence, `defer` is a
shortened version of `try-finally`, which generates more readable code.

Since that's the case, I shouldn't describe this operator in this RFC.
However, `onExit()` and `defer` are essentially almost the same.

Ah, I get it now, thanks.

spawn [with <scope>] <callable>[(<parameters>)];

spawn (getClosure());

spawn sleep(5);

You're cheating again - you've put an extra pair of brackets around one expression and not the other, and assumed they'll work differently, but that's not the grammar you proposed.

It's possible we could resolve the ambiguity that way - if it's in brackets, it's evaluated as an expression - but then all the other examples need to be changed to match, including this one:

spawn function() use(): string { return "string"; };

Instead you'd have to write this:

spawn (function() use(): string { return "string"; });

In the end, it still comes back to "there are two grammar rules here, how do we name them?" Only this time its "spawn" vs "spawn()" rather than "spawn" vs "spawn call", or all the other examples I've used.

[quote="Rowan_Tommins_IMSoP, post:24, topic:1573"]
The only caveat is that calls using the nullsafe ?-> operator are forbidden
- for reasons directly related to the feature, not because Nikita thought
they were ugly.
[/quote]

Yes, that's exactly what we'd like to avoid.

Sorry, what is exactly what we'd like to avoid?

[quote="Rowan_Tommins_IMSoP, post:24, topic:1573"]
But most likely you'd still want a separate syntax for a direct function
call, however you want to spell it:
[/quote]

**Exactly!** It turns out that the expression `spawn something();`
can be interpreted as if `something` is a PHP constant rather than a
function.

It's more fundamental than that: function_call and expr are overlapping grammars, so having a rule that spawn can be followed by either of them, with different meanings, leads to ambiguities. You can carefully tune the grammar to avoid those, but then the user has to learn those rules; or you can just use two keywords, which I don't remember you actually responding to as a suggestion.

Rowan Tommins
[IMSoP]

EdmondDantes · March 19, 2025, 11:01am

When even the official language documentation is telling you in ALL CAPS to not use something, you automatically know it’s a major footgun which has already been abused by newbies.

This is a compelling example of why you should not use the await currentScope() construct. Thank you.

EdmondDantes · March 19, 2025, 11:27am

You’re cheating again - you’ve put an extra pair of brackets around one
expression and not the other, and assumed they’ll work differently, but that’s
not the grammar you proposed.

Why am I cheating?

spawn (getClosure());
This is an honest statement, provided that the second parentheses are optional. The full notation would be:
spawn (getClosure())();

Instead you’d have to write this:
spawn (function() use(): string { return “string”; });

Exactly right, that’s how it should be according to PHP syntax.
Therefore, if we want to get rid of double parentheses, we need a separate rule for closures.

I would name these two cases as follows:

spawn callable – the general usage case
spawn closure – the special case for closures

I don’t think these two rules make the language inconsistent because the function keyword allows separating the first expression from the second one.

spawn function means that a Closure definition follows.
Accordingly, there should be no additional parentheses () at the end.

The reverse meaning is that if spawn is not followed by the function keyword, then it must be a valid expression that can be enclosed in parentheses.
There are some doubts about whether all situations are correct, but so far, this is how it works for me:

call a standard PHP function

spawn file_get_contents('file1.txt');

call a user-defined function

function example(string $name): void {
echo "Hello, $name!";
}

spawn example('World');

call a static method

spawn Mailer::send($message);

call a method of an object

$object = new Mailer();
spawn $object->send($message);

self, static or parent keyword:

spawn self::send($message);
spawn static::send($message);
spawn parent::send($message);

call $class method

$className = 'Mailer';
spawn $className::send($message);

expression

// Use result of foo()
spawn (foo())();
// Use foo as a closure
spawn (foo(...))();
// Use ternary operator
spawn ($option ? foo() : bar())();

call array dereference

$array = [fn() => sleep(1)];
spawn $array[0]();

new dereference

class Test {
public function wait(): void {
sleep(1);
}
}

spawn new Test->wait();

call dereferenceable scalar:

spawn "sleep"(5);

call short closure

spawn (fn() => sleep(1))();

Sorry, what is exactly what we’d like to avoid?

Syntax ambiguities.
But it seems that avoiding them completely is impossible anyway because
I’ve already found an old case where a property and a method are used in the same expression.

EdmondDantes · March 19, 2025, 11:51am

Generally, RFCs are for changes in the language itself, not for API contracts in C. That can generally be handled in PRs, if I understand correctly.

I thought this was handled by PHP INTERNAL.
So I have no idea how it actually works.

or other weird shenanigans? I think it would be better as a statement.

Your example was absolutely convincing.
I have nothing to argue with
So, suspend is 100% an operator.

What happens if it throws? Why does it return NULL; why not void or the result of the awaited spawn?

Exceptions are thrown if they exist. This also applies to suspend, by the way.

Why not void?
Because the expression $result = await ... must resolve to something.

If I remember the core code correctly, the PHP engine returns NULL even if the function is declared as void.

Wouldn’t it be better to throw an exception instead of silently failing?

Possibly, yes.
For me, this is a difficult situation because the code inside the final handler is already executing in a context where any exception can interrupt it.

Hmm. Do you mean this literally? So if I call a random function via spawn, it will have access to my current scope?

Exactly.
And yes, accessing a Scope defined in a different context is like shooting yourself in the foot.
That’s why this possibility should be removed.

I know I have been critical in this email, but I actually like it; for the most part. I think there are still some rough edges to sand down and polish, but it is on the right track!

In this RFC, I made two very big mistakes due to attention distortion.
This once again proves that it should be moved forward very slowly, as it only seems simple.

The first major mistake was that I tried to make functions an element of structural concurrency.
But to do so, functions would need to have an async attribute, which contradicts the RFC.

The second mistake is the currentScope() function, which doesn’t just shoot you in the foot—it shoots you straight in the head (just like globalScope()).
Of course, a programmer can intentionally pass the $scope object to another coroutine, but making it easier for them to do so is madness.

Aleksander_Machniak · March 19, 2025, 12:08pm

On 19.03.2025 12:51, Edmond Dantes wrote:

>
> or other weird shenanigans? I think it would be better as a statement.
>
Your example was absolutely convincing.
I have nothing to argue with
So, `suspend` is 100% an operator.

Please, don't use word operator in this context. It's a keyword, statement or language construct, but not operator. It's important especially when you write an RFC.

exit/die are also not operators.

--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
----------------------------------------------------
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com

Rowan_Tommins_IMSoP · March 19, 2025, 1:43pm

On 19 March 2025 11:27:11 GMT, Edmond Dantes <edmond.ht@gmail.com> wrote:

You're cheating again - you've put an extra pair of brackets around one
expression and not the other, and assumed they'll work differently, but

that's

not the grammar you proposed.

Why am I cheating?

"Cheating" in the sense that you wrote out a "general syntax", and then dropped in examples that contradicted that syntax. Saying "it follows this grammar, except when it doesn't" isn't very enlightening.

spawn (getClosure());

This is an honest statement, provided that the second parentheses are
optional. The full notation would be:

spawn (getClosure())();

That's not the problem; the problem is that the following are all equivalent expressions:

foo()
(foo())
((foo()))
(((foo())))

But you want these to mean different things:

spawn foo();
spawn (foo());

I think that would be achievable with a new grammar rule for "expression other than function call", so you could have:

spawn_statement:
'spawn' function_call { compile_function_spawn }
| 'spawn' expression_not_function_call { compile_closure_spawn }

But from a user's point of view, I hate rules like that which mean subtle changes completely change the meaning of the code, in a very specific context.

The reverse meaning is that if `spawn` is not followed by the `function`
keyword, then it must be a valid expression that can be enclosed in
parentheses.

There are some doubts about whether all situations are correct, but so far,
this is how it works for me:

All of these are examples of the keyword being followed by *a function call*, not *any expression*. Which honestly I think is the right way to go.

The "expression" part came into the conversation because I think it's weird to force the user to write "function() { ... }" as though it's a Closure declaration, but not let them perform obvious refactoring like moving that declaration into a variable.

If it's going to be a special case for an "inline coroutine", just use a keyword other than "function", so it doesn't look like an expression when it's not, like "spawn block { ... }"; or no keyword at all, just "spawn { ... }"

Rowan Tommins
[IMSoP]

EdmondDantes · March 19, 2025, 2:21pm

Please, don’t use word operator in this context. It’s a keyword,
statement or language construct, but not operator. It’s important
especially when you write an RFC.

Thank you so much for paying attention to this!

EdmondDantes · March 19, 2025, 3:51pm

“Cheating” in the sense that you wrote out a “general syntax”,

I got it.

That’s not the problem; the problem is that the following are all equivalent expressions:
(((foo())))

In principle, this is not a problem because the expression in parentheses preceded by spawn is unambiguously a call expression.
That is, from an analysis perspective, everything is fine here because the expression spawn () is a syntax error.

On the other hand, the expression spawn (()) is also an error, but a semantic one, because there is nothing to call.

The expression spawn ((), (), ()) is also an error because it attempts to specify a list of parameters without a call expression.

It seems I didn’t make any mistakes anywhere?

But from a user’s point of view, I hate rules like that which mean subtle changes completely change the meaning of the code, in a very specific context.

From the user’s perspective, what is the difference between these two expressions?

spawn function(): string { ... };
spawn (function(): string { ... });

All of these are examples of the keyword being followed by a function call, not any expression. Which honestly I think is the right way to go.

I created these examples according to the syntax rules from the Bison file for function_call.
Of course, I might have missed something, but overall, the list above essentially represents what function_call should expand into.

If we define the rule spawn function_call, then spawn acts as a prefix in this expression. However, everything that is already defined for function_call in PHP must remain valid.

If we take the most complex part:

| callable_expr { $<num>$ = CG(zend_lineno); } argument_list {
$$ = zend_ast_create(ZEND_AST_CALL, $1, $3);
$$->lineno = $<num>2;
}

callable_expr:
callable_variable { $$ = $1; }
| '(' expr ')' { $$ = $2; }
| dereferenceable_scalar { $$ = $1; }
| new_dereferenceable { $$ = $1; }
;

we can see that the expression (((some()))) corresponds to the second line, and arguments must follow.
However, in this case, arguments become mandatory. This means that after (((some()))), there must also be ().

And another good thing about this syntax: a coroutine is an execution context, and a function call is a transition to a block of code.
This means that the expression spawn function_call can be understood literally as:

Create a new execution context
Execute function_call

The syntax for function_call is already defined in PHP, and nothing changes in it.

The second form of the spawn expression is:

spawn inline_function

where inline_function is a definition from bison:

inline_function:
function returns_ref backup_doc_comment '(' parameter_list ')' lexical_vars return_type
...;

Another advantage of this solution is that if function_call constructs change in the future, we won’t need to modify the definitions for spawn.

If it’s going to be a special case for an “inline coroutine”, just use a keyword other than “function”, so it doesn’t look like an expression when it’s not, like “spawn block { … }”; or no keyword at all, just “spawn { … }”

Well, yes, but here we again face the issue with returnType syntax, which ends up hanging in the air…

Rowan_Tommins_IMSoP · March 19, 2025, 5:10pm

On 19 March 2025 15:51:38 GMT, Edmond Dantes <edmond.ht@gmail.com> wrote:

"Cheating" in the sense that you wrote out a "general syntax",

I got it.

That's not the problem; the problem is that the following are all

equivalent expressions:

(((foo())))

In principle, this is not a problem because the expression in parentheses
preceded by `spawn` is unambiguously a call expression.
That is, from an analysis perspective, everything is fine here because the
expression `spawn ()` is a syntax error.

This has nothing to do with my examples.

If you start with a valid expression, you can add any number of parentheses around it, and get another valid expression, with the same meaning. A function call is an expression, and a function call wrapped in parentheses is another way of writing the same expression.

But even though we're talking in circles about why, your latest examples do avoid the particular problem I was trying to describe.

From the user's perspective, what is the difference between these two
expressions?
spawn function(): string { ... };
spawn (function(): string { ... });

Yes, that would probably be a bad choice as well. Which is why I've repeatedly suggested a different keyword, and AFAIK you still haven't actually voiced an opinion on that.

If we define the rule `spawn function_call`, then `spawn` acts as a prefix
in this expression. However, everything that is already defined for
`function_call` in PHP must remain valid.

Yep, I'm totally fine with that. But you kept referring to that token as "expression", which confused me, because that's a different thing in the grammar.

The second form of the `spawn` expression is:
spawn inline_function

This part totally makes sense from a syntax point of view, I just think it has bad usability - the user has to type a bunch more boilerplate, which looks like something it's not.

If it's going to be a special case for an "inline coroutine", just use a

keyword other than "function", so it doesn't look like an expression when
it's not, like "spawn block { ... }"; or no keyword at all, just "spawn {
... }"

Well, yes, but here we again face the issue with `returnType` syntax, which
ends up hanging in the air...

I think I asked this before: why would anyone want to specify a return type there?

I assumed the actual user scenario we're trying to solve is "I have a bunch of statements I want to run in a new Coroutine, but they're not worth putting in a function". So to the user, having all the features of a function isn't relevant. We don't allow specifying the return type of a match statement, for example.

Do you have a different scenario in mind?

Rowan Tommins
[IMSoP]

EdmondDantes · March 19, 2025, 7:24pm

But even though we’re talking in circles about why,
your latest examples do avoid the particular problem I was trying to describe.

I thought the problem was that the syntax wouldn’t work. Is there any other issue?

Yes, that would probably be a bad choice as well. Which is why I’ve repeatedly suggested a different keyword, and AFAIK you still haven’t actually voiced an opinion on that.

Does this concern the syntax of spawn block {} or did I miss something?

I will describe the reasons why I rejected the concise syntax in favor of a more verbose one below.

But you kept referring to that token as “expression”, which confused me, because that’s a different thing in the grammar.

By the word “expression,” I mean a language construct along with keywords.
If spawn function_call returns a value, then it can be considered an expression, right? Or am I mistaken in the terminology?

I think I asked this before: why would anyone want to specify a return type there?

A simple answer (though not necessarily the correct one): because it’s a closure. And in PHP, a closure has a return type.
I understand what you’re asking: what is the practical significance of this? Perhaps none, but it provides consistency in syntax.

The syntax spawn {}; is the most elegant in terms of brevity. There’s no doubt that it’s shorter by the number of characters in “function”.

spawn block {}; also looks decent. However, the keyword block does not accurately reflect what is happening, because what follows is not a block but a closure.
spawn closure {}; is a possibility, but it raises the question: why introduce the keyword closure when we already have function? The difference in characters is minimal.
spawn fn {}; is the shortest option, but fn is already used for the shorthand function syntax fn() => ....
But we can forget about ReturnType, right? Okay, but there’s another point.

In PHP, code blocks are not all the same.

if / then / else / try / switch do not create a new scope.
function does create a new scope.

When a programmer writes:

$x = 5;
spawn {$x++};

Will they easily understand that $x++ is not modifying the same $x as before?
No, they won’t. They will have to remember that spawn {} creates a closure, just like function creates a closure with a separate scope.

This means the programmer has to remember one extra rule. The question is: is it worth the additional “function” characters?

Though, I don’t mind spawn fn {};—this option takes the best of everything. But if we implement it, I would also introduce the fn() {} syntax.

spawn fn use($x) {
...
};

Apart from violating the language’s style, I don’t see any drawbacks for this.

Rowan_Tommins_IMSoP · March 19, 2025, 11:04pm

On 19/03/2025 19:24, Edmond Dantes wrote:

> Yes, that would probably be a bad choice as well. Which is why I've repeatedly suggested a different keyword, and AFAIK you still haven't actually voiced an opinion on that.

Does this concern the syntax of `spawn block {}` or did I miss something?

We're talking around in circles a lot here, I think, let's try to reset and list out a load of different options, as abstractly as possible.

I'm going to use the word "keyword" in place of "spawn", just to separate *syntax* from *naming*; and where possible, I'm going to use the names from the current grammar, not any other placeholders.

1: keyword expr
2: keyword function_call
3: keyword expr_except_function_call
4: keyword inline_function
5: keyword_foo expr
6: keyword_bar function_call
7: keyword '{' inner_statement_list '}'

#1 is the most flexible: you have some way of specifying a callable value, which you pass to the engine, maybe with some arguments (concrete example: "spawn $my_closure;")

#2 is the most concise for the common case of using an existing function, because you don't need to make a callable / closure pointing to the function first (concrete example: "spawn my_function('my argument');")

BUT these two rules conflict: any function_call is also an expr, so we can't say "if it's an expr, do this; if it's a function_call, do that", because both are true at once.

The next four are compromises to work around that conflict:

#3 is a version of #1 which doesn't conflict with #2 - introduce a new grammar rule which has all the things in expr, but not function_call. This is technically fine, but maybe confusing to the user, because "spawn $foo == spawn $foo()", but "spawn foo() != spawn foo()()", and "spawn $obj->foo != $obj->foo()".

#4 is really the same principle, but restricted even further - the only expression allowed is the declaration of an inline function. This is less confusing, but has one surprising effect: if you refactor the inline function to be a variable, you have to replace it with "$foo()" not just "$foo", so that you hit rule #2

#5 and #6 are the "different keywords" options: they don't conflict with each other, and #1 could be used with #6, or #2 with #5. In concrete terms, you can have "spawn_func_call $foo() == spawn_closure $foo", or "spawn $foo() == spawn_closure $foo", or "spawn_func_call $foo() == spawn $foo".

#7 is the odd one out: it hides the closure from the user completely. It could be combined with any of the others, but most likely with #2

In PHP, code blocks are not all the same.
- `if` / `then` / `else` / `try` / `switch` **do not create a new scope**.
- `function` **does create a new scope**.

When a programmer writes:
$x = 5;
spawn {$x++};
Will they easily understand that |$x++| is not modifying the same |$x| as before?
No, they won’t. They will have to remember that |spawn {}| creates a closure, just like |function| creates a closure with a separate scope.

OK, thanks; I think this is the point that I was missing - I was thinking that the engine creating a closure was just an implementation detail, which could be made invisible to the user. But you're right, the way variable scope works would be completely unlike anything else in the language. That probably rules out #7

But I do want to come back to the question I asked in my last e-mail: what is the use case we're trying to cater for?

If the aim is "a concise way to wrap a few statements", we've already failed if the user's writing out "function" and a "use" clause.

If the aim is "a readable way to use a closure", rule #2 is fine.

Yes, it means some extra parentheses if you squeeze it all into one statement, but it's probably more readable to assign the closure to a temporary variable anyway:

// Legal under rule #2, but ugly
spawn (function() use($whatever) {
do_something($whatever);
})();

// Requires rule #4, saves a few brackets
spawn function() use($whatever) {
do_something($whatever);
};

// Only needs rule #2, and perfectly readable
$foo = function() use($whatever) {
do_something($whatever);
}
spawn $foo();

--
Rowan Tommins
[IMSoP]

Crell · March 20, 2025, 4:56am

First, side note: When I said "Tim" in my earlier messages, I was in fact referring to Rowan. I do not know why I confused Tim and Rowan. My apologies to both Tim and Rowan for the confusion.

On Tue, Mar 18, 2025, at 2:26 AM, Edmond Dantes wrote:

Hello, Larry.

First off, it desperately needs an "executive summary" section up at the top.
There's a *lot* going on, and having a big-picture overview would help a ton. (For
examples, see property hooks[1] and pattern matching[2].)

I looked at the examples you provided, but I still don't understand
what exactly I could put in this section.
Key usage examples without explanation?
Do you think that would make the RFC better? I don’t really understand
how.

For a large RFC like this, it's helpful to get a surface-level "map" of the new feature first. What it is trying to solve, and how it solves it. Basically what would be the first few paragraphs of the documentation page, with an example or three. That way, a reader can get a sort of mental boundary of what's being discussed, and then can refer back to that when later sections go into all the nitty gritty details (which are still needed). As is, there's "new" syntax being introduced for the first time halfway through the document. I have a hard time mentally fitting that into the model that the previous paragraphs built in my head.

So as an outline, I would recommend:

* Statement of problem being solved
* Brief but complete overview of the new syntax being introduced, with minimal/basic explanation
* Theory / philosophy / background that people should know (eg, the top-down/bottom-up discussion)
* Detailed dive into the syntax and how it all fits together, and the edge cases
* Implementation related details (this should be the first place you mention the Scheduler that doesn't exist, or whatever)

Second, please include realistic examples. Nearly all of the examples are contrived,

Which examples do you consider contrived, and why?

The vast majority of the examples are "print Hello World" and "sleep()", in various combinations. That doesn't tell me how to use the syntax for more realistic examples. There's a place for trivial examples, definitely, but also for "so what would I do with it, really?" examples. It shows what you expect the "best practices" to be, that you're designing for.

The first non-foobar example includes a comment "of course you should
never do it like this", which makes the example rather useless

Do you mean working with a closure that captures a reference to `$this`?
But that has nothing to do with this RFC, either directly or
indirectly. And it’s not relevant to the purpose of the example.

The Coroutine Scope Lifetime example. It says after it:

"Note: This example contains a circular dependency between objects, which should be avoided in real-world development."

Which, as above, means it is not helpful in telling me what real-world development with this feature would look like. How *should* I address the use case in that example, if not with, well, that example? That's unclear.

And the second is
built around a code model that I would never, ever accept into a code base, so it's
again unhelpful.

Why?

The second code block under "Coroutine local context". It's all a series of functions that call each other to end up on a DB factory that uses a static variable, so nothing there is injectable or testable. It has the same "should be avoided in real-world development" problem, which means it doesn't tell me anything useful about when/why I'd want to use local context, whatever that is.

So would the functions/keywords be shortcuts for
some of the common functionality of a Scope object?

Were you confused by the fact that the `Scope` object has a `spawn`
method?
(Which is semantically close to the operator?)
I understand that having both a method and an operator can create
ambiguity, but it's quite surprising that it could be so confusing.

Yes, that, for example. It suggested in my mind that the `spawn` keyword would be an alias for currentScope()->spawn(). Whether that would be wise or not I don't know, but if that wasn't your intent, then it was confusing.

How should I have written about this? It's simply a part of reality as
it is. Why did this cause confusion?
Yes, I split this RFC into several parts because it's the only way to
decompose it.
It’s logical that this needs to be mentioned so that those who haven’t
followed the discussion can have the right understanding.
What’s wrong with that?

I am fully in favor of explicitly "linking" RFCs together, such that each sub-feature can be discussed separately. However, each RFC needs to, on its own, be self-contained and useful. Maybe less useful than if the whole set were passed, but at least useful on its own.

Sometimes that means the individual RFCs are still quite large (eg, property hooks, which was split from aviz). Other times they're quite small (eg, pipes, which is part one of like 4, all noted in Future Scope). But each RFC needs to "work" on its own.

Consider: Suppose this RFC passed, but the follow-up to add a Scheduler did not pass, for whatever reason. What does that leave us with? I think it means we have an approved RFC that cannot be implemented. That's not-good.

If this RFC adds a keyword "async" or "spawn" or whatever we end up with, then that keyword needs to be able to do something useful with just the functionality approved in this RFC. Some additional functionality may not be available until a later RFC, which makes the whole thing better, but it at least still does something useful with the approved RFC.

For example, I suspect all the context values bits could be punted to a future RFC. Yes, that means some things won't be possible in the 1.0 version of the feature; that's OK. It may mean designing the syntax in such a way that it will evolve cleanly to include it later. That's also OK. But whatever plumbing needs to exist for the user-facing functionality to work (eg, the scheduler) needs to be there at the same time the user-facing functionality is.

Are those
standard terms you're borrowing from elsewhere, or your own creation?

Unfortunately, I couldn't find terminology that describes these models.
However, the RFC itself provides definitions. What is confusing to you?

Mainly that I wasn't sure if this was drawing on existing literature and I should be googling for these terms or not. If there is no existing literature then defining your own terms is fine, just be clear that's what you're doing.

I cannot really tell which one the "playpen" model
would fit into.

If we're talking about the "nursery" model in Python, there is no
direct analogy because a *nursery is not a coroutine*, but rather a
*Scope* in this RFC.

Yes, I meant nursery. (I have a 6 month old baby; all these baby-related terms blend together in my head. )

In this context, the model in the RFC is essentially no different from
nurseries in Python.

That was not at all evident to me from reading it.

The key difference lies elsewhere.
To define the structure, two elements are used:

• The coroutine itself
• An object of type *Nursery* or *Scope*
So in Python, coroutines work exactly the same way as in Go, and the
*nursery* is an additional mechanism to ensure structured concurrency.

In this RFC, there is a *nursery (Scope)*, but in addition to that,
*coroutines themselves are also part of the structure*.
(So this RFC uses a stricter approach than Python)

Does this mean it's not clear from the text?

Yes, because I didn't get that message from it. (This is where a 10,000 foot overview early on would be helpful.)

I honestly cannot see a use case at this point for starting coroutines in arbitrary scopes.

If we're talking about launching a coroutine in *GlobalScope*, then
it's 99% likely to be an *anti-pattern*, and it might be worth removing
entirely. It's the same as creating a global variable using `$GLOBAL`.
However, if we're referring to a *pattern where a service defines its
own `$scope`*, then this is probably one of the most useful aspects of
this RFC.

This is where more practical examples would be helpful. Eg, when and why would a service define its own scope, and what are the implications?

Elsewhere in the thread, Tim noted that
we should unify the function call vs closure question.

It would be great if this were possible, but so far, I haven't found a
syntax that satisfies both requirements.

Of course, the syntax could be unified like this:
spawn function($params): string use() {
}('param');
and
function test($x);
spawn test($x);
But it looks unnatural...

Right now, I'm mentally close to the approach that Rowan_Tommins also
described.:
spawn use($parameters): string {};
spawn test($x);

Let me ask this: With the spawn/start/whatever keyword, what is the expected return value? Does it block until that's done? Do I get back a future?

If the mental model is "take a function call like you already had and stick a 'start new coroutine keyword on it'", then that leads to one set of assumptions and therefore syntax behavior. If it's more like "fork a subprocess that I can communicate with later", that leads to a different set of assumptions.

On Tue, Mar 18, 2025, at 2:59 AM, Rowan Tommins [IMSoP] wrote:

I actually think what you're describing is very similar to the RFC,
just with different syntax; but your examples are different, so you're
talking past each other a bit.

That is quite possible. Given the other comments above, I'd say likely.

The "request handler" use case could easily benefit from a
"pseudo-global" scope for each request - i e. "tie this to the current
request, but not to anything else that's started a scope in between".

Potentially, though I would still question why it cannot just be explicitly passed values. (All data flow is explicit; sometimes it's painfully explicit. )

--Larry Garfield

EdmondDantes · March 20, 2025, 7:06am

This is simply a wonderful explanation. I will be able to go through each point.

But before that, let’s recall what spawn essentially is.
Spawn is an operation that creates a separate execution context and then calls a function within it.
To perform this, spawn requires two things:

callable – something that can be called; this is an expression or the result of an expression.
argument list – a list of arguments.

1: keyword expr

Then, this construct is a special case of another one:
keyword expr argument_list

However, PHP already has an expression that includes expr argument_list—this is function_call.
Therefore, the keyword function_call variant is inherently a valid and complete form that covers all possible cases.

So in the general case, keyword expr does not have a meaningful interpretation and does not necessarily need to be considered, especially if it leads to a contradiction.

In other words:
Option #1 is a special case.
Option #2 is the general case.
So, Option #2 should be our Option #1 because it describes everything.

3: keyword expr_except_function_call

If this expression is intended to call a closure, then essentially it is almost the same as #1.
That means all our conclusions about #1 also apply to this option.

4: keyword inline_function

This option can be considered a special case of option #2. And that’s exactly the case.

This is less confusing, but has one surprising effect: if you refactor the inline function to be a variable, you have to replace it with “$foo()” not just “$foo”, so that you hit rule #2

A completely logical transformation that does not contradict anything. If I want to use a variable, this means:

I want to define a closure at point A.
I want to use it at point B.
Point A does not know where point B is.
Point B does not know what arguments A will have.
Therefore, I need to define a list of arguments to explicitly state what I want to do.

The meaning of option #4 is different:

I want to define a closure at point A.
I want to use it at point A.
Point A knows what the closure looks like, so there is no need to define arguments — it’s the same place in the code.

Therefore, the keyword closure does not break the clarity of the description, whereas the `keyword `$something does.

5: keyword_foo expr

The same #1.

6: keyword_bar function_call

This contains even more characters than the original and yet adds nothing useful.

7: keyword ‘{’ inner_statement_list ‘}’

let me add another nail to the coffin of this option:

class Some {
public function someMethod()
{
spawn static function() {
...
}
}
}

Another point in favor of option #4 is the static keyword, which is also part of the closure.

But I do want to come back to the question I asked in my last e-mail: what is the use case we’re trying to cater for?

Goal #1: Improve code readability. Make it easier to understand.
Goal #2: Reduce the number of characters.

The spawn keyword simplifies constructs by removing two parentheses and commas. For example:

spawn file_get_content("file");

vs
spawn("file_get_content", "file");

The first option looks as natural as possible, and spawn is perceived as a prefix. And that’s exactly what it is — it essentially works like a prefix operation.
In other words, its appearance reflects its behavior, assuming, of course, that we are used to reading from left to right.

If the aim is “a concise way to wrap a few statements”, we’ve already failed if the user’s writing out “function” and a “use” clause.

Yes, but the function ... use syntax was not invented by us. We can’t blame ourselves for something we didn’t create.
If closures in PHP were more concise, then spawn + closure would also look shorter.
We cannot take responsibility for this part.

If the aim is “a readable way to use a closure”, rule #2 is fine.

Great. All that’s left is to approve option #4

but it’s probably more readable to assign the closure to a temporary variable anyway

In this case, we increase the verbosity of the code and force the programmer to create an unnecessary variable (not good).

The advantage of option #4 is not just that it removes parentheses, but also that it keeps the code readable.
It’s the same as (new Class(...))->method() — the point is not about saving keystrokes (after all, nowadays ChatGPT writes the code anyway), but about the level of readability. It improves readability by about 1.5 times.
That’s the key achievement.

The second reason why option #4 makes sense: it will be used frequently.
And that means the programmer will often create unnecessary variables.
Do you really want that?

For example, spawn fn() => file_get_content() won’t be, because it doesn’t make sense.