[PHP-DEV] PHP True Async RFC

Rowan_Tommins_IMSoP · March 5, 2025, 8:39am

On 4 March 2025 18:36:37 GMT, Larry Garfield <larry@garfieldtech.com> wrote:

PHP doesn't have Python-style context managers (though I would like them)

So would I, I've actually thought about it a lot...

But more importantly, this highlights something important about that Python library: it is built *on top of* a native async/await system which is baked into the language (note the example uses "async with", not normal "with").

That reinforces my earlier feeling that this RFC is trying to do far too much at once - it's not just about "low-level vs high-level", there's multiple whole features here:

- asynchronous versions of native functions, and presumably a C API for writing those in extensions
- facilities for writing coroutines (async/await, but not as keywords)
- deferrable "microtasks"
- event/signal handling functionality
- communication between threads/fibers via Channels
- a facility for launching coroutines concurrently (as Python demonstrates, this can be separate from how the coroutines themselves are written)
- maybe more that I've overlooked while trying to digest the RFC

Having all of those would be amazing, but every one of them deserves its own discussion, and several can be left to userland or as future scope in an initial implementation.

Rowan Tommins
[IMSoP]

EdmondDantes · March 5, 2025, 9:37am

Good day, Larry.

First off, as others have said, thank you for a thorough and detailed proposal.
Thanks!

A series of free-standing functions.

That only work if the scheduler is active.

The scheduler being active is a run-once global flag.

So code that uses those functions is only useful based on a global state not present in that function.

And a host of other seemingly low-level objects that have a myriad of methods on them that do, um, stuff.

Oh, and a lot of static methods, too, instead of free-standing functions.

Suppose these shortcomings don’t exist, and we have implemented the boldest scenario imaginable. We introduce Structured Concurrency, remove low-level elements, and possibly even get rid of Future. Of course, there are no functions like startScheduler or anything like that.

In this case, how should PHP handle Fiber and all the behavior associated with it? Should Fiber be declared deprecated and removed from the language? What should the flow be?
What should be done with I/O functions? Should they remain blocking, with a separate API provided as an extension?
Would it be possible to convince the maintainers of XDEBUG and other extensions to rewrite their code to support the new model? ( If you’re reading this question now, please share your opinion. )
If transparent concurrency is introduced for I/O in point 2, what should be done with Revolt + AMPHP? This would break their code. Should an additional function or option be introduced to switch PHP into “legacy mode”?

I share your feelings on many points, but I would like to see some real-world alternative.

I commend to your attention this post about a Python async library

Structured concurrency is a great thing. However, I’d like to avoid changing the language syntax and make something closer to Go’s semantics. I’ll think about it and add this idea to my TODO.

async $context {
// $context is an object of AsyncContext, and can be passed around as such.
// It is the only way to span anything async, or interact with the async controls.
// If a function doesn’t take an AsyncContext param, it cannot control async. This is good.

This is a very elegant solution. Theoretically.

However, in practice, if you require explicitly passing the context to all functions, it leads to the following consequences:

The semantics of all functions increase by one additional parameter (Signature bloat).
If an asynchronous call needs to be added to a function, and other functions depend on it, then the semantics of all dependent functions must be changed as well.

In strict languages, a hybrid model is often used, or like in Go, where the context is passed explicitly as a synchronization object, but only when necessary.

In this example, there is another aspect: the fact that async execution is explicitly limited to a specific scope. This is essentially the same as startScheduler, and it is one of the options I was considering.

Of course, startScheduler can be replaced with a construction like async(function() { ... }).
This means that async execution is only active within the closure, and coroutines can only be created inside that closure.

This is one of the semantic solutions that allows removing startScheduler, but at the implementation level, it is exactly the same.

What do you think about this?

I’m not convinced that sticking arbitrary key/value pairs into the Context object is wise;

Why not?

that’s global state by another name

Static variables inside a function are also global state. Are you against static variables?

But if we must, the above would handle all the inheritance and override stuff quite naturally. Possibly with:

How will a context with open string keys help preserve service data that the service doesn’t want to expose to anyone? The Key() solution is essentially the same as Symbol in JS, which is used for the same purpose. Of course, we could add a coroutine static $var construct to the language syntax. But it’s all the same just syntactic sugar that would require more code to support.

[$in, $out] = Channel::create($buffer_size);

This semantics require the programmer to remember that two variables actually point to the same object. If a function has multiple channels, this makes the code quite verbose. Additionally, such channels are inconvenient to store in lists because their structure becomes more complex.

I would suggest a slightly different solution:


$in = new Channel()->getProducer();
async myFunction($in->getConsumer());

This semantics do not restrict the programmer in usage patterns while still allowing interaction with the channel through a well-defined contract.
Thanks for the great examples, and a special thanks for the article.

I also like the definition of context.
Ed

EdmondDantes · March 5, 2025, 10:30am

Hello, Eugene!

What I think it could be.

async function baz(): int {
$foo = foo();
$bar = bar();

return $foo + $bar;
}

// value is returned just like from any other ordinary function
$val = baz();

If we have code like $x + $y, and in one block it follows rule 1 while in another block it follows rule 2, this increases the complexity of the language. The worst part is that the same operators exhibit DIFFERENT behavior in different contexts. This violates semantic integrity. (A similar issue occurred in C++ with operator overloading, where a theoretically elegant solution turned out to be terrible in practice).

If you want to achieve a clean syntax for concurrency in PHP, I would suggest considering pipes in the long run.

For example:

|> $users = getUsers() ||| $orders = getOrders()
|> mergeByColumn($users, $orders, ‘orders’)

–

Ed.

bukka · March 5, 2025, 10:58am

Hi,

https://wiki.php.net/rfc/true_async

I believe this version is not perfect and requires analysis. And I strongly believe that things like this shouldn’t be developed in isolation. So, if you think any important (or even minor) aspects have been overlooked, please bring them to attention.

I thought about this quite a bit and I think we should first try to clarify the primary design that we want to go for. What I mean is whether we would like to ever support a true concurrency (threads) in it. If we think it would be worth it (even thought it wouldn’t be initially supported), then we should take it into the account from the beginning and add restrictions to prevent race conditions. It means it should probably disallow global (e.g. global $var;) variables or at least make them context specific as well as disallowing object sharing. I think PHP users should not deal with synchronization primitives. Basically what I want to say is that multithreading should not be just something mentioned in the future scope but the whole design should be done in a way that will make sure everything will work fine for users. Ideally also having some simplified implementation that will verify it.

I also agree that the scope is currently too big. It should be reduced to the absolute minimum and just show what’s possible. It’s great to have a proof of concept for that but the initial proposal should be mainly about the design and introducing the core components.

Regards,

Jakub

EdmondDantes · March 5, 2025, 12:23pm

Hello, Jakub.

I thought about this quite a bit and I think we should first try to clarify the primary design that we want to go for.
What I mean is whether we would like to ever support a true concurrency (threads) in it.
If we think it would be worth it (even thought it wouldn’t be initially supported), then we should take it into the account from the beginning and add restrictions to prevent race conditions.

If you mean multitasking, i.e., executing coroutines in different OS threads, then this feature is far beyond the scope of this RFC and would require significant changes to the PHP core (Memory manager first).
And even if we imagine that such changes are made, eliminating data races without breaking the language is, to put it mildly, a questionable task from the current perspective.

Although this RFC raises the question of whether a concurrent version without multitasking is worth implementing at all, my opinion is positive. For PHP, this could be sufficient as a language primarily used in the context of asynchronous I/O, whereas a multitasking version may never happen.

I will likely pose this as a direct question in the final part of this RFC. Thanks!

Ed.

bukka · March 5, 2025, 4:55pm

Hi,

I thought about this quite a bit and I think we should first try to clarify the primary design that we want to go for.
What I mean is whether we would like to ever support a true concurrency (threads) in it.
If we think it would be worth it (even thought it wouldn’t be initially supported), then we should take it into the account from the beginning and add restrictions to prevent race conditions.

If you mean multitasking, i.e., executing coroutines in different OS threads, then this feature is far beyond the scope of this RFC and would require significant changes to the PHP core (Memory manager first).

You might want to look to parallel extension as it’s already dealing with that and mostly works - of course combination with coroutines will certainly complicate it but the point is that memory is not shared.

And even if we imagine that such changes are made, eliminating data races without breaking the language is, to put it mildly, a questionable task from the current perspective.

That’s exactly what I meant is to make sure that there won’t be any data races - it means use only channel communication (more below).

Although this RFC raises the question of whether a concurrent version without multitasking is worth implementing at all, my opinion is positive. For PHP, this could be sufficient as a language primarily used in the context of asynchronous I/O, whereas a multitasking version may never happen.

I didn’t really mean to introduce it as part of this RFC. What I meant is to design the API so there is still possibility to add it in the future without risking various race condition in the code. It means primarily to put certain restrictions that will prevent it like limited access to global and passing anything by reference (including objects) to the running tasks.

Regards

Jakub

EdmondDantes · March 5, 2025, 5:50pm

You might want to look to parallel extension as it’s already dealing
with that and mostly works - of course combination with coroutines will certainly complicate it but the point is that memory is not shared.

Do you mean this extension: https://www.php.net/manual/en/book.parallel.php?

Yes, I studied it before starting the development of True Async. My very first goal was to enable, if not coroutine execution across threads, then at least interaction, because the same thing is already possible with Swoole.

However, parallel does not provide multitasking and will not be able to in the future. The best it can offer is interaction via Channel between two different threads, which will be useful for job processing and built-in web servers.

And here’s the frustrating part. It turns out that parallel has to copy PHP bytecode for correct execution in another thread. This means that not only does the memory manager need to be replaced with a multi-threaded version, but the virtual machine itself must also be refactored.

For PHP to work correctly in multiple threads with context switching, it will be necessary to find a way to rewrite all the code that reads/writes global variables. (Where TLS macros are used, this shouldn’t be too difficult. But is that the case everywhere?) This applies to all extensions, both built-in and third-party.

Such a language update would create an “extension vacuum.” When a new version is released, many extensions will become unavailable due to the need to adapt to the new multitasking model.

I didn’t really mean to introduce it as part of this RFC.
What I meant is to design the API so there is still possibility to add it in the future without risking various race condition in the code.
It means primarily to put certain restrictions that will prevent it like limited access to global and passing anything by
reference (including objects) to the running tasks.

Primitives like Context are unlikely to be the main issue for multitasking. The main problem will be the code that has been developed for many years with single-threaded execution in mind. This is another factor that raises doubts about the rationale for introducing real multitasking in PHP.

If we are talking about a model similar to Python’s, the current RFC already works with it, as a separate thread is used on Windows to wait for processes and send events to the PHP thread. Therefore, integrating this RFC with parallel is not an issue.

It would be great to solve the bytecode problem in a way that allows it to be freely executed across different threads. This would enable running any closure as a coroutine in another OS thread and interacting through a channel. If you talk about this functionality, it does not block concurrency or current RFC. This capability should be considered as an additional feature that can be implemented later without modifying the existing primitives.

While working on this RFC, I also considered finding a way to create SharedObject instances that could be passed between threads. However, I ultimately concluded that this solution would require changes to the memory manager, so these objects were not included in the final document.

Ed.

Crell · March 5, 2025, 6:30pm

On Wed, Mar 5, 2025, at 3:37 AM, Edmond Dantes wrote:

Good day, Larry.

First off, as others have said, thank you for a thorough and detailed proposal.

Thanks!

* A series of free-standing functions.
* That only work if the scheduler is active.
* The scheduler being active is a run-once global flag.
* So code that uses those functions is only useful based on a global state not present in that function.
* And a host of other seemingly low-level objects that have a myriad of methods on them that do, um, stuff.
* Oh, and a lot of static methods, too, instead of free-standing functions.

Suppose these shortcomings don’t exist, and we have implemented the
boldest scenario imaginable. We introduce Structured Concurrency,
remove low-level elements, and possibly even get rid of `Future`. Of
course, there are no functions like `startScheduler` or anything like
that.

1. In this case, how should PHP handle `Fiber` and all the behavior
associated with it? Should `Fiber` be declared deprecated and removed
from the language? What should the flow be?

I'm not sure yet. I was quite hesitant about Fibers when they went in because they were so low-level, but the authors were confident that it was enough for a user-space toolchain to be iterated on quickly that everyone could use. That clearly didn't pan out as intended (Revolt exists, but usage of it is still rare), so here we are with a half-finished API.

Thinking aloud, perhaps we could cause `new Fiber` to create an automatic async block? Or we do deprecate it and discourage its use. Something to think through, certainly.

2. What should be done with I/O functions? Should they remain
blocking, with a separate API provided as an extension?

The fact that IO functions become transparently async when appropriate is the best part of the current RFC. Please keep that.

3. Would it be possible to convince the maintainers of XDEBUG and
other extensions to rewrite their code to support the new model? ( *If
you're reading this question now, please share your opinion.* )

I cannot speak for Derick.

4. If transparent concurrency is introduced for I/O in point 2, what
should be done with `Revolt` + `AMPHP`? This would break their code.
Should an additional function or option be introduced to switch PHP
into "legacy mode"?

Also an excellent question, to which I do not yet have an answer. (See previous point about Fibers being half-complete.) I would want to involve Aaron, Christian, and Ces-Jan before trying to make any suggestions here.

Structured concurrency is a great thing. However, I’d like to avoid
changing the language syntax and make something closer to Go’s
semantics. I’ll think about it and add this idea to my TODO.

Well, as noted in the article, structured concurrency done right means *not* having unstructured concurrency. Having Go-style async and then building a structured nursery system on top of it means you cannot have any of the guarantees of the structured approach, because the other one is still poking out the side and leaking. We're already stuck with mutable-by-default, global variables, and other things that prevent us from making helpful assumptions. Please, let's try to avoid that for async. We don't need more gotos.

async $context {
// $context is an object of AsyncContext, and can be passed around as such.
// It is the *only* way to span anything async, or interact with the async controls.
// If a function doesn't take an AsyncContext param, it cannot control async. This is good.

This is a very elegant solution. Theoretically.

However, in practice, if you require explicitly passing the context to
all functions, it leads to the following consequences:

1. The semantics of all functions increase by one additional parameter
(*Signature bloat*).

No, just those functions/objects that necessarily involve running async control commands. Most wouldn't. They would just silently context switch when they hit an IO operation (which as noted above is transparency supported, which is what makes this work) and otherwise behave the same.

But if something does actively need to do async stuff, it should have a context to work within. It's the same discussion as:

A: "Pass/inject a DB connection to a class that needs it, don't just call a global db() function."
B: "But then I have to pass it to all these places explicitly!"
A: "That's a sign your SQL is too scattered around the code base. Fix that first and your problem goes away."

Explicit flow control is how you avoid bugs. It's also self-documenting, as it's patently obvious what code expects to run in an async context and which doesn't care.

2. If an asynchronous call needs to be added to a function, and other
functions depend on it, then the semantics of all dependent functions
must be changed as well.

This is no different than DI of any other service. I have restructured code to handle temporary contexts before. (My AttributeUtils and Serde libraries.) The result was... much better code than I had before. I'm glad I made those refactors.

In this example, there is another aspect: the fact that async execution
is explicitly limited to a specific scope. This is essentially the same
as `startScheduler`, and it is one of the options I was considering.

Of course, `startScheduler` can be replaced with a construction like
`async(function() { ... })`.
This means that async execution is only active within the closure, and
coroutines can only be created inside that closure.

This is one of the semantic solutions that allows removing
`startScheduler`, but at the implementation level, it is exactly the
same.

What do you think about this?

That looks mostly like the async block syntax I proposed, spelled differently. The main difference is that the body of the wrapped function would need to explicitly `use` any variables from scope that it wanted, rather than getting them implicitly. Whether that's good or bad is probably subjective.

But it would allow for a syntax like this for the context, which is quite similar to how database transactions are often done:

$val = async(function(AsyncContext $ctx) use ($stuff, $fn) {
  $result = ;
  foreach ($stuff as $item) {
    $result = $ctx->run($fn);
  }

// We block/wait here until all subtasks are complete, then the async() call returns this value.
return $result;
});

And of course in both cases you could use a pre-defined callable instead of inlining one. At this point I think it's mostly a stylistic difference, function vs block.

I'm not convinced that sticking arbitrary key/value pairs into the Context object is wise;

Why not?

that's global state by another name

Static variables inside a function are also global state. Are you
against static variables?

Vocally, in fact.

But if we must, the above would handle all the inheritance and override stuff quite naturally. Possibly with:

How will a context with open string keys help preserve service data
that the service doesn't want to expose to anyone? The `Key()` solution
is essentially the same as `Symbol` in JS, which is used for the same
purpose. Of course, we could add a `coroutine static $var` construct to
the language syntax. But it's all the same just syntactic sugar that
would require more code to support.

I cannot speak to JS Symbols as I haven't used them. I am just vhemently opposed to globals, no matter how many layers they're wrapped in. Most uses could be replaced by proper DI or partial application.

[$in, $out] = Channel::create($buffer_size);

This semantics require the programmer to remember that two variables
actually point to the same object. If a function has multiple channels,
this makes the code quite verbose. Additionally, such channels are
inconvenient to store in lists because their structure becomes more
complex.

I would suggest a slightly different solution:

<code php>
$in = new Channel()->getProducer();
async myFunction($in->getConsumer());
<code>

This semantics do not restrict the programmer in usage patterns while
still allowing interaction with the channel through a well-defined
contract.

I'd go slightly differently if you wanted to go that route:

$ch = new Channel($buffer_size);
$in = $ch->producer();
$out = $ch->consumer();

// You do most interaction with $in and $out.

I could probably work with that as well.

(Or even just $ch->inPipe and $ch->outPipe, now that we have nice property support.)

But the overall point, I think, is avoiding implicit modal logic. If my code doesn't need to care if it's in an async world, it doesn't care. If it does, then I need an explicit async world to work within, rather than relying on one implicitly existing, I hope. And I shouldn't have to think about "who owns this end of this channel". I just have an in and out hose I stick stuff into and pull out from, kthxbye.

Thanks for the great examples, and a special thanks for the article.
I also like the definition of context.

Ed

--Larry Garfield

Crell · March 5, 2025, 6:38pm

On Wed, Mar 5, 2025, at 11:50 AM, Edmond Dantes wrote:

You might want to look to parallel extension as it's already dealing
with that and mostly works - of course combination with coroutines will certainly complicate it but the point is that memory is not shared.

Do you mean this extension: PHP: parallel - Manual?

Yes, I studied it before starting the development of True Async. My
very first goal was to enable, if not coroutine execution across
threads, then at least interaction, because the same thing is already
possible with Swoole.

*snip*

Such a language update would create an "extension vacuum." When a new
version is released, many extensions will become unavailable due to the
need to adapt to the new multitasking model.

It would necessitate a major version release, certainly.

I didn't really mean to introduce it as part of this RFC.
What I meant is to design the API so there is still possibility to add it in the future without risking various race condition in the code.
It means primarily to put certain restrictions that will prevent it like limited access to global and passing anything by
reference (including objects) to the running tasks.

Primitives like *Context* are unlikely to be the main issue for
multitasking. The main problem will be the code that has been developed
for many years with single-threaded execution in mind. This is another
factor that raises doubts about the rationale for introducing real
multitasking in PHP.

I think the point is more that the concurrency primitives that are introduced (async block, async() function, whatever it is) should be designed in such a way that PHP could introduce multiple parallel threads in the future to run multiple async blocks simultaneously... without any impact on the *user* code. To reuse my earlier example:

function parallel_map(iterable $it, Closure $fn) {
  $result = ;
  async $ctx {
    foreach ($it as $k => $v) {
      $result[$k] = $ctx->run($fn($v));
    }
  }
  return $result;
}

Whether each run() invocation is handled by one thread switching between them or 3 threads switching between them is not something the above code should care about. Which means designing that API in such a way that I... don't need to care. Which probably means something like "no inter-fiber communication other than channels", as in Go. And thinking through what it means for the context object if it does have some kind of global property bag. (This is one reason I don't want one.) And it means there's no way to control threads directly from user-space. You just get async blocks, and that's it. This is an area that Go got pretty solidly right, and is worth emulating.

The implications on C code of adding true-thread support in the future is a separate question; the async API should be built such that it *can* be a separate future question.

--Larry Garfield

EdmondDantes · March 5, 2025, 9:10pm

Thinking aloud, perhaps we could cause new Fiber to create an automatic async block?

The main issue with Fibers is their switching logic:
If you create Fiber A and call another Fiber B inside it, Fiber B can only return to the Fiber that created it, not just anywhere. However, the Scheduler requires entirely different behavior.

This creates a conflict with the Scheduler. Moreover, it can even break the Scheduler if it operates based on Fibers. That’s why all these strange solutions in the RFC are just workarounds to somehow bypass this problem. But it seems we’ve already found an alternative solution.

I cannot speak for Derick.

Of course, I just mean that he probably won’t be happy about it

No, just those functions/objects that necessarily involve running async control commands. Most wouldn’t.
They would just silently context switch when they hit an IO operation (which as noted above is transparency supported, which is what makes this
work) and otherwise behave the same.

So it’s something more like Go or Python.

$val = async(function(AsyncContext $ctx) use ($stuff, $fn) {
$result = ;
foreach ($stuff as $item) {
$result = $ctx->run($fn);
}

// We block/wait here until all subtasks are complete, then the async() call returns this value.
return $result;
});

Do I understand correctly that at the point $val = async(function(AsyncContext $ctx) use ($stuff, $fn) execution stops until everything inside is completed?

If so, let me introduce a second semantic option (for now, I’ll remove the context and focus only on the function).

$url1 = '[https://domain1.com/](https://domain1.com/)';
$url2 = '[https://domain2.com/](https://domain2.com/)';

$url_handle = fn(string $url) => file_get_contents($url);

$res = Async\start(function() use ($url1, $url2, $url_handle) {
$res1 = Async\run($url_handle, $url1);
$res2 = Async\run($url_handle, $url2);

Async\run(fn() => sleep(5));

// some logic here

return $merged_result;
});

What’s Happening Here:1. After calling `$res = Async\start()`, the code waits until the entire block completes.

Inside Async\start, the code waits for all nested coroutines to finish.
If a coroutine has other nested coroutines, the same rule applies.

Rules Inside an Asynchronous Block:1. I/O functions do not block coroutines within the block.

Creating a new Fiber is not allowed — an exception will be thrown: you cannot use Fiber.
Unhandled exceptions will be thrown at the point of $res = Async\start().

Coroutine Cancellation Rules:

Canceling a coroutine cancels it and all its child coroutines (this cannot be bypassed unless the coroutine is created in a different context).

How does this option sound to you?

Essentially, this is Kotlin, but it should also resemble Python. However, unlike Kotlin, there are no special language constructs here—code blocks naturally serve that role. Of course, syntactic sugar can be added later for better readability.

And if you like this, I have good news: there are no implementation issues at this level.

In terms of semantic elegance, the only thing that bothers me is that return behavior is slightly altered — meaning the actual “return” won’t happen until all child functions complete. This isn’t very good, and Kotlin’s style would fit better here.

But on the other hand — can we live with this?

I cannot speak to JS Symbols as I haven’t used them.
I am just vhemently opposed to globals, no matter how many layers they’re wrapped in. Most uses could be replaced by proper DI or partial application.

You won’t be able to use DI because you have only one service (instance of class) for the entire application, not a separate service for each coroutine. This service is shared across the application and can be called from any coroutine. As a result, the service needs memory slots to store or retrieve data. DI is a mechanism used once during service initialization, not every time a method is called.

The only question is whether to use open text keys in the context, which is unsafe and can lead to collisions, or to use a unique key-object that is known only to the one who created it. (If PHP introduces object constants, this syntax would also look elegant.)

There is, of course, another approach: making Context any arbitrary object defined by the user. But this solution has a known downside — lack of a standard interface.

(Or even just $ch->inPipe and $ch->outPipe, now that we have nice property support.)

Just a brilliant idea.

Have a good day!

Ed.

Morgan · March 5, 2025, 10:04pm

On 2025-03-05 11:54, Eugene Sidelnyk wrote:
> Hi there,
>
> I would also like to highlight some interesting ideas that I find being useful to consider.
>
> Recently Bend programming language has been released, and it incorporates a completely different view on the conception of "code", in the definition of "what it is" and "how it should be interpreted".
>
> While we interpret it as a sequence of instructions, the proper way of seeing it is the graph of instructions. On every step we reduce that graph, by running the code of the nodes current node depends on.
>

I've always kind of liked this model.

Rowan_Tommins_IMSoP · March 5, 2025, 11:10pm

On 05/03/2025 21:10, Edmond Dantes wrote:

Essentially, this is Kotlin, but it should also resemble Python. However, unlike Kotlin, there are no special language constructs here—code blocks naturally serve that role. Of course, syntactic sugar can be added later for better readability.

To pick up on this point: PHP doesn't have any generalised notion of "code blocks", only Closures, and those have a "weight" which is more fundamental than syntax: creating the Closure object, copying or referencing captured variables, creating a new execution stack frame, and arranging for parameters to be passed in and a return value passed out.

Perhaps more importantly, there's a reason most languages don't represent flow control purely in terms of functions and objects: it's generally far simpler to define "this is the semantics of a while loop" and implement it in the compiler or VM, than "these building blocks are sufficient that any kind of loop can be built in userland without explicit compiler support".

Defining new syntax would encourage us to define a minimum top-level behaviour, such as "inside an async{} block, these things are possible, and these things are guaranteed to be true". Then we simply make that true by having the compiler inject whatever actions it needs before, during, and after that block. Any additional keywords, functions, or objects, are then ways for the user to vary or make use of that flow, rather than ways to define the flow itself.

This is roughly what happened with Closures themselves in PHP: first, decide that "$foo = function(){};" will be valid syntax, and define Closure as the type of $foo; then over time, add additional behaviour to the Closure class, the ability to add __invoke() hooks on other classes, etc

Regards,

--
Rowan Tommins
[IMSoP]

Rowan_Tommins_IMSoP · March 5, 2025, 11:50pm

On 05/03/2025 23:10, Rowan Tommins [IMSoP] wrote:

This is roughly what happened with Closures themselves in PHP: first, decide that "$foo = function(){};" will be valid syntax, and define Closure as the type of $foo; then over time, add additional behaviour to the Closure class, the ability to add __invoke() hooks on other classes, etc

Sorry to double-post, but Generators are probably a better example: you can write "$foo = yield $bar;" and there are well-defined semantics; on the outside of the function, we represent the state as a Generator object, and make it implement Iterable to explain how foreach() works; but on the inside of the function, it's pure magic: $bar is passed into an invisible channel, an invisible continuation is created, and when it's resumed another invisible channel passes out a value for $foo.

--
Rowan Tommins
[IMSoP]

Crell · March 6, 2025, 4:11am

On Wed, Mar 5, 2025, at 3:10 PM, Edmond Dantes wrote:

No, just those functions/objects that necessarily involve running async control commands. Most wouldn't.
They would just silently context switch when they hit an IO operation (which as noted above is transparency supported, which is what makes this
work) and otherwise behave the same.

So it's something more like Go or Python.

$val = async(function(AsyncContext $ctx) use ($stuff, $fn) {
$result = ;
foreach ($stuff as $item) {
$result = $ctx->run($fn);
}

// We block/wait here until all subtasks are complete, then the async() call returns this value.
return $result;
});

Do I understand correctly that at the point `$val =
async(function(AsyncContext $ctx) use ($stuff, $fn)` execution stops
until everything inside is completed?

Correct. By the time $val is populated, all fibers/coroutines/tasks started inside that block have completed and closed, guaranteed. If an exception was thrown or something else went wrong, then by the time the exception escapes the asnc{} block, all fibers inside it are done and closed, guaranteed. (If there's another async {} block further up the stack somewhere, there may still be other background fibers running, but anything created inside that block is guaranteed done.)

If so, let me introduce a second semantic option (for now, I'll remove
the context and focus only on the function).
$url1 = 'https://domain1.com/';
$url2 = 'https://domain2.com/';

$url_handle = fn(string $url) => file_get_contents($url);

$res = Async\start(function() use ($url1, $url2, $url_handle) {
    $res1 = Async\run($url_handle, $url1);
    $res2 = Async\run($url_handle, $url2);

    Async\run(fn() => sleep(5));

    // some logic here

    return $merged_result;
});
What's Happening Here:

1. After calling `$res = Async\start()`, the code waits until the
entire block completes.
2. Inside `Async\start`, the code waits for all nested coroutines to
finish.
3. If a coroutine has other nested coroutines, the same rule applies.
Rules Inside an Asynchronous Block:

1. I/O functions do not block coroutines within the block.
2. Creating a new `Fiber` is not allowed — an exception will be
thrown: you cannot use `Fiber`.
3. Unhandled exceptions will be thrown at the point of `$res =
Async\start()`.
Coroutine Cancellation Rules:

Canceling a coroutine cancels it and all its child coroutines (this
cannot be bypassed unless the coroutine is created in a different
context).

How does this option sound to you?

We can quibble on the details and spelling, but I think the overall logic is sound. One key question, if we disallow explicitly creating Fibers inside an async block, can a Fiber be created outside of it and not block async, or would that also be excluded? Viz, this is illegal:

async {
$f = new Fiber(some_func(...));
}

But would this also be illegal?

$f = new Fiber(some_func(...));
$f->start();

async {
do_stuff();
}

Essentially, this is Kotlin, but it should also resemble Python.
However, unlike Kotlin, there are no special language constructs
here—code blocks naturally serve that role. Of course, syntactic sugar
can be added later for better readability.

My brief foray into Kotlin in a previous job didn't get as far as coroutines, so I will take your word from it. From a very cursory glance at the documentation, I think runBlocking {} is approximately what I am describing, yes. The various other block types I don't know are necessary.

And if you like this, I have good news: there are no implementation
issues at this level.

In terms of semantic elegance, the only thing that bothers me is that
`return` behavior is slightly altered — meaning the actual "return"
won’t happen until all child functions complete. This isn’t very good,
and Kotlin’s style would fit better here.

I'm not sure I follow. The main guarantee we want is that "once you pass this }, all fibers/coroutines have ended, count on it." Do you mean something like this?

async $ctx {
$ctx->run(foo(...));
$ctx->run(bar(...));

// This return statement blocks until foo() and bar() complete.
return "all done";
}

That doesn't seem any weirder than return and finally{} blocks. (Note that we can and should consider if async {} makes sense to have its own catch and finally blocks built in.)

But on the other hand — can we live with this?

This seems far closer to something I'd support than the current RFC, yes.

I cannot speak to JS Symbols as I haven't used them.
I am just vhemently opposed to globals, no matter how many layers they're wrapped in. Most uses could be replaced by proper DI or partial application.

You won’t be able to use DI because you have only *one service
(instance of class)* for the entire application, not a separate service
for each coroutine. This service is shared across the application and
can be called from any coroutine. As a result, the service needs memory
slots to store or retrieve data. DI is a mechanism used once during
service initialization, not every time a method is called.

Not true. DI doesn't imply singleton objects. Most good DI *containers* default to singleton objects, as they should, but for example Laravel's container does not. You have to opt-in to singleton behavior. (I think that's a terrible design, but it's still DI.)

DI just means "a scope gets the stuff it needs given to it, it never asks for it." How that stuff is passed in is, deliberately, undefined. A DI container is but one way.

In Crell/Serde, I actually use "runner objects" a lot. I have an example here:

https://presentations.garfieldtech.com/slides-serialization/longhornphp2023/#/7/4/3

That is still dependency injection, because ThingRunner is still taking all of its dependencies via the constructor. And being readonly, it's still immutable-friendly.

That's the sort of thing I'm thinking of here for the async context. To spitball again:

class ClientManager {
public function __construct(string $base) {}

  public function client(AsyncContext $ctx) {
    return new HttpClient($this->base, $ctx);
  }
}

class HttpClient {
public function __construct(private string $base, private AsyncContext $ctx) {}

  public function get(string $path) {
    $this->ctx->defer(fn() => print "Read $path\n");
    return $this->ctx->run(fn() => file_get_contents($this->base . $path));
  }
}

$manager = $container->get(ClientManager::class);

async $ctx {
  $client = $manager->client($ctx);
  $client->get('/foo');
  $client->get('/bar');
}
// We don't get here until all file_get_contents() calls are complete.
// The deferred functions all get called right here.

// There is no no async happening anymore.
print "Done";

I'm pretty sure the return values are all messed up there, but hopefully you get the idea. Now HttpClient has a fully injected context that controls what async scope it's working in. The same class can be used in a bunch of different async blocks, each with their own context. You can even mock AsyncContext for testing purposes just like any other constructor argument. And not a global function or variable in sight!

--Larry Garfield

EdmondDantes · March 6, 2025, 7:49am

Defining new syntax would encourage us to define a minimum top-level
behaviour, such as “inside an async{} block, these things are possible,
and these things are guaranteed to be true”

True. This is precisely the main reason not to change the syntax. The issue is not even about how many changes need to be made in the code, but rather about how many agreements need to be considered.

Ed.

EdmondDantes · March 6, 2025, 8:52am

One key question, if we disallow explicitly creating Fibers inside an async block,
can a Fiber be created outside of it and not block async, or would that also be excluded? Viz, this is illegal:

Creating a Fiber outside of an asynchronous block is allowed; this ensures backward compatibility.
According to the logic integrity rule, an asynchronous block cannot be created inside a Fiber. This is a correct statement.

However, if the asynchronous block blocks execution, then it does not matter whether a Fiber was created or not, because it will not be possible to switch it in any way.
So, the answer to your question is: yes, such code is legal, but the Fiber will not be usable for switching.

In other words, Fiber and an asynchronous block are mutually exclusive. Only one of them can be used at a time: either Fiber + Revolt or an asynchronous block.

Of course, this is not an elegant solution, as it adds one more rule to the language, making it more complex. However, from a legacy perspective, it seems like a minimal scar.

(to All: Please leave your opinion if you are reading this )

// This return statement blocks until foo() and bar() complete.

Yes, that’s correct. That’s exactly what I mean.

Of course, under the hood, return will execute immediately if the coroutine is not waiting for anything. However, the Scheduler will store its result and pause it until the child coroutines finish their work.

In essence, this follows the parent-child coroutine pattern, where they are always linked. The downside is that it requires more code inside the implementation, and some people might accuse us of a paternalistic approach.

should consider if async {} makes sense to have its own catch and finally blocks built in.)

We can use the approach from the RFC to catch exceptions from child coroutines: explicit waiting, which creates a handover point for exceptions.

Alternatively, a separate handler like Context::catch() could be introduced, which can be defined at the beginning of the coroutine.

Or both approaches could be supported. There’s definitely something to think about here.

That is still dependency injection, because ThingRunner is still taking all of its dependencies via the constructor. And being readonly, it’s still immutable-friendly.

Yeah, so basically, you’re creating the service again and again for each coroutine if the coroutine needs to use it. This is a good solution in the context of multitasking, but it loses in terms of performance and memory, as well as complexity and code size, because it requires more factory classes.

The main advantage of LongRunning is initializing once and using it multiple times. On the other hand, this approach explicitly manages memory, ensuring that all objects are created within the coroutine’s context rather than in the global context.

Ah, now I see how much you dislike global state!

However, in a scenario where a web server handles many similar requests, “global state” might not necessarily win in terms of speed but rather due to the simplicity of implementation and the overall maintenance cost of the code. (I know that in programming, there is an entire camp of immutability advocates who preach that their approach is the key remedy for errors.)

I would support both paradigms, especially since it doesn’t cost much.

A coroutine will own its internal context anyway, and this context will be carried along with it, even across threads. How to use this context is up to the programmer to decide. But at the same time, I will try to make the pattern you described fit seamlessly into this logic.

Ed.

Rowan_Tommins_IMSoP · March 6, 2025, 9:17am

On 06/03/2025 07:49, Edmond Dantes wrote:

> Defining new syntax would encourage us to define a minimum top-level
> behaviour, such as "inside an async{} block, these things are possible,
> and these things are guaranteed to be true"

True. This is precisely the main reason not to change the syntax. The issue is not even about how many changes need to be made in the code, but rather about how many agreements need to be considered.

Quite the opposite: with a function-and-object approach everything needs a name, an API, and a way of being described in relation to how the language already works. In a syntax-and-semantics approach, we only need to describe the things people actually need.

The generator implementation doesn't have a name or API for where the value on the right of a "yield" goes, or where the value on its left comes from; we just describe the behaviour: values passed to yield somehow end up in the calling scope's Generator object, and values passed to that object somehow end up back at the yield statement. We don't have to define the API for a GeneratorContext object, and the semantics of what happens when users pass it around and store it in different scopes.

In the same way, do we actually need to design what an "async context" looks like to the user? Do we actually want the user to be able to have access to two (nested) async contexts at once, and choose which one to spawn a task into? Or would we prefer, at least in the minimum implementation, to say "when you spawn a task, it spawns in the current async context, and if there is no current async context, an error is thrown"?

--
Rowan Tommins
[IMSoP]

Daniil_Gentili1 · March 6, 2025, 9:58am

Of course, this is not an elegant solution, as it adds one more rule to the language, making it more complex. However, from a legacy perspective, it seems like a minimal scar.

(to All: Please leave your opinion if you are reading this )

Larry’s approach seems like a horrible idea to me: it increases complexity, prevents easy migration of existing code to an asynchronous model and is incredibly verbose for no good reason.

The arguments mentioned in https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ are not good arguments at all, as they essentially propose explicitly reducing concurrency (by allowing it only within async blocks) or making it harder to use by forcing users to pass around contexts (which is even worse than function colouring What Color is Your Function? – journal.stuffwithstuff.com).
This (supposedly) reduces issues with resource contention/race conditions: sure, if you don’t use concurrency or severely limit it, you will have less issues with race conditions, but that’s not an argument in favour of nurseries, that’s an argument against concurrency.

Race conditions and deadlocks are possible either way when using concurrency, and the way to avoid them is to introduce synchronisation primitives (locks, mutexes similar to the ones in GitHub - amphp/sync: Non-blocking synchronization primitives for PHP based on Amp and Revolt., or lockfree solutions like actors, which I am a heavy user of), not bloating signatures by forcing users to pass around contexts, reducing concurrency and completely disallowing global state.

Golang is the perfect example of a language that does colourless, (mostly) contextless concurrency without the need for coloured (async/await keywords) functions and other complications.
Race conditions are deadlocks are avoided, like in any concurrent model, by using appropriate synchronisation primitives, and by communicating with channels (actor model) instead of sharing memory, where appropriate.

Side note, I very much like the current approach of implicit cancellations, because they even remove the need to pass contexts to make use of cancellations, like in golang or amphp (though the RFC could use some further work regarding cancellation inheritance between fibers, but that’s a minor issue).

Yeah, so basically, you’re creating the service again and again for each coroutine if the coroutine needs to use it. This is a good solution in the context of multitasking, but it loses in terms of performance and memory, as well as complexity and code size, because it requires more factory classes.

^ this

Regarding backwards compatibility (especially with revolt), since I also briefly considered submitting an async RFC and thought about it a bit, I can suggest exposing an event loop interface like event-loop/src/EventLoop.php at main · revoltphp/event-loop · GitHub, which would allow userland event loop implementations to simply switch to using the native event loop as backend (this’ll be especially simple to do for which is the main user of fibers, revolt, since the current implementation is clearly inspired by revolt’s event loop).

Essentially, the only thing that’s needed for backwards-compatibility in most cases is an API that can be used to register onWritable, onReadable callbacks for streams and a way to register delayed (delay) tasks, to completely remove the need to invoke stream_select.

I’d recommend chatting with Aaron to further discuss backwards compatibility and the overall RFC: I’ve already pinged him, he’ll chime in once he has more time to read the RFC.


To Edmond, as someone who submitted RFCs before: stand your ground, try not to listen too much to what people propose in this list, especially if it’s regarding radical changes like Larry's; avoid bloating the RFC with proposals that you do not really agree with.

Regards,
Daniil Gentili

—

Daniil Gentili - Senior software engineer

Portfolio: [https://daniil.it](https://daniil.it)
Telegram: [https://t.me/danogentili](https://t.me/danogentili)

Daniil_Gentili · March 6, 2025, 10:00am

Of course, this is not an elegant solution, as it adds one more rule to the language, making it more complex. However, from a legacy perspective, it seems like a minimal scar.

(to All: Please leave your opinion if you are reading this )

Larry’s approach seems like a horrible idea to me: it increases complexity, prevents easy migration of existing code to an asynchronous model and is incredibly verbose for no good reason.

The arguments mentioned in https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ are not good arguments at all, as they essentially propose explicitly reducing concurrency (by allowing it only within async blocks) or making it harder to use by forcing users to pass around contexts (which is even worse than function colouring https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/).
This (supposedly) reduces issues with resource contention/race conditions: sure, if you don’t use concurrency or severely limit it, you will have less issues with race conditions, but that’s not an argument in favour of nurseries, that’s an argument against concurrency.

Race conditions and deadlocks are possible either way when using concurrency, and the way to avoid them is to introduce synchronisation primitives (locks, mutexes similar to the ones in https://github.com/amphp/sync/, or lockfree solutions like actors, which I am a heavy user of), not bloating signatures by forcing users to pass around contexts, reducing concurrency and completely disallowing global state.

Golang is the perfect example of a language that does colourless, (mostly) contextless concurrency without the need for coloured (async/await keywords) functions and other complications.
Race conditions are deadlocks are avoided, like in any concurrent model, by using appropriate synchronisation primitives, and by communicating with channels (actor model) instead of sharing memory, where appropriate.

Side note, I very much like the current approach of implicit cancellations, because they even remove the need to pass contexts to make use of cancellations, like in golang or amphp (though the RFC could use some further work regarding cancellation inheritance between fibers, but that’s a minor issue).

Yeah, so basically, you’re creating the service again and again for each coroutine if the coroutine needs to use it. This is a good solution in the context of multitasking, but it loses in terms of performance and memory, as well as complexity and code size, because it requires more factory classes.

^ this

Regarding backwards compatibility (especially with revolt), since I also briefly considered submitting an async RFC and thought about it a bit, I can suggest exposing an event loop interface like https://github.com/revoltphp/event-loop/blob/main/src/EventLoop.php, which would allow userland event loop implementations to simply switch to using the native event loop as backend (this’ll be especially simple to do for which is the main user of fibers, revolt, since the current implementation is clearly inspired by revolt’s event loop).

Essentially, the only thing that’s needed for backwards-compatibility in most cases is an API that can be used to register onWritable, onReadable callbacks for streams and a way to register delayed (delay) tasks, to completely remove the need to invoke stream_select.

I’d recommend chatting with Aaron to further discuss backwards compatibility and the overall RFC: I’ve already pinged him, he’ll chime in once he has more time to read the RFC.


To Edmond, as someone who submitted RFCs before: stand your ground, try not to listen too much to what people propose in this list, especially if it’s regarding radical changes like Larry's; avoid bloating the RFC with proposals that you do not really agree with.

Regards,
Daniil Gentili

—

Daniil Gentili - Senior software engineer

Portfolio: [https://daniil.it](https://daniil.it/)
Telegram: [https://t.me/danogentili](https://t.me/danogentili)

EdmondDantes · March 6, 2025, 11:31am

In a syntax-and-semantics approach, we only need to describe the things people actually need.

There is no doubt that syntax provides the programmer with a clear tool for expressing intent.

In the same way, do we actually need to design what an “async context” looks like to the user?

Its implementation is more about deciding which paradigms we want to support.

If we want to support global services that require local state within a coroutine, then they need a context. If there are no global “impure” services (i.e., those maintaining state within a coroutine), then a context may not be necessary. The first paradigm is not applicable to pure multitasking—almost all programming languages (as far as I know) have abandoned it in favor of ownership/memory passing. However, in PHP, it is popular.

For example, PHP has functions for working with HTTP. One of them writes the last received headers into a “global” variable, and another function allows retrieving them. This is where a context is needed. Or, for instance, when a request is made inside a coroutine, the service that handles socket interactions under the hood must:

Retrieve a socket from the connection pool.
Place the socket in the coroutine’s context for as long as it is needed.

However, this same scenario could be implemented more elegantly if PHP code explicitly used an object like “Connection” or “Transaction” and retrieved it from the pool. In that case, a context would not be needed.

Thus, the only question is: do we need to maintain state between function/method calls within a coroutine?

Do we actually want the user to be able to have access to two (nested) async contexts at once, and choose which one to spawn a task into?

If we discard the Go model, where the programmer decides what to do and which foot to shoot themselves in, and instead use parent-child coroutines, then such a function breaks this rule. This means it should not exist, as its presence increases system complexity. However, in the parent-child model, there is a case where a coroutine needs to be created in a different context.

For example:

A request to reset a password arrives at the server.
The API creates a coroutine in a separate context from the request to send an email.
The API returns a 201 response.

In this case, a special API is needed to accomplish this. The downside of any strict semantics is the presence of exceptional cases. However, such cases should be rare. If they are not, then the parent-child model is not suitable.

To resolve this issue, we need to know the opinions of framework maintainers. They should say either: Yes, this approach will reduce the amount of code, or No, it will increase the codebase, or We don’t care, do as you like

[PHP-DEV] PHP True Async RFC

What’s Happening Here:1. After calling $res = Async\start(), the code waits until the entire block completes.

Rules Inside an Asynchronous Block:1. I/O functions do not block coroutines within the block.

Coroutine Cancellation Rules:

What’s Happening Here:1. After calling `$res = Async\start()`, the code waits until the entire block completes.