[PHP-DEV] PHP True Async RFC Stage 5

Rowan_Tommins_IMSoP · November 14, 2025, 7:29pm

On 14 November 2025 09:49:52 GMT, Edmond Dantes <edmond.ht@gmail.com> wrote:

As far as I understand, opening the vote requires creating a separate
page on the Wiki. For some reason, I couldn’t find clear instructions
for this in the documentation, which is a bit surprising.

The voting stage is described in step 6 of the How-To here: PHP: rfc:howto

There's no separate page, you just edit the attributes on the voting widget tag. You can see how it looks on the two RFCs that are currently "in voting" on the RFC index: PHP: rfc

Rowan Tommins
[IMSoP]

Rob_Landers · November 15, 2025, 11:37am

On Thu, Nov 13, 2025, at 10:01, Edmond Dantes wrote:

Hello all.

Today marks two weeks since the RFC was published.

I need to apply a few minor fixes that Luis pointed out.

If anyone else is working on comments for the RFC, please let me know.
If there are no objections, we can start the vote on Monday.

Best regards, Ed

I have concerns about the clarity of when suspension occurs in this RFC.

The RFC states as a core goal:

“Code that was originally written and intended to run outside of a Coroutine must work EXACTLY THE SAME inside a Coroutine without modifications.”

And:

“A PHP developer should not have to think about how Coroutine switch and should not need to manage their switching—except in special cases where they consciously choose to intervene in this logic.”

However, the RFC doesn’t clearly define what these “special cases” are or provide guidance on when developers need to intervene.

Specific questions:

CPU-bound operations: If I have a tight loop processing data in memory (no I/O), will it monopolise the coroutine scheduler? Do I need to manually insert suspend() calls? How do I know when and where?
The RFC suggests that existing PHP functions won’t automatically be non-blocking. So which will? Is there a way to identify suspension points at the language/API level?
Performance implications: Without knowing where suspensions occur, how do developers avoid either:

Starving other coroutines (too few suspensions)
Degrading performance (too many suspensions)

With explicit async/await (“coloured functions”), developers know exactly where suspension can occur. This RFC’s implicit model seems convenient, but without clear rules about suspension points, I’m unclear how developers can write correct concurrent code or reason about performance.

Could the RFC clarify the rules for when automatic suspension occurs versus when manual suspend() calls are required? Is this RFC following Go’s model where suspension timing is an implementation detail developers shouldn’t rely on? If so, that should be stated explicitly. Keep in mind that Go didn’t start that way and took nearly a decade to get there. Earlier versions of Go explicitly stated where suspensions were.

— Rob

EdmondDantes · November 15, 2025, 12:16pm

Hello all.

Some of these questions sound familiar. Let’s try to sort them out.

If I have a tight loop processing data in memory (no I/O), will it monopolise the coroutine scheduler? Do I need to manually insert suspend() calls? How do I know when and where?

A coroutine must yield control on its own. If it keeps it through an
endless loop, then it will be the only one running.

The RFC suggests that existing PHP functions won’t automatically be non-blocking. So which will? Is there a way to identify suspension points at the language/API level?

The RFC does not say that. PHP I/O functions automatically become
non-blocking relative to the whole process. In other words, an I/O
function calls suspend() on its own when needed. The programmer writes
code exactly as before, under the illusion that operations are
executed one after another.

Performance implications: Without knowing where suspensions occur, how do developers avoid either:

In most cases, this is not the developer’s concern. Situations where
performance is critical should be handled with dedicated tools.
A PHP developer should not have to drop down to the C level. Properly
designed abstractions must provide the required performance.

I can already anticipate the question: but a developer could write
something like “for i < 10000 suspend” or something similar.
The answer is this: a developer must know how to use abstractions. As
always. Everywhere. In any area of programming.
It’s just as important as respecting proper layering in code.

Provided that PHP does not try to play the role of a C-level language
(there have already been such attempts, and they keep resurfacing),
and does not try to play a web server or a database system.
For most web scenarios, the current approach is more than sufficient.
This has been proven by Swoole, which has been on the market for many
years. Therefore, performance questions are outside the scope of this
RFC.

As for the concurrency model, let me remind you that Go has true
multitasking. Goroutines in Go, although they have a “synthetic”
stack. But you already know all this well.
This RFC and its implementation describe coroutines in a single
thread. That is very far from what Go provides.

There is no preemptive multitasking in this RFC because it is
completely unrelated.
This RFC and its implementation provide cooperative concurrency in a
single thread, where coroutine code yields control on its own.
Why was this model chosen? The simple answer is: because it is the
only one that can realistically be implemented within a finite
timeframe.

Any more questions?

-- Ed

Rob_Landers · November 15, 2025, 2:09pm

On Sat, Nov 15, 2025, at 13:16, Edmond Dantes wrote:

Hello all.

Some of these questions sound familiar. Let’s try to sort them out.

If I have a tight loop processing data in memory (no I/O), will it monopolise the coroutine scheduler? Do I need to manually insert suspend() calls? How do I know when and where?

A coroutine must yield control on its own. If it keeps it through an
endless loop, then it will be the only one running.

The RFC suggests that existing PHP functions won’t automatically be non-blocking. So which will? Is there a way to identify suspension points at the language/API level?

The RFC does not say that. PHP I/O functions automatically become
non-blocking relative to the whole process. In other words, an I/O
function calls suspend() on its own when needed. The programmer writes
code exactly as before, under the illusion that operations are
executed one after another.

Performance implications: Without knowing where suspensions occur, how do developers avoid either:

In most cases, this is not the developer’s concern. Situations where
performance is critical should be handled with dedicated tools.
A PHP developer should not have to drop down to the C level. Properly
designed abstractions must provide the required performance.

I can already anticipate the question: but a developer could write
something like “for i < 10000 suspend” or something similar.
The answer is this: a developer must know how to use abstractions. As
always. Everywhere. In any area of programming.
It’s just as important as respecting proper layering in code.

Provided that PHP does not try to play the role of a C-level language
(there have already been such attempts, and they keep resurfacing),
and does not try to play a web server or a database system.
For most web scenarios, the current approach is more than sufficient.
This has been proven by Swoole, which has been on the market for many
years. Therefore, performance questions are outside the scope of this
RFC.

As for the concurrency model, let me remind you that Go has true
multitasking. Goroutines in Go, although they have a “synthetic”
stack. But you already know all this well.
This RFC and its implementation describe coroutines in a single
thread. That is very far from what Go provides.

There is no preemptive multitasking in this RFC because it is
completely unrelated.
This RFC and its implementation provide cooperative concurrency in a
single thread, where coroutine code yields control on its own.
Why was this model chosen? The simple answer is: because it is the
only one that can realistically be implemented within a finite
timeframe.

Any more questions?

– Ed

Hey Ed,

I feel like we’re talking past each other a bit. I’m not questioning the choice of a cooperative scheduler, that’s totally fine, I’m trying to understand what the actual suspension points are. Every other language/runtime with cooperative concurrency spells this out, because without it developers can’t reason about performance or why a deadlock is happening.

Based on the conversation so far, I’d imagine the list to look something like:

network i/o (streams/sockets/dns/curl/etc)
sleeps
subprocess waits? proc_open and friends?
extensions with a reactor implementation
awaiting FutureLike
suspend()

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

I also think the RFC needs at least minimal wording about scheduler guarantees, even if the details are implementation-specific. For example, is the scheduler run-to-suspend? FIFO or round-robin wakeup? And non-preemptive behaviour only appears here in the thread. It isn’t mentioned in the RFC itself. That’s important for people writing long, CPU-bound loops, since nothing will interrupt them unless they explicitly yield.

Lastly, cancellation during a syscall is still unclear. If a coroutine is cancelled while something like fwrite() or a DB write is in progress, what should happen? Does fwrite() still return the number of bytes written? Does it throw? For write-operations in particular, this affects whether applications can maintain a consistent state.

Clarifying these points would really help people understand how to reason about concurrency with this API.

— Rob

EdmondDantes · November 15, 2025, 2:41pm

Hello.

Based on the conversation so far, I’d imagine the list to look something like:

Yes, that’s absolutely correct. When a programmer uses an operation
that would normally block the entire thread, control is handed over to
the Scheduler instead.
The suspend function is called inside all of these operations.

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

In the Async implementation, it will be specified which functions are supported.

I also think the RFC needs at least minimal wording about scheduler guarantees, even if the details are implementation-specific.

The Scheduler guarantees that a coroutine will be invoked if it is in the queue.

For example, is the scheduler run-to-suspend? FIFO or round-robin wakeup? And non-preemptive behaviour only appears here in the thread. It isn’t mentioned in the RFC itself.

In Go, for example, when it was still cooperative, these details were
also not part of any public contract. The only guarantee Go provided
was that a coroutine would not be interrupted arbitrarily. The same
applies to this RFC: coroutines are interrupted only at designated
suspension points.
However, neither Go nor any other language exposes the internal
details of the Scheduler as part of a public contract, because those
details may change without notice.

That’s important for people writing long, CPU-bound loops, since nothing will interrupt them unless they explicitly yield.

Hypothetically, in the future it may become possible to interrupt
loops, just like Go eventually did. This would likely require an
additional RFC. PHP does have the ability to interrupt a loop at any
point, but most likely only for terminating execution.
This RFC does nothing of the sort.

Lastly, cancellation during a syscall is still unclear. If a coroutine is cancelled while something like fwrite() or a DB write is in progress, what should happen?
Does fwrite() still return the number of bytes written? Does it throw? For write-operations in particular, this affects whether applications can maintain a consistent state.

If the write operation is interrupted, the function will return an
error according to its contract. In this case, it will return false.

Clarifying these points would really help people understand how to reason about concurrency with this API.

This is described in the document.
There is, of course, a nuance regarding extended error descriptions,
but at the moment no such changes are planned.

EdmondDantes · November 15, 2025, 3:21pm

As for:

For example, is the scheduler run-to-suspend?
And non-preemptive behaviour only appears here in the thread. It isn’t mentioned in the RFC itself.

There is no direct statement in the RFC that cooperative multitasking
is implemented.
I think this text was removed, and that needs to be fixed. But on the
other hand, there is a clear description of the contract expressed in
different words:

RFC: "A coroutine can stop itself passing control to the scheduler.
However, it cannot be stopped externally."

which essentially means the same thing.

This is exactly what constitutes the public contract between PHP and
the developer.

I included a list of I/O functions for demonstration purposes, but
this list is not part of the RFC.
It is part of the implementation. This means that not all I/O
functions can or should be adapted immediately.

John_Bafford · November 15, 2025, 4:20pm

Hi Rob, Edmond,

On Nov 15, 2025, at 06:37, Rob Landers <rob@bottled.codes> wrote:

I have concerns about the clarity of when suspension occurs in this RFC.

The RFC states as a core goal:

"Code that was originally written and intended to run outside of a Coroutine must work EXACTLY THE SAME inside a Coroutine without modifications."

And:

"A PHP developer should not have to think about how Coroutine switch and should not need to manage their switching—except in special cases where they consciously choose to intervene in this logic."

[...]

With explicit async/await ("coloured functions"), developers know exactly where suspension can occur. This RFC’s implicit model seems convenient, but without clear rules about suspension points, I’m unclear how developers can write correct concurrent code or reason about performance.

Could the RFC clarify the rules for when automatic suspension occurs versus when manual suspend() calls are required? Is this RFC following Go’s model where suspension timing is an implementation detail developers shouldn’t rely on? If so, that should be stated explicitly. Keep in mind that Go didn’t start that way and took nearly a decade to get there. Earlier versions of Go explicitly stated where suspensions were.

— Rob

To provide an explicit example for this, code that fits this pattern is going to be problematic:

function writeData() {
  $count = count($this->data);
  for($x = 0; $x < $count; $x++) {
    [$path, $content] = $this->data[$x];
    file_put_contents($path, $content);
  }
  $this->data = ;
}

While there are better ways to write this function, in normal PHP code, there's no problem here. But if file_put_contents() can block and cause a different coroutine to run, $this->data can be changed out from under writeData(), which leads to unexpected behavior. (e.g. $this->data changes length, and now writeData() no longer covers all of it; or it runs past the end of the array and errors; or doesn't see there's a change and loses it when it clears the data).

Now, yes, the programmer would have to do something to cause there to be two coroutines running in the first place. But if _this_ code was correct when "originally written and intended to run outside of a Coroutine", and with no changes is incorrect when run inside a coroutine, one can only say that it is working "exactly the same" with coroutines by ignoring that it is now wrong.

Suspension points, whether explicit or hidden, allow for the entire rest of the world to change out from under the caller. The only way for non-async-aware code to operate safely is for suspension to be explicit (which, of course, means the code now must be async-aware). There is no way in general for code written without coroutines or async suspensions in mind to work correctly if it can be suspended.

-John

EdmondDantes · November 15, 2025, 5:22pm

To provide an explicit example for this, code that fits this pattern is going to be problematic

Why is this considered a problem if this behavior is part of the
language’s contract?
Exactly the same way as in Go for example, this is also part of the
contract between the language and the programmer.

$this->data can be changed out from under writeData(), which leads to unexpected behavior.

So the developer must intentionally create two different coroutines.
Intentionally pass them the same object.
Intentionally write this code.
And the behavior is called “unexpected”?

that it is working "exactly the same" with coroutines by ignoring that it is now wrong

I understand that a word written by one person can be interpreted
however another person feels like.
Language is not a reliable carrier of information, so people must take
context into account to extract useful information with minimal
distortion.

The changes described in the RFC refer to the algorithm for handling
I/O functions in blocking mode. And of course these words assume that
we haven’t lost our minds and understand that you cannot write
completely different message sequences to the same socket at the same
time. In practice, changes are of course sometimes necessary, but
throughout my entire experience working with coroutines, I should note
that I have never once run into the example you mentioned. Even when
adapting older projects. And do you know why? Because the first thing
we refactored in the old code was the places with shared variables.

There is no way in general for code written without coroutines or async suspensions in mind to work correctly if it can be suspended.

Agreed.

A developer must understand that potentially any function can
interrupt execution. This is a consequence of transparent asynchrony.
It is both its strength and its weakness. I will repeat it again: not
some specific function, but almost ANY function. Because under
transparent asynchrony you can use suspend() inside any function. This
does not negate the fact that documentation should list all functions
that switch context, but a certain coding style encourages this way of
thinking.

Modern programming languages strive for clarity and cleanliness. In
other words, colored functions provide code clarity and prevent errors
caused by misunderstanding what a function does. Critics of colored
functions criticize them precisely for what is actually their
strength, not their weakness. Color is an advantage.

But in PHP, colored functions are inconvenient. Overloading I/O
functions does not lead to serious errors that make developers suffer;
on the contrary, it saves time and gives the language more
flexibility.

I can explain why. The amount of code that works with sockets in PHP
is generally several times smaller than the code that works with
databases. In other words, the modules where such errors could occur
are simply not that many. They do exist. library clients... but
compared to all other code, there are far fewer of them. And as it
turns out, refactoring them for coroutines requires very few changes.
Especially if the code was already well-written with best practices in
mind, then it will most likely work excellently with coroutines with
minimal adjustments.

How did we refactor old code for coroutines?

1. We took the modules that had global state. There were not many of them.
2. We used a Context, which is essential, and moved the global state
into the context.

I don’t remember exactly how many thousands of lines of code there
were, but definitely more than 20,000.
But why anyone would intentionally pass the same object to different
coroutines and then complain that the code broke. I have no idea who
would need that.

A developer should strive to minimize asynchronous code in a project.
The less of it there is, the better. Asynchronous code is evil. An
anti-pattern. A high-complexity zone. But if a developer chooses to
use asynchronous code, they shouldn’t act like they’re three years old
and seeing a computer for the first time. Definitely not. This
technology requires steady, capable hands

Best Regards, Ed.

Rob_Landers · November 15, 2025, 7:55pm

On Sat, Nov 15, 2025, at 15:41, Edmond Dantes wrote:

Hello.

Based on the conversation so far, I’d imagine the list to look something like:

Yes, that’s absolutely correct. When a programmer uses an operation
that would normally block the entire thread, control is handed over to
the Scheduler instead.
The suspend function is called inside all of these operations.

I think that “normally” is doing a lot of work here. fwrite() can block, but often doesn’t. file_get_contents() is usually instant for local files but can take seconds on NFS or with an HTTP URL. An array_map() always blocks the thread but should never suspend.

Without very clear rules, it becomes impossible to reason about what’ll suspend and what won’t.

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

In the Async implementation, it will be specified which functions are supported.

This is exactly the kind of thing that needs to be in the RFC itself. Relying on “the implementation will document it” creates an unstable contract.

Even something simple like:

if it can perform network IO
if it can perform file/stream IO
if it can sleep or wait on timers
if it awaits a FutureLike
if it calls suspend()

This would then create a stable baseline and require an RFC to change the rules, forcing people to think through BC breakages and ecosystem impact.

I also think the RFC needs at least minimal wording about scheduler guarantees, even if the details are implementation-specific.
The Scheduler guarantees that a coroutine will be invoked if it is in the queue.

That’s not quite enough. The order really matters. Different schedulers produce different observable results.

For example:

function step(string $name, string $msg) {
echo “$name: $msg\n”;
suspend();
}

spawn(function() { step(“A”, “1”); step(“A”, “2”); step(“A”, “3”); });
spawn(function() { step(“B”, “1”); step(“B”, “2”); step(“B”, “3”); });
spawn(function() { step(“C”, “1”); step(“C”, “2”); step(“C”, “3”); });

Under different scheduling strategies you get different, but stable patterns.

Consider FIFO or round-robin, run-to-suspend:

A: 1
B: 1
C: 1
A: 2
B: 2
Cl: 2
A: 3
B: 3
C: 3

But with a stack-like or LIFO strategy, running-to-suspend:

A: 1
B: 1
C: 1
C: 2
C: 3
B: 2
B: 3
A: 2
A: 3

Both are valid, but are important to know which one is implemented, and if someone wants to replace the scheduler, they also need to ensure they guarantee this behaviour.

For example, is the scheduler run-to-suspend? FIFO or round-robin wakeup? And non-preemptive behaviour only appears here in the thread. It isn’t mentioned in the RFC itself.

In Go, for example, when it was still cooperative, these details were
also not part of any public contract. The only guarantee Go provided
was that a coroutine would not be interrupted arbitrarily. The same
applies to this RFC: coroutines are interrupted only at designated
suspension points.
However, neither Go nor any other language exposes the internal
details of the Scheduler as part of a public contract, because those
details may change without notice.

Go did document these details during its cooperative era, including exactly where goroutines might yield. Unfortunately, I can’t find a link to documentation that old. I did come across the old design docs that might shed some light on how things worked back then: https://go.dev/wiki/DesignDocuments

The key point is that Go made cooperative scheduling predictable enough that developers could write performant code without guessing.

That’s important for people writing long, CPU-bound loops, since nothing will interrupt them unless they explicitly yield.
Hypothetically, in the future it may become possible to interrupt
loops, just like Go eventually did. This would likely require an
additional RFC. PHP does have the ability to interrupt a loop at any
point, but most likely only for terminating execution.
This RFC does nothing of the sort.

My concern isn’t the lack of loop preemption. My concern is that the RFC never says CPU loops don’t yield. If it isn’t stated explicitly, it won’t be documented, and users will discover it the hard way. That’s exactly the sort of footgun we should avoid at the language level.

Lastly, cancellation during a syscall is still unclear. If a coroutine is cancelled while something like fwrite() or a DB write is in progress, what should happen?
Does fwrite() still return the number of bytes written? Does it throw? For write-operations in particular, this affects whether applications can maintain a consistent state.

If the write operation is interrupted, the function will return an
error according to its contract. In this case, it will return false.

fwrite() almost never returns false, it returns “bytes written OR false”. Partial successful writes are normal and extremely common. So, cancellation does change the behaviour unless this is spelled out very carefully so calling code can recover appropriately.

Clarifying these points would really help people understand how to reason about concurrency with this API.

This is described in the document.

I may be missing something, but I don’t see this spelled out anywhere in the RFC.

There is, of course, a nuance regarding extended error descriptions,
but at the moment no such changes are planned.

That’s fine, but then do you expect the RFC to pass as-is? Right now, without suspension rules, scheduler guarantees, defined syscall-cancellation semantics, it’s tough to evaluate the correctness and performance implications. Leaving some of the most important aspects as an “implementation detail” seems like asking for trouble.

— Rob

Rob_Landers · November 15, 2025, 8:29pm

On Sat, Nov 15, 2025, at 18:22, Edmond Dantes wrote:

To provide an explicit example for this, code that fits this pattern is going to be problematic

Why is this considered a problem if this behavior is part of the
language’s contract?
Exactly the same way as in Go for example, this is also part of the
contract between the language and the programmer.

One of the stated goals of the RFC:

Code that was originally written and intended to run outside of a Coroutine must work EXACTLY THE SAME inside a Coroutine without modifications.

The examples you give here seem to contradict that. You are now saying that developers must refactor shared state, must avoid passing objects to multiple coroutines, and must adopt a certain programming style to avoid breaking existing code. That’s the opposite of “works exactly the same without modification”.

$this->data can be changed out from under writeData(), which leads to unexpected behavior.

So the developer must intentionally create two different coroutines.
Intentionally pass them the same object.
Intentionally write this code.
And the behavior is called “unexpected”?

The original claim of the RFC is that code not written with coroutines in mind should still behave the same inside them. If any function can suspend at arbitrary points, the ordinary synchronous assumptions, including read/modify/write patterns on properties, no longer hold. Whether that pattern is good style or not doesn’t change the fact that the behaviour is different once asynchrony is introduced.

A developer must understand that potentially any function can
interrupt execution. This is a consequence of transparent asynchrony.

This is also a direct tension with another major goal:

A PHP developer should not have to think about how Coroutine switch and should not need to manage their switching—except in special cases where they consciously choose to intervene in this logic.

If any function can suspend, then developers MUST reason about all the usual concurrency hazards: torn writes, interleaving, race conditions, and the entire class of bugs that coloured function models prevent. That absolutely counts as “thinking about coroutine switching”.

It is both its strength and its weakness. I will repeat it again: not
some specific function, but almost ANY function. Because under
transparent asynchrony you can use suspend() inside any function. This
does not negate the fact that documentation should list all functions
that switch context, but a certain coding style encourages this way of
thinking.

These statements also seem to go against another goal of the RFC:

A PHP developer should not have to think about how Coroutine switch and should not need to manage their switching—except in special cases where they consciously choose to intervene in this logic.

How did we refactor old code for coroutines?

We took the modules that had global state. There were not many of them.

We used a Context, which is essential, and moved the global state
into the context.
[snip]
But why anyone would intentionally pass the same object to different
coroutines and then complain that the code broke. I have no idea who
would need that.

Existing PHP codes does this today without issue. Many libraries, parsers, database clients, stream decorators, in-memory caches, DTOs, middleware chains … are built around shared mutable objects. That style is extremely common in PHP, and today it’s perfectly fine to share these things.

Saying “just refactor all your shared-state-code” seems to contradict the goals given in the RFC.

A developer should strive to minimize asynchronous code in a project.
The less of it there is, the better. Asynchronous code is evil. An
anti-pattern. A high-complexity zone. But if a developer chooses to
use asynchronous code, they shouldn’t act like they’re three years old
and seeing a computer for the first time. Definitely not. This
technology requires steady, capable hands

Best Regards, Ed.

Right now, today, PHP has almost zero async code in the ecosystem. If the position of the RFC is that transparent asynchrony is inherently dangerous, requires careful discipline, breaks common patterns, and requires refactoring shared state, then it isn’t clear how the central value proposition “existing code works unchanged” is meant to hold.

This is why the semantics need to be written down explicitly, not left to implication or the experience of those who already work with coroutines.

— Rob

Rob_Landers · November 15, 2025, 9:11pm

On Sat, Nov 15, 2025, at 17:20, John Bafford wrote:

Hi Rob, Edmond,

On Nov 15, 2025, at 06:37, Rob Landers <rob@bottled.codes> wrote:

I have concerns about the clarity of when suspension occurs in this RFC.

The RFC states as a core goal:

“Code that was originally written and intended to run outside of a Coroutine must work EXACTLY THE SAME inside a Coroutine without modifications.”

And:

“A PHP developer should not have to think about how Coroutine switch and should not need to manage their switching—except in special cases where they consciously choose to intervene in this logic.”

[…]

With explicit async/await (“coloured functions”), developers know exactly where suspension can occur. This RFC’s implicit model seems convenient, but without clear rules about suspension points, I’m unclear how developers can write correct concurrent code or reason about performance.

Could the RFC clarify the rules for when automatic suspension occurs versus when manual suspend() calls are required? Is this RFC following Go’s model where suspension timing is an implementation detail developers shouldn’t rely on? If so, that should be stated explicitly. Keep in mind that Go didn’t start that way and took nearly a decade to get there. Earlier versions of Go explicitly stated where suspensions were.

— Rob

To provide an explicit example for this, code that fits this pattern is going to be problematic:

function writeData() {
$count = count($this->data);
for($x = 0; $x < $count; $x++) {
[$path, $content] = $this->data[$x];
file_put_contents($path, $content);
}
$this->data = ;
}

While there are better ways to write this function, in normal PHP code, there’s no problem here. But if file_put_contents() can block and cause a different coroutine to run, $this->data can be changed out from under writeData(), which leads to unexpected behavior. (e.g. $this->data changes length, and now writeData() no longer covers all of it; or it runs past the end of the array and errors; or doesn’t see there’s a change and loses it when it clears the data).

Now, yes, the programmer would have to do something to cause there to be two coroutines running in the first place. But if this code was correct when “originally written and intended to run outside of a Coroutine”, and with no changes is incorrect when run inside a coroutine, one can only say that it is working “exactly the same” with coroutines by ignoring that it is now wrong.

Suspension points, whether explicit or hidden, allow for the entire rest of the world to change out from under the caller. The only way for non-async-aware code to operate safely is for suspension to be explicit (which, of course, means the code now must be async-aware). There is no way in general for code written without coroutines or async suspensions in mind to work correctly if it can be suspended.

-John

I should have put all these emails combined into a single email … but here we are.

John’s example captures the core issue, and I want to take a moment and expand on it from a different angle. My concern with implicit suspensions isn’t theoretical. It’s exactly why nearly every modern language abandoned this model.

Transparent, implicit suspension means that any line of code can become an interleaving point. That makes a large class of patterns, which are perfectly safe in synchronous PHP today, unsafe the moment they run inside a coroutine. A few concrete examples:

With property hooks and implicit suspension, event this becomes unsafe:

$this->counter++;

A suspension can happen between the read and the write. Another coroutine can mutate the counter in between. The programmer did nothing wrong; it’s just a hazard introduced by invisible suspension.

And consider this can break invariants:

$this->balance -= $amount;
$this->ledger->writeEntry($this->id, -$amount);

If the first line suspends, the balance can be changed somewhere else before the ledger entry is written (which breaks an invariant that the balance is a reflection of the ledger). With transparent async, it’s suddenly a race condition.

Then you can have time pass invisibly:

if(!$cache->has($key)) {
$cache->set($key, $value);
}

If has() suspends, anything can happen to that cache key before the set. The invariant becomes incorrect.

Implicit suspension allows any function to be re-entered before it returns. That can lead to partially updated objects, state machines appearing to skip states, “method called twice before return” bugs, double writes, and re-entrant callbacks being invoked with inconsistent state.

The bugs are extremely challenging to debug because the programmer never actually wrote any async code.

I’ve had the “pleasure” of working on Fiber frameworks that use raw fibers (no async/await you get from React/Amp, though I’ve worked with those pretty extensively as well). These are the bugs you run into all the time, where you sometimes have to literally put a suspension in a seemingly random place to fix a bug.

Implicit async blurs one of the most important boundaries in software design: “this code cannot be interrupted” vs “this code can be interrupted”.

JavaScript moved from implicit async → promises → async/await
Python moved from callbacks/greenlets → async/await
Ruby moved from fibers → explicit schedulers
Go eventually added true preemption

Even the creators of Fibers eventually wrote async/await on top of them, because implicit async is broken and coloured functions close off entire classes of bugs and make reasoning possible again.

I understand the desire for “transparent async” but once a language allows suspension at arbitrary points, the language can no longer promise invariants, atomic sequences, non-reentrancy, predictable control flow, or even correctness, in-general.

— Rob

bukka · November 15, 2025, 9:17pm

Hi,

On Sat, Nov 15, 2025 at 8:56 PM Rob Landers rob@bottled.codes wrote:

On Sat, Nov 15, 2025, at 15:41, Edmond Dantes wrote:

Hello.

Based on the conversation so far, I’d imagine the list to look something like:

Yes, that’s absolutely correct. When a programmer uses an operation
that would normally block the entire thread, control is handed over to
the Scheduler instead.
The suspend function is called inside all of these operations.

I think that “normally” is doing a lot of work here. fwrite() can block, but often doesn’t. file_get_contents() is usually instant for local files but can take seconds on NFS or with an HTTP URL. An array_map() always blocks the thread but should never suspend.

Without very clear rules, it becomes impossible to reason about what’ll suspend and what won’t.

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

In the Async implementation, it will be specified which functions are supported.

This is exactly the kind of thing that needs to be in the RFC itself. Relying on “the implementation will document it” creates an unstable contract.

Even something simple like:

if it can perform network IO

if it can perform file/stream IO

if it can sleep or wait on timers

None of the above is part is this RFC so why is this being discussed. Any of the changes to stream layer and extensions will require special RFC and mainly clean implementation. We will need to carefully consider where the suspension is going to be done.

I think if there are parts of the RFC that mention IO, it should be removed here. I think this RFC should also remove any mention of reactor as it’s irrelevant for this.

Kind regards,

Jakub

Rob_Landers · November 15, 2025, 9:19pm

On Sat, Nov 15, 2025, at 22:17, Jakub Zelenka wrote:

Hi,

On Sat, Nov 15, 2025 at 8:56 PM Rob Landers rob@bottled.codes wrote:

On Sat, Nov 15, 2025, at 15:41, Edmond Dantes wrote:

Hello.

Based on the conversation so far, I’d imagine the list to look something like:

Yes, that’s absolutely correct. When a programmer uses an operation
that would normally block the entire thread, control is handed over to
the Scheduler instead.
The suspend function is called inside all of these operations.

I think that “normally” is doing a lot of work here. fwrite() can block, but often doesn’t. file_get_contents() is usually instant for local files but can take seconds on NFS or with an HTTP URL. An array_map() always blocks the thread but should never suspend.

Without very clear rules, it becomes impossible to reason about what’ll suspend and what won’t.

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

In the Async implementation, it will be specified which functions are supported.

This is exactly the kind of thing that needs to be in the RFC itself. Relying on “the implementation will document it” creates an unstable contract.

Even something simple like:

if it can perform network IO

if it can perform file/stream IO

if it can sleep or wait on timers

None of the above is part is this RFC so why is this being discussed. Any of the changes to stream layer and extensions will require special RFC and mainly clean implementation. We will need to carefully consider where the suspension is going to be done.

My point is that it should be a part of the RFC.

— Rob

EdmondDantes · November 15, 2025, 9:20pm

Hello.

An array_map() always blocks the thread but should never suspend.

This function is not related to this discussion or to the RFC.

Without very clear rules, it becomes impossible to reason about what’ll suspend and what won’t.

As I mentioned earlier, this RFC clearly defines the rules for
integrating functions. The functions themselves will be documented.

That’s not quite enough. The order really matters. Different schedulers produce different observable results.

No modern language guarantees a fixed execution order of coroutines.
Go, Kotlin, Python, JavaScript, C#, Rust. All only guarantee order
when explicit synchronization is used.
Everything else is an implementation detail of the scheduler, and user
code must not rely on it.
(Because concurrency naturally has many valid execution paths, not one.
Forcing a single fixed order is impossible without adding heavy
synchronization everywhere, which destroys performance and breaks the
concurrency model.)

Unfortunately, I can’t find a link to documentation that old.

no one can

I may be missing something, but I don’t see this spelled out anywhere in the RFC.

What exactly were you unable to find?

Right now, without suspension rules, scheduler guarantees, defined syscall-cancellation semantics, it’s tough to evaluate the correctness and performance implications.
Leaving some of the most important aspects as an "implementation detail" seems like asking for trouble.

I’m sorry, but it’s difficult for me to understand what this is referring to.

EdmondDantes · November 15, 2025, 9:24pm

The examples you give here seem to contradict that. You are now saying that developers must refactor shared state, must avoid passing objects to multiple coroutines,
and must adopt a certain programming style to avoid breaking existing code. That’s the opposite of "works exactly the same without modification".

I provided an explanation in my earlier messages.

Right now, today, PHP has almost zero async code in the ecosystem.

That is not accurate. PHP already has a significant amount of
async-style code in the ecosystem: Amphp, ReactPHP, Swoole, Swow, and
multiple async HTTP clients, database drivers, and event-loop
libraries. The ecosystem is not “almost zero”; it’s simply fragmented
across several implementations.

If the position of the RFC is that transparent asynchrony

My comment was about programming languages in general.

bukka · November 15, 2025, 9:27pm

Hi,

On Thu, Nov 13, 2025 at 4:25 PM Edmond Dantes <edmond.ht@gmail.com> wrote:

Hello Jakub.

I think it would be good to see the implementation that can cover the currently proposed API and try to strip it as much as possible so it doesn’t contain much more than that. We saw that PR for the async API was already quite
big and we didn’t really get any agreement there partially also because there was no user of that and it was not possible to have any tests for it (without writing them in C).
So what I’m thinking is that if some minimal version that implements just this (e.g. reactor can be just dummy because there is no io atm. and other things can be stripped too), then the voters would get better idea what they are > dealing with and could even try it out.

I understand what you mean.

** Regarding simplifying the code.**
Any “simplification” essentially comes down to removing stub files and
the C classes that implement the PHP classes. This is a relatively
small part of the project.

It’s 1.6k lines so it might help a little bit

For example, removing Scope from the C code doesn’t make much sense,
because it turned out (even unintentionally) to be a very convenient
structure for tracking a group of coroutines.
In other words, no major changes to the code are expected before the
PR review begins.

I don’t think you can create PR with the whole project. It’s not gonna get reviewed and merged. It might not even open in GH. So you will need to come up with a way how to split to small pieces and I think this is the first self contained bit that should be offered in minimal form.

** Reactor. **
Since the reactor uses libUV and we currently do not plan to provide a
pure-C implementation, we agreed to move it into a separate library.

Why do you need reactor for this specific part of proposal? The thing is that there shouldn’t be any IO so you reduce scheduler code as well and make it simpler and more reviewable.

Kind regards,

Jakub

bukka · November 15, 2025, 9:40pm

On Sat, Nov 15, 2025 at 10:19 PM Rob Landers rob@bottled.codes wrote:

On Sat, Nov 15, 2025, at 22:17, Jakub Zelenka wrote:

Hi,

On Sat, Nov 15, 2025 at 8:56 PM Rob Landers rob@bottled.codes wrote:

On Sat, Nov 15, 2025, at 15:41, Edmond Dantes wrote:

Hello.

Based on the conversation so far, I’d imagine the list to look something like:

Yes, that’s absolutely correct. When a programmer uses an operation
that would normally block the entire thread, control is handed over to
the Scheduler instead.
The suspend function is called inside all of these operations.

I think that “normally” is doing a lot of work here. fwrite() can block, but often doesn’t. file_get_contents() is usually instant for local files but can take seconds on NFS or with an HTTP URL. An array_map() always blocks the thread but should never suspend.

Without very clear rules, it becomes impossible to reason about what’ll suspend and what won’t.

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

In the Async implementation, it will be specified which functions are supported.

This is exactly the kind of thing that needs to be in the RFC itself. Relying on “the implementation will document it” creates an unstable contract.

Even something simple like:

if it can perform network IO

if it can perform file/stream IO

if it can sleep or wait on timers

None of the above is part is this RFC so why is this being discussed. Any of the changes to stream layer and extensions will require special RFC and mainly clean implementation. We will need to carefully consider where the suspension is going to be done.

My point is that it should be a part of the RFC.

But this is hard to know exactly. Also there will be always 3rd extensions that can block so we will need to do it piece by piece. You can just take it that ideally everything that can block would be suspendable . The first candidate is surely stream internall poll that is used for stream IO in various places and could handles most suspensions including in mysqlnd. Then curl and sockets would be probably added. There are various other bits already present in Edmonds PoC but we will need to consider them one by one.

In other words, we can’t really know that until we have some base pieces merged (this RFC) and there is acceptable implementation that can be merged for those parts.

Kind regards,

Jakub

EdmondDantes · November 15, 2025, 9:43pm

With property hooks and implicit suspension, event this becomes unsafe:
A suspension can happen between the read and the write. Another coroutine can mutate the counter in between. The programmer did nothing wrong; it's just a hazard introduced by invisible suspension.

The risk of a variable being modified by different coroutines does not
depend on the transparency model. This effect is possible in both
implementations.
Even if a setter triggers a suspension, it does not affect the logical
execution flow. Therefore, no danger arises.

The difference between the transparent model and the explicit one lies
in other aspects. It seems this discussion took place in March of this
year.

EdmondDantes · November 15, 2025, 10:02pm

Hello

It's 1.6k lines so it might help a little bit

Yeah

Why do you need reactor for this specific part of proposal? The thing is that there shouldn't be any IO so you reduce scheduler code as well and make it simpler and more reviewable.

So you are suggesting removing all I/O from the RFC. On the one hand,
that sounds appealing. It immediately eliminates a bunch of tests. But
there is a trap we could all fall into.
By separating the reactor and the scheduler, as well as the rules of
how they work together, we might accidentally introduce an error into
the document simply because the documents would be split.
(interface drift)

The fact that the I/O rules and coroutine rules are part of a single
document developed together is actually an advantage, just as the
existence of separate RFCs for await and Scope is. And this does not
prevent splitting the implementation code into many small parts.

On the other hand, STREAM has corner cases, especially in error
situations, that would be appropriate to discuss and formalize in a
separate RFC. Ideally, this should probably be done right before
accepting the STREAM PR. However, such a description does not carry
significant risks.

This requires some thought.

Rob_Landers · November 15, 2025, 10:00pm

On Sat, Nov 15, 2025, at 22:40, Jakub Zelenka wrote:

On Sat, Nov 15, 2025 at 10:19 PM Rob Landers rob@bottled.codes wrote:

On Sat, Nov 15, 2025, at 22:17, Jakub Zelenka wrote:

Hi,

On Sat, Nov 15, 2025 at 8:56 PM Rob Landers rob@bottled.codes wrote:

On Sat, Nov 15, 2025, at 15:41, Edmond Dantes wrote:

Hello.

Based on the conversation so far, I’d imagine the list to look something like:

Yes, that’s absolutely correct. When a programmer uses an operation
that would normally block the entire thread, control is handed over to
the Scheduler instead.
The suspend function is called inside all of these operations.

I think that “normally” is doing a lot of work here. fwrite() can block, but often doesn’t. file_get_contents() is usually instant for local files but can take seconds on NFS or with an HTTP URL. An array_map() always blocks the thread but should never suspend.

Without very clear rules, it becomes impossible to reason about what’ll suspend and what won’t.

If that’s the intended model, it’d help to have that spelled out directly; it makes it immediately clear which functions can or will suspend and prevents surprises.

In the Async implementation, it will be specified which functions are supported.

This is exactly the kind of thing that needs to be in the RFC itself. Relying on “the implementation will document it” creates an unstable contract.

Even something simple like:

if it can perform network IO

if it can perform file/stream IO

if it can sleep or wait on timers

None of the above is part is this RFC so why is this being discussed. Any of the changes to stream layer and extensions will require special RFC and mainly clean implementation. We will need to carefully consider where the suspension is going to be done.

My point is that it should be a part of the RFC.

But this is hard to know exactly. Also there will be always 3rd extensions that can block so we will need to do it piece by piece. You can just take it that ideally everything that can block would be suspendable . The first candidate is surely stream internall poll that is used for stream IO in various places and could handles most suspensions including in mysqlnd. Then curl and sockets would be probably added. There are various other bits already present in Edmonds PoC but we will need to consider them one by one.

In other words, we can’t really know that until we have some base pieces merged (this RFC) and there is acceptable implementation that can be merged for those parts.

Kind regards,

Jakub

I guess my main thing is that this RFC should only cover coroutine machinery: it should not promise “transparent async” or “code that works exactly the same” OR if it wants to make those claims, it should actually demonstrate how instead of hand-waving everything as an “implementation detail” when none of those claims can actually be validated without those details.

— Rob