[PHP-DEV] PHP True Async RFC

I can give you several examples where such logic is used in Amphp libraries, and it will break if they are invoked within an async block.

Got it, it looks like I misunderstood the post due to my focus. So, essentially, you’re talking not so much about wait_all itself, but rather about the parent-child vs. free model.

This question is what concerns me the most right now.

If you have real examples of how this can cause problems, I would really appreciate it if you could share them. Code is the best criterion of truth.

You misunderstand:

Yes, I misunderstood. It would be interesting to see the code with the destructor to analyze this approach better.

Let me summarize the current state for today:

  1. I am abandoning startScheduler and the idea of preserving backward compatibility with await_all or anything else in that category. The scheduler will be initialized implicitly, and this does not concern user-land. Consequently, the spawn function() code will work everywhere and always.

  2. I will not base the implementation on Fiber (perhaps only on the low-level part). Instead of Fiber, there will be a separate class. There will be no changes to Fiber at all. This decision follows the principle of Win32 COM/DCOM: old interfaces should never be changed. If an old interface needs modification, it should be given a new name. This should have been done from the start.

  3. I am abandoning low-level objects in PHP-land (FiberHandle, SocketHandle etc). Over time, no one has voted for them, which means they are unnecessary. There might be a low-level interface for compatibility with Revolt.

  4. It might be worth restricting microtasks in PHP-land and keeping them only for C code. This would simplify the interface, but we need to ensure that it doesn’t cause any issues.

The remaining question on the agenda: deciding which model to choose — parent-child or the Go-style model.

Thanks


Ed

I think the same thing applies to scheduling coroutines: we want the Scheduler to take over the “null fiber”,

Yes, you have quite accurately described a possible implementation.
When a programmer loads the initial index.php, its code is already running inside a coroutine.
We can call it the main coroutine or the root coroutine.

When the index.php script reaches its last instruction, the coroutine finishes, execution is handed over to the Scheduler, and then everything proceeds as usual.

Accordingly, if the Scheduler has more coroutines in the queue, reaching the last line of index.php does not mean the script terminates. Instead, it continues executing the queue until… there is nothing left to execute.

At that point, the relationship to a block syntax perhaps becomes clearer:

Thanks to the extensive discussion, I realized that the implementation with startScheduler raises too many questions, and it’s better to sacrifice a bit of backward compatibility for the sake of language elegance.

After all, Fiber is unlikely to be used by ordinary programmers.

Edmond,
The language barrier is bigger (because of me, I cannot properly explain it) so I will keep it simple. Having “await” makes it sync, not async. In hardware we use interrupts but we have to do it grandma style… The main loop checks from variables set on the interrupts which is async. So you have a main loop that checks a variable but that variable is set from another part of the processor cycle that has nothing to do with the main loop (it is not fire and forget style it is in real time). Basically you can have a standard int main()function that is sync because you can delay in it (yep sleep(0)) and while you block it you have an event that interrupts a function that works on another register which is independent from the main function. More details of this will be probably not interesting so I will stop. If you want to make async PHP with multiple processes you have to check variables semaphored to make it work.

···

Iliya Miroslavov Iliev
i.miroslavov@gmail.com

Edmond,

If you want to make async PHP with multiple processes you have to check variables semaphored to make it work.

Hello, Iliya.

Thank you for your feedback. I’m not sure if I fully understood the entire context. But.

At the moment, I have no intention of adding multitasking to PHP in the same way it works in Go.

Therefore, code will not require synchronization. The current RFC proposes adding only asynchronous execution. That means each thread will have its own event loop, its own memory, and its own coroutines.

P.s. I know also Russian and a bit asm.
Ed.

On Sun, Mar 9, 2025, at 14:17, Rowan Tommins [IMSoP] wrote:

On 08/03/2025 20:22, Edmond Dantes wrote:

For coroutines to work, a Scheduler must be started. There can be only

one Scheduler per OS thread. That means creating a new async task does

not create a new Scheduler.

Apparently, async {} in the examples above is the entry point for the

Scheduler.

I’ve been pondering this, and I think talking about “starting” or

“initialising” the Scheduler is slightly misleading, because it implies

that the Scheduler is something that “happens over there”.

It sounds like we’d be writing this:

// No scheduler running, this is probably an error

Async\runOnScheduler( something(…) );

Async\startScheduler();

// Great, now it’s running…

Async\runonScheduler( something(…) );

// If we can start it, we can stop it I guess?

Async\stopScheduler();

But that’s not we’re talking about. As the RFC says:

Once the Scheduler is activated, it will take control of the

Null-Fiber context, and execution within it will pause until all Fibers,

all microtasks, and all event loop events have been processed.

The actual flow in the RFC is like this:

// This is queued somewhere special, ready for a scheduler to pick it up

later

Async\enqueueForScheduler( something(…) );

// Only now does anything actually run

Async\runSchedulerUntilQueueEmpty();

// At this point, the scheduler isn’t running any more

// If we add to the queue now, it won’t run unless we run another scheduler

Async\enqueueForScheduler( something(…) );

Pondering this, I think one of the things we’ve been missing is what

Unix[-like] systems call “process 0”. I’m not an expert, so may get

details wrong, but my understanding is that if you had a single-tasking

OS, and used it to bootstrap a Unix[-like] system, it would look

something like this:

  1. You would replace the currently running single process with the new

kernel / scheduler process

  1. That scheduler would always start with exactly one process in the

queue, traditionally called “init”

  1. The scheduler would hand control to process 0 (because it’s the only

thing in the queue), and that process would be responsible for starting

all the other processes in the system: TTYs and login prompts, network

daemons, etc

Slightly off-topic, but you may find the following article interesting: https://manybutfinite.com/post/kernel-boot-process/

It’s a bit old, but probably still relevant for the most part. At least for x86.

— Rob

On Sun, Mar 9, 2025, at 8:17 AM, Rowan Tommins [IMSoP] wrote:

That leaves the question of whether it would ever make sense to nest
those blocks (indirectly, e.g. something() itself contains an async{}
block, or calls something else which does).

I guess in our analogy, nested blocks could be like running Containers
within the currently running OS: they don't actually start a new
Scheduler, but they mark a namespace of related coroutines, that can be
treated specially in some way.

Alternatively, it could simply be an error, like trying to run the
kernel as a userland program.

Support for nested blocks is absolutely mandatory, whatever else we do. If you cannot nest one async block (scheduler instance, coroutine, whatever it is) inside another, then basically no code can do anything async except the top level framework.

This function needs to be possible, and work anywhere, regardless of whether there's an "open" async session 5 stack calls up.

function par_map(iterable $it, callable $c) {
  $result = ;
  async {
    foreach ($it as $val) {
      $result = $c($val);
    }
  }
return $result;
}

However it gets spelled, the above code needs to be supported.

--Larry Garfield

On Sun, Mar 9, 2025, at 11:56 AM, Edmond Dantes wrote:

*Let me summarize the current state for today:*

1. I am abandoning `startScheduler` and the idea of preserving
backward compatibility with `await_all` or anything else in that
category. The scheduler will be initialized implicitly, and this does
not concern user-land. Consequently, the `spawn function()` code will
work everywhere and always.

2. I will not base the implementation on `Fiber` (perhaps only on the
low-level part). Instead of `Fiber`, there will be a separate class.
There will be no changes to `Fiber` at all. This decision follows the
principle of Win32 COM/DCOM: old interfaces should never be changed. If
an old interface needs modification, it should be given a new name.
This should have been done from the start.

3. I am abandoning low-level objects in PHP-land (FiberHandle,
SocketHandle etc). Over time, no one has voted for them, which means
they are unnecessary. There might be a low-level interface for
compatibility with Revolt.

4. It might be worth restricting microtasks in PHP-land and keeping
them only for C code. This would simplify the interface, but we need to
ensure that it doesn’t cause any issues.

The remaining question on the agenda: deciding which model to choose —
*parent-child* or the *Go-style model*.

As noted, I am in broad agreement with the previously linked article on "playpens" (even if I hate that name), that the "go style model" is too analogous to goto statements.

Basically, this is asking "so do we use gotos or for loops?" For which the answer is, I hope obviously, for loops.

Offering both, frankly, undermines the whole point of having structured, predictable concurrency. The entire goal of that is to be able to know if there's some stray fiber running off in the background somewhere still doing who knows what, manipulating shared data, keeping references to objects, and other nefarious things. With a nursery, you don't have that problem... *but only if you remove goto*. A language with both a for loop and an arbitrary goto statement gets basically no systemic benefit from having the for loop, because neither developers nor compilers get any guarantees of what will or won't happen.

Especially when, as demonstrated, the "this can run in the background and I don't care about the result" use case can be solved more elegantly with nested blocks and channels, and in a way that, in practice, would probably get subsumed into DI Containers eventually so most devs don't have to worry about it.

Of interesting note along similar lines would be Rust, and... PHP.

Rust's whole thing is memory safety. The language simply will not let you write memory-unsafe code, even if it means the code is a bit more verbose as a result. In exchange for the borrow checker, you get enough memory guarantees to write extremely safe parallel code. However, the designers acknowledge that occasionally you do need to turn off the checker and do something manually... in very edge-y cases in very small blocks set off with the keyword "unsafe". Viz, "I know what I'm doing is stupid, but trust me." The discouragement of doing so is built into the language, and tooling, and culture.

PHP... has a goto operator. It was added late, kind of as a joke, but it's there. However, it is not a full goto. It can only jump within the current function, and only "up" control structures. It's basically a named break. While it only rarely has value, it's not al that harmful unless you do something really dumb with it. And then it's only harmful within the scope of the function that uses it. And, very very rarely, there's some micro-optimization to be had. (cf, this classic: Recursion instead of goto · Issue #3 · igorw/retry · GitHub). But PHP has survived quite well for 30 years without an arbitrary goto statement.

So if we start from a playpen-like, structured concurrency assumption, which (as demonstrated) gives us much more robust code that is easier to follow and still covers nearly all use cases, there's two questions to answer:

1. Is there still a need for an "unsafe {}" block or in-function goto equivalent?
2. If so, what would that look like?

I am not convinced of 1 yet, honestly. But if it really is needed, we should be targeting the least-uncontrolled option possible to allow for those edge cases. A quick-n-easy "I'mma violate the structured concurrency guarantees, k?" undermines the entire purpose of structured concurrency.

During our discussion, everything seems to be converging on the idea
that the changes introduced by the RFC into `Fiber` would be better
moved to a separate class. This would reduce confusion between the old
and new solutions. That way, developers wouldn't wonder why `Fiber` and
coroutines behave differently—they are simply different classes.
The new *Coroutine* class could have a different interface with new
logic. This sounds like an excellent solution.

The interface could look like this:

• *`suspend`* (or another clear name) – a method that explicitly hands
over execution to the *Scheduler*.
• *`defer`* – a handler that is called when the coroutine completes.
• *`cancel`* – a method to cancel the coroutine.
• *`context`* – a property that stores the execution context.
• *`parent`* (public property or `getParent()` method) – returns the
parent coroutine.
(*Just an example for now.*)

The *Scheduler* would be activated automatically when a coroutine is
created. If the `index.php` script reaches the end, the interpreter
would wait for the *Scheduler* to finish its work under the hood.

Do you like this approach?

That API is essentially what I was calling "AsyncContext" before. I am flexible on the name, as long as it is descriptive and gives the user the right mental model. :slight_smile: (I'm not sure if Coroutine would be the right name either, since in what I was describing it's the spawn command that starts a coroutine; the overall async scope is the container for several coroutines.)

But perhaps that is a sufficient "escape hatch"? Spitballing again:

async $nursery { // Formerly AsyncContext
  // Runs at the end of this nursery scope
  $nursery->defer($fn);

  // This creates and starts a coroutine, in this scope.
  $future = $nursery->spawn($fn);

  // A short-hand for "spawn this coroutine, in whatever the nearest async nursery scope is.
  // aka, an alias for the above line, but doesn't require passing $nursery around.
  $future spawn $fn;

  // If you want.
  $future->cancel();

  // See below.
  $nursery->spawn(stuff(...));
} // This blocks until escape() finishes, too, because it was bound to this scope.

function stuff() {
  async $inner {
      // This is bound to the $inner scope; $inner cannot end
      // until this is complete. This is by design.
      spawn $inner;

      // This spawns a new coroutine on the parent scope, if any.
      // If there isn't one, $inner->parent is null so it falls back
      // to the current scope.
      // One could technically climb the entire tree to the top-most
      // scope and spawn a coroutine there. It would be a bit annoying to do,
      // but, as noted, that's a good thing, because you shouldn't be doing that 99.9% of the time!
      // Channels are better 99.9% of the time.
      ($inner->parent ?? $inner)->spawn(escape(...));
    }
}

I'm not sure I fully like the above. I don't know if it makes the guarantees too weak still. But it does offer a limited, partial escape hatch, so may be an acceptable compromise.

It would be valuable to take this idea (or whatever we end up with) to experts in other languages with better async models than JS, and maybe a few academics, to let them poke obvious-to-them holes in it.

Edmund, does that make any sense to you?

--Larry Garfield

As noted, I am in broad agreement with the previously linked article on “playpens” (even if I hate that name), that the “go style model” is too analogous to goto statements.

The syntax and logic you describe are very close to Kotlin’s implementation.

I would say that Kotlin is probably the best example of structured concurrency organization, which is closest to PHP in terms of abstraction level.
One downside I see in Kotlin’s syntax is its complexity.

However, what stands out is the CoroutineScope concept. Instead of linking coroutines through Parent-Child relationships, Kotlin binds them to execution contexts. At the same time, the GlobalScope context is accessible everywhere.

https://kotlinlang.org/docs/coroutines-and-channels.html#structured-concurrency

So, it is not the coroutine that maintains the hierarchy, but the Scope object, which essentially aligns with the RFC proposal: contexts can be hierarchically linked.

This model sounds promising because it relieves coroutines from the responsibility of waiting for their child coroutines. It is not the coroutine that should wait, but the Scope that should “wait.”


spawn function {
echo "c1\n";
spawn function {
echo "c2\n";
};

echo "c1 end\n";
};

There is no reason to keep coroutine1 in memory if it does not need coroutine2. However, the Scope will remain in memory until all associated coroutines are completed. This memory model is entirely fair.

Let’s consider a scenario with libraries. A library may want to run coroutines under its own control. This means that the library wants to execute coroutines within its own Scope. For example:


class Logger {
public function __construct() {
$this->scope = new CoroutineScope();
}

public function log(mixed $data) {
// Adding another coroutine to our personal Scope
$this->scope->spawn($this->handle_log(...), $data);
}

public function __destruct()
{
// We can explicitly cancel all coroutines in the destructor if we find it appropriate
$this->scope->cancel();
}
}

Default Behavior

By default, the context is always inherited when calling spawn, so there is no need to pass it explicitly. The expression spawn function {} is essentially equivalent to currentScope->spawn.

The behavior of an HTTP server in a long-running process would look like this:


function receiveLoop()
{
while (true) {
// Simulating waiting for an incoming connection
$connection = waitForIncomingConnection();

// Creating a new Scope for handling the request
$requestScope = new CoroutineScope();

// Processing the request inside its own scope
$requestScope->spawn(function () use ($connection) {
handleRequest($connection);
});
}
}

Scope allows the use of the “task group” and “await all” patterns without additional syntax, making it convenient.

$scope = new CoroutineScope();

$scope->spawn(function () {
echo "Task 1 started\n";
sleep(1);
echo "Task 1 finished\n";
});

$scope->spawn(function () {
echo "Task 2 started\n";
sleep(2);
echo "Task 2 finished\n";
});

// Wait for all tasks to complete
$scope->awaitAll();

What is the advantage of using Scope instead of parent-child relationships in coroutines?

If a programmer never uses Scope, then the behavior is literally the same as in Go. This means that the programmer does not need structured relationships, and it does not matter when a particular coroutine completes.

At the same time, code at a higher level can control the execution of coroutines created at lower levels. The known downside of this approach is that if lower-level code needs its coroutines to run independently, it must explicitly define this behavior.

According to analysis, this model is effective in most modern languages, especially those designed for business logic.

Who Will Use Coroutines?

I believe that the primary consumers of coroutines are libraries and frameworks, which will provide developers with services and logic to solve tasks.

If a library refuses to consider that its coroutine may be canceled by the user’s code, or how it should be canceled, it means the library is neglecting its responsibility to provide a proper contract.

The default contract must give the user the power to terminate all coroutines launched within a given context because only the user of the library knows when this needs to be done. If a library has a different opinion, then it is obligated to explicitly implement this behavior.

This means that libraries and services bear greater responsibility than their users. But isn’t that already the case?

The language simply will not let you write memory-unsafe code

And this is more of an anti-example than an example.

But this analogy, like any other, cannot be used as an argument for or against. Memory safety is not the same as launching a coroutine in GlobalScope.

Where is the danger here? That the coroutine does something? But why is that inherently bad?

It only becomes a problem when a coroutine accidentally captures memory from the current context via use(), leaving an object in memory that logically should have been destroyed when the request Scope was destroyed.

However, you cannot force a programmer to avoid writing such code. Neither nurseries nor structured concurrency will take away this possibility.

If a coroutine does not capture incorrect objects and does not wait on the wrong $channel, then it should not be considered an issue.

I’m not sure if Coroutine would be the right name either

I’m not an expert in choosing good names, so I rely on others’ opinions. If the term coroutine feels overused, we can look for something else. But what?

> ($inner->parent ?? $inner)->spawn(escape(...));

But the meaning of this code raises a question: why am I placing my coroutine in the parent’s context if my own context is already inherited from the parent?

Or do I want my coroutine to be destroyed only with the parent’s context, but not with the current one? But then, how do I know that I should place the coroutine in the parent, rather than in the parent’s parent?

It makes an assumption about something it cannot possibly know.

I would suggest explicitly specifying which Scope we want:


function stuff() {
async $inner {
// While the request is active
($inner->find('requestScope') ?? $inner)->spawn(escape(...));

// Or while the server is running
($inner->find('serverScope') ?? $inner)->spawn(escape(...));
}
}

Edmund, does that make any sense to you?

If there are expert-level people who have spent years working on language syntax while also having a deep understanding of asynchrony, and they are willing to help us, I would say that this is not just reasonable — it is more like a necessary step that absolutely must be taken.

However, not for the current RFC, but rather for a draft of the next one, which will focus on a much narrower topic. So 100% yes.

In addition to expert input, I would also like to create a database of real-world use cases from code and examples. This would allow us to use code as an argument against purely logical reasoning.


Ed

Let me summarize the current state for today:

  1. I am abandoning startScheduler and the idea of preserving backward compatibility with await_all or anything else in that category. The scheduler will be initialized implicitly, and this does not concern user-land. Consequently, the spawn function() code will work everywhere and always.

Very glad to hear this, this is the correct approach for concurrency, one that will not break all existing libraries and give them the freedom to handle their own resource cleanup.

I’ve also seen your latest email about kotlin-like contexts, and they also make more sense than an await_all block (which can only cause deadlocks): note how a kotlin coroutine context may only be cancelled (cancelling all inner coroutines with CancelledExceptions), never awaited.

I can give you several examples where such logic is used in Amphp libraries, and it will break if they are invoked within an async block.

Got it, it looks like I misunderstood the post due to my focus. So, essentially, you’re talking not so much about wait_all itself, but rather about the parent-child vs. free model.

This question is what concerns me the most right now.

If you have real examples of how this can cause problems, I would really appreciate it if you could share them. Code is the best criterion of truth.

Sure:

This is the main example where it is most evident that background fibers are needed: logic which requires periodic background pings to be sent in order to keep a connection alive, a mutex held or something similar.

Constructing a PeriodicHeartbeatQueueinside of a wait_all block invoking a a someClass::connect(), storing it in a property and destroying it outside of it in someClass::close or someClass::__destruct, would cause a deadlock (the EventLoop::repeat doesn’t technically spawn a fiber immediately, it spawns one every $interval, but it behaves as though a single background fiber is spawned with a sleep($interval), so essentially it’s a standalone thread of execution, collected only on __destruct).

https://github.com/danog/MadelineProto/tree/v8/src/Loop/Connection contains multiple examples of tasks of the same kind in my own library (ping loops to keep connections alive, read loops to handle updates (which contain vital information needed to keep the client running correctly) in the background, etc…), all started on __construct when initialising the library, and stopped in __destruct when they are not needed anymore.

A coroutine context/scope a-la kotlin is fine, but it should absolutely not have anything to await all coroutines in the scope, or else it can cause deadlocks with the very common logic listed above.

Regards,
Daniil Gentili.

On 10 March 2025 03:55:21 GMT, Larry Garfield <larry@garfieldtech.com> wrote:

Support for nested blocks is absolutely mandatory, whatever else we do. If you cannot nest one async block (scheduler instance, coroutine, whatever it is) inside another, then basically no code can do anything async except the top level framework.

To stretch the analogy slightly, this is like saying that no Linux program could call fork() until containers were invented. That's quite obviously not true; in a system without containers, the forked process is tracked by the single global scheduler, and has a default relationship to its parent but also with other top-level processes.

Nested blocks are necessary *if* we want automatic resource management around user-selected parts of the program - which is close to being a tautology. If we don't provide them, we just need a defined start and end of the scheduler - and Edmond's current suggestion is that that could be an automatic part of the process / thread lifecycle, and not visible to the user at all.

This function needs to be possible, and work anywhere, regardless of whether there's an "open" async session 5 stack calls up.

function par_map(iterable $it, callable $c) {
$result = ;
async {
   foreach ($it as $val) {
     $result = $c($val);
   }
}
return $result;
}

This looks to me like an example where you should not be creating an extra context/nursery/whatever. A generic building block like map() should generally not impose resource restrictions on the code it's working with. In fact as written there's no reason for this function to exist at all - if $c returns a Future, a normal array_map will return an array of Futures, and can be composed with await_all, await_any, etc as necessary.

If an explicit nursery/context was required in order to use async features, you'd probably want instead to have a version of array_map which took one as an extra parameter, and passed it to along to the callback:

function par_map(iterable $it, callable $c, AsyncContext $ctx) {
  $result = ;
  async {
    foreach ($it as $val) {
      $result = $c($val, $ctx);
    }
  }
return $result;
}

This is pretty much just coloured functions, but with uglier syntax, since par_map itself isn't doing anything useful with the context, just passing along the one from an outer scope. An awful lot of functions would be like this; maybe FP experts would like it, authors of existing PHP code would absolutely hate it.

The place I see nested async{} blocks *potentially* being useful is to have a handful of key "firewalls" in the application, where any accidentally orphaned coroutines can be automatically awaited before declaring a particular task "done". But Daniil is probably right to ask for concrete use cases, and I have not used enough existing async code (in PHP or any other language) to answer that confidently.

Rowan Tommins
[IMSoP]

Sure:

Yeah, this is a Watcher, a periodic function that is called to clean up or check something. Yes, it’s a very specific pattern. And of course, the Watcher belongs to the service. If the service is destroyed, the Watcher should also be stopped.

In the context of this RFC, it’s better to use the Async\Interval class.

contains multiple examples of tasks of the same kind in my own library (ping loops to keep connections alive,
read loops to handle updates (which contain vital information needed to keep the client running correctly) in the background, etc…), all started on
__construct when initialising the library, and stopped in __destruct when they are not needed anymore.

Thank you, I will read it.

Ed.

Yeah, this is a Watcher, a periodic function that is called to clean up or check something. Yes, it’s a very specific pattern. And of course, the Watcher belongs to the service. If the service is destroyed, the Watcher should also be stopped.

It’s a Async\Interval, but it behaves entirely like a background fiber (and it can be implemented using a background fiber as well): what I mean is, it can be treated in the same way as a background fiber, because it’s an background task that can be spawned by the library in any method: if await_all was used during construction but not during destruction, it would cause a deadlock (because it would wait for an uncontrolled background task when exiting the block, according the proposed functionality of wait_all).

Regards,
Daniil Gentili.

function par_map(iterable $it, callable $c) {
$result = ;
async {
foreach ($it as $val) {
$result = $c($val);
}
}
return $result;
}

If the assumption is that each call can be asynchronous and all elements need to be processed, the only proper tool is a concurrent iterator.
Manually using a foreach loop is not the best idea because the iterator does not necessarily create a coroutine for each iteration.
And, of course, such an iterator should have a getFuture method that allows waiting for the result.

Yes, Kotlin has an explicit blocking Scope, but I don’t see much need for it. So far, all the cases we’re considering fit neatly into a framework:

  1. I want to launch a coroutine and wait: await spawn
  2. I want to launch a coroutine and not wait: spawn
  3. I want to launch a group of coroutines and wait: await CoroutineScope
  4. I want to launch a group of coroutines and not wait: spawn
  5. I want a concurrent iteration: special iterator.

What else are we missing?