There should be a defined ordering (or at least, some guarantees).
The execution order, which is part of the contract, is as follows:
- Microtasks are executed first.
- Then I/O events and OS signals are processed.
- Then timer events are executed.
- Only after that are fibers scheduled for execution.
In the current implementation, fibers are stored in a queue without priorities (this is not a random choice). During one cycle period, only one fiber is taken from the queue.
This results in the following code (I’ve removed unnecessary details):
do {
execute_microtasks_handler();
has_handles = execute_callbacks_handler(circular_buffer_is_not_empty(&ASYNC_G(deferred_resumes)));
execute_microtasks_handler();
bool was_executed = execute_next_fiber_handler();
if (UNEXPECTED(
false == has_handles
&& false == was_executed
&& zend_hash_num_elements(&ASYNC_G(fibers_state)) > 0
&& circular_buffer_is_empty(&ASYNC_G(deferred_resumes))
&& circular_buffer_is_empty(&ASYNC_G(microtasks))
&& resolve_deadlocks()
)) {
break;
}
} while (zend_hash_num_elements(&ASYNC_G(fibers_state)) > 0
|| circular_buffer_is_not_empty(&ASYNC_G(microtasks))
|| reactor_loop_alive_fn()
);
If we go into details, it is also noticeable that microtasks are executed twice - before and after event processing - because an event handler might enqueue a microtask, and the loop ensures that this code executes as early as possible.
The contract for the execution order of microtasks and events is important because it must be considered when developing event handlers. The concurrent iterator relies on this rule.
However, making assumptions about when a fiber will be executed is not part of the contract, if only because this algorithm can be changed at any moment.
// Execution is paused until the fiber completes $result = Async\await($fiber); // immediately enter $fiber without queuing
So is it possible to change the execution order and optimize context switches? Yes, there are ways to do this. However, it would require modifying the Fiber
code, possibly in a significant way (I haven’t explored this aspect in depth).
But… let’s consider whether this would be a good idea.
We have a web server. A single thread is handling five requests. They all compete with each other because this is a typical application interacting with MySQL.
In each Fiber
, you send a query and wait for the result as quickly as possible.
In what case should we create a new coroutine within a request handler?
The answer: usually, we do this when we want to run something in the background while continuing to process the request and return a response as soon as possible.
In this paradigm, it is beneficial to execute coroutines in the order they were enqueued.
For other scenarios, it might be a better approach for a child coroutine to execute immediately. In that case, these scenarios should be considered, and it may be worth introducing specific semantics for such cases.
Won’t php code behave exactly the same as it did before once enabling the scheduler?
Suppose we have a sleep()
function. Normally, it calls php_sleep((unsigned int)num)
.
The php_sleep
function blocks the execution of the thread.
But we need to add an alternative path:
if (IN_ASYNC_CONTEXT) {
async_wait_timeout((unsigned int) num * 1000, NULL);
RETURN_LONG(0);
}
The IN_ASYNC_CONTEXT
condition consists of two points:
- The current execution context is inside a Fiber.
- The Scheduler is active.
What’s the difference?
If the Scheduler is not active, calling sleep()
will block the entire Thread because, without an event loop, it simply cannot correctly handle concurrency.
However, if the Scheduler is active, the code will set up handlers and return control to the “main loop”, which will pick the next Fiber from the queue, and so on.
This means that without a Scheduler and Reactor, concurrent execution is impossible (without additional effort).
From the perspective of a PHP developer, if they are working with AMPHP/Swoole, nothing changes, because the code inside the if
condition will never execute in their case.
Does this change the execution order inside a Fiber? No.
If you had code working with RabbitMQ sockets, and you copied this code into a Fiber, then enabled concurrency, it would work exactly the same way. If the code used blocking sockets, the Fiber would yield control to the Scheduler. And if two such Fibers are running, they will start working with RabbitMQ sequentially. Of course, each Fiber should use a different socket.
The same applies to CURL. Do you have an existing module that sends requests to a service using CURL in a synchronous style? Just copy the code into a coroutine.
This means almost 98% transparency. Why almost? Because there might be nuances in helper functions and internal states. There may also be differences in OS state management or file system, which could affect the final result.
How will a library take advantage of this feature if it cannot be certain the scheduler is
running or not? Do I need to write a library for async and another version for non-async?
Or do all the async functions with this feature work without the scheduler running, or do
they throw a catchable error?
This means that the launchScheduler()
function should be called only once during the entire lifecycle of the application. If an error occurs and is not handled, the application should terminate. This is not a technical limitation but rather a logical constraint.
If launchScheduler()
were replaced with a CLI option, such as php --enable-scheduler
, where the Scheduler is implicitly activated, then it would be like the last line of code it must exist only once.
Will this change the way os signals are handled then? Will it break compatibility if a
library uses pcntl traps and I'm using true async traps too? Note there are several
different ways (timeout) signals are handled in PHP -- so if (per-chance) the scheduler
could always be running, maybe we can unify the way signals are handled in php.
Regarding this phrase in the RFC: it refers to the window close event in Windows, which provides a few seconds before the process is forcibly terminated.
There are signals intended for application termination, such as SIGBREAK or CTRL-C, which should typically be handled in only one place in the application. Developers are often tempted to insert signal handlers in multiple locations, making the code dependent on the environment. But more importantly, this should not happen at all.
True Async explicitly defines a Flow for emergency or unexpected application termination. Attempting to disrupt this Flow by adding a custom termination signal handler introduces ambiguity.
There should be only one termination handler. And at the end of its execution, it must call gracefulShutdown
.
As for pcntl, this will need to be tested.
What if it never resumes at all?
If a Fiber is never resumed, it means the application has completely crashed with no way to recover 
The RFC has two sections dedicated to this issue:
Cancellation Operation + Graceful Shutdown.
If the application terminates due to an unhandled exception, all Fibers must be executed.
Any Fiber can be canceled at any time, and there is no need to use explicit Cancellation, which I personally find an inconvenient pattern.
The RFC doesn’t mention the stack trace. Will it throw away any information about the inner exception?
This is literally “exception transfer”. The stack trace will be exactly the same as if the exception were thrown at the call site.
To be honest, I haven’t had enough time to thoroughly test this. Let’s try it:
<?php
Async\async(function() {
echo "async function 1\n";
Async\async(function() {
echo "2\n";
throw new Error("Error");
});
});
echo "start\n";
try {
Async\launchScheduler();
} catch (\Throwable $exception) {
print_r($exception);
}
echo "end\n";
?>
004+ Error Object
005+ (
006+ [message:protected] => Error
007+ [string:Error:private] =>
008+ [code:protected] => 0
009+ [file:protected] => async.php
010+ [line:protected] => 8
011+ [trace:Error:private] => Array
012+ (
013+ [0] => Array
014+ (
015+ [function] => {closure:{closure:async.php:3}:6}
016+ [args] => Array
017+ (
018+ )
019+ )
020+ [1] => Array
021+ (
022+ [file] => async.php
023+ [line] => 14
024+ [function] => Async\launchScheduler
025+ [args] => Array
026+ (
027+ )
028+ )
029+ )
030+ [previous:Error:private] =>
031+ )
Seems perfectly correct.
What will calling exit
or die
do?
I completely forgot about them! Well, of course, Swoole override them. This needs to be added to the TODO.
Why is this the case?
For example, consider a long-running application where a service is a class that remains in memory continuously. The web server receives an HTTP request and starts a Fiber for each request. Each request has its own User Session ID.
You want to call a service function, but you don’t want to pass the Session ID every time, because there are also 5-10 other request-related variables. However, you cannot simply store the Session ID in a class property, because context switching is unpredictable. At one moment, you’re handling Request #1, and a second later, you’re already processing Request #2.
When a Fiber creates another Fiber, it copies a reference to the context object, which has minimal performance impact while maintaining execution environment consistency.
Closure variables work as expected they are pure closures with no modifications.
I didn’t mean that True Async breaks anything at the language level. The issue is logical:
You cannot use a global variable in two Fibers, modify it, read it, and expect its state to remain consistent.
By this point we have covered FiberHandle, Resume, and Contexts. Now we have Futures? Can we simplify this to just Futures? Why do we need all these different ways to handle execution?
Futures and Notifiers are two different patterns.
- A Future changes its state only once.
- A Notifier generates one or more events.
- Internally, Future uses Notifier.
In the RFC, I mention that these are essentially two APIs:
- High-level API
- Low-level API
One of the open questions is whether both APIs should remain in PHP-land.
The low-level API allows for close interaction with the event loop, which might be useful if someone wants to write a service in PHP that requires this level of control.
Additionally, this API helps minimize Fiber context switches, since its callbacks execute without switching.
This is both an advantage and a disadvantage.
It's also not clear what the value of most of these function is. For example:
Your comment made me think, especially in the context of anti-patterns. And I agree that it’s better to remove unnecessary methods than to let programmers shoot themselves in the foot.
As for the single producer method, I am not sure why you would use this.
Yes, in other languages there are no explicit restrictions. If the single producer approach is indeed rarely used, then it’s not such an important feature to include. However, I lack certainty on whether it’s truly a rare case. On the other hand, these functions are inexpensive to implement and do not affect performance. Moreover, they have another drawback: they increase the number of behavioral variants in a single class, which seems a more significant disadvantage than the frequency of use.
It isn't clear what happens when `trySend` fails. Is this an error or does nothing?
Yes, this is a documentation oversight. I’ll add it to the TODO.
Thinking through it, there may be cases where trySend
is valid,
Code using tryReceive could be useful in cases where a channel is used to implement a pool. Suppose you need to retrieve an object from the pool, but if it’s not available, you’d prefer to do something else (like throw an exception) rather than block the fiber.
Overall, though, you’re right — it’s an antipattern. It’s better to implement the pool as an explicit class and reserve channels for their classic use.
Can you expand on what this means in the RFC? Why expose it if it shouldn’t be used?
I answered a similar question above.
I also noticed that you seem to be relying heavily on the current implementation to define
behavior.
I love an iterative approach: prototype => RFC => prototype => RFC.
Thank you for the excellent remarks and analysis!
Ed.