Re: PHP True Async RFC

From: Date: Sun, 09 Mar 2025 09:04:04 +0000
Subject: Re: PHP True Async RFC
References: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On Sun, Mar 9, 2025, at 09:05, Edmond Dantes wrote:
> Good day, Alex.
> 
> >
> >  Can you please share a bit more details on how the Scheduler is implemented, to make sure
> > that I understand why this contradiction exists? Also with some examples, if possible.
> >
> 
> ```php
> $fiber1 = new Fiber(function () {
>     echo "Fiber 1 starts\n";
> 
>     $fiber2 = new Fiber(function () use (&$fiber1) {
>         echo "Fiber 2 starts\n";
> 
>         Fiber::suspend(); // Suspend the inner fiber
>         echo "Fiber 2 resumes\n";
> 
>     });
> 
> });
> ```
> 
> 
> Yes, of course, let's try to look at this in more detail.
> Here is the classic code demonstrating how Fiber works. Fiber1
> creates Fiber2. When Fiber2 yields control, execution returns to
> Fiber1.
> 
> Now, let's try to do the same thing with Fiber3. Inside Fiber2,
> we create Fiber3. Everything will work perfectly—Fiber3 will return
> control to Fiber2, and Fiber2 will return it to Fiber1—this
> forms a hierarchy.
> 
> 
> Now, imagine that we want to turn Fiber1 into a *Scheduler* while following these
> rules.
> To achieve this, we need to ensure that all Fiber instances are created from the
> *Scheduler*, so that control can always be properly returned.
> 
> ```php
> 
> 
> class Scheduler {
>     private array $queue = [];
> 
>     public function add(callable $task) {
>         $fiber = new Fiber($task);
>         $this->queue[] = $fiber;
>     }
> 
>     public function run() {
>         while (!empty($this->queue)) {
>             $fiber = array_shift($this->queue);
> 
>             if ($fiber->isSuspended()) {
>                 $fiber->resume($this);
>             }
>         }
>     }
> 
>     public function yield() {
>         $fiber = Fiber::getCurrent();
>         if ($fiber) {
>             $this->queue[] = $fiber;
>             Fiber::suspend();
>         }
>     }
> }
> 
> $scheduler = new Scheduler();
> 
> $scheduler->add(function (Scheduler $scheduler) {
>     echo "Task 1 - Step 1\n";
>     $scheduler->yield();
>     echo "Task 1 - Step 2\n";
> });
> 
> $scheduler->add(function (Scheduler $scheduler) {
>     echo "Task 2 - Step 1\n";
>     $scheduler->yield();
>     echo "Task 2 - Step 2\n";
> });
> 
> $scheduler->run();
> 
> ```
> 
> So, to successfully switch between Fibers:
> 
>  1. A Fiber must return control to the *Scheduler*.
>  2. The *Scheduler* selects the next Fiber from the queue and switches to it.
>  3. That Fiber then returns control back to the *Scheduler* again.
> 
> 
> This algorithm has one drawback: *it requires two context switches instead of one*. We could
> switch *FiberX* to *FiberY* directly. 
> 
> Breaking the contract not only disrupts the code in this RFC but also affects Revolt's
> functionality. However, in the case of Revolt, you can say: *"If you use this library, follow
> the library's contracts and do not use Fiber directly."*
> 
> 
> 
> But PHP is not just a library, it's a language that must remain consistent and cohesive.
> 
> 
> >
> >  Reading the RFC initially, I though that the Scheduler is using fibers for everything
> > that runs. 
> >
> 
> Exactly.  
> 
> 
> >
> >  You mean that when one of the fibers started by the Scheduler is starting other fibers
> > they would usually await for them to finish, and that is a blocking operating that blocks also the
> > Scheduler?
> >
> 
> When a *Fiber* from the *Scheduler* decides to create another *Fiber* and then tries to call
> blocking functions inside it, control can no longer return to the *Scheduler* from those functions.
> 
> Of course, it would be possible to track the state and disable the concurrency mode flag when
> the user manually creates a *Fiber*. But… this wouldn't lead to anything good. Not only would
> it complicate the code, but it would also result in a mess with different behavior inside and
> outside of *Fiber*.
> 
> 
> 
> This is even worse than calling *startScheduler*.
> 
> The hierarchical switching rule is a *design flaw* that happened because a *low-level
> component* was introduced into the language as part of the implementation of a *higher-level
> component*. However, the high-level component is in *User-land*, while the low-level component is in
> *PHP core*.
> 
> It's the same as implementing $this in OOP but requiring it to be explicitly
> passed in every method. This would lead to inconsistent behavior.
> 
> 
> 
> So, this situation needs to be resolved one way or another.  
> 
> --
> 
> Ed
> 

Hi Ed,

If I remember correctly, the original implementation of Fibers were built in such a way that
extensions could create their own fiber types that were distinct from fibers but reused the context
switch code.

From the original RFC:

> An extension may still optionally provide their own custom fiber implementation, but an
> internal API would allow the extension to use the fiber implementation provided by PHP.

Maybe, we could create a different version of fibers ("managed fibers", maybe?) distinct
from the current implementation, with the idea to deprecate them in PHP 10? Then, at least, the
scheduler could always be running. If you are using existing code that uses fibers, you can't
use the new fibers but it will "just work" if you aren't using the new fibers (since
the scheduler will never pick up those fibers).

Something to think about.

— Rob


Thread (110 messages)

« previous php.internals (#126670) next »