Re: [RFC] Data Classes

From: Date: Sat, 23 Nov 2024 20:27:15 +0000
Subject: Re: [RFC] Data Classes
References: 1 2 3 4  Groups: php.internals 
Request: Send a blank email to [email protected] to get a copy of this message
On Sat, Nov 23, 2024, at 18:34, Rowan Tommins [IMSoP] wrote:
> On 23/11/2024 16:05, Rob Landers wrote:
>>> Your RFC doesn't discuss this - the changeName example shows behaviour
>>> *inside* the method, but not behaviour when *calling* it
>> 
>> An interesting observation, can you explain more as to what you mean?
> 
> Looking closer, there's a hint at what you expect to happen in your Rectangle example:
> 
> 
> 
> $bigRectangle = $rectangle->resize(10, 20);
> assert($bigRectangle !== $rectangle); // true
> 
> 
> It seems that modifications to $this aren't visible outside the method, creating a purely
> local clone, which would be discarded if it wasn't returned (or saved somewhere).
> 
> I can see the logic, but the result is a bit unintuitive:
> 
> 
> 
> data class Example { 
>    public function __construct(public int $x) {}
>    public function inc(): void {
>      $this->x++;
>   }
> }
> $foo = new Example(0);
> $foo->x++;
> $foo->inc();
> echo $foo->x; // 1, not 2
> 
> 

Interesting! I actually found it to be intuitive.

Think of it like this:

function increment(array $array) {
  $array[0]++;
}

$arr = [0];
increment($arr);
echo $arr[0]; // is 0

We don't expect $arr to be any different outside of the function because $arr is a value, not a
reference. "data classes" are "values" and not references to values, thus when
you modify $this, you modify the value, and it doesn't affect values elsewhere. If you want to
keep track of that value, you have to put it somewhere where you can reference it—a return value,
global variable, property in a regular class, etc. In any case, lets keep going to see if there is a
better way.

> I think it would be clearer to prevent direct modification of $this:
> 
> 
> 
> data class Example { 
>    public function __construct(public int $x) {}
>    public function inc(): void {
>       $this->x++; // ERROR: Can not mutate $this in data class
>    }
>    public function withInc(): static {
>       $new = $this; // explicitly make a local copy of $this
>       $new->x++; // copy-on-write separates $new from $this
>       return $new;
>    }
> }
> 
> 

Not that I disagree (see the records RFC), but at that point, why not make data classes implicitly
readonly?

> 
> 
> That would still be compatible with Ilija's suggestion, which was to add special
> "mutating methods":
> 
> 
> 
> 
> data class Example {
>     public function __construct(public int $x) {}
>     public mutating function inc(): void {
>         $this->x++;
>     }
> }
> $foo = new Example(0);
> $foo->x++;
> $foo->inc!(); // copy-on-write triggered *before* the method is called
> echo $foo->x; // 2
> 
> 

I actually find this appealing, but it is strange to me to allow this syntax on classes. Is there
precedent for that? Or is there a way we can do it using "regular looking PHP"; or are
structs the way to go?

Another alternative would be that mutations still trigger a copy-on-write, but the outer variable is
updated with $this upon return. So this would work:

data class Example {
   public function __construct(public int $x) {}
   public function inc(): void {
     $this->x++;
  }
}

$foo = new Example(1);
$bar = $foo;
$foo->inc(); // foo is copied on mutation, and $foo points at the new value on return.
echo $bar->x; // 1
echo $foo->x; // 2

To me, this seems like it would be even more intuitive; $foo has the value you would expect from
outside the class and doesn't require you keeping track of the value yourself. Though, there
are some footguns here too:

class Foo {
  Example $bar;

  function baz() {
    $bar = $this->bar; // should be $bar = &$this->bar
    $bar->inc();
    echo $this->bar->x; // not incremented
  }
}

I could go either way on this one, honestly; so it makes sense to me why structs would have a
dedicated syntax for whichever you prefer. On the one hand, it currently requires you to be explicit
all the time (which can be annoying), and on the other hand, there's the implicit copy here,
which requires you to be explicit when you want a reference.

I suppose there is a third option as well, and that is to not to any of these options and just have
data classes that always compare by value.

> ??-- 
> Rowan Tommins
> [IMSoP]

— Rob


Thread (17 messages)

« previous php.internals (#126039) next »