Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

if a `get` hook for property `$foo` calls method `bar()`, then inside that method `$this->foo` will refer to the raw property, both read and write. If `bar()` is called from somewhere other than the hook, reading from `$this->foo` will trigger the `get` hook. This behavior is identical to that already used by `__get` and `__set` today.

I'm slightly confused by this.

If there is an actual property called $foo, then __get and __set will
be called only when it is out of visibility, regardless of the call
stack - e.g. a private property will always trigger __get from public
scope, and always access it directly from private scope:
Online PHP editor | output for R5Yos That seems differ from what's proposed, where
even a private call to bar() would trigger the hook.

The protection against recursion appears to only be relevant for
completely undefined properties. For __get, the direct access can never
do anything useful - there's nothing to access: Online PHP editor | output for 2nDZS
For __set, it is at least possible for the non-recursive write to
succeed, but only in the niche case of creating a dynamic property:
Online PHP editor | output for dpYOj I'm not sure that there's any equivalent to this
scenario for property hooks, since they can never be undefined/dynamic.

It's slightly different, yes. The point is that the special behavior of a hook is disabled if you are within the call stack of a hook, just like the special behavior of __get/__set is disabled if you are within the call stack of __get/__set. What happens when you hit an operation that would otherwise go into an infinite loop is a bit different, but the "disable to avoid an infinite loop" logic is the same.

So maybe "is the same" rather than "is identical", but otherwise it's the same concept.

There is one exception to the above: if a property is virtual, then there is no presumed connection between the get and set operations. [...] For that reason, `&get` by reference is allowed for virtual properties, regardless of whether or not there is a `set` hook.

I don't agree with this, and the example immediately following it
demonstrates the exact opposite: the &get and set hooks are both
proxying to the same backing value, and have all the same problems as
if the property was non-virtual. I would imagine a lot of real-life
virtual properties would be doing something similar: converting to/from
a different type, proxying to another object, etc.

I think this exception is unnecessarily complicated: either trust users
to handle the implications of combining &get with set, or forbid it.

The point is to give the user the option for full backwards compatibility when it makes sense. This requires jumping through some hoops, which is the point. This is essentially equivalent to creating a by-ref getter + a setter, exposing the underlying property. By creating a virtual property, we are "accepting" that the two are detached. While we could disallow this, we recognize that there may be valid use-cases that we'd like to enable. It also parallels __get/__set, where using &__get means you can write to something without going through __set.

In practice I expect it virtual properties with both hooks to be very rare. Most virtual properties will, I expect, be lazy-computed get-only values.

Additionally, `&get` hooks are allowed for arrays as well, provided there is no `set` hook.

I mentioned in a previous e-mail the possibility of using the &get hook
for array writes. Has this been considered?

That is:

$c->arr['beep'] = 'boop';

Would be equivalent to:

$temp =& $c->arr;
$temp['beep'] = 'boop';
unset($temp);

Which would be valid if $arr had an &get hook defined.

With the change to allow &get in the absence of set, I believe that would already work.

cf: Online PHP editor | rfc for 3Gnti

A `set` hook on a typed property must declare a parameter type that is the same as or contravariant (wider) from the type of the property.

Once a property has both a `get` and `set` operation, however, it is no longer covariant or contravariant for further extension.

How do these two rules interact?

Could this:

public string $foo {
   get => $this->_foo;
   set(string|Stringable $value) {
       $this->_foo = (string)$value;
   }
}

be over-ridden by this, where the property's "main type" remains
invariant but its "settable type" is contravariant?

public string $foo {
   get => $this->_foo;
   set(string|Stringable|SomethingElse $value) {
       $this->_foo = $value instanceof SomethingElse ?
$value->asString() : (string)$value;
   }
}

That would be legal.

ReflectionProperty has several new methods to work with hooks.

There should be some way to reliably determine the "settable type" of a
property. At the moment, I think you would have to do something like
this:

$setHook = $property->getHook(PropertyHookType::Set);
$writeType = $setHook === null ? $property->getType() :
$setHook->getParameters()[0]->getType();

Hm. Good point here. We'll probably need to add another method to ReflectionProperty for that. Stay tuned.

--Larry Garfield

I think it would be more helpful to justify this design on its own merits, particularly because it’s a significant difference from other languages (which either don’t have a “real property” behind the hooks, or in Kotlin’s case allow access to it only directly inside the hook definitions, via the “field” keyword).

On Wed, Mar 13, 2024, at 10:26 PM, Rowan Tommins [IMSoP] wrote:

On 12/03/2024 22:43, Larry Garfield wrote:

It's slightly different, yes. The point is that the special behavior of a hook is disabled if you are within the call stack of a hook, just like the special behavior of __get/__set is disabled if you are within the call stack of __get/__set. What happens when you hit an operation that would otherwise go into an infinite loop is a bit different, but the "disable to avoid an infinite loop" logic is the same.

I guess I'm looking at it more from the user's point of view: it's very
rare with __get and __set to have a method that sometimes accesses the
"real" property, and sometimes goes through the "hook". Either there is
no real property, or the property has private/protected scope, so any
method on the classes sees the "real" property *regardless* of access
via the hook.

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only *directly* inside the hook
definitions, via the "field" keyword).

I'm not sure I follow. The behavior we have currently is very close to how Kotlin works, from a user perspective. (The internal implementation is backwards from Kotlin, but that doesn't matter to the user.)

I've lost track of which specific issue you have an issue with or would want changed. The guards to prevent an infinite loop are necessary, for the same reasons as they are necessary for __get/__set. We couldn't use a backing field otherwise, without some other syntax. (This is where Kotlin uses 'field'.) So, I'm not really sure what we're discussing at this point. What specific changes are you suggesting?

With the change to allow &get in the absence of set, I believe that would already work.

cf: Online PHP editor | rfc for 3Gnti

Awesome! The RFC should probably highlight this, as it gives a
significant extra option for array properties.

Updated. I may try to rewrite the array and references section this weekend, as with the changes in the design to be more permissive I'm not sure it's entirely clear anymore what the net result is.

--Larry Garfield

On 15 March 2024 17:11:29 GMT, Larry Garfield <larry@garfieldtech.com> wrote:

On Wed, Mar 13, 2024, at 10:26 PM, Rowan Tommins [IMSoP] wrote:

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only *directly* inside the hook
definitions, via the "field" keyword).

I'm not sure I follow. The behavior we have currently is very close to how Kotlin works, from a user perspective.

Unless I'm misunderstanding something, the backing field in Kotlin is accessible only inside the hooks, nowhere else. I don't know what would happen if a hook caused a recursive call to itself, but there's no mention in the docs of it bypassing the hooks, only this:

This backing field can be referenced in the accessors using the `field` identifier

and

The `field` identifier can only be used in the accessors of the property.

And then a section explaining that more complex hooks should use a separate backing property - which is the only option in C#, and roughly what people would do in PHP today with __get and __set.

Kotlin does have a special syntax for "delegating" hooks, but looking at the examples, they do not use the backing field at all, they have to provide their own storage.

I've lost track of which specific issue you have an issue with or would want changed. The guards to prevent an infinite loop are necessary, for the same reasons as they are necessary for __get/__set.

I understand that *something* needs to happen if a recursive call happens, but it could just be an error, like any other unbounded recursion.

I can also understand the temptation to make it something more useful than an error, and provide a way to access the "backing field" / "raw value" from outside the hook. But it does lead to something quite surprising: the same line of code does different things depending on how it is called.

I doubt many people have ever discovered that __get and __set work that way, since as far as I can see it's only possible to use deliberately if you're dynamically adding and unsetting properties inside your class.

So, I don't necessarily think hooks working that way is the wrong decision, I just think it's a decision we should make consciously, not one that's obvious.

Regards,
Rowan Tommins
[IMSoP]

On Fri, Mar 15, 2024, at 11:25 PM, Rowan Tommins [IMSoP] wrote:

On 15 March 2024 17:11:29 GMT, Larry Garfield <larry@garfieldtech.com> wrote:

On Wed, Mar 13, 2024, at 10:26 PM, Rowan Tommins [IMSoP] wrote:

I think it would be more helpful to justify this design on its own
merits, particularly because it's a significant difference from other
languages (which either don't have a "real property" behind the hooks,
or in Kotlin's case allow access to it only *directly* inside the hook
definitions, via the "field" keyword).

I'm not sure I follow. The behavior we have currently is very close to how Kotlin works, from a user perspective.

Unless I'm misunderstanding something, the backing field in Kotlin is
accessible only inside the hooks, nowhere else. I don't know what would
happen if a hook caused a recursive call to itself, but there's no
mention in the docs of it bypassing the hooks, only this:

This backing field can be referenced in the accessors using the `field` identifier

and

The `field` identifier can only be used in the accessors of the property.

And then a section explaining that more complex hooks should use a
separate backing property - which is the only option in C#, and roughly
what people would do in PHP today with __get and __set.

Kotlin does have a special syntax for "delegating" hooks, but looking
at the examples, they do not use the backing field at all, they have to
provide their own storage.

I've lost track of which specific issue you have an issue with or would want changed. The guards to prevent an infinite loop are necessary, for the same reasons as they are necessary for __get/__set.

I understand that *something* needs to happen if a recursive call
happens, but it could just be an error, like any other unbounded
recursion.

I can also understand the temptation to make it something more useful
than an error, and provide a way to access the "backing field" / "raw
value" from outside the hook. But it does lead to something quite
surprising: the same line of code does different things depending on
how it is called.

I doubt many people have ever discovered that __get and __set work that
way, since as far as I can see it's only possible to use deliberately
if you're dynamically adding and unsetting properties inside your class.

So, I don't necessarily think hooks working that way is the wrong
decision, I just think it's a decision we should make consciously, not
one that's obvious.

Well, reading/writing from within a set/get hook is an obvious use case to support. We cannot do cached properties easily otherwise:

public string $expensive {
  get => $this->expensive ??= $this->compute();
  set {
    if (strlen($value) < 50) throw new Exception();
    $this->expensive = $value;
  }
}

So disabling the hooks from within the hooks seems like the only logical solution there. (Short of bringing back $field and making it mandatory, which is actually much harder than it sounds because of the ++ et al operators that would need to be supported.)

The other case then becomes:

class Foo {
  public string $a {
    get => $this->expensive ??= $this->compute();
    set {
      if (strlen($value) < 50) throw new Exception();
      $this->expensive = $value;
    }
  }

  public function compute() {
    $start = $this->expensive ?? 'a';
     return $start . 'b';
  }
}

Inside compute(), the logic requires reading from $this->expensive. If we have no guard, that would cause an infinite loop. If the guard extends down the call stack, then the loop is eliminated.

That does mean, as you note, that if you call $foo->compute(), the first call to $this->expensive will invoke the get hook, but subsequent calls to it will not. That may seem odd, but in practice, we do not see any other alternative that doesn't make infinite loops very easy to write. This is, of course, a highly contrived example. In practice, I don't expect it to come up much in the first place.

It was this logic that led us to the current implementation: Once you're within the callstack of a hook, you bypass that property's hooks. We then noted that __get/__set have essentially the same guard logic. It doesn't come up often because, again as you note, it's only relevant in even more contrived examples.

So yes, the current logic is very deliberate and based on a process of elimination to arrive at what we feel is the only logical design. Likely the same process of elimination that led to the behavior of the guards on __get/__set, which is why they are essentially the same. Being the same also makes the language more predictable, which is also a design goal for this RFC. (Hence why "this is the same logic as methods/__get/other very similar thing" is mentioned several times in the RFC. Consistency in expectations is generally a good thing.)

In theory we could also forbid accessing a property within the call stack of its hooks; in that case, the above code would simply error. However, that is a less-optimal solution because then compute() is only sometimes callable, depending on the callstack. That is no less weird than $this->expensive skipping hooks only sometimes. It would also be inconsistent with how __get/__set work, which makes hooks less capable and less consistent. That's why we don't think that's a good way of doing it.

Also, we've rewritten the references and arrays sections in the RFC. Nothing actually changed in the implementation, but it should be a lot clearer now that only a fairly small subset of impossible situations are blocked. Most things "just work," and when they don't, they (again) don't for a logical reason that parallels the behavior of the language elsewhere. There's even some nice summary tables. :slight_smile:

--Larry Garfield

On 16 March 2024 00:19:57 GMT, Larry Garfield <larry@garfieldtech.com> wrote:

Well, reading/writing from within a set/get hook is an obvious use case to support. We cannot do cached properties easily otherwise:

public string $expensive {
get => $this->expensive ??= $this->compute();
set {
   if (strlen($value) < 50) throw new Exception();
   $this->expensive = $value;
}
}

To play devil's advocate, in an implementation with only virtual properties, this is still perfectly possible, just one declaration longer:

private string $_expensive;
public string $expensive {
  get => $this->_expensive ??= $this->compute();
  set {
    if (strlen($value) < 50) throw new Exception();
    $this->_expensive = $value;
  }
}

Note that in this version there is an unambiguous way to refer to the raw value from anywhere else in the class, if you wanted a clearAll() method for instance.

I can't stress enough that this is where a lot of my thinking comes from: that backed properties are really the special case, not the default. Anything you can do with a backed property you can do with a virtual one, but the opposite will never be true.

The minimum version of backed properties is basically just sugar for that - the property is still essentially virtual, but the language declares the backing property for you, leading to:

public string $expensive {
  get => $field ??= $this->compute();
  set {
    if (strlen($value) < 50) throw new Exception();
    $field = $value;
  }
}

I realise now that this isn't actually how the current implementation works, but again I wanted to illustrate where I'm coming from: that backed properties are just a convenience, not a different type of property with its own rules.

Being the same also makes the language more predictable, which is also a design goal for this RFC. (Hence why "this is the same logic as methods/__get/other very similar thing" is mentioned several times in the RFC. Consistency in expectations is generally a good thing.)

I can only speak for myself, but my expectations were based on:

a) How __get and __set are used in practice. That generally involves reading and writing a private property, of either the same or different name from the public one; and that private property is visible everywhere equally, no special handling based on the call stack.

b) What happens if you accidentally cause infinite recursion in a normal function or method, which is that the language eventually hits a stack depth limit and throws an error.

So the assertion that the proposal was consistent with expectations surprised me. It feels to me like something that will seem surprising to people when they first encounter it, but useful once they understand the implications.

Regards,
Rowan Tommins
[IMSoP]

Hi Rowan

On Sat, Mar 16, 2024 at 9:32 AM Rowan Tommins [IMSoP]
<imsop.php@rwec.co.uk> wrote:

On 16 March 2024 00:19:57 GMT, Larry Garfield <larry@garfieldtech.com> wrote:

>Well, reading/writing from within a set/get hook is an obvious use case to support. We cannot do cached properties easily otherwise:
>
>public string $expensive {
> get => $this->expensive ??= $this->compute();
> set {
> if (strlen($value) < 50) throw new Exception();
> $this->expensive = $value;
> }
>}

To play devil's advocate, in an implementation with only virtual properties, this is still perfectly possible, just one declaration longer:

private string $_expensive;
public string $expensive {
  get => $this->_expensive ??= $this->compute();
  set {
    if (strlen($value) < 50) throw new Exception();
    $this->_expensive = $value;
  }
}

Note that in this version there is an unambiguous way to refer to the raw value from anywhere else in the class, if you wanted a clearAll() method for instance.

I can't stress enough that this is where a lot of my thinking comes from: that backed properties are really the special case, not the default. Anything you can do with a backed property you can do with a virtual one, but the opposite will never be true.

The minimum version of backed properties is basically just sugar for that - the property is still essentially virtual, but the language declares the backing property for you, leading to:

public string $expensive {
  get => $field ??= $this->compute();
  set {
    if (strlen($value) < 50) throw new Exception();
    $field = $value;
  }
}

I realise now that this isn't actually how the current implementation works, but again I wanted to illustrate where I'm coming from: that backed properties are just a convenience, not a different type of property with its own rules.

That's not really how we think about it. Our design decisions have
been guided by a few factors:

1. The RFC intentionally makes plain properties and properties with
hooks as fully compatible as possible.

A subclass can override a plain property by adding hooks to it. Many
other languages only allow doing so if the parent property already has
generated accessors (`{ get; set; }`). For many of them, switching
from a plain property to one with accessors is actually an ABI break.
One requires generating assembly/IR instructions that access a field
in some structure, the other one is a method call. This is not
relevant in our case.

In most languages, a consequence of `{ get; set; }` is that such
properties cannot be passed by reference. This part _is_ relevant to
PHP, because PHP makes heavy use of explicit by-reference passing for
arrays, but not much else. However, as outlined in the RFC, arrays are
not a good use-case for hooks to begin with. So instead of fragmenting
the entirety of all PHP code bases into plain and `{ get; set; }`
properties where it doesn't actually make a semantic difference, and
then not even using them when it would matter (arrays), we have
decided to avoid generated hooks altogether.

The approach of making plain and hooked properties compatible also
immediately means that a property can have both a "backing value"
(inherited from the parent property) and hooks (from the child
property). This goes against your model that backed properties are
really just two properties, one for the backing value and a virtual
one for the hooks.

Our approach has the nice side effect of properties only containing
hooks when they actually do something. We don't need to deal with
optimizations like "the hook is auto-generated, revert to accessing
the property directly to make it faster", or even just having the
generated hook taking up unnecessary memory. You can think of our
properties this way:

class Property {
    public ?Data $storage;
    public ?callable $getHook;
    public ?callable $setHook;

    public function get() {
        if ($hook = $this->getHook) {
            return $hook();
        } else if ($storage) {
            return $storage->get();
        } else {
            throw new Error('Property is write-only');
        }
    }

    public function set($value) {
        if ($hook = $this->setHook) {
            $hook($value);
        } else if ($storage) {
            $storage->set($value);
        } else {
            throw new Error('Property is read-only');
        }
    }
}

Properties can inherit both storage and hooks from their parent.
Hopefully, that helps with the mental model. Of course, in reality it
is a bit more complicated due to guards and references.

2. Although you say backed properties are just syntactic, they really
are not. For example, renaming a public property, making it private
and replacing it with a new passthrough virtual property breaks
serialization, as serialization works on the object's raw values. On
the other hand, adding a hook to an existing property doesn't
influence its backing value, so there is no impact on serialization.

> Being the same also makes the language more predictable, which is also a design goal for this RFC. (Hence why "this is the same logic as methods/__get/other very similar thing" is mentioned several times in the RFC. Consistency in expectations is generally a good thing.)

I can only speak for myself, but my expectations were based on:

a) How __get and __set are used in practice. That generally involves reading and writing a private property, of either the same or different name from the public one; and that private property is visible everywhere equally, no special handling based on the call stack.

b) What happens if you accidentally cause infinite recursion in a normal function or method, which is that the language eventually hits a stack depth limit and throws an error.

So the assertion that the proposal was consistent with expectations surprised me. It feels to me like something that will seem surprising to people when they first encounter it, but useful once they understand the implications.

Guards are used for dynamic property creation within `__get`/`__set`:

When `__get` or `__set` are called, the object remembers that this
property is being accessed via magic method. When you're already
inside this magic method, another call will not be triggered, thus
falling back to accessing the actual property of the object. In this
case, this means adding a dynamic property.

Dynamic properties are not particularly relevant today. The point was
not to show how similar these two cases are, but to explain that
there's an existing mechanism in place that works very well for hooks.
We may invent some new mechanism to access the backing value, like
`field = 'value'`, but for what reason? This would only make sense if
the syntax we use is useful for something else. However, given that
without guards it just leads to recursion, which I really can't see
any use for, I don't see the point.

Ilija

On 16/03/2024 17:51, Ilija Tovilo wrote:

Properties can inherit both storage and hooks from their parent.
Hopefully, that helps with the mental model. Of course, in reality it
is a bit more complicated due to guards and references.

That is a really helpful explanation, thanks; I hadn't thought about the significance of inheritance between hooked and non-hooked properties.

I still think there will be a lot of users coming from other languages, or from using __get and __set, who will look at virtual properties first. Making things less surprising for those people seems worth some effort, but I'm not asking for a complete redesign.

Dynamic properties are not particularly relevant today. The point was
not to show how similar these two cases are, but to explain that
there's an existing mechanism in place that works very well for hooks.
We may invent some new mechanism to access the backing value, like
`field = 'value'`, but for what reason? This would only make sense if
the syntax we use is useful for something else. However, given that
without guards it just leads to recursion, which I really can't see
any use for, I don't see the point.

I can think of several reasons we *could* explore other syntax:

1) To make it clearer in code whether a particular line is accessing via the hooks, or by-passing them 2) To make the code in the hooks shorter (e.g. `$field` is significantly shorter than `$this->someDescriptiveName`) 3) To allow code to by-pass the hooks at will, rather than only when called from the hooks (e.g. having a single method that resets the state of several lazy-loaded properties)

Those reasons are probably not enough to rule out the current syntax; but they show there are trade-offs being made.

To be honest, my biggest hesitation with the RFC remains asymmetric types (the ability to specify types in the set hook). It's quite a significant feature, with no precedent I know of, and I'm worried we'll overlook something by including it immediately. For instance, what will be the impact on people using reflection or static analysis to reason about types? I would personally be more comfortable leaving that to a follow-up RFC to consider the details more carefully.

Nobody else has raised that, beyond the syntax; I'm not sure if that's because everyone is happy with it, or because the significance has been overlooked.

Regards,

--
Rowan Tommins
[IMSoP]

Hi Rowan

On Sat, Mar 16, 2024 at 8:23 PM Rowan Tommins [IMSoP]
<imsop.php@rwec.co.uk> wrote:

I still think there will be a lot of users coming from other languages, or from using __get and __set, who will look at virtual properties first. Making things less surprising for those people seems worth some effort, but I'm not asking for a complete redesign.

For clarity, you are asking for a way to make the "virtualness" of
properties more explicit, correct? We touch on a keyword and why we
think it's suboptimal in the FAQ section. Unfortunately, I cannot
think of many alternatives. The `$field` variable made it a bit more
obvious, but only marginally.

I do believe that, for the most part, the user should not have to
think much about whether the property is backed or virtual. The
behavioral differences are mostly intuitive. For example:

class Test {
    // This property has a set hook that writes to the backing value. Since
    // we're using the backing value, it makes sense for there to be a way to
    // retrieve it. Without that, it wouldn't be useful.
    public $prop {
        set {
            $this->prop = strtoupper($value);
        }
    }

    // Similarly, a property with only a get hook that accesses the backing
    // value would need a way to write to the property for the get to be useful.
    public $prop {
        get => strtoupper($this->prop);
    }

    // A property with a get hook that does not use the backing value does not
    // need an implicit set operation, as writing to the backing value would be
    // useless, given that nobody will read it.
    public $prop {
        get => 42;
    }

    // Similarly, in the esoteric write-only case that does not use the backing
    // value, having an implicit get operation would always lead to a
    // "uninitialized property" error, and is not useful as such.
    public $prop {
        set {
            echo "Prop set\n";
        }
    }
}

Furthermore, `serialize`, `var_dump` and the other functions operating
on raw property values will include the property only if it is backed.
This also seems intuitive to me: If you never use the backing value,
the backing value would always be uninitialized, so there's no reason
to include it.

One case that is not completely obvious is lazy-initialized properties.

class Test {
    public $prop {
        get => $this->prop ??= expensiveOperation();
    }
}

It's not immediately obvious that there is a public set operation
here. The correct way to fix this would be with asymmetric visibility,
which was previously declined. Either way, I don't consider this case
alone enough to completely switch our approach. Please let me know if
you are aware of any other potentially non-intuitive cases.

I will admit that it is unfortunate that a user of the property has to
look through the hook implementation to understand whether a property
is writable. As you have previously suggested, one option might be to
add an explicit `set;` declaration. Maybe it's a bit more obvious now,
after my previous e-mail, why we are trying to avoid this.

Apart from the things already mentioned, it's unclear to me whether,
with such `set;` declarations, a `get`-only backed property should
even be legal. With the complete absence of a write operation, the
assignment within the `set` itself would fail. To make this work, the
absence of `set;` would need to mean something like "writable, but
only within another hook", which introduces yet another form of
asymmetric visibility.

> Dynamic properties are not particularly relevant today. The point was
> not to show how similar these two cases are, but to explain that
> there's an existing mechanism in place that works very well for hooks.
> We may invent some new mechanism to access the backing value, like
> `field = 'value'`, but for what reason? This would only make sense if
> the syntax we use is useful for something else. However, given that
> without guards it just leads to recursion, which I really can't see
> any use for, I don't see the point.

I can think of several reasons we *could* explore other syntax:

1) To make it clearer in code whether a particular line is accessing via the hooks, or by-passing them 2) To make the code in the hooks shorter (e.g. `$field` is significantly shorter than `$this->someDescriptiveName`) 3) To allow code to by-pass the hooks at will, rather than only when called from the hooks (e.g. having a single method that resets the state of several lazy-loaded properties)

Those reasons are probably not enough to rule out the current syntax; but they show there are trade-offs being made.

Fair enough. 1 and 2 are reasons why we added the `$field` macro as an
alternative syntax in the original draft. I don't quite understand
point 3. In Kotlin, `field` is only usable within its associated hook.
Other languages I'm aware of do not provide a way to access the
backing value directly, neither inside nor outside the accessor.

To be honest, my biggest hesitation with the RFC remains asymmetric types (the ability to specify types in the set hook). It's quite a significant feature, with no precedent I know of, and I'm worried we'll overlook something by including it immediately. For instance, what will be the impact on people using reflection or static analysis to reason about types? I would personally be more comfortable leaving that to a follow-up RFC to consider the details more carefully.

I personally do not feel strongly about whether asymmetric types make
it into the initial implementation. Larry does, however, and I think
it is not fair to exclude them without providing any concrete reasons
not to. I will spend time in the following days cleaning up tests, and
I will try my best to try to break asymmetric types. If I (or anybody
else) can't find a way to do so, I don't see a reason to remove them.

Nobody else has raised that, beyond the syntax; I'm not sure if that's because everyone is happy with it, or because the significance has been overlooked.

Yes, unfortunately that's a classic problem in RFC discussions: Syntax
gets a disproportionate amount of attention.

Ilija

On 17/03/2024 00:01, Ilija Tovilo wrote:

For clarity, you are asking for a way to make the "virtualness" of
properties more explicit, correct?

Either more explicit, or less important: the less often the user needs to know whether a property is virtual, the less it matters how easily they can find out.

Please let me know if
you are aware of any other potentially non-intuitive cases.

I agree that while they may not be immediately obvious to the user, most of the distinctions do make sense once you think about them.

The remaining difference I can see in the current RFC which seems to be unnecessary is that combining &get with set is only allowed on virtual properties. Although it may be "virtual" in the strict sense, any &get hook must actually be referring to some value stored somewhere - that might be a backed property, another field on the current class, a property of some other object, etc:

public int $foo { &get => $this->foo; set { $this->foo = $value; } }

public int $bar { &get => $this->_bar; set { $this->_bar = $value; } }

public int $baz { &get => $this->delegatedObj->baz; set { $this->delegatedObj->baz = $value; } }

This sentence from the RFC applies equally to all three of these examples:

> That is because any attempted modification of the value by reference would bypass a |set| hook, if one is defined.

I suggest that we either trust the user to understand that that will happen, and allow combining &get and set on any property; or we do not trust them, and forbid it on any property.

Apart from the things already mentioned, it's unclear to me whether,
with such `set;` declarations, a `get`-only backed property should
even be legal. With the complete absence of a write operation, the
assignment within the `set` itself would fail. To make this work, the
absence of `set;` would need to mean something like "writable, but
only within another hook", which introduces yet another form of
asymmetric visibility.

Any write inside the get hook already by-passes the set hook and refers to the underlying property, so there would be no need for any default set behaviour other than throwing an error.

It's not likely to be a common scenario, but the below works with the current implementation Online PHP editor | rfc for t7qhR

class Example {
public int $nextNumber {
get {
$this->nextNumber ??= 0;
return $this->nextNumber++;
}
// Mimic the current behaviour of a virtual property: Online PHP editor | rfc for cAfAI
set => throw new Error('Property Example::$nextNumber is read-only');
}
}

Fair enough. 1 and 2 are reasons why we added the `$field` macro as an
alternative syntax in the original draft. I don't quite understand
point 3. In Kotlin, `field` is only usable within its associated hook.
Other languages I'm aware of do not provide a way to access the
backing value directly, neither inside nor outside the accessor.

We are already allowing more than Kotlin by letting hooks call out to a method, and have that method refer back to the raw value. Hypothetically, we could allow *any* method to access it, using some syntax like $this->foo::raw. As a spectrum from least access to most access:

1) $field - accessible only in the lexical scope of the hook

2) $this->foo - accessible in the dynamic scope of the hook, e.g. a hook calling $this->doSomething(__PROPERTY__);

3) $this->foo::raw - accessible anywhere in the class, e.g. a public clearAll() method by-passing hooks

Whichever we provide for backed properties, option 3 is available for virtual properties anyway, and common with __get/__set: store a value in a private property, and have a public hooked property providing access to it.

I understand now that option 2 fits most easily with the implementation, and with decisions around inheritance and upgrade of existing code; but the other options do have their advantages from a user's point of view.

I personally do not feel strongly about whether asymmetric types make
it into the initial implementation. Larry does, however, and I think
it is not fair to exclude them without providing any concrete reasons
not to. I will spend time in the following days cleaning up tests, and
I will try my best to try to break asymmetric types. If I (or anybody
else) can't find a way to do so, I don't see a reason to remove them.

My concern is more about the external impact of what is effectively a change to the type system of the language: will IDEs give correct feedback to users about which assignments are legal? will tools like PhpStan and Psalm require complex changes to analyse code using such properties? will we be prevented from adding some optimisation to OpCache because these properties break some otherwise safe assumption?

Maybe I'm being over-cautious, but those are the kinds of questions I would expect to come up if this feature had its own RFC.

Regards,

--
Rowan Tommins
[IMSoP]

Hi Rowan

On Sun, Mar 17, 2024 at 3:41 PM Rowan Tommins [IMSoP]
<imsop.php@rwec.co.uk> wrote:

The remaining difference I can see in the current RFC which seems to be
unnecessary is that combining &get with set is only allowed on virtual
properties. Although it may be "virtual" in the strict sense, any &get
hook must actually be referring to some value stored somewhere - that
might be a backed property, another field on the current class, a
property of some other object, etc:

public int $foo { &get => $this->foo; set { $this->foo = $value; } }

public int $bar { &get => $this->_bar; set { $this->_bar = $value; } }

public int $baz { &get => $this->delegatedObj->baz; set {
$this->delegatedObj->baz = $value; } }

This sentence from the RFC applies equally to all three of these examples:

> That is because any attempted modification of the value by reference
would bypass a |set| hook, if one is defined.

I suggest that we either trust the user to understand that that will
happen, and allow combining &get and set on any property; or we do not
trust them, and forbid it on any property.

I'm indeed afraid that people will blindly make their array properties
by-reference, without understanding the implications. Allowing
by-reference behavior for virtual read/write properties is a tradeoff,
for cases where it may be necessary. Exposing private properties
by-reference is already possible outside of hooks
(Online PHP editor | output for VNhf7), that's not something we can prevent for
secondary backing properties. However, we can at least make sure that
a reference to the baking value of a hooked property doesn't escape.

I realize this is somewhat inconsistent, but I believe it is
reasonable. If you want to expose the underlying property
by-reference, you need to jump through some additional hoops.

> Apart from the things already mentioned, it's unclear to me whether,
> with such `set;` declarations, a `get`-only backed property should
> even be legal. With the complete absence of a write operation, the
> assignment within the `set` itself would fail. To make this work, the
> absence of `set;` would need to mean something like "writable, but
> only within another hook", which introduces yet another form of
> asymmetric visibility.

Any write inside the get hook already by-passes the set hook and refers
to the underlying property, so there would be no need for any default
set behaviour other than throwing an error.

It's not likely to be a common scenario, but the below works with the
current implementation Online PHP editor | rfc for t7qhR

class Example {
     public int $nextNumber {
         get {
             $this->nextNumber ??= 0;
             return $this->nextNumber++;
         }
         // Mimic the current behaviour of a virtual property:
Online PHP editor | rfc for cAfAI
         set => throw new Error('Property Example::$nextNumber is
read-only');
     }
}

Again, it depends on how you think about it. As you have argued, for a
get-only property, the backing value should not be writable without an
explicit `set;` declaration. You can interpret `set;` as an
auto-generated hook, or as a marker that indicates that the backing
value is accessible without a hook. As mentioned in my previous
e-mail, auto-generated hooks is something we'd really like to avoid.
So, if the absence of `set;` means that the backing value is not
writable, the hook itself must be exempt from this rule.

Another thing to consider: The current implementation inherits the
backing value and all hooks from its parent. If the suggestion is to
add an explicit `set;` declaration to make it more obvious that the
property is writable, how does this help overridden properties?

class P {
    public $prop {
        get => strtolower($this->prop);
        set;
    }
}

class C extends P {
    public $prop {
        get => strtoupper(parent::$prop::get());
    }
}

Even though `P::$prop` signals that it is writable, there is no such
indication in `C::$prop`. You may suggest to also add `set;` to the
child, but then what if the parent adds a custom implementation for
`set;`?

class P {
    public $prop {
        get => strtolower($this->prop);
        set {
            echo $value, "\n";
            $this->prop = $value;
        }
    }
}

class C extends P {
    public $prop {
        get => strtoupper(parent::$prop::get());
        set;
    }
}

The meaning for `set;` is no longer clear. Does it mean that there's a
generated hook that accesses the backing field? Does it mean that the
backing field is accessible without a hook? Or does it mean that it
accesses the parent hook? The truth is, with inheritance there's no
way to look at the property declaration and fully understand what's
going on, unless all hooks must be spelled out for the sake of clarity
(e.g. `get => parent::$prop::get()`).

We are already allowing more than Kotlin by letting hooks call out to a
method, and have that method refer back to the raw value.
Hypothetically, we could allow *any* method to access it, using some
syntax like $this->foo::raw. As a spectrum from least access to most access:

1) $field - accessible only in the lexical scope of the hook

2) $this->foo - accessible in the dynamic scope of the hook, e.g. a hook
calling $this->doSomething(__PROPERTY__);

3) $this->foo::raw - accessible anywhere in the class, e.g. a public
clearAll() method by-passing hooks

Whichever we provide for backed properties, option 3 is available for
virtual properties anyway, and common with __get/__set: store a value in
a private property, and have a public hooked property providing access
to it.

I seriously doubt accessing the backing value outside of the current
hook is useful. The backing value is an implementation detail. If it
is absolutely needed, `ReflectionProperty::setRawValue()` offers a way
to do it. I understand the desire for a shorter alternative like
`$field`, but it doesn't seem like the majority shares this desire at
this point in time.

I understand now that option 2 fits most easily with the implementation,
and with decisions around inheritance and upgrade of existing code; but
the other options do have their advantages from a user's point of view.

A different syntax like `$this->prop::raw` comes with similar
complexity issues, similar to those previously discussed for
`parent::$prop`/`parent::$prop = 'prop'`. All operations that
currently work out-of-the-box for the currently proposed syntax (read,
assign, assign by ref, assign with operation (+=, -=, etc.), inc/dec,
isset, send by-ref) would need new implementations. We could limit the
new syntax to read/assign/assign by ref, but that means more typing
for all other cases (e.g. `$this->prop::raw = $this->prop::raw + 1`
vs. `$this->prop++`). So, is that really better?

My concern is more about the external impact of what is effectively a
change to the type system of the language: will IDEs give correct
feedback to users about which assignments are legal? will tools like
PhpStan and Psalm require complex changes to analyse code using such
properties? will we be prevented from adding some optimisation to
OpCache because these properties break some otherwise safe assumption?

I don't consider Opcache a problem. An assignment on a non-final
property with mismatched types is no longer guaranteed to fail.
However, Opcache doesn't usually optimize error cases, e.g. replacing
them with direct exceptions. I can't speak for IDEs or static
analyzers, but I'm not sure what makes this case special. We can ask
some of their maintainers for feedback.

Maybe I'm being over-cautious, but those are the kinds of questions I
would expect to come up if this feature had its own RFC.

Of course, no worries. I'd rather hear it now than when the voting has
started. :wink:

Ilija

I do understand that falling back to (b) makes the implementation simpler, and works well with inheritance and some use cases; but falling back to (c) wouldn’t necessarily need a “default hook”, just a marker of “has hooks”.

···

On 18/03/2024 00:04, Ilija Tovilo wrote:

I realize this is somewhat inconsistent, but I believe it is
reasonable. If you want to expose the underlying property
by-reference, you need to jump through some additional hoops.

I disagree with this reasoning, because I foresee plenty of cases where a virtual property is necessary anyway, so doesn’t provide any additional hoop to jump through.

But there’s not much more to say on this point, so I guess we’ll leave it there.

Again, it depends on how you think about it. As you have argued, for a
get-only property, the backing value should not be writable without an
explicit `set;` declaration. You can interpret `set;` as an
auto-generated hook, or as a marker that indicates that the backing
value is accessible without a hook.

Regardless of which of these views you start with, it still seems intuitive to me that accesses inside the get hook would bypass the normal rules and write to the raw value.

Leaving aside the implementation, there are three things that can happen when you write to a property:

a) the set hook is called
b) the raw property is written to
c) an error is thrown

Inside the dynamic scope of a hook, the behaviour is always (b), and I don’t see any reason for that to change. From anywhere else, backed properties currently try (a) and fall back to (b); virtual properties try (a) and fall back to (c).

It occurred to me you could implement it in reverse: auto-generate a hook “set => throw new Error;” and then remove it if the user opts in to the default set behaviour. That would keep the “write directly” case optimised “for free”; but it would be awkward for inheritance, as you’d have to somehow avoid calling the parent’s hook.

The meaning for `set;` is no longer clear. Does it mean that there's a
generated hook that accesses the backing field? Does it mean that the
backing field is accessible without a hook? Or does it mean that it
accesses the parent hook? The truth is, with inheritance there's no
way to look at the property declaration and fully understand what's
going on, unless all hooks must be spelled out for the sake of clarity
(e.g. `get => parent::$prop::get()`).

Yes, I think this is probably a good argument against requiring “set;”

I think “be careful when inheriting only one hook” will always be a key rule to teach anyway, because it’s easy to mess up (e.g. assuming the parent is backed and accessing $this->foo, rather than calling the parent’s hook implementation). But adding “set;” into the mix probably just makes it worse.

I seriously doubt accessing the backing value outside of the current
hook is useful. The backing value is an implementation detail. If it
is absolutely needed, `ReflectionProperty::setRawValue()` offers a way
to do it. I understand the desire for a shorter alternative like
`$field`, but it doesn't seem like the majority shares this desire at
this point in time.

The example of clearAll() is a real use case, which people will currently achieve with __get and __set (e.g. the Yii ActiveRecord implementation I linked in one of my previous messages).

The alternative wouldn’t be reflection, it would just be switching to a virtual property with the value stored in a private field. I think that’s fine, it’s just drawing the line of which use cases backed properties cover: Kotlin covers more use cases than C#; PHP will cover more than Kotlin (methods able to by-pass a hook when called from that hook); but it will draw the line here.

A different syntax like `$this->prop::raw` comes with similar
complexity issues, similar to those previously discussed for
`parent::$prop`/`parent::$prop = 'prop'`.

Yeah, I can’t even think of a nice syntax for it, let alone a nice implementation. Let’s leave it as a thought experiment, no further action needed. :slight_smile:

Regarding asymmetric types:

I can't speak for IDEs or static
analyzers, but I'm not sure what makes this case special. We can ask
some of their maintainers for feedback.

In order to reliably tell the user whether “$a->foo = $b->bar;” is a type-safe operation, the analyser will need to track two types for every property, the “gettable type” and the “settable type”, and apply them in the correct contexts.

I’ve honestly no idea whether that will be easy or hard; it will probably vary between tools. In particular, I get the impression IDEs / editor plugins sometimes have a base implementation used for multiple programming languages, and PHP might be the only one that needed this extra tracking.

Regards,

-- 
Rowan Tommins
[IMSoP]