[PHP-DEV] Property hooks, and &get-only write behavior

Hi everyone

While reviewing the property hooks patch, Bob found a discrepancy in
the implementation and the specification. There is an unfortunate
mistake in one of the examples that suggests specific by-reference
behavior that we did not intend to support.

Specifically, from PHP: rfc:property-hooks
(you'll have to scroll down a bit):

class C {
    public string $a {
        &get {
            $b = $this->a;
            return $b;
        }
    }
}
$c = new C();
$c->a = 'beep';
// $c is unchanged.

This example implies that the assignment `$c->a = 'beep';` will invoke
`&get` to write to the contents of the reference. However, this is not
the case. Direct assignments will, in the absence of `set`, fall back
to the builtin write operation for backed properties, or throw for
virtual properties.

Instead of `$c->a = 'beep';`, the example _should_ have said:

$ref = &$c->a;
$ref = 'beep';

Here, `$ref = &$c->a;` _will_ call `&get`. What the example intended
to show is that the reference returned by `&get` is detached from
`$c->a` itself.

That said, I did give this idea some thought. I think it would not
have been unreasonable to automatically call `&get` and assign to the
returned reference when there is a `&get` hook but _no_ `set` hook.
However, when trying to implement this, I encountered some technical
issues that make this currently unfeasible. So, we'd like to move
forward with the original semantics: If there is no `set` hook, then
the assignment will trigger a builtin write for backed properties, or
an error for virtual properties.

If you wish to achieve the behavior described above (using `&get` to
write to the property), you can implement `set` to perform the write
to the reference yourself.

I hope that these semantics are acceptable for everyone.

Ilija

On Wednesday, 12 June 2024 at 16:06, Ilija Tovilo <tovilo.ilija@gmail.com> wrote:

Hi everyone

[...]

I hope that these semantics are acceptable for everyone.

Ilija

Hello Ilija,

I might know what sort of technical issues you've been running into, as I have some similar issues correctly handling references with my Container/Offset RFC implementation.
The way I'm handling this is by splitting the "get"/read handler/hook to *just* be a by-value return, and for fetching creating a new "fetch" handler that is called in the respective situations.
Therefore, I am wondering if it wouldn't possibly make more sense to add a "fetch" hook, and ban returning by-ref for the get hook.

Best regards,

Gina P. Banyard

Hi Gina

On Thu, Jun 13, 2024 at 3:24 AM Gina P. Banyard <internals@gpb.moe> wrote:

On Wednesday, 12 June 2024 at 16:06, Ilija Tovilo <tovilo.ilija@gmail.com> wrote:

> Hi everyone
>
> [...]
>
> I hope that these semantics are acceptable for everyone.
>
> Ilija

Hello Ilija,

I might know what sort of technical issues you've been running into, as I have some similar issues correctly handling references with my Container/Offset RFC implementation.
The way I'm handling this is by splitting the "get"/read handler/hook to *just* be a by-value return, and for fetching creating a new "fetch" handler that is called in the respective situations.
Therefore, I am wondering if it wouldn't possibly make more sense to add a "fetch" hook, and ban returning by-ref for the get hook.

I don't know exactly what issue you encountered, but it sounds
unrelated to mine. The issue was not related to the design of &get,
but the way object handlers work for properties. In particular,
get_property_ptr_ptr does not provide zval space for temporary values.

There are three object handlers relevant:

* get_property_ptr_ptr, which returns a direct pointer to the property
zval. This is usually used for read+write operations such as ++, or
when assigning by-ref, like =&.
* read_property
* write_property

read_property and write_property are also used as fallbacks when
get_property_ptr_ptr returns NULL. This is the case for hooks.
Essentially, for hooks, we want to make sure the engine does not
bypass the set hook by writing to the zval directly.

Pure assignments aren't really an issue, I was able to implement them
using write_property. When a value is assigned to a &get-only
property, we call &get and assign the value to the resulting
reference. However, for read+write operations, the flow works like
this:

* get_property_ptr_ptr returns NULL, indicating that a direct
modification of the zval is not possible.
* The engine calls read_property instead.
* It modifies the resulting value.
* It calls write_property with the new value.

With this workflow, &get gets called twice, once for read_property and
once for write_property, which is the unexpected part. I would expect
&get only to be called once, and for this value to be modified
in-place. To implement this properly, get_property_ptr_ptr should
return the reference instead.

However, the way get_property_ptr_ptr is built doesn't allow for this,
because unlike read_property, it doesn't have a `zval *rv` parameter
the caller must supply that the callee may use in the case of
temporary values (such as &__get, or here, &get).

So, rather than causing a big internal API change to add the `zend
*rv` parameter (this API is very old), I opted to go with the original
plan.

I hope that helps.

Ilija