Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

On Wed, Feb 21, 2024, 19:57 Larry Garfield <larry@garfieldtech.com> wrote:

Hello again, fine Internalians.

After much on-again/off-again work, Ilija and I are back with a more polished property access hooks/interface properties RFC. It’s 99% unchanged from last summer; the PR is now essentially complete and more robust, and we were able to squish the last remaining edge cases.

Baring any major changes, we plan to bring this to a vote in mid-March.

It’s long, but that’s because we’re handling every edge case we could think of. Properties involve dealing with both references and inheritance, both of which have complex implications. We believe we’ve identified the most logical handling for all cases, though.

Note the FAQ question at the end, which explains some design choices.

There’s one outstanding question, which is slightly painful to ask: Originally, this RFC was called “property accessors,” which is the terminology used by most languages. During early development, when we had 4 accessors like Swift, we changed the name to “hooks” to better indicate that one was “hooking into” the property lifecycle. However, later refinement brought it back down to 2 operations, get and set. That makes the “hooks” name less applicable, and inconsistent with what other languages call it.

However, changing it back at this point would be a non-small amount of grunt work. There would be no functional changes from doing so, but it’s lots of renaming things both in the PR and the RFC. We are willing to do so if the consensus is that it would be beneficial, but want to ask before putting in the effort.


Larry Garfield
larry@garfieldtech.com

Hi, thanks for the RFC and the effort put into trying to make it palatable to skeptical minds!

After reading most of the discussion in this thread I believe that the RFC in its current form can work and that I will get used to it’s “peculiarities”, but an idea occurred to me that may have some advantages, so here goes:

Use the “set” keyword that you’ve already introduced to set the raw value of a “backed” property:

public int $name {
        set {
             set strtoupper($value);
        }
    }

Or when used in short form:

public int $name {
set => set strtoupper($value);
}

Advantages in no particular order:

  1. Shorter than $this->name
  2. No magic $field
  3. Short and long form works the same

Disadvantage: “Set” can only be used to set the raw value inside the hook method itself. Or maybe that’s a good thing too. To be honest, I don’t love that $this->name sometimes goes through the hook and sometimes not. I’d prefer if the raw value could only be accessed inside the hooks or via a special syntax like f.ex. $this->name:raw

If there are any use cases or technical details that I’ve missed that would make this syntax unfavourable, I apologize.

Another observation (I apologize for being late to the game but it was a long RFC and thread to read through):

What would happen if we stopped talking about virtual vs. backed properties? Couldn’t we just treat a property that was never set the same as any other uninitialized property?
What I mean is, that if you try to access the raw value of a property with a set hook that never sets its own raw value, you’d get either null or Typed property […] must not be accessed before initialization, just like you’d expect if you’re already used to modern php. Of course you’d just write your code correctly so that that never happens. It’s already the case that uninitialized properties are omitted when serializing the object so there would be no difference there either.

The advantage here would be that there’s no need to detect the virtual or backed nature of the property at compile time and the RFC would be a lot shorter.

Thank you for your consideration!

Best,
Jakob


On Tue, Mar 26, 2024, at 8:18 PM, Jakob Givoni wrote:

Hi, thanks for the RFC and the effort put into trying to make it
palatable to skeptical minds!

After reading most of the discussion in this thread I believe that the
RFC in its current form can work and that I will get used to it's
"peculiarities", but an idea occurred to me that may have some
advantages, so here goes:

Use the "set" keyword that you've already introduced to set the raw
value of a "backed" property:

public int $name {
        set {
             set strtoupper($value);
        }
    }
Or when used in short form:

public int $name {
        set => set strtoupper($value);
    }

Advantages in no particular order:
1. Shorter than $this->name
2. No magic $field
3. Short and long form works the same

Disadvantage: "Set" can only be used to set the raw value inside the
hook method itself. Or maybe that's a good thing too. To be honest, I
don't love that $this->name sometimes goes through the hook and
sometimes not. I'd prefer if the raw value could only be accessed
inside the hooks or via a special syntax like f.ex. $this->name:raw

If there are any use cases or technical details that I've missed that
would make this syntax unfavourable, I apologize.

Interesting idea. Not being able to write the raw value except in the set hook isn't a bug, but an important feature, so that's not a downside. (Modulo reflection, which is a reasonable back-door.)

However, there's a few other disadvantages that probably make it not worth it.

1. `set` is not actually a keyword at the moment. It's contextually parsed in the lexer, so it doesn't preclude using `set` as a constant or function name the way a full keyword does. (PHP has many of these context-only keywords.) Making it a keyword inside the body of the hook would do that, however.
2. Like $field, it would be a syntax you just "have to know". Most people seem to hate that idea, right or wrong.
3. Like the considered syntaxes for parent-access, it wouldn't be possible to do anything but a direct write. So `set => set++` wouldn't be possible, whereas with $this->prop all existing operations should "just work."
4. Would we then also want a `get` keyword in the get hook to be parallel? What does that even do there? It would have the same implications as point 3 in get, so we're back to $field by a different spelling.

So it's an interesting concept, but the knock-on effects would lead to a lot more complications.

Another observation (I apologize for being late to the game but it was
a long RFC and thread to read through):

What would happen if we stopped talking about virtual vs. backed
properties? Couldn't we just treat a property that was never set the
same as any other uninitialized property?
What I mean is, that if you try to access the raw value of a property
with a set hook that never sets its own raw value, you'd get either
null or Typed property [...] must not be accessed before
initialization, just like you'd expect if you're already used to modern
php. Of course you'd just write your code correctly so that that never
happens. It's already the case that uninitialized properties are
omitted when serializing the object so there would be no difference
there either.

The advantage here would be that there's no need to detect the virtual
or backed nature of the property at compile time and the RFC would be a
lot shorter.

Unfortunately the backed-vs-virtual distinction is quite important at an implementation level for a few reasons.

1. A backed property reserves memory space for that property. A virtual property does not. Making virtual properties "unused backed" properties would increase memory usage for values that would never be usable.
2. There would be no realistic way to differentiate between a get-only virtual property with no storage, and a backed property that just happens to have a get hook but no set hook. Meaning you would be able to write to an otherwise-inaccessible backing value of the property.
3. That would then appear in serialization, even though it's impossible to get to from code without using reflection. Which is just all kinds of confusing.

So for practical reasons, the distinction isn't just a user-facing difference but an important engine-level distinction we cannot avoid.

Cheers.

--Larry Garfield