[PHP-DEV] [RFC][Vote announcement] Property hooks

Erick_de_Azevedo_Lim · April 10, 2024, 4:51pm

Reminder to not run into a sunk cost fallacy.

My main point was not the cost/work itself, but the fact that big concerns should be raised ASAP, and not on the 11th hour, as Robert said.
Whether we like it or not, the voting right beholders’ concerns should be taken into account with “special attention”, because, well…
They are the ones that, in the end, decide if your RFC will make it to the language or not.
So having no negative feedback or having it only at the last minute before voting is not good at all.
I think doing this and voting for a “no” for a feature that was so well received and discussed extensively by the internalians discourages
OSS developers from contributing to the language we care so much about.

Regards,
Erick

Jeffrey_Dafoe · April 10, 2024, 4:57pm

On Wed, Apr 10, 2024 at 4:01 PM Erick de Azevedo Lima <mailto:ericklima.comp@gmail.com> wrote:
Maybe if such a feedback was given before and it was decided to go for a trimmed version of the feature, maybe Ilija/Larry could have had less work to implement and test all the variantes they've done. I think if a person has such numerous concerns, it should be exposed ASAP even if your thoughts are not
totally clear. If you had a chat directly with Ilija/Larry, they would be aware of such concerns and would expend efforts to address these concerns.

To an outsider, it looks wild when feedback starts coming in right before the vote starts. What's even more startling is that there are people with voting rights who have never participated in the discussion at all, yet have a right to wordlessly affect the vote's outcome. I sincerely hope Ilija and Larry's work don't
go to waste here.

I hope that the property hooks feature goes through. I've used it in C# and it reads nice and saves time.

Waiting until the last minute to express a significant design disagreement is disrespectful of the time the RFC authors have spent working on this proposal. It's also counterproductive to the PHP project overall.

-Jeff

Marco_Pivetta · April 10, 2024, 4:58pm

Agreed, but to give you some more context:

I know JRF has been holding off feedback about it for a long time, waiting for the RFC to be more “stable” (i.e. not being a moving flubber) before posting thoughts
I personally usually only get into RFCs very early on (when I have a clear idea/opinion formed on a topic upfront) or at the end, when the RFC has gone through the ironing out a lot: the in-between is a brawl for which 24h/day aren’t sufficient.

The nature of email threads in a mailing list makes RFCs extremely hard to approach anything that is “in-flight”.

···

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

Mark_Trapp · April 10, 2024, 8:41pm

On Wed, Apr 10, 2024 at 9:59 AM Jeffrey Dafoe <JDafoe@fsx.com> wrote:

> On Wed, Apr 10, 2024 at 4:01 PM Erick de Azevedo Lima <mailto:ericklima.comp@gmail.com> wrote:
> Maybe if such a feedback was given before and it was decided to go for a trimmed version of the feature, maybe Ilija/Larry could have had less work to implement and test all the variantes they've done. I think if a person has such numerous concerns, it should be exposed ASAP even if your thoughts are not
> totally clear. If you had a chat directly with Ilija/Larry, they would be aware of such concerns and would expend efforts to address these concerns.

> To an outsider, it looks wild when feedback starts coming in right before the vote starts. What's even more startling is that there are people with voting rights who have never participated in the discussion at all, yet have a right to wordlessly affect the vote's outcome. I sincerely hope Ilija and Larry's work don't
> go to waste here.

I hope that the property hooks feature goes through. I've used it in C# and it reads nice and saves time.

Waiting until the last minute to express a significant design disagreement is disrespectful of the time the RFC authors have spent working on this proposal. It's also counterproductive to the PHP project overall.

-Jeff

Just a friendly reminder that feedback on why people vote against RFCs
has been desperately sought after and requested each time a seemingly
popular RFC is rejected, and the feedback provided by others for this
RFC was done so in good faith, before voting started, allowing the RFC
authors to address said feedback should they choose to acknowledge it.

And in general, the wider PHP community does not pay attention to RFCs
until/unless they come up for a vote.

Admonishing people for providing unexpected or negative feedback will
have a chilling effect for future RFC attempts, as it's much easier to
just silently vote no than to be attacked simply for providing a
dissenting opinion at an inconvenient time.

- Mark

Tim_Dusterhus · April 10, 2024, 7:31pm

Hi

On 4/10/24 10:28, Arvids Godjuks wrote:

The amount of complexity in these two hooks is on the level of the whole
PHP OOP model with the number of footguns and WTF cases to rival magic
quotes, register_globals=on and every other infamous gotcha we had in the
engine that got deleted out of existence.

As someone who read the RFC like 20 times while it evolved (and just read it once more before submitting this email):

Most of what the RFC spells out explicitly, because that's what RFCs need to, is a pretty much natural consequence of how the existing PHP features work. You cannot really shorten the RFC without either making it underspecified or useless.

Is there some syntax that could technically be omitted? Sure. Would doing so meaningfully simplify the RFC? No, because most of the text length comes from the interaction of existing syntax and semantics with hooks and thus is an inherent property of property hooks (pun not intended). It's not the RFCs fault that references exist, or that casting an object to an array is legal, or whatever else needs to be explicitly spelled out.

I also believe that the chosen syntax fits nicely with the existing syntax, borrowing syntax where useful and inventing new syntax where the existing syntax does not provide anything.

It's a clear improvement over the status quo of __get() and __set(), especially from a third-party tooling perspective. I'm in favor.

Best regards
Tim Düsterhus

Arvids_Godjuks · April 10, 2024, 8:17pm

Yeah we had a discussion with Larry on SF Slack, he made some things clear that it’s either all or nothing, because once they dove in, it became clear that there’s no way to gradually to introduce it. So, if it passes, the community is just gonna have to deal with the massive scope of it if they want the hooks or forget about them.

···

Arvīds Godjuks+371 26 851 664
arvids.godjuks@gmail.com
Telegram: @psihius https://t.me/psihius

Matthew_Weier_O_Phin · April 10, 2024, 10:07pm

Hi everyone

Heads-up: Larry and I would like to start the vote of the property
hooks RFC tomorrow:
https://wiki.php.net/rfc/property-hooks

We have worked long and hard on this RFC, and hope that we have found
some middle-ground that works for the majority. One last concern we
have not officially clarified on the list:

https://externals.io/message/122445#122667

After reading JRF’s reply, I found myself nodding along. I feel like while a lot of the details of my original feedback were addressed, the themes largely were not, and many overlap with what Juliette noted.

I’ve taken some time again today to review the proposal, and I think I can boil my main concern down to consistency, and ensuring reviewer comprehension when somebody is reviewing code that includes hooks. New features in PHP really should veer away from allowing implicit behavior, and should be internally consistent with existing syntax (unless they are attempting to make an explicit change to existing syntax).

On this latest review, I’ve got fewer concerns, but I still have some.

I’m not a fan of implicit. I strongly feel that there’s no use case for set without an argument. Requiring the argument makes explicit that it is receiving a value, and also makes it explicit at the definition point what value type is accepted, and what it is named. (I realize I might be in a minority here, though, and the fact that I can require this as a CS rule (eventually) mitigates it to a degree. It feels like an opportunity to reduce errors by requiring it, however.)
I’m not a huge fan of the short syntax, but the improvements in the most recent draft are mostly ones I can live with. The part that’s still unclear is when and where hooks need a ; termination, and why the ; is used, instead of ,. When using match() expressions, you use , to separate the expressions, but for some reason, the proposal uses ; … but only when using short expressions. And if you have a full-form mixed with a short form, the ; is only needed for the short-form expression. This feels arbitrary, and it will be easy to get it wrong for people comfortable with match() statements.

My gut take is that the syntax should use , to separate hooks in ALL cases:

public string $content {
get {
$content = str_replace(' ', ' ', $this->content);
return $this->convert($content);
},
set(string $value) => $value,
}

public string $summary {
get => $this->convert($this->summary),
set(string $value) {
$value = str_replace((' ', ' ', $value);
$this->summary = $value;
},
}

This more closely matches match() expressions (pardon the pun), and provides a template for how we might allow blocks in match() expressions in the future.

(Larry tells me that it’s match() being weird here, but considering that for many developers, their only point of reference for this sort of syntax IS match(), making it feel like the language is ignoring its own syntax when creating new syntax.)

While I’d likely prefer Marco’s approach to references (just don’t allow them), the fact that they mirror how __get() and __set() currently work gives a migration path for users who are familiar with that paradigm’s gotchas. In other words, it’s consistent with the current language, and will make migrating from __set/__get to hooks easier. It’s a lot of complexity, but the table you created helps with that. That table MUST make it to the docs for the feature!
The interaction with serialization has a LOT of different cases, and looks like a great place to observe foot guns in the future. I’d like to see a more consistent, predictable approach here, but knowing how many pitfalls there are in serialization normally, I’m not expecting one; at least it’s being consistent with how each of the serialization methods already work. What I DO expect is that you create a table detailing the behavior for the docs, similar to the one you did for references/backed values/virtual.

While I definitely feel for QA/CS tool makers (this is a HUGE chunk of new syntax, and non-trivial), I also don’t know how this could be done in a way that would be simpler. My only suggestions would be to remove short-form syntax and require type/varname for set always, but even with those changes, I think similar levels of complexity will still exist to write parsers/formatters for these anyways.

I personally have wanted these features in the language for likely 15 years. Maybe the foundation might be able to sponsor some developer time towards helping the QA/CS tool makers with these features once merged (assuming the RFC passes)?

···

Matthew Weier O’Phinney
mweierophinney@gmail.com
https://mwop.net/
he/him

Robert_Landers · April 10, 2024, 10:23pm

On Wed, Apr 10, 2024 at 10:41 PM Mark Trapp <mark@itafroma.com> wrote:

On Wed, Apr 10, 2024 at 9:59 AM Jeffrey Dafoe <JDafoe@fsx.com> wrote:
>
> > On Wed, Apr 10, 2024 at 4:01 PM Erick de Azevedo Lima <mailto:ericklima.comp@gmail.com> wrote:
> > Maybe if such a feedback was given before and it was decided to go for a trimmed version of the feature, maybe Ilija/Larry could have had less work to implement and test all the variantes they've done. I think if a person has such numerous concerns, it should be exposed ASAP even if your thoughts are not
> > totally clear. If you had a chat directly with Ilija/Larry, they would be aware of such concerns and would expend efforts to address these concerns.
>
> > To an outsider, it looks wild when feedback starts coming in right before the vote starts. What's even more startling is that there are people with voting rights who have never participated in the discussion at all, yet have a right to wordlessly affect the vote's outcome. I sincerely hope Ilija and Larry's work don't
> > go to waste here.
>
> I hope that the property hooks feature goes through. I've used it in C# and it reads nice and saves time.
>
> Waiting until the last minute to express a significant design disagreement is disrespectful of the time the RFC authors have spent working on this proposal. It's also counterproductive to the PHP project overall.
>
> -Jeff

Just a friendly reminder that feedback on why people vote against RFCs
has been desperately sought after and requested each time a seemingly
popular RFC is rejected, and the feedback provided by others for this
RFC was done so in good faith, before voting started, allowing the RFC
authors to address said feedback should they choose to acknowledge it.

And in general, the wider PHP community does not pay attention to RFCs
until/unless they come up for a vote.

Admonishing people for providing unexpected or negative feedback will
have a chilling effect for future RFC attempts, as it's much easier to
just silently vote no than to be attacked simply for providing a
dissenting opinion at an inconvenient time.

- Mark

Hey Mark,

I think that is fair, but I'd also point out that RFCs have been
rejected in the past simply because voters felt it was "too last
minute" before a release. I don't think it's fair to decline RFCs for
being too last minute, but when the inverse happens to an RFC: a huge
dissenting opinion at the last second, call foul.

Largely, I agree with you but it's worth pointing out the incongruity here.

Robert Landers
Software Engineer
Utrecht NL

ilutov · April 11, 2024, 11:00pm

Hi Juliette

On 9-4-2024 16:03, Juliette Reinders Folmer wrote:

On 8-4-2024 23:39, Ilija Tovilo wrote:

https://wiki.php.net/rfc/property-hooks

I realize it is late in the discussion period to speak up, but for months I've been trying to find the words to express my concerns in a polite and constructive way and have failed.

First of all, thank you for taking the time to analyze the RFC and
form your thoughts. I know how much time and effort it takes.
Nonetheless, I hope you understand that receiving feedback of this
sort literally minutes before a vote is planned is close to the worst
way to go about it, both from a practical standpoint, but also RFC
author and reviewer morale.

Many of the points you raise have been there for many months to a
year, and many have been raised and discussed many times over. I will
try to address each, and give people some time to respond.

And not just one syntax, but five and the differences in the semantics of the function syntaxes are significant.

I think this is somewhat of a misrepresentation. For reference, you
can find the grammar here. It's only about <50 lines.

github.com

php/php-src/blob/bf390b47c7522fd4d130be26b8ee97520f985275/Zend/zend_language_parser.y#L1088-L1131


      
          hooked_property:
          		T_VARIABLE backup_doc_comment '{' property_hook_list '}'
          			{ $$ = zend_ast_create(ZEND_AST_PROP_ELEM, $1, NULL, ($2 ? zend_ast_create_zval_from_str($2) : NULL), $4); }
          	|	T_VARIABLE '=' expr backup_doc_comment '{' property_hook_list '}'
          			{ $$ = zend_ast_create(ZEND_AST_PROP_ELEM, $1, $3, ($4 ? zend_ast_create_zval_from_str($4) : NULL), $6); }
          ;
          
          property_hook_list:
          		%empty { $$ = zend_ast_create_list(0, ZEND_AST_STMT_LIST); }
          	|	property_hook_list property_hook { $$ = zend_ast_list_add($1, $2); }
          	|	property_hook_list attributes property_hook {
          			$$ = zend_ast_list_add($1, zend_ast_with_attributes($3, $2));
          		}
          ;
          
          optional_property_hook_list:
          		%empty	{ $$ = NULL; }
          	|	'{' property_hook_list '}'	{ $$ = $2; }
          ;

This file has been truncated. show original

Notably, `get` and `set` hooks don't actually have a different
grammar. Instead, hooks have names, optional parameter lists, and an
optional short (`=>`) or long (`{}`) body. Whether all possible
combinations are considered different syntax is open to
interpretation. It's true that additional rules apply to each hook,
e.g. whether they are allowed to declare a parameter list, and how the
return value is used. I could see an argument being made for allowing
an empty parameter list (`()`) for `get`, for consistency. Let us know
if you have any other ideas to make them more congruent.

In any case, I don't believe PHP_CodeSniffer would need to handle each
combination separately.

What would your optimal solution look like? I feel this discussion
cannot progress unless we have more concrete discussion points.

* The implicit "set" parameter, which does not have a explicit parameter, does not have parentheses and has the magically created $value variable.

To be a bit more exact, the parameter list is simply inferred to
`(<property type> $value)` so that the user does not need to spell it
out. This is not akin to other magic variables we had previously, like
`$http_response_header`.

TL;DR: this RFC tries to do too much in one go and introduces a huge amount of cognitive complexity with all the exceptions and the differences in behaviour between virtual and backed properties. This cognitive complexity is so high that I expect that the feature will catch most developers out a lot of the time.

Many people still say that this RFC doesn't do enough because it
doesn't support x/y/z. This is including you:

* The arrow function variant for `set` with all the above differences + the differences inherent to arrow functions with the above mentioned exceptions + the implicit assignment, which breaks the expected behaviour of arrow functions by assigning the result of the expression instead of returning it (totally understandable, but still a difference).
* Properties _may_ or _may not_ have a default value anymore, depending on whether the hooks cause it to be a backed or a virtual property.
* Properties with hooks can _be_ readonly, but cannot be declared as readonly.
* Only available for object properties, static properties are excluded.
* Readonly properties which are not denoted as readonly, but still are due to their virtual nature (get without access to $this->prop), which can only be figured out by, again, studying the contents of the hook function.

There are more below that I will be commenting on in more detail.

I understand that, in a perfect world, each language feature would
automatically work with every other. In reality, this is either not
possible, or complex and time consuming. We opted for the middle
ground, allowing things that were workable and omitting things that
weren't, or required unrelated changes. If this is not acceptable,
then I genuinely don't know how to approach non-trivial RFCs.

* Properties can now be declared as abstract, but only with explicit hook requirements, otherwise the abstract keyword is not allowed.
* Properties can now be declared on interfaces, but only with explicit hook requirements, otherwise they are not allowed.

The semantics of plain interface properties are not clear.
Intuitively, `public $prop;` looks equivalent to `public $prop { get;
set; }`, but due to reference semantics it is not. It's not even
equivalent to `{ &get; set; }`, because by-reference hooked properties
still have some inherent limitations
(PHP: rfc:property-hooks).
Again, this could be addressed, but only at the cost of complexity.

* Abstract/Interface properties don't comply with the same covariance/contravariance rules as other properties

Do you mean that they are not invariant as other properties? I think
it's generally accepted that LSP rules should be as lax as possible,
while being sound. Note that this rule isn't unique to
read-/write-only abstract properties either, but also applies to
read-/write-only virtual properties.

class P {
    public string|int $readonly { get => 'foo'; }
}

class C extends P {
    public int $readonly { get => 42; }
}

And last but not least, there is the ability to declare property hooks in constructor property promotion ... where we can now basically have a function declaration within the signature of a method declaration.... with all five possible syntax types.

The grammar used for hooks in constructor property promotion is the
same as the one used for properties. Am I wrong to assume this
shouldn't cause any additional work for PHP_CodeSniffer?

* `var` for property declarations is not supported according to the RFC.

This is a misunderstanding between Larry and me. I personally never
intended to deprecate `var` in this RFC. We've briefly discussed this
and decided to treat `var` as normal, meaning as an alias of `public`.

* Disallows references to properties, but only when there is a set hook.
* Disallows indirect modification of a property, but only when there is a set hook.

The RFC spends much time explaining these limitations. If the RFC is
ever accepted, in any form, then we must accept either of these
approaches:

1. The engine places limitations on by-reference usage of properties
when there is a `set` hook.
2. The user can trivially circumvent the `set` hook.

This limitation is not unique to PHP either.

* Write-only (virtual) properties, which cannot be read and you need to study the hook declaration in detail to figure out if a property is or is not write-only.

Indeed. But this has also been discussed in much detail. One proposed
solution was explicitly marking properties as virtual. This comes with
the downside of changing the property signature, making conversion of
a backed parent property to virtual (and vice versa) a BC break. If
you have a different solution in mind, please share.

Oh, and hang on, they *can* be read from a method in the same class as long as that method is called from within the set hook, so now the method will need to either do a backtrace or use Reflection to figure out whether it has access to the property value.

Right. We use the same mechanism as `__get`/`__set` uses to protect
against recursion. In particular, when entering a magic method for a
particular property, PHP remembers this using a "property guard". It's
really just a marker in the object that sets the magic method type,
along with the property name.

When the same property is accessed again, the magic method or hook is
skipped, and PHP behaves as if the magic method or hook wasn't there
at all. For `__get`, `__set` and hooks of virtual properties, this
means accessing a dynamic property. For backed properties, this means
accessing its backing value.

After discussing it, we've decided to make virtual properties slightly
less surprising by throwing, instead of trying to create a dynamic
property. This would be consistent with how virtual read-, and
write-only properties are handled outside of hooks, when they are
written to or read, respectively.

now why does that remind me of the magic __...() methods ?

That's no secret. We have mentioned many times that we use the same
mechanism for accessing backing value as `__get`/`__set` uses for
accessing dynamic properties.

* Whether a type can be specified on the parameter on `set` depends on whether the property is typed. You cannot declare `set(mixed $value)` for an untyped property, even though it would effectively be compatible. This is inconsistent with the behaviour for, for instance method overloads, where this is acceptable: Online PHP editor | rfc for hbCor , though it is consistent with the behaviour of property overloads, where this is not acceptable: Online PHP editor | output for seDWM (anyone up for an RFC to fix this inconsistency ?)

For child properties, it is not legal to add `mixed` to a previously
untyped property. Online PHP editor | output for mScQS I was honestly not
aware of the case you outlined. Anyway, either approach we chose will
be inconsistent with the other.

* Changes the meaning of `$this->property` read/write access for all methods called from within a hook while executing the hook.

What is the right solution to this in your opinion? Recursion surely
isn't it. Throwing when accessing the backing value outside of its
corresponding hook? I will investigate whether this is possible in an
efficient way, and whether it's worth considering.

* Creates two different flavour of properties: "backed" properties and "virtual" properties with significantly different behaviours, raising cognitive complexity.
This includes, but is not limited to the fact that set hooks can be bypassed on virtual properties via a get reference.

Sure, but is there actually a solution to this? Making all properties
backed means that serialization includes backing values that aren't
used, and that the property supports useless reads/writes for
properties that are only read or only written to. Only supporting
virtual properties means that there's an inevitable BC break when
converting a backed property to a virtual one, because the explicit
backing property cannot share the same name as the virtual one, which
will once again affect serialization.

* The range of different behaviours for various forms of serialization.

Indeed, we carefully picked the behavior that should make sense for
each case. Isn't that better than the user having to remember that
they need to handle serialization/JSON themselves, depending on the
case, which they also need to identify themselves?

[1] The RFC also states in "Interaction with isset() and unset()" that the behaviour of property hooks is consistent with how `isset()` interacts with `__get()` today. This is not true, as in: the behaviour as described for isset with magic methods is different than what the RFC states it is: Online PHP editor | rfc for 2Arkh

Right, this paragraph is a bit misleading. In essence, `isset`
triggers both `__isset` and `__get` to 1. verify that the property
even exists and 2. if it does, that it is non-null. For hooked
properties, we don't need to verify the former, because the property
is explicitly declared. So we only call `get` and make sure its value
is non-null.

We'll make sure to clarify this better.

Additionally, it proposes for isset() to throw an Error for virtual properties without a get hook. This is surprising behaviour as isset() by its nature is expected to *not* throw errors, but return false for whatever is not set or inaccessible from the current context: Online PHP editor | output for 3OlgM

Personally, I would rather know when checking `isset` on something
that can never return `true`. But I admit that this case is not
clear-cut. The throwing approach can be relaxed to return false
without a BC break though, but not the opposite.

[2] PHP supports "multi-property declarations", but I see no mention of these in the RFC, which makes it unclear if and if so, how property hooks would work with multi-property declarations.
class Foo {
public string $bar, $baz, $bab;
}

The implementation does not allow mixing multi-property declarations
with hooks. I find them confusing (e.g. are `public int $foo, $bar;`
two typed properties, or one typed and one untyped?). The same applies
for hooks. We'll make sure to spell this out explicitly.

[3] If a `set` hook gets a named parameter, is the magic `$value` variable still available or not ? I'd expect not, but it is unclear from the RFC.

No. As mentioned above, on omission of the parameter list, it is
simply inferred. There's no "truly" magic `$value` variable.

[4] Why should ReflectionProperty::getRawValue() throw an error for static properties ? Instead of returning the value, identically to ReflectionProperty::getValue() ? This is surprising to me and I see no reason for it.

Because it's equivalent to `getValue`, and indicates the user is doing
something they don't fully understand.

As for the impact on static analysis tools....

I won't comment too much on this, because it's your area of expertise.
Of course, I sympathize. If there's a solution that works well for us
_and_ you, let me know. But it doesn't sound like some simple tweaks
can significantly improve this problem for you.

Here are the things we're changing:

* `var` continues to be supported.
* We will change accessing a virtual property with the same name from
within a method called from a hook to throw, rather than create a
dynamic property.
* We will change accessing a backed property from within a method
called from a hook to throw as well, assuming it can be done
performantly. (We need to verify this.)

To summarize: I understand your concerns and why you raise them.
However, none of the complexity in this RFC is arbitrary. Most of it
is inherent in the problem space, and the RFC’s approach is the result
of extensive experimentation to determine possible solutions. While
some superficial changes are possible, the actual complex parts
(reference handling, inheritance, interfaces, virtual properties,
etc.) are all necessary complexity, not accidental complexity. Absent
a novel alternative, which after over a year of work and extensive
discussions on this list and elsewhere we do not believe exists, we
believe this RFC represents "the best hooks implementation that PHP
can support." And that implementation deserves a final vote to
determine if the internals community wants hooks at all or not.

Ilija

Juliette_Reinders_Fo · April 12, 2024, 2:55am

Thank you for your extensive response. No matter how much time I spend trying to voice my concerns properly, I clearly still failed. My email was not a criticism of the work on property hooks as such. It was, more than anything, an opinion piece. The first part of my email (up to the “Furthermore”) was mostly me trying to summarize the complexity of the proposal and to compare it to existing syntaxes and list those things which - to me - appeared inconsistent with other PHP features. It was not intended as criticism on individual choices made and I’m not sure whether, if I were designing this feature, I would have made different choices, other than to have just one syntax - the “full” syntax (with a get();/set(); variant - yes, always with parentheses - for abstract/interface properties). This first part was meant as a summation to highlight how incredibly complex this new feature is and how many opportunities it offers developers to shoot themselves in the foot. Let alone, how many problems this can cause when developers use classes in dependencies which use this new feature and they have limited to no awareness of whether they are using virtual or backed properties on the dependency. As the proposal purports to be a better alternative to the magic _get()/__set() methods, I would expect it to prevent a lot of the problems associated with those methods. However, what I found - and what you seemingly confirmed in your reply - is that this new feature does not in actual fact solve the problems magic methods have. It has the same problems + more problems associated with the new feature itself. Part of my concern is therefore that this feature adds significant extra complexity to the engine, for little to no real benefit, other than to have a nicer syntax for creating the same problems we already had before. In that sense, my email was not to ask you to make changes to the proposal [], but far more to try and make sure that those people with the power to vote and without the time to fully read and really grep the (really really long) proposal, are put in a position to make an informed choice. [] The part after the “Furthermore” did contain some criticism, where my points marked with [2] and [3] are AFAICS just a matter of clarifying things in the RFC, while [1] and [4] would possibly warrant changes, though I would expect (hope) the impact of those in dev time to be small. Oh and yes, my remark about var would also warrant a change in the RFC text. I can see that those textual changes in the RFC have been made in the mean time, so thank you for adding the additional clarifications for those points to the RFC. As for [1] and [4], let me respond to your answers to those points: In my opinion, a plain call to isset() should never have to be put in a try-catch, but I understand your point about being conservative now and being able to relax the approach later without BC-break. I would definitely be in favour of relaxing the approach to make isset() return false, but let’s see what other have to say about it. Except that it isn’t equivalent behaviour, as getValue() will return the value for both static and non-static properties without any problems. See: which is based on Example #1 in the documentation of Forgive me, but I had to chuckle at the mention of virtual properties needing to access a dynamic property in the context of a property guard. My thoughts went straight to the following question “Does this mean that support for dynamic properties could never be removed if this RFC gets voted in ? And if so, can we please get rid of the deprecation notice (and need for the attribute) ?” But I see you’ve already answered that question by changing the behaviour to throw now. Now, as for the impact on PHP_CodeSniffer: For the purposes of PHP_CodeSniffer, having all those different variations of the syntax does actually make things exponentially more complex. I’ll try to give you some idea, though I fear this is not really PHP Internals mailinglist material as it is a bit too detailed and too specific to one tool. The Tokenizer layer in PHPCS will need extensive changes to accommodate recognizing set and get as “function” keywords, but only when used in the context of a property declaration. Though, hang on, not only in the context of a property declaration, also in the context of a function parameter declaration in the case of constructor property promotion. Next PHPCS will hit another problem (still in the Tokenizer layer), which is that “function” keywords are expected to get markers for the parentheses, which is now a problem. Similarly, “function” keywords (for non-abstract, non-interface functions) are expected to have markers for the start and end of the scope, but PHPCS searches for those starting from the close parenthesis… Only once all that is solved and stable, can we even start to look at the sniffs themselves. Let’s take as an example a sniff trying to examine the contents of a function. Such a sniff will typically first check if the function has a body (based on the scope markers) and bow out if not, then gather the declared parameters and associated information from between the parentheses and then examine the contents of the function between the curlies (or between the => and the “guessed” end of the expression for arrow functions (as this is still not 100% solved)) for whatever the sniff is looking for. Now, for the property hooks, this would mean each of these sniffs will need changes along the following lines: - the hook function keywords would need to be added to indicate the sniff should examine property hooks as well - the “gathering of the parameters” part is broken as the hook function may not have parentheses, which would typically be seen by sniffs as indicative of use in an IDE (live coding) or a parse error and would be a reason to bow out, so special handling would need to be put in place for this. - but even then the “gathering of the parameters” is broken as if there are no parentheses, the set hook will automatically get the $value parameter, but none of the other info about the parameter is available, like the type, so now the sniff will need to examine the property itself to figure out the type of $value. - etc Now, all of this is, of course, solvable, but it will take years to update all actively used sniffs to account for all these different situations. From a PHPCS point of view, things would be much more straight forward if the hooks would always take parentheses, the set hook would always need a declared parameter, return types were allowed and only the syntax with curlies was supported. That way the syntax matches the pre-existing function syntaxes much more closely. And while writing this, I already know that limiting the RFC to only the full syntax + parentheses and return types will be extremely unpopular among the readers here, so kind request to people here: no need to flame me. I know. I’m only providing an answer to the question posed, I don’t expect the RFC to be changed to accommodate PHPCS. Yes, your assumption is unfortunately wrong. See above. Agreed. But just to clarify: please don’t see what I highlighted regarding PHP_CodeSniffer as a “me” problem. This problem will affect all users of PHP_CodeSniffer who want to use the new syntax for years to come. I fully agree. It does deserve a final vote and the main thing I was trying to do with my mail was add another perspective to inform voters. I realize it may not be a popular perspective and that was part of why I struggled so much writing the email, as I know how much thought you both have put into this and I so very much respect all that attention to detail and all the careful considerations you have both made. It pains me to know how much work you and Larry have put into this feature, especially as I really kind of like the idea of the feature. Having said that, sometimes one has to take a step back and look at the wider picture and ask oneself if something solves the original problem or exacerbates it. Unfortunately, my conclusion was the latter. We’ll see via the vote, what the conclusion drawn by the voters is. With all my regards, Juliette

···

Hi Ilija,

On 12-4-2024 1:00, Ilija Tovilo wrote:

On 9-4-2024 16:03, Juliette Reinders Folmer wrote:

I realize it is late in the discussion period to speak up, but for months I've been trying to find the words to express my concerns in a polite and constructive way and have failed.

First of all, thank you for taking the time to analyze the RFC and
form your thoughts. I know how much time and effort it takes.
Nonetheless, I hope you understand that receiving feedback of this
sort literally minutes before a vote is planned is close to the worst
way to go about it, both from a practical standpoint, but also RFC
author and reviewer morale.
Many of the points you raise have been there for many months to a
year, and many have been raised and discussed many times over.

[1] Additionally, it proposes for isset() to throw an Error for virtual properties without a get hook. This is surprising behaviour as isset() by its nature is expected to *not* throw errors, but return false for whatever is not set or inaccessible from the current context: [https://3v4l.org/3OlgM](https://3v4l.org/3OlgM)

Personally, I would rather know when checking `isset` on something
that can never return `true`. But I admit that this case is not
clear-cut. The throwing approach can be relaxed to return false
without a BC break though, but not the opposite.

[4] Why should ReflectionProperty::getRawValue() throw an error for static properties ? Instead of returning the value, identically to ReflectionProperty::getValue() ? This is surprising to me and I see no reason for it.

Because it's equivalent to `getValue`, and indicates the user is doing
something they don't fully understand.

https://3v4l.org/v4F0S https://www.php.net/manual/en/reflectionproperty.getvalue.php

Oh, and hang on, they *can* be read from a method in the same class as long as that method is called from within the set hook, so now the method will need to either do a backtrace or use Reflection to figure out whether it has access to the property value.

Right. We use the same mechanism as `__get`/`__set` uses to protect
against recursion. In particular, when entering a magic method for a
particular property, PHP remembers this using a "property guard". It's
really just a marker in the object that sets the magic method type,
along with the property name.

When the same property is accessed again, the magic method or hook is
skipped, and PHP behaves as if the magic method or hook wasn't there
at all. For `__get`, `__set` and hooks of virtual properties, this
means accessing a dynamic property. For backed properties, this means
accessing its backing value.

After discussing it, we've decided to make virtual properties slightly
less surprising by throwing, instead of trying to create a dynamic
property. This would be consistent with how virtual read-, and
write-only properties are handled outside of hooks, when they are
written to or read, respectively.

And not just one syntax, but five and the differences in the semantics of the function syntaxes are significant.

Notably, `get` and `set` hooks don't actually have a different
grammar. Instead, hooks have names, optional parameter lists, and an
optional short (`=>`) or long (`{}`) body. Whether all possible
combinations are considered different syntax is open to
interpretation. It's true that additional rules apply to each hook,
e.g. whether they are allowed to declare a parameter list, and how the
return value is used. I could see an argument being made for allowing
an empty parameter list (`()`) for `get`, for consistency. Let us know
if you have any other ideas to make them more congruent.
In any case, I don't believe PHP_CodeSniffer would need to handle each
combination separately.

What would your optimal solution look like? I feel this discussion
cannot progress unless we have more concrete discussion points.

And last but not least, there is the ability to declare property hooks in constructor property promotion ... where we can now basically have a function declaration within the signature of a method declaration.... with all five possible syntax types.

The grammar used for hooks in constructor property promotion is the
same as the one used for properties. Am I wrong to assume this
shouldn't cause any additional work for PHP_CodeSniffer?

I won't comment too much on this, because it's your area of expertise.
Of course, I sympathize. If there's a solution that works well for us
_and_ you, let me know. But it doesn't sound like some simple tweaks
can significantly improve this problem for you.

To summarize: I understand your concerns and why you raise them.
However, none of the complexity in this RFC is arbitrary. Most of it
is inherent in the problem space, and the RFC’s approach is the result
of extensive experimentation to determine possible solutions. While
some superficial changes are possible, the actual complex parts
(reference handling, inheritance, interfaces, virtual properties,
etc.) are all necessary complexity, not accidental complexity. Absent
a novel alternative, which after over a year of work and extensive
discussions on this list and elsewhere we do not believe exists, we
believe this RFC represents "the best hooks implementation that PHP
can support." And that implementation deserves a final vote to
determine if the internals community wants hooks at all or not.

mabe · April 12, 2024, 7:29am

Hi,

Kind regards, Marc

···

On 12.04.24 04:55, Juliette Reinders Folmer wrote:

Forgive me, but I had to chuckle at the mention of virtual properties needing to access a dynamic property in the context of a property guard. My thoughts went straight to the following question “Does this mean that support for dynamic properties could never be removed if this RFC gets voted in ? And if so, can we please get rid of the deprecation notice (and need for the attribute) ?” But I see you’ve already answered that question by changing the behaviour to throw now.

This was something I was confused by as well but wasn’t sure at all if this would actually be an issue in a real world. So I tried to come up with a real world example.

If I understand correctly you already changed the behavior to throw now but I still want to share anyway …

Imaging, as library author, you have a property with hooks that contain a typo and you want to rename it in a BC way. Let’s say you want to rename “oldProp” to “newProp” but both should work the same as if it would be the same property.

class Test {
const PREFIX = ‘backed:’;

public string $newProp {
set => $this->prefixize($value);
get => $this->getProp();
}

public string $oldProp {
set {
$this->newProp = $value;
}
get => $this->getProp();
}

/** Complex value transformer */
private function prefixize($value) {
return self::PREFIX . $value;
}

/** Complex value reverse transformer */
private function unprefixize($prefixed) {
return substr($prefixed, strlen(self::PREFIX));
}

private function getProp() {
return $this->unprefixize($this->newProp);
}
}

$t = new Test();
$t->newProp = ‘foo’;
var_dump($t); // object(Test)#1 (1) { [“newProp”]=> string(10) “backed:foo” }
var_dump($t->newProp); // string(3) “foo”
var_dump($t->oldProp); // string(0) “”

$t->oldProp = ‘bar’;
var_dump($t); // { [“newProp”]=> string(10) “backed:bar” }
var_dump($t->newProp); // string(3) “bar”
var_dump($t->oldProp); // string(0) “”

Hi Ilija,

On 12-4-2024 1:00, Ilija Tovilo wrote:

Oh, and hang on, they *can* be read from a method in the same class as long as that method is called from within the set hook, so now the method will need to either do a backtrace or use Reflection to figure out whether it has access to the property value.

Right. We use the same mechanism as `__get`/`__set` uses to protect
against recursion. In particular, when entering a magic method for a
particular property, PHP remembers this using a "property guard". It's
really just a marker in the object that sets the magic method type,
along with the property name.

When the same property is accessed again, the magic method or hook is
skipped, and PHP behaves as if the magic method or hook wasn't there
at all. For `__get`, `__set` and hooks of virtual properties, this
means accessing a dynamic property. For backed properties, this means
accessing its backing value.

After discussing it, we've decided to make virtual properties slightly
less surprising by throwing, instead of trying to create a dynamic
property. This would be consistent with how virtual read-, and
write-only properties are handled outside of hooks, when they are
written to or read, respectively.

Rowan_Tommins_IMSoP · April 12, 2024, 1:27pm

On 10 April 2024 04:40:13 BST, Juliette Reinders Folmer <php-internals_nospam@adviesenzo.nl> wrote:

* Whether a type can be specified on the parameter on `set` depends on whether the property is typed. You cannot declare `set(mixed $value)` for an untyped property, even though it would effectively be compatible. This is inconsistent with the behaviour for, for instance method overloads, where this is acceptable: Online PHP editor | rfc for hbCor , though it is consistent with the behaviour of property overloads, where this is not acceptable: Online PHP editor | output for seDWM (anyone up for an RFC to fix this inconsistency ?)

Just picking up on this point, because it's a bit of a tangle: PHP currently makes a hard distinction between "typed properties" and "untyped properties". For instance, unset() works differently, and the "readonly" attribute can only be added to a typed property.

That's actually rather relevant to your point, because if this RFC passes we would probably need to consider that PHP has at least 4 types of properties:

- dynamic properties (deprecated by default, but allowed with an attribute)
- declared but untyped properties
- typed properties
- virtual properties

But maybe 6, with:

- untyped properties with hooks
- typed properties with hooks

Of course, most of the time, users aren't aware of the current 3-way split, and they won't need to think about all 6 of these variations. But there are going to be cases where documentation or a future RFC has to cover edge cases of each.

I do think there is scope for removing some features from the RFC which are nice but not essential, and reducing these combinations. For instance, if we limit the access to the underlying property, we might be able to treat "virtual properties" as just an optimisation: the engine doesn't allocate a property it knows will never be accessed, and accesses to it, e.g. via reflection, just return "uninitialized".

I am however conscious that RFCs have failed in the past for being "not complete enough" as well as for being "too complex".

Regards,
Rowan Tommins
[IMSoP]

Crell · April 12, 2024, 4:25pm

On Fri, Apr 12, 2024, at 2:55 AM, Juliette Reinders Folmer wrote:

The first part of my email (up to the "Furthermore") was mostly me
trying to summarize the complexity of the proposal and to compare it to
existing syntaxes and list those things which - to me - appeared
inconsistent with other PHP features. It was not intended as criticism
on individual choices made and I'm not sure whether, if I were
designing this feature, I would have made different choices, other than
to have just one syntax - the "full" syntax (with a `get();`/`set();`
variant - yes, always with parentheses - for abstract/interface
properties).

This first part was meant as a summation to highlight how incredibly
complex this new feature is and how many opportunities it offers
developers to shoot themselves in the foot.
Let alone, how many problems this can cause when developers use classes
in dependencies which use this new feature and they have limited to no
awareness of whether they are using virtual or backed properties on the
dependency.

The RFC itself is complex because PHP is complex, and the problem space is complex. For instance, when hooks exist, what does that do for the return value of `=`? Most devs probably don't even remember that there is one, but as a language design element it has to be considered.

Overall, the guiding design principle for this RFC was to make it as transparent as possible, so that devs who "use classes in dependencies which use this new feature and they have limited to no awareness of whether they are using virtual or backed properties on the dependency" do not need to care. The whole point is that it should be a transparent change, or as transparent as possible. Most of the complexity of the RFC is figuring out what the most transparent behavior would be, and ensuring that works.

As indicated in the introduction, a primary use case for hooks is to not need to use them at all; 90%+ of all Doctrine entities I've seen have dumb boilerplate do-nothing get/set methods, placed there as a "just in case we need it" or "because That's How It's Done In Java Beans." Occasionally they do a little extra set validation, but not in the majority case.

Hooks allow eliminating that "just in case" boilerplate. If hooks get almost no usage in the wild, but the option to use them eliminates hundreds of thousands of lines of effectively no-op getX() methods, I will consider it a resounding success.

As the proposal purports to be a better alternative to the magic
`_get()`/`__set()` methods, I would expect it to prevent a lot of the
problems associated with those methods. However, what I found - and
what you seemingly confirmed in your reply - is that this new feature
does *not* in actual fact solve the problems magic methods have. It has
the same problems + more problems associated with the new feature
itself.

Part of my concern is therefore that this feature adds significant
extra complexity to the engine, for little to no real benefit, other
than to have a nicer syntax for creating the same problems we already
had before.

I believe this is a highly disingenuous description of the RFC.

* Hooks are automatically static-analysis-friendly. __get is not.
* Hooks are automatically IDE autocomplete-friendly. __get is not.
* Hooks provide built-in type enforcement. __get does not.
* Hooks allow properties to be self-documenting. __get does not.
* Hooks avoid smooshing multiple different virtual properties into a single mega-method. __get requires that.
* Hooks allow opting-in to reference semantics per-property. __get is all-or-nothing.
* Hooks have clear, consistent integration with serialization (in that the behavior with each serialization mechanism is consistent with that mechanism's expectations). __get doesn't interact with serialization at all, leaving you to do everything manually.
* Hooks allow properties to be declared without special behavior, and add special behavior later without a BC break. __get... only kinda sorta does that, if you use it just right.
* Hooks cleanly allow pre-defining properties to get the substantial memory savings of well-defined properties. __get may or may not, depending on how you impelement it.

From a usability perspective, hooks are vastly superior to using __get/__set directly.

However, both __get/__set and hooks operate in the same problem space: Inserting extra user-space code in between what otherwise seems like a boring variable read/write. *That* concept naturally involves complexity, by definition, and has lots of edge cases, by definition. Any attempt to insert user-space code into that operation would run into the exact same set of complexities; References exist, we have to deal with them somehow. `=` evaluates to a value, we have to deal with that somehow. Inheritance exists, we have to deal with that somehow. isset() exists, we have to deal with that somehow. This is all essential complexity.

In most cases, we chose to address issues in the same way that __get/__set do precisely to keep it as simple as possible and reduce cognitive overhead for developers. The corner cases where the code insertion cannot be perfectly transparent already exist with __get/__set, and we therefore defaulted to "make them not-quite-transparent in the same way it's already not-quite-transparent," precisely to minimize the amount of thinking developers have to do.

In that sense, my email was not to ask you to make changes to the
proposal [*], but far more to try and make sure that those people with
the power to vote and without the time to fully read and really grep
the (really really long) proposal, are put in a position to make an
informed choice.

Given the extensive feedback the RFC has gotten, both on list and off, I am confident that those who care to understand the level of complexity already do.

[*] The part after the "Furthermore" did contain some criticism, where
my points marked with [2] and [3] are AFAICS just a matter of
clarifying things in the RFC, while [1] and [4] would possibly warrant
changes, though I would expect (hope) the impact of those in dev time
to be small.
Oh and yes, my remark about `var` would also warrant a change in the
RFC text.

I can see that those textual changes in the RFC have been made in the
mean time, so thank you for adding the additional clarifications for
those points to the RFC.

And we appreciate you calling out those inconsistencies. We just wish you had done so 6 weeks ago, when it could certainly have been done in a polite and constructive way, as you said you wanted to do.

[4] Why should ReflectionProperty::getRawValue() throw an error for static properties ? Instead of returning the value, identically to ReflectionProperty::getValue() ? This is surprising to me and I see no reason for it.

Because it's equivalent to `getValue`, and indicates the user is doing
something they don't fully understand.

Except that it isn't equivalent behaviour, as `getValue()` will return
the value for both static and non-static properties without any
problems.
See: Online PHP editor | output for v4F0S which is based on Example #1 in the
documentation of
PHP: ReflectionProperty::getValue - Manual

Static properties have no concept of virtual or backed values, so anyone calling getRawValue() on a static property is already doing something illogical. That said, if in practice we find this is problematic it is trivial to relax the throw to it being a pure alias of getValue() in the future; tightening it up in the future would be a BC break.

Forgive me, but I had to chuckle at the mention of virtual properties
needing to access a dynamic property in the context of a property guard.
My thoughts went straight to the following question "Does this mean
that support for dynamic properties could never be removed if this RFC
gets voted in ? And if so, can we please get rid of the deprecation
notice (and need for the attribute) ?"

As noted above, this was simply "what do __get/__set do?", which in some cases would result in a dynamic property. Whether that is wise or not is a different question. Based on this thread we've decided that there are no good use cases for this pathway, so we're going to deviate from __get/__set in this instance.

This is the text I just added to the RFC (in the Scoping section):

If a hook calls a method that in turn tries to read or write from the property again, that would normally result in an infinite loop. To prevent that, accessing the backing value of a property from a method called from a hook on that property will throw an Error. That is somewhat different than the existing behavior of __get and __set, where such sub-called methods would bypass the magic methods. However, as valid use cases for such circular logic are difficult to identify and there is added risk of confusion with dynamic properties, we have elected to simply block that access entirely.

I'll try to give you some idea, though I fear this is not really PHP
Internals mailinglist material as it is a bit too detailed and too
specific to one tool.

I am going to skip the weeds of the PHPCS implementation, and just ask you, and everyone, to provide this kind of feedback on future RFCs early, not at the literal last minute when everything was already polished. If there are small tweaks to the syntax that would have made user-space meta tools (formatters, SA tools, etc.) substantially easier, that's something we would have been open to hearing 8 weeks ago, or a year ago when this was first proposed. Addressing it now would require non-trivial design work from us, non-trivial implementation changes from Ilija, and non-trivial discussion and review from the half-dozen or so other reviewers that have taken an active interest in this RFC (which we very much appreciate!). Trying to do that now would burn several weeks more time, with only one person involved being paid for it (Ilija works for the Foundation), and like every RFC it's all still spec work.

Even if you're not sure how to make it "polite", provide that kind of feedback early and often, in bite-sized pieces, so that it can save everyone's time, including yours.

This is a major structural failing of the current RFC process, something many have pointed out already. It desperately needs to be addressed. Until it does, though, there are ways we can make the process at least a little less painful for all involved.

--Larry Garfield

ilutov · April 14, 2024, 11:44pm

Hi Matthew

We're going to skip over the reiterations of Juliettes arguments, as
they have already been responded to.

On Thu, Apr 11, 2024 at 12:08 AM Matthew Weier O'Phinney
<mweierophinney@gmail.com> wrote:

On Mon, Apr 8, 2024 at 4:41 PM Ilija Tovilo <tovilo.ilija@gmail.com> wrote:

https://externals.io/message/122445#122667

2. I'm not a huge fan of the short syntax, but the improvements in the most recent draft are _mostly_ ones I can live with. The part that's still unclear is when and where hooks need a `;` termination, and _why_ the `;` is used, instead of `,`. When using `match()` expressions, you use `,` to separate the expressions, but for some reason, the proposal uses `;` ... but only when using short expressions. And if you have a full-form mixed with a short form, the `;` is only needed for the short-form expression. This feels arbitrary, and it will be easy to get it wrong for people comfortable with `match()` statements.

I agree that the distinction of `,` and `;` isn't clear-cut. I would
categorize hooks as declarations, because they are really just
functions attached to the property. Declarations are `;`-terminated,
or have a body (`{}`) (properties, (non-)abstract methods, class
consts, etc). `,`, on the other hand, is used for lists of expressions
and other things (array elements, match cases, function arguments).

The other argument for the syntax we chose is that it's already used in C#.

(Larry tells me that it's `match()` being weird here, but considering that for many developers, their only point of reference for this sort of syntax IS `match()`, making it feel like the language is ignoring its own syntax when creating new syntax.)

Given my description from above, I don't believe this is true either.
Match arms most closely relate to the array element syntax, which
already uses `,`.

3. While I'd likely prefer Marco's approach to references (just don't allow them), the fact that they mirror how `__get()` and `__set()` _currently_ work gives a migration path for users who are familiar with that paradigm's gotchas. In other words, it's consistent with the current language, and will make migrating from `__set/__get` to hooks easier. It's a lot of complexity, but the table you created helps with that. That table MUST make it to the docs for the feature!

Precisely. Our decision to support references (as much as possible)
comes from the desire to maintain compatibility (as much as possible),
not only with `__get`/`__set` but also plain props.

Ilija