[PHP-DEV] [RFC] Clone with v2

On Thu, May 15, 2025, at 2:56 PM, Rob Landers wrote:

On Thu, May 15, 2025, at 17:32, Tim Düsterhus wrote:

Hi

Am 2025-05-15 14:14, schrieb Rob Landers:
> For example, if you have a Money type, you'd want to be able to ensure
> it cannot be negative when updating via `with()`. This is super
> important for ensuring constraints are met during the clone.

That's why the assignments during cloning work exactly like regular
property assignments, observing visibility and property hooks.

The only tiny difference is that an “outsider” is able to change a
`public(set) readonly` property after a `__clone()` method ran to
completion and relied on the property in question not changing on the
cloned object after it observed its value. This seems not to be
something relevant in practice, because why would the exact value of the
property only matter during cloning, but not at any other time?

Best regards
Tim Düsterhus

Hey Tim,

why would the exact value of the
property only matter during cloning, but not at any other time?

For example, queueing up patches to store/db to commit later; during
the clone, it may register various states to ensure the patches are
accurate from that point. That's just one example, though, and it
suggests calling __clone *before* setting the values is the right
answer.

I think Larry's idea of just using hooks for validation is also pretty
good. As Larry said, the only thing you can really do is throw an
exception, and the same would be true in a constructor as well.

— Rob

The limit of hooks is that they're single-property. So depending on how your derived properties are implemented, it may be insufficient. I could easily write such an example (the hooks RFC included some), but how contrived they are, I don't know.

--Larry Garfield

I really like a way with arrays.
It allows users to combine what properties they want to re-set and call the clone function only once. Really good catch.

···

Volker Dusch
Head of Engineering

Tideways GmbH
Königswinterer Str. 116
53227 Bonn
https://tideways.io/imprint

Sitz der Gesellschaft: Bonn
Geschäftsführer: Benjamin Außenhofer (geb. Eberlei)
Registergericht: Amtsgericht Bonn, HRB 22127

On Thu, May 15, 2025, at 22:11, Larry Garfield wrote:

On Thu, May 15, 2025, at 2:56 PM, Rob Landers wrote:

On Thu, May 15, 2025, at 17:32, Tim Düsterhus wrote:

Hi

Am 2025-05-15 14:14, schrieb Rob Landers:

For example, if you have a Money type, you’d want to be able to ensure
it cannot be negative when updating via with(). This is super
important for ensuring constraints are met during the clone.

That’s why the assignments during cloning work exactly like regular
property assignments, observing visibility and property hooks.

The only tiny difference is that an “outsider” is able to change a
public(set) readonly property after a __clone() method ran to
completion and relied on the property in question not changing on the
cloned object after it observed its value. This seems not to be
something relevant in practice, because why would the exact value of the
property only matter during cloning, but not at any other time?

Best regards
Tim Düsterhus

Hey Tim,

why would the exact value of the
property only matter during cloning, but not at any other time?

For example, queueing up patches to store/db to commit later; during
the clone, it may register various states to ensure the patches are
accurate from that point. That’s just one example, though, and it
suggests calling __clone before setting the values is the right
answer.

I think Larry’s idea of just using hooks for validation is also pretty
good. As Larry said, the only thing you can really do is throw an
exception, and the same would be true in a constructor as well.

— Rob

The limit of hooks is that they’re single-property. So depending on how your derived properties are implemented, it may be insufficient. I could easily write such an example (the hooks RFC included some), but how contrived they are, I don’t know.

–Larry Garfield

Yeah, the validation won’t be too automatable (ie, in a base class) without at least having reusable hooks. It’s important to be mindful that there are approximately three different paradigms when it comes to validating objects (and they apply to frameworks differently) in PHP.

– validate on serialization –

This paradigm is mostly used in symfony via doctrine/serializer. An object is allowed to be in an “invalid” state and is usually constructed in a “zero” state, and then state is applied over the lifetime of the application. Only during serialization to the wire is it usually validated. So, it usually looks something like this:

$user = new User();
$user->name = “Rob”
$user->id = 123;

When you receive an object in a function, you will likely have to validate that specific properties are set before operating on it, otherwise you can end up with bugs. The nice thing about this style is that you can build up an object’s state over a longer period (such as processing a form input or a database query result).

– validate on construction –

This paradigm is commonly used in value objects and more functional-style PHP. The idea is that an object must be valid by the time it is constructed and does not allow partially formed objects to exist. All properties are validated in the constructor, and invalid combinations throw exceptions immediately. This tends to lead to more robust code, particularly when the object represents a meaningful invariant (e.g., Money, EmailAddress, Uuid). Here’s what it looks like:

$user = new User(name: “Rob”, id: 123);

The dowside is that its harder to build objects piece by piece and usually requires factories, DTOs, and builder patterns for times when not all the data is available upfront. This approach is usually seen in functional, DDD (domain driven design), or strict typing contexts and plays nicely with immutability. This style is less-common in popular PHP frameworks like Symfony and Laravel, which tend to favor more flexible object construction.

– validate on mutation –

This paradigm is popular in both Symfony and Laravel and favors an active-record-esqe approach. In this paradigm, each mutator (setter) is responsible for ensuring the property remains valid at the point of modification. The downside is that inter-property constraints require redundant checks that can be difficult to maintain. The nice thing is that they’re easy to enforce.

Of course, these can be mixed-and-matched as needed/desired. The downside with the cloning method here is that it really puts “validation on construction” on a back foot. Developers will likely have absolutely no way to perform validation without rewriting their entire class structures and will make validation during construction basically impossible for immutable objects.

— Rob

On 15/05/2025 18:56, Stephen Reay wrote:

I agree that no __clone and an empty __clone should behave the same way. But as I said, I believe they should behave the same way as they do *now*, until the developer opts in to support cloning with new values.

I think what Andreas is saying is that the behaviour of this:

$foo = clone($bar, someProperty: 42);

Should be consistent with the existing behaviour of this:

$foo = clone $bar;
$bar->someProperty = 42;

An empty or missing __clone method won't prevent `someProperty` being assigned in the existing case, and for everything other than readonly properties, the new syntax is purely sugar for the existing one.

If I understand the RFC, the only change in behaviour is for readonly properties, which are "unlocked" during the "clone with" process. That means that if they were previously validated only in the constructor, this syntax can put the object in an unexpected state.

However, readonly properties are "protected(set)" by default, so the situations where this can happen are actually quite limited:

- Code inside the class itself (private scope) can reasonably be considered to be "opting in" to the feature it's using.
- Code in a sub-class (protected scope) can by default over-ride the constructor anyway.
- Code outside the class (public scope) will fail unless the property is explicitly "readonly public(set)", which would be pointless if it was always initialised in the constructor.

So the only case I can think of where something surprising could happen is:

1. A public or protected readonly property is initialised in a constructor marked "final"
2. A sub-class adds code that uses "clone with" to set that property to a new value

The question then is, how worried are we about that scenario?

--
Rowan Tommins
[IMSoP]

Le jeu. 15 mai 2025 à 15:55, Tim Düsterhus <tim@bastelstu.be> a écrit :

Hi

Am 2025-05-15 00:04, schrieb Larry Garfield:

Subtle point here. If the __clone() method touches a readonly
property, does that make the property inaccessible to the new
clone-with?

Yes. Quoting from the RFC:

The currently linked implementation “locks” a property if it modified
within __clone(), if this is useful is up for debate.

A single unlock block would be confusing to me.

We’ve implemented it like that, because it felt most in line with what
was decided in
https://wiki.php.net/rfc/readonly_amendments#proposal_2readonly_properties_can_be_reinitialized_during_cloning,
which says:

Reinitialization of each property is possible once and only once:

We expect “public(set) readonly” + “__clone()” to be rare and from
within the class, the author knows how their __clone() implementation
works and can make sure it is compatible with whatever properties they
might want to update during cloning. The lack of “use cases” is the
primary reason we made the more conservative choice, but we are not
particularly attached to this specific behavior.

Being able to update a readonly property even if __clone already touched it looks critical to me because otherwise, it’d mean that adding a __clone method after publishing a first version of some class that has no __clone method would be a BC break.

Nicolas

Le jeu. 15 mai 2025 à 16:06, Larry Garfield <larry@garfieldtech.com> a écrit :

On Thu, May 15, 2025, at 1:22 AM, Stephen Reay wrote:

I may be missing something here..

So far the issues are “how do we deal with a parameter for the actual
object, vs new properties to apply”, “should __clone be called before
or after the changes” and “this won’t allow regular readonly properties
to be modified”.

Isn’t the previous suggestion of passing the new property arguments
directly to the __clone method the obvious solution to all three
problems?

There’s no potential for a conflicting property name, the developer can
use the new property values in the order they see fit relative to the
logic in the __clone call, and it’s inherently in scope to write to any
(unlocked during __clone) readonly properties.

I did some exploratory design a few years ago on this front, looking at the implications of different possible syntaxes.

https://peakd.com/hive-168588/@crell/object-properties-part-2-examples

What that article calls “initonly” is essentially what became readonly. The second example is roughly what this RFC would look like if the extra arguments were passed to __clone(). As noted in the article, the result is absolutely awful.

Auto-setting the values while using the clone($object, …$args) syntax is the cleanest solution. Given that experimentation, I would not support an implementation that passes args to __clone and makes the developer figure it out. That just makes a mess.

Rob makes a good point elsewhere in thread that running __clone() afterward is a way to allow the object to re-inforce validation if necessary. My concern is whether the method knows it needs to do the extra validation or not, since it may be arbitrarily complex. It would also leave no way to reject the changes other than throwing an exception, though in fairness the same is true of set hooks. Which also begs the question of whether a set hook would be sufficient that __clone() doesn’t need to do extra validation? At least in the typical case?

One possibility (just brainstorming) would be to update first, then call __clone(), but give clone a new optional arg that just tells it what properties were modified by the clone call. It can then recheck just those properties or ignore it entirely, as it prefers. If that handles only complex cases (eg, firstName was updated so the computed fullName needs to be updated) and set hooks handle the single-property ones, that would probably cover all bases reasonably well.

I like where this is going but here is a variant that’d be even more capable:

we could pass the original object to __clone.

The benefits I see:

  • Allow implementing this validation logic you’re talking about.
  • Allow to skip deep-cloning of already updated properties (that’s a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.
  • Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.

On this last aspect, I think it’s new to the discussion but it’s something I’ve always found very limiting when using weak-map: let’s say some metadata are stored about object $foo in a weakmap, it’s currently not possible to track those metadata across clones without using some nasty tricks. If __clone were given the original object, it’s be easy to duplicate meta-data from $foo to $clone.

I have just one concern significant with adding an argument to __clone: it’dbe a BC break to mandate this argument at the declaration level, and adding one right now generates an error with current versions of PHP.
However, I think we could (and should if confirmed) provide some FC/BC layer by allowing one to use func_get_args() in __clone. The engine could then verify at compile time that __clone has either one non-nullable “object” argument, or zero.

Nicolas

Great input. Thank you Nicolas.

While this only applies to public public(set) readonly protected readonly properties that are also then touched in the new clone method, it is indeed an E_FATAL with the current implementation.

For these cases, that would indeed be an annoying gotcha, even if I don’t have an example at hand, it might make sense to me to account for it. I’ll update the RFC and publish a changelog on Monday, and I’ll mention that there once I had another look at the implementation.

···

Volker Dusch
Head of Engineering

Tideways GmbH
Königswinterer Str. 116
53227 Bonn
https://tideways.io/imprint

Sitz der Gesellschaft: Bonn
Geschäftsführer: Benjamin Außenhofer (geb. Eberlei)
Registergericht: Amtsgericht Bonn, HRB 22127

I forgot to mention that on the flip side, this would allow public public(set) readonly properties to be “overwritten” after __clone touched them; which is why we chose the more “locked down” version in the first place. So I want to consider this before just updating the RFC :slight_smile:

Hey everyone,

Thank you for the participation so far, since the start of the discussion, from feedback on and off list, I’ve added a couple of examples:

We’re still looking for feedback on the …variadic approach to the Syntax: https://wiki.php.net/rfc/clone_with_v2#open_issues, as we only got one reply so far on the topic.

Also thanks to theodorejb for touching up phrasing, correcting spelling mistakes and so on: https://wiki.php.net/rfc/clone_with_v2?do=revisions

Kind Regards,
Volker

···

Volker Dusch
Head of Engineering

Tideways GmbH
Königswinterer Str. 116
53227 Bonn
https://tideways.io/imprint

Sitz der Gesellschaft: Bonn
Geschäftsführer: Benjamin Außenhofer (geb. Eberlei)
Registergericht: Amtsgericht Bonn, HRB 22127

Your summary matches my ideas around the topic and I’ve updated the RFC [1] to better clarify that we consider touching __clone to be out of scope.

From our perspective, we are not worried about the Points (1&2) you raised and are not interested in solving this by providing parameters to __clone as the additional cost, issues, and complexity are not a worthwhile tradeoff in our eyes.

Kind Regards,
Volker

[1] https://news-web.php.net/php.internals/127397

···

Volker Dusch
Head of Engineering

Tideways GmbH
Königswinterer Str. 116
53227 Bonn
https://tideways.io/imprint

Sitz der Gesellschaft: Bonn
Geschäftsführer: Benjamin Außenhofer (geb. Eberlei)
Registergericht: Amtsgericht Bonn, HRB 22127

On Mon, May 19, 2025, at 5:48 AM, Volker Dusch wrote:

Hey everyone,

Thank you for the participation so far, since the start of the
discussion, from feedback on and off list, I've added a couple of
examples:

-
PHP: rfc:clone_with_v2
- PHP: rfc:clone_with_v2
- We removed the behavior of __clone locking readonly properties when
the function updates them for forward compatibility concerns raised in
[RFC] Clone with v2 - Externals
- Updated PHP: rfc:clone_with_v2
mentioning there that consider touching __clone or adding parameters a
non-goal due to complexity and BC implications.
- Updated the property name of the varadic parameter from
$updatedProperties to $withProperties

We're still looking for feedback on the ...variadic approach to the
Syntax: PHP: rfc:clone_with_v2, as we only
got one reply so far on the topic.

Also thanks to theodorejb for touching up phrasing, correcting spelling
mistakes and so on: PHP: rfc:clone_with_v2

Kind Regards,
Volker

For positional parameters, I don't see any way that they'd work or do what someone expects. So why not just block them entirely instead of relying on dynamic properties to warn-but-sorta-work?

--Larry Garfield

On Fri, 16 May 2025 at 21:59, Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:

Le jeu. 15 mai 2025 à 16:06, Larry Garfield <larry@garfieldtech.com> a écrit :

On Thu, May 15, 2025, at 1:22 AM, Stephen Reay wrote:

> I may be missing something here..
>
> So far the issues are "how do we deal with a parameter for the actual
> object, vs new properties to apply", "should __clone be called before
> or after the changes" and "this won't allow regular readonly properties
> to be modified".
>
> Isn't the previous suggestion of passing the new property arguments
> directly to the __clone method the obvious solution to all three
> problems?
>
> There's no potential for a conflicting property name, the developer can
> use the new property values in the order they see fit relative to the
> logic in the __clone call, and it's inherently in scope to write to any
> (unlocked during __clone) readonly properties.

I did some exploratory design a few years ago on this front, looking at the implications of different possible syntaxes.

Object Properties part 2: examples | PeakD

What that article calls "initonly" is essentially what became readonly. The second example is roughly what this RFC would look like if the extra arguments were passed to __clone(). As noted in the article, the result is absolutely awful.

Auto-setting the values while using the clone($object, ...$args) syntax is the cleanest solution. Given that experimentation, I would not support an implementation that passes args to __clone and makes the developer figure it out. That just makes a mess.

Rob makes a good point elsewhere in thread that running __clone() afterward is a way to allow the object to re-inforce validation if necessary. My concern is whether the method knows it needs to do the extra validation or not, since it may be arbitrarily complex. It would also leave no way to reject the changes other than throwing an exception, though in fairness the same is true of set hooks. Which also begs the question of whether a set hook would be sufficient that __clone() doesn't need to do extra validation? At least in the typical case?

One possibility (just brainstorming) would be to update first, then call __clone(), but give clone a new optional arg that just tells it what properties were modified by the clone call. It can then recheck just those properties or ignore it entirely, as it prefers. If that handles only complex cases (eg, firstName was updated so the computed fullName needs to be updated) and set hooks handle the single-property ones, that would probably cover all bases reasonably well.

I like where this is going but here is a variant that'd be even more capable:

we could pass the original object to __clone.

My proposal earlier was to pass the original object _and_ the values
that were passed to the clone call, by reference.
And this would happen before those values are assigned to the object.

class MyClass {
  public function __construct(
    public readonly int $x,
    public readonly int $y,
    public readonly int $z,
  ) {}
  public function __clone(object $original, array &$values): void {
    // Set a value directly, and modify it.
    if (isset($values['x'])) {
      $this->x = $values['x'] * 10;
      // Prevent that the same property is assigned again.
      unset($values['x']);
    }
  }
}

$obj = new C(5, 7, 9);
$clone = clone($obj, x: 2, y: 3);
assert($clone->x === 20); // x was update in __clone().
assert($clone->y === 3); // y was auto-updated after __clone().
assert($clone->z === 9); // z was not touched at all.

The benefits I see:
- Allow implementing this validation logic you're talking about.
- Allow to skip deep-cloning of already updated properties (that's a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.
- Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.

On this last aspect, I think it's new to the discussion but it's something I've always found very limiting when using weak-map: let's say some metadata are stored about object $foo in a weakmap, it's currently not possible to track those metadata across clones without using some nasty tricks. If __clone were given the original object, it's be easy to duplicate meta-data from $foo to $clone.

I have just one concern significant with adding an argument to __clone: it'dbe a BC break to mandate this argument at the declaration level, and adding one right now generates an error with current versions of PHP.
However, I think we could (and should if confirmed) provide some FC/BC layer by allowing one to use func_get_args() in __clone. The engine could then verify at compile time that __clone has either one non-nullable "object" argument, or zero.

This seems reasonable.

Nicolas

Le lun. 19 mai 2025 à 16:30, Andreas Hennings <andreas@dqxtech.net> a écrit :

On Fri, 16 May 2025 at 21:59, Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:

Le jeu. 15 mai 2025 à 16:06, Larry Garfield <larry@garfieldtech.com> a écrit :

On Thu, May 15, 2025, at 1:22 AM, Stephen Reay wrote:

I may be missing something here..

So far the issues are “how do we deal with a parameter for the actual
object, vs new properties to apply”, “should __clone be called before
or after the changes” and “this won’t allow regular readonly properties
to be modified”.

Isn’t the previous suggestion of passing the new property arguments
directly to the __clone method the obvious solution to all three
problems?

There’s no potential for a conflicting property name, the developer can
use the new property values in the order they see fit relative to the
logic in the __clone call, and it’s inherently in scope to write to any
(unlocked during __clone) readonly properties.

I did some exploratory design a few years ago on this front, looking at the implications of different possible syntaxes.

https://peakd.com/hive-168588/@crell/object-properties-part-2-examples

What that article calls “initonly” is essentially what became readonly. The second example is roughly what this RFC would look like if the extra arguments were passed to __clone(). As noted in the article, the result is absolutely awful.

Auto-setting the values while using the clone($object, …$args) syntax is the cleanest solution. Given that experimentation, I would not support an implementation that passes args to __clone and makes the developer figure it out. That just makes a mess.

Rob makes a good point elsewhere in thread that running __clone() afterward is a way to allow the object to re-inforce validation if necessary. My concern is whether the method knows it needs to do the extra validation or not, since it may be arbitrarily complex. It would also leave no way to reject the changes other than throwing an exception, though in fairness the same is true of set hooks. Which also begs the question of whether a set hook would be sufficient that __clone() doesn’t need to do extra validation? At least in the typical case?

One possibility (just brainstorming) would be to update first, then call __clone(), but give clone a new optional arg that just tells it what properties were modified by the clone call. It can then recheck just those properties or ignore it entirely, as it prefers. If that handles only complex cases (eg, firstName was updated so the computed fullName needs to be updated) and set hooks handle the single-property ones, that would probably cover all bases reasonably well.

I like where this is going but here is a variant that’d be even more capable:

we could pass the original object to __clone.

My proposal earlier was to pass the original object and the values
that were passed to the clone call, by reference.

And this would happen before those values are assigned to the object.

class MyClass {
public function __construct(
public readonly int $x,
public readonly int $y,
public readonly int $z,
) {}
public function __clone(object $original, array &$values): void {
// Set a value directly, and modify it.
if (isset($values[‘x’])) {
$this->x = $values[‘x’] * 10;
// Prevent that the same property is assigned again.
unset($values[‘x’]);
}
}
}

$obj = new C(5, 7, 9);
$clone = clone($obj, x: 2, y: 3);
assert($clone->x === 20); // x was update in __clone().
assert($clone->y === 3); // y was auto-updated after __clone().
assert($clone->z === 9); // z was not touched at all.

I’m not sure I understand, there might be missing bits to your idea, eg where is visibility enforced? why is pass-by-ref needed at all?
Pass-by-ref makes me think this is a bad idea already :slight_smile:

Also, WDYT of my simpler proposal itself? Wouldn’t it cover all use cases?

Note that I don’t see the need for operations like in your example (transforming a value while cloning).
This looks like the job of a setter or a hook instead, not a cloner.

The benefits I see:

  • Allow implementing this validation logic you’re talking about.
  • Allow to skip deep-cloning of already updated properties (that’s a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.
  • Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.

On this last aspect, I think it’s new to the discussion but it’s something I’ve always found very limiting when using weak-map: let’s say some metadata are stored about object $foo in a weakmap, it’s currently not possible to track those metadata across clones without using some nasty tricks. If __clone were given the original object, it’s be easy to duplicate meta-data from $foo to $clone.

I have just one concern significant with adding an argument to __clone: it’dbe a BC break to mandate this argument at the declaration level, and adding one right now generates an error with current versions of PHP.
However, I think we could (and should if confirmed) provide some FC/BC layer by allowing one to use func_get_args() in __clone. The engine could then verify at compile time that __clone has either one non-nullable “object” argument, or zero.

This seems reasonable.

I’d be happy to know what Volker and Tim think about this? I read they excluded any change to __clone in the RFC, but I think it should still be possible to discuss this, especially if it provides the path to the desired solution.

Nicolas

On Mon, 19 May 2025 at 17:13, Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:

Le lun. 19 mai 2025 à 16:30, Andreas Hennings <andreas@dqxtech.net> a écrit :

On Fri, 16 May 2025 at 21:59, Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:
>
>
>
> Le jeu. 15 mai 2025 à 16:06, Larry Garfield <larry@garfieldtech.com> a écrit :
>>
>> On Thu, May 15, 2025, at 1:22 AM, Stephen Reay wrote:
>>
>> > I may be missing something here..
>> >
>> > So far the issues are "how do we deal with a parameter for the actual
>> > object, vs new properties to apply", "should __clone be called before
>> > or after the changes" and "this won't allow regular readonly properties
>> > to be modified".
>> >
>> > Isn't the previous suggestion of passing the new property arguments
>> > directly to the __clone method the obvious solution to all three
>> > problems?
>> >
>> > There's no potential for a conflicting property name, the developer can
>> > use the new property values in the order they see fit relative to the
>> > logic in the __clone call, and it's inherently in scope to write to any
>> > (unlocked during __clone) readonly properties.
>>
>> I did some exploratory design a few years ago on this front, looking at the implications of different possible syntaxes.
>>
>> Object Properties part 2: examples | PeakD
>>
>> What that article calls "initonly" is essentially what became readonly. The second example is roughly what this RFC would look like if the extra arguments were passed to __clone(). As noted in the article, the result is absolutely awful.
>>
>> Auto-setting the values while using the clone($object, ...$args) syntax is the cleanest solution. Given that experimentation, I would not support an implementation that passes args to __clone and makes the developer figure it out. That just makes a mess.
>>
>> Rob makes a good point elsewhere in thread that running __clone() afterward is a way to allow the object to re-inforce validation if necessary. My concern is whether the method knows it needs to do the extra validation or not, since it may be arbitrarily complex. It would also leave no way to reject the changes other than throwing an exception, though in fairness the same is true of set hooks. Which also begs the question of whether a set hook would be sufficient that __clone() doesn't need to do extra validation? At least in the typical case?
>>
>> One possibility (just brainstorming) would be to update first, then call __clone(), but give clone a new optional arg that just tells it what properties were modified by the clone call. It can then recheck just those properties or ignore it entirely, as it prefers. If that handles only complex cases (eg, firstName was updated so the computed fullName needs to be updated) and set hooks handle the single-property ones, that would probably cover all bases reasonably well.
>
>
> I like where this is going but here is a variant that'd be even more capable:
>
> we could pass the original object to __clone.

My proposal earlier was to pass the original object _and_ the values
that were passed to the clone call, by reference.

And this would happen before those values are assigned to the object.

class MyClass {
  public function __construct(
    public readonly int $x,
    public readonly int $y,
    public readonly int $z,
  ) {}
  public function __clone(object $original, array &$values): void {
    // Set a value directly, and modify it.
    if (isset($values['x'])) {
      $this->x = $values['x'] * 10;
      // Prevent that the same property is assigned again.
      unset($values['x']);
    }
  }
}

$obj = new C(5, 7, 9);
$clone = clone($obj, x: 2, y: 3);
assert($clone->x === 20); // x was update in __clone().
assert($clone->y === 3); // y was auto-updated after __clone().
assert($clone->z === 9); // z was not touched at all.

I'm not sure I understand, there might be missing bits to your idea, eg where is visibility enforced? why is pass-by-ref needed at all?
Pass-by-ref makes me think this is a bad idea already :slight_smile:

Maybe we are looking at different problems to be solved.
To me, the main questions are:
- Could a __clone() method want to behave differently depending which
property values are passed with the clone call?
- Can there be conflicts between operations we would normally do in
__clone() and values passed to __clone()?
- Should we prevent or allow double-write to a readonly property? That
is, if one write happens in __clone(), and the other write happens
automatically due to the property value passed to clone(..).

And to clarify my proposal:
- Everything is the same as in the RFC (except points below)
- Same as in the RFC, the __clone() method is called _after_ the
original object values have been copied over, but _before_ any of the
property values passed as arguments to clone($obj, ...$values) are
assigned.
- Same as in the RFC, the values passed to clone($obj, ...$values) are
assigned automatically after the __clone() method.
- Unlike the RFC, the __clone() method can see (and validate) the
values that were passed to __clone($obj, ...$values)
- Unlike the RFC, the __clone() method can _alter_ the values passed
to __clone($obj, ...$values) before they are assigned.
- As in the RFC, readonly properties can be written only once on
clone. The __clone() method can prevent a double write by unsetting
that key in $values.

Consequence:
By leaving the __clone() method empty, all values are assigned
automatically, as in the RFC.

where is visibility enforced?

Exactly as in the RFC.
When clone($obj, ...$values) is called, and before __clone() is
invoked, php has to verify which of the properties are legal to be
updated in this way, based on property visibility and readonly status,
and depending on the scope from which it is called.

Actually this raises some questions that I did not think of before:
- After __clone() is invoked, does php need to validate again?
- What happens to private properties that are not accessible from the
scope of the __clone() method? Are they also passed in the $values
array?

Also, WDYT of my simpler proposal itself? Wouldn't it cover all use cases?

First, I want to make sure I understand correctly:
- Unlike the RFC, we want to call __clone() _after_ the values from
clone($obj, ...$values) are assigned.
  (I assume this because otherwise the two objects would be identical,
and the original object would be useless)
- Unlike the RFC, we pass the original object as a parameter to __clone().

Tbh I am not sure if the use cases I think of are relevant or not :slight_smile:
I mostly think of it in terms of functional completeness, without
trying to speculate why a developer would want to do this or that
during __clone().

Let's look at the "benefits" section from your earlier mail.

The benefits I see:
- Allow implementing this validation logic you're talking about.

Having __clone() called after the values are assigned, as you propose,
makes it possible to run an integrity check on the object itself. On
the other hand, this may leave us with a short moment of possibly
"bad" property values.
Having __clone() called before the values are assigned means we have
to validate the values array, not the object.

- Allow to skip deep-cloning of already updated properties (that's a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.

With __clone() called after the values are assigned, and with access
to the original object, we can check $this->prop === $old->prop to see
whether a specific property was updated (and therefore should not be
deep cloned).
With __clone() called before the values are assigned, and with access
to the values array, we would check isset($values['prop']) instead.
(or really array_key_exists())
In general this would produce the same result, unless a property was
assigned the same value that it had before.

Another question is about "dependent properties", e.g. for lazily
filled calculated values.
With access to the old object we would have to check $this->prop !==
$old->prop to then see which dependent properties need to be reset or
recalculated.
With access to the values we would check isset($values['prop']) instead.

- Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.

I don't feel informed or qualified to talk about this one..

So, with __clone() called after and with the original object, we
compare old and new when looking for a specific property.
With access to an array of updated (or to be updated) properties, we
could iterate over the changes, or we could check whether the
changelist is empty, which is less obvious to do by comparing old and
new instance.

Note that I don't see the need for operations like in your example (transforming a value while cloning).
This looks like the job of a setter or a hook instead, not a cloner.

Tbh, most of the classes I wrote that would benefit from a "clone
with" did not really need a __clone() method, because everything in
there was already immutable.
To me, the scenarios where we want both are quite speculative, so this
is why my examples might not be the most realistic.

>
> The benefits I see:
> - Allow implementing this validation logic you're talking about.
> - Allow to skip deep-cloning of already updated properties (that's a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.
> - Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.
>
> On this last aspect, I think it's new to the discussion but it's something I've always found very limiting when using weak-map: let's say some metadata are stored about object $foo in a weakmap, it's currently not possible to track those metadata across clones without using some nasty tricks. If __clone were given the original object, it's be easy to duplicate meta-data from $foo to $clone.
>
> I have just one concern significant with adding an argument to __clone: it'dbe a BC break to mandate this argument at the declaration level, and adding one right now generates an error with current versions of PHP.
> However, I think we could (and should if confirmed) provide some FC/BC layer by allowing one to use func_get_args() in __clone. The engine could then verify at compile time that __clone has either one non-nullable "object" argument, or zero.

This seems reasonable.

I'd be happy to know what Volker and Tim think about this? I read they excluded any change to __clone in the RFC, but I think it should still be possible to discuss this, especially if it provides the path to the desired solution.

Nicolas

Le lun. 19 mai 2025 à 19:06, Andreas Hennings <andreas@dqxtech.net> a écrit :

On Mon, 19 May 2025 at 17:13, Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:

Le lun. 19 mai 2025 à 16:30, Andreas Hennings <andreas@dqxtech.net> a écrit :

On Fri, 16 May 2025 at 21:59, Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:

Le jeu. 15 mai 2025 à 16:06, Larry Garfield <larry@garfieldtech.com> a écrit :

On Thu, May 15, 2025, at 1:22 AM, Stephen Reay wrote:

I may be missing something here..

So far the issues are “how do we deal with a parameter for the actual
object, vs new properties to apply”, “should __clone be called before
or after the changes” and “this won’t allow regular readonly properties
to be modified”.

Isn’t the previous suggestion of passing the new property arguments
directly to the __clone method the obvious solution to all three
problems?

There’s no potential for a conflicting property name, the developer can
use the new property values in the order they see fit relative to the
logic in the __clone call, and it’s inherently in scope to write to any
(unlocked during __clone) readonly properties.

I did some exploratory design a few years ago on this front, looking at the implications of different possible syntaxes.

https://peakd.com/hive-168588/@crell/object-properties-part-2-examples

What that article calls “initonly” is essentially what became readonly. The second example is roughly what this RFC would look like if the extra arguments were passed to __clone(). As noted in the article, the result is absolutely awful.

Auto-setting the values while using the clone($object, …$args) syntax is the cleanest solution. Given that experimentation, I would not support an implementation that passes args to __clone and makes the developer figure it out. That just makes a mess.

Rob makes a good point elsewhere in thread that running __clone() afterward is a way to allow the object to re-inforce validation if necessary. My concern is whether the method knows it needs to do the extra validation or not, since it may be arbitrarily complex. It would also leave no way to reject the changes other than throwing an exception, though in fairness the same is true of set hooks. Which also begs the question of whether a set hook would be sufficient that __clone() doesn’t need to do extra validation? At least in the typical case?

One possibility (just brainstorming) would be to update first, then call __clone(), but give clone a new optional arg that just tells it what properties were modified by the clone call. It can then recheck just those properties or ignore it entirely, as it prefers. If that handles only complex cases (eg, firstName was updated so the computed fullName needs to be updated) and set hooks handle the single-property ones, that would probably cover all bases reasonably well.

I like where this is going but here is a variant that’d be even more capable:

we could pass the original object to __clone.

My proposal earlier was to pass the original object and the values
that were passed to the clone call, by reference.

And this would happen before those values are assigned to the object.

class MyClass {
public function __construct(
public readonly int $x,
public readonly int $y,
public readonly int $z,
) {}
public function __clone(object $original, array &$values): void {
// Set a value directly, and modify it.
if (isset($values[‘x’])) {
$this->x = $values[‘x’] * 10;
// Prevent that the same property is assigned again.
unset($values[‘x’]);
}
}
}

$obj = new C(5, 7, 9);
$clone = clone($obj, x: 2, y: 3);
assert($clone->x === 20); // x was update in __clone().
assert($clone->y === 3); // y was auto-updated after __clone().
assert($clone->z === 9); // z was not touched at all.

I’m not sure I understand, there might be missing bits to your idea, eg where is visibility enforced? why is pass-by-ref needed at all?
Pass-by-ref makes me think this is a bad idea already :slight_smile:

Maybe we are looking at different problems to be solved.
To me, the main questions are:

  • Could a __clone() method want to behave differently depending which
    property values are passed with the clone call?

Definitely yes, at least to skip triggering a costly deep cloning operation.

  • Can there be conflicts between operations we would normally do in
    __clone() and values passed to __clone()?

I don’t see any. BTW, I had a look at what is done within __clone methods in Symfony, and all implementations fall down into 4 categories:

  • incrementing some counter for tracking management purposes
  • resetting the state of the clone (at least for some transient properties)
  • deep-cloning
  • forbidding clone at all (throwing or making the method private)
  • Should we prevent or allow double-write to a readonly property? That
    is, if one write happens in __clone(), and the other write happens
    automatically due to the property value passed to clone(..).

This question doesn’t really make sense IMHO. What matters is readonly semantics, which must be preserved. The fact that there are two or more steps to achieve the target state doesn’t matter.

And to clarify my proposal:

  • Everything is the same as in the RFC (except points below)
  • Same as in the RFC, the __clone() method is called after the
    original object values have been copied over, but before any of the
    property values passed as arguments to clone($obj, …$values) are
    assigned.
  • Same as in the RFC, the values passed to clone($obj, …$values) are
    assigned automatically after the __clone() method.
  • Unlike the RFC, the __clone() method can see (and validate) the
    values that were passed to __clone($obj, …$values)
  • Unlike the RFC, the __clone() method can alter the values passed
    to __clone($obj, …$values) before they are assigned.
  • As in the RFC, readonly properties can be written only once on
    clone. The __clone() method can prevent a double write by unsetting
    that key in $values.

Thanks for the clarification.
About this last item: the RFC has been updated on this topic.
Also preventing double-writes by unsetting a by-ref array looks terrible, no chance this can be the best API, sorry :slight_smile:

Consequence:
By leaving the __clone() method empty, all values are assigned
automatically, as in the RFC.

where is visibility enforced?

Exactly as in the RFC.
When clone($obj, …$values) is called, and before __clone() is
invoked, php has to verify which of the properties are legal to be
updated in this way, based on property visibility and readonly status,
and depending on the scope from which it is called.

Actually this raises some questions that I did not think of before:

  • After __clone() is invoked, does php need to validate again?
  • What happens to private properties that are not accessible from the
    scope of the __clone() method? Are they also passed in the $values
    array?

That’s definitely an issue with any approach that relies on passing property names.
This doesn’t happen of course if we pass only the original object and rely on === to know what changed.

Also, WDYT of my simpler proposal itself? Wouldn’t it cover all use cases?

First, I want to make sure I understand correctly:

  • Unlike the RFC, we want to call __clone() after the values from
    clone($obj, …$values) are assigned.
    (I assume this because otherwise the two objects would be identical,
    and the original object would be useless)
  • Unlike the RFC, we pass the original object as a parameter to __clone().

You’ve got it right, yes.

Tbh I am not sure if the use cases I think of are relevant or not :slight_smile:
I mostly think of it in terms of functional completeness, without
trying to speculate why a developer would want to do this or that
during __clone().

Let’s look at the “benefits” section from your earlier mail.

The benefits I see:

  • Allow implementing this validation logic you’re talking about.

Having __clone() called after the values are assigned, as you propose,
makes it possible to run an integrity check on the object itself. On
the other hand, this may leave us with a short moment of possibly
“bad” property values.

Having __clone() called before the values are assigned means we have
to validate the values array, not the object.

This concern matters only if that temporary state is observable from the outside. Which isn’t going to be the case.

  • Allow to skip deep-cloning of already updated properties (that’s a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.

With __clone() called after the values are assigned, and with access
to the original object, we can check $this->prop === $old->prop to see
whether a specific property was updated (and therefore should not be
deep cloned).
With __clone() called before the values are assigned, and with access
to the values array, we would check isset($values[‘prop’]) instead.
(or really array_key_exists())
In general this would produce the same result, unless a property was
assigned the same value that it had before.

Another question is about “dependent properties”, e.g. for lazily
filled calculated values.
With access to the old object we would have to check $this->prop !==
$old->prop to then see which dependent properties need to be reset or
recalculated.
With access to the values we would check isset($values[‘prop’]) instead.

  • Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.

I don’t feel informed or qualified to talk about this one..

So, with __clone() called after and with the original object, we
compare old and new when looking for a specific property.
With access to an array of updated (or to be updated) properties, we
could iterate over the changes, or we could check whether the
changelist is empty, which is less obvious to do by comparing old and
new instance.

I think the above concern with private properties and the strange by-ref API both rule out the approach.
The one I propose actually works and is simpler.

Note that I don’t see the need for operations like in your example (transforming a value while cloning).
This looks like the job of a setter or a hook instead, not a cloner.

Tbh, most of the classes I wrote that would benefit from a “clone
with” did not really need a __clone() method, because everything in
there was already immutable.
To me, the scenarios where we want both are quite speculative, so this
is why my examples might not be the most realistic.

Think deep-cloning. This does require __clone + has to support clone-with.

The benefits I see:

  • Allow implementing this validation logic you’re talking about.
  • Allow to skip deep-cloning of already updated properties (that’s a significant drawback of the current proposal - deep cloning before setting is a potential perf/mem hog built into the engine) : guarding deep-cloning with a strict comparison would be enough.
  • Allow implementing WeakMap that are able to clone weak-properties as objects are cloned.

On this last aspect, I think it’s new to the discussion but it’s something I’ve always found very limiting when using weak-map: let’s say some metadata are stored about object $foo in a weakmap, it’s currently not possible to track those metadata across clones without using some nasty tricks. If __clone were given the original object, it’s be easy to duplicate meta-data from $foo to $clone.

I have just one concern significant with adding an argument to __clone: it’dbe a BC break to mandate this argument at the declaration level, and adding one right now generates an error with current versions of PHP.
However, I think we could (and should if confirmed) provide some FC/BC layer by allowing one to use func_get_args() in __clone. The engine could then verify at compile time that __clone has either one non-nullable “object” argument, or zero.

This seems reasonable.

I’d be happy to know what Volker and Tim think about this? I read they excluded any change to __clone in the RFC, but I think it should still be possible to discuss this, especially if it provides the path to the desired solution.

Nicolas

Hi

Am 2025-05-19 15:30, schrieb Larry Garfield:

For positional parameters, I don't see any way that they'd work or do what someone expects. So why not just block them entirely instead of relying on dynamic properties to warn-but-sorta-work?

For better or worse PHP supports numeric properties in objects and it does not seem correct to make this an artificial limitation when the behavior of numeric array keys follows the existing semantics (e.g. when casting an array to an object).

Best regards
Tim Düsterhus

Hi

Am 2025-05-19 12:48, schrieb Volker Dusch:

We're still looking for feedback on the ...variadic approach to the Syntax:
PHP: rfc:clone_with_v2, as we only got one
reply so far on the topic.

I was hoping for some additional opinions here before adding my own, but since this does not appear to happen, adding my personal opinion on this matter now:

*Some* property name being completely incompatible with “clone with” (no matter how the first parameter is going to be called) is a limitation that should not exist, it feels like a feature that is broken by design and I think I really would hate it if the documentation of this RFC would need a “Caution: It is not possible to reassign a property called '$object', due to a parameter name conflict”.

Adjusting the signature to `clone(object $object, array $withProperties)` would not have this problem and I don't consider the additional verbosity of an array literal to be a problem. Static analysis tools already understand array shapes and would need adjustments either way to understand the semantics.

From an implementation PoV a regular array parameter would also be simpler, since the implementation would be able to simply pass along the input array, whereas the “variadic” syntax needs special handling to combine positional parameters with named parameters into a single array that is then used in the cloning process.

Syntax-wise there might also be a middle-ground. Similarly to how this RFC turns `clone()` into a function, the `array()` syntax also looks like a function call and would naturally extend to named parameters. While mixing positional and named parameters would probably get complex, allowing purely named parameters would trivially be possible (without any function call overhead). It would also allow using the first-class-callable syntax with `array(...)`, something I would liked to have in the past. A proof of concept PR is at:

Combining named-parameter `array()` syntax with clone taking a array as the second parameter would allow for the following, which might combine the best of both worlds?

     clone($obj, array(foo: 1, bar: "baz", object: "this is not blocked"));

Best regards
Tim Düsterhus

Hi Tim,

Le mer. 21 mai 2025 à 16:15, Tim Düsterhus <tim@bastelstu.be> a écrit :

Hi

Am 2025-05-19 12:48, schrieb Volker Dusch:

We’re still looking for feedback on the …variadic approach to the
Syntax:
https://wiki.php.net/rfc/clone_with_v2#open_issues, as we only got one
reply so far on the topic.

I was hoping for some additional opinions here before adding my own, but
since this does not appear to happen, adding my personal opinion on this
matter now:

Some property name being completely incompatible with “clone with” (no
matter how the first parameter is going to be called) is a limitation
that should not exist, it feels like a feature that is broken by design
and I think I really would hate it if the documentation of this RFC
would need a “Caution: It is not possible to reassign a property called
‘$object’, due to a parameter name conflict”.

Adjusting the signature to clone(object $object, array $withProperties) would not have this problem and I don’t consider the
additional verbosity of an array literal to be a problem. Static
analysis tools already understand array shapes and would need
adjustments either way to understand the semantics.

From an implementation PoV a regular array parameter would also be
simpler, since the implementation would be able to simply pass along the
input array, whereas the “variadic” syntax needs special handling to
combine positional parameters with named parameters into a single array
that is then used in the cloning process.

Syntax-wise there might also be a middle-ground. Similarly to how this
RFC turns clone() into a function, the array() syntax also looks
like a function call and would naturally extend to named parameters.
While mixing positional and named parameters would probably get complex,
allowing purely named parameters would trivially be possible (without
any function call overhead). It would also allow using the
first-class-callable syntax with array(...), something I would liked
to have in the past. A proof of concept PR is at:

https://github.com/php/php-src/pull/18613

Combining named-parameter array() syntax with clone taking a array as
the second parameter would allow for the following, which might combine
the best of both worlds?

clone($obj, array(foo: 1, bar: “baz”, object: “this is not
blocked”));

Thanks for sharing your insights. This looks a bit far reaching for the RFC.

On my side, my opinion is: don’t make clone a function call. I’ve never missed not being able to call clone as a callback. It’s trivial to write a short function using the operator when in need.

Nicolas

Hi

Am 2025-05-21 16:27, schrieb Nicolas Grekas:

Thanks for sharing your insights. This looks a bit far reaching for the RFC.

Making `array()` a function / allowing named parameter syntax with `array()` would be a separate RFC.

On my side, my opinion is: don't make clone a function call. I've never
missed not being able to call clone as a callback. It's trivial to write a
short function using the operator when in need.

It's an intentional design goal of this RFC to borrow the function call syntax to avoid inventing something that does not yet exist in PHP and to avoid blocking additional keywords (“with”). Making `clone()` an actual function greatly simplifies the implementation, since all the heavy lifting around parameter parsing is already provided by the engine [1] and it also ensures that the behavior is consistent with the behavior implied by the used syntax. If the second parameter would be a regular array rather than using the named parameter syntax, making clone() a function would not be necessary (but wouldn't make things harder either).

Best regards
Tim Düsterhus

[1] As an example, with the currently proposed named parameter syntax, unpacking arbitrary Traversables with `clone($obj, ...$traversable)` would need to be reimplemented specifically for `clone()`.

On Wed, May 21, 2025 at 09:13 Tim Düsterhus wrote:

Am 2025-05-19 12:48, schrieb Volker Dusch:

We're still looking for feedback on the ...variadic approach to the
Syntax:
PHP: rfc:clone_with_v2, as we only got one
reply so far on the topic.

...

*Some* property name being completely incompatible with “clone with” (no
matter how the first parameter is going to be called) is a limitation
that should not exist, it feels like a feature that is broken by design
and I think I really would hate it if the documentation of this RFC
would need a “Caution: It is not possible to reassign a property called
'$object', due to a parameter name conflict”.

Adjusting the signature to `clone(object $object, array $withProperties)`
would not have this problem and I don't consider the additional verbosity
of an array literal to be a problem. Static analysis tools already
understand array shapes and would need adjustments either way to
understand the semantics.

From an implementation PoV a regular array parameter would also be
simpler, since the implementation would be able to simply pass along the
input array, whereas the “variadic” syntax needs special handling to
combine positional parameters with named parameters into a single array
that is then used in the cloning process.

The more I've thought about it, the more I also don't like the variadic
approach. It seems like a hack to try to get `property: value` syntax
for "free", but at a fundamental level the API doesn't make sense for
its purpose.

Not only does it prevent setting a property with the name of the first
parameter (with no workaround), but it also means that someone could
call the function with positional rather than named arguments, which
as the RFC admits "for clone this is usually not useful".

Even if it's slightly less ergonomic to quote property names in array keys,
I agree with Tim that it would be better to change the function signature to:

    function clone(object $object, array $withProperties): object {}

This simply removes the issue of not being able to set a certain property
name, as well as the confusing positional parameter behavior.

Syntax-wise there might also be a middle-ground. Similarly to how this
RFC turns `clone()` into a function, the `array()` syntax also looks
like a function call and would naturally extend to named parameters.
While mixing positional and named parameters would probably get complex,
allowing purely named parameters would trivially be possible (without
any function call overhead). It would also allow using the
first-class-callable syntax with `array(...)`, something I would liked
to have in the past. A proof of concept PR is at:

Make `array()` a function by TimWolla · Pull Request #18613 · php/php-src · GitHub

Combining named-parameter `array()` syntax with clone taking a array as
the second parameter would allow for the following, which might combine
the best of both worlds?

    clone($obj, array(foo: 1, bar: "baz", object: "this is not blocked"));

I really like this idea! If it can work without any function call overhead,
it would enable more ergonomic array creation not only for `clone()` but
also in many other common scenarios.

Regards,
Theodore