[PHP-DEV] [RFC] Lazy Objects

Hi Larry,

On Fri, Jun 14, 2024 at 10:18 PM Larry Garfield <larry@garfieldtech.com> wrote:

> The actual instance is allowed to escape the proxy and to create direct references to itself.

How? Is this a "return $this" type of situation? This could use more fleshing out and examples.

"return $this" will return the proxy object, but it is possible to
create references to the actual instance during initialization, either
directly in the initializer function, or in methods called by the
initializer. The "About Proxies" section discusses this a bit. I've
added an example.

The terms "virtual" and "proxy" seem to be used interchangeably in different places, including in the API. Please just use one, and purge the other. It's confusing as is. :slight_smile: (I'd favor "proxy", as it seems more accurate to what is happening.)

Agreed

Under Common Behavior, you have an example of calling the constructor directly, using the reflection API, but not of binding the callable, which the text says is also available. Please include an example of that so we can evaluate how clumsy (or not) it would be.

I've clarified that binding can be achieved with Closure::bind(). In
practice I expect there will be two kinds of ghost initializers:
- Those that just call one public method of the object, such as the constructor
- Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section

> After calling newLazyGhostInstance(), the behavior of the object is the same as an object created by newLazyGhostInstance().

I think the first is supposed be a make* call?

Thank you!

> When making an existing object lazy, the makeInstanceLazy*() methods call the destructor unless the SKIP_DESTRUCTOR flag is given.

I don't quite get why this is. Admittedly destructors are rarely used, but why does it need to call the destructor?

The rationale is that unless specified otherwise, we must assume that
the constructor has been called on the object. Therefore we must call
the destructor before resetting the object's state entirely. See also
the Mutex example given by Tim. I've added the rationale and an
example.

In practice we expect that makeInstanceLazy*() methods will not be
used on fully initialized objects, and that the flag will be set most
of the time, but as it is the API is safe by default.

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:

class Service {
    public function __construct(private ServiceA $a, private ServiceB $b) {}
}

$c = some_container();

$init = fn() => $this->__construct($c->get(ServiceA::class), $c->get(ServiceB::class));

$service = new ReflectionLazyObjectFactory(Service::class, $init);

(Most likely in generated code that can dynamically sort out the container calls to inline.)

Am I missing something?

No you are right, but they must fallback to the proxy strategy when
the user provides a factory.

E.g. this will use the ghost strategy because the DIC
instantiates/initializes the service itself:

my_service:
    class: MyClass
    arguments: [@service_a, @service_b]
    lazy: true

But this will use the proxy strategy because the DIC doesn't
instantiate/initialize the service itself:

my_service:
    class: MyClass
    arguments: [@service_a, @service_b]
    factory: [@my_service_factory, createService]
    lazy: true

The RFC didn't make it clear enough that the example was about the
factory case specifically.

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :slight_smile: Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

On Sat, Jun 15, 2024, at 8:28 AM, Arnaud Le Blanc wrote:

Hi Larry,

Under Common Behavior, you have an example of calling the constructor directly, using the reflection API, but not of binding the callable, which the text says is also available. Please include an example of that so we can evaluate how clumsy (or not) it would be.

I've clarified that binding can be achieved with Closure::bind(). In
practice I expect there will be two kinds of ghost initializers:
- Those that just call one public method of the object, such as the constructor
- Those that initialize everything with ReflectionProperty::setValue()
as in the Doctrine example in the "About Lazy-Loading strategies"
section

I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :slight_smile:

$init = function() use ($c) {
  $this->a = $c->get(ServiceA::class);
  $this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

In practice we expect that makeInstanceLazy*() methods will not be
used on fully initialized objects, and that the flag will be set most
of the time, but as it is the API is safe by default.

In the case an object does not have a destructor, it won't make a difference either way, correct?

I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:

class Service {
    public function __construct(private ServiceA $a, private ServiceB $b) {}
}

$c = some_container();

$init = fn() => $this->__construct($c->get(ServiceA::class), $c->get(ServiceB::class));

$service = new ReflectionLazyObjectFactory(Service::class, $init);

(Most likely in generated code that can dynamically sort out the container calls to inline.)

Am I missing something?

No you are right, but they must fallback to the proxy strategy when
the user provides a factory.

E.g. this will use the ghost strategy because the DIC
instantiates/initializes the service itself:

my_service:
    class: MyClass
    arguments: [@service_a, @service_b]
    lazy: true

But this will use the proxy strategy because the DIC doesn't
instantiate/initialize the service itself:

my_service:
    class: MyClass
    arguments: [@service_a, @service_b]
    factory: [@my_service_factory, createService]
    lazy: true

The RFC didn't make it clear enough that the example was about the
factory case specifically.

Ah, got it. That makes more sense.

Which makes me ask if the $initializer of a proxy should actually be called $factory? Since that's basically what it's doing, and I'm unclear what it would do with the proxy object itself that's passed in.

ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :slight_smile: Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:

newInstance(...$args)
newInstanceWithoutConstructor(...$args)
newGhostInstance($init)
newProxyInstance($init)

That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.

Thank you for the suggestion. We will check if this fits the
use-cases. Moving some methods on ReflectionObject may have negative
performance implications as it requires creating a dedicated instance
for each object. Some use-cases rely on caching the reflectors for
performance.

Best Regards,
Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :slight_smile:

If it has to be a separate object, please don't make it extend ReflectionClass but still give it useful dynamic methods rather than static methods. Or perhaps even do something like

$ghost = new ReflectionGhostInstance(SomeClass::class, $init);
$proxy = new ReflectionProxyINstance(SOmeClass::class, $init);

And be done with it. (I'm just spitballing here. As I said, I like the feature, I just want to ensure the ergonomics are as good as possible.)

--Larry Garfield

On Sat, Jun 15, 2024 at 7:13 PM Larry Garfield <larry@garfieldtech.com> wrote:

> In practice I expect there will be two kinds of ghost initializers:
> - Those that just call one public method of the object, such as the constructor
> - Those that initialize everything with ReflectionProperty::setValue()
> as in the Doctrine example in the "About Lazy-Loading strategies"
> section
I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :slight_smile:

$init = function() use ($c) {
  $this->a = $c->get(ServiceA::class);
  $this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

Oh I see. Yes you will not be able to bind $this in a simple way here,
but you could bind the scope. This modified example will work:

$init = function($object) use ($c) {
  $object->a = $c->get(ServiceA::class);
  $object->b = $c->get(ServiceB::class);
}
$service = new ReflectionLazyObjectFactory(Service::class,
$init->bindTo(null, Service::class));

If you really want to bind $this you could achieve it in a more convoluted way:

$init = function($object) use ($c) {
  (function () use ($c) {
    $this->a = $c->get(ServiceA::class);
    $this->b = $c->get(ServiceB::class);
  })->bindTo($object)();
}
$service = new ReflectionLazyObjectFactory(Service::class, $init);

This is inconvenient, but the need or use-case is not clear to me.
Could you describe some use-cases where you would hand-write
initializers like this? Do you feel that the proposal should provide
an easier way to change $this and/or the scope?

> In practice we expect that makeInstanceLazy*() methods will not be
> used on fully initialized objects, and that the flag will be set most
> of the time, but as it is the API is safe by default.

In the case an object does not have a destructor, it won't make a difference either way, correct?

Yes

>> I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:
> The RFC didn't make it clear enough that the example was about the
> factory case specifically.

Ah, got it. That makes more sense.

Which makes me ask if the $initializer of a proxy should actually be called $factory? Since that's basically what it's doing,

Good point, $factory would be a good name for this parameter.

and I'm unclear what it would do with the proxy object itself that's passed in.

Passing the factory itself as argument could be used to make decisions
based on the value of some initialized field, or on the class of the
object, or on its identity. I think Nicolas had a real use-case where
he detects clones based on the identity of the object:

$init = function ($object) use (&$originalObject) {
    if ($object !== $originalObject) {
        // we are initializing a clone
    }
};
$originalObject = $reflector->newProxyInstance($init);

This was on ghosts, but I think it's also a valid use-case example for proxies.

>> ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :slight_smile: Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:
>>
>> newInstance(...$args)
>> newInstanceWithoutConstructor(...$args)
>> newGhostInstance($init)
>> newProxyInstance($init)
>>
>> That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.
>
> Thank you for the suggestion. We will check if this fits the
> use-cases. Moving some methods on ReflectionObject may have negative
> performance implications as it requires creating a dedicated instance
> for each object. Some use-cases rely on caching the reflectors for
> performance.
>
> Best Regards,
> Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :slight_smile:

What I meant is that creating an instance (not necessarily of
ReflectionObject, but of any class) is more expensive than just doing
nothing. The first two loops below would be fine, but the last one
would be slower. This can make an important difference in a hot path.

foreach ($objects as $object) {
    ReflectionLazyObject::isInitialized($object);
}

$reflector = new ReflectionClass(SomeClass::class);
foreach ($objects as $object) {
    $reflector->isInitialized($object);
}

foreach ($objects as $object) {
    $reflector = new ReflectionObject($object);
    $reflector->isInitialized($object);
}

If it has to be a separate object, please don't make it extend ReflectionClass but still give it useful dynamic methods rather than static methods. Or perhaps even do something like

$ghost = new ReflectionGhostInstance(SomeClass::class, $init);
$proxy = new ReflectionProxyINstance(SOmeClass::class, $init);

And be done with it. (I'm just spitballing here. As I said, I like the feature, I just want to ensure the ergonomics are as good as possible.)

Thank you for your help. We will think about a better API.

On Sun, Jun 16, 2024, at 8:46 AM, Arnaud Le Blanc wrote:

On Sat, Jun 15, 2024 at 7:13 PM Larry Garfield <larry@garfieldtech.com> wrote:

> In practice I expect there will be two kinds of ghost initializers:
> - Those that just call one public method of the object, such as the constructor
> - Those that initialize everything with ReflectionProperty::setValue()
> as in the Doctrine example in the "About Lazy-Loading strategies"
> section
I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :slight_smile:

$init = function() use ($c) {
  $this->a = $c->get(ServiceA::class);
  $this->b = $c->get(ServiceB::class);
}

$service = new ReflectionLazyObjectFactory(Service::class, $init);

// We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

Oh I see. Yes you will not be able to bind $this in a simple way here,
but you could bind the scope. This modified example will work:

$init = function($object) use ($c) {
  $object->a = $c->get(ServiceA::class);
  $object->b = $c->get(ServiceB::class);
}
$service = new ReflectionLazyObjectFactory(Service::class,
$init->bindTo(null, Service::class));

If you really want to bind $this you could achieve it in a more convoluted way:

$init = function($object) use ($c) {
  (function () use ($c) {
    $this->a = $c->get(ServiceA::class);
    $this->b = $c->get(ServiceB::class);
  })->bindTo($object)();
}
$service = new ReflectionLazyObjectFactory(Service::class, $init);

This is inconvenient, but the need or use-case is not clear to me.
Could you describe some use-cases where you would hand-write
initializers like this? Do you feel that the proposal should provide
an easier way to change $this and/or the scope?

Primarily I was just reacting to this line:

However, for more complex use-cases where the initializer wishes to access non-public properties, it is required to bind the initializer function to the right scope (with Closure::bind()), or to access properties with ReflectionProperty.

And asking "OK, um, how?" Because what I was coming up with didn't make any sense. :slight_smile:

It's debatable if the use case of wanting to assign to private properties in the initializer without using Reflection is common enough to warrant more. I'm not sure at this point. If we wanted to, I could see an extra flag that would tell the system to bind the closure to the object before calling it. I don't know how common a need that will be, though, so I won't insist it be included. But I would like to see that line clarified with one of the above examples, because as is, I would expect to be able to bind to the object as I was trying to do and it (obviously) didn't work.

and I'm unclear what it would do with the proxy object itself that's passed in.

Passing the factory itself as argument could be used to make decisions
based on the value of some initialized field, or on the class of the
object, or on its identity. I think Nicolas had a real use-case where
he detects clones based on the identity of the object:

$init = function ($object) use (&$originalObject) {
    if ($object !== $originalObject) {
        // we are initializing a clone
    }
};
$originalObject = $reflector->newProxyInstance($init);

This was on ghosts, but I think it's also a valid use-case example for proxies.

Hm, interesting. Please include this sort of example in the RFC, so we know what the use is (and when it won't matter, which seems like it would be the more common case).

>> ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :slight_smile: Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:
>>
>> newInstance(...$args)
>> newInstanceWithoutConstructor(...$args)
>> newGhostInstance($init)
>> newProxyInstance($init)
>>
>> That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.
>
> Thank you for the suggestion. We will check if this fits the
> use-cases. Moving some methods on ReflectionObject may have negative
> performance implications as it requires creating a dedicated instance
> for each object. Some use-cases rely on caching the reflectors for
> performance.
>
> Best Regards,
> Arnaud

I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :slight_smile:

What I meant is that creating an instance (not necessarily of
ReflectionObject, but of any class) is more expensive than just doing
nothing. The first two loops below would be fine, but the last one
would be slower. This can make an important difference in a hot path.

foreach ($objects as $object) {
    ReflectionLazyObject::isInitialized($object);
}

$reflector = new ReflectionClass(SomeClass::class);
foreach ($objects as $object) {
    $reflector->isInitialized($object);
}

foreach ($objects as $object) {
    $reflector = new ReflectionObject($object);
    $reflector->isInitialized($object);
}

Ah, now I see what you mean. Interesting. Including that reasoning in the RFC would be good. Though, I don't know how often I'd be calling isInitialized() on a larger set of objects, hot path or no.

--Larry Garfield

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options): object {}

    public function newLazyGhost(callable $initializer, int $options): object {}

    public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

    public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

    public function initialize(object $object): object {}

    public function isInitialized(object $object): bool {}

    // existing methods
}

class ReflectionProperty
{
    public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

    public function skipInitialization(object $object): void {}

    // existing methods
}

Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

On Sun, Jun 16, 2024 at 3:46 PM Arnaud Le Blanc <arnaud.lb@gmail.com> wrote:

On Sat, Jun 15, 2024 at 7:13 PM Larry Garfield <larry@garfieldtech.com> wrote:
> > In practice I expect there will be two kinds of ghost initializers:
> > - Those that just call one public method of the object, such as the constructor
> > - Those that initialize everything with ReflectionProperty::setValue()
> > as in the Doctrine example in the "About Lazy-Loading strategies"
> > section
> I'm still missing an example with ::bind(). Actually, I tried to write a version of what I think the intent is and couldn't figure out how. :slight_smile:
>
> $init = function() use ($c) {
> $this->a = $c->get(ServiceA::class);
> $this->b = $c->get(ServiceB::class);
> }
>
> $service = new ReflectionLazyObjectFactory(Service::class, $init);
>
> // We need to bind $init to $service now, but we can't because $init is already registered as the initializer for $service, and binding creates a new closure object, not modifying the existing one. So, how does this even work?

Oh I see. Yes you will not be able to bind $this in a simple way here,
but you could bind the scope. This modified example will work:

$init = function($object) use ($c) {
  $object->a = $c->get(ServiceA::class);
  $object->b = $c->get(ServiceB::class);
}
$service = new ReflectionLazyObjectFactory(Service::class,
$init->bindTo(null, Service::class));

If you really want to bind $this you could achieve it in a more convoluted way:

$init = function($object) use ($c) {
  (function () use ($c) {
    $this->a = $c->get(ServiceA::class);
    $this->b = $c->get(ServiceB::class);
  })->bindTo($object)();
}
$service = new ReflectionLazyObjectFactory(Service::class, $init);

This is inconvenient, but the need or use-case is not clear to me.
Could you describe some use-cases where you would hand-write
initializers like this? Do you feel that the proposal should provide
an easier way to change $this and/or the scope?

> > In practice we expect that makeInstanceLazy*() methods will not be
> > used on fully initialized objects, and that the flag will be set most
> > of the time, but as it is the API is safe by default.
>
> In the case an object does not have a destructor, it won't make a difference either way, correct?

Yes

> >> I find it interesting that your examples list DICs as a use case for proxies, when I would have expected that to fit ghosts better. The common pattern, I would think, would be:
> > The RFC didn't make it clear enough that the example was about the
> > factory case specifically.
>
> Ah, got it. That makes more sense.
>
> Which makes me ask if the $initializer of a proxy should actually be called $factory? Since that's basically what it's doing,

Good point, $factory would be a good name for this parameter.

> and I'm unclear what it would do with the proxy object itself that's passed in.

Passing the factory itself as argument could be used to make decisions
based on the value of some initialized field, or on the class of the
object, or on its identity. I think Nicolas had a real use-case where
he detects clones based on the identity of the object:

$init = function ($object) use (&$originalObject) {
    if ($object !== $originalObject) {
        // we are initializing a clone
    }
};
$originalObject = $reflector->newProxyInstance($init);

This was on ghosts, but I think it's also a valid use-case example for proxies.

> >> ReflectionLazyObjectFactory is a terrible name. Sorry, it is. :slight_smile: Especially if it's subclassing ReflectionClass. If it were its own thing, maybe, but it's still too verbose. I know you don't want to put more on the "dumping ground" fo ReflectionClass, but honestly, that feels more ergonomic to me. That way the following are all siblings:
> >>
> >> newInstance(...$args)
> >> newInstanceWithoutConstructor(...$args)
> >> newGhostInstance($init)
> >> newProxyInstance($init)
> >>
> >> That feels a lot more sensible and ergonomic to me. isInitialized(), initialized(), etc. also feel like they make more sense as methods on ReflectionObject, not as static methods on a random new class.
> >
> > Thank you for the suggestion. We will check if this fits the
> > use-cases. Moving some methods on ReflectionObject may have negative
> > performance implications as it requires creating a dedicated instance
> > for each object. Some use-cases rely on caching the reflectors for
> > performance.
> >
> > Best Regards,
> > Arnaud
>
> I'm not clear why there's a performance difference, but I haven't looked at the reflection implementation in, well, ever. :slight_smile:

What I meant is that creating an instance (not necessarily of
ReflectionObject, but of any class) is more expensive than just doing
nothing. The first two loops below would be fine, but the last one
would be slower. This can make an important difference in a hot path.

foreach ($objects as $object) {
    ReflectionLazyObject::isInitialized($object);
}

$reflector = new ReflectionClass(SomeClass::class);
foreach ($objects as $object) {
    $reflector->isInitialized($object);
}

foreach ($objects as $object) {
    $reflector = new ReflectionObject($object);
    $reflector->isInitialized($object);
}

> If it has to be a separate object, please don't make it extend ReflectionClass but still give it useful dynamic methods rather than static methods. Or perhaps even do something like
>
> $ghost = new ReflectionGhostInstance(SomeClass::class, $init);
> $proxy = new ReflectionProxyINstance(SOmeClass::class, $init);
>
> And be done with it. (I'm just spitballing here. As I said, I like the feature, I just want to ensure the ergonomics are as good as possible.)

Thank you for your help. We will think about a better API.

On Tue, Jun 18, 2024, at 5:45 PM, Arnaud Le Blanc wrote:

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
    public function newLazyProxy(callable $factory, int $options): object {}

    public function newLazyGhost(callable $initializer, int $options): object {}

    public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

    public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

    public function initialize(object $object): object {}

    public function isInitialized(object $object): bool {}

    // existing methods
}

class ReflectionProperty
{
    public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

    public function skipInitialization(object $object): void {}

    // existing methods
}

Comments / rationale:
- Adding methods on ReflectionClass instead of ReflectionObject is
better from a performance point of view, as mentioned earlier
- Keeping the word "Lazy" in method names is clearer, especially for
"newLazyProxy" as a the "Proxy" pattern has many uses-cases that are
not related to laziness. However we removed the word "Instance" to
make the names shorter.
- We have renamed "make" methods to "reset", following your feedback
about the word "make". It should better convey the behavior of these
methods, and clarify that it's modifying the object in-place as well
as resetting its state
- setRawValueWithoutInitialization() has the same behavior as
setRawValue() (from the hooks RFC), except it doesn't trigger
initialization
- Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks *so* much more self-explanatory and readable. I love it. Thanks! (Looks like the RFC text hasn't been updated yet.)

--Larry Garfield

Le mar. 18 juin 2024 à 22:59, Larry Garfield <larry@garfieldtech.com> a écrit :

On Tue, Jun 18, 2024, at 5:45 PM, Arnaud Le Blanc wrote:

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
public function newLazyProxy(callable $factory, int $options): object {}

public function newLazyGhost(callable $initializer, int $options): object {}

public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

public function skipInitialization(object $object): void {}

// existing methods
}

Comments / rationale:

  • Adding methods on ReflectionClass instead of ReflectionObject is
    better from a performance point of view, as mentioned earlier
  • Keeping the word “Lazy” in method names is clearer, especially for
    “newLazyProxy” as a the “Proxy” pattern has many uses-cases that are
    not related to laziness. However we removed the word “Instance” to
    make the names shorter.
  • We have renamed “make” methods to “reset”, following your feedback
    about the word “make”. It should better convey the behavior of these
    methods, and clarify that it’s modifying the object in-place as well
    as resetting its state
  • setRawValueWithoutInitialization() has the same behavior as
    setRawValue() (from the hooks RFC), except it doesn’t trigger
    initialization
  • Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks so much more self-explanatory and readable. I love it. Thanks! (Looks like the RFC text hasn’t been updated yet.)

Happy you like it so much! The text of the RFC is now up to date. Note that we renamed ReflectionProperty::skipInitialization() and setRawValueWithoutInitialization() to skipLazyInitialization() and setRawValueWithoutLazyInitialization() after we realized that ReflectionProperty already has an isInitialized() method for something quite different.

While Arnaud works on moving the code to the updated API, are there more comments on this RFC before we consider opening the vote?

Cheers,
Nicolas

Hey Nicolas,

···

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

On Thu, Jun 20, 2024 at 10:52 AM Nicolas Grekas <nicolas.grekas+php@gmail.com> wrote:

Le mar. 18 juin 2024 à 22:59, Larry Garfield <larry@garfieldtech.com> a écrit :

On Tue, Jun 18, 2024, at 5:45 PM, Arnaud Le Blanc wrote:

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
public function newLazyProxy(callable $factory, int $options): object {}

public function newLazyGhost(callable $initializer, int $options): object {}

public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

public function skipInitialization(object $object): void {}

// existing methods
}

Comments / rationale:

  • Adding methods on ReflectionClass instead of ReflectionObject is
    better from a performance point of view, as mentioned earlier
  • Keeping the word “Lazy” in method names is clearer, especially for
    “newLazyProxy” as a the “Proxy” pattern has many uses-cases that are
    not related to laziness. However we removed the word “Instance” to
    make the names shorter.
  • We have renamed “make” methods to “reset”, following your feedback
    about the word “make”. It should better convey the behavior of these
    methods, and clarify that it’s modifying the object in-place as well
    as resetting its state
  • setRawValueWithoutInitialization() has the same behavior as
    setRawValue() (from the hooks RFC), except it doesn’t trigger
    initialization
  • Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks so much more self-explanatory and readable. I love it. Thanks! (Looks like the RFC text hasn’t been updated yet.)

Happy you like it so much! The text of the RFC is now up to date. Note that we renamed ReflectionProperty::skipInitialization() and setRawValueWithoutInitialization() to skipLazyInitialization() and setRawValueWithoutLazyInitialization() after we realized that ReflectionProperty already has an isInitialized() method for something quite different.

While Arnaud works on moving the code to the updated API, are there more comments on this RFC before we consider opening the vote?

Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use naming carefully to make sure that users identify them as belonging to a sub.feature (lazy objects) in particular, so i would prefer we rename some of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy / lazyInitialize - other methods in this RFC are already very outspoken, so I don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API users make assumptions about what this relates to (readonly, lazy, …)

2.) I am 100% behind the implementation of lazy ghosts, its really great work with all the behaviors. Speaking with my Doctrine ORM core developer hat this has my full support.

3.) the lazy proxies have me worried that we are opening up a can of worms by having the two objects and the magic of using only the properties of one and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a compelling argument for having it, but it is a leaky abstraction solving the identity issue only on one side, but the factory code might not know its used for a proxy and make all sorts of decisions based on identity that lead to problems.

Correct me if i am wrong or missing something, but If the factory does not know about proxying, then it would also be fine to build a lazy ghost and copy over all state after using the factory. This creates a similar amount of problems with identity, but is less magic while doing so. All of the inheritance heirachy and properties must exist logic can also be implemented in the userland initializer, passing the responsibility for the mess over to userland :wink:

class Container

{

public function getClientService(): Client

{

$reflector = new ReflectionClass(Client::class);

$client = $reflector->newLazyGhost(function (Client $ghost) use ($container) {

$clientFactory = $container->get(‘client_factory’);

$client = $clientFactory->createClient();

// not sure this is 100% right, the idea is to copy all state over

$vars = get_mangled_object_vars($client);

foreach ($vars as $k => $v) { $ghost->$k = $v; }

});

return $client;

}

This would also allow to make „initialize“ return void and simplify this part of the API.

4.) I am wondering, do we need the resetAs* methods? You can already implement lazy proxies in userland code by manually writing the code, we don’t need engine support for that. Not having these two methods would reduce the surface of the RFC / API considerably. And given the „real world“ example is not really real world, only the Doctrine (createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are, this shows maybe its not needed.

5.) The RFC does not spell it out, but I assume this does not have any effect on stacktraces, i.e. since properties are proxied, there are no „magic“ frames appearing in the stacktraces?

Cheers,
Nicolas

Hi Ben,

On Tue, Jun 18, 2024, at 5:45 PM, Arnaud Le Blanc wrote:

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
public function newLazyProxy(callable $factory, int $options): object {}

public function newLazyGhost(callable $initializer, int $options): object {}

public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

public function skipInitialization(object $object): void {}

// existing methods
}

Comments / rationale:

  • Adding methods on ReflectionClass instead of ReflectionObject is
    better from a performance point of view, as mentioned earlier
  • Keeping the word “Lazy” in method names is clearer, especially for
    “newLazyProxy” as a the “Proxy” pattern has many uses-cases that are
    not related to laziness. However we removed the word “Instance” to
    make the names shorter.
  • We have renamed “make” methods to “reset”, following your feedback
    about the word “make”. It should better convey the behavior of these
    methods, and clarify that it’s modifying the object in-place as well
    as resetting its state
  • setRawValueWithoutInitialization() has the same behavior as
    setRawValue() (from the hooks RFC), except it doesn’t trigger
    initialization
  • Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks so much more self-explanatory and readable. I love it. Thanks! (Looks like the RFC text hasn’t been updated yet.)

Happy you like it so much! The text of the RFC is now up to date. Note that we renamed ReflectionProperty::skipInitialization() and setRawValueWithoutInitialization() to skipLazyInitialization() and setRawValueWithoutLazyInitialization() after we realized that ReflectionProperty already has an isInitialized() method for something quite different.

While Arnaud works on moving the code to the updated API, are there more comments on this RFC before we consider opening the vote?

Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use naming carefully to make sure that users identify them as belonging to a sub.feature (lazy objects) in particular, so i would prefer we rename some of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy / lazyInitialize - other methods in this RFC are already very outspoken, so I don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API users make assumptions about what this relates to (readonly, lazy, …)

I get this aspect, I’m fine with either option, dunno if anyone has a strong preference?
Under this argument, mine is isLazyObject + initializeLazyObject.

2.) I am 100% behind the implementation of lazy ghosts, its really great work with all the behaviors. Speaking with my Doctrine ORM core developer hat this has my full support.

\o/

3.) the lazy proxies have me worried that we are opening up a can of worms by having the two objects and the magic of using only the properties of one and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a compelling argument for having it, but it is a leaky abstraction solving the identity issue only on one side, but the factory code might not know its used for a proxy and make all sorts of decisions based on identity that lead to problems.

Correct me if i am wrong or missing something, but If the factory does not know about proxying, then it would also be fine to build a lazy ghost and copy over all state after using the factory.

Unfortunately no, copying doesn’t work in the generic case: when the object’s dependencies involve a circular reference with the object itself, the copying strategy can lead to a sort of “brain split” situation where we have two objects (the proxy and the real object) which still coexist but can have diverging states.

This is what virtual state proxies solve, by making sure that while we have two objects, we’re sure by design that they have synchronized state.

Yes, $this can leak with proxies, but this is reduced to the strict minimum in the state-proxy design. Compared to the “brain split” I mentioned, this is a minor concern.

State-synchronization is costly currently since it relies on magic methods on every single property access.

From this angle, state-proxies are the ones that benefit the most from being in the engine.

4.) I am wondering, do we need the resetAs* methods? You can already implement lazy proxies in userland code by manually writing the code, we don’t need engine support for that. Not having these two methods would reduce the surface of the RFC / API considerably. And given the „real world“ example is not really real world, only the Doctrine (createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are, this shows maybe its not needed.

Yes, this use case of making an object lazy after it’s been created is quite useful. It makes it straightforward to turn a class lazy using inheritance for example (LazyClass extends NonLazyClass), without having to write nor maintain any decorating logic. From a technical pov, this is just a different flavor of the same code infrastructure, so this is pretty aligned with the rest of the proposed API.

5.) The RFC does not spell it out, but I assume this does not have any effect on stacktraces, i.e. since properties are proxied, there are no „magic“ frames appearing in the stacktraces?

Nothing special on this domain indeed, there are no added frames (unlike inheritance proxies since they’d decorate methods).

As a general note, an important design criterion for the RFC has been to make it a superset of what we can achieve in userland already. Ghost objects, state proxies, capabilities of resetAsLazy* methods, etc are all possible today. Making the RFC a subset of those existing capabilities would defeat the purpose of this proposal, since it would mean we’d have to keep maintaining the existing code to support the use cases it enables, with all the associated drawbacks for the PHP community at large.

Nicolas

Hi

On 6/20/24 10:49, Nicolas Grekas wrote:

While Arnaud works on moving the code to the updated API, are there more
comments on this RFC before we consider opening the vote?

I plan to give the RFC another read-through, but will likely not get around to it before the next week.

Best regards
Tim Düsterhus

On Fri, Jun 21, 2024 at 12:24 PM Nicolas Grekas <nicolas.grekas+php@gmail.com> wrote:

Hi Ben,

On Tue, Jun 18, 2024, at 5:45 PM, Arnaud Le Blanc wrote:

Hi Larry,

Following your feedback we propose to amend the API as follows:

class ReflectionClass
{
public function newLazyProxy(callable $factory, int $options): object {}

public function newLazyGhost(callable $initializer, int $options): object {}

public function resetAsLazyProxy(object $object, callable
$factory, int $options): void {}

public function resetAsLazyGhost(object $object, callable
$initializer, int $options): void {}

public function initialize(object $object): object {}

public function isInitialized(object $object): bool {}

// existing methods
}

class ReflectionProperty
{
public function setRawValueWithoutInitialization(object $object,
mixed $value): void {}

public function skipInitialization(object $object): void {}

// existing methods
}

Comments / rationale:

  • Adding methods on ReflectionClass instead of ReflectionObject is
    better from a performance point of view, as mentioned earlier
  • Keeping the word “Lazy” in method names is clearer, especially for
    “newLazyProxy” as a the “Proxy” pattern has many uses-cases that are
    not related to laziness. However we removed the word “Instance” to
    make the names shorter.
  • We have renamed “make” methods to “reset”, following your feedback
    about the word “make”. It should better convey the behavior of these
    methods, and clarify that it’s modifying the object in-place as well
    as resetting its state
  • setRawValueWithoutInitialization() has the same behavior as
    setRawValue() (from the hooks RFC), except it doesn’t trigger
    initialization
  • Renamed $initializer to $factory for proxy methods

WDYT?

Best Regards,
Arnaud

Oh, that looks so much more self-explanatory and readable. I love it. Thanks! (Looks like the RFC text hasn’t been updated yet.)

Happy you like it so much! The text of the RFC is now up to date. Note that we renamed ReflectionProperty::skipInitialization() and setRawValueWithoutInitialization() to skipLazyInitialization() and setRawValueWithoutLazyInitialization() after we realized that ReflectionProperty already has an isInitialized() method for something quite different.

While Arnaud works on moving the code to the updated API, are there more comments on this RFC before we consider opening the vote?

Thank you for updating the API, the RFC is now much easier to grasp.

My few comments on the updated RFC:

1 ) ReflectionClass API is already very large, adding methods should use naming carefully to make sure that users identify them as belonging to a sub.feature (lazy objects) in particular, so i would prefer we rename some of the new methods to:

isInitialized => isLazyObject (with inverted logic)

initialize => one of initializeLazyObject / initializeWhenLazy / lazyInitialize - other methods in this RFC are already very outspoken, so I don’t mind being very specific here as well.

The reason is „initialized“ is such a generic word, best not have API users make assumptions about what this relates to (readonly, lazy, …)

I get this aspect, I’m fine with either option, dunno if anyone has a strong preference?
Under this argument, mine is isLazyObject + initializeLazyObject.

2.) I am 100% behind the implementation of lazy ghosts, its really great work with all the behaviors. Speaking with my Doctrine ORM core developer hat this has my full support.

\o/

3.) the lazy proxies have me worried that we are opening up a can of worms by having the two objects and the magic of using only the properties of one and the methods of the other.

Knowing Symfony DIC, the use case of a factory method for the proxy is a compelling argument for having it, but it is a leaky abstraction solving the identity issue only on one side, but the factory code might not know its used for a proxy and make all sorts of decisions based on identity that lead to problems.

Correct me if i am wrong or missing something, but If the factory does not know about proxying, then it would also be fine to build a lazy ghost and copy over all state after using the factory.

Unfortunately no, copying doesn’t work in the generic case: when the object’s dependencies involve a circular reference with the object itself, the copying strategy can lead to a sort of “brain split” situation where we have two objects (the proxy and the real object) which still coexist but can have diverging states.

This is what virtual state proxies solve, by making sure that while we have two objects, we’re sure by design that they have synchronized state.

Yes, $this can leak with proxies, but this is reduced to the strict minimum in the state-proxy design. Compared to the “brain split” I mentioned, this is a minor concern.

State-synchronization is costly currently since it relies on magic methods on every single property access.

From this angle, state-proxies are the ones that benefit the most from being in the engine.

Makes sense to me.

4.) I am wondering, do we need the resetAs* methods? You can already implement lazy proxies in userland code by manually writing the code, we don’t need engine support for that. Not having these two methods would reduce the surface of the RFC / API considerably. And given the „real world“ example is not really real world, only the Doctrine (createLazyGhost) and Symfony (createLazyGhost or createLazyProxy) are, this shows maybe its not needed.

Yes, this use case of making an object lazy after it’s been created is quite useful. It makes it straightforward to turn a class lazy using inheritance for example (LazyClass extends NonLazyClass), without having to write nor maintain any decorating logic. From a technical pov, this is just a different flavor of the same code infrastructure, so this is pretty aligned with the rest of the proposed API.

Will you use this in Symfony DIC for something? While I can understand the argument that its easy to integrate with the lazy object code that you have, i don’t see how the argument “without having to write nor maintain any dceoarting logic” is true.

The example in the RFC is very much written code. Yes its only one line of new ReflectionClass()->initialie($this)->send($data), but something like https://gist.github.com/beberlei/568719a1c5536cc5f59a60381c37aa05 is not more code and works fine.

The (Lazy-)Connection example can be re-written as:

class LazyConnection extends Connection
{
public function create(): Connection
{
return (new ReflectionClass(Connection::class))->newLazyGhost(function (Connection $connection) {
$connection->__construct(); // Or any heavier initialization logic
$connection->ttl = 2.0;
});
}

private function __construct() {
parent::__construct();
}
}

This to me reads easier, especially when Connection has more than one public method (send) it requires way less code.

Given the complexities of newLazy* already, i am just trying to find arguments to keep the public surface of this API as small as posisble, as its intricacies are hard to grasp and simplicity / less ways to use it will be a benefit.

So far i don’t see that with resetAsLazy* you can impmlement something new that cannot also be done with newLazy* methods.

5.) The RFC does not spell it out, but I assume this does not have any effect on stacktraces, i.e. since properties are proxied, there are no „magic“ frames appearing in the stacktraces?

Nothing special on this domain indeed, there are no added frames (unlike inheritance proxies since they’d decorate methods).

As a general note, an important design criterion for the RFC has been to make it a superset of what we can achieve in userland already. Ghost objects, state proxies, capabilities of resetAsLazy* methods, etc are all possible today. Making the RFC a subset of those existing capabilities would defeat the purpose of this proposal, since it would mean we’d have to keep maintaining the existing code to support the use cases it enables, with all the associated drawbacks for the PHP community at large.

I very much appreciate the benefits this brings as primary language concept.

Nicolas

Hey Nicolas, Arnaud,

Please find all the details here:
https://wiki.php.net/rfc/lazy-objects

First of all, let me say that this is a fantastic RFC: having maintained both mine and Doctrine’s version of lazy proxies for many years, this is indeed a push in the right direction, making laziness an engine detail, rather than a complex userland topic with hacks.

Moving this forward will allow us (in a far future) to get rid of some BC boundaries around the weird unset() semantics of properties, which were indeed problematic with typed properties and readonly properties.

Stellar work: well done!

TL;DR: of my feedback:

  • RFC is good / useful / needed: will vote for it

  • ghost proxies are well designed / fantastic feature

  • lazy proxies should probably explore replacing object identity further

  • don’t like expanding ReflectionClass further: LazyGhost and LazyProxy classes (or such) instead

  • initialize() shouldn’t have a boolean behavioral switch parameter: make 2 methods

  • flags should be a list<SomeEnumAroundProxies> instead. A bitmask for a new API feels unsafe and anachronistic, given the tiny performance hit.

  • don’t touch readonly because of lazy objects: this feature is too niche to cripple a major-major feature like readonly. I would suggest deferring until after the first bits of this RFC landed.

That said, I took some notes that I’d like you to both consider / answer.

Raw feedback

I did skim through the thread, but did not read it all, so please excuse me if some feedback is potentially duplicate.
Notes are succinct/to the point: I abuse bullet points heavily, sorry :slight_smile:

From an abstraction point of view, lazy objects from this RFC are indistinguishable from non-lazy ones

  • do the following aspects always apply? I understand they don’t for lazy proxies, just for ghosts?

  • spl_object_id($object) === spl_object_id($proxy)?

  • get_class($object) === get_class($proxy)?

Execution of methods or property hooks does not trigger initialization until one of them accesses a backed property.

  • excellent! The entire design revolving around object state makes it so much easier to reason about the entire behavior too!

Proxies: The initializer returns a new instance, and interactions with the proxy object are forwarded to this instance

  • I am not sure why this is needed, given ghost objects.
  • I understand that you want this for when instantiation is delegated to a third party (in Symfony’s DIC, a factory), but I feel like the ghost design is so much more fool-proof than the proxy approach.
  • Perhaps worth taking another stab at sharing object identity, before implementing these?
  • another note on naming: I used “value holder” inside ProxyManager, because using just “proxy” as a name led to a lot of confusion. This also applies to me trying to distinguish proxy types inside the RFC and this discussion.

Internal objects are not supported because their state is usually not managed via regular properties.

  • sad, but understandable limitation
  • what happens if a class extends an internal engine class, like the gruesome ArrayObject?

The API uses various flags given as options

public int const SKIP_INITIALIZATION_ON_SERIALIZE = 1;
public int const SKIP_DESTRUCTOR = 2;
public int const SKIP_INITIALIZED_READONLY = 4;
public int const RESET_INITIALIZED_READONLY = 8;
  • IMO, these should be enum types, given in as a list<TOption> to the call site
  • bitmasks are really only relevant in serialization / storage contexts, IMO, for compressing space as much as possible
  • the bitmasks in the reflection API are already very confusing and hard to use, and I say that as someone that wrapped the entire reflection API various times, out of despair
public function newLazyGhost(callable $initializer, int $options = 0): object {}
public function newLazyProxy(callable $factory, int $options = 0): object {}
  • Given the recent improvements around closures and the ... syntax (https://wiki.php.net/rfc/first_class_callable_syntax), is it worth having Closure only as argument type?
  • should we declare generic types in the RFC/docs, even if just in the stubs?
  • they would serve as massive documentation improvement for Psalm and PHPStan
  • it would be helpful to document $initializer and $factory as :void or :object functions
  • can the engine check that, perhaps? I have no idea if Closure can provide such information on return types, inside the engine.
public function initialize(object $object, bool $skipInitializer = false): object {}
  • worth dividing this into

  • initialize()

  • markAsAlreadyInitialized()

  • don’t use a boolean flag for two functions that do different things

  • all methods were added to ReflectionClass

  • IMO worth having this as separate class/object where this is attached

  • if one needs to decorate/stub such API in userland, it is therefore a completely separate decoration

  • ReflectionClass is already gigantic

  • a smaller API surface that only does proxies (perhaps different based on proxy strategy) would be very beneficial

  • suggestion: something like new GhostObject($className) and new Proxy($className)

  • I understand that the interactions with ReflectionClass#getProperty() felt natural, but the use-case is narrow, and ReflectionClass is already really humongous

The resetAsLazy*() methods accept an already created instance.
This allows writing classes that manage their own laziness

  • overall understandable
  • a bit weird to support this, for now
  • useful for resettable interfaces: Symfony DIC could benefit from this
  • what happens if ReflectionClass#reset*() methods are used on a different class instance?
  • considered?
$reflector->getProperty('id')->skipLazyInitialization($post);
  • perfect for partial objects / understanding why this was implemented
  • would it make sense to have an API to set “bulk” values in an object this way, instead of having to do this for each property manually?
  • avoids instantiating reflection properties / marking individual properties manually
  • perhaps future scope?
  • thinking (new GhostObject($class))->initializePropertiesTo(['foo' => 'bar', 'baz' => 'tab'])

Initialization Triggers

  • really happy to see all these edge cases being considered here!
  • how much of this new API has been tried against the test suite of (for example) ocramius/proxy-manager?
  • mostly asking because there’s tons of edge cases noted in there

Cloning, unless __clone() is implemented and accesses a property.

  • how is the initializer of a cloned proxy used?
  • Is the initializer cloned too?
  • what about the object that a lazy proxy forwards state access to? Is it cloned too?

The following special cases do not trigger initialization of a lazy object:

  • Will accessing a property via a debugger (such as XDebug) trigger initialization here?

  • asking because debugging proxy initialization often led to problems, in the past

  • sometimes even IDEs crashing, or segfaults

  • this wording is a bit confusing:

Proxy Objects
The actual instance is set to the return value.

  • considering the following paragraph:

The proxy object is not replaced or substituted for the actual instance.

After initialization, property accesses on the proxy are forwarded to the actual instance.
Observing properties of the proxy has the same result as observing properties of the actual instance.

  • This is some sort of “quantum locking” of both objects?
  • How hard is it to break this linkage?
  • Can properties be unset(), for example?
  • what happens to dynamic properties?
  • I don’t use them myself, and I discourage their usage, but it would be OK to just document the expected behavior

The proxy and actual instance have distinct identities.

  • Given that we went great lengths to “quantum lock” two objects’ properties, wasn’t it perhaps feasible to replace the
    proxy instance?
  • I have no idea if that would be possible/wished
  • would require merging spl_object_id() within the scope of the initializer stack frame, with any outer ones
  • would solve any identity problems that still riddle the lazy proxy design (which I think is incomplete, right now)

The scope and $this of the initializer function is not changed

  • good / thoughtful design
  • using __construct() or reflection properties suffices for most users

If the initializer throws, the object properties are reverted to their pre-initialization state and the object is
marked as lazy again.

  • this is some sort of “transactional” behavior
  • welcome API, but is it worth having this complexity?
  • is there a performance tradeoff?
  • is a copy of the original state kept during initializer calls?
  • OK with it myself, just probing design considerations
  • the example uses setRawValueWithoutLazyInitialization(), and initialization then accesses public properties
  • shouldn’t a property that now has a value not trigger initialization anymore?
  • or does that require ReflectionProperty#skipLazyInitialization() calls, for that to work?

ReflectionClass::SKIP_INITIALIZATION_ON_SERIALIZE: By default, serializing a lazy object triggers its initialization
This flag disables that behavior, allowing lazy objects to be serialized as empty objects.

  • how would one deserialize an empty object into a proxy again?
  • would this understanding be deferred to the (de-)serializer of choice?
  • exercise for userland?

ReflectionClass::newLazyProxy()
The factory should return a new object: the actual instance.

  • what happens if the user mis-implements the factory as function (object $proxy): object { return $proxy; }?
  • this is obviously a mistake on their end, but is it somehow preventable?

The resetAsLazyGhost() method resets an existing object and marks it as lazy.
The indented use-case is for an object to manage its own lazyness by calling the method in its constructor.

  • this certainly makes it easier to design “out of the box” lazy objects
  • perhaps more useful for tools like ORMs, (de-)serializers and DICs though
  • using the proxy API internally in classes like DB connections feels a bit overkill, to me

ReflectionClass::SKIP_INITIALIZED_READONLY
If this flag is set, these properties are skipped and no exception is thrown.
The behavior around readonly properties is explained in more details later.
ReflectionClass::RESET_INITIALIZED_READONLY

  • while I can see this as useful, it effectively completely breaks the readonly design

  • this is something I’d probably vote against: not worth breaking readonly for just the reset*() API here

  • reset*() is already a niche API inside the (relatively) niche use-case of laziness: I wouldn’t bypass readonly for it

  • readonly provided developer value is bigger than lazy object value, in my mind

ReflectionClass::resetAsLazyProxy()
The proxy and the actual instance are distinct objects, with distinct identities.

  • When creating a lazy proxy, all property accesses are forwarded to a new instance
  • are all property accesses re-bound to the new instance?
  • are there any leftovers pointing to the old instance anywhere?
  • thinking dynamic properties and similar

If $skipInitializer is true, the behavior is the one described for Ghost Objects in the Initialization Sequence
section, except that the initializer is not called.

  • please make a separate method for this
  • it’s not worth cramming a completely different behavior in the same method
  • can document the methods completely independently, this way
  • can deprecate the new method separately, if a design flaw is found in future

ReflectionProperty::setRawValueWithoutLazyInitialization()
The method does not call hooks, if any, when setting the property value.

  • So far, it has been possible to unset($object->property) to force __get and __set to be called
  • will setRawValueWithoutLazyInitialization skip also this “unset properties” behavior that is possible in userland?
  • this is fine, just a documentation detail to note
  • if it is like that, is it worth renaming the method setValueWithoutCallingHooks or such?
  • not important, just noting this opportunity

Readonly properties

  • while I appreciate the effort into digging in readonly properties, this section feels extremely tricky
  • I would suggest not implementing (for now) either of
  • SKIP_INITIALIZED_READONLY
  • RESET_INITIALIZED_READONLY
  • I would suggest leaving some time for these, and re-evaluating after the RFC lands / starts being used

Destructors
The destructor of proxy objects is never called. We rely on the destructor of the proxied instance instead.

  • raising an edge case here: spl_object_* and object identity checks may be used inside a destructor
  • for example, a DB connection de-registering itself from a connection pool somewhere, such as $pool->deRegister($this)
  • the connection pool may have the spl_object_id() of the proxy, not the real instance
  • this is not a blocker, just an edge case that may require documentation
  • it reconnects with “can we replace the object in-place?” question above: replacing objects worth exploring

It employs the ghost strategy by default unless the dependency is to be instantiated
and initialized by a user-provided factory

  • one question arises here: can this RFC create proxies of interfaces at all?

  • if not, does it throw appropriate exceptions?

  • the reason this question comes up is that, especially in the context of DICs, factories are for interfaces

  • concrete classes are implementation details of factories, usually

  • very difficult to use lazy proxy and ghost object design with services that decorate each other

  • this is semi-discussed in “About Proxies” below, around “inheritance-proxy”

  • still worth mentioning “no interfaces” as a clear limitation, perhaps?

···

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

On Tuesday, 4 June 2024 at 13:28, Nicolas Grekas <nicolas.grekas+php@gmail.com> wrote:

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for over a year to add native support for lazy objects to PHP.

Please find all the details here:
PHP: rfc:lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

Hello,

I don't have any strong opinions about the feature in general, mainly because I don't understand the problem space.

However, I have some remarks.

The fact that an initialize() method has a $skipInitializer parameter doesn't make a lot of sense to me.
Because at a glance, I don't see how passing true to it, and not calling the method is different?
This should probably be split into two distinct methods.

Does get_mangled_object_vars() trigger initialization or not?
This should behave like an (array) cast (and should be favoured instead of an array cast as it was introduced for that purpose).

How does a lazy object look like when it has been dumped?

The initializer must return null or no value

*Technically* all functions in PHP return a value, which by default is null, so this is somewhat redundant.
Also, would this throw a TypeError if a value other than null is returned?

Best regards,
Gina P. Banyard

On Tue, Jun 4, 2024 at 6:31 AM Nicolas Grekas
<nicolas.grekas+php@gmail.com> wrote:

Dear all,

Arnaud and I are pleased to share with you the RFC we've been shaping for over a year to add native support for lazy objects to PHP.

Please find all the details here:
PHP: rfc:lazy-objects

We look forward to your thoughts and feedback.

Cheers,
Nicolas and Arnaud

I will vote no on this one. I do not believe the internal complexity
and maintenance is worth the feature. Additionally, I do not feel like
changing the language to support this feature is a good idea; if this
were a library only thing, I would not care.

Hi Levi,

On Wed, Jun 26, 2024 at 12:07 AM Levi Morrison <levi.morrison@datadoghq.com> wrote:

I will vote no on this one. I do not believe the internal complexity
and maintenance is worth the feature. Additionally, I do not feel like
changing the language to support this feature is a good idea; if this
were a library only thing, I would not care.

Hi Levi,

The proposed implementation is adding very little complexity as it’s not adding any special case outside of object handlers (except in json_encode() and serialize() because these functions trade abstractions for speed). Furthermore all operations that may trigger an object initialization are already effectful, due to magic methods or hooks (so we are not making pure operations effectful). This means that we do not have to worry about lazy objects or to be aware of them anywhere in the code base, outside of object handlers.

To give you an idea, it’s implemented by hooking into the code path that handles accesses to undefined properties. This code path may call __get or __set methods if any, or trigger errors, and with this proposal, may trigger the initialization. Userland implementations achieve this functionality in a very similar way (with unset() and a generated sub-class with magic methods), but they have considerably more edge cases to handle due to being at a different abstraction level.

Best Regards,
Arnaud

Hi Gina,

On Tue, Jun 25, 2024 at 5:59 PM Gina P. Banyard internals@gpb.moe wrote:

The fact that an initialize() method has a $skipInitializer parameter doesn’t make a lot of sense to me.
Because at a glance, I don’t see how passing true to it, and not calling the method is different?

Calling initialize() with $skipInitializer set to true has the same effect as calling skipLazyInitialization() on all properties that are still lazy on the object. This can be used to manually initialize a lazy object afterwards, as property accesses will not trigger initialization anymore. This also ensures that the initializer function can be decref’ed.

This should probably be split into two distinct methods.

Agreed

Does get_mangled_object_vars() trigger initialization or not?

No. get_mangled_object_vars() and array cast are among the few cases that do not trigger initialization. They are listed in https://wiki.php.net/rfc/lazy-objects#initialization_triggers.

This should behave like an (array) cast (and should be favoured instead of an array cast as it was introduced for that purpose).

Exactly

How does a lazy object look like when it has been dumped?

The output of var_dump() on a lazy object is the same as on an object whose all properties have been unset() (except those initialized with setRawValueWithoutLazyInitialization() or skipLazyInitialization()). For convenience we also prefix the output with either lazy ghost or lazy proxy.

I’ve added a var_dump section in the RFC.

The initializer must return null or no value
Technically all functions in PHP return a value, which by default is null, so this is somewhat redundant.

Agreed. I believe that formulating this like that makes it clear that any of “return null;”, “return;”, or implicit return, will work.

Also, would this throw a TypeError if a value other than null is returned?

Agreed. Currently we throw an Error, but I will change that to TypeError.

Best Regards,
Arnaud

On Tue, Jun 4, 2024, at 14:28, Nicolas Grekas wrote:

Dear all,

Arnaud and I are pleased to share with you the RFC we’ve been shaping for over a year to add native support for lazy objects to PHP.

Please find all the details here:

https://wiki.php.net/rfc/lazy-objects

We look forward to your thoughts and feedback.

Cheers,

Nicolas and Arnaud

Can you add to the RFC how to proxy final classes as well? This is mentioned (unless I misunderstood) but in the proxy example it shows the proxy class extending the proxied class (which I think is an error if the base class is final). How would this work? Or would it need to implement a shared interface (this is totally fine IMHO)?

— Rob

On Thu, Jun 27, 2024, at 12:32, Marco Pivetta wrote:

Hey Arnaud,

On Wed, 26 Jun 2024 at 21:06, Arnaud Le Blanc <arnaud.lb@gmail.com> wrote:

The proposed implementation is adding very little complexity as it’s not adding any special case outside of object handlers (except in json_encode() and serialize() because these functions trade abstractions for speed). Furthermore all operations that may trigger an object initialization are already effectful, due to magic methods or hooks (so we are not making pure operations effectful). This means that we do not have to worry about lazy objects or to be aware of them anywhere in the code base, outside of object handlers.

To give you an idea, it’s implemented by hooking into the code path that handles accesses to undefined properties. This code path may call __get or __set methods if any, or trigger errors, and with this proposal, may trigger the initialization. Userland implementations achieve this functionality in a very similar way (with unset() and a generated sub-class with magic methods), but they have considerably more edge cases to handle due to being at a different abstraction level.

Assuming this won’t pass a vote (I hope it does, but I want to be optimistic): is this something that could be implemented in an extension, or is it only feasible in core?

Greets,

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

I really hope it passes, not just for your libraries but also for mine. I’m looking forward to going on a deletion spree and having a nice standardized proxy API.

— Rob

Hey Arnaud,

···

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/