[PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

On Tue, Sep 17, 2024 at 2:56 AM Lynn <kjarli@gmail.com> wrote:

On Sat, Sep 14, 2024 at 11:51 PM Jordan LeDoux <jordan.ledoux@gmail.com> wrote:

Hello internals,

This discussion will use my previous RFC as the starting point for conversation: https://wiki.php.net/rfc/user_defined_operator_overloads

There has been discussion on list recently about revisiting the topic of operator overloads after the previous effort which I proposed was declined. There are a variety of reasons, I think, this is being discussed, both on list and off list.

  1. As time has gone on, more people have come forward with use cases. Often they are use cases that have been mentioned before, but it has become more clear that these use cases are more common than was suggested previously.

  2. Several voters, contributors, and participants have had more time (years now) to investigate and research some of the related issues, which naturally leads to changes in opinion or perspective.

  3. PHP has considered and been receptive toward several RFCs since my original proposal which update the style of PHP in ways which are congruent with the KIND of language that has operator overloads.

I mentioned recently that I would not participate in another operator overload RFC unless I felt that the views of internals had become more receptive to the topic, and after some discussion with several people off-list, I feel that it is at least worth discussing for the next version.

Operator overloads has come up as a missing feature in several discussions on list since the previous proposal was declined. This includes:

[RFC] [Discussion] Support object type in BCMath 1

Native decimal scalar support and object types in BcMath 2

Custom object equality 3

pipes, scalar objects and on? 4

[RFC][Discussion] Object can be declared falsifiable 5

The request to support comparison operators (>, >=, ==, !=, <=, <, <=>) has come up more frequently, but particularly in discussion around linear algebra, arbitrary precision mathematics, and dimensional numbers (such as currency or time), the rest of the operators have also come up.

Typically, these use cases are themselves very niche, but the capabilities operator overloads enable would be much more widely used. From discussion on list, it seems likely that very few libraries would need to implement operator overloads, but the libraries that do would be well used and thus MANY devs would be consumers of operator overloads.

I want to discuss what changes to the previous proposal people would be seeking, and why. The most contentious design choice of the previous proposal was undoubtedly the operator keyword and the decision to make operator overload implementations distinct from normal magic methods. For some of the voters who voted yes on the previous RFC, this was a “killer feature” of the proposal, while for some of the voters who voted no it was the primary reason they were against the feature.

There are also several technical and tangentially related items that are being worked on that would be necessary for operator overloads (and were originally included in my implementation of the previous RFC). This includes:

  1. Adding a new opcode for LARGER and LARGER_OR_EQUAL so that operand position can be preserved during ALL comparisons.

  2. Updating ZEND_UNCOMPARABLE such that it has a value other than -1, 0, or 1 which are typically reserved during an ordering comparison.

  3. Allowing values to be equatable without also being orderable (such as with matrices, or complex numbers).

These changes could and should be provided independent of operator overloads. Gina has been working on a separate RFC which would cover all three of these issues. You can view the work-in-progress on that RFC here: https://github.com/Girgias/php-rfcs/blob/master/comparison-equality-semantics.md

I hope to start off this discussion productively and work towards improving the previous proposal into something that voters are willing to pass. To do that, I think these are the things that need to be discussed in this thread:

  1. Should the next version of this RFC use the operator keyword, or should that approach be abandoned for something more familiar? Why do you feel that way?

  2. Should the capability to overload comparison operators be provided in the same RFC, or would it be better to separate that into its own RFC? Why do you feel that way?

  3. Do you feel there were any glaring design weaknesses in the previous RFC that should be addressed before it is re-proposed?

  4. Do you feel that there is ANY design, version, or implementation of operator overloads possible that you would support and be in favor of, regardless of whether it matches the approach taken previously? If so, can you describe any of the core ideas you feel are most important?

Jordan

External Links:

I’m not experienced with other languages and overloading, so consider this reply as me not knowing enough about the subject. Rowan asked an interesting question: “Are we over-riding operators or operations?” which made me think about behaviors as a 3rd alternative. Instead of individual operator overloading, could classes define how they would act as certain primitives or types that have overloading under the hood? We have Stringable with __toString, which might not be the best example but does point in a similar direction. I don’t know if this is a direction worth exploring but wanted to at least bring it up.

interface IntBehavior {
public function asInt(): int;
}

class PositiveInt implements IntBehavior {
public readonly int $value;

public function __construct(int $value) {
$this->value = max(0, $value);
}
public function asInt(): int {
return $this->value;
}
}

var_dump(10 + new PositiveInt(5)); // 15
var_dump(new PositiveInt(10) + 15); // 25
var_dump(new PositiveInt(100) + new PositiveInt(100)); // 200

// leaves it to the developer to do:
$number = new PositiveInt(new PositiveInt(10) + 5);

I actually did explore something like this during my initial design phases before ever bringing it up on list the first time. I decided that it was certainly useful, and perhaps even something I would also want, but did not solve the problem I was trying to solve.

The problem I was trying to solve involved lots of things that cannot be represented well by primitive types (which is presumably why they are classes in the first place). Things like Complex Numbers, Matrices, or Money. Money can be converted to a float of course (or an int depending on implementation), but Money does not want to be added with something like request count, which might also be an int. Or if it does, it probably wants to know exactly what the context is. There are lots of these kinds of value classes that might be representable with scalars, but would lose a lot of their context and safety if that is done.

On the other hand, Money would probably not want to be multiplied with other Money values. What would Money squared mean exactly? Things like this are very difficult to control for if all you provide is a way to control casting to scaar types.

Jordan

On Sep 17, 2024, at 10:15, Jordan LeDoux jordan.ledoux@gmail.com wrote:

On Tue, Sep 17, 2024 at 1:18 AM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 14/09/2024 22:48, Jordan LeDoux wrote:

  1. Should the next version of this RFC use the operator keyword, or
    should that approach be abandoned for something more familiar? Why do
    you feel that way?

  2. Should the capability to overload comparison operators be provided
    in the same RFC, or would it be better to separate that into its own
    RFC? Why do you feel that way?

  3. Do you feel there were any glaring design weaknesses in the
    previous RFC that should be addressed before it is re-proposed?

I think there are two fundamental decisions which inform a lot of the
rest of the design:

  1. Are we over-riding operators or operations? That is, is the user
    saying “this is what happens when you put a + symbol between two Foo
    objects”, or “this is what happens when you add two Foo objects together”?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.

  1. How do we despatch a binary operator to one of its operands? That is,
    given $a + $b, where $a and $b are objects of different classes, how do
    we choose which implementation to run?

This is something not many other people have been interested in so far, but interestingly there is a lot of prior art on this question in other languages! :slight_smile:

The best approach, from what I have seen and developer usage in other languages, is somewhat complicated to follow, but I will do my best to make sure it is understandable to anyone who happens to be following this thread on internals.

The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

  1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)
  2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.
  3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.
  4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.

This results from what it means to extend a class. Suppose you have a class Foo and a class Bar that extends Foo. If both Foo and Bar implement an overload, that means Bar inherited an overload. It is either the same as the overload from Foo, in which case it shouldn’t matter which is executed, or it has been updated with even more specific logic which is aware of the extra context that Bar provides, in which case we want to execute the updated implementation.

So the implementation on the left would almost always be executed, unless the implementation on the right comes from a class that is a direct descendant of the class on the left.

Foo + Bar
Bar + Foo

In practice, you would very rarely (if ever) use two classes from entirely different class inheritance hierarchies in the same overload. That would closely tie the two classes together in a way that most developers try to avoid, because the implementation would need to be aware of how to handle the classes it accepts as an argument.

The exception to this that I can imagine is something like a container, that maybe does not care what class the other object is because it doesn’t mutate it, only store it.

But for virtually every real-world use case, executing the overload for the child class regardless of its position would be preferred, because overloads will tend to be confined to the core types of PHP + the classes that are part of the hierarchy the overload is designed to interact with.

Finally, a very quick note on the OperandPosition enum: I think just a
“bool $isReversed” would be fine - the “natural” expansion of “$a+$b” is
“$a->operator+($b, false)”; the “fallback” is “$b->operator+($a, true)”

Regards,


Rowan Tommins
[IMSoP]

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like $isReversed or $left or so on is somewhat ambiguous, while the enum makes it extremely explicit.

However, it’s not a design detail I am committed to. I just want to let you know why it was done that way.

Jordan

To be clear: I’m very much in favor of operator overloading. I frequently work with both Money value objects, and DateTime objects that I need to manipulate through arithmetic with others of the same type.

What if I wanted to create a generic add($a, $b) function, how would I type hint the params to ensure that I only get “addable” things? I would expect that to be:

  • Ints
  • Floats
  • Objects of classes with “operator+” defined

I think that an interface is the right solution for that, and you can just union with int/float type hints: add(int | float | Addable …$operands) (or add(int | float | (Foo & Addable) …$operands)

Is this type of behavior even allowed? I think the intention is that it must be otherwise the decision over which overload method gets called is drastically simplified.

Perhaps for a first iteration, operator overloads only work between objects of the same type or their descendants — and if a descendant overrides the overload, the descendants version is used regardless of left/right precedence.

I suspect this will simplify the complexity of the magic, and solve the majority of cases where operator overloading is desired.

  • Davey

On Tue, Sep 17, 2024 at 10:55 AM Davey Shafik <me@daveyshafik.com> wrote:

On Sep 17, 2024, at 10:15, Jordan LeDoux <jordan.ledoux@gmail.com> wrote:

On Tue, Sep 17, 2024 at 1:18 AM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 14/09/2024 22:48, Jordan LeDoux wrote:

  1. Should the next version of this RFC use the operator keyword, or
    should that approach be abandoned for something more familiar? Why do
    you feel that way?

  2. Should the capability to overload comparison operators be provided
    in the same RFC, or would it be better to separate that into its own
    RFC? Why do you feel that way?

  3. Do you feel there were any glaring design weaknesses in the
    previous RFC that should be addressed before it is re-proposed?

I think there are two fundamental decisions which inform a lot of the
rest of the design:

  1. Are we over-riding operators or operations? That is, is the user
    saying “this is what happens when you put a + symbol between two Foo
    objects”, or “this is what happens when you add two Foo objects together”?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.

  1. How do we despatch a binary operator to one of its operands? That is,
    given $a + $b, where $a and $b are objects of different classes, how do
    we choose which implementation to run?

This is something not many other people have been interested in so far, but interestingly there is a lot of prior art on this question in other languages! :slight_smile:

The best approach, from what I have seen and developer usage in other languages, is somewhat complicated to follow, but I will do my best to make sure it is understandable to anyone who happens to be following this thread on internals.

The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

  1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)
  2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.
  3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.
  4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.

This results from what it means to extend a class. Suppose you have a class Foo and a class Bar that extends Foo. If both Foo and Bar implement an overload, that means Bar inherited an overload. It is either the same as the overload from Foo, in which case it shouldn’t matter which is executed, or it has been updated with even more specific logic which is aware of the extra context that Bar provides, in which case we want to execute the updated implementation.

So the implementation on the left would almost always be executed, unless the implementation on the right comes from a class that is a direct descendant of the class on the left.

Foo + Bar
Bar + Foo

In practice, you would very rarely (if ever) use two classes from entirely different class inheritance hierarchies in the same overload. That would closely tie the two classes together in a way that most developers try to avoid, because the implementation would need to be aware of how to handle the classes it accepts as an argument.

The exception to this that I can imagine is something like a container, that maybe does not care what class the other object is because it doesn’t mutate it, only store it.

But for virtually every real-world use case, executing the overload for the child class regardless of its position would be preferred, because overloads will tend to be confined to the core types of PHP + the classes that are part of the hierarchy the overload is designed to interact with.

Finally, a very quick note on the OperandPosition enum: I think just a
“bool $isReversed” would be fine - the “natural” expansion of “$a+$b” is
“$a->operator+($b, false)”; the “fallback” is “$b->operator+($a, true)”

Regards,


Rowan Tommins
[IMSoP]

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like $isReversed or $left or so on is somewhat ambiguous, while the enum makes it extremely explicit.

However, it’s not a design detail I am committed to. I just want to let you know why it was done that way.

Jordan

To be clear: I’m very much in favor of operator overloading. I frequently work with both Money value objects, and DateTime objects that I need to manipulate through arithmetic with others of the same type.

What if I wanted to create a generic add($a, $b) function, how would I type hint the params to ensure that I only get “addable” things? I would expect that to be:

  • Ints
  • Floats
  • Objects of classes with “operator+” defined

I think that an interface is the right solution for that, and you can just union with int/float type hints: add(int | float | Addable …$operands) (or add(int | float | (Foo & Addable) …$operands)

Is this type of behavior even allowed? I think the intention is that it must be otherwise the decision over which overload method gets called is drastically simplified.

Perhaps for a first iteration, operator overloads only work between objects of the same type or their descendants — and if a descendant overrides the overload, the descendants version is used regardless of left/right precedence.

I suspect this will simplify the complexity of the magic, and solve the majority of cases where operator overloading is desired.

  • Davey

The problem with providing interfaces is something the nikic addressed very early in my design process and convinced me of: an Addable interface will not actually tell you if two objects can be added together. A Money class and a Vector2D class might both have an implementation for operator +() and implement some kind of Addable interface. But there is no sensible way in which they could actually be added. Knowing that an object implements an overload is not enough in most cases to use operators with them. This is part of the reason that I am skeptical of people who worry about accidentally using random overloads.

The signature for the implementation in the Money class, might look something like this:

operator +(Money $other, OperandPosition $position): Money

while the signature for the implementation in the Vector2D class might look something like this:

operator +(Vector2D|array $other, OperandPosition $position): Vector2D

Any attempt to add these two together will result in a TypeError.

Classes which have overloads that look like the following would be something I think developers should be IMMEDIATELY suspicious of:

operator +(object $other, OperandPosition $position)
operator +(mixed $other, OperandPosition $position)

Does your implementation really have a plan for how to + with a stream resource like a file handler, as well as an int? Can you just as easily use + with the DateTime class as you can with a Money class in your implementation?

I think there are very few use cases that would survive code reviews or feedback or testing that look like any of these signatures.

There are situations in which objects might accept objects from a different class hierarchy. For instance, with the changes Saki has made there are now objects for numbers in the BcMath extension. Those are objects that might be quite widely accepted in overload implementations, since they represent numbers in the same way that just an int or float might. But I highly doubt that it’s even possible for the overload to accept those sorts of things without also being aware of them, and if the overload is aware of them it can type-hint them in the signature.

Jordan

On Sep 17, 2024, at 11:11, Jordan LeDoux jordan.ledoux@gmail.com wrote:

On Tue, Sep 17, 2024 at 10:55 AM Davey Shafik <me@daveyshafik.com> wrote:

On Sep 17, 2024, at 10:15, Jordan LeDoux <jordan.ledoux@gmail.com> wrote:

On Tue, Sep 17, 2024 at 1:18 AM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 14/09/2024 22:48, Jordan LeDoux wrote:

  1. Should the next version of this RFC use the operator keyword, or
    should that approach be abandoned for something more familiar? Why do
    you feel that way?

  2. Should the capability to overload comparison operators be provided
    in the same RFC, or would it be better to separate that into its own
    RFC? Why do you feel that way?

  3. Do you feel there were any glaring design weaknesses in the
    previous RFC that should be addressed before it is re-proposed?

I think there are two fundamental decisions which inform a lot of the
rest of the design:

  1. Are we over-riding operators or operations? That is, is the user
    saying “this is what happens when you put a + symbol between two Foo
    objects”, or “this is what happens when you add two Foo objects together”?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.

  1. How do we despatch a binary operator to one of its operands? That is,
    given $a + $b, where $a and $b are objects of different classes, how do
    we choose which implementation to run?

This is something not many other people have been interested in so far, but interestingly there is a lot of prior art on this question in other languages! :slight_smile:

The best approach, from what I have seen and developer usage in other languages, is somewhat complicated to follow, but I will do my best to make sure it is understandable to anyone who happens to be following this thread on internals.

The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

  1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)
  2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.
  3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.
  4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.

This results from what it means to extend a class. Suppose you have a class Foo and a class Bar that extends Foo. If both Foo and Bar implement an overload, that means Bar inherited an overload. It is either the same as the overload from Foo, in which case it shouldn’t matter which is executed, or it has been updated with even more specific logic which is aware of the extra context that Bar provides, in which case we want to execute the updated implementation.

So the implementation on the left would almost always be executed, unless the implementation on the right comes from a class that is a direct descendant of the class on the left.

Foo + Bar
Bar + Foo

In practice, you would very rarely (if ever) use two classes from entirely different class inheritance hierarchies in the same overload. That would closely tie the two classes together in a way that most developers try to avoid, because the implementation would need to be aware of how to handle the classes it accepts as an argument.

The exception to this that I can imagine is something like a container, that maybe does not care what class the other object is because it doesn’t mutate it, only store it.

But for virtually every real-world use case, executing the overload for the child class regardless of its position would be preferred, because overloads will tend to be confined to the core types of PHP + the classes that are part of the hierarchy the overload is designed to interact with.

Finally, a very quick note on the OperandPosition enum: I think just a
“bool $isReversed” would be fine - the “natural” expansion of “$a+$b” is
“$a->operator+($b, false)”; the “fallback” is “$b->operator+($a, true)”

Regards,


Rowan Tommins
[IMSoP]

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like $isReversed or $left or so on is somewhat ambiguous, while the enum makes it extremely explicit.

However, it’s not a design detail I am committed to. I just want to let you know why it was done that way.

Jordan

To be clear: I’m very much in favor of operator overloading. I frequently work with both Money value objects, and DateTime objects that I need to manipulate through arithmetic with others of the same type.

What if I wanted to create a generic add($a, $b) function, how would I type hint the params to ensure that I only get “addable” things? I would expect that to be:

  • Ints
  • Floats
  • Objects of classes with “operator+” defined

I think that an interface is the right solution for that, and you can just union with int/float type hints: add(int | float | Addable …$operands) (or add(int | float | (Foo & Addable) …$operands)

Is this type of behavior even allowed? I think the intention is that it must be otherwise the decision over which overload method gets called is drastically simplified.

Perhaps for a first iteration, operator overloads only work between objects of the same type or their descendants — and if a descendant overrides the overload, the descendants version is used regardless of left/right precedence.

I suspect this will simplify the complexity of the magic, and solve the majority of cases where operator overloading is desired.

  • Davey

The problem with providing interfaces is something the nikic addressed very early in my design process and convinced me of: an Addable interface will not actually tell you if two objects can be added together. A Money class and a Vector2D class might both have an implementation for operator +() and implement some kind of Addable interface. But there is no sensible way in which they could actually be added. Knowing that an object implements an overload is not enough in most cases to use operators with them. This is part of the reason that I am skeptical of people who worry about accidentally using random overloads.

The signature for the implementation in the Money class, might look something like this:

operator +(Money $other, OperandPosition $position): Money

while the signature for the implementation in the Vector2D class might look something like this:

operator +(Vector2D|array $other, OperandPosition $position): Vector2D

Any attempt to add these two together will result in a TypeError.

Classes which have overloads that look like the following would be something I think developers should be IMMEDIATELY suspicious of:

operator +(object $other, OperandPosition $position)
operator +(mixed $other, OperandPosition $position)

Does your implementation really have a plan for how to + with a stream resource like a file handler, as well as an int? Can you just as easily use + with the DateTime class as you can with a Money class in your implementation?

I think there are very few use cases that would survive code reviews or feedback or testing that look like any of these signatures.

There are situations in which objects might accept objects from a different class hierarchy. For instance, with the changes Saki has made there are now objects for numbers in the BcMath extension. Those are objects that might be quite widely accepted in overload implementations, since they represent numbers in the same way that just an int or float might. But I highly doubt that it’s even possible for the overload to accept those sorts of things without also being aware of them, and if the overload is aware of them it can type-hint them in the signature.

Jordan

Goods points, while Money objects are frequently added together, I would typically add DateInterval instances to DateTime instances, which breaks the limitation.

  • Davey

On 17/09/2024 18:15, Jordan LeDoux wrote:

    1. Are we over-riding *operators* or *operations*? That is, is the
    user
    saying "this is what happens when you put a + symbol between two Foo
    objects", or "this is what happens when you add two Foo objects
    together"?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.

I don't think that's really true. Take the behaviour of comparisons in your previous RFC: if that RFC had been accepted, the user would have had no way to make $a < $b and $a > $b have different behaviour, because the same overload would be called, with the same parameters, in both cases.

Slightly less strict is requiring groups of operators: the Haskell "num" typeclass (roughly similar to an interface) requires definitions for all of "+", "*", "abs", "signum", "fromInteger", and either unary or binary "-". It also defines the type signatures for each. If this was the only way to overload the "+" operator, users would have to really go out of their way to use it to mean something unrelated addition.

As it happens, Haskell *does* allow arbitrary operator overloads, and in fact goes to the other extreme and allows entirely new operators to be invented. The same is true in PostgreSQL - you can implement the <<//-^+^-//>> operator if you want to.

I think it's absolutely possible - and desirable - to choose a philosophical position on that spectrum, and use it to drive design decisions. The choice of "__add" vs "operator+" is one such decision.

The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)
2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.
3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.
4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.

This is option (g) in my list, with the additional "prefer sub-classes" rule (step 3), which I agree would be a good addition.

As noted, it doesn't provide symmetry, because step 4 depends on the order in the source code. Option (c) is the same algorithm without step 4, so guarantees that $a + $b and $b + $a will always call the same method.

Options (d), (e), and (f) each add an extra step: one operand can signal "I don't know" and the other operand gets a chance to answer. They're essentially ways to "partially implement" an operator.

Options (a) and (b) perform the same kind of polymorphic resolution on *both* operands, which is how many languages work for functions and/or methods already.

Reading the C# spec, if there is more than one candidate overload which is equally specific, an error is raised. I guess you could do the same even with one implementation per class, by replacing step 4 in your algorithm:

> 4. If neither of them are direct descendants of the other, and only one implements the operator, use it.
> 5. If neither of them are direct descendants of the other, and both implement the operator, throw an error.

Let's call that option (h) :slight_smile:

By the way, searching online for the phrase "Polymorphic Handler Resolution" finds no results other than you saying it is the name for this algorithm.

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like `$isReversed` or `$left` or so on is somewhat ambiguous, while the enum makes it extremely explicit.

Ah, fair enough. Explicitness vs conciseness is always a trade-off. My thinking was that the "reversed" form would be far more rarely called than the "normal" form; but that depends a lot on which resolution algorithm is used.

Regards,

--
Rowan Tommins
[IMSoP]

On Tue, Sep 17, 2024 at 12:27 PM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 17/09/2024 18:15, Jordan LeDoux wrote:

  1. Are we over-riding operators or operations? That is, is the user
    saying “this is what happens when you put a + symbol between two Foo
    objects”, or “this is what happens when you add two Foo objects together”?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.

I don’t think that’s really true. Take the behaviour of comparisons in your previous RFC: if that RFC had been accepted, the user would have had no way to make $a < $b and $a > $b have different behaviour, because the same overload would be called, with the same parameters, in both cases.

Slightly less strict is requiring groups of operators: the Haskell “num” typeclass (roughly similar to an interface) requires definitions for all of “+”, “*”, “abs”, “signum”, “fromInteger”, and either unary or binary “-”. It also defines the type signatures for each. If this was the only way to overload the “+” operator, users would have to really go out of their way to use it to mean something unrelated addition.

As it happens, Haskell does allow arbitrary operator overloads, and in fact goes to the other extreme and allows entirely new operators to be invented. The same is true in PostgreSQL - you can implement the <<//-^+^-//>> operator if you want to.

I think it’s absolutely possible - and desirable - to choose a philosophical position on that spectrum, and use it to drive design decisions. The choice of “__add” vs “operator+” is one such decision.

Ah, I see. I suppose I never really entertained an idea like this because in my mind it can’t even handle non-trivial math, let alone the sorts of things that people might want to use overloads for. Once you get past arithmetic with real numbers into almost any other kind of math, which operators are meaningful, and what they mean exactly, begins to depend a lot on context. This is why I felt like even if we were limiting the use cases to math projects, things like commutativity should not necessarily be enforced.

The line $a + $b and $b + $a are SUPPOSED to give different results for certain types of math objects, for instance. The line $a - $b and $b - $a more obviously give different results to most people, because subtraction is not commutative even for real numbers.

My personal opinion is that the RFC should not assume the overloads are used in a particular domain (like real number arithmetic), and thus should not attempt to enforce these kinds of behaviors. But, opinions like this are actually what I was hoping to receive from this thread. This could be the way forward that voters are more interested in, even if it wouldn’t be my own first preference as it will be highly limiting to the applicable domains.

The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

  1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)
  2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.
  3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.
  4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.

This is option (g) in my list, with the additional “prefer sub-classes” rule (step 3), which I agree would be a good addition.

As noted, it doesn’t provide symmetry, because step 4 depends on the order in the source code. Option (c) is the same algorithm without step 4, so guarantees that $a + $b and $b + $a will always call the same method.

Options (d), (e), and (f) each add an extra step: one operand can signal “I don’t know” and the other operand gets a chance to answer. They’re essentially ways to “partially implement” an operator.

Options (a) and (b) perform the same kind of polymorphic resolution on both operands, which is how many languages work for functions and/or methods already.

Reading the C# spec, if there is more than one candidate overload which is equally specific, an error is raised. I guess you could do the same even with one implementation per class, by replacing step 4 in your algorithm:

  1. If neither of them are direct descendants of the other, and only one implements the operator, use it.
  2. If neither of them are direct descendants of the other, and both implement the operator, throw an error.

Let’s call that option (h) :slight_smile:

By the way, searching online for the phrase “Polymorphic Handler Resolution” finds no results other than you saying it is the name for this algorithm.

Hmmm, I will see if I can find where I came across the term in my original research then. I did about 4 months of research for my RFC, but that was several years ago at this point, so I might be mistaken.

So I understand here that you’re looking for commutativity in which overload is actually called, even if it doesn’t create commutativity in the result of the operation. That the executed overload should be the same no matter the order of the operands.

This was something I also was interested in, but I could not find a solution I was happy with. All of the things you have detailed here have tradeoffs that I’m unsure about. This is an open question of design that I feel requires more input and more voices from others who are interested, because I don’t feel like any of these approaches (including the one that I went with) are better, they are just different.

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like $isReversed or $left or so on is somewhat ambiguous, while the enum makes it extremely explicit.

Ah, fair enough. Explicitness vs conciseness is always a trade-off. My thinking was that the “reversed” form would be far more rarely called than the “normal” form; but that depends a lot on which resolution algorithm is used.

Regards,

-- 
Rowan Tommins
[IMSoP]

It would also depend on whether it is used with scalars.

For instance, $numObj - 5 and 5 - $numObj. For both of these, you want to call the overload on $numObj, because it’s the only avenue that won’t result in a fatal error (assuming that the overload knows how to work with int values). The case of an object with an overload being used with an operand that is a non-object will most likely result in reversed calls quite frequently. This will be a prominent issue for some use cases (like arbitrary precision math), and an almost non-existent issue for other use cases (like currency or time).

Jordan

On Tue, Sep 17, 2024, at 21:25, Rowan Tommins [IMSoP] wrote:

On 17/09/2024 18:15, Jordan LeDoux wrote:

  1. Are we over-riding operators or operations? That is, is the user

saying "this is what happens when you put a + symbol between two Foo

objects", or “this is what happens when you add two Foo objects together”?

If we allow developers to define arbitrary code which is executed as a result of an operator, we will always end up allowing the first one.

I don’t think that’s really true. Take the behaviour of comparisons in your previous RFC: if that RFC had been accepted, the user would have had no way to make $a < $b and $a > $b have different behaviour, because the same overload would be called, with the same parameters, in both cases.

Slightly less strict is requiring groups of operators: the Haskell “num” typeclass (roughly similar to an interface) requires definitions for all of “+”, “*”, “abs”, “signum”, “fromInteger”, and either unary or binary “-”. It also defines the type signatures for each. If this was the only way to overload the “+” operator, users would have to really go out of their way to use it to mean something unrelated addition.

As it happens, Haskell does allow arbitrary operator overloads, and in fact goes to the other extreme and allows entirely new operators to be invented. The same is true in PostgreSQL - you can implement the <<//-^+^-//>> operator if you want to.

I think it’s absolutely possible - and desirable - to choose a philosophical position on that spectrum, and use it to drive design decisions. The choice of “__add” vs “operator+” is one such decision.

The approach I plan to use for this question has a name: Polymorphic Handler Resolution. The overload that is executed will be decided by the following series of decisions:

  1. Are both of the operands objects? If not, use the overload on the one that is. (NOTE: if neither are objects, the new code will be bypassed entirely, so I do not need to handle this case)

  2. If they are both objects, are they both instances of the same class? If they are, use the overload of the one on the left.

  3. If they are not objects of the same class, is one of them a direct descendant of the other? If so, use the overload of the descendant.

  4. If neither of them are direct descendants of the other, use the overload of the object on the left. Does it produce a type error because it does not accept objects of the type in the other position? Return the error and abort instead of re-trying by using the overload on the right.

This is option (g) in my list, with the additional “prefer sub-classes” rule (step 3), which I agree would be a good addition.

As noted, it doesn’t provide symmetry, because step 4 depends on the order in the source code. Option (c) is the same algorithm without step 4, so guarantees that $a + $b and $b + $a will always call the same method.

Options (d), (e), and (f) each add an extra step: one operand can signal “I don’t know” and the other operand gets a chance to answer. They’re essentially ways to “partially implement” an operator.

Options (a) and (b) perform the same kind of polymorphic resolution on both operands, which is how many languages work for functions and/or methods already.

Reading the C# spec, if there is more than one candidate overload which is equally specific, an error is raised. I guess you could do the same even with one implementation per class, by replacing step 4 in your algorithm:

  1. If neither of them are direct descendants of the other, and only one implements the operator, use it.
  1. If neither of them are direct descendants of the other, and both implement the operator, throw an error.

Let’s call that option (h) :slight_smile:

By the way, searching online for the phrase “Polymorphic Handler Resolution” finds no results other than you saying it is the name for this algorithm.

This is similar to what I originally designed, and I actually moved to an enum based on feedback. The argument was something like $isReversed or $left or so on is somewhat ambiguous, while the enum makes it extremely explicit.

Ah, fair enough. Explicitness vs conciseness is always a trade-off. My thinking was that the “reversed” form would be far more rarely called than the “normal” form; but that depends a lot on which resolution algorithm is used.

Regards,

-- 
Rowan Tommins
[IMSoP]

To be honest, this juggling of caller orders has me a bit concerned. For example, matrix multiplication isn’t communitive, as are non-abelion groups in general (quaternions being another popular system), but, I am used to Scala, where the left-hand is the one always called.

I understand that this is what the operant position is for, but it strikes me as something that extreme care has to be called for when working with these types of objects when another object is involved. For example, quaternions can be multiplied by a matrix and the order is super important (used for 3d rotations) but it appears the actual method called may not be deterministic because these classes may be unrelated, the one on the left is called, which may or may not result in a correct answer. All that is to say, this is just to illustrate how complex this ordering algorithm seems to be. Depending on how the libraries are implemented and whether they are designed to work together.

I would prefer to see something simple, and easy to reason about. We can abuse some mathematical properties to result in something quite simple:

  1. If both are scalar, use existing logic.

  2. If one is scalar and the other is not, use existing logic.

  3. If one is scalar and the other overrides the operation, rearrange the operation per its communitive rules so the object is on the left. $scalar + $obj == $obj + $scalar; $scalar - $obj == -$obj + $scalar, -($obj - $scalar). It is generally accepted (IIRC) that when scalars are involved, we don’t need to be concerned with non-abelion groups.

  4. If both are objects, use the one on the left.

I think this is much easier to reason about (you either get a scalar or another object) that doesn’t involve a developer deeply understanding the inheritance of the objects in question or to understand the algorithm for choosing which one will be called.

— Rob

On 2024-09-18 05:43, Jordan LeDoux wrote:

The problem I was trying to solve involved lots of things that cannot be represented well by primitive types (which is presumably why they are classes in the first place). Things like Complex Numbers, Matrices, or Money. Money can be converted to a float of course (or an int depending on implementation), but Money does not want to be added with something like request count, which might also be an int. Or if it does, it probably wants to know exactly what the context is. There are lots of these kinds of value classes that might be representable with scalars, but would lose a lot of their context and safety if that is done.

Even plain addition with plain numbers can be fraught. Cardinal numbers (1,2,3...) can be added to and subtracted from each other. Ordinal numbers (1st,2nd,3rd...), on the other hand, cannot be added together and subtracting one from another results in an interval, not a number (which means that addition involving ordinal numbers means things like ordinal+interval=ordinal).

Then there are quantities. Quantities of arbitrary dimension (length, duration, monetary value...) can be multiplied, resulting in a quantity of a new dimension, but only quantities of the same dimension can be added.

>
> On the other hand, Money would probably not want to be multiplied with
> other Money values. What would Money squared mean exactly? Things like
> this are very difficult to control for if all you provide is a way to
> control casting to scaar types.
>
While I can't think off the top of my head of a case where money quantities might be multiplied by other money quantities, I can think of situations where one might want to _divide_ them...

On 17/09/2024 21:37, Rob Landers wrote:

I would prefer to see something simple, and easy to reason about. We can abuse some mathematical properties to result in something quite simple:

1. If both are scalar, use existing logic.
2. If one is scalar and the other is not, use existing logic.
3. If one is scalar and the other overrides the operation, rearrange
    the operation per its communitive rules so the object is on the
    left. $scalar + $obj == $obj + $scalar; $scalar - $obj == -$obj +
    $scalar, -($obj - $scalar). It is generally accepted (IIRC) that
    when scalars are involved, we don’t need to be concerned with
    non-abelion groups.
4. If both are objects, use the one on the left.

Step 3 requires operators to be overloaded in groups: the rearrangement of the binary "-" operator requires definitions of both the unary "-" operator and the binary "+" operator; and definitions that meet the appropriate mathematical rules.

IMO, that's a lot more complicated than calling the "+" overload with an OperandPosition::RightSide flag; or Python's approach of separate "add" and "reflected add" magic methods.

Since you mentioned Scala, I looked it up, and it seems to be on the other end of the spectrum: operators are just methods, with no mathematical meaning or special dispatch behaviour. In fact, "a plus b" is just another way of writing "a.plus(b)", so "a + b" is just a way of writing "a.+(b)"

Maybe it would be "useful enough" to just restrict to left-hand side:

1. If the left operand is an object which implements the specified operator, call that implementation with the right operand as argument
2. Else, proceed as current PHP.

This is where gathering a good catalogue of use cases would come in handy: which of them would be impossible, or annoyingly difficult, with a more restrictive resolution method?

Regards,

--
Rowan Tommins
[IMSoP]

When you see a method call, you know 100% of the time that it calls a method.

Assuming operator overloading, when you see an operator used with variables, you will no longer be able to rely on your knowledge of how the operator work as it might use the built-in operator OR it might call a method. Most of the time it will be the former, occasionally the latter.

And since a method would only be called occasionally I would likely get tired of searching to see if there is a method being called and just assume there isn’t. Not smart, but having to track down potential methods for every operator use would just be far too tedious.

Also, getting to the implementation of operators would likely be just as easy as looking up methods nowadays, because IDEs would support that kind of lookup.

The more generic named the method, the more IDEs struggle with finding methods, and operators are about as generic as you can get.

Most framework and library code is by now type-hinted - I would have understood this argument when operator overloading would be implemented into PHP 5.x, where many classes and functions got values and had no enforced types, so they might have expected an int, yet an object with operator overloading might work but in weird ways, because there were no type hints for int. I cannot see this situation now at all - if a Symfony component wants an int, you cannot pass in an object. If a Symfony component wants a Request object, it will not use operators that it does not expect to be there, even if you would extend the class and add operator support. Using operators implies you know you can use operators, otherwise it will be a fatal error (except for comparisons).

ANY object could have an operator overload on == and/or === and have those behave differently.

I also just checked and Symfony has many parameters type hinted as Stringable and numerous interfaces that extend Stringable. Any object that implements Stringable could also add operator overloads and introduce subtle bugs into those functions.

Symfony has numerous interfaces that extend Countable. Any object that implements Countable could also add operator overloads and introduce subtle bugs into those functions.

Symfony has numerous interfaces that extend ArrayAccess. Any object that implements ArrayAccess could also add operator overloads and introduce subtle bugs into those functions.

Further — if operator overloads are added — there will likely be demand for more “-able” type interfaces, or some other way to make objects behave more like scalar types. If we have opt-in then code is safe. If we do not have opt-in then unintended consequences are possible, if not likely.

From your arguments it also seems you are afraid everybody will use operator overloading excessively and unnecessarily. This seems very unlikely to me - it is not that useful a feature, except for certain situations.

We will have to agree to disagree on that then. Look at what happened when is was added to C++; people when crazy with it and it was numerous years before the community moderated their behavior.

Many other languages have had operator overloading for many years of even decades - are there really huge problems in those languages?

Depends on what you consider to be “huge” problems. Many people feel operator overloading is a cure for an ailment where the medicine is the worse than the disease.

Just google “operator overloading considered harmful.” But in case you can’t be bothered to google, here are just three of many opinions:

If yes, maybe PHP can learn from some of the problems there (which I think the original RFC tried to carefully consider), but as far as I know the usage of operator overloading is niche in languages which support it, depending on use case - some people like it, some don’t, but they do not seem to be a big problem for these languages or their code in general. Maybe you have some sources on actual problems in other languages?

Besides the links above, my sources are personal experience over 30+ years of programming.

Personally I would love my Money class to finally have operators instead of the current “plus”, “minus”, “multipliedBy” (and so on) methods which are far less readable. I would only use operator overloading on a few specific classes, but for those the readability improvement would be huge. Also, being able to override comparison operators for objects would be very useful, because currently using == and === with objects is almost never helpful or sufficient.

And I support that. Just have developers or yourself opt-in to use operators on your Money class.

Requiring an opt-in is a safer way to go, and NOT requiring an opt-in means we would forever have to live with any downsides that opt-in would reign in.

So how is requiring an opt-in a bad idea and not a reasonable compromise?

While I do not presume to speak for all voters (I don’t even have voting rights myself), my feeling from all of the conversations I have had over almost the last 4 years is that implementing your suggestion would virtually guarantee that the RFC is declined.

Well, given that few have commented on the idea yet, it seems premature to make that assumption.

Why not wait until we get more feedback rather than nix it up front?

You are suggesting providing a new syntax (which voters tend to be skeptical of) to create a situation where more errors occur

(which voters tend to be skeptical of)

So, type hinting is all about creating errors. Yet almost everyone on the list is super excited about adding more and better type hinting.

Given that, I would argue that you are claim is, if not false, at least far too simplistic to be a valid claim.

to solve a problem which can be solved with existing syntax by simply type guarding your code to not allow any objects near your operators (which voters tend to be skeptical of)

You mean you assume it can be solved with existing syntax.

for which I cannot find any code examples that explain the problem it is solving (which voters tend to skeptical of).

Frankly, it is difficult to come up with examples because examples are predicating on junior developers doing dumb things that would never occur to a senior developer because they know better.

But I prepared one such hypothetical example using function equals() as a stand-in for operator == so the code will run:

https://3v4l.org/6GTKb#v8.3.11

I feel certain I could come up with several others, but I unfortunately have already exceeded my time allotment for the day.

···

On 17.09.24 11:14, Mike Schinkel wrote:

How would a developer know if they are using an object that has operators, unless they study all the source code or at least the docs (assuming there are good docs, which there probably are not?)

How is that different from using a method on an object? Operators are roughly speaking a different syntax for methods, and if you use an object method, you have to know about it or look up what it does.

Operator overloading is indeed a very overloaded topic concerning computer languages.

But my 0.02 cents is that it's a good thing to have provided it solves one or more of the problems such as:

* Will the code be easier to write, maintain, and read?
* Will this help with optimizations?
* Will this provide new paradigms that are useful for architectural solutions?
* or anything else practical and not based on the idea that 'php needs it as it's available in other languages?

--Kent

On Wed, Sep 18, 2024, at 01:12, K Sandvik wrote:

Operator overloading is indeed a very overloaded topic concerning computer languages.

But my 0.02 cents is that it’s a good thing to have provided it solves one or more of the problems such as:

Hey Kent,

  • Will the code be easier to write, maintain, and read?

Just imagine being able to ignore the whole DateInterval class. Maybe one day we could write $datetime + ‘2 minutes’ and it “just work” instead of writing PT2M or using special libraries.

Another example is to consider how most people do time durations in php: 5 * MINUTE_IN_SECONDS. or something like this. They are usually stored as integers. If you had a units library of time, you could write 5 * MINUTE where MINUTE is a one minute unit object (Which also begs the question of whether operators are “constant expressions” when defined…) so that a unit object is stored instead of an integer, preventing mistakes or having to dig through the code to find out what units “$timeout” is in.

  • Will this help with optimizations?

I’m not sure what you mean by this.

  • Will this provide new paradigms that are useful for architectural solutions?

I think it could. For example a Host + Port + Path objects could result in a URI object. Or a DI container could use it to express service dependencies more succinctly.

  • or anything else practical and not based on the idea that 'php needs it as it’s available in other languages?

–Kent

I think it is worth pointing out that you can already do “operator overloading” in PHP, so long as your operators are emojis and you don’t mind how it looks… :joy:

https://3v4l.org/gQKTb

In all seriousness though, php has classes because other languages have classes. It’s a valid argument but maybe not a strong one.

— Rob

On Tue, Sep 17, 2024 at 3:44 PM Mike Schinkel <mike@newclarity.net> wrote:

On Sep 17, 2024, at 6:04 AM, Andreas Leathley <a.leathley@gmx.net> wrote:

On 17.09.24 11:14, Mike Schinkel wrote:

How would a developer know if they are using an object that has operators, unless they study all the source code or at least the docs (assuming there are good docs, which there probably are not?)

How is that different from using a method on an object? Operators are roughly speaking a different syntax for methods, and if you use an object method, you have to know about it or look up what it does.

When you see a method call, you know 100% of the time that it calls a method.

Assuming operator overloading, when you see an operator used with variables, you will no longer be able to rely on your knowledge of how the operator work as it might use the built-in operator OR it might call a method. Most of the time it will be the former, occasionally the latter.

And since a method would only be called occasionally I would likely get tired of searching to see if there is a method being called and just assume there isn’t. Not smart, but having to track down potential methods for every operator use would just be far too tedious.

Also, getting to the implementation of operators would likely be just as easy as looking up methods nowadays, because IDEs would support that kind of lookup.

The more generic named the method, the more IDEs struggle with finding methods, and operators are about as generic as you can get.

Most framework and library code is by now type-hinted - I would have understood this argument when operator overloading would be implemented into PHP 5.x, where many classes and functions got values and had no enforced types, so they might have expected an int, yet an object with operator overloading might work but in weird ways, because there were no type hints for int. I cannot see this situation now at all - if a Symfony component wants an int, you cannot pass in an object. If a Symfony component wants a Request object, it will not use operators that it does not expect to be there, even if you would extend the class and add operator support. Using operators implies you know you can use operators, otherwise it will be a fatal error (except for comparisons).

ANY object could have an operator overload on == and/or === and have those behave differently.

I also just checked and Symfony has many parameters type hinted as Stringable and numerous interfaces that extend Stringable. Any object that implements Stringable could also add operator overloads and introduce subtle bugs into those functions.

Symfony has numerous interfaces that extend Countable. Any object that implements Countable could also add operator overloads and introduce subtle bugs into those functions.

Symfony has numerous interfaces that extend ArrayAccess. Any object that implements ArrayAccess could also add operator overloads and introduce subtle bugs into those functions.

Further — if operator overloads are added — there will likely be demand for more “-able” type interfaces, or some other way to make objects behave more like scalar types. If we have opt-in then code is safe. If we do not have opt-in then unintended consequences are possible, if not likely.

From your arguments it also seems you are afraid everybody will use operator overloading excessively and unnecessarily. This seems very unlikely to me - it is not that useful a feature, except for certain situations.

We will have to agree to disagree on that then. Look at what happened when is was added to C++; people when crazy with it and it was numerous years before the community moderated their behavior.

Many other languages have had operator overloading for many years of even decades - are there really huge problems in those languages?

Depends on what you consider to be “huge” problems. Many people feel operator overloading is a cure for an ailment where the medicine is the worse than the disease.

Just google “operator overloading considered harmful.” But in case you can’t be bothered to google, here are just three of many opinions:

If yes, maybe PHP can learn from some of the problems there (which I think the original RFC tried to carefully consider), but as far as I know the usage of operator overloading is niche in languages which support it, depending on use case - some people like it, some don’t, but they do not seem to be a big problem for these languages or their code in general. Maybe you have some sources on actual problems in other languages?

Besides the links above, my sources are personal experience over 30+ years of programming.

Personally I would love my Money class to finally have operators instead of the current “plus”, “minus”, “multipliedBy” (and so on) methods which are far less readable. I would only use operator overloading on a few specific classes, but for those the readability improvement would be huge. Also, being able to override comparison operators for objects would be very useful, because currently using == and === with objects is almost never helpful or sufficient.

And I support that. Just have developers or yourself opt-in to use operators on your Money class.

Requiring an opt-in is a safer way to go, and NOT requiring an opt-in means we would forever have to live with any downsides that opt-in would reign in.

So how is requiring an opt-in a bad idea and not a reasonable compromise?

On Sep 17, 2024, at 1:22 PM, Jordan LeDoux <jordan.ledoux@gmail.com> wrote:
While I do not presume to speak for all voters (I don’t even have voting rights myself), my feeling from all of the conversations I have had over almost the last 4 years is that implementing your suggestion would virtually guarantee that the RFC is declined.

Well, given that few have commented on the idea yet, it seems premature to make that assumption.

Why not wait until we get more feedback rather than nix it up front?

You are suggesting providing a new syntax (which voters tend to be skeptical of) to create a situation where more errors occur

(which voters tend to be skeptical of)

So, type hinting is all about creating errors. Yet almost everyone on the list is super excited about adding more and better type hinting.

Given that, I would argue that you are claim is, if not false, at least far too simplistic to be a valid claim.

to solve a problem which can be solved with existing syntax by simply type guarding your code to not allow any objects near your operators (which voters tend to be skeptical of)

You mean you assume it can be solved with existing syntax.

for which I cannot find any code examples that explain the problem it is solving (which voters tend to skeptical of).

Frankly, it is difficult to come up with examples because examples are predicating on junior developers doing dumb things that would never occur to a senior developer because they know better.

But I prepared one such hypothetical example using function equals() as a stand-in for operator == so the code will run:

https://3v4l.org/6GTKb#v8.3.11

I feel certain I could come up with several others, but I unfortunately have already exceeded my time allotment for the day.

On Sep 17, 2024, at 1:43 PM, Jordan LeDoux <jordan.ledoux@gmail.com> wrote:

The problem I was trying to solve involved lots of things that cannot be represented well by primitive types (which is presumably why they are classes in the first place). Things like Complex Numbers, Matrices, or Money. Money can be converted to a float of course (or an int depending on implementation), but Money does not want to be added with something like request count, which might also be an int. Or if it does, it probably wants to know exactly what the context is. There are lots of these kinds of value classes that might be representable with scalars, but would lose a lot of their context and safety if that is done.

On the other hand, Money would probably not want to be multiplied with other Money values. What would Money squared mean exactly? Things like this are very difficult to control for if all you provide is a way to control casting to scaar types.

Given your pushback you had on even having an opt-in for operator overload, why not instead prepare RFCs for built-in Complex Number, Matrix, and Money classes?

That was what I argued three years ago yet no action was taken on it so I get the impression that not having operator overloading for those kind of classes is actually not that big of a deal as someone would have pushed a class-specific RFC in that 3 years time.

As an aside, I think it would be a LOT more valuable to have a built-in standard that becomes always available as a starting point for everyone doing maths rather than having a bunch of different, incompatible and non-composable and edge-case buggy classes for these uses out on Github.

-Mike

I really wish you would read the things that I have written.

  1. No objects would be able to overload the === operator, because that operator was not included in the RFC intentionally.

  2. The current semantics of the == with objects are possibly used by some, but in practice from what I have seen and researched, are mostly so useless that they aren’t used. But, even so, the comparison operators in the original RFC actually DO fall back to the existing behavior if no overload is present.

  3. I don’t know how else to explain this to you, but the existence of an operator in your code does NOT mean you would now need to do more work if you do not want to take advantage of operator overloads, even without an opt-in. Currently using objects with an operator results in a fatal error. Since you presumably have written code that can run in the last 30+ years of your experience, you have somehow avoided this. I presume you have avoided this situation by paying attention to the types of your variables. You may continue to do the same thing you MUST do now, and you will never encounter an operator overload. Ever.

  4. None of the Symfony code that allows Stringable or Countable will suddenly produce new bugs without intentionally adding operators. I know this because the code currently does not produce errors, and currently any object that is Stringable and/or Countable will produce errors when it encounters any of the operators specified in the original RFC (excluding comparisons, which I have mentioned). ArrayAccess… perhaps? Arrays can use + to union, and I’m not sure off the top of my head if ArrayAccess also does this. But that’s an internal implementation that could be forced to remain the same fairly trivially if so.

  5. I addressed this multiple times in my original RFC and the discussion surrounding it, but C++ is the most useless analogue anyone could choose for what I am proposing, and also just HAPPENS to be the language with the most negative impact from its operator overload implementation. I noticed this in my research prior to the RFC (which is the point of such research), and I made design choices SPECIFICALLY to avoid that and then communicated that to everyone via multiple channels including this mailing list and the original RFC. The vast majority of trouble that C++ operator overloads cause are related to overloads with pointers. Pointers do not exist in PHP, and their closest analogue (by-ref) are specifically not possible with overloads to prevent that very problem.

  6. The articles you linked to about the evils of operator overload seem to be essentially making the point “people will misuse operator overloading”. Sure. In fact the RFC I wrote had a section dedicated to that. People will misuse ANYTHING. I don’t see that on its own as a legitimate reason to avoid a feature that has legitimate uses. I spent a lot of time researching this aspect of it in different developer communities, and my conclusion eventually was “most developer communities for languages that have operator overloads do not appear to see them as a particularly troublesome source of quirks, bugs, and confusion”. I don’t think that the Python or C# communities for instance would support removing the feature, even if there were a way to magically fix the BC issues it would cause. It is seen as a useful but niche feature, which is exactly what it is.

  7. Your opt-in concept is a bad idea because it creates multiple switchable modes for the VM to run in, which is something that I think most developers would find MUCH more mentally taxing than simply adjusting to the new feature.

  8. It is not merely an “assumption” that existing syntax can solve the issue you are trying to solve. The following operators ALWAYS result in a fatal error when used with an object that doesn’t have an overload: +, -, *, /, %, **, ^, &, |, <<, >>, ~. These operators are currently used in code that does not produce errors. Ergo, there is an existing syntax that prevents their use with objects. Ergo, there is an existing syntax that solves the issue. In fact there are multiple.

  9. I have not prepared RFCs for bundled Complex Number, Matrix, etc. types because those are just the use cases that brought me to this feature, not the only ones I have encountered, and I don’t think that special-casing an internal class for every single one is a sustainable solution. How long before we end up with a Currency class also? Or a Length, Area, and Volume class that all understand how to multiply and divide into each other? What about other measurement units getting their own? I think that same thought process would also work against the idea of a Complex Number of Matrix class, and frankly, something like a Complex Number class is HIGHLY specific and would be virtually unused. I might have a use for it, but including it in the language is fairly dubious.

Almost everything you have said are things that I have talked about or been asked about for years. That is why I have been fairly dismissive about them so far, not because I disagree. I am well familiar with these discussions, and I was hoping to get new feedback and thoughts through this thread. To hear specifics.

The specifics I hear from you is something along the lines of “as long as you give me a way to force all my code, even code other developers wrote, to ignore this feature, I am fine with it”. That’s not really a helpful position at all, even if it IS a valid one.

Jordan

On Sep 17, 2024, at 8:05 PM, Jordan LeDoux <jordan.ledoux@gmail.com> wrote:

On Tue, Sep 17, 2024 at 3:44 PM Mike Schinkel <mike@newclarity.net> wrote:

Almost everything you have said are things that I have talked about or been asked about for years. That is why I have been fairly dismissive about them so far, not because I disagree. I am well familiar with these discussions, and I was hoping to get new feedback and thoughts through this thread. To hear specifics.

The specifics I hear from you is something along the lines of “as long as you give me a way to force all my code, even code other developers wrote, to ignore this feature, I am fine with it”. That’s not really a helpful position at all, even if it IS a valid one.

Since you you do not share my concerns related to your RFC I will not spend any more time on this topic.

I would close with best of luck, but honestly, I hope for PHP’s sake the results are the same as last time.

-Mike

On Tue, Sep 17, 2024, at 3:14 PM, Jordan LeDoux wrote:

I think it's absolutely possible - and desirable - to choose a philosophical position on that spectrum, and use it to drive design decisions. The choice of "__add" vs "operator+" is one such decision.

Ah, I see. I suppose I never really entertained an idea like this
because in my mind it can't even handle non-trivial math, let alone the
sorts of things that people might want to use overloads for. Once you
get past arithmetic with real numbers into almost any other kind of
math, which operators are meaningful, and what they mean exactly,
begins to depend a lot on context. This is why I felt like even if we
were limiting the use cases to math projects, things like commutativity
should not necessarily be enforced.

The line `$a + $b` and `$b + $a` are SUPPOSED to give different results
for certain types of math objects, for instance. The line `$a - $b` and
`$b - $a` more obviously give different results to most people, because
subtraction is not commutative even for real numbers.

My personal opinion is that the RFC should not assume the overloads are
used in a particular domain (like real number arithmetic), and thus
should not attempt to enforce these kinds of behaviors. But, opinions
like this are actually what I was hoping to receive from this thread.
This could be the way forward that voters are more interested in, even
if it wouldn't be my own first preference as it will be highly limiting
to the applicable domains.

I'm not sure where exactly in this thread to put this, so I'm putting it here...

Rowan makes an interesting point regarding operators vs operations. In particular, the way the <=> logic is defined, it is defining an operation: comparison. Using it for anything other than ordering comparison is simply not viable, nor useful. It's defining a custom implementation if a specific pre-existing action.

For all the other operators, the logic seems to be defined for an operator, the behavior of which is "whatever makes sense in your use case, idk." That is, to use Rowan's distinction, a philosophically different approach. Not a bad one, necessarily. In fact, I think it's a very good one.

But, as they are different, perhaps that suggests that comparison should instead not be implemented as an operator overload per se, but as a named magic method. The existing logic for it is, I think, fine, but it's a fair criticism that you're not defining "what happens for a method-ish named <=>", you're defining "how do objects compare." So I think it would make sense to replace the <=> override with a `__compare(mixed $other): int`, which any class could implement to opt-in to ordering comparisons, and thus work with <, >, ==, <=>, etc. (And, importantly, still keep the "specify the type(s) you want to be able to compare against" logic, already defined.)

A similar argument could probably be made for ==, though I've not fully thought through if I agree or not. Again, I think the previously defined logic is fine. It would be just changing the spelling from `operator ==(mixed $other): bool` to `public function __equals(mixed $other): bool`. But that again better communicates that it is a core language behavior that is being overridden, rather than an arbitrarily defined symbol-function-thing with domain-specific meaning.

There was an RFC for a Comparable interface back in the stone age (2010), but it looks like it never went to a vote: PHP: rfc:comparable

Arguably, this would then make more sense as a stand-alone RFC that happens to reuse a lot of the existing code and logic defined for operator overloads, which are all still just as valid.

That does not apply to the arithmetic, bitwise, or logic operators. Overriding + or / for a specific domain is not the same, as you're not hooking into engine behavior the way <=> or == are. For those, I'd prefer to stick to the current/previous implementation, with the `operator` keyword, for reasons I explained before.

Jordan, does that distinction make sense to you?

--Larry Garfield

On Tue, Sep 17, 2024 at 6:49 PM Larry Garfield <larry@garfieldtech.com> wrote:

On Tue, Sep 17, 2024, at 3:14 PM, Jordan LeDoux wrote:

I think it’s absolutely possible - and desirable - to choose a philosophical position on that spectrum, and use it to drive design decisions. The choice of “__add” vs “operator+” is one such decision.

Ah, I see. I suppose I never really entertained an idea like this
because in my mind it can’t even handle non-trivial math, let alone the
sorts of things that people might want to use overloads for. Once you
get past arithmetic with real numbers into almost any other kind of
math, which operators are meaningful, and what they mean exactly,
begins to depend a lot on context. This is why I felt like even if we
were limiting the use cases to math projects, things like commutativity
should not necessarily be enforced.

The line $a + $b and $b + $a are SUPPOSED to give different results
for certain types of math objects, for instance. The line $a - $b and
$b - $a more obviously give different results to most people, because
subtraction is not commutative even for real numbers.

My personal opinion is that the RFC should not assume the overloads are
used in a particular domain (like real number arithmetic), and thus
should not attempt to enforce these kinds of behaviors. But, opinions
like this are actually what I was hoping to receive from this thread.
This could be the way forward that voters are more interested in, even
if it wouldn’t be my own first preference as it will be highly limiting
to the applicable domains.

I’m not sure where exactly in this thread to put this, so I’m putting it here…

Rowan makes an interesting point regarding operators vs operations. In particular, the way the <=> logic is defined, it is defining an operation: comparison. Using it for anything other than ordering comparison is simply not viable, nor useful. It’s defining a custom implementation if a specific pre-existing action.

For all the other operators, the logic seems to be defined for an operator, the behavior of which is “whatever makes sense in your use case, idk.” That is, to use Rowan’s distinction, a philosophically different approach. Not a bad one, necessarily. In fact, I think it’s a very good one.

But, as they are different, perhaps that suggests that comparison should instead not be implemented as an operator overload per se, but as a named magic method. The existing logic for it is, I think, fine, but it’s a fair criticism that you’re not defining “what happens for a method-ish named <=>”, you’re defining “how do objects compare.” So I think it would make sense to replace the <=> override with a __compare(mixed $other): int, which any class could implement to opt-in to ordering comparisons, and thus work with <, >, ==, <=>, etc. (And, importantly, still keep the “specify the type(s) you want to be able to compare against” logic, already defined.)

A similar argument could probably be made for ==, though I’ve not fully thought through if I agree or not. Again, I think the previously defined logic is fine. It would be just changing the spelling from operator ==(mixed $other): bool to public function __equals(mixed $other): bool. But that again better communicates that it is a core language behavior that is being overridden, rather than an arbitrarily defined symbol-function-thing with domain-specific meaning.

There was an RFC for a Comparable interface back in the stone age (2010), but it looks like it never went to a vote: https://wiki.php.net/rfc/comparable

Arguably, this would then make more sense as a stand-alone RFC that happens to reuse a lot of the existing code and logic defined for operator overloads, which are all still just as valid.

That does not apply to the arithmetic, bitwise, or logic operators. Overriding + or / for a specific domain is not the same, as you’re not hooking into engine behavior the way <=> or == are. For those, I’d prefer to stick to the current/previous implementation, with the operator keyword, for reasons I explained before.

Jordan, does that distinction make sense to you?

–Larry Garfield

Yes, I certainly understand the distinction. The RFC does not treat all operators equally. For many of them, the only opinion it holds is whether or not the operator is unary or binary, which is something enforced by the compiler anyway. But for comparisons, the RFC went out of its way to ensure that the overloads cannot repurpose any comparisons for non-comparison, non-ordering tasks usefully (as far as return value goes). In that sense, yes, I see how that feels more like an operation instead of an operator.

The only hesitation I would have about that is the clunky/ugly feeling I get of having some of them be symbols and some of them be names for a reason that will be totally inscrutable to 95% of developers and just be “one of those PHP quirks”. In principle though, I do get what you’re saying here.

Jordan

On Sat, Sep 14, 2024, at 4:48 PM, Jordan LeDoux wrote:

Hello internals,

This discussion will use my previous RFC as the starting point for
conversation: PHP: rfc:user_defined_operator_overloads

Replying to the top to avoid dragging any particular side discussion into this...

I've seen a few people, both in this thread and previously, make the argument that "operator overloads are only meaningful for math, and seriously no one does fancy math in PHP, so operator overloads are not necessary." This statement is patently false, and I would ask that everyone stop using it as it is simply FUD. I don't mean just the "no one does fancy math in PHP", but "operators are only meaningful for math" is just an ignorant statement.

As someone asked for use cases, here's a few use cases for operator overloads that do not fall into the category of fancy numeric math. Jordan, feel free to borrow any of these verbatim for your next RFC draft if you wish. (I naturally haven't tried running any of them, so forgive any typos or bugs. It's just to get the point across of each use case.)

## Pathlib

As I mentioned previously, Python's pathlib uses / to join different path fragments together. In PHP, one could implement such a library very simply with overloads. (A not-late-night implementation would probably be more performant than this, but it's just a POC.)

class Path implements Stringable
{
  private array $parts;

  public function __construct(?string $path = null) {
    $this->parts = array_filter(explode('/', $path ?? ''));
  }

  public static function fromArray(array $parts): self {
    $new = new self();
    $new->parts = $parts;
    return $new;
  }

  public function __toString() {
    return implode('/', $this->parts);
  }

  operator /(Path|string $other, OperandPosition $pos): Path {
    if ($other instanceof Path) {
      $other = (string)$other;
    }
    $otherParts = array_filter(explode('/', $path));

    return match ($pos) {
      OperandPosition::LeftSide => self::fromArray([...$this->parts, ...$otherParts]),
      OperandPosition::RightSide => self::fromArray([...$otherParts, ...$this->parts]),
    };
  }
}

$p = new Path('/foo/bar');
$p2 = $p / 'beep' / 'narf/poink';

## Collections

In my research into collections in other languages, I found it was extremely common for collections to have operator overloads on them. Rather than repeat it here, I will just link to my results and recommendations for what operators would make sense for what operation:

## Enum sets

Ideally, we would just use generic collections for this directly. However, even without generics, bitwise overloads would allow for this to be implemented fairly easily for a given enum. (Again, a smarter implementation is likely possible with actual effort.)

enum Perm {
  case Read;
  case Write;
  case Exec;
}

class Permissions {
  private array $cases = ;

  public function __construct(Perm ...$perms) {
     foreach ($perms as $case) {
      $this->cases[$case->name] = 1;
    }
  }

  operator +(Perm $other, OperandPosition $pos): Permissions {
    $new = clone($this);
    $new->cases[$other->name] = 1;
    return $new;
  }

  operator +(Perm $other, OperandPosition $pos): Permissions {
    $new = clone($this);
    unset($new->cases[$other->name]);
    return $new;
  }

  operator |(Permissions $other, OperandPosition $pos): Permissions {
    $new = clone($this);
    foreach ($other->cases as $caseName => $v) {
      $new->cases[$caseName] = 1;
    }
    return $new;
  }

  operator &(Permission $other, OperandPosition $pos): Permissions {
    $new = new self();
    $new->cases = array_key_intersect($this->cases, $other->cases);
    return $new;
  }
  
  // Not sure what operator makes sense here, so punting as this is just an example.
  public function has(Perm $p): bool {
    return array_key_exists($this->cases, $p->name);
  }
}

$p = new Permissions(Perm::Read);
$p2 = $p + Perm::Exec;
$p3 = $p2 | new Permissions(Perm::Write);
$p3 -= Perm::Exec;
$p3->has(Perm::Read);

## Function composition

I have long argued that PHP needs both a pipe operator and a function composition operator. It wouldn't be ideal, but something like this is possible. (Ideally we'd use the string concat operator here, but the RFC doesn't show it. It would be a trivial change to use instead.)

class Composed {
  /** @var \Closure */
  private array $steps = ;

  public function __construct(?\Closure $c = null) {
    $this->steps = $c;
  }

  private static function fromList(array $cs): self {
    $new = new self();
    $new->steps = $cs;
    return $new;
  }

  public function __invoke(mixed $arg): mixed {
    foreach ($this->steps as $step) {
      $arg = $step($arg);
    }
    return $arg;
  }

  operator +(\Closure $other, OperandPosition $pos): self {
    return match ($pos) {
      OperandPosition::LeftSide => self::fromArray([...$this->steps, $other]),
      OperandPosition::RightSide => self::fromArray([$other, ...$this->steps]),
    };
  }
}

$fun = new Composed()
  + someFunc(...)
  + $obj->someMethod(...)
  + fn(string $a) => $a . ' (archived)'
  + strlen(...);

$fun($input); // Calls each closure in turn.

Note that there are a half-dozen libraries in the wild that do something akin to this, just much more clumsily, including in Laravel. The code above would be vastly simpler and easier to maintain and debug.

## Units

Others have mentioned this before, but to make clear what it could look like:

abstract readonly class MetricDistance implements MetricDistance {
  protected int $factor = 1;

  public function __construct(private int $length) {}

  public function +(MetricDistance $other, OperandPos $pos): self {
    return new self(floor(($this->length * $this->factor + $other->length * $other->factor)/$this->factor));
  }

  public function -(MetricDistance $other, OperandPos $pos): self {
    return match ($pos) {
      OperandPosition::LeftSide => new self(floor(($this->length * $this->factor - $other->length * $other->factor)/$this->factor)),
      OperandPosition::RightSide => new self(floor($other->length * $other->factor - $this->length * $this->factor)/$this->factor)),
    };
  }

  public function __toString(): string {
    return $this->length;
  }
}

readonly class Meters extends MetricDistance {
  protected int $factor = 1;
}

readonly class Kilometers extends MetricDistance {
  protected int $factor = 1000;
}

$m1 = new Meters(500);
$k1 = new Kilometers(3);
$m1 += $k1;
print $m1; // prints 3500

$m1 + 12; // Error. 12 what?

There's likely a bug in the above somewhere, but it's late and it still gets the point across for now.

(Side note: The previous RFC supported abstract operator declarations, but not declarations on interfaces. That seems necessary for completeness.)

## Date and time

DateTimeImmutable and DateInterval already do this, and they're not "fancy math."

I consider all of the above to be reasonable, viable, and useful applications of operator overloading, none of which are fancy or esoteric math cases. Others may dislike them, stylistically. That's a subjective question, so opinions can differ. But the viability of the above cases is not disputable, so the claim that operator overloading is too niche to be worth it is, I would argue, demonstrably false.

--Larry Garfield

On 18.09.2024 at 00:13, Rowan Tommins [IMSoP] wrote:

Since you mentioned Scala, I looked it up, and it seems to be on the
other end of the spectrum: operators are just methods, with no
mathematical meaning or special dispatch behaviour. In fact, "a plus b"
is just another way of writing "a.plus(b)", so "a + b" is just a way of
writing "a.+(b)"

Maybe it would be "useful enough" to just restrict to left-hand side:

In my opinion, this is the only reasonable way to implement operator
overloads in PHP. It is easy to understand, and what can easily be
understood is easy to explain, document, and to reason about. I do not
understand why we're talking about commutative operations; even the
inconspicuous plus operator is not commutative in PHP
(Online PHP editor | output for nQcL5). Those who want to implement an numerical
tower (or whatever) can still implement the operations as being
commutative (where appropriate) by doing manual double-dispatch. Yeah,
doesn't fit with scalars, but where is the actual problem? And I
wouldn't want to restrict the functionality of overloading *exiting*
operators. If a library completely goes overboard with operator
overloading, either only few will use it, or it might be a fantastic
tool of which nobody of us could have even thought of.

Now, comparison operators pose a particular issue if overloads where
implemented this way, namely that the engine already swaps them; there
are no greater than (or equal) OPcodes. However, this already doesn't
work for uncomparable values, yielding "surprising" results (e.g.
#15773). As such, it might be worth considering to have a separate PR
regarding (overloading of) comparison operators.

And I would consider equality operator overloading as yet a different
issue, since that operation is (or at least should be) inherently
commutative. What we have now, however, is not that helpful, and breaks
encapsulation (Online PHP editor | output for hTR2v); although without that it would
be completely useless.

Christoph

On Wed, 18 Sep 2024, at 15:19, Christoph M. Becker wrote:

Maybe it would be "useful enough" to just restrict to left-hand side:

In my opinion, this is the only reasonable way to implement operator
overloads in PHP. It is easy to understand, and what can easily be
understood is easy to explain, document, and to reason about. I do not
understand why we're talking about commutative operations; even the
inconspicuous plus operator is not commutative in PHP
(Online PHP editor | output for nQcL5).

There are really three different things we shouldn't confuse:

1) Commutativity of the operation, as in $a + $b and $b + $a having the same result. As you say, this is a non-goal; we already have examples of non-commutative operators in PHP, and there are plenty more that have been given.

2) Commutativity of the *resolution*. This is slightly subtler: if $a and $b both have implementations of the operator, should $a + $b and $b + $a call the same implementation? We can say "no", but it may be surprising to some users that if $b is a sub-class of $a, its version of + isn't used by preference.

3) Resolution when *only one side has an implementation*. For instance, how do you define an overload for 1 / $object? Or for (new DateTime) + (new MySpecialDateOffset)? It's possible to work around this if the custom class has to be on the left, but probably not very intuitive.

It's also worth considering that the *resolution* of PHP's operators aren't currently determined by their left-hand side, e.g. int + float and float + int both return a float, which certainly feels like "preferring the float implementation regardless of order", even if PHP doesn't technically implement it that way.

Regards,
--
Rowan Tommins
[IMSoP]

On Wed, Sep 18, 2024 at 6:11 PM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On Wed, 18 Sep 2024, at 15:19, Christoph M. Becker wrote:

Maybe it would be “useful enough” to just restrict to left-hand side:

In my opinion, this is the only reasonable way to implement operator
overloads in PHP. It is easy to understand, and what can easily be
understood is easy to explain, document, and to reason about. I do not
understand why we’re talking about commutative operations; even the
inconspicuous plus operator is not commutative in PHP
(https://3v4l.org/nQcL5).

There are really three different things we shouldn’t confuse:

  1. Commutativity of the operation, as in $a + $b and $b + $a having the same result. As you say, this is a non-goal; we already have examples of non-commutative operators in PHP, and there are plenty more that have been given.

  2. Commutativity of the resolution. This is slightly subtler: if $a and $b both have implementations of the operator, should $a + $b and $b + $a call the same implementation? We can say “no”, but it may be surprising to some users that if $b is a sub-class of $a, its version of + isn’t used by preference.

  3. Resolution when only one side has an implementation. For instance, how do you define an overload for 1 / $object? Or for (new DateTime) + (new MySpecialDateOffset)? It’s possible to work around this if the custom class has to be on the left, but probably not very intuitive.

It’s also worth considering that the resolution of PHP’s operators aren’t currently determined by their left-hand side, e.g. int + float and float + int both return a float, which certainly feels like “preferring the float implementation regardless of order”, even if PHP doesn’t technically implement it that way.

How about doing it like in Python, where there is __add__ and __radd__?

And the engine could call $op1->add($op2) if $op1 is an object and add() is implemented, or otherwise call $op2->rightAdd($op1) if $op2 is an object and rightAdd() is implemented, or otherwise fail with an error.

We could have (distinct) interfaces for both add() and rightAdd().
Or use magic methods like __add() and __rightAdd() to allow stricter types instead of mixed on the other operand. I think there is no extra complexity for the engine by using magic methods or interfaces.


Alex