[PHP-DEV] [RFC] Default expression

On 25/08/2024 14:35, Larry Garfield wrote:

My other concern is the list of supported expression types. I
understand how the implementation would naturally make all of those
syntactically valid, but it seems many of them, if not most, are
semantically nonsensical.

I tend to agree with Larry and John that the list of operators should be restricted - we can always allow more in future, but restricting later is much harder.

A few rules that seem logical to me:

1) The expression should be reasonably guaranteed to produce the same type as the actual default.

- No casts
- No comparison operators, because they produce booleans from non-boolean input
- No "<=>". Technically, it has an integer result, but it's rare to use it as one, rather than a kind of three-value boolean
- No "instanceof"
- No "empty"

2) The expression should not have side effects (outside of exotic operator overloads).

- No "include", "require", etc
- No "throw"
- No "print"
- Borderline, but I would also say no "clone"

3) The expression should be passing additional information into the function, not pulling information out of it. The syntax shouldn't be a way to write obfuscated reflection, or invert data flow from callee to caller.

- No assignments.
- No ternaries with "default" on the left-hand side - "$foo ? $bar : default" is acting on local knowledge, but "default ? $foo : $bar" is acting on information the caller shouldn't know
- Same for "?:" and "??"
- No "match" with "default" as the condition or branch, for the same reason. "match($foo) { $bar => default }" is fine, match(default) { ... }" or "match($foo) { default => ... }" are not.

Note that these can be seen as aspects of the same rule: the aim of the expression should be to transform the default value into another value of the same type, not to pull it out and perform arbitrary operations based on it.

I believe that leaves us with:

- Arithmetic operators: binary + - * / % **, unary + -
- Bitwise operators: & | ^ << >> ~
- Boolean operators: && || and or xor !
- Conditions with default on the RHS: $foo ? $bar : default, $foo ?: default, $foo ?? default, match($foo) { $bar => default }
- Parentheses: (((default)))

Even then, I look at that list and see more problems than use cases. As the RFC points out, library authors already worry about the maintenance burden of named argument support, will they now also need to question whether someone is relying on "default + 1" having some specific effect?

Maybe we should instead require justification for each addition:

- Bitwise | is nicely demonstrated in the RFC
- Bitwise & could probably be justified on similar grounds
- "$foo ? $bar : default" is discussed in the RFC
- The other "conditions with default on the RHS" in my shortlist above fit the same basic use case

Beyond that, I'm struggling to think of meaningful uses: "whatever the function sets as its default, do the opposite"; "whatever number the function sets as default, raise it to the power of 3"; etc. Again, they can easily be added in later versions, if a use case is pointed out.

Regards,

--
Rowan Tommins
[IMSoP]

On 25/08/2024 15:04, John Coggeshall wrote:

Other thoughts here are what happens when |default| resolves to an object or enumeration or something complex? Your original example had |CuteTheme| , so can you call a method of |default| ?? I could entirely see someone doing something like this for example:

enum Foo:string {
// cases

public function buildSomeValidBasedOnCase\(\): int \{ // \.\.\. \}

}

F(MyClass::makeBasedOnValue(default->buildSomeValidBasedOnCase()))

As you have written it, no, you will get a parser error: Parse error: syntax error, unexpected token "->", expecting ")"

However, you can wrap the `default` in parens as in the following example:

class C {
function F() {
echo 'lol';
}
}

function G($V = new C) {}

G((default)->F()); // lol

On Sun, Aug 25, 2024, at 17:31, Rowan Tommins [IMSoP] wrote:

On 25/08/2024 14:35, Larry Garfield wrote:

My other concern is the list of supported expression types.  I 
understand how the implementation would naturally make all of those 
syntactically valid, but it seems many of them, if not most, are 
semantically nonsensical.

I tend to agree with Larry and John that the list of operators should be restricted - we can always allow more in future, but restricting later is much harder.

A few rules that seem logical to me:

  1. The expression should be reasonably guaranteed to produce the same type as the actual default.
  • No casts

  • No comparison operators, because they produce booleans from non-boolean input

  • No “<=>”. Technically, it has an integer result, but it’s rare to use it as one, rather than a kind of three-value boolean

  • No “instanceof”

  • No “empty”

  1. The expression should not have side effects (outside of exotic operator overloads).
  • No “include”, “require”, etc

  • No “throw”

  • No “print”

  • Borderline, but I would also say no “clone”

  1. The expression should be passing additional information into the function, not pulling information out of it. The syntax shouldn’t be a way to write obfuscated reflection, or invert data flow from callee to caller.
  • No assignments.

  • No ternaries with “default” on the left-hand side - “$foo ? $bar : default” is acting on local knowledge, but “default ? $foo : $bar” is acting on information the caller shouldn’t know

  • Same for “?:” and “??”

  • No “match” with “default” as the condition or branch, for the same reason. “match($foo) { $bar => default }” is fine, match(default) { … }" or “match($foo) { default => … }” are not.

Note that these can be seen as aspects of the same rule: the aim of the expression should be to transform the default value into another value of the same type, not to pull it out and perform arbitrary operations based on it.

I believe that leaves us with:

  • Arithmetic operators: binary + - * / % **, unary + -

  • Bitwise operators: & | ^ << >> ~

  • Boolean operators: && || and or xor !

  • Conditions with default on the RHS: $foo ? $bar : default, $foo ?: default, $foo ?? default, match($foo) { $bar => default }

  • Parentheses: (((default)))

Even then, I look at that list and see more problems than use cases. As the RFC points out, library authors already worry about the maintenance burden of named argument support, will they now also need to question whether someone is relying on “default + 1” having some specific effect?

Maybe we should instead require justification for each addition:

  • Bitwise | is nicely demonstrated in the RFC

  • Bitwise & could probably be justified on similar grounds

  • “$foo ? $bar : default” is discussed in the RFC

  • The other “conditions with default on the RHS” in my shortlist above fit the same basic use case

Beyond that, I’m struggling to think of meaningful uses: “whatever the function sets as its default, do the opposite”; “whatever number the function sets as default, raise it to the power of 3”; etc. Again, they can easily be added in later versions, if a use case is pointed out.

Regards,

-- 
Rowan Tommins
[IMSoP]

Hi Rowan, you went through a lot of trouble to write this out, and the reasoning makes sense to me. However, all the nonsensical things you say shouldn’t be allowed are already perfectly allowed today, you just have to type a bunch of boilerplate reflection code. There is no new behavior here, just new syntax.

— Rob

On 25/08/2024 16:29, Bilge wrote:

You can write, `include(1 + 1);`, because `include()` accepts an expression. You will get: "Failed opening '2' for inclusion". Should we restrict that? No, because that's just how expressions work in any context where they're allowed.

I think a better comparison might be the "new in initializers" and "fetch property in const expressions" RFCs, which both forbid uses which would naturally be allowed by the grammar. The rationale in those cases was laid out in PHP: rfc:new_in_initializers and PHP: rfc:fetch_property_in_const_expressions

To pull out a point that might be overlooked at the bottom of my longer response earlier:

> As the RFC points out, library authors already worry about the maintenance burden of named argument support, will they now also need to question whether someone is relying on "default + 1" having some specific effect?

By saying "default can be used in any expression, as complex as the caller can imagine", we're implicitly saying "if you add a default to your function signature, that is no information a user can pull *out* as part of your API".

Regards,

--
Rowan Tommins
[IMSoP]

On 25/08/2024 16:54, Rob Landers wrote:

Hi Rowan, you went through a lot of trouble to write this out, and the reasoning makes sense to me. However, all the nonsensical things you say shouldn’t be allowed are already perfectly allowed today, you just have to type a bunch of boilerplate reflection code. There is no new behavior here, just new syntax.

Firstly, your response to John was essentially "please give more details" [[RFC] Default expression - Externals], and your response to me is "thanks for the details, but I'm not going to engage with them". That's a bit frustrating.

Secondly, I don't think "it's possible with half a dozen lines of reflection, so it's fine for it to be a first-class feature of the language syntax" is a strong argument. The Reflection API is a bit like the Advanced Settings panel in a piece of software, it comes with a big "Proceed with Caution" warning. You only move something from that Advanced Settings panel to the main UI when it's going to be commonly used, and generally safe to use. I don't think allowing arbitrary operations on a value that's declared as the default of some other function passes that test.

Regards,

--
Rowan Tommins
[IMSoP]

On Aug 25 2024, at 11:31 am, Rowan Tommins [IMSoP] imsop.php@rwec.co.uk wrote:

Even then, I look at that list and see more problems than use cases. As the RFC points out, library authors already worry about the maintenance burden of named argument support, will they now also need to question whether someone is relying on “default + 1” having some specific effect?

Maybe we should instead require justification for each addition:

  • Bitwise | is nicely demonstrated in the RFC
  • Bitwise & could probably be justified on similar grounds
  • “$foo ? $bar : default” is discussed in the RFC
  • The other “conditions with default on the RHS” in my shortlist above fit the same basic use case

IMO the operations that make sense in this context are:

  • Some Bitwise operators: & | ^
  • Conditions with default on the RHS: $foo ? $bar : default, $foo ?: default, $foo ?? default, match($foo) { $bar => default }
  • Parentheses: (((default)))

Beyond that, I’m struggling to think of meaningful uses: “whatever the function sets as its default, do the opposite”; “whatever number the function sets as default, raise it to the power of 3”; etc. Again, they can easily be added in later versions, if a use case is pointed out.

I 100% agree.

G((default)->F()); // lol

Special-casing the T_DEFAULT grammar would not only bloat the grammar rules but also increase the chance that new expression grammars introduced in future, which could conveniently interoperate with default, would be unintentionally excluded by omission.

I won’t vote for this RFC if the above code is valid, FWIW. Unlike include , default is a special-case with a very specific purpose – one that is reaching into someone else’s API in a way the developer of that library doesn’t explicitly permit. It should not become a fast easy way to inject a new potentially complex dependency which is what allowing a full expression support would allow.

The fact that Reflection allows me to pull out a private member doesn’t mean accessing private members of objects should be given its own language syntax.

Frankly, not only should the op list be limited but ideally it should also only be valid based on the type of the upstream API call (e.g. bitwise operators should only be valid if the upstream API call has a type int ).

Special-casing the T_DEFAULT grammar would not only bloat the grammar rules but also increase the chance that new expression grammars introduced in future, which could conveniently interoperate with default, would be unintentionally excluded by omission.

Forgot to add that I don’t think the fact doing this properly requires a more complex grammar is a strong argument for doing it “the easy way” of allowing all expressions. It’s a special case, and that should be reflected in the grammar.

Hi Rowan

On Sun, Aug 25, 2024 at 6:06 PM Rowan Tommins [IMSoP]
<imsop.php@rwec.co.uk> wrote:

On 25/08/2024 16:29, Bilge wrote:
> You can write, `include(1 + 1);`, because `include()` accepts an
> expression. You will get: "Failed opening '2' for inclusion". Should
> we restrict that? No, because that's just how expressions work in any
> context where they're allowed.

I think a better comparison might be the "new in initializers" and
"fetch property in const expressions" RFCs, which both forbid uses which
would naturally be allowed by the grammar. The rationale in those cases
was laid out in
PHP: rfc:new_in_initializers and
PHP: rfc:fetch_property_in_const_expressions

I don't agree with that. Constant expressions in PHP already only
support a subset of operations that expressions do. However, default
is proposed to be a true expression, i.e. one that compiles to
opcodes. Looking at the `expr` nonterminal [1] I can't see any
productions that are restricted in the context they can be used in,
even though plenty of them are nonsensical (e.g. exit(1) + 2).

Furthermore, new in initializers was disallowed in some contexts not
because it would be nonsensical, but because it posed technical
difficulties.

I also believe some of the rules you've laid out would be hard to enforce.

1) The expression should be reasonably guaranteed to produce the same type as the actual default.

Even the simple cases of ??, ?: can easily break this rule.

Furthermore, context restriction is easily circumvented. E.g.

foo((int) default); // This is not allowed
foo((int) match (true) { default => default }); // Let me just do that

I'm not sure context restriction is worthwhile, if 1. we can't do it
properly anyway and 2. there are no technical reasons to do so.

Ilija

[1] php-src/Zend/zend_language_parser.y at 3f4028d3d9d63e1dae012a9c350141493b30825f · php/php-src · GitHub

On 25/08/2024 17:36, Ilija Tovilo wrote:

I don't agree with that. Constant expressions in PHP already only
support a subset of operations that expressions do. However, default
is proposed to be a true expression, i.e. one that compiles to
opcodes.

This is circular: obviously, changing the proposal requires making changes to what is proposed.

I'm arguing that allowing default as a token that's usable in arbitrary expressions is unnecessary and problematic, and that we should instead define the specific use cases, and build the feature around those.

I also believe some of the rules you've laid out would be hard to enforce.

The rules were intended to guide the design of the feature, not be things that someone needed to enforce in code somewhere. If you start with the aim of implementing:

- Use in place of an argument
- Use with bitwise | and &
- Use on the RHS of ?: and ??

Then maybe you end up with a completely different implementation from what's currently been written.

For instance, rather than adding "default" to the "expr" rule in the grammar, and then restricting it at compile-time, maybe we add a new grammar rule "expr_with_default", usable only in expressions and with a very limited set of productions. Maybe that means we can't support match() expressions, because it would bloat the grammar too much, but "match($foo) { 'blah'=> 'bleugh', default => default }" is pretty ugly anyway.

Or maybe, the expressions are allowed, but they're compiled down with "default" as a special pseudo-type that has limited legal operations, so that the result of "(int)default" is undefined, no matter how you try to obfuscate it.

Just because it's easy to implement a feature a particular way, doesn't mean that's necessarily the right way.

--
Rowan Tommins
[IMSoP]

Like the ? Yeah, we did that.

···

On 25/08/2024 18:12, Rowan Tommins [IMSoP] wrote:

For instance, rather than adding “default” to the “expr” rule in the grammar, and then restricting it at compile-time, maybe we add a new grammar rule “expr_with_default”, usable only in expressions and with a very limited set of productions.

original commit

Just because it’s easy to implement a feature a particular way, doesn’t mean that’s necessarily the right way.

With respect, you do not know what you’re talking about here. The original approach was to start manually whitelisting each expression grammar I thought made sense. THAT was the easiest way because both myself an Ilija failed in our first attempts to expand the grammar to support default as a general expression, and not for lack of trying. It took a Bison grammar expert to drop a patch that certainly wowed me, because hitherto I wasn’t even certain it was possible, mainly because of the conflicts with match (but also switch to some extent). Aside, with respect to match, there is still an unresolved case and the RFC needs updating with the semantics we want to enforce there. So we pursued default as an expression not because it was easy, but despite the fact that it was hard, because it was precisely what we wanted to do.

I apologise for coming on strong, but I put a lot of effort into this, so I take exception to the implication that anyone involved took the easy way out to arrive at this (our best) solution.

Kind regards,
Bilge

Hi Rowan,

On Aug 25, 2024, at 11:31, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

3) The expression should be passing additional information into the function, not pulling information out of it. The syntax shouldn't be a way to write obfuscated reflection, or invert data flow from callee to caller.
- No assignments.
- No ternaries with "default" on the left-hand side - "$foo ? $bar : default" is acting on local knowledge, but "default ? $foo : $bar" is acting on information the caller shouldn't know
- Same for "?:" and "??"
- No "match" with "default" as the condition or branch, for the same reason. "match($foo) { $bar => default }" is fine, match(default) { ... }" or "match($foo) { default => ... }" are not.

I think this brings up a good question on what exactly should be intended to be public API. Currently, there's two main ways to effectively write a default parameter:

  function foo(int $param = 42) {}
  function bar(?int $param) { $param ??= 42; }

In the former, the default value is listed in the function declaration, along with the function name, and parameter type and name, which are already part of the public interface.
In the latter, the default value is an implementation detail of the function, and is not part of the function declaration.

(You could also ?int $param = 42, but I'd argue that at that point, if you really need to distinguish among the set of (unspecified, null, value), you're better off with an ADT, which we don't have yet. And you can also explicitly expose a default value as a static value on a type, when a function itself doesn't want to/shouldn't make the policy decision of what the default is.)

Although I'm not sold on the idea of using default as part of an expression, I would argue that a default function parameter value is fair game to be read and manipulated by callers. If the default value was intended to be private, it shouldn't be in the function declaration.

One important case where reading the default value could be important is in interoperability with different library versions. For example, a library might change a default parameter value between versions. If you're using the library, and want to support both versions, you might both not want to set the value, and yet also care what the default value is from the standpoint of knowing what to expect out of the function.

-John

On 25/08/2024 18:30, Bilge wrote:

I apologise for coming on strong, but I put a lot of effort into this, so I take exception to the implication that anyone involved took the easy way out to arrive at this (our best) solution.

I apologise for the inadvertent offence.

It was based solely on this comment from Ilija:

> I also believe some of the rules you've laid out would be hard to enforce.

I took that to mean that supporting generic expressions was straight-forward, but supporting a limited set would be complex in some way. Apparently I was wrong in that interpretation; in which case, I've no idea what that sentence was referring to.

--
Rowan Tommins
[IMSoP]

On Sun, Aug 25, 2024, at 18:21, Rowan Tommins [IMSoP] wrote:

On 25/08/2024 16:54, Rob Landers wrote:

Hi Rowan, you went through a lot of trouble to write this out, and the

reasoning makes sense to me. However, all the nonsensical things you

say shouldn’t be allowed are already perfectly allowed today, you just

have to type a bunch of boilerplate reflection code. There is no new

behavior here, just new syntax.

Firstly, your response to John was essentially "please give more

details" [https://externals.io/message/125183#125214], and your response

to me is "thanks for the details, but I’m not going to engage with

them". That’s a bit frustrating.

Oh, my apologies! That wasn’t my intention! With John and yourself, I do agree with you. I’m just trying to understand the logic in limiting it. As in, “I intuitively feel the same way but I don’t know why but maybe you do.” Intuition sucks sometimes.

Secondly, I don’t think "it’s possible with half a dozen lines of

reflection, so it’s fine for it to be a first-class feature of the

language syntax" is a strong argument. The Reflection API is a bit like

the Advanced Settings panel in a piece of software, it comes with a big

“Proceed with Caution” warning. You only move something from that

Advanced Settings panel to the main UI when it’s going to be commonly

used, and generally safe to use. I don’t think allowing arbitrary

operations on a value that’s declared as the default of some other

function passes that test.

Regards,

Rowan Tommins

[IMSoP]

That makes sense, but is it uncommon because it is hard and slow, or because it is genuinely not a common need?

— Rob

On 25/08/2024 18:44, John Bafford wrote:

Although I'm not sold on the idea of using default as part of an
expression, I would argue that a default function parameter value is
fair game to be read and manipulated by callers. If the default value
was intended to be private, it shouldn't be in the function declaration.

There's an easy argument against this interpretation: child classes can freely change the default value for a parameter, as long as they do not make it mandatory. Online PHP editor | output for SEsRm

That matches my intuition: that the public API, as a contract, states that the parameter is optional; the specification of what happens when it is not provided is an implementation detail.

For comparison, consider constructor property promotion; the caller shouldn't know or care whether a class is defined as:

public function __construct(private int $bar) {}

or:

private int $my_bar;
public function __construct(int $bar) { $this->my_bar = $bar; }

The syntax sits in the function signature because it's convenient, not because it's part of the API.

One important case where reading the default value could be important is
  in interoperability with different library versions. For example, a
library might change a default parameter value between versions. If
you're using the library, and want to support both versions, you might
both not want to set the value, and yet also care what the default value
  is from the standpoint of knowing what to expect out of the function.

This seems contradictory to me. If you use the default, you're telling the library that you don't care about that parameter, and trust it to provide a default.

If you want to know what the library did with its arguments, reflecting the signature will never be enough anyway. For example, it's quite common to write code like this:

function foo(?SomethingInterface $blah = null) {
if ( $blah === null ) {
$blah = self::_setup_default_blah();
}
// ...
}

A caller can't tell by looking at the signature that a new version of the library has changed what _setup_default_blah() returns. If the library doesn't provide an API to get $blah out later, then it's a private detail that the caller has no business inspecting.

Regards,

--
Rowan Tommins
[IMSoP]

On Sun, Aug 25, 2024, at 20:46, Rowan Tommins [IMSoP] wrote:

On 25/08/2024 18:44, John Bafford wrote:

Although I'm not sold on the idea of using default as part of an 
expression, I would argue that a default function parameter value is 
fair game to be read and manipulated by callers. If the default value 
was intended to be private, it shouldn't be in the function declaration.

There’s an easy argument against this interpretation: child classes can freely change the default value for a parameter, as long as they do not make it mandatory. https://3v4l.org/SEsRm

That matches my intuition: that the public API, as a contract, states that the parameter is optional; the specification of what happens when it is not provided is an implementation detail.

For comparison, consider constructor property promotion; the caller shouldn’t know or care whether a class is defined as:

public function __construct(private int $bar) {}

or:

private int $my_bar;

public function __construct(int $bar) { $this->my_bar = $bar; }

The syntax sits in the function signature because it’s convenient, not because it’s part of the API.

One important case where reading the default value could be important is
 in interoperability with different library versions. For example, a 
library might change a default parameter value between versions. If 
you're using the library, and want to support both versions, you might 
both not want to set the value, and yet also care what the default value
 is from the standpoint of knowing what to expect out of the function.

This seems contradictory to me. If you use the default, you’re telling the library that you don’t care about that parameter, and trust it to provide a default.

If you want to know what the library did with its arguments, reflecting the signature will never be enough anyway. For example, it’s quite common to write code like this:

function foo(?SomethingInterface $blah = null) {

if ( $blah === null ) {

$blah = self::_setup_default_blah();

}

// …

}

A caller can’t tell by looking at the signature that a new version of the library has changed what _setup_default_blah() returns. If the library doesn’t provide an API to get $blah out later, then it’s a private detail that the caller has no business inspecting.

Regards,

-- 
Rowan Tommins
[IMSoP]

I think you’ve hit an interesting point here, but probably not what you intended.

For example, let’s consider this function:

json_encode(mixed $value, int $flags = 0, int $depth = 512): string|false

Already, you have to look up the default value of depth or set it to something that makes sense, as well as $flags. So you do this:

json_encode($value, JSON_THROW_ON_ERROR, 512);

You are doing this even when you omit the default. If you set it to a variable to spell it out:

$default_flags = 0 | JSON_THROW_ON_ERROR;

$default_depth = 512; // according to docs on DATE

json_encode($value, $default_flags, $default_depth);

Can now be rewritten:

json_encode($value, $default_flags = default | JSON_THROW_ON_ERROR, $default_depth = default);

This isn’t just reflection, this is saving me from having to look up the docs/implementation and hardcode values. The implementation is free to change them, and my code will “just work.”

Now, let’s look at a more non-trivial case from some real-life use-cases, in the form of a plausible story:


public function __construct(
    private LoggerInterface|null $logger = null,
    private string|null $name = null,

    Level|null $level = null,

)

This code constructs a new logger composed from an already existing logger. When constructing it, I may look up what the default values are and decide if I want to override them or not. Otherwise, I will leave it as null.

A coworker and I got to talking about this interface. It kind of sucks, and we don’t like it. It’s been around for ages, so we are worried about changing it. Specifically, we are wondering if we should use SuperNullLogger as the default instead of null (which happens to just create a NullLogger a few lines later). We are pretty sure making this change won’t cause any issues, but to be extra safe, we will do it only on a single code path; further, we are 100% sure we are going to change this signature, so we need to do it in a forward-compatible way. Thus, we will set it to SuperNullLogger if-and-only-if the default value is null:

default ?? new SuperNullLogger()

Now, we can run this in production and see how well it performs. Incidentally, we discover that NullLogger implementation is superior and we can now change the default:


public function __construct(
private LoggerInterface $logger = new NullLogger(),
    private string|null $name = null,

    Level|null $level = null,

)

That one code path “magically” updates as soon as the library is updated, without having to make further changes. Anything that is hardcoded “null” will break in tests/static analysis, making it easy to locate. Further, we can test other types of NullLoggers just as easily:

default instanceof NullLogger ? new BasicNullLogger() : default

So, yes, I think in isolation the feature might look strange, and some operations might look nonsensical, but I believe there is a use case here that was previously rather hard to do; or statically done via someone looking up some documentation/code and doing a search-and-replace.

— Rob

On 25/08/2024 17:05, Rowan Tommins [IMSoP] wrote:

On 25/08/2024 16:29, Bilge wrote:

You can write, `include(1 + 1);`, because `include()` accepts an expression. You will get: "Failed opening '2' for inclusion". Should we restrict that? No, because that's just how expressions work in any context where they're allowed.

I think a better comparison might be the "new in initializers" and "fetch property in const expressions" RFCs, which both forbid uses which would naturally be allowed by the grammar. The rationale in those cases was laid out in PHP: rfc:new_in_initializers and PHP: rfc:fetch_property_in_const_expressions

They do not seem like better comparisons because, in both cases, support for those respective features was limited due to technical obstructions. My implementation already permits `default` as a general expression (thanks to Bob's Bison patch), Q.E.D. there is no technical constraint precluding support for default as a general expression grammar. What you are proposing is an artificial limitation on the language, which is an entirely different proposition (and not a healthy one, in my view).

Allow me to address the point made in your previous email which ended up hinting that `default + 1` should also be prohibited because it hasn't been explicitly justified.

Notwithstanding I don't have the energy to justify every single permutation of expressions, I'll humour the arithmetic operators criticism with an example just to demonstrate that one can justify just about anything with sufficient enthusiasm and creativity.

Suppose we have a Suspension class that suspends the current process for a specified delay in milliseconds, but our subclass wants to present an interface that deals with whole seconds (including fractional seconds using floats).

class Suspension {
/**
* @param int $delay Specifies the delay in milliseconds.
*/
public function suspend(int $delay = 1_000) {
var_dump($delay);
}
}

class MySuspension extends Suspension {
/**
* @param float|int|null $delay Specifies the delay in seconds.
*/
public function suspend(float|int|null $delay = null) {
parent::suspend((int)(($delay ?? 0) * 1000) ?: default);
}
}

new MySuspension()->suspend(2.2345); // int(2234)

Not only have I demonstrated the need to use multiplication or division to change the scale, but also the need to cast.

Cheers,
Bilge

On 25/08/2024 18:46, Rowan Tommins [IMSoP] wrote:

On 25/08/2024 18:30, Bilge wrote:

I apologise for coming on strong, but I put a lot of effort into this, so I take exception to the implication that anyone involved took the easy way out to arrive at this (our best) solution.

I apologise for the inadvertent offence.

That's OK, I can tell you're passionate about PHP and you're interested in having a constructive discussion about this RFC, so we have that in common :slightly_smiling_face:

Cheers,
Bilge

On Aug 25, 2024, at 14:46, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 25/08/2024 18:44, John Bafford wrote:

Although I'm not sold on the idea of using default as part of an
expression, I would argue that a default function parameter value is
fair game to be read and manipulated by callers. If the default value
was intended to be private, it shouldn't be in the function declaration.

There's an easy argument against this interpretation: child classes can freely change the default value for a parameter, as long as they do not make it mandatory. Online PHP editor | output for SEsRm
That matches my intuition: that the public API, as a contract, states that the parameter is optional; the specification of what happens when it is not provided is an implementation detail.
For comparison, consider constructor property promotion; the caller shouldn't know or care whether a class is defined as:
public function __construct(private int $bar) {}
or:
private int $my_bar;
public function __construct(int $bar) { $this->my_bar = $bar; }
The syntax sits in the function signature because it's convenient, not because it's part of the API.

This is only by current convention. It used to be that parameter names were not part of the API contract, but now with named parameters, they are. There's no reason default values couldn't (or shouldn't) become part of the API contract in the same way.

(Note that in some other languages, default parameter values are not only part of the API contract, but they're emitted into the clients when compiled, so an API can change/add/remove its default values and the client continues to function as it used to with the value as defined at compile time. This doesn't currently matter for PHP, where you have the full source to anything you run, but could become important later if PHP gained ahead-of-time compiled binary modules.)

One important case where reading the default value could be important is
in interoperability with different library versions. For example, a
library might change a default parameter value between versions. If
you're using the library, and want to support both versions, you might
both not want to set the value, and yet also care what the default value
is from the standpoint of knowing what to expect out of the function.

This seems contradictory to me. If you use the default, you're telling the library that you don't care about that parameter, and trust it to provide a default.
If you want to know what the library did with its arguments, reflecting the signature will never be enough anyway. For example, it's quite common to write code like this:
function foo(?SomethingInterface $blah = null) {
    if ( $blah === null ) {
        $blah = self::_setup_default_blah();
    }
    // ...
}
A caller can't tell by looking at the signature that a new version of the library has changed what _setup_default_blah() returns. If the library doesn't provide an API to get $blah out later, then it's a private detail that the caller has no business inspecting.

Well, but that's the private default example I described. In that case you're not intended to be able to reason about what the default is, because it's a private implementation detail of the function, as opposed to being expressed in the parameter list. Although, if it weren't intended to be an implementation detail, the only thing stopping you from writing in the parameter list like this:

  function foo(SomethingInterface $blah = self::_setup_default_blah()) {...}

is because PHP doesn't currently allow default values to be computed at runtime. (Maybe it should.)

-John

public function suspend(float|int|null $delay = null) {
parent::suspend((int)(($delay ?? 0) * 1000) ?: default);
}
}

new MySuspension()->suspend(2.2345); // int(2234)

Not only have I demonstrated the need to use multiplication or division
to change the scale, but also the need to cast.

I appreciate what you’re saying here.

I’ve been struggling a little bit to really nail my language here on what I think should and shouldn’t be allowed. Essentially I’m trying to say (and I think others are too) is this:

The engine should not allow the use of default in an expression that doesn’t ultimately evaluate to default *.

In the above example the left - hand of the ?: operator doesn’t use default , so it’s evaluation is whatever it’s evaluation is. The right-hand of the ?: operator DOES use default , and thus it must evaluate ultimately to default or that would be an error. Another example:

parent::foo((default >= 10) ? default : 10)

Would be permitted because the left-hand uses default , but the evaluation if the conditional where default was used for the true case true is default . Likewise the right-hand is just 10 and irrelevant

This would not be permitted

parent::foo((default >= 10) ? (default + 1) : 10)

Because now the ultimate evaluation of default is default + 1 – not default

*The exception to the rule I’ve described above would be IFF the expression default is in only uses specific allowed operators like a subset of the bitwise operators.

I very much appreciate that what is being described here is a significant effort to achieve, I’m not even sure it’s reasonably possible… but I just can’t get behind the idea that (default)->foobar() is a valid expression in this context or a good idea for the language. The use of this proposed default keyword must have guardrails IMO. I think my definition above is a pretty reasonable attempt at capturing where I think the line is here and hopefully that helps guide this discussion.

John

On Sun, Aug 25, 2024, at 10:29 AM, Bilge wrote:

On 25/08/2024 14:35, Larry Garfield wrote:

The approach here seems reasonable overall. The mental model I have from the RFC is "yoink the default value out of the function, drop it into this expression embedded in the function call, and let the chips fall where they may." Is that about accurate? Yes, as it happens. That is the approach we took, because the alternative would have been changing how values are sent to functions, which would have required a lot more changes to the engine with no clear benefit. Internally it literally calls the reflection API, but a low-level call, that elides the class instantiation and unnecessary hoops of the public interface that would just slow it down. My main holdup is the need. I... can't recall ever having a situation where this is something I needed. Some of the examples show valid use cases (eg, the "default plus this binary flag" example), but again, I've never actually run into that myself in practice. That's fine. Not everyone will have such a need, and of those that do, I'm willing to bet it will be rare or uncommon at best. But for those times it is needed, the frequency by which it is needed in no way diminishes its usefulness. I rarely use `goto` but that doesn't mean we shouldn't have the feature. My other concern is the list of supported expression types. I understand how the implementation would naturally make all of those syntactically valid, but it seems many of them, if not most, are semantically nonsensical. Eg, `default > 1` would take a presumably numeric default value and output a boolean, which should really never be type compatible with the function being called. (A param type of int|bool is a code smell at best, and a fatal waiting to happen at worst.) In practice, I think a majority of those expressions would be logically nonsensical, so I wonder if it would be better to only allow a few reasonable ones and block the others, to keep people from thinking nonsensical code would do something useful.

Since you're not the only one raising this, I will address it, but just
to say there is no good reason, in my mind, to ever prohibit the
expressiveness. To quote Rob

I'm reasonably certain you can write nonsensical PHP without this feature. I don't think we should be the nanny of developers.

See, I approach it from an entirely different philosophical perspective:

To the extent possible, the language and compiler should prevent you from doing stupid things, or at least make doing stupid things harder.

This is the design philosophy behind, well, most good user interfaces. It's why it's good that US and EU power outlets are different, because they run different voltages, and blindly plugging one into the other can cause damage or death.

This is the design philosophy behind all type systems: Make illogical or dangerous or "we know it can't work" code paths a compile error, or even impossible to express at all.

This is the design philosophy behind password_hash() and friends: The easy behavior is, 99% of the time, the right one, so doing the "right thing" is easy. Doing something dumb (like explicitly setting password_hash() to use md5 or something) may be possible, but it requires extra work to be dumb.

Good design makes the easy path the safe path.

Now, certainly, the definition of "stupid things" is subjective and squishy, and reasonable people can disagree on where that threshold is. That's what a robust discussion is for, to figure out what qualifies as a "stupid thing" in this case.

Rob has shown some possible, hypothetical uses for some of the seemingly silly possible combinations, which may or may not carry weight with people. But there are others that are still unjustified, so for now, I would still put "default != 5" into the "stupid things" category, for example.

As you've noted, this is already applicable only in some edge cases to begin with, so enabling edge cases of edge cases that only maybe make sense if you squint is very likely in the "stupid things" territory.

I fully agree with that sentiment. It seems to be biting me that I went
to the trouble of listing out every permutation of what *expression*
means where perhaps this criticism would not have been levied at all
had I chosen not to do so.

From one RFC author to another, it's better to make that list explicitly and let us collectively think through the logic of it than to be light on details and not realize what will break until later. We've had RFCs that did that, and it caused problems. The discussion can absolutely be frustrating (boy do I know), but the language is better for it. So I'm glad you did call it out so we could have this discussion.

--Larry Garfield