[PHP-DEV] Consensus on argument validation for built-in functions

Hello everyone,

I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:

In some cases, changes introduced ValueError immediately in the next version, without a deprecation phase. To ensure a consistent approach, I propose the following:

  1. Introduce a deprecation notice in the next minor version.
  2. Raise a ValueError in the following minor version.

If needed, I can create RFC, but as described by a few people in the discussions, we can avoid it having the consensus. What do you think?

Kind regards,

Jorg Sowa

My vote is on straight to ValueError without RFC. Deprecations should
only be limited to meaningful changes in functionality, not fixing
bugs or undefined behaviour. Lack of argument validation is not a
behaviour users would have relied on in past.

On Monday, 10 March 2025 at 23:07, Jorg Sowa <jorg.sowa@gmail.com> wrote:

Hello everyone,

I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:
- ext/standard: validate mode in array_filter() by jorgsowa · Pull Request #15647 · php/php-src · GitHub
- array_change_key_case(): Throw ValueError on invalid argument by Girgias · Pull Request #15883 · php/php-src · GitHub
- ext/standard: pathinfo() check flags argument validity. by devnexen · Pull Request #17859 · php/php-src · GitHub

In some cases, changes introduced ValueError immediately in the next version, without a deprecation phase. To ensure a consistent approach, I propose the following:

1. Introduce a deprecation notice in the next minor version.
2. Raise a ValueError in the following minor version.

If needed, I can create RFC, but as described by a few people in the discussions, we can avoid it having the consensus. What do you think?

My opinion on this topic is well known, validation errors should go straight to a ValueError.
And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.
Validation errors exist to inform the person programing that there is an error with the call site that is preventable very easily.
If we start emitting E_DEPRECATED instead of E_WARNING/throwing a ValueError, we are incentivising people to add throwing error handlers for E_DEPRECATED,
something that I (and I know others too) lament constantly, as a deprecation is not an error.

Best regards,

Gina P. Banyard

Am 11.03.2025 um 07:32 schrieb Gina P. Banyard <internals@gpb.moe>:

On Monday, 10 March 2025 at 23:07, Jorg Sowa <jorg.sowa@gmail.com> wrote:

I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:
- ext/standard: validate mode in array_filter() by jorgsowa · Pull Request #15647 · php/php-src · GitHub

My opinion on this topic is well known, validation errors should go straight to a ValueError.
And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.

I'll sound a bit like a broken record :slight_smile:

The very first example shows why we should have an E_WARNING phase to smoothen BC changes:
Code like
  $a = array_filter(["a", "", "b"], strlen(...), 666);
which currently ignores the bogus value and allows the code to continue would suddenly stop with en Exception.

Fixing the code is trivial and a warning helps identify those code locations but decoupling the moment of upgrading PHP from having to fix all code is IMHO still valuable. Especially when using third-party packages which might have a delay in fixing it.
I'm still waiting on some packages to fix new warnings for PHP 8.4 even though we've upgraded to the new version already.

Conclusion:
Yes, I'd very much like the 1) warning 2) ValueError approach from the original mail to be the default.

Regards,
- Chris

On Tue, Mar 11, 2025, at 10:07, Christian Schneider wrote:

Am 11.03.2025 um 07:32 schrieb Gina P. Banyard <internals@gpb.moe>:

On Monday, 10 March 2025 at 23:07, Jorg Sowa <jorg.sowa@gmail.com> wrote:

I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:

My opinion on this topic is well known, validation errors should go straight to a ValueError.

And if not a ValueError then it should be an E_WARNING.

Raising a deprecation is frankly non-sensical.

I’ll sound a bit like a broken record :slight_smile:

The very first example shows why we should have an E_WARNING phase to smoothen BC changes:

Code like

$a = array_filter([“a”, “”, “b”], strlen(…), 666);

which currently ignores the bogus value and allows the code to continue would suddenly stop with en Exception.

Fixing the code is trivial and a warning helps identify those code locations but decoupling the moment of upgrading PHP from having to fix all code is IMHO still valuable. Especially when using third-party packages which might have a delay in fixing it.

I’m still waiting on some packages to fix new warnings for PHP 8.4 even though we’ve upgraded to the new version already.

Conclusion:

Yes, I’d very much like the 1) warning 2) ValueError approach from the original mail to be the default.

Regards,

  • Chris

Well, when you make it an exception, it usually gets upgraded/fixed faster :slight_smile:

— Rob

On 11 March 2025 10:20:52 GMT, Rob Landers <rob@bottled.codes> wrote:

Well, when you make it an exception, it usually gets upgraded/fixed faster :slight_smile:

Not necessarily. When people see long lists of breaking changes in a release, they may just put off upgrading / marking the library compatible.

I think we should be very wary of how far we bend the difference between "minor" and "major" releases.

For these changes, I'd like to hear the argument *against* starting with a Warning. Is there any significant burden to waiting until 9.0 for these to become errors?

Rowan Tommins
[IMSoP]

On 11/03/2025 06:32, Gina P. Banyard wrote:

And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.
Validation errors exist to inform the person programing that there is an error with the call site that is preventable very easily.
If we start emitting E_DEPRECATED instead of E_WARNING/throwing a ValueError, we are incentivising people to add throwing error handlers for E_DEPRECATED,
something that I (and I know others too) lament constantly, as a deprecation is not an error.

I dedicated a whole section in a previous RFC to making essentially this same argument: PHP: rfc:deprecate-bareword-strings

The underlying problem is that "deprecation" isn't really a "severity", it's more a "category" of messages. It should maybe be a separate flag combined with the severity, so that E_DEPRECATION_NOTICE=E_NOTICE|E_DEPRECATED, and E_DEPRECATION_WARNING=E_WARNING|E_DEPRECATED

That way we could signal the difference between "this is bad, but the behaviour will remain well-defined" (e.g. file I/O warnings, which would require a completely new API to eliminate) and "this is bad, and there are concrete plans to change it" (e.g. it will become an error, or start doing something different, in the next major version).

--
Rowan Tommins
[IMSoP]

On Tuesday, 11 March 2025 at 11:12, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 11 March 2025 10:20:52 GMT, Rob Landers rob@bottled.codes wrote:

> Well, when you make it an exception, it usually gets upgraded/fixed faster :slight_smile:

Not necessarily. When people see long lists of breaking changes in a release, they may just put off upgrading / marking the library compatible.

I think we should be very wary of how far we bend the difference between "minor" and "major" releases.

For these changes, I'd like to hear the argument against starting with a Warning. Is there any significant burden to waiting until 9.0 for these to become errors?

Rowan Tommins
[IMSoP]

Consistency.
Just staring at ext/pcntl and the git blame you can see that some functions validate the signal number (or flags) since PHP 8.0, others do not or only recently (commit from November 2023).
The reason being that the code was inconsistent in when it would actually warn in the first place.

The other one is data loss concerns, many functions that forward strings to C APIs don't check that they do not contain nul bytes, and thus the C API receives a truncated string.
Or C API require a C int, which is 32bit compared to a PHP int which can be 64 bit, so truncation may happen for large negative/positive values.

It also means that we need to do *multiple* passes, on the same code path, resulting in somewhat of a code churn and possibly not using a common abstraction.
Considering that we didn't even have the time to properly remove some deprecations from PHP 7 with the PHP 8.0.0 release, nor convert all relevant E_WARNINGs to Value/Type errors, I expect that we would be missing some of them again when PHP 9 comes around.
(and this is ignoring the fact that PHP does not follow semver, and I'm starting to really believe that any "semver-like" versioning system just does not work for a PL, especially not one like PHP which has an insanely vast API surface.)
Moreover, changing a silent error condition to throwing a ValueError on a programming error that really never should be happening in the first place is not a disruptive BC break IMHO.

And I will be repeating what I've been saying again and again, none of these issues would exist if bundled extensions did not exist.
The fact that these extensions are tied to the "core PHP language" release cadence is.... incredibly annoying for everyone, from maintainers to users.
Taking the example of ext/pcntl again, if it were a standalone extension, having it follow semver is a way more reasonable proposition.
Because we could just release 2 versions the same day, a x.y+1.0 introducing a warning, and a x+1.0.0 which would convert them into proper errors.
Meaning as a user, you could be running whatever PHP version and have the stricter behaviour, or upgrade PHP while still holding the ext/pcntl version behind while you deal with the issues.

(Bonus points, the issue where PIE is currently unable to install bundled extension [1] would be solved with unbundling too.)

Best regards,

Gina P. Banyard

[1] Make core bundled extensions available (PDO, curl, intl, etc.) · Issue #133 · php/pie · GitHub

Am 11.03.2025 um 12:54 schrieb Gina P. Banyard <internals@gpb.moe>:

Taking the example of ext/pcntl again, if it were a standalone extension, having it follow semver is a way more reasonable proposition.
Because we could just release 2 versions the same day, a x.y+1.0 introducing a warning, and a x+1.0.0 which would convert them into proper errors.
Meaning as a user, you could be running whatever PHP version and have the stricter behaviour, or upgrade PHP while still holding the ext/pcntl version behind while you deal with the issues.

I live in a world where a) people don't always control the PHP version they are using (hosted websites) and b) are using more and more packages which have to stay compatible for (e.g. security) updates to be applicable. My (sad) experience is that semver *does not* solve this, it just provides the concept of differentiating different types of changes and their respective compatibility assumptions.

To be honest the pcntl example sounds like worst of both worlds:
- Maintainers would have two (and later maybe even more) versions to maintain (for how long?). I'd rather have one version at a time and if I do not want to forget switching the warning to an exception at a specific point there are ample ways of doing this, e.g. a TODO/FIXME/whatever you want to call it marker somewhere which is automatically reported once the given criteria (time, version, ...) is met.
- Would this be the only difference between these two version? If Yes then why have them in the first place (if you want to be strict about warnings then just convert them to an exception, that's easy enough). If No then you open yet another can of worms, e.g. one package requiring x.y+1 and another package requiring x+1.0. Been there, hated it :slight_smile:

Regards,
- Chris

Am 11.03.2025 um 16:29 schrieb Christian Schneider <cschneid@cschneid.com>:

...(if you want to be strict about warnings then just convert them to an exception, that's easy enough)...

Reading it back I maybe wasn't clear: I mean if you as an application developer want to be strict then you can use an error handler to catch warnings and throw an exception yourself.

- Chris

On 11/03/2025 11:54, Gina P. Banyard wrote:

It also means that we need to do *multiple* passes, on the same code path, resulting in somewhat of a code churn and possibly not using a common abstraction.
Considering that we didn't even have the time to properly remove some deprecations from PHP 7 with the PHP 8.0.0 release, nor convert all relevant E_WARNINGs to Value/Type errors, I expect that we would be missing some of them again when PHP 9 comes around.

I wonder if in some cases, we could automate this within the source code - e.g. with a macro like WARNING_OR_ERROR(message, first_version_for_error) which checks a compile-time constant for the current version. I realise that still requires code cleanup later, but it means we can't miss our chance when the appropriate version does come around.

and this is ignoring the fact that PHP does not follow semver

This is true, but only in the sense that something under the MIT license isn't under the BSD license. Officially, PHP's release policy is extremely similar to SemVer.

https://semver.org/ says:

> Major version X (X.y.z | X > 0) MUST be incremented if any backward incompatible changes are introduced to the public API. It MAY also include minor and patch level changes

Meanwhile policies/release-process.rst at main · php/policies · GitHub says:

> Major Version Number
> x.y.z to x+1.0.0
> - Backward compatibility can be broken
> - API compatibility can be broken (internals and userland)

> Minor Version Number
> x.y.z to x.y+1.z
> - Backward compatibility must be kept
> - API compatibility must be kept (userland)

Now, what exactly constitutes "backward compatibility" and "userland API compatibility" can be debated, but there is currently no agreed policy allowing "small compatibility breaks".

The fact that everyone acts as though there is such a policy, probably means a rewrite of that document is overdue. A policy that nobody follows is no use to either contributors or users.

Maybe we could work on some criteria that could be applied (and publicised to users) about what is and isn't allowed in minor versions. For instance:

- Code that already causes fatal behaviour might cause different fatal behaviour (e.g. throwing an Error instead of raising an E_ERROR)
- Code that directly violates documented types might start throwing TypeError
- Code that produced NULL as an error case only for invalid inputs might start throwing ValueError

Some of the cases linked at the start of this thread might then meet the agreed criteria, but some might not.

For instance, it looks like pathinfo does give reliable outputs if the flag argument is treated as a bitmask, even if doing so makes no logical sense. It's not strictly the wrong type, and the behaviour isn't obviously attempting to indicate an error, or doing something immediately fatal.

So to allow that as well, we'd need some broader criteria, like:

- Code that is clearly an error on the part of the programmer might starting throwing Errors

But that's probably too broad and debatable to be meaningful policy.

Another alternative is to scrap x.y.z versioning completely, and give every annual release equal status. This is the approach that PostgreSQL has taken, I believe.

We'd probably still want some kind of deprecation policy - some changes should be deprecated for X releases before removal/change. Which brings us back to some kind of criteria for which changes need that, so doesn't really solve the problem.

--
Rowan Tommins
[IMSoP]

On Tuesday, 11 March 2025 at 21:17, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

On 11/03/2025 11:54, Gina P. Banyard wrote:

> It also means that we need to do multiple passes, on the same code path, resulting in somewhat of a code churn and possibly not using a common abstraction.
> Considering that we didn't even have the time to properly remove some deprecations from PHP 7 with the PHP 8.0.0 release, nor convert all relevant E_WARNINGs to Value/Type errors, I expect that we would be missing some of them again when PHP 9 comes around.

I wonder if in some cases, we could automate this within the source code
- e.g. with a macro like WARNING_OR_ERROR(message,
first_version_for_error) which checks a compile-time constant for the
current version. I realise that still requires code cleanup later, but
it means we can't miss our chance when the appropriate version does come
around.

I don't think this is an improvement for the maintenance side.

> and this is ignoring the fact that PHP does not follow semver

This is true, but only in the sense that something under the MIT license
isn't under the BSD license. Officially, PHP's release policy is
extremely similar to SemVer.

[...]

Now, what exactly constitutes "backward compatibility" and "userland API
compatibility" can be debated, but there is currently no agreed policy
allowing "small compatibility breaks".

The fact that everyone acts as though there is such a policy, probably
means a rewrite of that document is overdue. A policy that nobody
follows is no use to either contributors or users.

Yes, rewriting this policy would be a good idea.
There have been BC breaks in every minor release going all the way back to PHP 4.

Maybe we could work on some criteria that could be applied (and
publicised to users) about what is and isn't allowed in minor versions.

[...]

We'd probably still want some kind of deprecation policy - some changes
should be deprecated for X releases before removal/change. Which brings
us back to some kind of criteria for which changes need that, so doesn't
really solve the problem.

Indeed, none of this really solves the problem.
I think it is *fair* that the Core language has more stringent requirements on BC than anything else.
The standard library being second in line for this sort of considerations.
But every other bundled extension should be allowed to have way more leeway in what said extension maintainer wants to do.

Making a distinction between Type and Value errors is meaningless to me,
if PHP had a stronger type system, e.g. dependent types,
then, a lot of things we currently consider ValueErrors would be TypeErrors.

If we were also designing some of these functions from scratch today, a lot of parameters wouldn't integers, but enums.
The only exception to this are bitmask parameters, as we don't have Enum Sets (or Intersection Types as values) which would replace those.

And again, ValueErrors are only ever added when it *easy* to check if the condition is satisfied, and it very clearly points to a programming error.

I still find it baffling that telling someone that the code they wrote, even if it is decades old, is incorrect is somehow bad.
Because the alternative is letting users have bugs in their software that they can ignore.

Best regards,

Gina P. Banyard

Hi

Am 2025-03-20 17:00, schrieb Gina P. Banyard:

And again, ValueErrors are only ever added when it *easy* to check if the condition is satisfied, and it very clearly points to a programming error.

I still find it baffling that telling someone that the code they wrote, even if it is decades old, is incorrect is somehow bad.
Because the alternative is letting users have bugs in their software that they can ignore.

I agree with that (and Kamil, who said the same thing). Passing an undocumented value that does *something* is a clear programmer error that would also break when adding new values, which is generally considered a backwards compatible change. Pointing out this error as an actual error before it causes a silent behavioral change is a good thing.

The `round()` function would be a good example. PHP 8.4 both added a validation of the `$mode` parameter if an integer value is given and also added new rounding modes (just as an enum in the gold release, though). Before PHP 8.4 it was possible to use `round($value, 5)` by accident and it would be interpreted as `PHP_ROUND_HALF_UP`. Now if we would've added a new `const PHP_ROUND_WHATEVER = 5;` constant this code would silently have changed its behavior. As a user I certainly would be interested in finding out about this mistake. A clear ValueError is easy to fix in a backwards compatible way, but incorrectly rounded values can remain undetected for so long that it becomes impossible to fix them, since the stored data might already be poisoned, including all backups.

Best regards
Tim Düsterhus

Hi,

On Tue, Mar 11, 2025 at 12:09 AM Jorg Sowa <jorg.sowa@gmail.com> wrote:

Hello everyone,

I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:

In some cases, changes introduced ValueError immediately in the next version, without a deprecation phase. To ensure a consistent approach, I propose the following:

  1. Introduce a deprecation notice in the next minor version.
  2. Raise a ValueError in the following minor version.

If needed, I can create RFC, but as described by a few people in the discussions, we can avoid it having the consensus. What do you think?

I would personally prefer also go through the deprecation first to be on the safe side.

This thread clearly shows that there is no consensus so I think the only way forward would be to create a policy RFC to make decision about this approach. Until then no PR introducing exception, deprecation or just plain warning for this sort of things should be merged.

Kind regards

Jakub

This thread clearly shows that there is no consensus so I think the only way forward would be to create a policy RFC to make decision about this approach. Until then no PR introducing exception, deprecation or just plain warning for this sort of things should be merged.

In terms of RFC, in this case we need some decision so it might make sense to do something like primary vote just to be an approval to make any changes (e.g. “Add clarification to policy about argument validation”) and then just add secondary vote as a choice between deprecation or straight going to value error.

Thank you all for inputs. There is no consensus at all, so we must go through RFC process. I will sum up feedback and I will create the RFC this weekend.

Kind regards,

Jorg