[PHP-DEV] [RFC] Throwable Hierarchy Policy for Extensions

Tim_Dusterhus · April 27, 2025, 8:12pm

Hi

as announced in the URI RFC discussion thread ([RFC] [Discussion] Add WHATWG compliant URL parsing API - Externals), I've now written up an “Exception Hierarchy” policy RFC together with Gina.

Please find the following links:

RFC: PHP: rfc:extension_exceptions
Policy PR: Add Throwable policy by TimWolla · Pull Request #17 · php/policies · GitHub

The RFC itself also contains additional references.

This message is intended to begin the official discussion period. Please do not comment on the PR itself, but reply to this discussion thread for proper visibility.

Best regards
Tim Düsterhus

Kamil_Tekiela · April 27, 2025, 8:40pm

The exception message MUST NOT be the only property that allows to

differentiate different types of error that the user may be interested in.

What does this mean exactly? Can you give an example?

Tim_Dusterhus · April 27, 2025, 8:57pm

Hi

On 4/27/25 22:40, Kamil Tekiela wrote:

The exception message MUST NOT be the only property that allows to

differentiate different types of error that the user may be interested in.

What does this mean exactly? Can you give an example?

This is intended to say “if a user calls ->getMessage() and consumes it in an `if()` statement, then you are doing it wrong” and “the error message is intended for human consumption and changes to the message are not considered breaking changes”.

Ideally different exception classes should be used, but using the `$code` property to differentiate between different types of error is also acceptable when there is a wide range of errors the user might be interested in. For PDO this might be:

     PdoException extends Exception
     PdoError extends Error
     UnsuccessfulQueryException extends PdoException
     QuerySyntaxError extends PdoError

And then in UnsuccessfulQueryException, use a different code for "Duplicate entry", "Query timed out", and "Deadlock" to avoid adding a separate class for each failure case. It might make sense to add classes for the failure cases that are most likely to require special handling. e.g. a DeadlockException to retry the transaction.

But as a user I should not need to do:

     again:
     try {
         $query->execute();
     } catch (UnsuccessfulQueryException $e) {
         if (str_contains($e->getMessage(), 'Timeout')) {
             goto again;
         }
         throw $e
     }

to determine whether it was a timeout, a deadlock or a duplicate entry.

Do you have a wording suggestion to make it clearer what is meant by that sentence?

Best regards
Tim Düsterhus

Rob_Landers · April 27, 2025, 8:57pm

On Sun, Apr 27, 2025, at 22:40, Kamil Tekiela wrote:

The exception message MUST NOT be the only property that allows to
differentiate different types of error that the user may be interested in.

What does this mean exactly? Can you give an example?

Many people don’t know about the $previous property, which allows you to chain exceptions, for example. I’ve worked in more than one codebase with custom exceptions that are missing that. It makes for throwing from a catch more nebulous as you lose how you got there in the first place.

— Rob

Kamil_Tekiela · April 27, 2025, 9:16pm

Do you have a wording suggestion to make it clearer what is meant by

that sentence?

How about this:
The exception message MUST NOT be the only means of distinguishing
exceptions. Any two exceptions with different messages MUST be
identifiable either by a unique exception class name or code.

Tim_Dusterhus · April 27, 2025, 9:45pm

Hi

On 4/27/25 23:16, Kamil Tekiela wrote:

The exception message MUST NOT be the only means of distinguishing
exceptions. Any two exceptions with different messages MUST be
identifiable either by a unique exception class name or code.

Thank you. I have used that as the basis for this change:

I've intentionally adjusted the wording to "different cause" rather than "different message" to avoid defining what constitutes a different message.

As an example, when the CSPRNG fails, we might want to emit different error messages depending on the CSPRNG driver (e.g. /dev/urandom vs the getrandom() syscalls), but it's not useful to distinguish these cases with a different code, since the user does not decide which driver is used and can't do anything useful with that information.

Even distinguishing between "/dev/urandom does not exist" and "/dev/urandom exists, but is a regular file instead of a character device" probably is only useful within the message itself, since neither is really recoverable from within PHP and allocating and documenting codes is likely work that helps no one.

Best regards
Tim Düsterhus

Crell · April 28, 2025, 8:04pm

On Sun, Apr 27, 2025, at 4:45 PM, Tim Düsterhus wrote:

Hi

On 4/27/25 23:16, Kamil Tekiela wrote:

The exception message MUST NOT be the only means of distinguishing
exceptions. Any two exceptions with different messages MUST be
identifiable either by a unique exception class name or code.

Thank you. I have used that as the basis for this change:

Add Throwable policy by TimWolla · Pull Request #17 · php/policies · GitHub

I've intentionally adjusted the wording to "different cause" rather than
"different message" to avoid defining what constitutes a different message.

As an example, when the CSPRNG fails, we might want to emit different
error messages depending on the CSPRNG driver (e.g. /dev/urandom vs the
getrandom() syscalls), but it's not useful to distinguish these cases
with a different code, since the user does not decide which driver is
used and can't do anything useful with that information.

Even distinguishing between "/dev/urandom does not exist" and
"/dev/urandom exists, but is a regular file instead of a character
device" probably is only useful within the message itself, since neither
is really recoverable from within PHP and allocating and documenting
codes is likely work that helps no one.

Best regards
Tim Düsterhus

Holy cow, thank you for this bit. The inability to tell what went wrong programmatically without string parsing the exception message is one of my biggest pet peeves in current exceptions.

A few other notes:

* Should the property be specified as public/readonly? Should it be conventional to have accessor methods? (IMO, property FTW, no need for a method. I already do this in all my exceptions.)

* "Non-base exceptions MAY define additional properties to provide additional metadata about the nature of the error." I am tempted to strengthen that to SHOULD, to help drive the point home. Maybe use a SHOULD, and at the end add "unless the nature and details of the error is fully defined by the exceptions' type."

* Would allowing an extension-tagging interface instead of a base class be an option? It still allows for catching "anything thrown by this extension", which I presume is the goal. If not, why?

--Larry Garfield

Tim_Dusterhus · April 28, 2025, 8:30pm

Hi

On 4/28/25 22:04, Larry Garfield wrote:

Holy cow, thank you for this bit. The inability to tell what went wrong programmatically without string parsing the exception message is one of my biggest pet peeves in current exceptions.

Anything particular from the standard library? It might be possible to improve this for existing extensions without creating an entirely new hierarchy and without an RFC.

* Should the property be specified as public/readonly? Should it be conventional to have accessor methods? (IMO, property FTW, no need for a method. I already do this in all my exceptions.)

I would not specify this and let authors make a choice here to determine what is appropriate. The URI RFC has `public readonly array $errors;` and I think that is appropriate in that case, but in other situations, a method might be more appropriate.

* "Non-base exceptions MAY define additional properties to provide additional metadata about the nature of the error." I am tempted to strengthen that to SHOULD, to help drive the point home. Maybe use a SHOULD, and at the end add "unless the nature and details of the error is fully defined by the exceptions' type."

I would not want to encourage authors to add additional properties “just in case” they might be useful by using a SHOULD phrasing. I also expect this to be something that can be resolved by simple agreement during the RFC discussion or review of the implementation.

I historically also had almost no cases where additional properties on an exception provided value for *programmatic consumption*. Most of the cases could be decided by class name alone and the message was sufficient to provide additional details for the human reader in the application logs (e.g. the exact nature of a DNS resolution error). Keep in mind that additional properties cannot be handled in a generic fashion, so they are useful for programmatic consumption when catching a specific exception class only.

* Would allowing an extension-tagging interface instead of a base class be an option? It still allows for catching "anything thrown by this extension", which I presume is the goal. If not, why?

See the “Choice of Base Exception” section in Add ext/random Exception hierarchy by TimWolla · Pull Request #9220 · php/php-src · GitHub.

Best regards
Tim Düsterhus

Crell · April 28, 2025, 9:09pm

On Mon, Apr 28, 2025, at 3:30 PM, Tim Düsterhus wrote:

Hi

On 4/28/25 22:04, Larry Garfield wrote:

Holy cow, thank you for this bit. The inability to tell what went wrong programmatically without string parsing the exception message is one of my biggest pet peeves in current exceptions.

Anything particular from the standard library? It might be possible to
improve this for existing extensions without creating an entirely new
hierarchy and without an RFC.

I was thinking the same thing. The main one that comes to mind is ArgumentCountError, where while doing some interesting meta-coding I had to do this:

github.com/Crell/AttributeUtils

src/Analyzer.php

master


      
          
          /**
           * Throws a domain-specific exception based on an ArgumentCountError.
           *
           * This is absolutely hideous, but this is what happens when your throwable
           * puts all the useful information in the message text rather than as useful
           * properties or methods or something.
           *
           * Conclusion: Write better, more debuggable exceptions than PHP does.
           */
          protected function translateArgumentCountError(\ArgumentCountError $error): never
          {
              $message = $error->getMessage();
              // PHPStan doesn't understand this syntax style of sscanf(), so skip it.
              // @phpstan-ignore-next-line
              [$classAndMethod, $passedCount, $file, $line, $expectedCount] = sscanf(
                  string: $message,
                  format: "Too few arguments to function %s::%s, %d passed in %s on line %d and exactly %d expected"
              );
              [$className, $methodName] = \explode('::', $classAndMethod ?? '');

* "Non-base exceptions MAY define additional properties to provide additional metadata about the nature of the error." I am tempted to strengthen that to SHOULD, to help drive the point home. Maybe use a SHOULD, and at the end add "unless the nature and details of the error is fully defined by the exceptions' type."

I would not want to encourage authors to add additional properties “just
in case” they might be useful by using a SHOULD phrasing. I also expect
this to be something that can be resolved by simple agreement during the
RFC discussion or review of the implementation.

I historically also had almost no cases where additional properties on
an exception provided value for *programmatic consumption*. Most of the
cases could be decided by class name alone and the message was
sufficient to provide additional details for the human reader in the
application logs (e.g. the exact nature of a DNS resolution error). Keep
in mind that additional properties cannot be handled in a generic
fashion, so they are useful for programmatic consumption when catching a
specific exception class only.

* Would allowing an extension-tagging interface instead of a base class be an option? It still allows for catching "anything thrown by this extension", which I presume is the goal. If not, why?

See the “Choice of Base Exception” section in
Add ext/random Exception hierarchy by TimWolla · Pull Request #9220 · php/php-src · GitHub.

That seems to be about not having a common interface for both the Error and the Exception, which makes sense. I'm talking about `interface ExampleException {}` and `interface ExampleError {}`, instead of `class ExampleException extends Exception {}`, etc.

--Larry Garfield

Tim_Dusterhus · April 28, 2025, 9:27pm

Hi

On 4/28/25 23:09, Larry Garfield wrote:

* Would allowing an extension-tagging interface instead of a base class be an option? It still allows for catching "anything thrown by this extension", which I presume is the goal. If not, why?

See the “Choice of Base Exception” section in
Add ext/random Exception hierarchy by TimWolla · Pull Request #9220 · php/php-src · GitHub.

That seems to be about not having a common interface for both the Error and the Exception, which makes sense. I'm talking about `interface ExampleException {}` and `interface ExampleError {}`, instead of `class ExampleException extends Exception {}`, etc.

Besides not following the de facto standard (which is what this proposal is trying to codify), I'm also not sure what benefit an interface would have over a base exception for the problem we're trying to solve here? So I can return the “why (interface)?”. It would just make it tempting to extend some SPL exception

I see the value of using interfaces for exceptions when the functionality implements an interface that defines specific types of exception (e.g. PSR-18), but this is (literally) orthogonal to base exceptions that group exceptions by “library” [1].

Best regards
Tim Düsterhus

[1] Writing down these words, it would probably made sense for ext/random to define a Random\EngineFailureExceptionInterface and specifying that Random\Engine::generate() must throw that one, rather than directly throwing the Random\RandomException base exception - especially for userland engines. But on the other hand an engine failure is not really programmatically recoverable anyways, so that's probably why I used the "simplification" back when I designed the hierarchy.

Crell · April 28, 2025, 10:36pm

On Mon, Apr 28, 2025, at 4:27 PM, Tim Düsterhus wrote:

Hi

On 4/28/25 23:09, Larry Garfield wrote:

* Would allowing an extension-tagging interface instead of a base class be an option? It still allows for catching "anything thrown by this extension", which I presume is the goal. If not, why?

See the “Choice of Base Exception” section in
Add ext/random Exception hierarchy by TimWolla · Pull Request #9220 · php/php-src · GitHub.

That seems to be about not having a common interface for both the Error and the Exception, which makes sense. I'm talking about `interface ExampleException {}` and `interface ExampleError {}`, instead of `class ExampleException extends Exception {}`, etc.

Besides not following the de facto standard (which is what this proposal
is trying to codify), I'm also not sure what benefit an interface would
have over a base exception for the problem we're trying to solve here?
So I can return the “why (interface)?”. It would just make it tempting
to extend some SPL exception

It's a common recommendation in userland, as it allows implementers to extend an existing exception (eg, InvalidArgumentException) of their choice while still being tagged as coming from a given library. Though I suppose if the policy doc also says to never do that, that becomes an irrelevant consideration.

--Larry Garfield

Tim_Dusterhus · April 28, 2025, 10:56pm

Hi

On 4/29/25 00:36, Larry Garfield wrote:

It's a common recommendation in userland, as it allows implementers to extend an existing exception (eg, InvalidArgumentException) of their choice while still being tagged as coming from a given library. Though I suppose if the policy doc also says to never do that, that becomes an irrelevant consideration.

Yes, the SPL exceptions are so awfully generic that there is no value in catching them, since they can refer to *anything*. And when you can't usefully catch them, then extending them doesn't make sense either. And a TypeError or ValueError is a clear programming error (failure to check preconditions), so that it also is incorrect to catch them.

Best regards
Tim Düsterhus

Derick_Rethans · April 30, 2025, 11:18am

On Sun, 27 Apr 2025, Tim Düsterhus wrote:

Hi

as announced in the URI RFC discussion thread
([RFC] [Discussion] Add WHATWG compliant URL parsing API - Externals), I've now written up an
“Exception Hierarchy” policy RFC together with Gina.

Please find the following links:

RFC: PHP: rfc:extension_exceptions
Policy PR: Add Throwable policy by TimWolla · Pull Request #17 · php/policies · GitHub

The RFC itself also contains additional references.

This message is intended to begin the official discussion period. Please do
not comment on the PR itself, but reply to this discussion thread for proper
visibility.

- Exceptions MUST NOT be ``final``.

Could the RFC explain why not?

- The name of the extension SHOULD NOT be used as a prefix or suffix of
the unqualified class name of additional exceptions.

Could you add an example of how to do it instead (or a "not this" "but
that" example)?

- Any two exceptions with different causes MUST be identifiable either
  by a unique exception class name, a stable ``$code``, or a
  class-specific additional property suitable for programmatic
  consumption (e.g. an enum).

I would probably not even allow the stable ``$code`` in here, as I have
seen from experience people don't really check for them.

cheers,
Derick

--
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: Xdebug: Support

mastodon: @derickr@phpc.social @xdebug@phpc.social

Crell · April 30, 2025, 1:33pm

On Wed, Apr 30, 2025, at 6:18 AM, Derick Rethans wrote:

On Sun, 27 Apr 2025, Tim Düsterhus wrote:

- Any two exceptions with different causes MUST be identifiable either
  by a unique exception class name, a stable ``$code``, or a
  class-specific additional property suitable for programmatic
  consumption (e.g. an enum).

I would probably not even allow the stable ``$code`` in here, as I have
seen from experience people don't really check for them.

The only time I've seen anyone use $code is in TYPO3. Their coding standards say that any time you throw an exception, you use the current timestamp (determined manually) as a code. That way there is a globally unique code regardless of exception type that can be grepped to find the exact line it came from.

I am not saying this is a good strategy, just that it's the only time I've seen $code used in the wild...

--Larry Garfield

Tim_Dusterhus · April 30, 2025, 6:06pm

Hi

On 4/30/25 15:33, Larry Garfield wrote:

The only time I've seen anyone use $code is in TYPO3. Their coding standards say that any time you throw an exception, you use the current timestamp (determined manually) as a code. That way there is a globally unique code regardless of exception type that can be grepped to find the exact line it came from.

To my understanding this would result in effectively identical exceptions having different codes, just because checking the error condition is split across different `if()` statements for readability? That doesn't seem like a good idea - and that's why the RFC uses “cause” as the wording of choice.

[…] just that it's the only time I've seen $code used in the wild...

PDO (for better or worse) also uses the `$code` for the error code returned by the database. Unfortunately it also widens the (untyped) $code from int to string|int, which causes some issues, since folks only expect int, since Exception::__construct() types the `$code` parameter as `int`.

Best regards
Tim Düsterhus

Tim_Dusterhus · April 30, 2025, 6:53pm

Hi

On 4/30/25 13:18, Derick Rethans wrote:

- Exceptions MUST NOT be ``final``.

Could the RFC explain why not?

I'm not sure if this is useful to add to the RFC itself as a “only extra explanation” and since the discussion is an equally official resource:

The reason is to allow flexible extensions of the exception hierarchy, e.g. when adding a more specific type of exception to provide further context to some parent exception that just identifies some “concept”.

A HTTP request can fail (HttpRequestFailedExceptions) for different reasons, e.g. due to a connection failure (ConnectionFailedException) or due to a server error (ServerErrorException). Now we might also want to clarify why the connection failed. It could be a timeout (ConnectionTimeoutException) or DNS resolution error (DnsResolutionFailedException). This already requires several of the exceptions not to be final to allow extending them. Making the leaf exceptions final would not bring any value and just cause additional churn when realizing that having more child exceptions would be helpful to the user. Also some extension might intentionally want to allow subclassing for some classes. When subclassing a class it also makes sense being able to subclass the corresponding exceptions. I have made it legal to throw “unowned” exceptions when subclassing something in:

- The name of the extension SHOULD NOT be used as a prefix or suffix of
the unqualified class name of additional exceptions.

Could you add an example of how to do it instead (or a "not this" "but
that" example)?

I've expanded on this paragraph and added an example in:

Basically the intention is to avoid class names “oddly specific” names that just concatenate some random words. Ideally the class name would be a succinct English phrase that matches what you would communicate them to a co-worker. I would say “The HTTP request failed”, but not “The HTTP request that we perform using the network request library called curl failed”. Or I would say “The timezone is invalid” rather than “The timezone, which relates to the concept called ‘date’ (and not any other use of the term timezone) is invalid”.

How exactly that works in practice greatly depends on the extension, that's why it's just a SHOULD (NOT). I trust that folks make good choices when they have a reminder to make a good choice.

- Any two exceptions with different causes MUST be identifiable either
   by a unique exception class name, a stable ``$code``, or a
   class-specific additional property suitable for programmatic
   consumption (e.g. an enum).

I would probably not even allow the stable ``$code`` in here, as I have
seen from experience people don't really check for them.

My goal here is to avoid making exception messages part of the backwards compatibility promise. Whether or not the `$code` is useful in practice will be something that folks can figure out when writing an RFC. It probably greatly depends on the type of extension what makes sense. Perhaps it would also make sense to officially widen the code from `int` to `int|string|UnitEnum` to avoid the PDO gotcha. Since `__construct()` does not participate in LSP checks and since `getCode()` is already final, this seems safe to me.

Best regards
Tim Düsterhus

Crell · April 30, 2025, 8:12pm

On Wed, Apr 30, 2025, at 1:06 PM, Tim Düsterhus wrote:

Hi

On 4/30/25 15:33, Larry Garfield wrote:

The only time I've seen anyone use $code is in TYPO3. Their coding standards say that any time you throw an exception, you use the current timestamp (determined manually) as a code. That way there is a globally unique code regardless of exception type that can be grepped to find the exact line it came from.

To my understanding this would result in effectively identical
exceptions having different codes, just because checking the error
condition is split across different `if()` statements for readability?
That doesn't seem like a good idea - and that's why the RFC uses “cause”
as the wording of choice.

Correct. There's 400 `throw new InvalidArgumentException('...', 123456798)` calls across the code base, each with a unique code number timestamp.

I didn't care for this approach either when I worked at TYPO3. My point being that I've rarely if ever seen $code used in a constructive and useful fashion.

[…] just that it's the only time I've seen $code used in the wild...

PDO (for better or worse) also uses the `$code` for the error code
returned by the database. Unfortunately it also widens the (untyped)
$code from int to string|int, which causes some issues, since folks only
expect int, since Exception::__construct() types the `$code` parameter
as `int`.

Best regards
Tim Düsterhus

In my experience, worse. But that's another topic.

--Larry Garfield

Kamil_Tekiela · April 30, 2025, 8:56pm

On Wed, 30 Apr 2025 at 21:13, Larry Garfield <larry@garfieldtech.com> wrote:

>> […] just that it's the only time I've seen $code used in the wild...
>>
>
> PDO (for better or worse) also uses the `$code` for the error code
> returned by the database. Unfortunately it also widens the (untyped)
> $code from int to string|int, which causes some issues, since folks only
> expect int, since Exception::__construct() types the `$code` parameter
> as `int`.
>
> Best regards
> Tim Düsterhus

In my experience, worse. But that's another topic.

PDO is a bad example because the code is pretty much useless. You need
to get the actual code from `errorInfo[1]` if you want to know the
reason.

Tim_Dusterhus · May 11, 2025, 1:46pm

Hi

Am 2025-04-27 22:12, schrieb Tim Düsterhus:

as announced in the URI RFC discussion thread ([RFC] [Discussion] Add WHATWG compliant URL parsing API - Externals), I've now written up an “Exception Hierarchy” policy RFC together with Gina.

Please find the following links:

RFC: PHP: rfc:extension_exceptions
Policy PR: Add Throwable policy by TimWolla · Pull Request #17 · php/policies · GitHub

14 days of discussion are over this evening. There were some minor changes and clarifications to the policy text, but nothing significant and discussion has been silent for the past 10 days (and no changes, except for formatting, were made either).

Therefore we assume that everyone has said what they wanted to say and plan to open the vote in the next days.

Best regards
Tim Düsterhus