[PHP-DEV] function autoloading v4 RFC

Hello internals,

I’ve decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I’ve decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

On 15 Aug 2024, at 22:22, Rob Landers rob@bottled.codes wrote:

Hello internals,

I’ve decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I’ve decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hi Rob,

I like the simplicity of this, however your RFC doesn’t document the changes required to spl_autoload[1] to allow it to keep working with this new functionality.

The same issue (unexpected additional argument) potentially affects userland autoloaders too, but obviously the individual authors can fix that themselves (whether this would count as a BC break is not immediately clear to me)

Slightly tangentially, you may also want to look at a change to spl_autoload_call to accept a SPL_AUTOLOAD_* argument, so that it works consistently.

Cheers

Stephen

1: https://www.php.net/spl_autoload

On 15/08/2024 16:22, Rob Landers wrote:

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

PHP: rfc:function_autoloading4

Hi Rob,

While brevity can sometimes be a virtue, I feel like there's a lot left to the reader's interpretation here.

Specifically, one of the main issues that has come up in previous discussions of the topic is the behaviour of unqualified function names, which check the current namespace first, then fall back to global scope. Your RFC implies an approach to this, but doesn't actually spell it out, nor discuss its pros and cons.

The fully qualified case is straight-forward: the autoloader is called, and if still not defined, an error is thrown. But for the unqualified case, there are multiple scenarios, and you only give the behaviour for one of them:

Defined in current namespace? | Defined in global namespace? | Proposed behaviour
------------------------------+------------------------------+--------------------------------------
No | No | Autoloader called, with prefixed name
No | Yes | ???
Yes | No | ???
Yes | Yes | ???

The third and fourth cases (where the function exists in the current namespace) are straight-forward, although it wouldn't hurt to spell them out: presumably, the namespaced function is used as now, so no autoloading is needed.

The complex case has always been the second one: the function doesn't exist in the current namespace, but *does* exist in the global namespace. (Or, an autoloader *defines* it in the global namespace.)

In concrete terms, what does this code output:

spl_autoload_register( function($function, $type) { echo "$function..."; }, type:SPL_AUTOLOAD_FUNCTION);

namespace Foo {
foreach (['hello', 'goodbye'] as $word) {
echo strlen($word), ';';
}
}

a) "Foo\strlen...5;Foo\strlen...7;" (the autoloader is called every time the function is encountered)
b) "Foo\strlen...5;7;" (the autoloader is called once, then somehow marked not to run again for this name
c) "5;7;" (the autoloader is never run for this code)

Note that there is an open RFC implementing option (b) by Gina and Dan here: PHP: rfc:core-autoloading Last I heard, Gina was still hoping to get back to it.

On a different note, there is no mention of autoloading namespaced constants in this RFC, unlike in some previous proposals. Is this a conscious decision to leave them out of scope, or an oversight?

Regards,

--
Rowan Tommins
[IMSoP]

Le 15 août 2024 à 17:27, Rob Landers rob@bottled.codes a écrit :

Hello internals,

I’ve decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I’ve decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hi Rob,

There is the following RFC, presented last year; it appears currently inactive, but not necessarily abandoned:

https://wiki.php.net/rfc/core-autoloading

—Claude

On Thu, Aug 15, 2024, at 18:18, Stephen Reay wrote:

On 15 Aug 2024, at 22:22, Rob Landers rob@bottled.codes wrote:

Hello internals,

I’ve decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I’ve decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hi Rob,

I like the simplicity of this, however your RFC doesn’t document the changes required to spl_autoload[1] to allow it to keep working with this new functionality.

Ah, good catch. I’ve updated this and gone through other relevant functions.

The same issue (unexpected additional argument) potentially affects userland autoloaders too, but obviously the individual authors can fix that themselves (whether this would count as a BC break is not immediately clear to me)

Userland functions don’t throw that error, so it shouldn’t be an issue. (You can pass as many arguments as you want to a userland function, as long as there are enough of them).

Slightly tangentially, you may also want to look at a change to spl_autoload_call to accept a SPL_AUTOLOAD_* argument, so that it works consistently.

Done.

Thank you Stephen for pointing out this oversight.

Cheers

Stephen

1: https://www.php.net/spl_autoload

— Rob

On Thu, Aug 15, 2024, at 20:40, Rowan Tommins [IMSoP] wrote:

On 15/08/2024 16:22, Rob Landers wrote:

Hello internals,

I’ve decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I’ve decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Hi Rob,

While brevity can sometimes be a virtue, I feel like there’s a lot left to the reader’s interpretation here.

Specifically, one of the main issues that has come up in previous discussions of the topic is the behaviour of unqualified function names, which check the current namespace first, then fall back to global scope. Your RFC implies an approach to this, but doesn’t actually spell it out, nor discuss its pros and cons.

It doesn’t go too much into details here on purpose, especially since there was some recent discussion on changing the order. That being said, while writing the reply below, I realized it simply wasn’t clear enough. I’ve updated the RFC to be more clear, in regards to the current behavior.

The fully qualified case is straight-forward: the autoloader is called, and if still not defined, an error is thrown. But for the unqualified case, there are multiple scenarios, and you only give the behaviour for one of them:

To fill in your chart:

Defined in current namespace? | Defined in global namespace? | Proposed behaviour
------------------------------±-----------------------------±-------------------------------------
No | No | Prefixed name
No | Yes | Prefixed name
Yes | No | No change
Yes | Yes | No change

The third and fourth cases (where the function exists in the current namespace) are straight-forward, although it wouldn’t hurt to spell them out: presumably, the namespaced function is used as now, so no autoloading is needed.

The complex case has always been the second one: the function doesn’t exist in the current namespace, but does exist in the global namespace. (Or, an autoloader defines it in the global namespace.)

This should have the same behavior as in the class autoloader. In my attempt to be vague (in case the loading order is changed, which I assumed would affect class autoloading as well), I wasn’t very clear on this. Meaning if you create a class called “Fiber” and a function called “strlen” in the current namespace, in unloaded files, an autoloader should be given the opportunity to load them.

I should also probably define some vocabulary so this is all less confusing. and Done.

In concrete terms, what does this code output:

spl_autoload_register( function($function, $type) { echo “$function…”; }, type:SPL_AUTOLOAD_FUNCTION);

namespace Foo {
foreach ([‘hello’, ‘goodbye’] as $word) {
echo strlen($word), ‘;’;
}
}

a) “Foo\strlen…5;Foo\strlen…7;” (the autoloader is called every time the function is encountered)
b) “Foo\strlen…5;7;” (the autoloader is called once, then somehow marked not to run again for this name
c) “5;7;” (the autoloader is never run for this code)

I believe the “most correct” answer is (a) – and is what the current class autoloader does as well. Option (b) would make it impossible to do any kind of dynamic code generation or old-fashioned file-based deployments.


namespace global {
    spl_autoload_register(function ($function) {
       echo "$function...";
    });

}

namespace Foo {

    foreach (['hello', 'goodbye'] as $word) {
       if (!class_exists(Fiber::class)) {
          echo "Test...";
       }
    }
}

Since I foresee (based on previous conversations) about people being worried about performance:

I think this is best left to autoloader authors to manage. If there isn’t a function autoloader, then there won’t be any performance impact (or at least, minimal). However, if there is one, the author can take into account how their codebase works. For example, I highly suspect FIG will create a PSR for this (eventually), based on how they expect things to work. I suspect a project like WordPress could make use of it and implement things how they expect things to work, etc.

For the case where you have a file full of unqualified strlen()'s, the autoloader author could create an option “do not shadow global functions” where it just returns immediately for built-in functions and doesn’t even try autoloading (I also suspect there will be someone who code-golfs this to death so this check is extremely efficient). Is there a performance impact there? Probably not as big as you’d think – and there already is a performance impact by using unqualified global functions (as shown in a recent proposal to change the order). So, if you are doing that already; you likely don’t care about performance anyway.

But I’m going to reserve any serious discussions about performance for when I have an actual implementation to play with (hopefully a PoC this weekend).

Note that there is an open RFC implementing option (b) by Gina and Dan here: https://wiki.php.net/rfc/core-autoloading Last I heard, Gina was still hoping to get back to it.

I went looking for this (I had sworn I had seen it last year), but unfortunately, I did not find it while searching the RFCs for prior work. I had thought it abandoned so I was looking in the wrong place. This RFC does not conflict with that RFC, if Gina/Dan were to pick it back up again. I may borrow the function_exists() change, which I just updated the RFC to do.

On a different note, there is no mention of autoloading namespaced constants in this RFC, unlike in some previous proposals. Is this a conscious decision to leave them out of scope, or an oversight?

I’m leaving them out on purpose to keep the scope tight. I’ve got a whole year ahead of me.

Regards,

-- 
Rowan Tommins
[IMSoP]

— Rob

Sent from my iPhone

On 16 Aug 2024, at 04:45, Rob Landers <rob@bottled.codes> wrote:

Userland functions don't throw that error, so it shouldn't be an issue. (You can pass as many arguments as you want to a userland function, as long as there are enough of them).

Hi Rob,

I didn't mean the actual internal error about too many args I meant the concept - it's possible a userland autoloader callback has some signature other than a single string parameter.

I don't know if that happens in the wild or if that's considered a BC or not but there is a hypothetical case there where old code would stop working.

Cheers

Stephen

On Fri, Aug 16, 2024, at 02:22, Stephen Reay wrote:

Sent from my iPhone

On 16 Aug 2024, at 04:45, Rob Landers <rob@bottled.codes> wrote:

Userland functions don’t throw that error, so it shouldn’t be an issue. (You can pass as many arguments as you want to a userland function, as long as there are enough of them).

Hi Rob,

I didn’t mean the actual internal error about too many args I meant the concept - it’s possible a userland autoloader callback has some signature other than a single string parameter.

I don’t know if that happens in the wild or if that’s considered a BC or not but there is a hypothetical case there where old code would stop working.

Cheers

Stephen

I wouldn’t consider it a BC break, no. But (ironically?), Symfony crashes with this change. It really shouldn’t but if you take a look at https://github.com/symfony/symfony/blob/7.2/src/Symfony/Component/Config/Resource/ClassExistenceResource.php#L70 and it’s definition in https://github.com/symfony/symfony/blob/7.2/src/Symfony/Component/Config/Resource/ClassExistenceResource.php#L142, you’ll see that they define a function that accepts two parameters. It crashes when it receives a second parameter of int instead of it’s expected exception.

I’ve opened a drive-by PR as an attempt to fix it (https://github.com/symfony/symfony/pull/58030), but I suspect there will be other projects as well that do something silly like that. They’ll just have to fix it.

— Rob

On 17 August 2024 22:33:03 BST, Rob Landers <rob@bottled.codes> wrote:

I wouldn't consider it a BC break, no. But (ironically?), Symfony crashes with this change. It really shouldn't but ...

I don't think it makes sense to say "it breaks existing code, but it's not a compatibility break".

Perhaps what you're saying is "it's only a BC break for code that's not following best practices"?

But more relevant than whether you think the current code is "correct" is the fact that a) it will need to be changed to work with your proposal; and b) the change is simple and can be done in advance.

So the RFC should acknowledge this BC break, but could argue that it's small enough to include in a minor version. This is actually really common - RFCs that introduce a new global function often acknowledge that it would break existing userland functions with that name. Between that and obviously serious BC breaks like *removing* a function, there's a big grey area where we have to make a judgement call.

Regards,
Rowan Tommins
[IMSoP]

On Sun, Aug 18, 2024, at 00:40, Rowan Tommins [IMSoP] wrote:

On 17 August 2024 22:33:03 BST, Rob Landers <rob@bottled.codes> wrote:

I wouldn’t consider it a BC break, no. But (ironically?), Symfony crashes with this change. It really shouldn’t but …

I don’t think it makes sense to say “it breaks existing code, but it’s not a compatibility break”.

Perhaps what you’re saying is “it’s only a BC break for code that’s not following best practices”?

But more relevant than whether you think the current code is “correct” is the fact that a) it will need to be changed to work with your proposal; and b) the change is simple and can be done in advance.

So the RFC should acknowledge this BC break, but could argue that it’s small enough to include in a minor version. This is actually really common - RFCs that introduce a new global function often acknowledge that it would break existing userland functions with that name. Between that and obviously serious BC breaks like removing a function, there’s a big grey area where we have to make a judgement call.

Regards,

Rowan Tommins

[IMSoP]

Hey Rowan,

Ah, that’s a good tip and point. Thank you. I’ll update the RFC!

— Rob

Hi Rob,

On 18 Aug 2024, at 04:33, Rob Landers rob@bottled.codes wrote:

I wouldn’t consider it a BC break, no. But (ironically?), Symfony crashes with this change.

I wasn’t aware of that specific code before but it’s exactly the type of issue I was talking about earlier.

Ah, good catch. I’ve updated this and gone through other relevant functions.

The RFC now says “The spl_autoload function will not be updated.”, but that will also break if it isn’t updated to at least account for, even if it doesn’t use the second argument given.

However I’m also curious why you would specifically make it not support function loading?
The current implementation should work unmodified, once the signature is changed to accept an int as the second parameter (and move the current 2nd parameter to 3rd), There is nothing “class specific” in the existing implementation except for a couple of variable names.

I would imagine you also need to update spl_autoload_unregister[1] - it also needs to be able to identify the type of autoloader it’s operating on (because the same autoloader might be defined for both types).

And lastly I think you’d need to adapt spl_autoload_functions[2] too - perhaps the same as the others, introduce a parameter int $type = SPL_AUTOLOAD_CLASS, so existing code works as-is, otherwise it’d be impossible to know how a given autoloader was registered.

1: https://www.php.net/manual/en/function.spl-autoload-unregister.php
2: https://www.php.net/manual/en/function.spl-autoload-functions.php

Cheers

Stephen

On Sun, Aug 18, 2024, at 11:36, Stephen Reay wrote:

Hi Rob,

On 18 Aug 2024, at 04:33, Rob Landers rob@bottled.codes wrote:

I wouldn’t consider it a BC break, no. But (ironically?), Symfony crashes with this change.

I wasn’t aware of that specific code before but it’s exactly the type of issue I was talking about earlier.

Ah, good catch. I’ve updated this and gone through other relevant functions.

The RFC now says “The spl_autoload function will not be updated.”, but that will also break if it isn’t updated to at least account for, even if it doesn’t use the second argument given.

However I’m also curious why you would specifically make it not support function loading?

The current implementation should work unmodified, once the signature is changed to accept an int as the second parameter (and move the current 2nd parameter to 3rd), There is nothing “class specific” in the existing implementation except for a couple of variable names.

I would imagine you also need to update spl_autoload_unregister[1] - it also needs to be able to identify the type of autoloader it’s operating on (because the same autoloader might be defined for both types).

And lastly I think you’d need to adapt spl_autoload_functions[2] too - perhaps the same as the others, introduce a parameter int $type = SPL_AUTOLOAD_CLASS, so existing code works as-is, otherwise it’d be impossible to know how a given autoloader was registered.

1: https://www.php.net/manual/en/function.spl-autoload-unregister.php

2: https://www.php.net/manual/en/function.spl-autoload-functions.php

Cheers

Stephen

Hello again internals,

The implementation has now reached a stable point and things discovered during the implementation have been reflected in the RFC text itself.

I did my best to assess the impact to Opcache, but I suspect someone much more familiar with how opcache works will need to take a look. From my understanding of the changes needed in zend_execute, it looks like there are some changes needed to the JIT helpers, but there may be more changes required that aren’t so obvious. I’ve added a helper that can be used as a drop-in replacement for looking up functions from the function table directly, which should help.

Lastly, I do plan to open a docs PR in the coming week that will probably trigger some smaller updates to the RFC; mostly to smooth out the wording and make it more friendly for non-technical people, but the technical aspects shouldn’t change (barring the discussion here, of course).

The RFC now says “The spl_autoload function will not be updated.”, but that will also break if it isn’t updated to at least account for, even if it doesn’t use the second argument given.

I’ve updated the text to reflect exactly that, it did require updating. :wink:

However I’m also curious why you would specifically make it not support function loading?

The current implementation should work unmodified, once the signature is changed to accept an int as the second parameter (and move the current 2nd parameter to 3rd), There is nothing “class specific” in the existing implementation except for a couple of variable names.

I mostly decided not to support it to avoid the inevitable bikeshedding of how a “default function autoloader” would work. I really want to push that to a separate RFC, unless there was a general consensus of an implementation. If we are fine reusing the existing default class autoloading, then I am fine with that.

And lastly I think you’d need to adapt spl_autoload_functions[2] too - perhaps the same as the others, introduce a parameter int $type = SPL_AUTOLOAD_CLASS, so existing code works as-is, otherwise it’d be impossible to know how a given autoloader was registered.

I’ve also added those functions as well.

— Rob

On Thursday, 15 August 2024 at 17:22, Rob Landers <rob@bottled.codes> wrote:

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

PHP: rfc:function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

I had a quick glance at the RFC, and I really don't think this is a good approach or good API.
Having autoloading in SPL causes dumb issues, for example ext/session must register it has a dependency on ext/spl just for the autoloader. [1]
This means that currently ext/session depends on ext/spl, ext/spl depends on ext/standard, and ext/standard depends on ext/session, a glorious circular dependency.

This main problem has pushed me again to rebase my PR [2] for the Core Autoloading RFC. [3]
The wording of the RFC hasn't been updated, as this is frankly my least favourite part of any RFC.
The PR passes CI and has also produced benchmark results from CI. [4]

I would rather have some collaboration on a proper solution than, IMHO, this suboptimal solution.
I still need to get opinions from other people if it makes sense to remove the zend_autoload_class function pointer API and have the VM directly call zend_autoload.
Because from what I remember 2 years ago, some profiling tools hook into it to track autoloading time.
This might be improved by introducing new observer hooks.

Best regards,

Gina P. Banyard

[1] Update ext/spl dependencies by petk · Pull Request #14544 · php/php-src · GitHub
[2] Add new core autoloading mechanism for classes and functions by Girgias · Pull Request #8294 · php/php-src · GitHub
[3] PHP: rfc:core-autoloading
[4] Add new core autoloading mechanism for classes and functions · php/php-src@7b2b59f · GitHub

On Mon, Aug 19, 2024, at 14:03, Gina P. Banyard wrote:

On Thursday, 15 August 2024 at 17:22, Rob Landers rob@bottled.codes wrote:

Hello internals,

I’ve decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I’ve decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

I had a quick glance at the RFC, and I really don’t think this is a good approach or good API.

Having autoloading in SPL causes dumb issues, for example ext/session must register it has a dependency on ext/spl just for the autoloader. [1]

This means that currently ext/session depends on ext/spl, ext/spl depends on ext/standard, and ext/standard depends on ext/session, a glorious circular dependency.

This main problem has pushed me again to rebase my PR [2] for the Core Autoloading RFC. [3]

The wording of the RFC hasn’t been updated, as this is frankly my least favourite part of any RFC.

The PR passes CI and has also produced benchmark results from CI. [4]

I would rather have some collaboration on a proper solution than, IMHO, this suboptimal solution.

I still need to get opinions from other people if it makes sense to remove the zend_autoload_class function pointer API and have the VM directly call zend_autoload.

Because from what I remember 2 years ago, some profiling tools hook into it to track autoloading time.

This might be improved by introducing new observer hooks.

Best regards,

Gina P. Banyard

[1] https://github.com/php/php-src/pull/14544#issuecomment-2294907817

[2] https://github.com/php/php-src/pull/8294

[3] https://wiki.php.net/rfc/core-autoloading

[4] https://github.com/php/php-src/actions/runs/10441267948

Hey Gina,

I’d love to collaborate on this feature. For what it’s worth though, I did a ton of research on it (mostly reading every discussion I could find on the topic, and prior RFCs) and I felt that this API was the most likely to be accepted. You even have a comment on your PR asking why not an API similar to this one (!) though your reasoning is sound for why it is a bad idea, and I believe it is the superior API.

There are some key differences between our two RFC’s that I think are worth discussing (besides the obvious API differences):

  1. in your RFC, it calls the autoloader with the global function it isn’t found in both scopes. In mine, it calls it once not found in the local scope and calls the autoloader once (for the local scope). This seemed to be a highly liked proposal in one of the older discussions (2013-ish?) that appeared to not result in a new RFC, as it would bypass a lot of perceived performance issues in earlier RFCs. If an autoloader so desires, the autoloader can check the global scope (by getting the “base name” of the function), but autoloading the global scope should be a niche application these days.

  2. I like the “pinning” aspect. I haven’t seen your code yet, but I suspect it just registers the global function in the current namespace? If so, does this affect the namespace global? Probably not, but I am just curious. What happens if I manually require a file with a pinned function in it?

  3. Should we update function_exists like I did in mine to include an autoload argument?

  4. You mention no default autoloader for classes and functions, while I agree that this should be the case, will the spl library still provide the default class autoloader that can be registered?
    As far as performance for ambiguous functions go, I was thinking of submitting an RFC, where ambiguous function calls are tagged during compilation and always resolve lexically, sorta like how it works now:

echo strlen($x); // resolves to global, always

require_once “my-strlen”;

echo strlen($x); // resolves to my strlen, always

This works by basically rewriting the function name once resolved and may make the code more predictable. If I can pull it off, it can be relegated to a technical change that doesn’t need an RFC. Still working on it.

In other words, maybe pinning could be solved more generally in a future RFC, decrease your RFC’s scope and chance for sharp edge cases.

In any case, if you are up for a high bandwidth conversation to collaborate, we can join a call, collaborate in the open, or whatever you feel most comfortable with. I’m very excited to see this feature.

— Rob

On August 15, 2024 5:22:51 PM GMT+02:00, Rob Landers <rob@bottled.codes> wrote:

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

PHP: rfc:function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hello,

I've also noted that on another RFC, but I think it applies here too:
Any new constants that are shaped like an enum and used like an enum should just be
an actual enum. I think instead of `SPL_AUTOLOAD_CLASS` and `SPL_AUTOLOAD_FUNCTION`,
there should be an `enum AutoloadType` (to be bikeshed), and the new parameters
should be typed as such, instead of `int`.

Regards,
Mel

On 19/08/2024 17:23, Rob Landers wrote:

As far as performance for ambiguous functions go, I was thinking of submitting an RFC, where ambiguous function calls are tagged during compilation and always resolve lexically, sorta like how it works now:

echo strlen($x); // resolves to global, always
require_once "my-strlen";
echo strlen($x); // resolves to my strlen, always

This works by basically rewriting the function name once resolved and may make the code more predictable. If I can pull it off, it can be relegated to a technical change that doesn’t need an RFC. Still working on it.

I'm not entirely clear what you're suggesting, but I think it might be either what already happens, or the same as Gina is proposing.

Consider this code [https://3v4l.org/184k3\]:

namespace Foo;

foreach ( [1,2,3] as $i ) {
echo strlen('hello'), ' ';
shadow_strlen();
echo strlen('hello'), '; ';
}

function shadow_strlen() {
if ( ! function_exists('Foo\\strlen') ) {
function strlen($s) {
return 42;
}
}
}

In PHP 5.3, this outputs '5 42; 42 42; 42 42;' That's fairly straight-forward: each time the function is called, the engine checks if "Foo\strlen" is defined.

Since PHP 5.4, it outputs '5 42; 5 42; 5 42;' The engine caches the result of the lookup against each compiled opcode, so the first strlen() is cached as a call to \strlen() and the second as a call to \Foo\strlen().

As I understand it, Gina is proposing that it would instead output '5 5; 5 5; 5 5;' - the function would be "pinned" by making "\Foo\strlen" an alias to "\strlen" for the rest of the program, and the function_exists call would immediately return true.

Neither, as far as I can see, can happen at compile time, because the compiler doesn't know if and when a definition of \Foo\strlen() will be encountered.

In other words, maybe pinning could be solved more generally in a future RFC, decrease your RFC’s scope and chance for sharp edge cases.

If anything, I think it would need to be the other way around: change the name resolution logic, so that an autoloading proposal was more palatable, because it didn't require running the autoloader an unbounded number of times for the same name.

Proposing both at once seems reasonable, as the autoloading gives an extra benefit to outweigh the breaking change to shadowing behaviour.

Regards,

--
Rowan Tommins
[IMSoP]

On Mon, Aug 19, 2024, at 23:17, Rowan Tommins [IMSoP] wrote:

On 19/08/2024 17:23, Rob Landers wrote:

As far as performance for ambiguous functions go, I was thinking of

submitting an RFC, where ambiguous function calls are tagged during

compilation and always resolve lexically, sorta like how it works now:

echo strlen($x); // resolves to global, always

require_once “my-strlen”;

echo strlen($x); // resolves to my strlen, always

This works by basically rewriting the function name once resolved and

may make the code more predictable. If I can pull it off, it can be

relegated to a technical change that doesn’t need an RFC. Still

working on it.

I’m not entirely clear what you’re suggesting, but I think it might be

either what already happens, or the same as Gina is proposing.

Yeah, that would be a separate RFC, so I didn’t go into the weeds, but my point, is that it would result in no change. But here we go :smiley:

Consider this code [https://3v4l.org/184k3]:

namespace Foo;

foreach ( [1,2,3] as $i ) {

echo strlen(‘hello’), ’ ';

shadow_strlen();

echo strlen(‘hello’), '; ';

}

function shadow_strlen() {

if ( ! function_exists(‘Foo\strlen’) ) {

function strlen($s) {

return 42;

}

}

}

In PHP 5.3, this outputs ‘5 42; 42 42; 42 42;’ That’s fairly

straight-forward: each time the function is called, the engine checks if

“Foo\strlen” is defined.

Since PHP 5.4, it outputs ‘5 42; 5 42; 5 42;’ The engine caches the

result of the lookup against each compiled opcode, so the first strlen()

is cached as a call to \strlen() and the second as a call to \Foo\strlen().

As I understand it, Gina is proposing that it would instead output '5 5;

5 5; 5 5;’ - the function would be “pinned” by making “\Foo\strlen” an

alias to “\strlen” for the rest of the program, and the function_exists

call would immediately return true.

Neither, as far as I can see, can happen at compile time, because the

compiler doesn’t know if and when a definition of \Foo\strlen() will be

encountered.

To be fair, this isn’t even really completely figured out yet. I was mostly wanting to point out that it maybe could be a totally separate issue. But, the gist is that at compile time, we can mark a function as “ambiguous,” meaning we don’t really know if the function exists (because it isn’t fully qualified). The main issue with Gina’s implementation (if I am understanding it properly, and I potentially am not, so take this with a grain of salt) is that this (https://3v4l.org/0jCpW) could fail if it were pinned, where before, it would not.

In my idea, a function becomes pinned until it isn’t, with strict rules that kick it out of being pinned and I think those rules are complex enough to warrant being a completely different RFC or PR.

In other words, maybe pinning could be solved more generally in a

future RFC, decrease your RFC’s scope and chance for sharp edge cases.

If anything, I think it would need to be the other way around: change

the name resolution logic, so that an autoloading proposal was more

palatable, because it didn’t require running the autoloader an unbounded

number of times for the same name.

Proposing both at once seems reasonable, as the autoloading gives an

extra benefit to outweigh the breaking change to shadowing behaviour.

Regards,

I assume you are worried about something like this passing test?


--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
    echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once. I could maybe add it as a test in the PR. I’ve committed it as another test on the RFC implementation.

Rowan Tommins

[IMSoP]

— Rob

On 20 August 2024 00:21:22 BST, Rob Landers <rob@bottled.codes> wrote:

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
   echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once.

I haven't looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That's not a small detail of the implementation, it's a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,
Rowan Tommins
[IMSoP]

On Tue, Aug 20, 2024, at 08:50, Rowan Tommins [IMSoP] wrote:

On 20 August 2024 00:21:22 BST, Rob Landers <rob@bottled.codes> wrote:

I assume you are worried about something like this passing test?

–TEST–

show called only once

–FILE–

<?php

namespace test;

spl_autoload_register(function($name) {

echo “name=$name\n”;

}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen(‘foo’);

echo strlen(‘bar’);

echo strlen(‘baz’);

?>

–EXPECT–

name=test\strlen

333

In my RFC, I mention it is called exactly once.

I haven’t looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That’s not a small detail of the implementation, it’s a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,

Rowan Tommins

[IMSoP]

Thanks Rowan,

That’s a fair critique.

I expect some of the wording will be more clear once I write out the documentation – even if it isn’t used directly, I tend to write out documentation to force myself to reconcile the code with the plan, find logic bugs, perform larger scale tests, and create tests to verify assertions in the documentation. From there, I’ll update the plan or code to get everything to match and spend some time on clarity. It’s the hardest part, IMHO, as it requires diligently ensuring everything is correct. In other words, writing the documentation makes it feel like a “real thing” and it triggers what small amount of perfectionism I have.

— Rob

On Tue, Aug 20, 2024, at 18:07, Rob Landers wrote:

On Tue, Aug 20, 2024, at 08:50, Rowan Tommins [IMSoP] wrote:

On 20 August 2024 00:21:22 BST, Rob Landers <rob@bottled.codes> wrote:

I assume you are worried about something like this passing test?

–TEST–

show called only once

–FILE–

<?php

namespace test;

spl_autoload_register(function($name) {

echo “name=$name\n”;

}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen(‘foo’);

echo strlen(‘bar’);

echo strlen(‘baz’);

?>

–EXPECT–

name=test\strlen

333

In my RFC, I mention it is called exactly once.

I haven’t looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That’s not a small detail of the implementation, it’s a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,

Rowan Tommins

[IMSoP]

Thanks Rowan,

That’s a fair critique.

I expect some of the wording will be more clear once I write out the documentation – even if it isn’t used directly, I tend to write out documentation to force myself to reconcile the code with the plan, find logic bugs, perform larger scale tests, and create tests to verify assertions in the documentation. From there, I’ll update the plan or code to get everything to match and spend some time on clarity. It’s the hardest part, IMHO, as it requires diligently ensuring everything is correct. In other words, writing the documentation makes it feel like a “real thing” and it triggers what small amount of perfectionism I have.

— Rob

I have an experimental library that I use for testing these kinds of things. There are aspects of it that I could work with to make use of function autoloading. Thus, I did so and benchmarked the performance of unit tests. The unit testing library makes a ton of “unqualified function calls”.

I spent some time working on two autoloaders:

  1. A naive autoloader: parses out the file to load, checks if it exists, and then requires the file.

  2. An optimized autoloader: only cares about the namespace it has registered. All others are an instant return.

In the “vanilla” case, I was mostly concerned with variation. I wanted a statistically significant result, so once I got my system into a stable state and I was no longer seeing any variance, I started benchmarking.

For the naive autoloader, I saw a performance degradation of about 6% and lots of variability. This is probably due to the “file_exists” check being done every time an unqualified name was called.

However, for the optimized autoloader, I ended up with less variability (:thinking:) than the vanilla approach and absolutely no measurable performance degradation.

Now, time to try this on a larger scale… WordPress. It’s pretty much the only large codebase I know of that makes use of tons of functions.

— Rob