[PHP-DEV] [Initial Feedback] PHP User Modules - An Adaptation of ES6 from JavaScript

This is a long reply rather than send a bunch of shorter emails.

On Jun 27, 2024, at 2:10 PM, Deleu <deleugyn@gmail.com> wrote:

Overall, I think PHP has already reached the limit of surviving with only PSR-4 and Composer. Single class files were a great solution to get us out of the nightmare of `require` and `import` on top of PHP files. But more than once I have had the desire to declare a couple of interfaces in a single file, or a handful of Enums, etc.

This.

I cannot overemphasize how nice it is to work in Go where I can put almost any code I want in any file I want without having to think about autoloading.

It is great when writing proofs of concept and having all the code in one place makes it easier to reason about. Once fleshed out you can then organize into multiple files, but still get to keep highly related code in the same files.

On Jun 27, 2024, at 3:02 PM, Michael Morris <tendoaki@gmail.com> wrote:
Thanks. The sticking point is what degree of change should be occurring. PHP isn't as behind an 8-ball as JavaScript is since the dev can choose their PHP version and hence deprecation works most of the time for getting rid of old stuff. But not always. Changes that are incompatible with what came before need a way to do things the old way during transition.

As I understand the proposal, this would have no BC issues for code not in modules. PHP could then set rules for code in modules that would not to be directly compatible with code outside modules.

That's how it works in JavaScript, at least as I have experienced, and I'd say it works pretty well.

Again, see PHP 6 and unicode, which snowballed until it was clear that even if PHP 6 had been completed it wouldn't be able to run most PHP 5 code.

At least to me this does not feel as big as trying to implement unicode.

2. No need for autoloaders with modules; I assume this would be obvious, right?

Depends largely on whether modules can include and require to get access to old code. I also didn't discuss how they behave - do they share their variables with includes and requires?

I was presuming that all old code would use autoloaders but modules would be free to do it a better way.

If you need to call code from a namespace from inside a module, sure, the autoloader would be needed.

6. Modules should be directories, not .php files. Having each file be a module makes code org really hard.

Yes, but that is how JavaScript currently handles things. It is currently necessary when making large packages to have an index.js that exports out the public members of the module. This entry point is configurable through the package.json of the module.

I am envisioning that there could be a module metadata file that would have everything that PHP needs to handle the module. It could even be binary, using protobufs:

The php CLI could have an option to generate this file making it easy for IDEs to generate the file, or generic file watchers to generate. This would mean that within a module there would be no need for an autoloader.

If the module metadata file does n0t exist, PHP could generate it on the fly. If the file is obviously out-of-date given a new file, PHP could re-generate. If PHP can't write the file, such as on a production server, it throws a warning and regenerates for in-memory use each page load.

It iss also possible that instead of protobuf the module file could actually be a phar file, or the equivalent of a phar file optimized to allow PHP to load, access and execute that code as fast as possible.

7. Modules would have a symbol table metadata file generated by IDEs and during deployment.

Node.js uses package.json and the attendant npm to do this sort of prep work. And it's a critical part of this since modules can be versioned, and different modules may need to run different specific versions of other modules.

node_modules IMO is one of the worse things about the JavaScript ecosystem. Who has not seen the meme about node_modules being worse than a black hole?

I would argue that PHP itself not be involved in trying to manage versions. Let Composer do that, or whatever other tool developers currently use to manage versions, or new tools developed later.

9. .php files in modules as identified by metadata file should not be loadable via HTTP(S).

Those are implementation details a little further down the road than we're ready for, I think.

But ensuring that it is possible to disallow loading needs to be contemplated in the design. PHP has to be able to know what is a module and what isn't without expensive processes.

10. Having exports separate from functions and classes seems like it would be problematic.

Again, this is how they work in JavaScript. Not saying that's the best approach, but even if problematic it's a solved problem.

I have evidently not written enough JavaScript to realize that.

I'm also interested in learning on how other module systems out there do work.

I am very familiar with modules (packages) in GoLang and think PHP could benefit from considering how they work, too.

On Jun 27, 2024, at 3:22 PM, Michael Morris <tendoaki@gmail.com> wrote:
Composer would need a massive rewrite to be a part of this since it currently requires the file once it determines it should do so. If we do a system where import causes the parser to act differently then that alone means imports can't be dealt with in the same manner as other autoloads.

That is why I am strongly recommending a modern symbol resolution system within modules vs. autoloading.

I'm not fond of this either.

There will need to be a way to define the entrypoint php. I think index.php is reasonable, and if another entry point is desired it can be called out -> "mypackage/myentry.php"

Why is an entry point needed? If there is a module metadata file as I am proposing PHP can get all the information it needs from that file. Maybe that is the `.phm` file?

On Jun 27, 2024, at 4:54 PM, Rob Landers <rob@bottled.codes> wrote:

Thanks. The sticking point is what degree of change should be occurring. PHP isn't as behind an 8-ball as JavaScript is since the dev can choose their PHP version and hence deprecation works most of the time for getting rid of old stuff. But not always. Changes that are incompatible with what came before need a way to do things the old way during transition. Again, see PHP 6 and unicode, which snowballed until it was clear that even if PHP 6 had been completed it wouldn't be able to run most PHP 5 code.

It’s not just up to the dev, but the libraries we use and whether or not we can easily upgrade (or remove) them to upgrade the php version.

By "upgrade" then, do you mean convert them into modules, or just be able to use them as-is.

As I read it and am envisioning it, there would be no changes needed to be able to use them as-is.

I think it would be a mistake to exclude old code and/or prevent templating. Not only are there now decades old code in some orgs, but how would you write an email sender that sent templated emails, provide html, generate code, etc? There has to be an output from the code to be useful.

Excluding old code or templates from modules would not exclude them from working as they currently do outside modules. As I see it, modules would be more about exporting classes and functions, not generating output per se.

So all that decades of old code could continue to exist outside modules, as it currently does today.

I think it’s fine to use js as an inspiration, but it isn’t the only one out there. There is some precedent to consider directories as modules (go calls them “packages”) and especially in PHP where namespaces (due to PSR-4 autoloading) typically match directory structures.

Totally agree about inspiration for modules outside JS, but not sure that PHP namespaces are the best place to look for inspiration.

Namespaces by their very nature were designed to enable autoloading with a one-to-one file to class or interface, and by nature add conceptual scope and complexity to a project that would not be required if a modern module/package system were added to PHP.

Modules could and IMO should be a rethink that learns the lessons other languages have learned over the past decade+.

Node.js uses package.json and the attendant npm to do this sort of prep work. And it's a critical part of this since modules can be versioned, and different modules may need to run different specific versions of other modules.

Please, please, please do not make a json file a configuration language. You can’t comment in them, you can’t handle “if php version <9, load this, or if this extension is installed, use this.”

Maybe that is desirable, but doing things slightly different based on extensions loaded is def a thing.

I don't think commenting is important in this file, or even desired.

As I proposed above, these could be protobuf or phar. These should be build artifacts that can be generated on the fly during development or for newbies even during deployment, not hand-managed.

I could see the generation of two files; one in binary form and one that is readonly so a developer can double-check what is in the current protobuf or phar file.

Those are implementation details a little further down the road than we're ready for, I think.

Personally, if these are going to have any special syntax, we probably shouldn’t call them .php files. Maybe .phm?

I was going to suggest that, and then remembered earlier PHP when there were multiple file extensions and that was a nightmare.

This does remind me to mention that I think there should be a required "module" declaration at the top of each file just like Go requires a "package" declaration at the top of each file. That would make it trivial for tooling to differentiate, even with grep.

the only thing I don’t like about this import/export thing is that it reminds me of the days when we had to carefully order our require_once directives to make sure files were loaded before they were used. So, I think it is worth thinking about how loading will work and whether loading can be dynamic, hoisted out of function calls (like js), how order matters, whether packages can enrich other packages (like doctrine packages) and if so, how much they can gain access to internal state, etc. This is very much not “a solved problem.”

That is why I proposed having a "compiled" module symbol table to eliminate most (all?) of those issues.

On Jun 27, 2024, at 6:00 PM, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:
I do think PHP badly needs a native concept of "module" or "package" - in fact, I'm increasingly convinced it's the inevitable path we'll end up on at some point. BUT I think any such concept needs to be built on top of what we have right now. That means:

- It should build on or work in harmony with namespaces, not ignore or replace them

It may be an unpopular opinion, but I would argue that namespaces were optimized for autoloading and the one class/interface per file paradigm, not to mention to regrettable choice of using the escape operator to seperate namespaces and that fact that PHP throws away a lot of information about namespaces at runtime.

IMO allowing modules to eventually deprecate namespaces — at least in a defacto form of deprecation — would allow modules to be much better than if the try to cling to a less desirable past.

- It should be easy to take existing code, and convert it to a module/package

Maybe, but not if that means modules retain baggage that should really be jettisoned.

and namespaces have proved an extremely successful way of sharing code without those names colliding.

At the expense of a lot more complexity than necessary, yes.

Managing symbols in a module need not be a hard problem if PHP recognizes modules internally rather than trying to munge everything into a global namespace like with namespaces.

Other parts of your e-mail are essentially an unrelated idea, to have some new "PHP++" dialect, where a bunch of "bad" things are removed. You're not the first person to be tempted by this, but I think the history HHVM and Hack is educational here: initially, PHP and Hack were designed to interoperate on one run-time, but the more they tried to optimise for Hack, the harder it became to support PHP, and now Hack is a completely independent language.

While I agree that some things are unnecessary — such as unifying scope resolution operators for existing concepts — past failure does not guarantee future failure.

Hack tried to create an entirely new language yet still be PHP compatible. Learn from the Hack experience and rather than create an entirely new language, PHP modules could simply add constraints for code in modules, and then any "new" language features that are not module-specific by-nature should be considered to work everywhere in PHP, or not at all.

On Jun 27, 2024, at 6:41 PM, Larry Garfield <larry@garfieldtech.com> wrote:
What problem would packages/modules/whatever be solving that isn't already adequately solved?

Not speaking for Michael, obviously, but speaking for what I envision:

1. Adding a module/package system to PHP with modern module features
  - including module private, module function, and module properties
2. Providing an alternative to auto-loader-optimized namespaces.
  - better code management and better page load performance

Do we want:

1. Packages and namespaces are synonymous? (This is roughly how JVM languages work, I believe.)
2. Packages and files are synonymous? (This is how Python and Javascript work.)
3. All packages correspond to a namespace, but not all namespaces are a package?

I would argue packages (modules) should be orthogonal to namespaces to allow modules to be optimized for what other languages have learned about packages/modules over the past decade+.

The fact that namespaces use the escape character as a separator, that PHP does not keep track of namespace after parsing is enough reason to move on from them, and that they were optimize for one-to-one symbol to file autoload are enough reasons IMO to envision a way to move on from them.

And given the near-universality of PSR-4 file structure, what impact would each of those have in practice?

Orthogonal. Old way vs new way. But still completely usable, just not as modules without conversion.

The fact PSR-4 exists is an artifact of autoloading single-symbol files and thus a sunken cost does not mean that PHP should not cling to for modules just because they currently exist.

-Mike

On 28 June 2024 01:16:24 BST, Mike Schinkel <mike@newclarity.net> wrote:

It may be an unpopular opinion, but I would argue that namespaces were optimized for autoloading and the one class/interface per file paradigm

I don't see any particular relationship between namespaces and autoloading, or any reason we need to throw them away to introduce different conventions for loading files.

My opinions match Larry's almost exactly: I want package-level optimisation, and package-private declarations. But I don't want to rewrite my entire codebase to start using a completely different naming system.

Not to mention that working with a combination of existing namespaced packages and "new shiny module" packages is going to be inevitable, so we can't just hand-wave that away.

1. Adding a module/package system to PHP with modern module features

I find that "modern" often just means "fashionable". Please, let's be specific. What is different between imports and namespaces, and why is it a good thing?

What specifically stops us doing all the things you've been discussing around loading, and visibility, etc, in a way that's compatible with the 400_000 packages available on Packagist, and billions of lines of existing code?

Rowan Tommins
[IMSoP]

This is a very long reply to several emails.

On Thu, Jun 27, 2024 at 5:45 PM Jim Winstead <jimw@trainedmonkey.com> wrote:

The angle I am coming at this from is improving the developer experience around “packages” or “modules” or whatever you want to call them, and so much of this proposal doesn’t seem to be about that.

Ok, first problem - not a proposal really, but a ramble trying to get to a proposal. Before I made the first post the idea was knocking around in my head and wouldn’t go away, so I just stream of consciousness listed what’s going through my head. That leads to the second point you made.

I could have made that point in other ways, and I’m sorry that my first attempt came off as insulting. It really concerned me when I already saw discussion about taking this off-list and going into the weeds on technical details when the problem that is being addressed by this proposal is extremely unclear to me.

It is unclear even to me. Perhaps I shouldn’t have posted out something this half baked. That said, pruning off large sections of language functionality is a distraction. For now let’s just note that it is a possibility to improve the language this way afforded by the fact that import would be new way of bringing scripts in. Could isn’t should. Also, at the moment again it’s a distraction. Let’s focus down on how code is imported.

First though, a history review, partially to get this straight in my own head but hopefully of use for those following along. Why? Knowing how we got we are is important to some degree to chart a way forward.

PHP started as a template engine. By modern standards, and compared to the likes of twig, it’s a very bad template engine, but that doesn’t really matter because it’s evolved into a programming language in it’s own right over the last nearly 20 years.

Include, include_once, require, and require_once have been around since the beginning as the way to splice code files together. The behavior of these statements calls back to PHP’s origin as a template engine as they do things similar mechanisms like JavaScript’s import do not do (and for that matter, their equivalents in C# and Java). Their scope behavior is very different from import mechanisms in other languages, as they see the variables in the scope of the function they were invoked from or the global scope when called from there. Their parsing can be aborted early with a return. They can return a value, which is quite unusual to be honest. None of this is bad per se, but it is different and the question arises is it necessary.

One artifact of their behavior that is bad in my opinion is that they start from the standpoint of being text or html files. If the included file has no PHP tags then the contents get echoed out. If there are no output buffers running this can cause headers to be set and fun errors to be had. So they can’t be used to create files that can only echo explicitly (that is, a call to the echo statement or the like).

Fast forward a bit - PHP 5.3, and the introduction of namespaces were introduced to deal with the overloaded symbol tables. They are a bit a hotwire as (if I’m not mistaken, it’s been a couple years since I read the discussion on it) they just quietly prepend the namespace string in front of the name of all new symbols declared in the namespace for use elsewhere. As a result, PHP namespaces don’t do some of the things we see in the namespaces of other languages (looking at Java and C# here). For example, privacy modifiers within a namespace aren’t a thing.

Very quickly after PHP 5.3 released autoloaders showed up. At some point support for multiple autoloaders was added. Several schema were added, PSR-4 won out, and composer showed up to leverage this. Composer is based on NPM, even to the point where json is used to configure it, and the composer.json file is fairly close to npm’s package.json file even now. It’s a userland solution, but to my knowledge WordPress is the only widely used PHP application out there that doesn’t use it directly (there is a Composer Wordpress project).

Before composer, and before namespaces there was PECL. Composer has eclipsed it because PECL has the limitation of being server-wide. It never really caught on in the age of virtual hosting with multiple PHP sites running on one box. Today we have Docker, but that didn’t help PECL make a comeback because by the time docker deployment of PHP sites became the norm composer had won out. Also, composer library publishing is more permissive than PECL. I’ll stop here lest this digress into a Composer v PECL discussion - suffice to say stabs a bringing code packages into PHP isn’t a new idea, and a survey of what’s been done before, what was right about those attempts and what was wrong needs to be considered before adding yet another php package system into the mix.

The main influence of composer and autoloaders for preparing packages is that PHP has become far more Object Oriented than it was before. Prior to PHP 5.3 object oriented programming was a great option, but since autoloaders cannot bring in functions (at least not directly, they can be cheated in by bundling them in static classes which are all but namespaces) the whole ecosystem has become heavily object oriented.

That isn’t a bad thing. But it does need to be acknowledged. Before I go further I’ll now respond to some other points made by others in this thread.

On Thu, Jun 27, 2024 at 6:01 PM Jordan LeDoux <jordan.ledoux@gmail.com> wrote:

Ah, yes, THAT’S a fair point. While the idea of optimizing the engine/parser for modules has merit as part of a user modules proposal, I agree that many of the specifics proposed here feel pretty scatter-shot and unclear.

The scoping operator change I simply ignored, as that feels to me like just asking “I would like to program in Node” and there’s no clear benefit to changing the scoping operator outlined, while there is a clear detriment to eliminating the concatenation operator entirely.

Mostly I ignored that aspect of it, because I assumed that all the people capable of implementing this proposal would just refuse stuff like that outright, and that the inclusion of it would guarantee the RFC fails, so no point in worrying.

But the broader question you are presenting about the focus and goals of the proposal, and how the specifics relate to that, is actually a question that I share.

I hope the above begins to address that. Package management I think should be the main topic, and from here forward I’ll leave aside any unnecessary parser changes that might occur when code is imported as there are distractions. Those I continue to bring up I’ll state why, and those who are more familiar with how the engine works can speak to whether such changes truly are useful or unecessary. If I’m wrong, then dropping such suggestions entirely is the way to go.

On Thu, Jun 27, 2024 at 6:07 PM Rob Landers rob@bottled.codes wrote:

Internals has made it pretty clear: no more declare or ini entries (unless it is absolutely needed).

Noted.

I personally don’t like it because it uses arrays, which are opaque, easy to typo, and hard to document/check.

Instead, maybe consider a new Reflection API?

(new ReflectionModule)->import(‘MyModule’)->run()

That doesn’t solve the problem of how the parser figures out where the code is. That’s got to happen somewhere. I’ll come back to this in a moment.

Keep in mind that extensions typically expose functions automatically, and under the original proposal those functions have to be imported to be used: import mysql_query

they also do now, unless you either prefix them with \ or rely on the fallback resolution system. I’m honestly not sure we need a new syntax for this, but maybe just disable the global fallback system in modules?

I’m not sure that’s a good idea, neither was this.

Perhaps PHP imports, unlike their JavaScript or even Java C# counterparts, could be placed in try/catch blocks, with the catch resolving what to do if the import misses.

Which is something I wrote, yet a day later - yuck. I do not like. But I’m in brainstorm mode, playing with ideas with everyone.

I really don’t like the extension games seen in node with js, cjs and mjs, but there’s a precedent for doing it that way. In their setup if you’ve set modules as the default parse method then cjs can be used to identify files that still need to use CommonJS. And mjs can force the ES6 even in default mode. But it is a bit of a pain and feels like it should be avoided.

I would argue that it be something seriously considered. Scanning a directory in the terminal, in production systems, while diagnosing ongoing production issues, it can be very handy to distinguish between the “old way” and “new way”, at a glance.

Fair point.

the only thing I don’t like about this import/export thing is that it reminds me of the days when we had to carefully order our require_once directives to make sure files were loaded before they were used. So, I think it is worth thinking about how loading will work and whether loading can be dynamic, hoisted out of function calls (like js), how order matters, whether packages can enrich other packages (like doctrine packages) and if so, how much they can gain access to internal state, etc. This is very much not “a solved problem.”

In JavaScript import must be top of the file - you’ll get an error if you try an import following any other statement unless it’s a dynamic import(), which is a whole other Promise/Async/Kettle of fish that thankfully PHP does not have to take into account as, until you get used to it (and even after), async code is a headache.

Are you sure? I don’t remember them removing import hoisting, but it’s probably more of a typical linting rule because it is hard to reason about.

Likely correct - I do use linters heavily. Hoisting is evil (necessary, but still evil).

On Thu, Jun 27, 2024 at 6:13 PM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

Thank you for sharing. I think it’s valuable to explore radical ideas
sometimes.

I do think PHP badly needs a native concept of “module” or “package” -
in fact, I’m increasingly convinced it’s the inevitable path we’ll end
up on at some point. BUT I think any such concept needs to be built on
top of what we have right now. That means:

  • It should build on or work in harmony with namespaces, not ignore or
    replace them
  • It should be compatible with Composer, but not dependent on it
  • It should be easy to take existing code, and convert it to a
    module/package
  • It should be easy to carry on using that module/package after it’s
    been converted

On all these points, agreed.

If we can learn from other languages while we do that, I’m all for it;
but we have to remember that those languages had a completely different
set of constraints to work with.

For instance, JS has no concept of “namespaces”, but does treat function
names as dynamically scoped alongside variables. So the module system
needed to give a way of managing how you imported names from one scope
to another. That’s not something PHP needs, because it treats all names
as global, and namespaces have proved an extremely successful way of
sharing code without those names colliding.

Very good point.

Other parts of your e-mail are essentially an unrelated idea, to have
some new “PHP++” dialect, where a bunch of “bad” things are removed.

Let’s set that aside then. Better package management is a big enough dragon to slay.

On Thu, Jun 27, 2024 at 8:16 PM Mike Schinkel <mike@newclarity.net> wrote:

This is a long reply rather than send a bunch of shorter emails.

On Jun 27, 2024, at 2:10 PM, Deleu <deleugyn@gmail.com> wrote:

Overall, I think PHP has already reached the limit of surviving with only PSR-4 and Composer. Single class files were a great solution to get us out of the nightmare of require and import on top of PHP files. But more than once I have had the desire to declare a couple of interfaces in a single file, or a handful of Enums, etc.

This.

I cannot overemphasize how nice it is to work in Go where I can put almost any code I want in any file I want without having to think about autoloading.

Go is cool. I need to use it more. These days JavaScript gets most of my time, but PHP will always be the language that got me into programming professionally and for that I’ll be eternally grateful.

As I understand the proposal, this would have no BC issues for code not in modules. PHP could then set rules for code in modules that would not to be directly compatible with code outside modules.

That is the goal. Module code should be allowed to be different if the optimization makes for faster running and easier to understand code (for the programmer, the IDE, and the parser itself). Changing things for the sake of changing them, no.

At least to me this does not feel as big as trying to implement unicode.

I would hope not, because that turned out to be well night impossible.

  1. No need for autoloaders with modules; I assume this would be obvious, right?

Depends largely on whether modules can include and require to get access to old code. I also didn’t discuss how they behave - do they share their variables with includes and requires?

I was presuming that all old code would use autoloaders but modules would be free to do it a better way.

If you need to call code from a namespace from inside a module, sure, the autoloader would be needed.

This is correct and what I had in mind.

  1. Modules should be directories, not .php files. Having each file be a module makes code org really hard.

Yes, but that is how JavaScript currently handles things. It is currently necessary when making large packages to have an index.js that exports out the public members of the module. This entry point is configurable through the package.json of the module.

I am envisioning that there could be a module metadata file that would have everything that PHP needs to handle the module. It could even be binary, using protobufs:

An interesting idea. I need to research this some.

node_modules IMO is one of the worse things about the JavaScript ecosystem. Who has not seen the meme about node_modules being worse than a black hole?

Fair enough. Or maybe import maps would be a better way forward.

But ensuring that it is possible to disallow loading needs to be contemplated in the design. PHP has to be able to know what is a module and what isn’t without expensive processes.

One possible solution is that if modules do not have <?php ?> tags, ever, and someone directly tries to load a module through http(s) the file won’t execute. Only files with <?php ?> tags are executable by the web sapi.

  1. Having exports separate from functions and classes seems like it would be problematic.

Again, this is how they work in JavaScript. Not saying that’s the best approach, but even if problematic it’s a solved problem.

I have evidently not written enough JavaScript to realize that.

JavaScript is an odd prototypical duck. Everything ultimately is an object. Tha

I’m also interested in learning on how other module systems out there do work.

I am very familiar with modules (packages) in GoLang and think PHP could benefit from considering how they work, too.

I’ve only touched the surface on how GoLang does things. Some of it was confusing to me at first. It’s also been awhile so I’d need to refresh my memory to speak to it.

On Jun 27, 2024, at 3:22 PM, Michael Morris <tendoaki@gmail.com> wrote:
Composer would need a massive rewrite to be a part of this since it currently requires the file once it determines it should do so. If we do a system where import causes the parser to act differently then that alone means imports can’t be dealt with in the same manner as other autoloads.

That is why I am strongly recommending a modern symbol resolution system within modules vs. autoloading.

Ok.

I’m not fond of this either.

There will need to be a way to define the entrypoint php. I think index.php is reasonable, and if another entry point is desired it can be called out → “mypackage/myentry.php”

Why is an entry point needed? If there is a module metadata file as I am proposing PHP can get all the information it needs from that file. Maybe that is the .phm file?

Maybe. Again, I need to look over this meta data format. Also, how does it get created?

On Jun 27, 2024, at 4:54 PM, Rob Landers rob@bottled.codes wrote:

Thanks. The sticking point is what degree of change should be occurring. PHP isn’t as behind an 8-ball as JavaScript is since the dev can choose their PHP version and hence deprecation works most of the time for getting rid of old stuff. But not always. Changes that are incompatible with what came before need a way to do things the old way during transition. Again, see PHP 6 and unicode, which snowballed until it was clear that even if PHP 6 had been completed it wouldn’t be able to run most PHP 5 code.

It’s not just up to the dev, but the libraries we use and whether or not we can easily upgrade (or remove) them to upgrade the php version.

By “upgrade” then, do you mean convert them into modules, or just be able to use them as-is.

As I read it and am envisioning it, there would be no changes needed to be able to use them as-is.

Any system that blocks existing code from being used would be a non-starter for inclusion.

I think it would be a mistake to exclude old code and/or prevent templating. Not only are there now decades old code in some orgs, but how would you write an email sender that sent templated emails, provide html, generate code, etc? There has to be an output from the code to be useful.

Excluding old code or templates from modules would not exclude them from working as they currently do outside modules. As I see it, modules would be more about exporting classes and functions, not generating output per se.

So all that decades of old code could continue to exist outside modules, as it currently does today.

Exactly this.

I think it’s fine to use js as an inspiration, but it isn’t the only one out there. There is some precedent to consider directories as modules (go calls them “packages”) and especially in PHP where namespaces (due to PSR-4 autoloading) typically match directory structures.

Totally agree about inspiration for modules outside JS, but not sure that PHP namespaces are the best place to look for inspiration.

Namespaces by their very nature were designed to enable autoloading with a one-to-one file to class or interface, and by nature add conceptual scope and complexity to a project that would not be required if a modern module/package system were added to PHP.

Modules could and IMO should be a rethink that learns the lessons other languages have learned over the past decade+.

Agreed.

Node.js uses package.json and the attendant npm to do this sort of prep work. And it’s a critical part of this since modules can be versioned, and different modules may need to run different specific versions of other modules.

Please, please, please do not make a json file a configuration language. You can’t comment in them, you can’t handle “if php version <9, load this, or if this extension is installed, use this.”

Maybe that is desirable, but doing things slightly different based on extensions loaded is def a thing.

I don’t think commenting is important in this file, or even desired.

As I proposed above, these could be protobuf or phar. These should be build artifacts that can be generated on the fly during development or for newbies even during deployment, not hand-managed.

Hand management has value in learning the underlying concepts though.

I could see the generation of two files; one in binary form and one that is readonly so a developer can double-check what is in the current protobuf or phar file.

Those are implementation details a little further down the road than we’re ready for, I think.

Personally, if these are going to have any special syntax, we probably shouldn’t call them .php files. Maybe .phm?

I was going to suggest that, and then remembered earlier PHP when there were multiple file extensions and that was a nightmare.

This does remind me to mention that I think there should be a required “module” declaration at the top of each file just like Go requires a “package” declaration at the top of each file. That would make it trivial for tooling to differentiate, even with grep

Fun idea, if the @ operator is ditched as an error suppression operator it could be used as the package operator. (If I manage to talk everyone into getting rid of one thing, it’s @).

.

the only thing I don’t like about this import/export thing is that it reminds me of the days when we had to carefully order our require_once directives to make sure files were loaded before they were used. So, I think it is worth thinking about how loading will work and whether loading can be dynamic, hoisted out of function calls (like js), how order matters, whether packages can enrich other packages (like doctrine packages) and if so, how much they can gain access to internal state, etc. This is very much not “a solved problem.”

That is why I proposed having a “compiled” module symbol table to eliminate most (all?) of those issues.

The more you bring it up, the more I am reminded of the import-map directive added to client-side JavaScript.

On Jun 27, 2024, at 6:00 PM, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:
I do think PHP badly needs a native concept of “module” or “package” - in fact, I’m increasingly convinced it’s the inevitable path we’ll end up on at some point. BUT I think any such concept needs to be built on top of what we have right now. That means:

  • It should build on or work in harmony with namespaces, not ignore or replace them

It may be an unpopular opinion, but I would argue that namespaces were optimized for autoloading and the one class/interface per file paradigm, not to mention to regrettable choice of using the escape operator to seperate namespaces and that fact that PHP throws away a lot of information about namespaces at runtime.

I remember when the choice to use \ was made. I’ve rarely been so angry about a language design choice before or since. I’ve gotten used to it, but seeing \ all over the place in strings is still… yuck.

IMO allowing modules to eventually deprecate namespaces — at least in a defacto form of deprecation — would allow modules to be much better than if the try to cling to a less desirable past.

  • It should be easy to take existing code, and convert it to a module/package

Maybe, but not if that means modules retain baggage that should really be jettisoned.

and namespaces have proved an extremely successful way of sharing code without those names colliding.

At the expense of a lot more complexity than necessary, yes.

Managing symbols in a module need not be a hard problem if PHP recognizes modules internally rather than trying to munge everything into a global namespace like with namespaces.

I’m inclined to agree on these points, but I also don’t know the engine internals that wall. Intuitively it would seem keeping the symbol table small would make the code go faster.

On Jun 27, 2024, at 6:41 PM, Larry Garfield <larry@garfieldtech.com> wrote:
What problem would packages/modules/whatever be solving that isn’t already adequately solved?

Not speaking for Michael, obviously, but speaking for what I envision:

  1. Adding a module/package system to PHP with modern module features
  • including module private, module function, and module properties
  1. Providing an alternative to auto-loader-optimized namespaces.
  • better code management and better page load performance

Couldn’t have said it better myself.

Do we want:

  1. Packages and namespaces are synonymous? (This is roughly how JVM languages work, I believe.)
  2. Packages and files are synonymous? (This is how Python and Javascript work.)
  3. All packages correspond to a namespace, but not all namespaces are a package?

I would argue packages (modules) should be orthogonal to namespaces to allow modules to be optimized for what other languages have learned about packages/modules over the past decade+.

The fact that namespaces use the escape character as a separator, that PHP does not keep track of namespace after parsing is enough reason to move on from them, and that they were optimize for one-to-one symbol to file autoload are enough reasons IMO to envision a way to move on from them.

And given the near-universality of PSR-4 file structure, what impact would each of those have in practice?

Orthogonal. Old way vs new way. But still completely usable, just not as modules without conversion.

The fact PSR-4 exists is an artifact of autoloading single-symbol files and thus a sunken cost does not mean that PHP should not cling to for modules just because they currently exist.

I have nothing to add to the above.

On Fri, Jun 28, 2024, at 09:07, Michael Morris wrote:

This is a very long reply to several emails.

On Thu, Jun 27, 2024 at 5:45 PM Jim Winstead <jimw@trainedmonkey.com> wrote:

The angle I am coming at this from is improving the developer experience around “packages” or “modules” or whatever you want to call them, and so much of this proposal doesn’t seem to be about that.

Ok, first problem - not a proposal really, but a ramble trying to get to a proposal. Before I made the first post the idea was knocking around in my head and wouldn’t go away, so I just stream of consciousness listed what’s going through my head. That leads to the second point you made.

I could have made that point in other ways, and I’m sorry that my first attempt came off as insulting. It really concerned me when I already saw discussion about taking this off-list and going into the weeds on technical details when the problem that is being addressed by this proposal is extremely unclear to me.

It is unclear even to me. Perhaps I shouldn’t have posted out something this half baked. That said, pruning off large sections of language functionality is a distraction. For now let’s just note that it is a possibility to improve the language this way afforded by the fact that import would be new way of bringing scripts in. Could isn’t should. Also, at the moment again it’s a distraction. Let’s focus down on how code is imported.

First though, a history review, partially to get this straight in my own head but hopefully of use for those following along. Why? Knowing how we got we are is important to some degree to chart a way forward.

PHP started as a template engine. By modern standards, and compared to the likes of twig, it’s a very bad template engine, but that doesn’t really matter because it’s evolved into a programming language in it’s own right over the last nearly 20 years.

How do you think twig works, exactly? You should probably check it out, because templates compile down to regular PHP templates – at least it did the last time I looked at it a few years ago. How do you think emails are templated by php code? How do you think anything is output? By either “echo” or ?> content <?php, or fwrite/file_put_contents

Without that, there is literally no purpose to php code (or any code).

Include, include_once, require, and require_once have been around since the beginning as the way to splice code files together. The behavior of these statements calls back to PHP’s origin as a template engine as they do things similar mechanisms like JavaScript’s import do not do (and for that matter, their equivalents in C# and Java). Their scope behavior is very different from import mechanisms in other languages, as they see the variables in the scope of the function they were invoked from or the global scope when called from there. Their parsing can be aborted early with a return. They can return a value, which is quite unusual to be honest. None of this is bad per se, but it is different and the question arises is it necessary.

How do you think javascript import works, exactly? They load a file which returns a value via export.

There is nothing inherently wrong with requires/includes, it’s literally required by every language via some mechanism or another (C’s include statements, go’s go.mod file, javascripts import/package.json, C#'s project config, etc). There’s no magical thing here, just abstractions and different levels of it.

One artifact of their behavior that is bad in my opinion is that they start from the standpoint of being text or html files.

Every single language starts with a text file… There’s nothing inherently special about the bytes in any source code file and they only have meaning due to creating a parser that can make sense of the stream of bytes. The fact that they also have meaning to humans is what makes it source code and not object code/byte code.

If the included file has no PHP tags then the contents get echoed out. If there are no output buffers running this can cause headers to be set and fun errors to be had. So they can’t be used to create files that can only echo explicitly (that is, a call to the echo statement or the like).

For headers, this is largely up to the SAPI (the program that executes the PHP code, eg, frankenphp, php-fpm, mod-cgi, roadrunner, etc) and the fact that most SAPIs want to be able to run existing code where developers have certain expectations of behavior. FrankenPHP did a little something different with the support of the 103 status code which isn’t supported in any other SAPI (AFAIK). The CLI sapi doesn’t output any headers whatsoever.

As far as PHP opening tags go… I don’t even notice it. They’re there, just like the “package” declaration on the top of every one of my Go files.

Fast forward a bit - PHP 5.3, and the introduction of namespaces were introduced to deal with the overloaded symbol tables. They are a bit a hotwire as (if I’m not mistaken, it’s been a couple years since I read the discussion on it) they just quietly prepend the namespace string in front of the name of all new symbols declared in the namespace for use elsewhere. As a result, PHP namespaces don’t do some of the things we see in the namespaces of other languages (looking at Java and C# here). For example, privacy modifiers within a namespace aren’t a thing.

This would be nice to have … maybe. But namespace have been around, what, 10-15 years? I think if someone wanted to “fix” it, it would have been fixed by now.

Very quickly after PHP 5.3 released autoloaders showed up. At some point support for multiple autoloaders was added. Several schema were added, PSR-4 won out, and composer showed up to leverage this. Composer is based on NPM, even to the point where json is used to configure it, and the composer.json file is fairly close to npm’s package.json file even now. It’s a userland solution, but to my knowledge WordPress is the only widely used PHP application out there that doesn’t use it directly (there is a Composer Wordpress project).

WordPress predates composer et. al., by more than 10 years (if you count the B2 code it was forked from). Why would it use it? From working at Automattic (I left a couple of years ago, and this is merely what I saw as an observer, I wasn’t closely involved with any of the open-source side), there was a bit of a push to make it happen as it looked like Composer would be around for awhile, but then there was talk about an “official” loader/thing from php itself and I think they’d rather use that instead. When you have a project that has literally been around decades, you don’t change out your whole system on the whims of what is fashionable at the time; you either innovate or wait and see what becomes standard. Composer has only been around 50% of the time WordPress has been around now; and it didn’t start out as immediately popular.

Before composer, and before namespaces there was PECL. Composer has eclipsed it because PECL has the limitation of being server-wide. It never really caught on in the age of virtual hosting with multiple PHP sites running on one box. Today we have Docker, but that didn’t help PECL make a comeback because by the time docker deployment of PHP sites became the norm composer had won out. Also, composer library publishing is more permissive than PECL. I’ll stop here lest this digress into a Composer v PECL discussion - suffice to say stabs a bringing code packages into PHP isn’t a new idea, and a survey of what’s been done before, what was right about those attempts and what was wrong needs to be considered before adding yet another php package system into the mix.

I don’t think PECL and Composer have much in common… at all. Like they are not even comparable, and it is still the best way to install extensions (in fact, it is the only way in a docker container AFAIK) on a self-compiled php.

The main influence of composer and autoloaders for preparing packages is that PHP has become far more Object Oriented than it was before. Prior to PHP 5.3 object oriented programming was a great option, but since autoloaders cannot bring in functions (at least not directly, they can be cheated in by bundling them in static classes which are all but namespaces) the whole ecosystem has become heavily object oriented.

require/include still works fine, or using the “file” key on the autoloader in composer.json.

I don’t find most of these “problems” actually valid. There is some merit to the conclusions, but the premise feels shaky.

— Rob

First of all a quick note for the OP: I am all for it in general, but I don’t think copying the entire JS module system one to one makes sense. It contains a lot of compromises and mistakes that we should absolutely learn from as well as the good things they did.

Which of these are we trying to solve? (Solving all of them at once is unlikely, and some are mutually-incompatible.)

1. Adding a "strict pedantic mode" without messing with existing code?
2. Package-level visibility (public, package, protected, private)?
3. Avoid name clashes?
4. Improved information for autoloaders and preloading, possibly making class-per-file unnecessary in many cases?
5. A larger scope for the compiler to analyze in order to make optimizations?
6. Package-level declares, inherited by all files in the package?
7. Something else?

I agree with most of your analysis, and IMO Package-level visibility is the main direct win, with a larger scope for JIT optimization coming later.

It would however be very tempting to bake in 1, and remove a bunch of things which are not removable from the language at large due to BC, as that might be a once in a lifetime opportunity. Some features make JIT optimizations nearly impossible (Nikita had a list somewhere… but the main one if probably killing references).

The autoloader information to be honest I am not sure how important this is. For everyone not wanting to do class-per-file, note that you can just use “classmap” autoloading in Composer. It is anyway the most performant option at runtime [1]. The only catch is you have to re-dump the autoloader when adding new classes/files to make them discoverable. But I think everyone’s kinda too stuck on PSR-4 because it is a standard.

Do we want:

1. Packages and namespaces are synonymous?  (This is roughly how JVM languages work, I believe.)
2. Packages and files are synonymous?  (This is how Python and Javascript work.)
3. All packages correspond to a namespace, but not all namespaces are a package?

And given the near-universality of PSR-4 file structure, what impact would each of those have in practice?  (Even if packages open up some new autoloading options and FIG publishes a new PSR for how to use them, there's only a billion or so PSR-4 class files in the wild that aren't going away any time soon.)  My gut feeling is we want 3, but I'm sure there's a debate to be had there.

I’d go for 3 as well. Every package having a single root namespace is probably true of 99% of packages due to the PSR-4 autoload root. Sub-namespaces are discretionary.

[1] https://getcomposer.org/doc/articles/autoloader-optimization.md

···
-- 
Jordi Boggiano
@seldaek - [https://seld.be](https://seld.be)

On Fri, 28 Jun 2024, at 09:12, Mike Schinkel wrote:

On Jun 28, 2024 at 2:54 AM, <Rowan Tommins [IMSoP]> wrote:
I don't see any particular relationship between namespaces and autoloading, or any reason we need to throw them away to introduce different conventions for loading files.

Sure, you can make the argument they are not related, but then you have to ask if namespaces would look the way they do if it were not for the need to map them to be able to autoload symbols. I do not think they would.

Autoloading is by-nature one symbol per file. Namespaces were designed for mapping with "<namespace>/<className>.php" to allow autoloading.

Having worked in languages that do not require having to think about or run userland code to handle autoloading nor have to be concerned about loading in the proper order has been such a joy when compared to the pain of working PHP.

Namespaces don't require autoloading, and autoloading doesn't require one file per class.

To compile a program with multiple source files, in any language, you need one of two things:

a) A list of files you want to compile. Maybe auto-generated, maybe done with a recursive iteration over a directory, but ultimately the compiler needs a file path to process.
b) A way for the compiler to tell, based on some symbol it wants to resolve, which file should be compiled.

PHP originally provided only option (a), via the include and require keywords. Autoloading adds option (b), where you provide a function which takes a class name and does *whatever you want* to find the definition.

I think it might be time to re-visit the tooling around option (a), as OpCache makes the cost of eagerly loading a list of files much lower than it was when autoloading was added. That could be as simple as include_all($directory), or as fancy as include_from_manifest_file($some_crazy_binary_file_format); either could be implemented right now in userland, because it all eventually comes down to calling include or require.

My opinions match Larry's almost exactly: I want package-level optimisation, and package-private declarations. But I don't want to rewrite my entire codebase to start using a completely different naming system.

I can't see how package-privates would be of any value to you *unless* you rewrite your codebase.

Simple: I have private Composer packages, right now, that group all their classes under a particular namespace prefix. I want to be able to mark some classes in that namespace as "internal".

I do not want to change every place that uses the existing classes to reference "ModuleName@ClassName" instead of "NamespacePrefix\ClassName", or change every "use" to "import from".

And rewrite your entire codebase? Why? No need to rewrite if you don't need the specific features. Just because there is a new feature doesn't mean you have to use it if there is no benefit to you using it.

Code doesn't existing in isolation; if Symfony Mailer is re-published as a "module", every single application that uses it needs to change their code from referencing namespaced classes, to having "import" statements.

As for package-level optimisation, you'll need to give examples of what you mean there as I don't want to wrongly assume.

Currently, OpCache only optimises per file, because it can't guarantee how files will be used together.

A simple example is function fallback: if you could declare a package as "completely loaded", OpCache could replace all references to "strlen" with "\strlen", knowing that no namespaced function with that name could be added later.

Not to mention that working with a combination of existing namespaced packages and "new shiny module" packages is going to be inevitable, so we can't just hand-wave that away.

I do not follow your train of thought here.

What specifically are you accusing me of "hand-waving away?"

I didn't intend it as a personal accusation, apologies if it came across that way.

What I meant was: we can't just treat namespaces and modules as completely separate things, and assume that every code file will be using one style or the other. We have to imagine the user experience when there is a mix of the two.

I can't imagine it being pleasant to have a mix of "import" and "use" statements at the top of a file, with different and even conflicting semantics.

>1. Adding a module/package system to PHP with modern module features

I find that "modern" often just means "fashionable". Please, let's be specific.
What is different between imports and namespaces, and why is it a good thing?

1. Namespaces are a parsing construct but not an AST construct beyond scoping. This has many ramifications which have often been mentioned as limitations on this mailing list.

2. Namespaces cannot provide code isolation and encapsulation, unlike more "fashionable" modules/packages. :wink:

3. Namespaces have no runtime behavior, but more "fashionable" modules/packages often do.

Perhaps I didn't word the question well. What I'm really asking, as someone who's never used JS or Go modules, is why I'd want to write "import", rather than referencing a global name in a hierarchy.

That's really all I mean by "making it compatible with namespace": I want "new \Foo\Bar\Baz;" to be able to refer to a "packaged" class, probably by having a way to mark all classes under "\Foo\Bar" as belonging to a particular package.

4. The usage of the escape character for namespace separator makes dynamic programming tedious and error prone.

Sorry, not interested.

5. Because of one-to-one symbol-to-file for autoloading, Namespaces by nature result in a large number of files in a large number of directories and do not allow code organization optimized for cohesiveness.

See above - this is not related to namespaces.

6. Modules and packages are typically small-scoped to a single directory and it is a code smell to have many different packages tightly coupled, as is the case with namespaces. Forcing modules to munge with namespaces would mean most modules would be written with those code smells for years because that will be how everyone doing a quick shift from namespace to module will write their modules.

Again, this is entirely about code style, and not something the language can control.

Also, the JS insistence on having a separate package for every tiny function is a common source of criticism, so personally I am very happy that PHP packages are generally larger than that.

7. In designing modules, if modules and namespaces were munged together then every single design decision made for modules will have to be compatible with namespaces. I cannot currently know what all constraints will emerge but I can almost guarantee that modules would be less well-designed if they have to be shoehorned to be fully compatible with namespaces.

That said, maybe the best solution is to NOT put the stake in the ground right now and say "They must be namespace compatible" or "They must not be namespace compatible" but move forward with an open mind so that we can tease out exactly how namespaces would constrain modules and and then make the decision later for what would be in the best interest of moving PHP into the future.

If and when an actual problem arises, let's discuss it.

What specifically stops us doing all the things you've been discussing around loading, and visibility, etc, in a way that's compatible with the
400_000 packages available on Packagist, and billions of lines of existing code?

You speak as if I am proposing getting rid of namespaces and making those 400_000 packages available on Packagist, and billions of lines of existing code not work in PHP. Of course not.

No, I'm saying that every one of those packages could benefit if we make incremental changes.

I don't want to couple it so that you can't have "package private" without also switching to some new "advanced" dialect of the language, and I don't see any reason why we need to do so.

Maybe package scoped declares could allow opting in to certain checks, but I don't think "is in a package" and "has been audited for a load of extra breaking changes" should be set by the same flag.

Leaving the rest of your reply here, since you accidentally sent it privately:

What I am saying is that we should design modules from a cleaner slate than namespaces, and allow solving problems that concerns for BC have always stopped PHP from solving.

Besides, when a language evolves and adds new features, it rarely works to shoehorn existing code AS-IS into the new constructs because doing so does not take advantage of the new capabilities. Just because you have a ton of code written for namespaces doesn't mean modules should be constrained to make it easy for you to move your namespaces to modules without a redesign.

But as I am proposing your namespaced code would continue to work exactly as before.

By their nature, beginners would not be as likely to use modules and would be likely to stick to existing PHP style. Intermediate to advanced programmers could instead be the target market for modules.

There has always been a divide in PHP between those who want a really advanced language and are happy to break compatibility to get there, and those who want PHP just the way it has always been. Modules could easily require typing for all things that can be typed, for example. Modules could be the thing that finally addresses the needs of intermediate to advanced developers while keeping everyone else who wants to keep PHP as more beginner friendly language happy.

--
Rowan Tommins
[IMSoP]

On Jun 28, 2024, at 3:07 AM, Michael Morris <tendoaki@gmail.com> wrote:

On Thu, Jun 27, 2024 at 8:16 PM Mike Schinkel <mike@newclarity.net> wrote:

node_modules IMO is one of the worse things about the JavaScript ecosystem. Who has not seen the meme about node_modules being worse than a black hole?

Fair enough. Or maybe import maps would be a better way forward.

Import maps are really a small part of what PHP actually needs. For example, is it a class, an interface, or a function? For a module, it is a property?

I envision basically that this file, whatever it would be called would be a pre-compilation of everything that PHP can pre-compile about the files that are contained within the module/directory.

See below where I talk about a pre-compiled .php.module

But ensuring that it is possible to disallow loading needs to be contemplated in the design. PHP has to be able to know what is a module and what isn’t without expensive processes.

One possible solution is that if modules do not have <?php ?> tags, ever, and someone directly tries to load a module through http(s) the file won’t execute. Only files with <?php ?> tags are executable by the web sapi.

Except that would require parsing all of the entire files in the directory to know (unless everything were pre-compiled as I am advocating.).

Still, I think it would be better to be explicit, and for that I would propose the first line in the file needs to start with “module” and have the name of the module.

I’ve only touched the surface on how GoLang does things. Some of it was confusing to me at first. It’s also been awhile so I’d need to refresh my memory to speak to it.

In Go modules or, in this context more correctly named “packages” are:

  • A collection of files grouped into a directory and thus all files in that directory are in the same package.
  • Public or private scope are determined by case of symbols; lowercase are private and uppercase are public. People coming from other languages tend to hate this, but I have come to love it because it makes code less dense while employing the same information as a “public” and “private” keywords. It also makes code across different developers more consistent.
  • Packages can be nested in package directories, but…
  • There is no concept of a “sub” package, meaning there are no hierarchies when packages are used in code (there is a file path hierarchy but that is only relevant for importing the package.) When I started working with Go I thought that was unfortunate. Now after 5+ years working with Go I see it as a really good decision.

  • Package files must have a “package” statement at the top, and all files in the directory must have the same “package” statement, with one caveat.

  • That caveat is that package files can have package <packagename>_test as a package name and that file is assumed to contains a test but it cannot see private members in package <packagename>.

  • Test files are typically named to pair with a <filename>.go and would be named <filename>_test.go. That file’s package name can either be just <packagename> or <packagename>_test, depending on if you want to reach into private members or not.

  • You can also find test packages that contain all <filename>_test.go files.

  • Testing is build into Go with go test ./... to run all tests in current and all subdirectories. (Idiomatic testing in Go is so much easier that idiomatic testing in PHP resulting in a culture of testing among almost all Go developers.)

  • Package files can have types, vars, consts, and funcs as well as imports and directives, of course.

  • Types in Go can be struct (which is the closest Go has to a class), slice of type e.g. []<type>, array of type e.g. [<n>]<type>, map[<key>]<value>, and a few more that I won’t go into as I think they are out of scope for this explanation.

  • Packages can have one or more init() functions that are all called before the program’s main() func is called. There can also be multiple init() functions even in the same file.

  • vars can be initialized and those initializations are run before the program’s main() func is called.

  • consts are initialized before the program’s main() func is called but can only be initialized by literal scalar types. Unfortunately.

  • imports take the form of import "<package>" for standard library types and where a <package> can contain parent paths.

  • For local types imports take the form of import "<module>/<package>" where a <module> is defining by having a go.mod file in the directory or a parent directory, and a <package> can contain parent paths. A go.mod file has a module directive, a go version, and one or more require statements (I’m ignoring a bit of minutia here.)

  • Modules allow grouping of packages together and were added in recent years to provide versioning for the collective dependencies of a module. The version information is stored in go.sum and is managed automatically with Go CLI commands.

  • For external third party modules imports take the form of import "<domain>/<package>" where <package> can contain parent paths and almost always does. An example is “github.com/stretchr/testify/assert

  • External modules are by definition HTTP(S) GETable, and Go developers use go get <module> on the command line to download the module. Go does not have or need a 3rd party package manager as that can become a single point of failure and is definitely a single point of control. To download testify for use in their Go module a Go dev would run go get [github.com/stretchr/testify](http://github.com/stretchr/testify/assert)

  • Most external third party modules for Go are hosted on Github but can be hosted on a custom domain, Bitbucket, GitLab, etc.

  • The Go team manages a standard proxy for go get but organizations can run their own if desired.

  • imports are referenced by name internally where the package name is the last segment of the import after a /, or just the name if no slash. So “github.com/stretchr/testify/assert” is referenced in code as assert. For example, assert.Equal(t,1,value) would assert that value!=1 then it would use the testing variable t to mark this assertion as an error and generate appropriate output.

  • imports can be aliases so you could import check "[github.com/stretchr/testify/assert](http://github.com/stretchr/testify/assert)" and then call check.Equal(t,1,value) instead of assert.Equal(t,1,value)` but needing to alias a package frequently is a code smell for a badly named package.

  • You can use . as an alias and then not need to use the alias, so we could import "github.com/stretchr/testify/assert"and then just callEqual(t,1,value) instead of assert.Equal(t,1,value) but this is frowned on in the Go community except for in very specific use-cares.

  • You can use _ to bring in a package even if you are not referencing it in case it has an init() function that you need to run. If that applied to testify it would look like this: import _ github.com/stretchr/testify/assert”.`

  • All of import, var, and const support a multiline for using parenthesis like so:

var (
x = 1
y = 2
)

  • Module names are idiomatically one word w/o underscores and lowercase.

  • There is no need to import specific symbols from a Go package like there is in JavaScript. I have programmed in both Go and JS, and I have not found a real benefit to having to reference everything explicitly in the import — since you have to mention the package name everywhere you use any package symbol — but I have noticed a benefit to not having nearly as much boilerplate at the top of the file for import when working with Go vs. working with Javascript. And my GoLang IDE just manages imports for me whereas WebStorm just calls out when I haven’t imported function names in Javascript.

I am sure there is more I missed, but that should cover the highlights.

The takeaways that I think would be useful are PHP modules are:

  1. Imports

  2. Import aliases

  3. Module-level consts

  4. Module-level init() functions

  5. Module-level vars with initialization

  6. Module-level functions

  7. One directory == one module

  8. No hierarchy for modules

  9. Single word module names in lowercase.

  10. Module sytax being , e.g. mymodule->MySymbol

Takeaways I wish the PHP community would consider but doubt there is any chance:

  1. Having modules be HTTP(S) GETtable with php get <module>
  2. Uppercase being public, lowercase being private, and no need for protected
  3. Test packages with testing build into PHP e.g. php test ./...

I’m not fond of this either.

There will need to be a way to define the entrypoint php. I think index.php is reasonable, and if another entry point is desired it can be called out → “mypackage/myentry.php”

Why is an entry point needed? If there is a module metadata file as I am proposing PHP can get all the information it needs from that file. Maybe that is the .phm file?

Maybe. Again, I need to look over this meta data format. Also, how does it get created?

As I am envisioning, PHP at the command line would have the ability to pre-compile a module — aka all files in the module directory — and then write a module-specific file, maybe .php.module? That could ideally be optimized for loading by PHP and have everything it needs to know to run the code in that module.

That file could be completely self-contained include all source code similar to a .phar file, or it could just have a complete symbol table and still require the PHP source code to exist as well. I have not pondered all the pros and cons of these alternatives yet.

Clearly though even if it compiled to a self-contained file the .PHP files would still be needed during development. Thus I envision that the PHP CLI would need a --watch option to watch directories and recompile the .php.module file upon PHP file change. IDEs like PhpStorm could run php --watch for users and non-IDE users could run it themselves.

When PHP would come across an import statement pointing to a module directory it would first look for the compiled .php.module file and if found use it but if not found it would recreate it. Maybe it could write to disk, or generate an error if it cannot write to disk. OTOH writing to disk might be a security issue in which case it could issue a warning that the .php.module file does not exists and then compile the module to memory and continue on.

It would be nice if there was a mode where PHP would check the timestamps of all PHP files in the module directory and if the compiled .php.module was earlier than any of the .php file then recompile but you’d want that off for production. That could be a new function set_dev_mode(boolean) or a CLI option to create a .phpdev.module instead of a .php.module.

Clearly anyone using deployments could have their build generate all the required .php.module files for deployment, and hosting companies that host apps that don’t use deployments like WordPress could have processes that build the .php.module files for their users.

I think I have thought through this enough to identify there are no technical blockers, but I could certainly have missed something so please call it out if anyone can identify something that would keep this from working and/or significantly change the nature of PHP development.

BTW, this pre-compiling would ONLY apply to modules, so people not using modules would not have to be concerned about any of this at all.

-Mike

P.S.

I remember when the choice to use \ was made. I’ve rarely been so angry about a language design choice before or since. I’ve gotten used to it, but seeing \ all over the place in strings is still… yuck.

Ditto.

Not replying to anyone in particular and instead doing a mild reset taking into account the discussion that has gone before.

So, I want to import a package. I’ll create an index.php file at the root of my website and populate it with this.

<?php import "./src/mymodule";

Now I’ll create that directory and run a command php mod init in that directory. Stealing this from Go, it’s fairly straightforward though. Now if we look in the directory we will see two files.

php.mod

php.sum

The second file I’ll not be touching on but exists to track checksums of downloaded packages - Composer does the same with its composer-lock.json file which in turn was inspired by node’s package-lock.json.

The php.mod file stands in for composer.json, but it isn’t a json file. It would start something like this:

namespace mymodule

php 10.0

registry packagist.org/packages

We start with three directives - the root namespace is presumed to be the directory name. If that isn’t true this is a text file, change it. PHP min version should be straightforward. Registry details where we are going to go get code from. Suppose we want to use our own registry but fallback to packagist. That would be this:

namespace mymodule

php 10.0

registry (

github.com/myaccount

packagist.org/packages

)

Multiple registry entries will be checked for the code in order. Handling auth tokens for restricted registries is outside of scope at the moment.

So let’s build the module. We’ll make a file called hello.phm. The reason for phm and not php is so that web SAPIs will not try to parse this code. Further they can be configured to not even allow direct https access to these files at all.

import “twig/twig”;

use \Twig\Loader\ArrayLoader;

use \Twig\Environment;

$loader = new ArrayLoader([

‘index’ => ‘Hello {{ name }}’

]);

$twig = new Environment($loader);

export $twig;

As mentioned in previous discussions, modules have their own variable scope. Back in our index we need to receive the variable

<?php

import $twig from “./src/mymodule”

$twig->render(‘index’, [‘name’ => ‘World’]);

If we load index.php in the web browser we should see “Hello World”. If we look back in the mymodules folder we’ll see the php.mod file has been updated

namespace mymodule

php 10.0

registry packagist.org/packages

imports (

twig/twig v3.10.3

symfony/deprecation-contracts v2.5 //indirect

symfony/polyfill-mbstring v1.3 //indirect

symfony/polyfill-php80 v1.22 //indirect

)

Note the automatically entered comment that marks the imported dependencies of twig. Meanwhile the php.sum file will also be updated with the checksums of these packages.

So why this instead of composer? Well, a native implementation should be faster, but also it might be able to deal with php extensions.

import “@php_mysqli

The @ marks that the extension is either a .so or .dll library, as I’ll hazard a guess that the resolution mechanic will be radically different from the php language modules themselves - if it is possible at all. If it can be done it will make working with packages that require extensions a hell of a lot easier since it will no longer be necessary to monkey the php.ini file to include them. At a minimum the parser needs to know that the import will not be in the registry and instead it should look to the extensions directory, hence the lead @. Speaking of, having the extension directory location be a directive of php.mod makes sense here. Each module can have its own extension directory, but if this is kept within the project instead of globally then web SAPIs definitely need to stay out of those directories.

Final thing to touch on is how the module namespaces behave. The export statement is used to call out what is leaving the module - everything else is private to that module.

class A {} // private

export class B {} // public

All the files of the package effectively have the same starting namespace - whatever was declared in php.mod. So it isn’t necessary to repeat the namespace on each file of the package. If a namespace is given, it will be a sub-namespace

namespace tests;

export function foo() {}

Then in the importing file

import “./src/mymodule”

use \mymodule\tests\foo

Notice here that if there is no from clause everything in the module grafts onto the symbol table. Subsequent file loads need only use the use statement. Exported variables however must be explicitly pulled because the variable symbol table isn’t affected by namespaces (if I recall correctly, call me an idiot if I’m wrong).

The from clause is useful for permanently aliasing - if something is imported under an alias it will remain under that alias. Continuing the prior example

import tests\foo as boo from “./src/mymodule”;

boo()

That’s enough to chew on I think.

On Sat, Jun 29, 2024, at 08:32, Michael Morris wrote:

Not replying to anyone in particular and instead doing a mild reset taking into account the discussion that has gone before.

So, I want to import a package. I’ll create an index.php file at the root of my website and populate it with this.

<?php import "./src/mymodule";

Now I’ll create that directory and run a command php mod init in that directory. Stealing this from Go, it’s fairly straightforward though. Now if we look in the directory we will see two files.

php.mod

php.sum

The second file I’ll not be touching on but exists to track checksums of downloaded packages - Composer does the same with its composer-lock.json file which in turn was inspired by node’s package-lock.json.

I don’t think that is correct… package-lock.json didn’t come about until what, 2016-7ish? with pressure from yarn which did a yarn.lock file. Pretty sure composer was doing that since the beginning. I remember this being a BIG reason we switched from npm to yarn when it came out, because dev A would have different versions of libraries than dev B. Bug hunting was FUN when it was in a library.

The php.mod file stands in for composer.json, but it isn’t a json file. It would start something like this:

namespace mymodule

php 10.0

registry packagist.org/packages

We start with three directives - the root namespace is presumed to be the directory name. If that isn’t true this is a text file, change it. PHP min version should be straightforward. Registry details where we are going to go get code from. Suppose we want to use our own registry but fallback to packagist. That would be this:

namespace mymodule

php 10.0

registry (

github.com/myaccount

packagist.org/packages

)

Multiple registry entries will be checked for the code in order. Handling auth tokens for restricted registries is outside of scope at the moment.

While this looks good on paper, you’re going to have to standardize how packages are accessed (API calls, etc) so they can be used in this file, or literally anyone who wants to add a competing registry will have to create an RFC to allow accessing their own registry, which is a ton of politics for something that is strictly technical – not to mention a bunch of if-this-registry-do-that type statements scattered throughout the code, which makes it harder to maintain.

So let’s build the module. We’ll make a file called hello.phm. The reason for phm and not php is so that web SAPIs will not try to parse this code. Further they can be configured to not even allow direct https access to these files at all.

import “twig/twig”;

use \Twig\Loader\ArrayLoader;

use \Twig\Environment;

$loader = new ArrayLoader([

‘index’ => ‘Hello {{ name }}’

]);

$twig = new Environment($loader);

export $twig;

SAPIs are the programs that parse ALL php code and return it to the server (ie, nginx, apache, caddy, etc) to be displayed. The SAPI absolutely needs to parse these files in order to execute them. Servers are designed to display files, so any server configured today will just output the contents of these files because it won’t be configured to send the request to the SAPI instead. It’s better to suggest moving these files out of the web-root so it’s a non-issue.

In other news, I’m not a fan of how many times I have to write “twig” just to get Twig in the current file. The module already registers a namespace, why can’t the use-statement implicitly import the module?

As mentioned in previous discussions, modules have their own variable scope. Back in our index we need to receive the variable

<?php

import $twig from “./src/mymodule”

$twig->render(‘index’, [‘name’ => ‘World’]);

If we load index.php in the web browser we should see “Hello World”. If we look back in the mymodules folder we’ll see the php.mod file has been updated

In real life, my code is going to be in a module/framework and I’m going to need to render it there. This example of exporting a dependency also kinda breaks encapsulation principles, and even though it is an example, things like this end up in documentation of a feature and cause all kinds of bad practices (like Symfony and anemic objects).

namespace mymodule

php 10.0

registry packagist.org/packages

imports (

twig/twig v3.10.3

symfony/deprecation-contracts v2.5 //indirect

symfony/polyfill-mbstring v1.3 //indirect

symfony/polyfill-php80 v1.22 //indirect

)

Note the automatically entered comment that marks the imported dependencies of twig. Meanwhile the php.sum file will also be updated with the checksums of these packages.

One of the first things I do in a composer.json file is remove polyfills through the replace key. It’s unnecessary, annoys me in my IDE with having multiple classes of the same name, and hides the fact that I should probably install an extension for better performance. How do we do that with this new setup?

In fact, it is worth pointing out that how would this system work with polyfills in-general? Polyfills have their uses – especially for library/framework code where you don’t control the runtime environment. Like how would someone polyfill mb_string since people will be adding import @mbstring and not import symfony/polyfill-mbstring?

So why this instead of composer? Well, a native implementation should be faster, but also it might be able to deal with php extensions.

import “@php_mysqli

The @ marks that the extension is either a .so or .dll library, as I’ll hazard a guess that the resolution mechanic will be radically different from the php language modules themselves - if it is possible at all. If it can be done it will make working with packages that require extensions a hell of a lot easier since it will no longer be necessary to monkey the php.ini file to include them. At a minimum the parser needs to know that the import will not be in the registry and instead it should look to the extensions directory, hence the lead @. Speaking of, having the extension directory location be a directive of php.mod makes sense here. Each module can have its own extension directory, but if this is kept within the project instead of globally then web SAPIs definitely need to stay out of those directories.

So … if we want to round, we have to use import @math and then we can call the global round() function? Or if we want to use DateTimeImmutable we have to add import @date? That seems like a step in the wrong direction since most people don’t even know that most (if not all) global library functions come from extensions – and virtually nobody knows the name of each extension and what functions they have. Also, installing extensions is not 100% straightforward as some environments need to use pecl, some need to use OS package managers.

Final thing to touch on is how the module namespaces behave. The export statement is used to call out what is leaving the module - everything else is private to that module.

class A {} // private

export class B {} // public

All the files of the package effectively have the same starting namespace - whatever was declared in php.mod. So it isn’t necessary to repeat the namespace on each file of the package. If a namespace is given, it will be a sub-namespace

namespace tests;

export function foo() {}

Then in the importing file

import “./src/mymodule”

use \mymodule\tests\foo

Notice here that if there is no from clause everything in the module grafts onto the symbol table. Subsequent file loads need only use the use statement. Exported variables however must be explicitly pulled because the variable symbol table isn’t affected by namespaces (if I recall correctly, call me an idiot if I’m wrong).

The from clause is useful for permanently aliasing - if something is imported under an alias it will remain under that alias. Continuing the prior example

import tests\foo as boo from “./src/mymodule”;

boo()

That’s enough to chew on I think.

— Rob

On Jun 28, 2024, at 10:12 AM, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:

Namespaces don’t require autoloading, and autoloading doesn’t require one file per class.

No they do not, but the design of each was heavily intertwined with each other resulting in a less than optimal design, IMO.

So, are you arguing to keep one and eject the other for modules, and if so which are you arguing we eject? Autoloading?

Or are you arguing to keep both for modules, in which case your argument above is moot?

To compile a program with multiple source files, in any language, you need one of two things:

a) A list of files you want to compile. Maybe auto-generated, maybe done with a recursive iteration over a directory, but ultimately the compiler needs a file path to process.

Recursion is only needed if modules are hierarchical in nature.

b) A way for the compiler to tell, based on some symbol it wants to resolve, which file should be compiled.

That presumes the compiler did not simply generate an AST from the list of files.

PHP originally provided only option (a), via the include and require keywords. Autoloading adds option (b), where you provide a function which takes a class name and does whatever you want to find the definition.

And that “whatever you want” takes execution time (and tracing through when you are debugging.) But when you look at many other languages loading is an implementation detail that PHP chose to hoist onto userland developers when PHP could have established the rules to handle it more performantly without userland involvement.

Or is there some aspect of autoloading that could not be handled by PHP itself? Note I am asking only within the propose scope of modules, which we could constrain to optimize their runtime use.

I think it might be time to re-visit the tooling around option (a), as OpCache makes the cost of eagerly loading a list of files much lower than it was when autoloading was added.

Now you are getting somewhere.

Imagine that each module — which could equal a single directory — could have a pre-compiled op-cache which is essentially what I proposed in other recent emails.

That could be as simple as include_all($directory), or as fancy as include_from_manifest_file($some_crazy_binary_file_format); either could be implemented right now in userland, because it all eventually comes down to calling include or require.

meh.

That sounds like a way to avoid discussing the ways in which smartly designed modules could really improve PHP.

My opinions match Larry’s almost exactly: I want package-level optimisation, and package-private declarations. But I don’t want to rewrite my entire codebase to start using a completely different naming system.

I can’t see how package-privates would be of any value to you unless you rewrite your codebase.

Simple: I have private Composer packages, right now, that group all their classes under a particular namespace prefix. I want to be able to mark some classes in that namespace as “internal”.

Not simple, although I admit I am being pedantic about words used here, but for a reason.

I asked about “package-privates,” you responded with “namespace-privates.”

Adding private to namespaces is orthogonal to the discussion of packages.

To require that packages be constrained to have all the same warts as namespaces and existing PHP code simply so you can have namespace-privates is short-sighted (and IMO a bit selfish.)

Alternately, namespaces could get private scope in parallel to having modules be considered.

That would allow modules to gain improvements that we could not get by having to maintain BC with namespaces.

Which causes me to ask: If you have really wanted namespace private why has it been six years since it was even last mentioned on the list, and four years since last discussed?

https://externals.io/message/101323

Why has there not been an RFC since this one https://wiki.php.net/rfc/namespace-visibility six years ago, that was not even voted on?

Why is it that when the topic of addressing modules/packages comes up — which has been talked about numerous times in the past six years — do you now bring up namespace privates in a manner that would effectively torpedo goals of the modules discussion, at least from the perspective of the OP and myself?

If namespace private were really something important to you, why haven’t you championed it before, rather than hijack a discussion about the benefits we could get from modules if not constrained by namespaces?

I do not want to change every place that uses the existing classes to reference “ModuleName@ClassName” instead of “NamespacePrefix\ClassName”, or change every “use” to “import from”.

Then don’t. Champion this RFC https://wiki.php.net/rfc/namespace-visibility and get what you want.

But please don’t argue against a discussion on modules because you want a feature that can be gotten orthogonally. (If you must argue against it, make arguments for which accommodations for your preferences cannot be found.)

Code doesn’t existing in isolation; if Symfony Mailer is re-published as a “module”, every single application that uses it needs to change their code from referencing namespaced classes, to having “import” statements.

And that is bad, how?

But before you answer, it just means that instead of a use statement in your existing code you change to a import statement.

You’d then of course need to changes — if applicable — to call the new Symphony Mailer, but you’d have to do that with or without modules.

Or is there something else I am missing?

As for package-level optimisation, you’ll need to give examples of what you mean there as I don’t want to wrongly assume.

Currently, OpCache only optimises per file, because it can’t guarantee how files will be used together.

A simple example is function fallback: if you could declare a package as “completely loaded”, OpCache could replace all references to “strlen” with “\strlen”, knowing that no namespaced function with that name could be added later.

Thank you for elaborating on that.

So, champion an RFC to improve OpCache for namespaces. That need not impose on the discussion about modules.

Further, and this is what is nice about being able to discuss modules not having to be compatible with namespaces, if there are aspect of namespaces that make optimization hard or impossible then we could potentially set up rules of modules that make similar optimizations easy and/or possible.

EVEN further, consider the fact that in PHP all class members are public by default. One thing we could have in modules is to go back to short var and eliminate both private and protected modifiers and only have public with the default behavior being what is private outside of modules. protected would no longer be needed as we would have module scope which is defacto-protected.

Classes could be final by default in modules and then we could modify them with an open keyword (thanks to Lynn for that one.)

And so on. In other words, if we could treat modules as their own sandbox, we could get fix many of the regrettable former design choices of the PHP language — some of which are to make PHP be beginner friendly — and potentially re-energize people who once looked at PHP and dismissed it to give it another look.

What I meant was: we can’t just treat namespaces and modules as completely separate things, and assume that every code file will be using one style or the other. We have to imagine the user experience when there is a mix of the two.

Why can we not just treat namespaces and modules as completely separate things?

I can’t imagine it being pleasant to have a mix of “import” and “use” statements at the top of a file, with different and even conflicting semantics.

That feels like a frivolous concern when compared to the benefits we could see with modules, especially when there would be ways to mitigate your stated concerns here.

If you don’t like to see imports, but your imports in a namespace and then “use” that namespace.

Or we could allow “use module” instead of (or in addition to) “import” and then it could look more pleasant for you.

As for conflicting semantics:

1.) I’m not seeing how those could be significant in the using/importing file, and

2.) Isn’t dealing with conflicting semantics just a part of programming?

3.) Don’t “use” and “use function” have conflicting semantics?

God knows that “use” by itself has many confusing semantics, which “import” could avoid.

Perhaps I didn’t word the question well. What I’m really asking, as someone who’s never used JS or Go modules, is why I’d want to write “import”, rather than referencing a global name in a hierarchy.

“use module” would work just as well as “import”; the “import” is not special, the module scoping and features are what is valuable here.

For specifics see my other recent emails on the subject. If they do not explain, please ask again with specifics.

That’s really all I mean by “making it compatible with namespace”: I want “new \Foo\Bar\Baz;” to be able to refer to a “packaged” class, probably by having a way to mark all classes under “\Foo\Bar” as belonging to a particular package.

And that is what I am trying to get away from.

First the backslash — because when using in reflection or other dynamic programming they have to be escaped which can lead to escaping errors. I know you don’t care, but I and others do.

Second, the hierarchy. Because there is no constraint on hierarchy PHP subtly encourages developers — as if sirens of the Odyssey — to create large hierarchies. I even find myself doing it as I fighting myself against it.

The reasons hierarchy is bad is:

1.) larger hierarchies grow conceptual complexity,
2.) they place no limit on package growth as you can always create subdirectories,
3.) they make it harder to “see” all the code files in one place (a single directory),
4.) they constrain where code is located when there are benefits to a different layout

That’s really all I mean by “making it compatible with namespace”: I want “new \Foo\Bar\Baz;” to be able to refer to a “packaged” class, probably by having a way to mark all classes under “\Foo\Bar” as belonging to a particular package.

Revisiting this, why is it important to you that “new \Foo\Bar\Baz” refer to a “packaged” class vs a namespaced class, assuming you had namespace-private and OpCache improvements?

Why can’t you still just use the namespaces you prefer and let “packages” (modules) improve in other ways?

I am trying my best not to make this ad-hominem so forgive me but I do have to ask if this is just not a case of “I am comfortable doing it the way I have been doing it and do not want to consider changing,” maybe? Note I am asking that question limited to the one statement I quoted above, not on the broader discussion.

  1. Modules and packages are typically small-scoped to a single directory and it is a code smell to have many different packages tightly coupled, as is the case with namespaces. Forcing modules to munge with namespaces would mean most modules would be written with those code smells for years because that will be how everyone doing a quick shift from namespace to module will write their modules.

Again, this is entirely about code style, and not something the language can control.

A language cannot control it, but a language can encourage or discourse it.

And the PHP language encourages a large amount of file and directory bloat.

One only need to compare the number of files in most PHP libraries to the number of files in JS or Go package to see that the nature of a language clearly does not influence.

To bring stats vs. opinion I asked ChatGPT what the two equivalent packages are to Symphony for JS and Go respectively and it suggested ExpressJS and Gin. So I cloned them to see the number of files and directories each has. From the root of each repo:

Project Files Dirs

Symfony: 12,504 2,162

ExpressJS: 259 87

Gin(GoLang): 145 30

The comparison might not be completely fair given how much longer Symfony has been around, but they all target the same use-case so even if there is less functionality in ExpressJS or Gin.

Given that I think that well over an order of magnitude more files is a really odiferous code smell, and is thanks to the language which admittedly cannot “control” layout, but definitely influences it.

Am I wrong? Present any other relatively equivalent project comparisons you please. Here are the bash commands to count files and dirs:

find /path/to/subdirectory -type f | wc -l
find /path/to/subdirectory -type d | wc -l

Also, the JS insistence on having a separate package for every tiny function is a common source of criticism, so personally I am very happy that PHP packages are generally larger than that.

I can’t speak for the OP, but nothing I am proposing is advocating for separate packages for every tiny functions. Nothing.

Instead I am advocating for packages that are mostly in a few directories instead of almost two magnitudes more!

That said, maybe the best solution is to NOT put the stake in the ground right now and say “They must be namespace compatible” or “They must not be namespace compatible” but move forward with an open mind so that we can tease out exactly how namespaces would constrain modules and and then make the decision later for what would be in the best interest of moving PHP into the future.

If and when an actual problem arises, let’s discuss it.

Not “problems” but instead “opportunities.”

I have already pointed out numerous opportunities in this email and one of my recent emails.

What specifically stops us doing all the things you’ve been discussing around loading, and visibility, etc, in a way that’s compatible with the
400_000 packages available on Packagist, and billions of lines of existing code?

You speak as if I am proposing getting rid of namespaces and making those 400_000 packages available on Packagist, and billions of lines of existing code not work in PHP. Of course not.

No, I’m saying that every one of those packages could benefit if we make incremental changes.

Maybe.

What benefits can you envision you would get if PHP made namespaces==modules compared with the benefits I have mentioned for making modules not be constrained to compatibility with namespaces (besides private and OpCache as we already discussed you pursue for namespaces?)

Can we get precompiling for modules in a directory and written to a .php.module file? We can’t do that with namespaces because scanning recursively could take too long at runtime.

Can we get default private for all symbols and class members in namespaces? No, that would be a huge BC break.

Can we get namespaces to be first-class AST participants? If yes, why have we not done it before?

I could go on, but this email is getting loooong.

I don’t want to couple it so that you can’t have “package private” without also switching to some new “advanced” dialect of the language, and I don’t see any reason why we need to do so.

And I am not advocating that. I am advocating you should get “namespace private.” Hey RFC is already written! https://wiki.php.net/rfc/namespace-visibility

And most of the other benefits of modules as I am proposing would be BC breaks so you could not get them in namespaces anyway.

Unless you can come up with something besides private and opCache I had not considered.

Maybe package scoped declares could allow opting in to certain checks, but I don’t think “is in a package” and “has been audited for a load of extra breaking changes” should be set by the same flag.

I am not aware of any discussion of opting in, flags, nor auditing with respect to modules.

-Mike

On Jun 29, 2024, at 2:32 AM, Michael Morris <tendoaki@gmail.com> wrote:

Not replying to anyone in particular and instead doing a mild reset taking into account the discussion that has gone before.

So, I want to import a package. I’ll create an index.php file at the root of my website and populate it with this.

<?php import "./src/mymodule";

Now I’ll create that directory and run a command php mod init in that directory. Stealing this from Go, it’s fairly straightforward though. Now if we look in the directory we will see two files.

php.mod

php.sum

The second file I’ll not be touching on but exists to track checksums of downloaded packages - Composer does the same with its composer-lock.json file which in turn was inspired by node’s package-lock.json.

The php.mod file stands in for composer.json, but it isn’t a json file. It would start something like this:

namespace mymodule

php 10.0

registry packagist.org/packages

We start with three directives - the root namespace is presumed to be the directory name. If that isn’t true this is a text file, change it. PHP min version should be straightforward. Registry details where we are going to go get code from. Suppose we want to use our own registry but fallback to packagist. That would be this:

namespace mymodule

php 10.0

registry (

github.com/myaccount

packagist.org/packages

)

Multiple registry entries will be checked for the code in order. Handling auth tokens for restricted registries is outside of scope at the moment.

That is very Go-like, as you stated.

However, be aware that in a Go project repo you are likely to have only one go.mod — or multiple if you have numerous CLI apps being generated — whereas every directory with Go code is a package (which I think is equivalent to what you are calling “module.”

So I think your use of them here is conflating the two concepts. One is a project-wide concept and the other is a “package” concept.

Maybe you would be better to adopt module to mean project and package to mean packaged code as Go has them?

From here on I will refer to directory rather than module or package to avoid confusion. By directory I will mean what Go calls a “package” and what I think your original proposal called a “module.”

A big difference between Go and PHP is that Go have a compiler that compiles into an executable before it runs. That is clearly not compatible with PHP, and why I was proposing that each directory could have a pre-compiled .php.module that could be pre-compiled, or compiled on the fly at first import.

Also, it is problematic to have php.mod and php.sum because web servers would serve them if not carefully configured hence why I went with a leading dot, e.g. .php.module

So let’s build the module. We’ll make a file called hello.phm. The reason for phm and not php is so that web SAPIs will not try to parse this code. Further they can be configured to not even allow direct https access to these files at all.

import “twig/twig”;

use \Twig\Loader\ArrayLoader;

use \Twig\Environment;

$loader = new ArrayLoader([

‘index’ => ‘Hello {{ name }}’

]);

$twig = new Environment($loader);

export $twig;

As mentioned in previous discussions, modules have their own variable scope. Back in our index we need to receive the variable

<?php

import $twig from “./src/mymodule”

$twig->render(‘index’, [‘name’ => ‘World’]);

Aside from being familiar per Javascript, what is the argument to requiring the import of specific symbols vs just a package import, e.g.:

<?php

import “./src/mymodule”

mymodule->twig->render(‘index’, [‘name’ => ‘World’]);

To me is seems to just add to boilerplate required. Note that having mymodule everywhere you reference twig makes code a lot more self-documenting, especially on line 999 of a PHP file. :slightly_smiling_face:

If we load index.php in the web browser we should see “Hello World”. If we look back in the mymodules folder we’ll see the php.mod file has been updated

namespace mymodule

php 10.0

registry packagist.org/packages

imports (

twig/twig v3.10.3

symfony/deprecation-contracts v2.5 //indirect

symfony/polyfill-mbstring v1.3 //indirect

symfony/polyfill-php80 v1.22 //indirect

)

Having a php.sum file is interesting but again, it should start with a period if so.

That said, I wonder if incorporating versioning does not make the scope of modules too big to complete?

Note the automatically entered comment that marks the imported dependencies of twig. Meanwhile the php.sum file will also be updated with the checksums of these packages.

So why this instead of composer? Well, a native implementation should be faster, but also it might be able to deal with php extensions.

import “@php_mysqli

I would like this, but I think hosting vendors would block it since extensions can have C bugs and create vulnerabilities for servers.

I have long thought PHP should kick off a new type of extension using WASM, which can be sandboxed.

But I digress.

The @ marks that the extension is either a .so or .dll library, as I’ll hazard a guess that the resolution mechanic will be radically different from the php language modules themselves - if it is possible at all. If it can be done it will make working with packages that require extensions a hell of a lot easier since it will no longer be necessary to monkey the php.ini file to include them. At a minimum the parser needs to know that the import will not be in the registry and instead it should look to the extensions directory, hence the lead @. Speaking of, having the extension directory location be a directive of php.mod makes sense here. Each module can have its own extension directory, but if this is kept within the project instead of globally then web SAPIs definitely need to stay out of those directories.

Final thing to touch on is how the module namespaces behave. The export statement is used to call out what is leaving the module - everything else is private to that module.

class A {} // private

export class B {} // public

All the files of the package effectively have the same starting namespace - whatever was declared in php.mod. So it isn’t necessary to repeat the namespace on each file of the package. If a namespace is given, it will be a sub-namespace

namespace tests;

export function foo() {}

Then in the importing file

import “./src/mymodule”

use \mymodule\tests\foo

Notice here that if there is no from clause everything in the module grafts onto the symbol table. Subsequent file loads need only use the use statement. Exported variables however must be explicitly pulled because the variable symbol table isn’t affected by namespaces (if I recall correctly, call me an idiot if I’m wrong).

The from clause is useful for permanently aliasing - if something is imported under an alias it will remain under that alias. Continuing the prior example

import tests\foo as boo from “./src/mymodule”;

boo()

That’s enough to chew on I think.

I don’t think it is wise to intertwine this concept of modules with namespaces like that, but I am replied out for the night. :slight_smile:

-Mike

On Jun 28, 2024, at 7:45 AM, Rob Landers <rob@bottled.codes> wrote:

Fast forward a bit - PHP 5.3, and the introduction of namespaces were introduced to deal with the overloaded symbol tables. They are a bit a hotwire as (if I’m not mistaken, it’s been a couple years since I read the discussion on it) they just quietly prepend the namespace string in front of the name of all new symbols declared in the namespace for use elsewhere. As a result, PHP namespaces don’t do some of the things we see in the namespaces of other languages (looking at Java and C# here). For example, privacy modifiers within a namespace aren’t a thing.

This would be nice to have … maybe. But namespace have been around, what, 10-15 years? I think if someone wanted to “fix” it, it would have been fixed by now.

Or not.

Never underestimate the power of inertia for maintaining a less than ideal status-quo, especially when the decision to change has to be approved by a 2/3rd vote of committee. :man_shrugging:

-Mike

On Jun 29, 2024 at 6:20 AM, <Rowan Tommins [IMSoP]> wrote:

On 29 June 2024 08:06:57 BST, Mike Schinkel <mike@newclarity.net> wrote:
>The takeaways that I think would be useful are PHP modules are:
>
>1. Imports
>2. Import aliases  
>3. Module-level consts
>4. Module-level init() functions
>5. Module-level vars with initialization
>6. Module-level functions
>7. One directory == one module
>8. No hierarchy for modules
>9. Single word module names in lowercase.
>10. Module sytax being <module><operator><symbol>, e.g. mymodule->MySymbol

This all sounds like an interesting set of ideas for building a new language.







Maybe, if a new language only had a tiny set of the features needed to actually have a useful language.

That list is just package-specific, nothing about syntax, data types, control structures, package management, etc. etc.

 Most of it sounds completely impractical to apply in retrospect to an existing one with millions of users - apart from the bits we actually already have, like points 3 and 6.

You say it is impractical, you claim millions of users, but you don’t address why the specific features are impractical.

They are no more impractical than any other new language features PHP has added in recent years (and I am not being critical of what has been added, to be clear.)

Rather than looking at languages which have done things completely differently, 




“Completely” here is a leading word used in that context.

There is nothing “completely” different about JavaScript, or Go for that matter. All three of JS, Go, and PHP are descendants of C.

We are not talking about APL, Whitespace, Befunge, or Intercal, after all.

I think it would be more useful to look for inspiration for ones which are *similar* to PHP's approach, but have extra features. 



so there might be good and bad experiences we can learn from there, as well. And I'm sure there are others that are much less alien than JS or Go.

I would argue JS and maybe Go is a lot more similar to PHP than Java or C#. But then the alienness is in the eye of the beholder.

You claimed you don’t know JS or Go, but I don’t know Java or C#, at least not enough to be proficient in them.

That said, I really don’t think gatekeeping based on the genetics of a language is the path to improving it. Instead I think objectively evaluating the specifics of the proposed features is the better path. And to me each of those things I mentioned stand on their own and can be justified, as needed.

-Mike

On 29 June 2024 07:32:58 BST, Michael Morris <tendoaki@gmail.com> wrote:

So why this instead of composer? Well, a native implementation should be faster, but also it might be able to deal with php extensions.

Building a package manager is hard, and getting a package manager adopted requires the network effect of a community using it. Jordi, Nils, et al have done a fantastic job with Composer, and it has a near 100% buy-in from the community, with hundreds of thousands of published packages.

It already supports *requiring* extensions; being able to *install* extensions is a much harder job, but that's being worked on now.

I would need an extremely persuasive argument to pay any attention at all to an incompatible alternative.

Rowan Tommins
[IMSoP]

On 29 June 2024 08:06:57 BST, Mike Schinkel <mike@newclarity.net> wrote:

The takeaways that I think would be useful are PHP modules are:

1. Imports
2. Import aliases
3. Module-level consts
4. Module-level init() functions
5. Module-level vars with initialization
6. Module-level functions
7. One directory == one module
8. No hierarchy for modules
9. Single word module names in lowercase.
10. Module sytax being <module><operator><symbol>, e.g. mymodule->MySymbol

This all sounds like an interesting set of ideas for building a new language. Most of it sounds completely impractical to apply in retrospect to an existing one with millions of users - apart from the bits we actually already have, like points 3 and 6.

Rather than looking at languages which have done things completely differently, I think it would be more useful to look for inspiration for ones which are *similar* to PHP's approach, but have extra features.

For instance, .net has both "assemblies" (multiple files compiled as one redistributable unit) and namespaces (which are hierarchical, like PHP's). "Package private" modifiers work at the assembly level, not the namespace one.

.Net assemblies don't have to be limited to one namespace root, but in practice generally are. For PHP, I think there would be some benefits to making that a fixed rule, with some tricks to "re-open" a namespace, or explicitly add "friends" to it.

I don't know much about modern Java, but it too has hierarchical namespaces, so there might be good and bad experiences we can learn from there, as well. And I'm sure there are others that are much less alien than JS or Go.

Rowan Tommins
[IMSoP]

On Jun 29, 2024, at 7:14 AM, Rob Landers <rob@bottled.codes> wrote:

You say it is impractical, you claim millions of users, but you don’t address why the specific features are impractical.

They are no more impractical than any other new language features PHP has added in recent years (and I am not being critical of what has been added, to be clear.)

So far, nobody has shown how it is practical – that is on the person proposing the RFC. Ideally, this would be it, you show why it is useful, how to use it, etc. But it is also political. You need to show why people would use it, why people would rewrite their entire application to use it (if the RFC calls for it), and so far, nobody has shown that other than “there are packages!”

The problem with your assertion is that “impractical” is not a criticism that can be objectively determined to be true or false. It is just a pejorative used to stifle discussion which is why I responded to it as a did.

Yes I agree that it is no proposers to show people why to use it, but it is unfair to proposers to give criticism that can only be classified as opinion.

You need to show why people would use it, why people would rewrite their entire application to use it (if the RFC calls for it), and so far, nobody has shown that other than “there are packages!”

It seems you have not read any of the several other emails I have written to this list in the past several days that do far more than say “there are packages!”

Please read them in full before making such further equivalently dismissive claims.

I cringed at this. There is no direct lineage though they borrow come syntax from C, and if you want to push it, you might as well say they’re descendants of B which borrowed syntax from BCPL which borrowed syntax from CPL which borrowed it’s syntax from ALGOL… eh, no, these languages are not related to each other. Inspired, maybe.

Aside from your cringing, how does your pedanticism here move the discussion forward in a positive manner?

No, PHP and Go are nothing like each other. With a bit of finangling, you can actually port JavaScript line-for-line to PHP, but not the other way around. If anything, JavaScript is more like PHP than PHP is more like JavaScript.

Again, you are making a statement that cannot be objectively proven true or false, and frankly I cannot see any way in which your argument here matters to discussion of modules.

I don’t see any gate-keeping here,

Those who are inside the gates never do.

I called out gatekeeping because he argued the genetic fallacy[1] for dismissing the proposed ideas rather than using objective criticism of the features proposed.

just people challenging assumptions and pushing for the feature to be better than it is currently being proposed.

Yet the challenges are premised on opinions and fallacies instead of objectively challenging the proposed features.

I am happy to defend against proposal against arguments that can be objectively evaluated, but having my arguments challenged “because they come from a language I don’t know” means that my assumptions are not actually being challenged and the criticisms made are based on the challenger’s pre-existing lack of comfort with the assumptions while making it appear readers the criticism is objective.

And that IMO is no way to improve a language.

-Mike

[1] https://en.wikipedia.org/wiki/Genetic_fallacy

On Sat, Jun 29, 2024, at 12:56, Mike Schinkel wrote:

 Most of it sounds completely impractical to apply in retrospect to an existing one with millions of users - apart from the bits we actually already have, like points 3 and 6.


You say it is impractical, you claim millions of users, but you don’t address why the specific features are impractical.

They are no more impractical than any other new language features PHP has added in recent years (and I am not being critical of what has been added, to be clear.)

So far, nobody has shown how it is practical – that is on the person proposing the RFC. Ideally, this would be it, you show why it is useful, how to use it, etc. But it is also political. You need to show why people would use it, why people would rewrite their entire application to use it (if the RFC calls for it), and so far, nobody has shown that other than “there are packages!”

Rather than looking at languages which have done things completely differently, 





“Completely” here is a leading word used in that context.

There is nothing “completely” different about JavaScript, or Go for that matter. All three of JS, Go, and PHP are descendants of C.

I cringed at this. There is no direct lineage though they borrow come syntax from C, and if you want to push it, you might as well say they’re descendants of B which borrowed syntax from BCPL which borrowed syntax from CPL which borrowed it’s syntax from ALGOL… eh, no, these languages are not related to each other. Inspired, maybe.

We are not talking about APL, Whitespace, Befunge, or Intercal, after all.

I think it would be more useful to look for inspiration for ones which are *similar* to PHP's approach, but have extra features. 




so there might be good and bad experiences we can learn from there, as well. And I'm sure there are others that are much less alien than JS or Go.

I would argue JS and maybe Go is a lot more similar to PHP than Java or C#. But then the alienness is in the eye of the beholder.

No, PHP and Go are nothing like each other. With a bit of finangling, you can actually port JavaScript line-for-line to PHP, but not the other way around. If anything, JavaScript is more like PHP than PHP is more like JavaScript.

You claimed you don’t know JS or Go, but I don’t know Java or C#, at least not enough to be proficient in them.

That said, I really don’t think gatekeeping based on the genetics of a language is the path to improving it. Instead I think objectively evaluating the specifics of the proposed features is the better path. And to me each of those things I mentioned stand on their own and can be justified, as needed.

I don’t see any gate-keeping here, just people challenging assumptions and pushing for the feature to be better than it is currently being proposed.

— Rob

On Sat, Jun 29, 2024, at 11:41, Mike Schinkel wrote:

On Jun 28, 2024, at 7:45 AM, Rob Landers <rob@bottled.codes> wrote:

Fast forward a bit - PHP 5.3, and the introduction of namespaces were introduced to deal with the overloaded symbol tables. They are a bit a hotwire as (if I’m not mistaken, it’s been a couple years since I read the discussion on it) they just quietly prepend the namespace string in front of the name of all new symbols declared in the namespace for use elsewhere. As a result, PHP namespaces don’t do some of the things we see in the namespaces of other languages (looking at Java and C# here). For example, privacy modifiers within a namespace aren’t a thing.

This would be nice to have … maybe. But namespace have been around, what, 10-15 years? I think if someone wanted to “fix” it, it would have been fixed by now.

Or not.

Never underestimate the power of inertia for maintaining a less than ideal status-quo, especially when the decision to change has to be approved by a 2/3rd vote of committee. :man_shrugging:

-Mike

I disagree that it is inertia, as there is quite a bit of flexibility and robustness in the current way things are. If you’ve ever had to get access to “internal” methods/fields in other languages, there are quite a few hoops.

  • Java: as of 11.0 or maybe 8.0 – it’s been awhile since I’ve had to do this, but you have to create a class in the target “namespace” and expose whatever you need.

  • C#: you have to use reflection to gain access to it.

Right now, in PHP, you just “use it” and deal with the consequences of doing so. It’s great when you need to work around a bug in a library and very little friction. As a library author, I want some friction, but I also don’t want to force people to come to me and open a PR just for their use-case (hence why I almost never make classes final as well).

— Rob

On Jun 29, 2024, at 8:27 AM, Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:
On 29 June 2024 11:56:43 BST, Mike Schinkel <mike@newclarity.net> wrote:

That list is just package-specific, nothing about syntax, data types, control structures, package management, etc. etc.

It includes fundamental design decisions like "what does a class name look like", and "how are classes identified across boundaries". If names aren't universal, what does ::class return? How does resolution work in a DI container? Etc etc etc.

I'm sure Go has answers to all those questions, but so does PHP, and I've not seen any convincing argument why we should throw it all away and start again.

That comment sounds like you think that I am saying to do what Go does for PHP. That is not what I was saying.

Instead, I am saying "let us look at these aspects of Go for inspiration for features that would be beneficial for PHP."

Anyway, I have started a repo to put thoughts down, so continuing this discussion is probably premature before I have something more to show/discuss.

Rather than looking at languages which have done things completely differently,

There is nothing "completely" different about JavaScript, or Go for that matter. All three of JS, Go, and PHP are descendants of C.

You have misread what I wrote. I didn't say *the languages* are different, I said *the decisions they have made around namespaces and packages* are different.

There is no "genetic fallacy" or "gatekeeping" involved, I'm saying it will be easier to apply a design that shares some characteristics with what we have, than to rewrite the language to fit a design which shares none.

Fair point.

But let us not dismiss ideas that come from a language that you admitted are not that familiar with — just because it comes from that other language — before fully understanding what is being proposed.

The descriptions of the *design of packages* in JS and Go make me think they don't have enough in common with PHP to be easy to apply, so I'm suggesting we look at other designs.

And I am suggesting that maybe those designs will benefit PHP more than thinking inside the box.

That said, I will applaud you bringing specific concepts to the table from any other languages.

-Mike

P.S. What I am working on at the moment — after one tweak of that list of ten things to get inspired about from Go — is a lot more like PHP than you are probably currently envisioning and can possibly be implemented with much less of a production than anyone is likely assuming.

On 29 June 2024 11:56:43 BST, Mike Schinkel <mike@newclarity.net> wrote:

That list is just package-specific, nothing about syntax, data types, control structures, package management, etc. etc.

It includes fundamental design decisions like "what does a class name look like", and "how are classes identified across boundaries". If names aren't universal, what does ::class return? How does resolution work in a DI container? Etc etc etc.

I'm sure Go has answers to all those questions, but so does PHP, and I've not seen any convincing argument why we should throw it all away and start again.

Rather than looking at languages which have done things completely differently,

There is nothing "completely" different about JavaScript, or Go for that matter. All three of JS, Go, and PHP are descendants of C.

You have misread what I wrote. I didn't say *the languages* are different, I said *the decisions they have made around namespaces and packages* are different.

There is no "genetic fallacy" or "gatekeeping" involved, I'm saying it will be easier to apply a design that shares some characteristics with what we have, than to rewrite the language to fit a design which shares none.

The descriptions of the *design of packages* in JS and Go make me think they don't have enough in common with PHP to be easy to apply, so I'm suggesting we look at other designs.

Rowan Tommins
[IMSoP]