Re: [PHP-DEV] AW: [Pre-RFC] Pure-code source files via .phpc extension

On 5/27/26 07:12, hmennen90@gmail.com wrote:

Hi internals,

I intend to submit an RFC introducing a new file extension for pure-code PHP
source files (no leading <?php required) and would like to gather feedback
before drafting.

Proposal in brief:

Files ending in .phpc would be parsed starting in ST_IN_SCRIPTING state. No <?
php or ?> tags permitted inside such files. Existing .php files and their
semantics are completely unchanged. This is purely additive and BC-clean.

Motivation:

PHP's mixed-mode default reflects its 1995 templating origins. Since PHP 7+, the
language has evolved into a credible general-purpose tool: strict types, enums,
readonly classes, property hooks, JIT compilation. I personally maintain
PHPolygon, a CPU-bound 3D engine written in PHP – a use case where the
templating heritage is pure ceremony. Other modern uses (CLI tooling, queue
workers, code generators) share this pattern. A dedicated pure-code file format
would be a small but meaningful acknowledgment that PHP-as-language is now a
first-class use case alongside PHP-as-template.

Prior art and what's different:

I have read both rfc/source_files_without_opening_tag (Boutell, 2012, abandoned
by author) and rfc/nophptags (Ohgaki, 2014, inactive). My proposal deliberately
avoids what I believe were the two design choices that killed them:

- No new include syntax (Boutell's AS keyword). Extension-based detection only.
- No php.ini-based mode switch (Ohgaki's template_mode). No global config side
effects.
- No security framing. The mode-switch overhead is parse-time only and OPcache/
JIT eliminate it in practice; this proposal is about conceptual clarity and
tooling, not performance or LFI mitigation.

Implementation:

I will write and maintain the implementation patch. Initial scope: extension
registration in zend_compile_file, lexer state initialization, OPcache
awareness, CLI support, and rejection of <?php/?> tokens inside .phpc files. I
will also coordinate with Composer maintainers ahead of RFC submission to
confirm autoload support.

Open questions for the list:

1. Is the .phpc extension acceptable as the disambiguator, or is there appetite
for something else (e.g. shebang line, declare directive – both of which I think
are worse, but I'd hear the case)?
2. Should #! shebang lines and UTF-8 BOM be permitted before the implicit
scripting state begins? My intent is yes for both.
3. Should __halt_compiler() retain its current behavior in .phpc files? My
intent is yes.

I welcome substantive critique. If the concept itself is unwanted, I would
rather know now than discover it during a vote.

Thanks.

Hendrik Mennen
Maintainer, PHPolygon

The problem with assigning meaning to a file extension is that the interpreter (currently) doesn't care what the file extension is. As long as it's text, it'll process any file and execute what comes after any <?php tags.

What you want is something like this:

     php -r "$(cat foo.phpc)"

Of course, this only works for a single file, and the engine won't execute code in any included files unless they contain PHP open tags, so the file needs a way to tell the engine to parse and execute it as PHP source.

At the risk of bike-shedding, I think it could be easy for others to confuse .phpc files as byte-code files, since it's common to see Python byte-code files with the .pyc extension. Also, a lot of folks in the community use "phpc" as a shorthand and tag to mean "PHP community," though I'm not sure this is reason enough not to use the .phpc extension. I don't have any better alternative recommendations at this time, though.

Cheers,
Ben

Am 27.05.26, 15:19 schrieb “Ben Ramsey” ramsey@php.net:

On 5/27/26 07:12, hmennen90@gmail.com wrote:

Hi internals,

I intend to submit an RFC introducing a new file extension for pure-code PHP
source files (no leading <?php required) and would like to gather feedback
before drafting.

Proposal in brief:

Files ending in .phpc would be parsed starting in ST_IN_SCRIPTING state. No <? php or ?> tags permitted inside such files. Existing .php files and their
semantics are completely unchanged. This is purely additive and BC-clean.

Motivation:

PHP’s mixed-mode default reflects its 1995 templating origins. Since PHP 7+, the
language has evolved into a credible general-purpose tool: strict types, enums,
readonly classes, property hooks, JIT compilation. I personally maintain
PHPolygon, a CPU-bound 3D engine written in PHP – a use case where the
templating heritage is pure ceremony. Other modern uses (CLI tooling, queue
workers, code generators) share this pattern. A dedicated pure-code file format
would be a small but meaningful acknowledgment that PHP-as-language is now a
first-class use case alongside PHP-as-template.

Prior art and what’s different:

I have read both rfc/source_files_without_opening_tag (Boutell, 2012, abandoned
by author) and rfc/nophptags (Ohgaki, 2014, inactive). My proposal deliberately
avoids what I believe were the two design choices that killed them:

  • No new include syntax (Boutell’s AS keyword). Extension-based detection only.
  • No php.ini-based mode switch (Ohgaki’s template_mode). No global config side
    effects.
  • No security framing. The mode-switch overhead is parse-time only and OPcache/
    JIT eliminate it in practice; this proposal is about conceptual clarity and
    tooling, not performance or LFI mitigation.

Implementation:

I will write and maintain the implementation patch. Initial scope: extension
registration in zend_compile_file, lexer state initialization, OPcache
awareness, CLI support, and rejection of <?php/?> tokens inside .phpc files. I
will also coordinate with Composer maintainers ahead of RFC submission to
confirm autoload support.

Open questions for the list:

  1. Is the .phpc extension acceptable as the disambiguator, or is there appetite
    for something else (e.g. shebang line, declare directive – both of which I think
    are worse, but I’d hear the case)?
  2. Should #! shebang lines and UTF-8 BOM be permitted before the implicit
    scripting state begins? My intent is yes for both.
  3. Should __halt_compiler() retain its current behavior in .phpc files? My
    intent is yes.

I welcome substantive critique. If the concept itself is unwanted, I would
rather know now than discover it during a vote.

Thanks.

Hendrik Mennen
Maintainer, PHPolygon

The problem with assigning meaning to a file extension is that the
interpreter (currently) doesn’t care what the file extension is. As long
as it’s text, it’ll process any file and execute what comes after any

<?php tags. What you want is something like this: php -r "$(cat foo.phpc)" Of course, this only works for a single file, and the engine won't execute code in any included files unless they contain PHP open tags, so the file needs a way to tell the engine to parse and execute it as PHP source. At the risk of bike-shedding, I think it could be easy for others to confuse .phpc files as byte-code files, since it's common to see Python byte-code files with the .pyc extension. Also, a lot of folks in the community use "phpc" as a shorthand and tag to mean "PHP community," though I'm not sure this is reason enough not to use the .phpc extension. I don't have any better alternative recommendations at this time, though. Cheers, Ben Hi Ben, Thanks for the quick response. > The problem with assigning meaning to a file extension is that the > interpreter (currently) doesn't care what the file extension is. As long > as it's text, it'll process any file and execute what comes after any > <?php tags. Right, that is the current behavior, and changing it is exactly what the proposal is about. The engine would learn to check the file extension at the entry point (zend_compile_file for SAPI/CLI, and the include/require family for nested loads) and use that to set the initial lexer state. The .php behavior remains untouched. That said, you point indirectly at something I do need to address: there are entry paths where the engine does not have a filename, or has one it should not trust. Off the top of my head: - stdin (cat foo.phpc | php). No filename available. Options: require an explicit CLI flag (php --pure), or simply not support this path and document it. - eval(). Operates on strings, not files. Extension is irrelevant here; eval() continues to require <?php as today. - Phar archives. Internal entries have filenames, so dispatch by extension should work, but I would need to verify. - include of a URL stream (rare, often disabled). Same question. Probably handled by extension on the URL path. I will work these out explicitly in the RFC draft. Thanks for surfacing it. > At the risk of bike-shedding, I think it could be easy for others to > confuse .phpc files as byte-code files, since it's common to see Python > byte-code files with the .pyc extension. Fair point, and one I had not weighed heavily enough. The Python .pyc parallel is real and would cause exactly the kind of one-time confusion that adds friction to adoption. Boutell's 2012 RFC used .phpp (Pure PHP) for the same purpose, which avoids the bytecode association. I am open to .phpp or other suggestions; the disambiguator matters less than the mechanism. > Also, a lot of folks in the community use "phpc" as a shorthand and tag > to mean "PHP community," though I'm not sure this is reason enough not > to use the .phpc extension. Noted. Less critical than the .pyc concern in my view, but it does reinforce that .phpc is not the obviously-best choice. I will list candidate extensions in the RFC and explicitly invite the list to pick one rather than defending a specific letter. Thanks again for the substantive feedback. Hendrik Mennen Maintainer, PHPolygon

Am 27.05.2026 um 15:19 schrieb Ben Ramsey <ramsey@php.net>:

On 5/27/26 07:12, hmennen90@gmail.com wrote:

Hi internals,
I intend to submit an RFC introducing a new file extension for pure-code PHP
source files (no leading <?php required) and would like to gather feedback
before drafting.
Proposal in brief:
Files ending in .phpc would be parsed starting in ST_IN_SCRIPTING state. No <?
php or ?> tags permitted inside such files. Existing .php files and their
semantics are completely unchanged. This is purely additive and BC-clean.
Motivation:
PHP's mixed-mode default reflects its 1995 templating origins. Since PHP 7+, the
language has evolved into a credible general-purpose tool: strict types, enums,
readonly classes, property hooks, JIT compilation. I personally maintain
PHPolygon, a CPU-bound 3D engine written in PHP – a use case where the
templating heritage is pure ceremony. Other modern uses (CLI tooling, queue
workers, code generators) share this pattern. A dedicated pure-code file format
would be a small but meaningful acknowledgment that PHP-as-language is now a
first-class use case alongside PHP-as-template.
Prior art and what's different:
I have read both rfc/source_files_without_opening_tag (Boutell, 2012, abandoned
by author) and rfc/nophptags (Ohgaki, 2014, inactive). My proposal deliberately
avoids what I believe were the two design choices that killed them:
- No new include syntax (Boutell's AS keyword). Extension-based detection only.
- No php.ini-based mode switch (Ohgaki's template_mode). No global config side
effects.
- No security framing. The mode-switch overhead is parse-time only and OPcache/
JIT eliminate it in practice; this proposal is about conceptual clarity and
tooling, not performance or LFI mitigation.
Implementation:
I will write and maintain the implementation patch. Initial scope: extension
registration in zend_compile_file, lexer state initialization, OPcache
awareness, CLI support, and rejection of <?php/?> tokens inside .phpc files. I
will also coordinate with Composer maintainers ahead of RFC submission to
confirm autoload support.
Open questions for the list:
1. Is the .phpc extension acceptable as the disambiguator, or is there appetite
for something else (e.g. shebang line, declare directive – both of which I think
are worse, but I'd hear the case)?
2. Should #! shebang lines and UTF-8 BOM be permitted before the implicit
scripting state begins? My intent is yes for both.
3. Should __halt_compiler() retain its current behavior in .phpc files? My
intent is yes.
I welcome substantive critique. If the concept itself is unwanted, I would
rather know now than discover it during a vote.
Thanks.
Hendrik Mennen
Maintainer, PHPolygon

The problem with assigning meaning to a file extension is that the interpreter (currently) doesn't care what the file extension is. As long as it's text, it'll process any file and execute what comes after any <?php tags.

What you want is something like this:

   php -r "$(cat foo.phpc)"

Of course, this only works for a single file, and the engine won't execute code in any included files unless they contain PHP open tags, so the file needs a way to tell the engine to parse and execute it as PHP source.

At the risk of bike-shedding, I think it could be easy for others to confuse .phpc files as byte-code files, since it's common to see Python byte-code files with the .pyc extension. Also, a lot of folks in the community use "phpc" as a shorthand and tag to mean "PHP community," though I'm not sure this is reason enough not to use the .phpc extension. I don't have any better alternative recommendations at this time, though.

Cheers,
Ben

One more Try for readability:

Hi Ben,

Thanks for the quick response.

The problem with assigning meaning to a file extension is that the
interpreter (currently) doesn't care what the file extension is. As long
as it's text, it'll process any file and execute what comes after any
<?php tags.

Right, that is the current behavior, and changing it is exactly what the proposal is about. The engine would learn to check the file extension at the entry point (zend_compile_file for SAPI/CLI, and the include/require family for nested loads) and use that to set the initial lexer state. The .php behavior remains untouched.

That said, you point indirectly at something I do need to address: there are entry paths where the engine does not have a filename, or has one it should not trust. Off the top of my head:

- stdin (cat foo.phpc | php). No filename available. Options: require an explicit CLI flag (php --pure), or simply not support this path and document it.
- eval(). Operates on strings, not files. Extension is irrelevant here; eval() continues to require <?php as today.
- Phar archives. Internal entries have filenames, so dispatch by extension should work, but I would need to verify.
- include of a URL stream (rare, often disabled). Same question. Probably handled by extension on the URL path.

I will work these out explicitly in the RFC draft. Thanks for surfacing it.

At the risk of bike-shedding, I think it could be easy for others to
confuse .phpc files as byte-code files, since it's common to see Python
byte-code files with the .pyc extension.

Fair point, and one I had not weighed heavily enough. The Python .pyc parallel is real and would cause exactly the kind of one-time confusion that adds friction to adoption. Boutell's 2012 RFC used .phpp (Pure PHP) for the same purpose, which avoids the bytecode association. I am open to .phpp or other suggestions; the disambiguator matters less than the mechanism.

Also, a lot of folks in the community use "phpc" as a shorthand and tag
to mean "PHP community," though I'm not sure this is reason enough not
to use the .phpc extension.

Noted. Less critical than the .pyc concern in my view, but it does reinforce that .phpc is not the obviously-best choice. I will list candidate extensions in the RFC and explicitly invite the list to pick one rather than defending a specific letter.

Thanks again for the substantive feedback.

Hendrik Mennen
Maintainer, PHPolygon

On 5/27/26 09:27, Hendrik Mennen wrote:

Am 27.05.2026 um 15:19 schrieb Ben Ramsey <ramsey@php.net>:

The problem with assigning meaning to a file extension is that the
interpreter (currently) doesn't care what the file extension is. As long
as it's text, it'll process any file and execute what comes after any
<?php tags.

Right, that is the current behavior, and changing it is exactly what the proposal is about. The engine would learn to check the file extension at the entry point (zend_compile_file for SAPI/CLI, and the include/require family for nested loads) and use that to set the initial lexer state. The .php behavior remains untouched.

The behavior right now is that the file extension isn't checked. So, whether it's .php, .phtml, .php3, .rb, .py, or .txt doesn't matter. Using .php is a convention, not a requirement. If the proposal places any restrictions on the file extension, that's a major BC break.

So, the logic will need to be something like: if .phpc, then parse assuming there are no <?php tags, otherwise assume there must be <?php tags.

That said, you point indirectly at something I do need to address: there are entry paths where the engine does not have a filename, or has one it should not trust. Off the top of my head:

- stdin (cat foo.phpc | php). No filename available. Options: require an explicit CLI flag (php --pure), or simply not support this path and document it.
- eval(). Operates on strings, not files. Extension is irrelevant here; eval() continues to require <?php as today.
- Phar archives. Internal entries have filenames, so dispatch by extension should work, but I would need to verify.
- include of a URL stream (rare, often disabled). Same question. Probably handled by extension on the URL path.

I will work these out explicitly in the RFC draft. Thanks for surfacing it.

At the risk of bike-shedding, I think it could be easy for others to
confuse .phpc files as byte-code files, since it's common to see Python
byte-code files with the .pyc extension.

Fair point, and one I had not weighed heavily enough. The Python .pyc parallel is real and would cause exactly the kind of one-time confusion that adds friction to adoption. Boutell's 2012 RFC used .phpp (Pure PHP) for the same purpose, which avoids the bytecode association. I am open to .phpp or other suggestions; the disambiguator matters less than the mechanism.

I'm still unsure about assigning any meaning to the extension. Maybe this is something that could be handled at the SAPI configuration level similar to how .phps files are configured? Likewise, maybe the CLI should have a `-p` flag that tells it to process the input without checking for <?php tags.

For what it's worth, `php foo.phps` still executes the file. You need to run `php -s foo.phps` to output HTML syntax-highlighted source. The .phps extension has no meaning to the interpreter.

That said, I'm not sure how you'd handle this with include/require or in Phar files.

Cheers,
Ben

Am 28.05.2026 um 04:31 schrieb Ben Ramsey <ramsey@php.net>:

On 5/27/26 09:27, Hendrik Mennen wrote:

Am 27.05.2026 um 15:19 schrieb Ben Ramsey <ramsey@php.net>:

The problem with assigning meaning to a file extension is that the
interpreter (currently) doesn't care what the file extension is. As long
as it's text, it'll process any file and execute what comes after any
<?php tags.

Right, that is the current behavior, and changing it is exactly what the proposal is about. The engine would learn to check the file extension at the entry point (zend_compile_file for SAPI/CLI, and the include/require family for nested loads) and use that to set the initial lexer state. The .php behavior remains untouched.

The behavior right now is that the file extension isn't checked. So, whether it's .php, .phtml, .php3, .rb, .py, or .txt doesn't matter. Using .php is a convention, not a requirement. If the proposal places any restrictions on the file extension, that's a major BC break.

So, the logic will need to be something like: if .phpc, then parse assuming there are no <?php tags, otherwise assume there must be <?php tags.

That said, you point indirectly at something I do need to address: there are entry paths where the engine does not have a filename, or has one it should not trust. Off the top of my head:
- stdin (cat foo.phpc | php). No filename available. Options: require an explicit CLI flag (php --pure), or simply not support this path and document it.
- eval(). Operates on strings, not files. Extension is irrelevant here; eval() continues to require <?php as today.
- Phar archives. Internal entries have filenames, so dispatch by extension should work, but I would need to verify.
- include of a URL stream (rare, often disabled). Same question. Probably handled by extension on the URL path.
I will work these out explicitly in the RFC draft. Thanks for surfacing it.

At the risk of bike-shedding, I think it could be easy for others to
confuse .phpc files as byte-code files, since it's common to see Python
byte-code files with the .pyc extension.

Fair point, and one I had not weighed heavily enough. The Python .pyc parallel is real and would cause exactly the kind of one-time confusion that adds friction to adoption. Boutell's 2012 RFC used .phpp (Pure PHP) for the same purpose, which avoids the bytecode association. I am open to .phpp or other suggestions; the disambiguator matters less than the mechanism.

I'm still unsure about assigning any meaning to the extension. Maybe this is something that could be handled at the SAPI configuration level similar to how .phps files are configured? Likewise, maybe the CLI should have a `-p` flag that tells it to process the input without checking for <?php tags.

For what it's worth, `php foo.phps` still executes the file. You need to run `php -s foo.phps` to output HTML syntax-highlighted source. The .phps extension has no meaning to the interpreter.

That said, I'm not sure how you'd handle this with include/require or in Phar files.

Cheers,
Ben

Hi Ben,

I'm still unsure about assigning any meaning to the extension. Maybe
this is something that could be handled at the SAPI configuration
level similar to how .phps files are configured?

Thanks, and also thanks for the correction on .phps. I had the wrong mental model there: I assumed the extension carried interpreter meaning, when in fact it is purely an Apache handler convention plus the php -s flag. Useful to have that straight before drafting.

That said, SAPI-level configuration is exactly the path I do not think can carry the full mechanism, and I think your own observation at the end of your mail is the reason:

That said, I'm not sure how you'd handle this with include/require or
in Phar files.

Right, and this is the crux. include, require, and Phar entry resolution all happen below the SAPI layer, inside the engine. A web server handler mapping .phpc to a pure-code mode would work for the directly-served entry file, but the moment that file does require __DIR__ . '/lib/something.phpc', the engine has to decide what to do with the included file on its own, with no SAPI in the loop. The same applies to CLI: php script.php works through one path, php -f script.php through another, both bypassing any web-server config. And Phar internals never see SAPI at all.

So I think the dispatch has to live in the engine. The extension is just the most ergonomic signal I can think of for the engine to use, but I am open to other engine-level signals (a magic first line, a declare-like marker, even a per-Phar manifest flag). What I do not think works is delegating the decision to layers above the engine, because those layers do not cover all entry paths.

Likewise, maybe the CLI should have a -p flag that tells it to process
the input without checking for <?php tags.

This I would take, as a complement rather than a replacement. The stdin case (cat foo | php) is exactly where there is no filename to inspect, and a -p flag is a clean way to handle it. I will include this in the draft as the explicit answer to the stdin entry path.

Would you find the proposal more palatable if the RFC framed the extension dispatch as one of several engine-level signals (with the extension being the recommended default, but a declare-style marker or a CLI flag covering the cases where extension is unavailable), rather than as the sole mechanism? I am trying to understand whether your objection is to extension-based dispatch specifically, or to the broader idea of the engine making this decision at all.

Hendrik Mennen
Maintainer, PHPolygon

On 5/27/26 23:02, Hendrik Mennen wrote:

I am trying to understand whether your objection is to extension-
based dispatch specifically, or to the broader idea of the engine
making this decision at all.

I don't have a strong objection to the engine making this decision, but
I'd like to hear from others first.

Cheers,
Ben

Le 28/05/2026 à 06:32, Ben Ramsey a écrit :

On 5/27/26 23:02, Hendrik Mennen wrote:

I am trying to understand whether your objection is to extension-
based dispatch specifically, or to the broader idea of the engine
making this decision at all.

I don't have a strong objection to the engine making this decision, but
I'd like to hear from others first.

Cheers,
Ben

I think this can be summarised with this:

  * File extension cannot have a "meaning" for the engine, as the engine
    does not care about it at all in the first place
  * All SAPIs will consider a "normal PHP file" as default input, could
    it be CLI, mod_php for Apache, or anything else (even FrankenPHP)

IMO this means that the default entrypoint for "parsing and executing PHP files" for all SAPIs must not change, but /might/ either be complemented with a boolean arg for "pure PHP"/"no <?php ?> tags" , or the engine must provide another public entrypoint for this that SAPIs could implement or not.

An environment variable could be even used for that, and be included in the engine by default, so that all existing SAPIs could be updated at lower cost, but the variable name would have to be checked in all existing Composer packages (something like "PHP_INPUT_PURE=0|1" , or similar, TDB anyway), and would only be used for *first included file*. Any subsequent call to "include/require" would still behave as before, to avoid BC breaks.

And after that, to make use of this feature in userland projects, I would suggest an "include_pure" (and its "require" + "_once" variants) global keyword, to keep full compatibility everywhere.

WDYT?

Hi Alex,

Thanks for jumping in with concrete alternatives. Let me address each, because I think the most important thing for the RFC draft is to be precise about what the actual constraints are versus what is just the current default.

  • File extension cannot have a “meaning” for the engine, as the engine
    does not care about it at all in the first place

I would push back on this gently, because I think it is a description of current behavior, not a constraint. The engine does have the filename available in zend_compile_file and in the include/require resolution path. Teaching it to check the extension there is a small, additive change, not a violation of any architectural rule I am aware of. Other languages do this routinely (Python differentiates .py from .pyc at the loader level). So the question is not “can the engine know about extensions” but “should we use that as the signal here.” Happy to be told there is an architectural reason I am missing, but if the objection is just “this is not how PHP does it today,” then the whole RFC is by definition asking to change that.

  • All SAPIs will consider a “normal PHP file” as default input, could
    it be CLI, mod_php for Apache, or anything else (even FrankenPHP)

Agreed, and nothing in the proposal changes that. Existing .php files behave exactly as today through every SAPI. The proposal is additive: a new file type that the engine handles differently, with zero impact on the existing default path.

An environment variable could be even used for that, […] and would
only be used for first included file. Any subsequent call to
“include/require” would still behave as before, to avoid BC breaks.

This is where I see a real problem with the env-var approach. If only the entry file is pure and everything it includes must be mixed-mode, then pure code cannot be shipped as a library. A library author writing pure code could never be loaded from a mixed-mode application, and vice versa. The adoption story dies on contact with Composer.

For pure-code files to be useful in practice, the parsing context has to attach to the file itself, not to the entry-point execution. Otherwise the feature is limited to a single bootstrap script, which is too narrow to justify the engine change.

And after that, to make use of this feature in userland projects, I
would suggest an “include_pure” (and its “require” + “_once” variants)
global keyword

This is essentially Tom Boutell’s 2012 approach (rfc/source_files_without_opening_tag), which used a modified include syntax. Boutell ultimately abandoned that RFC, citing concerns that I would summarize as: the caller decides how the file is parsed, not the author. That inverts the usual relationship between a library and its consumer. A library author cannot ship pure-code files with the guarantee that they will be parsed as such, because consumers might forget include_pure and load them with regular include. The same physical file would parse differently depending on how it was loaded, which is a discoverability and tooling nightmare.

That said, I do not want to dismiss include_pure entirely. As a complement to extension-based dispatch, it has merit:

  • For stdin and eval-like situations, where no filename exists, an explicit per-call signal is exactly what is needed (Ben suggested php -p for the CLI stdin case, similar idea).
  • For interop scenarios where someone wants to force pure-code parsing on a file that does not have the extension, it provides an explicit escape hatch.

What I think does not work is replacing the extension-based mechanism with include_pure, because then we lose the “file declares its own parsing context” property.

A possible synthesis:

  1. Primary mechanism: file extension (the file itself declares its parsing context, author-controlled).
  2. CLI complement: -p flag for stdin (Ben’s suggestion).
  3. Userland complement: include_pure or similar for explicit override (your suggestion, but as override not primary).

This gives every entry path a clear answer and keeps author intent intact. I will work this into the RFC draft.

Does the engine-level dispatch via extension still feel wrong to you once the alternative is framed this way, or is your objection more specific to the .phpc letter than to the mechanism?

Hendrik Mennen
Maintainer, PHPolygon

···

Le 28/05/2026 à 06:32, Ben Ramsey a écrit :

On 5/27/26 23:02, Hendrik Mennen wrote:

I am trying to understand whether your objection is to extension-
based dispatch specifically, or to the broader idea of the engine
making this decision at all.

I don’t have a strong objection to the engine making this decision, but
I’d like to hear from others first.

Cheers,
Ben

I think this can be summarised with this:

  • File extension cannot have a “meaning” for the engine, as the engine does not care about it at all in the first place
  • All SAPIs will consider a “normal PHP file” as default input, could it be CLI, mod_php for Apache, or anything else (even FrankenPHP)

IMO this means that the default entrypoint for “parsing and executing PHP files” for all SAPIs must not change, but might either be complemented with a boolean arg for “pure PHP”/“no <?php ?> tags” , or the engine must provide another public entrypoint for this that SAPIs could implement or not.

An environment variable could be even used for that, and be included in the engine by default, so that all existing SAPIs could be updated at lower cost, but the variable name would have to be checked in all existing Composer packages (something like “PHP_INPUT_PURE=0|1” , or similar, TDB anyway), and would only be used for first included file. Any subsequent call to “include/require” would still behave as before, to avoid BC breaks.

And after that, to make use of this feature in userland projects, I would suggest an “include_pure” (and its “require” + “_once” variants) global keyword, to keep full compatibility everywhere.

WDYT?

I see a security concern in introducing a new file extension.

It’s common to configure a web server to pass locations that end with .php to a PHP interpreter.

Nginx example:

location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/run/php/php8.2-fpm.sock;
}

Apache2 example:

<FilesMatch \.php$>
SetHandler application/x-httpd-php
</FilesMatch>

With a new file extension, users would be forced to change their configs, or a direct request to .phpX file would expose its source code.

This will come as a surprise to users who don’t know about the pure syntax yet include libraries that use it.

Am 28.05.2026 um 11:10 schrieb go.al.ni@gmail.com:

I see a security concern in introducing a new file extension.

It's common to configure a web server to pass locations that end with .php to a PHP interpreter.

Nginx example:

location ~ \.php$ {
    include snippets/fastcgi-php.conf;
    fastcgi_pass unix:/run/php/php8.2-fpm.sock;
}

Apache2 example:

<FilesMatch \.php$>
    SetHandler application/x-httpd-php
</FilesMatch>

With a new file extension, users would be forced to change their configs, or a direct request to .phpX file would expose its source code.

This will come as a surprise to users who don't know about the pure syntax yet include libraries that use it.

Hi,

This is a legitimate concern and one the RFC must address head-on rather than wave away. Let me engage with it directly.

With a new file extension, users would be forced to change their
configs, or a direct request to .phpX file would expose its source
code. This will come as a surprise to users who don't know about the
pure syntax yet include libraries that use it.

The risk is real, and it is structurally identical to the problem that .inc files have caused for decades and that .module files caused in older Drupal setups. The path of attack is: a Composer library uses pure-code files, the consumer's web server has no handler for the new extension, an attacker requests the file by direct URL, and the server returns the source verbatim. Database credentials, API keys, business logic all exposed.

So I think the RFC has to do at least four things to take this seriously:

1. Mandatory Security section. A dedicated section in the RFC describing the exposure risk, the affected configurations, and the mitigations. Not buried, not phrased as "users should configure properly," but as a first-class concern.

2. Canonical web server configurations as part of the RFC. Apache (FilesMatch with SetHandler, plus a defensive deny rule for any other extension fallthrough), Nginx (location block extending the pattern to cover the new extension), Caddy, and FrankenPHP snippets. These would also go into php.net documentation on acceptance.

3. Composer ecosystem coordination. Before submitting the formal RFC, I plan to talk to the Composer maintainers about both autoload support and about whether the conventional vendor/ layout should be documented as the canonical mitigation (vendor outside web root, or vendor denied at the web server level). This is already a best practice for other sensitive files, but it should be promoted to documented requirement when pure-code libraries become possible.

4. Consideration of a transitional default. One option worth weighing: ship the pure-code extension support disabled by default at the SAPI level for the first PHP release that introduces it, requiring administrators to opt in via configuration. This trades some adoption friction for stronger safety during the transition period. I am not yet decided on this and would value the list's input.

What I do not think works as a mitigation:

- Engine-side refusal to serve pure-code files as text. By the time the web server is returning the file as static content, PHP is not in the request path at all. There is nothing the engine can do.
- Renaming the extension to something less guessable. The same path-traversal class of attack would still find it once libraries adopt a convention.
- Relying on users to figure it out. That is exactly the assumption that has made .inc and .module a recurring incident class.

The one structural argument in favor of the proposal here is that the population at risk is bounded by adoption: users who do not opt in to pure-code files do not have any new exposure, and users who do opt in are by definition aware of the new file type and can be expected (with proper documentation) to update their server config. But that argument only holds if the documentation is unmissable, which is exactly why the Security section has to be primary.

Thanks for surfacing this, it is going straight into the draft.

Hendrik Mennen
Maintainer, PHPolygon

> * All SAPIs will consider a "normal PHP file" as default input, could
> it be CLI, mod_php for Apache, or anything else (even FrankenPHP)

Agreed, and nothing in the proposal changes that. Existing .php files behave exactly as today through every SAPI. The proposal is additive: a new file type that the engine handles differently, with zero impact on the existing default path.

This is not entirely true, as the handling of `*phpc` (or whatever
extension you are going to choose) _will_ change. People could already
be using files with that extension. The BC impact is non-zero, though
problems are arguably unlikely.

On Thu, 28 May 2026, Alex Rock wrote:

Le 28/05/2026 à 06:32, Ben Ramsey a écrit :
> On 5/27/26 23:02, Hendrik Mennen wrote:
> >
> > I am trying to understand whether your objection is to extension-
> > based dispatch specifically, or to the broader idea of the engine
> > making this decision at all.
> >
>
> I don't have a strong objection to the engine making this decision, but
> I'd like to hear from others first.

I think this can be summarised with this:

* File extension cannot have a "meaning" for the engine, as the engine
   does not care about it at all in the first place
* All SAPIs will consider a "normal PHP file" as default input, could
   it be CLI, mod_php for Apache, or anything else (even FrankenPHP)

IMO this means that the default entrypoint for "parsing and executing PHP
files" for all SAPIs must not change, but /might/ either be complemented with
a boolean arg for "pure PHP"/"no <?php ?> tags" , or the engine must provide
another public entrypoint for this that SAPIs could implement or not.

An environment variable could be even used for that, and be included in the
engine by default, so that all existing SAPIs could be updated at lower cost,
but the variable name would have to be checked in all existing Composer
packages (something like "PHP_INPUT_PURE=0|1" , or similar, TDB anyway), and
would only be used for *first included file*. Any subsequent call to
"include/require" would still behave as before, to avoid BC breaks.

And after that, to make use of this feature in userland projects, I would
suggest an "include_pure" (and its "require" + "_once" variants) global
keyword, to keep full compatibility everywhere.

WDYT?

I think this adds a lot of complexity, for dubious benefit: not having
to start PHP code files with "<?php".

cheers,
Derick

On Tue, Jun 2, 2026 at 4:15 AM Derick Rethans <derick@php.net> wrote:

On Thu, 28 May 2026, Alex Rock wrote:

Le 28/05/2026 à 06:32, Ben Ramsey a écrit :

On 5/27/26 23:02, Hendrik Mennen wrote:

I am trying to understand whether your objection is to extension-
based dispatch specifically, or to the broader idea of the engine
making this decision at all.

I don’t have a strong objection to the engine making this decision, but
I’d like to hear from others first.

I think this can be summarised with this:

  • File extension cannot have a “meaning” for the engine, as the engine
    does not care about it at all in the first place
  • All SAPIs will consider a “normal PHP file” as default input, could
    it be CLI, mod_php for Apache, or anything else (even FrankenPHP)

IMO this means that the default entrypoint for “parsing and executing PHP
files” for all SAPIs must not change, but /might/ either be complemented with
a boolean arg for “pure PHP”/“no <?php ?> tags” , or the engine must provide
another public entrypoint for this that SAPIs could implement or not.

An environment variable could be even used for that, and be included in the
engine by default, so that all existing SAPIs could be updated at lower cost,
but the variable name would have to be checked in all existing Composer
packages (something like “PHP_INPUT_PURE=0|1” , or similar, TDB anyway), and
would only be used for first included file. Any subsequent call to
“include/require” would still behave as before, to avoid BC breaks.

And after that, to make use of this feature in userland projects, I would
suggest an “include_pure” (and its “require” + “_once” variants) global
keyword, to keep full compatibility everywhere.

WDYT?

I think this adds a lot of complexity, for dubious benefit: not having
to start PHP code files with “<?php”.

cheers,
Derick

Question - what is the performance hit of scanning the file for <?php and, if none are found, restarting the parse process in code mode?

If the hit isn’t significant, this might be a way forward. There is the BC break that files fed through the parser with nothing to parse will start creating errors, but that situation (a php file with no <?php at all) feels like an error state anyway. Thoughts?

On 04/06/2026 3:47 pm, Michael Morris wrote:

Question - what is the performance hit of scanning the file for <?php and, if none are found, restarting the parse process in code mode?

If the hit isn't significant, this might be a way forward. There is the BC break that files fed through the parser with nothing to parse will start creating errors, but that situation (a php file with no <?php at all) feels like an error state anyway. Thoughts?

It is wrong to assume a PHP file always contains PHP code. Many PHP files only contain HTML just because it is totally valid to do so. Then the question is how to handle pure no-code files.

Additionally you would also need to scan for the <?= (short open tag).

As others have mentioned a potential RFC would provide unncessary security and BC risks. That would most likely only surprise users. So I don't really see the real benefit of it.

--
Regards,

Jordi Kroon

On 4 June 2026 15:34:31 BST, Jordi Kroon <jordikroon@me.com> wrote:

On 04/06/2026 3:47 pm, Michael Morris wrote:

If the hit isn't significant, this might be a way forward. There is the BC break that files fed through the parser with nothing to parse will start creating errors, but that situation (a php file with no <?php at all) feels like an error state anyway. Thoughts?

It is wrong to assume a PHP file always contains PHP code. Many PHP files only contain HTML just because it is totally valid to do so. Then the question is how to handle pure no-code files.

Rasmus (Lerdorf, PHP's creator) likes to point out that PHP has the shortest "hello world application" of any non-esoteric language:

Hello, World!

More realistically, I just checked the codebase I work on with "ack" (a grep replacement):

ack --php -L '<\?'

That found 14 files which will be run as PHP but don't contain the string "<?". Some of those are empty files, but about half are "templates" which happen not to have any dynamic content.

So yeah, it would be an entirely unnecessary disruption to perfectly good code.

Regards,

Rowan Tommins
[IMSoP]

I think this adds a lot of complexity, for dubious benefit: not having
to start PHP code files with “<?php”.

Question - what is the performance hit of scanning the file for <?php and, if none are found, restarting the parse process in code mode?

If the hit isn’t significant, this might be a way forward. There is the BC break that files fed through the parser with nothing to parse will start creating errors, but that situation (a php file with no <?php at all) feels like an error state anyway.

As most projects use dynamic autoloading you’d have to add a stat call for a second filename, to try both .php and .phpc files. The performance hit for that is much bigger than any minuscule gain.

On Fri, Jun 5, 2026 at 9:07 AM Casper Langemeijer <langemeijer@php.net> wrote:

I think this adds a lot of complexity, for dubious benefit: not having
to start PHP code files with “<?php”.

Question - what is the performance hit of scanning the file for <?php and, if none are found, restarting the parse process in code mode?

If the hit isn’t significant, this might be a way forward. There is the BC break that files fed through the parser with nothing to parse will start creating errors, but that situation (a php file with no <?php at all) feels like an error state anyway.

As most projects use dynamic autoloading you’d have to add a stat call for a second filename, to try both .php and .phpc files. The performance hit for that is much bigger than any minuscule gain.

I would expect that most projects in production are using composer, and that they are using the optimized autoloader when installing in production.
That builds a class map to exact file, and no stat call is done.
I would say this is a problem that can be solved at autoloader level, and I recall I did a similar implementation with what composer does, 10+ years ago, when not yet using composer.


Alex

Le 05/06/2026 à 09:04, Alexandru Pătrănescu a écrit :

On Fri, Jun 5, 2026 at 9:07 AM Casper Langemeijer <langemeijer@php.net> wrote:

        I think this adds a lot of complexity, for dubious benefit:
        not having
        to start PHP code files with "<?php".

    Question - what is the performance hit of scanning the file for
    <?php and, if none are found, restarting the parse process in
    code mode?

    If the hit isn't significant, this might be a way forward. There
    is the BC break that files fed through the parser with nothing to
    parse will start creating errors, but that situation (a php file
    with no <?php at all) feels like an error state anyway.

    As most projects use dynamic autoloading you'd have to add a stat
    call for a second filename, to try both .php and .phpc files. The
    performance hit for that is much bigger than any minuscule gain.

I would expect that most projects in production are using composer, and that they are using the optimized autoloader when installing in production.
That builds a class map to exact file, and no stat call is done.
I would say this is a problem that can be solved at autoloader level, and I recall I did a similar implementation with what composer does, 10+ years ago, when not yet using composer.

--
Alex

A language-level syntax that is only available when you actually implement it yourself in autoload, and that is provided by composer, isn't a language-level syntax. It's one of the numerous critics of the current partly-erased generics RFC, and I think having a feature opt-in is nice, but opt-in-with-custom-autoload-implem isn't good.

Right now I'm working on a legacy codebase that doesn't use Composer. Won't be able to use that unless I customize the autoload by myself, or if I copy-paste some code from Composer itself, or from any other implementation. Doesn't seem like a language improvement IMO.