Hi
Am 2025-11-03 15:51, schrieb Gina P. Banyard:
While the < > syntax to "force" the endianess of a sequence specifier is nice.
But if this requires rewriting the whole parser as this RFC implies, then you are asking someone to commit to a larger amount of work than they signed up, which is considered bad RFC etiquette. [1]
I disagree with that claim in the RFC and to put my money where my mouth is, I have spent the 15 minutes of writing the necessary patch for the pack() function. It is attached to this email and also available as this gist: 0001-pack-Support-endian-specifier.patch · GitHub. Given the time spent, I've only given it light testing, but it passes all existing `pack()` tests and returns the correct output for:
<?php
var_dump(bin2hex(pack('s<2s>2', 258, -2, 258, -2)));
var_dump(bin2hex(pack('a>', 258)));
Using `perl -e "print pack('s<2s>2', 258, -2, 258, -2)" |xxd` as a comparison. I have not created the patch for `unpack()`, but I believe this is already sufficient demonstration that “rewriting the whole parser” is not necessary at all.
Best regards
Tim Düsterhus
(Attachment 0001-pack-Support-endian-specifier.patch is missing)
Hi
Am 2025-10-31 14:27, schrieb Alexandre Daubois:
I reworked the wording a bit and labeled the implementation as
"proposed PR" instead of "current PR" to reduce potential confusion.
This is not resolving the factual issues with the RFC. The “Why Perl's Approach Is Not The Best Fit For PHP” section still contains the incorrect statements that I previously pointed out in my email on September 16th: php.internals: Re: [RFC] Add pack()/unpack() support for signed integers with specific endianness. With regard to "(2) Parser Architecture Limitations" specifically, please see my previous reply to Gina.
With regard to the “Considered Alternatives”, it is also not clear to me what “complex migration path” there should be. Supporting specific endianess for signed integers is a new feature. There is no migration path.
Best regards
Tim Düsterhus
On Mon, Nov 3, 2025, at 17:09, Tim Düsterhus wrote:
Hi
Am 2025-11-03 15:51, schrieb Gina P. Banyard:
While the < > syntax to “force” the endianess of a sequence specifier
is nice.
But if this requires rewriting the whole parser as this RFC implies,
then you are asking someone to commit to a larger amount of work than
they signed up, which is considered bad RFC etiquette. [1]
I disagree with that claim in the RFC and to put my money where my mouth
is, I have spent the 15 minutes of writing the necessary patch for the
pack() function. It is attached to this email and also available as this
gist: https://gist.github.com/TimWolla/d8bca56a6507226e684827d2a7b44829.
Given the time spent, I’ve only given it light testing, but it passes
all existing pack() tests and returns the correct output for:
<?php
var_dump(bin2hex(pack('s<2s>2', 258, -2, 258, -2)));
var_dump(bin2hex(pack('a>', 258)));
Using `perl -e "print pack('s<2s>2', 258, -2, 258, -2)" |xxd` as a
comparison. I have not created the patch for `unpack()`, but I believe
this is already sufficient demonstration that “rewriting the whole
parser” is not necessary at all.
Best regards
Tim Düsterhus
**Attachments:**
- 0001-pack-Support-endian-specifier.patch
Please don’t do this.
For those of us using pack()/unpack(), I don’t really care how much like or unlike Perl it is, and having to switch strings based on php version because someone wanted it like Perl sounds like a special kind of hell. It’s already tricky enough to get pack/unpack right when dealing with binary data and having to do it twice plus maintain two different versions of the same string… no thank you.
It’s also used subtly in all kinds of unexpected places (totp calculations, encryption polyfills, etc). This kind of change would almost necessitate a major version bump of php.
— Rob
On 03/11/2025 18:33, Rob Landers wrote:
Please don’t do this.
For those of us using pack()/unpack(), I don’t really care how much like or unlike Perl it is, and having to switch strings based on php version because someone wanted it like Perl sounds like a special kind of hell. It’s already tricky enough to get pack/unpack right when dealing with binary data and having to do it twice plus maintain two different versions of the same string… no thank you.
AFAIU the old way of doing things won't break with Tim's suggestion. So there's no need to switch strings.
It just adds the possibility of using <>.
I agree it's already tricky enough to get things right, which is _exactly_ why Tim's approach is the right one. Instead of adding more arbitrarily chosen letters we now have a more meaningful way to indicate endianness. It also is proven by Tim's patch that this isn't hard to achieve. While implementation-wise adding some more letters is easier, Tim's patch isn't really difficult anyway.
I will vote against the RFC in its current form in favor of Tim's approach.
Kind regards
Niels
Hi,
It just adds the possibility of using <>.
I agree it's already tricky enough to get things right, which is _exactly_ why Tim's approach is the right one. Instead of adding more arbitrarily chosen letters we now have a more meaningful way to indicate endianness. It also is proven by Tim's patch that this isn't hard to achieve. While implementation-wise adding some more letters is easier, Tim's patch isn't really difficult anyway.
So, if I get it right, you would both prefer a RFC proposing to add <
and > for letters using machine endianness, with no effect on other
letters (like Perl does)? I try to think about possible edge and error
cases before what I really think about this proposition. If you have
in mind tricky things that could be worth investigating deeper with
implementing modifiers, please let me know.
— Alexandre Daubois
Hi
Am 2025-11-05 09:57, schrieb Alexandre Daubois:
So, if I get it right, you would both prefer a RFC proposing to add <
and > for letters using machine endianness, with no effect on other
letters (like Perl does)? I try to think about possible edge and error
Correct. More specifically: The modifiers should emit an error for unsupported letters instead of silently failing. This is what my proof-of-concept patch already implements and it's in line with unknown letters throwing:
php > var_dump(pack('?', 123));
PHP Warning: Uncaught ValueError: Type ?: unknown format code in php shell code:1
Other than that, I can't think of any edge cases worth handling.
Best regards
Tim Düsterhus
Hi everyone,
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
After rereading the threads and spending some time thinking about it
all, I propose a new version of this RFC aimed at adding Perl
modifiers. Indeed, this seems to be a better solution than the one
previously proposed, and several people seem to share this opinion.
The RFC URL is the same and its version has been bumped to 1.1:
Looking forward to reading your feedback on this revision.
— Alexandre Daubois
Hi
On 11/21/25 11:46, Alexandre Daubois wrote:
After rereading the threads and spending some time thinking about it
all, I propose a new version of this RFC aimed at adding Perl
modifiers. Indeed, this seems to be a better solution than the one
previously proposed, and several people seem to share this opinion.
The RFC URL is the same and its version has been bumped to 1.1:
PHP: rfc:pack-unpack-endianness-signed-integers-support
Looking forward to reading your feedback on this revision.
Thank you.
I only have one comment on
Initially, endianness modifiers will only be supported for signed integer format codes (s, l, q) since unsigned integers already have dedicated endian-specific letters.
While there are already dedicated alternatives, I feel that restricting the new modifiers to the lowercase versions would be unnecessarily restrictive. Since the RFC argues that:
2. Intuitive semantics: The < and > symbols visually suggest byte order direction
which I agree with, the same argument applies to the uppercase QLS versions. As a developer I would rather remember l> as "signed long big-endian" and L> as "unsigned long big-endian" rather than N as "4-byte network-byte order".
Since there is no inherent limitation or ambiguity with supporting modifiers on QLS, I would suggest just allowing it. In fact I think my PoC patch already supported them.
There's also a formatting issue of the “Rationale” in the “Proposed Solution” section.
Best regards
Tim Düsterhus
Hi Tim,
Le dim. 23 nov. 2025 à 15:45, Tim Düsterhus <tim@bastelstu.be> a écrit :
> Initially, endianness modifiers will only be supported for signed integer format codes (s, l, q) since unsigned integers already have dedicated endian-specific letters.
While there are already dedicated alternatives, I feel that restricting
the new modifiers to the lowercase versions would be unnecessarily
restrictive. Since the RFC argues that:
> 2. Intuitive semantics: The < and > symbols visually suggest byte order direction
which I agree with, the same argument applies to the uppercase QLS
versions. As a developer I would rather remember l> as "signed long
big-endian" and L> as "unsigned long big-endian" rather than N as
"4-byte network-byte order".
Since there is no inherent limitation or ambiguity with supporting
modifiers on QLS, I would suggest just allowing it. In fact I think my
PoC patch already supported them.
I agree. I just updated the text and tables to reflect the addition of
big and little endian unsigned integers throughout the document.
There's also a formatting issue of the “Rationale” in the “Proposed
Solution” section.
The text has been cleaned and simplified. Thanks!
— Alexandre Daubois
Hello Marc,
Le dim. 23 nov. 2025 à 18:04, Marc B. <marc@mabe.berlin> a écrit :
Quoting the docs from Perl, it's also supported to use <> modifiers on floating point values but I haven't found any note about it in your RFC. In my opinion it makes sense to allow these modifiers on fd as well for the same reasons as QLS.
Thanks for this information! I think that it would make sense. I added
this to the future scope section of the RFC.
While I'm eager to go deeper into the floating points topic with
pack/unpack, I feel that it would deserve a follow-up RFC so this one
doesn't grow too much. This one's focus on integers as its title and
URL suggest, but the core feature is actually adding support for
modifiers now. In the scenario of this one being accepted, we would
have plenty of time to create a follow-up and implement it (especially
since modifiers would have already been accepted).
— Alexandre Daubois
Hi
On 11/24/25 12:20, Alexandre Daubois wrote:
I agree. I just updated the text and tables to reflect the addition of
big and little endian unsigned integers throughout the document.
Thank you. In the “Complete PHP Format Letter Organization:” table you could also add “Unsigned machine-endian" for completeness (i.e. the uppercase QLS without modifier).
Other than that, I don't have further comments. The RFC LGTM.
Best regards
Tim Düsterhus
Hi Tim,
Le mar. 25 nov. 2025 à 23:17, Tim Düsterhus <tim@bastelstu.be> a écrit :
Thank you. In the “Complete PHP Format Letter Organization:” table you
could also add “Unsigned machine-endian" for completeness (i.e. the
uppercase QLS without modifier).
RFC updated with the new table row. Thanks!
— Alexandre Daubois
Hi everyone,
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
Here is the link to the RFC: PHP: rfc:pack-unpack-endianness-signed-integers-support
This is a friendly reminder of this RFC. It's been 2 weeks since the
last discussion took place. I think the RFC is ready. We're arriving
at the holiday period, which is why I'm not starting the vote soon. I
plan to start the vote in January, after the holiday period.
In the meantime, if you have any feedback on the RFC, please let me know!
— Alexandre Daubois
Hi everyone,
Le mer. 10 déc. 2025 à 09:08, Alexandre Daubois
<alex.daubois+php@gmail.com> a écrit :
This is a friendly reminder of this RFC. It's been 2 weeks since the
last discussion took place. I think the RFC is ready. We're arriving
at the holiday period, which is why I'm not starting the vote soon. I
plan to start the vote in January, after the holiday period.
I plan to open the vote on Wednesday, January 14th on this RFC. Please
let me know in the meantime if you'd like more info or if you have
concerns about this RFC.
Thanks!
— Alexandre Daubois