Hi
Am 2025-11-03 15:51, schrieb Gina P. Banyard:
While the < > syntax to "force" the endianess of a sequence specifier is nice.
But if this requires rewriting the whole parser as this RFC implies, then you are asking someone to commit to a larger amount of work than they signed up, which is considered bad RFC etiquette. [1]
I disagree with that claim in the RFC and to put my money where my mouth is, I have spent the 15 minutes of writing the necessary patch for the pack() function. It is attached to this email and also available as this gist: 0001-pack-Support-endian-specifier.patch · GitHub. Given the time spent, I've only given it light testing, but it passes all existing `pack()` tests and returns the correct output for:
<?php
var_dump(bin2hex(pack('s<2s>2', 258, -2, 258, -2)));
var_dump(bin2hex(pack('a>', 258)));
Using `perl -e "print pack('s<2s>2', 258, -2, 258, -2)" |xxd` as a comparison. I have not created the patch for `unpack()`, but I believe this is already sufficient demonstration that “rewriting the whole parser” is not necessary at all.
Best regards
Tim Düsterhus
(Attachment 0001-pack-Support-endian-specifier.patch is missing)
Hi
Am 2025-10-31 14:27, schrieb Alexandre Daubois:
I reworked the wording a bit and labeled the implementation as
"proposed PR" instead of "current PR" to reduce potential confusion.
This is not resolving the factual issues with the RFC. The “Why Perl's Approach Is Not The Best Fit For PHP” section still contains the incorrect statements that I previously pointed out in my email on September 16th: php.internals: Re: [RFC] Add pack()/unpack() support for signed integers with specific endianness. With regard to "(2) Parser Architecture Limitations" specifically, please see my previous reply to Gina.
With regard to the “Considered Alternatives”, it is also not clear to me what “complex migration path” there should be. Supporting specific endianess for signed integers is a new feature. There is no migration path.
Best regards
Tim Düsterhus
On Mon, Nov 3, 2025, at 17:09, Tim Düsterhus wrote:
Hi
Am 2025-11-03 15:51, schrieb Gina P. Banyard:
While the < > syntax to “force” the endianess of a sequence specifier
is nice.
But if this requires rewriting the whole parser as this RFC implies,
then you are asking someone to commit to a larger amount of work than
they signed up, which is considered bad RFC etiquette. [1]
I disagree with that claim in the RFC and to put my money where my mouth
is, I have spent the 15 minutes of writing the necessary patch for the
pack() function. It is attached to this email and also available as this
gist: https://gist.github.com/TimWolla/d8bca56a6507226e684827d2a7b44829.
Given the time spent, I’ve only given it light testing, but it passes
all existing pack() tests and returns the correct output for:
<?php
var_dump(bin2hex(pack('s<2s>2', 258, -2, 258, -2)));
var_dump(bin2hex(pack('a>', 258)));
Using `perl -e "print pack('s<2s>2', 258, -2, 258, -2)" |xxd` as a
comparison. I have not created the patch for `unpack()`, but I believe
this is already sufficient demonstration that “rewriting the whole
parser” is not necessary at all.
Best regards
Tim Düsterhus
**Attachments:**
- 0001-pack-Support-endian-specifier.patch
Please don’t do this.
For those of us using pack()/unpack(), I don’t really care how much like or unlike Perl it is, and having to switch strings based on php version because someone wanted it like Perl sounds like a special kind of hell. It’s already tricky enough to get pack/unpack right when dealing with binary data and having to do it twice plus maintain two different versions of the same string… no thank you.
It’s also used subtly in all kinds of unexpected places (totp calculations, encryption polyfills, etc). This kind of change would almost necessitate a major version bump of php.
— Rob
On 03/11/2025 18:33, Rob Landers wrote:
Please don’t do this.
For those of us using pack()/unpack(), I don’t really care how much like or unlike Perl it is, and having to switch strings based on php version because someone wanted it like Perl sounds like a special kind of hell. It’s already tricky enough to get pack/unpack right when dealing with binary data and having to do it twice plus maintain two different versions of the same string… no thank you.
AFAIU the old way of doing things won't break with Tim's suggestion. So there's no need to switch strings.
It just adds the possibility of using <>.
I agree it's already tricky enough to get things right, which is _exactly_ why Tim's approach is the right one. Instead of adding more arbitrarily chosen letters we now have a more meaningful way to indicate endianness. It also is proven by Tim's patch that this isn't hard to achieve. While implementation-wise adding some more letters is easier, Tim's patch isn't really difficult anyway.
I will vote against the RFC in its current form in favor of Tim's approach.
Kind regards
Niels
Hi,
It just adds the possibility of using <>.
I agree it's already tricky enough to get things right, which is _exactly_ why Tim's approach is the right one. Instead of adding more arbitrarily chosen letters we now have a more meaningful way to indicate endianness. It also is proven by Tim's patch that this isn't hard to achieve. While implementation-wise adding some more letters is easier, Tim's patch isn't really difficult anyway.
So, if I get it right, you would both prefer a RFC proposing to add <
and > for letters using machine endianness, with no effect on other
letters (like Perl does)? I try to think about possible edge and error
cases before what I really think about this proposition. If you have
in mind tricky things that could be worth investigating deeper with
implementing modifiers, please let me know.
— Alexandre Daubois
Hi
Am 2025-11-05 09:57, schrieb Alexandre Daubois:
So, if I get it right, you would both prefer a RFC proposing to add <
and > for letters using machine endianness, with no effect on other
letters (like Perl does)? I try to think about possible edge and error
Correct. More specifically: The modifiers should emit an error for unsupported letters instead of silently failing. This is what my proof-of-concept patch already implements and it's in line with unknown letters throwing:
php > var_dump(pack('?', 123));
PHP Warning: Uncaught ValueError: Type ?: unknown format code in php shell code:1
Other than that, I can't think of any edge cases worth handling.
Best regards
Tim Düsterhus