[PHP-DEV] [RFC] Add pack()/unpack() support for signed integers with specific endianness

Hello Internals,

I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.

Here is the link to the RFC: https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support

Best,
Alexandre Daubois

On Tue, Sep 16, 2025, at 13:45, Alexandre Daubois wrote:

Hello Internals,

I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.

Here is the link to the RFC: https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support

Best,
Alexandre Daubois

Hi Alexandre,

Thank you for your work on this. Of all the RFCs I’ve seen in awhile, this one is one that I’m most excited to see after writing a protobuf implementation.

If there is one thing I would be over the moon for, it would be for also adding zigzag encoding as a possible signed integer encoding (maybe using Z/z as the letter)? It is more efficient for signed integers (vs. twos-complement) where a variable length integer is desired. I can understand if this is out of scope, but I thought I’d ask.

— Rob

Hi Rob,

Thank you for your work on this. Of all the RFCs I've seen in awhile, this one is one that I'm most excited to see after writing a protobuf implementation.

Happy to see you immediately have a use case in mind while reading this!

If there is one thing I would be over the moon for, it would be for also adding zigzag encoding as a possible signed integer encoding (maybe using Z/z as the letter)?

I wasn't aware this encoding existed. I'm happy to learn about it!
Unfortunately, Z is already used with pack/unpack. After a quick
research, I think it would be a nice feature. I also noticed that
languages like Rust and Python use external packages for this it
seems. I guess it would be better to have dedicated functions. I don't
think it should be included in this RFC, but that is an interesting
idea nevertheless. I'd be really interested if someone comes up with
an RFC for this feature!

— Alexandre Daubois

Hi

Am 2025-09-16 13:45, schrieb Alexandre Daubois:

Here is the link to the RFC:
PHP: rfc:pack-unpack-endianness-signed-integers-support

Thank you for the RFC. I'm confused by the “Why Perl's Approach Cannot Be Used in PHP” section.

1. Base Letters Already Taken

The point of a modifier is to modify something. That means that there needs to be a “base letter”. The same base letters are also “taken” in Perl and have the same definition.

2. Parser Architecture Limitations

That sounds like a simple problem to solve. When reaching one of the “base letters” in question, look at the next character.

3. Different Design Philosophy

This is simply false. v/n/V/N identically exist in Perl. J is not clear to me, and P appears to be different (but I don't do enough Perl to say for certain).

Best regards
Tim Düsterhus

Hi

Am 2025-09-16 16:10, schrieb Alexandre Daubois:

seems. I guess it would be better to have dedicated functions. I don't
think it should be included in this RFC, but that is an interesting
idea nevertheless. I'd be really interested if someone comes up with
an RFC for this feature!

A better `pack()` with a streamlined format “description” would probably fit with Ignace's proposed new Encoding extension: [RFC][DISCUSSION] Add RFC 4648 compliant data encoding API - Externals

Best regards
Tim Düsterhus

Hi,

> 1. Base Letters Already Taken

The point of a modifier is to modify something. That means that there
needs to be a “base letter”. The same base letters are also “taken” in
Perl and have the same definition.

> 2. Parser Architecture Limitations

That sounds like a simple problem to solve. When reaching one of the
“base letters” in question, look at the next character.

Indeed, it is just that I'm not sure that it is worth adding modifiers
support to pack and unpack as with this addition, most (if not all)
cases should be covered then.

This is simply false. v/n/V/N identically exist in Perl. J is not clear
to me, and P appears to be different (but I don't do enough Perl to say
for certain).

Perl and PHP share common letters, you are right. But looking at the
table of each language, there are many differences. However, I may
reword it as I realized that the RFC states that differences appear
with specific endianness letters (and you showed that it's not true).
There are many differences when we're not talking about endian
specific formats actually.

— Alexandre Daubois

Hi

Am 2025-09-16 16:50, schrieb Alexandre Daubois:

Indeed, it is just that I'm not sure that it is worth adding modifiers
support to pack and unpack as with this addition, most (if not all)
cases should be covered then.

I don't think it is worth blocking more letters that are also incompatible with the language where pack() was “borrowed” from.

This is simply false. v/n/V/N identically exist in Perl. J is not clear
to me, and P appears to be different (but I don't do enough Perl to say
for certain).

Perl and PHP share common letters, you are right. But looking at the

They don't just “share common letters”. The pack() function is directly coming from Perl and that is also documented:

The idea for this function was taken from Perl and all formatting codes work the same as in Perl. However, there are some formatting codes that are missing such as Perl's "u" format code.

.

table of each language, there are many differences. However, I may
reword it as I realized that the RFC states that differences appear
with specific endianness letters (and you showed that it's not true).
There are many differences when we're not talking about endian
specific formats actually.

I'm not sure if “all formatting codes work the same” is still 100% accurate (due to J and P), but “many differences” is definitely false. I'm seeing the following differences:

- J and P might or might not be different.
- e, E, g, and G don't exist in Perl (it would be d<, d>, f<, and f> respectively; these format specifiers could be deprecated if this RFC ships).

Both w and W are already taken in Perl and would actually be a difference.

Best regards
Tim Düsterhus

Hi,

I'm not sure if “all formatting codes work the same” is still 100%
accurate (due to J and P), but “many differences” is definitely false.
I'm seeing the following differences:

- J and P might or might not be different.
- e, E, g, and G don't exist in Perl (it would be d<, d>, f<, and f>
respectively; these format specifiers could be deprecated if this RFC
ships).

Both w and W are already taken in Perl and would actually be a
difference.

I'm not sure to exactly understand what you mean. Do you propose to
introduce < and > modifiers and deprecate some letters, if I get it
right?

— Alexandre Daubois

Hi

Am 2025-09-17 09:09, schrieb Alexandre Daubois:

I'm not sure to exactly understand what you mean. Do you propose to
introduce < and > modifiers and deprecate some letters, if I get it
right?

I'm proposing to introduce the `<` and `>` modifiers, following Perl's lead. This would then allow to deprecate e, E, g, and G in a future version and if / when that happens, PHP would be in sync with Perl again. I'm not proposing that the deprecation should happen at the same time, because folks should have at least one PHP version to migrate where the new logic is available, but the old is not yet deprecated.

Best regards
Tim Düsterhus

Am 16.09.2025 um 13:45 schrieb Alexandre Daubois <alex.daubois+php@gmail.com>:

I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.

Here is the link to the RFC: PHP: rfc:pack-unpack-endianness-signed-integers-support

Just a little side-note: unpackToSignedInt can be implemented more easily, e.g. like

function unpackToSignedInt($bytes)
{
        return ($uint32 = unpack('V', $bytes)[1]) < 2 ** 31 ? $uint32 : $uint32 - 2 ** 32; # Use 'N' for big endian
}

While I undestand the wish for having all different sizes in all different endian-ness I don't think it is that big of an issue for people dealing with binary data, so I'm +-0 whether PHP really needs it.

Regards,
- Chris