[PHP-DEV] [RFC] Deprecations for PHP 8.4

Hi

On 7/15/24 15:56, Christoph M. Becker wrote:

I do not, however, agree with the reasoning that a function (like
uniqid()) is often used in a unsafe way (i.e. for purposes it has not
been designed), and therefore should be deprecated/removed. There are
likely a couple of developers who are easily rolling their own
implementation which can be way worse. I've seen "encryption" code

That's effectively what's already being done, with complex, but meaningless, constructions such as `sha1(uniqid(microtime(true), true).rand())`. I probably did so myself in the past, back when I didn't know better and followed tutorials that also didn't know better.

I agree that it helps no one if the cure is worse than the disease.

That's why I strongly believe in misuse-resistant APIs: The secure choice should also be the default choice and also the easiest choice. The 'password_*' API is an great example for that. It has a few functions that do exactly what they say on the tine. You basically cannot use it incorrectly (except by not using it).

The old procedural API to use randomness is not such an example [1]. The `uniqid()` function is the most obvious choice to generate a unique string and when using it, the output also looks "random". But in reality it does not guarantee that the output is unique or unguessable - making the function name a lie.

It's almost always the wrong choice and I expect the type of developer that is able to use it safely to be able to write their own domain-specific formatter for the current time. All other users would be better suited by choosing something else, such as the `bin2hex(random_bytes(16))` construction that is supported since 7.0 and even longer with ParagonIE's polyfill [2]. Yes, it is not as terse as a call to `uniqid()`, but it is nicely explicit in what the output will look like and what the security properties are.

which was basically a Caesar cipher, spiced with some obsure function
calls to make it "even more safe". And I've seen obscure HTML escaping
code with an not so obvious back-door, that was once available as user
note on php.net.

That doesn't mean that I'm against the uniqid() deprecation, especially
if the deprecation message is clear on what to use instead.

I will make sure to write useful migration docs, helping users making an educated choice for an alternative. Unfortunately is no one-size-fits-all solution to the problem of generating an unique string.

Best regards
Tim Düsterhus

[1] This includes uniqid(), rand(), mt_rand(), lcg_value(). random_bytes() and random_int() are fine.
[2] GitHub - paragonie/random_compat: PHP 5.x support for random_bytes() and random_int()

On Mon, Jul 15, 2024, at 23:29, Tim Düsterhus wrote:

Hi

On 7/15/24 16:12, Rob Landers wrote:

This always gets me. “safer” doesn’t have a consistent meaning. For

Yes it does. SHA-256 is safer than MD5. And on modern CPUs with sha_ni

extensions, it’s also faster. The following is on a Intel i7-1365U:

$ openssl speed md5 sha1 sha256 sha512

snip

version: 3.0.10

built on: Wed Feb 21 10:45:39 2024 UTC

options: bn(64,64)

compiler: snip

CPUINFO: OPENSSL_ia32cap=0x7ffaf3ffffebffff:0x98c027bc239c27eb

The ‘numbers’ are in 1000s of bytes per second processed.

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes

md5 114683.10k 286174.51k 550288.90k 715171.50k 783611.22k 788556.46k

sha1 138578.57k 440607.38k 1082163.29k 1674088.45k 2017296.38k 2047377.41k

sha256 150670.11k 460483.71k 1054829.57k 1553830.57k 1807897.94k 1823981.57k

sha512 41246.76k 181566.07k 341457.66k 645468.50k 781042.81k 804296.02k


example, if you were to want to create a "content addressable

address" using a hash and it needs to fit inside a 128 bit number

(such as a GUID), you may be tempted to take SHA-X and just truncate

it. However, this biases the resulting numbers, which this bias may

This is false. For a hash algorithm to be considered cryptographically

secure (which I consider to be a reasonable definition of “safe”), it -

among other properties - needs to have the “avalanche effect” property,

which means that any change in the input is going to affect each output

bit with 50% probability.

from a practical perspective across hundreds of millions of hashes of unique ids, I can say that there is a practical and detectable bias when truncating sha-256 hashes. Enough that we were having to throw out a/b test results… I’m not going to write a paper on it and I’m not going to bother arguing the point that no hash function is perfect, but I will point out that “theory” and “reality” don’t always agree.

This means that for a cryptographic hash algorithm - such as the SHA-2

family - the resulting hash is indistinguishable from uniformly selected

random bits. And this property also holds after truncation - you just

have fewer bits of course.

See also: https://security.stackexchange.com/a/34797/21705

be considered unsafe (such as using it in an A/B testing tool). Just

because you have a short hash, doesn’t make it “unsafe” as longer

hashes can also be considered “unsafe.” What people usually mean by

this is in the context of encryption, and in those cases it is

unsafe, but in the context of non-encryption, usage of truncated

larger hashes is just as unsafe.

I’m afraid I don’t understand what you are attempting to say here.

Best regards

Tim Düsterhus

— Rob

On Tue, Jul 16, 2024, at 01:08, Rob Landers wrote:

On Mon, Jul 15, 2024, at 23:29, Tim Düsterhus wrote:

Hi

On 7/15/24 16:12, Rob Landers wrote:

This always gets me. “safer” doesn’t have a consistent meaning. For

Yes it does. SHA-256 is safer than MD5. And on modern CPUs with sha_ni

extensions, it’s also faster. The following is on a Intel i7-1365U:

$ openssl speed md5 sha1 sha256 sha512

snip

version: 3.0.10

built on: Wed Feb 21 10:45:39 2024 UTC

options: bn(64,64)

compiler: snip

CPUINFO: OPENSSL_ia32cap=0x7ffaf3ffffebffff:0x98c027bc239c27eb

The ‘numbers’ are in 1000s of bytes per second processed.

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes

md5 114683.10k 286174.51k 550288.90k 715171.50k 783611.22k 788556.46k

sha1 138578.57k 440607.38k 1082163.29k 1674088.45k 2017296.38k 2047377.41k

sha256 150670.11k 460483.71k 1054829.57k 1553830.57k 1807897.94k 1823981.57k

sha512 41246.76k 181566.07k 341457.66k 645468.50k 781042.81k 804296.02k


example, if you were to want to create a "content addressable

address" using a hash and it needs to fit inside a 128 bit number

(such as a GUID), you may be tempted to take SHA-X and just truncate

it. However, this biases the resulting numbers, which this bias may

This is false. For a hash algorithm to be considered cryptographically

secure (which I consider to be a reasonable definition of “safe”), it -

among other properties - needs to have the “avalanche effect” property,

which means that any change in the input is going to affect each output

bit with 50% probability.

from a practical perspective across hundreds of millions of hashes of unique ids, I can say that there is a practical and detectable bias when truncating sha-256 hashes. Enough that we were having to throw out a/b test results… I’m not going to write a paper on it and I’m not going to bother arguing the point that no hash function is perfect, but I will point out that “theory” and “reality” don’t always agree.

I have been corrected. The issue was due to a modulus causing the bias deeper in the code.

— Rob

Hi Tim!

On 15.07.2024 at 23:50, Tim Düsterhus wrote:

That doesn't mean that I'm against the uniqid() deprecation, especially
if the deprecation message is clear on what to use instead.

I will make sure to write useful migration docs, helping users making an
educated choice for an alternative. Unfortunately is no
one-size-fits-all solution to the problem of generating an unique string.

See also a respective GH issue regarding deprecation messages:
<Issues · php/php-src · GitHub; it probably makes sense
to include an URL in the deprecation message, or maybe some code which
users can use to look up more thorough information about the deprecation.

Cheers,
Christoph

Hi

On 7/16/24 13:04, Christoph M. Becker wrote:

See also a respective GH issue regarding deprecation messages:
<Issues · php/php-src · GitHub; it probably makes sense
to include an URL in the deprecation message, or maybe some code which
users can use to look up more thorough information about the deprecation.

To loop the list back in: I've replied to that issue that as part of the #[\Deprecated] RFC implementation new deprecation messages were added to (almost) all functions that are currently deprecated:

In case of utf8_decode() / utf8_encode() the message just points towards the documentation, but there is an extensive explanation discussing the possible alternatives, for the others the message already points out the alternatives or explains the deprecation in another way (e.g. for the *_free() functions that became obsolete when moving from resources to objects).

For uniqid() there are already various warnings in the documentation and today a PR was merged to adjust the description text to avoid using the word "unique", because it was factually untrue:

If the deprecation is accepted, then the list of possible alternatives that are already mentioned in the RFC can be included in the documentation.

Best regards
Tim Düsterhus

On Tuesday, 25 June 2024 at 15:36, Gina P. Banyard <internals@gpb.moe> wrote:

Hello internals,

It is this time of year again where we proposed a list of deprecations to add in PHP 8.4:

PHP: rfc:deprecations_php_8_4

As a reminder, this list has been compiled over the course of the past year by various different people.

And as usual, each deprecation will be voted in isolation.

We still have a bit of time buffer, so if anyone else has any suggestions, they are free to add them to the RFC.

Some should be non-controversial, others a bit more.
If such, they might warrant their own dedicated RFC, or be dropped from the proposal altogether.

Best regards,

Gina P. Banyard

Hello internals,

It's been a bit over 3 weeks since the discussion started, and I intend to open the vote tomorrow.

Best regards,

Gina P. Banyard