Re: [PHP-DEV] [RFC] On the need of a `is_int_string` ?

On Thu, Aug 15, 2024, at 17:42, Vincent Langlet wrote:

Hi,

When string is used as an array key, it’s sometimes casted to an int.

As explained in https://www.php.net/manual/en/language.types.array.php:

“Strings containing valid decimal ints, unless the number is preceded by a + sign, will be cast to the int type. E.g the key “8” will actually be stored under 8. On the other 08 will not be cast as it isn’t a valid decimal integer.”

This behavior cause some issues, especially for static analysis. As an example https://phpstan.org/r/5a387113-de45-4bef-89af-b6c52adc5f69

vs real life https://3v4l.org/pDkoB

Currently most of static analysis rely on one/many native php functions to describe types.

PHPStan/Psalm supports a numeric-string thanks to the is_numeric method.

I don’t think there is a native function to know if the key will be casted to an int. The implementation would be something similar (but certainly better and in C) to


function is_int_string(string $s): bool

{

if (!is_numeric($s)) {

return false;

}

$a[$s] = $s;

return array_keys($a) !== array_values($a);

}

Which gives:

is_numeric(‘08’) => true

ctype_digit(‘08’) => true

is_int_string(‘08’) => false

is_numeric(‘8’) => true

ctype_digit(‘8’) => true

is_int_string(‘8’) => true

is_numeric(‘+8’) => true

ctype_digit(‘+8’) => false

is_int_string(‘+8’) => false

is_numeric(‘8.4’) => true

ctype_digit(‘8.4’) => false

is_int_string(‘8.4’) => false

Such method would allow to easily introduce a int-string type in static analysis and the opposite, a non-int-string one (cf https://github.com/phpstan/phpstan/issues/10239#issuecomment-1837571316).

WDYT about adding a is_int_string method then ?

Thanks

Hello,

At the risk of bikeshedding, it would probably be better to define it in the array_* space, maybe something like array_key_is_string(string $key): bool?

As for your function definition, it can be simplified a bit:

return (($s[0] ?? '') > 0 || (($s[0] ?? '') === '-' && ($s[1] ?? '') > 0)) && is_numeric($s);

I believe that covers all the cases that I could think of:

01, -01, +01, 1, 1.2, -1, -1.2, ~1, ~01

— Rob

I found a simpler implementation later which rely on array_keys

fn is_int_string(string $s): bool => \is_int(array_keys([$s => null])[0]);

I considered that is_int_string was better in the same namespace than
is_object, is_array, is_int, is_numeric, … but maybe there was something better
than int_string to describe this category of string since english is not good (integish ? integable ? integerable ?).
But indeed it could be interesting to relate this method to the array namespace…

Anyway, this topic does not seems to interest lot of developer so far ^^’

Le ven. 16 août 2024 à 01:04, Rob Landers rob@bottled.codes a écrit :

On Thu, Aug 15, 2024, at 17:42, Vincent Langlet wrote:

Hi,

When string is used as an array key, it’s sometimes casted to an int.

As explained in https://www.php.net/manual/en/language.types.array.php:

“Strings containing valid decimal ints, unless the number is preceded by a + sign, will be cast to the int type. E.g the key “8” will actually be stored under 8. On the other 08 will not be cast as it isn’t a valid decimal integer.”

This behavior cause some issues, especially for static analysis. As an example https://phpstan.org/r/5a387113-de45-4bef-89af-b6c52adc5f69

vs real life https://3v4l.org/pDkoB

Currently most of static analysis rely on one/many native php functions to describe types.

PHPStan/Psalm supports a numeric-string thanks to the is_numeric method.

I don’t think there is a native function to know if the key will be casted to an int. The implementation would be something similar (but certainly better and in C) to


function is_int_string(string $s): bool

{

if (!is_numeric($s)) {

return false;

}

$a[$s] = $s;

return array_keys($a) !== array_values($a);

}

Which gives:

is_numeric(‘08’) => true

ctype_digit(‘08’) => true

is_int_string(‘08’) => false

is_numeric(‘8’) => true

ctype_digit(‘8’) => true

is_int_string(‘8’) => true

is_numeric(‘+8’) => true

ctype_digit(‘+8’) => false

is_int_string(‘+8’) => false

is_numeric(‘8.4’) => true

ctype_digit(‘8.4’) => false

is_int_string(‘8.4’) => false

Such method would allow to easily introduce a int-string type in static analysis and the opposite, a non-int-string one (cf https://github.com/phpstan/phpstan/issues/10239#issuecomment-1837571316).

WDYT about adding a is_int_string method then ?

Thanks

Hello,

At the risk of bikeshedding, it would probably be better to define it in the array_* space, maybe something like array_key_is_string(string $key): bool?

As for your function definition, it can be simplified a bit:

return (($s[0] ?? '') > 0 || (($s[0] ?? '') === '-' && ($s[1] ?? '') > 0)) && is_numeric($s);

I believe that covers all the cases that I could think of:

01, -01, +01, 1, 1.2, -1, -1.2, ~1, ~01

— Rob