[PHP-DEV] [RFC] [Discussion] array_first() and array_last()

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".

Kind regards
Niels

Le 5 avr. 2025 à 17:51, Niels Dossche <dossche.niels@gmail.com> a écrit :

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

Hi Niels,

It is reasonable. I have a userland implementation of that in my codebase since long ago.

You missed the following point when discussing the behaviour for empty array:

* Consistent with `array_shift()` and `array_pop()`, which are also about retrieving the first, respectively the last element of an array.

—Claude

On 05/04/2025 20:00, Claude Pache wrote:

Hi Niels,

It is reasonable. I have a userland implementation of that in my codebase since long ago.

You missed the following point when discussing the behaviour for empty array:

* Consistent with `array_shift()` and `array_pop()`, which are also about retrieving the first, respectively the last element of an array.

Hi Claude

Good point, I'll add that as well, thanks.
There are probably even more arguments that can be made :slight_smile:

—Claude

Niels

On 2025-04-05 18:51, Niels Dossche wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

Hi,

Do you think it would be hard or wrong to add `array_nth`? I've had more trouble with that as the in-place implementation is usually pretty unreadable, e.g.

     array_slice(array_values($array), $offset, 1)[0] ?? null

I understand this need is less common, especially compared to `array_first`. But I also suspect that an in-engine implementation might have more performance gains compared to an userland one.

BR,
Juris

On 06/04/2025 18:06, Juris Evertovskis wrote:

I also suspect that an in-engine implementation might have more performance gains compared to an userland one.

That applies to literally anything, obviously; one less layer of abstraction is always more efficient. That's therefore not a reason to add something to the engine. I suggest first proving there is a legitimate need.

Cheers,
Bilge

On Mon, Apr 7, 2025 at 2:05 AM Bilge <bilge@scriptfusion.com> wrote:
... [snip] I suggest first proving there is a
legitimate need.

I did a quick GitHub search for a common pattern of accessing an array
value by using the `array_key_first` and `array_key_last` functions:

$value = $arr[array_key_first($results)];

- `[array_key_first(`: over 3,700 results[^1]
- `[array_key_last(`: over 4,300 results[^2]

All of these hits can benefit from the proposed `array_first` and `array_last`.

On Sun, Apr 6, 2025, at 7:47 PM, Ayesh Karunaratne wrote:

On Mon, Apr 7, 2025 at 2:05 AM Bilge <bilge@scriptfusion.com> wrote:
... [snip] I suggest first proving there is a
legitimate need.

I did a quick GitHub search for a common pattern of accessing an array
value by using the `array_key_first` and `array_key_last` functions:

$value = $arr[array_key_first($results)];

- `[array_key_first(`: over 3,700 results[^1]
- `[array_key_last(`: over 4,300 results[^2]

All of these hits can benefit from the proposed `array_first` and `array_last`.

Just registering that I am all for this RFC and look forward to using it.

--Larry Garfield

On 07/04/2025 01:47, Ayesh Karunaratne wrote:

On Mon, Apr 7, 2025 at 2:05 AM Bilge <bilge@scriptfusion.com> wrote:
... [snip] I suggest first proving there is a
legitimate need.

I did a quick GitHub search for a common pattern of accessing an array
value by using the `array_key_first` and `array_key_last` functions:

$value = $arr[array_key_first($results)];

  - `[array_key_first(`: over 3,700 results[^1]
  - `[array_key_last(`: over 4,300 results[^2]

All of these hits can benefit from the proposed `array_first` and `array_last`.

To be clear, I wasn't disputing first/last, but nth, but thanks for the insights!

Cheers,
Bilge

On 06/04/2025 19:06, Juris Evertovskis wrote:

On 2025-04-05 18:51, Niels Dossche wrote:
Do you think it would be hard or wrong to add `array_nth`? I've had more trouble with that as the in-place implementation is usually pretty unreadable, e.g.

array\_slice\(array\_values\($array\), $offset, 1\)\[0\] ?? null

I understand this need is less common, especially compared to `array_first`. But I also suspect that an in-engine implementation might have more performance gains compared to an userland one.

This is not hard to add. For packed arrays this can be done in O(1) time, for non-packed arrays in O(n) worst case.
If added, it's best to add the pair array_nth() & array_nth_key() together IMO.
However, it's not something I ever really needed nor something that I have seen a demand for.
I think this could be left for the future for follow-up work.

Kind regards
Niels

On 4/5/2025 6:51 PM, Niels Dossche wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

-1 because returning `null` for empty arrays is still wrong. Whatever similar behavior exists should be corrected to throw `ValueError` in the future. Just my 2c.

On Sat, Apr 5, 2025 at 9:51 AM Niels Dossche <dossche.niels@gmail.com> wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

I dislike all the functions where `null` is a valid value, which can
also be confused with an error in the operation.

However, this is kind of in the grey area for me, because we do have
plenty of tools to avoid the error condition. Of course there's the
simple if:

    if (!empty($array) {
        $first = array_first($array);
    }

And if we have a default value that could be used:

    $first = !empty($array) ? array_first($array) : $default;

I would prefer a better design tool than returning null, but being
honest, neither of these are too bad. As long as our documentation for
these functions are helpful but concise about showing how to work
around these edges, then I may be convinced to vote yes despite the
design.

On Tue, Apr 8, 2025, at 10:53, Daikaras wrote:

On 4/5/2025 6:51 PM, Niels Dossche wrote:

Hi internals

I’m opening the discussion for the RFC “array_first() and array_last()”.

https://wiki.php.net/rfc/array_first_last

Kind regards

Niels

-1 because returning null for empty arrays is still wrong. Whatever

similar behavior exists should be corrected to throw ValueError in the

future. Just my 2c.

I’ve always viewed arrays as an infinite field of nulls. The only time this isn’t true is when you use array_key_exists() or care about the warnings (which make it less useful, IMHO). When you treat it as an infinite field of nulls, you get some interesting mathematical/computational properties you can exploit. Trying to treat it as a map/indexed-array is when you start running into weird problems.

— Rob

On Mon, 7 Apr 2025 at 02:48, Ayesh Karunaratne <ayesh@php.watch> wrote:

> On Mon, Apr 7, 2025 at 2:05 AM Bilge <bilge@scriptfusion.com> wrote:
> ... [snip] I suggest first proving there is a
> legitimate need.

I did a quick GitHub search for a common pattern of accessing an array
value by using the `array_key_first` and `array_key_last` functions:

$value = $arr[array_key_first($results)];

- `[array_key_first(`: over 3,700 results[^1]
- `[array_key_last(`: over 4,300 results[^2]

All of these hits can benefit from the proposed `array_first` and `array_last`.

(I used the wrong reply button earlier..)

I suspect this is just the tip of the iceberg. You should look for
reset() and end().
I get 336K when I look for "/(= |return |\(|\[)(reset|end)\(\$/ language:PHP".
I get 8.3K when I look for "/\[array_key_(first|last)\(/ language:PHP".

There are more matches for "/reset\(" and "/end\(", but we only want
matches where the return value is used.

And indeed I want to never have to use reset() and end() again!
I don't recall when I ever intentionally used the internal array
pointer. Only as a workaround to get a first or last element.

-- Andreas

On Tue, 8 Apr 2025 at 18:38, Levi Morrison <levi.morrison@datadoghq.com> wrote:

On Sat, Apr 5, 2025 at 9:51 AM Niels Dossche <dossche.niels@gmail.com> wrote:
>
> Hi internals
>
> I'm opening the discussion for the RFC "array_first() and array_last()".
> PHP: rfc:array_first_last
>
> Kind regards
> Niels

I dislike all the functions where `null` is a valid value, which can
also be confused with an error in the operation.

However, this is kind of in the grey area for me, because we do have
plenty of tools to avoid the error condition. Of course there's the
simple if:

    if (!empty($array) {
        $first = array_first($array);
    }

And if we have a default value that could be used:

    $first = !empty($array) ? array_first($array) : $default;

I would prefer a better design tool than returning null, but being
honest, neither of these are too bad. As long as our documentation for
these functions are helpful but concise about showing how to work
around these edges, then I may be convinced to vote yes despite the
design.

(I used the wrong reply button earlier)

The problem with the above solutions is that you need to repeat the
array expression.
For a simple variable that is ok, but for a call, not so much.
So it is not good for nested expressions, you need to introduce a
local variable.
(which some believe you should do anyway, but...)

One way to get a custom unique "empty" value would be this:
$zilch = new \stdClass();
$first = array_first(some_lengthy_expression() ?: [$zilch]);
if ($first === $zilch) { ... }

We will know for sure that $zilch cannot be a pre-existing array value.

In a lot of cases we already know that the array does not contain
NULL, then it works fine as a default.

-- Andreas

Le 5 avr. 2025 à 17:51, Niels Dossche dossche.niels@gmail.com a écrit :

Hi internals

I’m opening the discussion for the RFC “array_first() and array_last()”.
PHP: rfc:array_first_last

Kind regards
Niels

Hi,

I think that this argument is not convincing, and even counterproductive:

  • NULL is a rare legitimate value, so the potential for clashing is low

First, it says that it is a “rare” legitimate value, which one can disagree with. (I do disagree.)

Second, the way it is formulated, it implies that, when null is used in array, there will be “clashing”, which is not necessarily the case. (I consider that there is almost never a clash, because it is rarely useful to make the difference between an explicit null and a missing value, and, when it is useful, you have almost surely already checked for an empty array upfront.)

I suggest to replace that argument with the two following ones, that doesn’t treat arrays with NULL as second class citizens:

  • Semantically, NULL represents a missing value. Returning NULL from an empty array is not semantically incorrect, but it means that the function doesn’t differentiate between a implicit missing value (empty array) and an explicit missing value (array with NULL as its first/last element).

  • In the relatively rare cases you do want to make the difference between an empty array and an array that starts/ends with NULL, you can (and should) just check for empty array upfront.


One more thing. On https://www.php.net/, I read: “Fast, flexible and pragmatic, PHP powers everything from your blog to the most popular websites in the world.” (emphasis added). If we were to design some new perfect language, we might consider making array_first() (or equivalent) choke on empty arrays. But given the current state of the affairs, the pragmatic thing to do is to pave the cowpath.

—Claude

On Sat, Apr 5, 2025, at 10:51 AM, Niels Dossche wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

To add another argument: the reset() workaround doesn't work with a readonly array property, because it does modify the array's internal pointer.

Yes, I have in fact run into this situation in real code, which led me to use the $a[array_key_first($a)] dance instead. I'd love to replace it with a single call.

--Larry Garfield

On 4/8/25 10:53 AM, Daikaras wrote:

On 4/5/2025 6:51 PM, Niels Dossche wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

-1 because returning `null` for empty arrays is still wrong. Whatever similar behavior exists should be corrected to throw `ValueError` in the future. Just my 2c.

I think consistency is very important, hence I will stick to my reasoning why returning NULL is the right thing at the moment.
If the behaviour of array functions on empty arrays is ever revised in the future, then we should keep everything consistent. So e.g. if a _future_ RFC decides to make array access throw instead of yield NULL, then this should affect array_{first,last} too IMO. However, that's not for now, that's a lot of ifs and hypothetical future development. That discussion should happen in the future elsewhere.

On 4/8/25 10:53 AM, Daikaras wrote:

On 4/5/2025 6:51 PM, Niels Dossche wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

-1 because returning `null` for empty arrays is still wrong. Whatever similar behavior exists should be corrected to throw `ValueError` in the future. Just my 2c.

I think consistency is very important, hence I will stick to my reasoning why returning NULL is the right thing at the moment.
If the behaviour of array functions on empty arrays is ever revised in the future, then we should keep everything consistent. So e.g. if a _future_ RFC decides to make array access throw instead of yield NULL, then this should affect array_{first,last} too IMO. However, that's not for now, that's a lot of ifs and hypothetical future development. That discussion should happen in the future elsewhere.

On 4/8/25 8:46 PM, Claude Pache wrote:

Hi,

I think that this argument is not convincing, and even counterproductive:

* NULL is a rare legitimate value, so the potential for clashing is low

First, it says that it is a “rare” legitimate value, which one can disagree with. (I do disagree.)

I think it's a matter of phrasing, but I agree that it can be more nuanced.

Second, the way it is formulated, it implies that, when `null` is used in array, there will be “clashing”, which is not necessarily the case. (I consider that there is almost never a clash, because it is rarely useful to make the difference between an explicit `null` and a missing value, and, when it is useful, you have almost surely already checked for an empty array upfront.)

I suggest to replace that argument with the two following ones, that doesn’t treat arrays with NULL as second class citizens:

* Semantically, NULL represents a missing value. Returning NULL from an empty array is not semantically incorrect, but it means that the function doesn’t differentiate between a implicit missing value (empty array) and an explicit missing value (array with NULL as its first/last element).

* In the relatively rare cases you do want to make the difference between an empty array and an array that starts/ends with NULL, you can (and should) just check for empty array upfront.

That's indeed a bit better worded and more nuanced. I might adapt the RFC text a bit; although in the end it kinda means the same thing anyway.

-------

One more thing. On https://www.php.net/, I read: “Fast, flexible and pragmatic, PHP powers everything from your blog to the most popular websites in the world.” (emphasis added). If we were to design some new perfect language, we might consider making `array_first()` (or equivalent) choke on empty arrays. But given the current state of the affairs, the pragmatic thing to do is to pave the cowpath.

I agree. I value consistency; we have too much inconsistency already in PHP anyway, let's not add more (unless we would have a very very good reason to do so; but that's not in this case IMO).

On 05/04/2025 17:51, Niels Dossche wrote:

Hi internals

I'm opening the discussion for the RFC "array_first() and array_last()".
PHP: rfc:array_first_last

Kind regards
Niels

Hi

I'll be putting this to vote on Tuesday 22nd if no one has complaints.

Kind regards
Niels