[PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri

On Tue, Dec 16, 2025 at 7:14 PM Juris Evertovskis <juris@glaive.pro> wrote:

On 2025-12-16 09:53, ignace nyamagana butera wrote:

Since we will be dealing with arrays the following rules could be updated when parsing the string using PHP behaviour:

  • “&a” should be converted to [‘a’ => null]

Hey Ignace,

In practice valueless arguments like ?debug are most often “flags” or “booleans” and their presence implies truthiness.

Do you think it would be wrong or confusing to have it converted to ['debug' => true]?

I’m worried that ['a' => null] would not be that handy since both $params['a'] and isset($params['a']) would return falsy which would likely be opposite to the intended value.

BR,
Juris

Hi Juris,

Do you think it would be wrong or confusing to have it converted to ['debug' => true]?

Yes IMHO it would be wrong because flag parameters or booleans are converted to [‘debug’ => 1]
The [‘debug’ => null] expresses the presence of the name pair and the absence of value associated with it.
Let’s see how it is currently done:

The WHATWG URL living standard does the following:

let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);

the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.

In PHP you have currently the following behaviour:

example 1

parse_str('debug&foo=bar&debug=', $params);
var_dump($params, http_build_query($params));
//$params ['debug' => '', 'foo' => 'bar']
//after roundtrip you get 'debug=&foo=bar'
example 2
parse_str('debug&foo=bar&debug=1', $params);
var_dump($params, http_build_query($params));
//$params ['debug' => '1', 'foo' => 'bar']
//after roundtrip you get 'debug=1&foo=bar'

So you lose data and the query data can be randomly sorted
parse_str convert the first debug into [‘debug’ => ‘’]
parse_str overwrites the value (This may be a security concern if you need to hash/validate your query string)

Since IMHO interoperability and security is important you should prefer an algorithm that preserves the original query.
The proposed solution is already in use for instance in League/Uri or in Guzzle


echo Uri::withQueryValues(Utils::uriFor('[https://example.com](https://example.com)'), [
    'debug' => null,
    'foo' => 'bar',
    'baz' => '',
]), PHP_EOL;
// [https://example.com?debug&foo=bar&baz=](https://example.com?debug&foo=bar&baz=)

Because Guzzle uses an associative array, the debug variable can only appear once but there is a difference using null and the empty string.
This improves interoperability with other languages and you no longer have data loss or random query re-arrangement.

Last but not least, the Query objects proposed by Màté all expose:

  • a has method which will always tell if the key is present regardless of its value an equivalent to array_key_exists.
  • provide a way to have the same parameter appear multiple times in the query string

So IMHO it is an improvement to also allow the distinction between null and the empty string so we can finally write in PHP

echo (new Uri\Rfc3986\Query())
    ->append('debug', null)
    ->append('foo', 'bar')
    ->append('debug', '')
    ->toRfc3986String();
// debug&foo=bar&baz=

Best regards,
Ignace

Hi Ignace,

Currently, in your proposal you have 2 Query objects. This will give the developper a lot of work to understand where, when and which object to choose and why. Is that complexity really needed? IMHO we may end up with a correct API ... that no-one will use.

Just to reiterate what I wrote to Juris a few days ago: I’m open to unifying the two classes, but I’m just hesitant because of security and evolvability reasons (but the main one is security).

With all that in mind I believe a single `Uri\Query` should be used. Its goal should be:

- to be immutable
- to store the query in its decoded form.
- to manipulate and change the data in a consistent way.

So far, I imagined the two QueryParams classes to be mutable because one of their main goals is to be able to build (~ mutate) query param list…
But otherwise an immutable implementation would be useful for sure.

Decoding/encoding should happen at the object boundaries but everything inside the object should
be done on decoded data.

Yes, that’s what I also had to find out based on my experience with implementing the POC, so I completely agree here.

On a bonus side, it would be nice to have a mechanism in PHP that allows the application to switch
from the current `parse_str` usage to the new improved parsing provided by the new class when
populating the `_GET` array. (So that deprecating `parse_str` can be initiated in some distant future.)
This last observation/remark is not mandatory but nice to have.

This is a very interesting remark, and I have not thought about this possibility yet. Generally, I agree with
the idea, but my long-term goal (or wish) is to move away from using $_GET and $_POST to access request
data in favor of using objects… So I most probably won’t deal with trying to implement this idea. However,
I’m willing to add a UriQueryParams::fromCurrentQueryString(), maybe even a UriQueryParams::fromCurrentBody()
or similar factory methods if people like it.

- in respect to `parse_str`, no mangled data should occur on parsing:

Uh, I completely forgot about this behavior of parse_str(), and I definitely agree that mangling shouldn’t happen.

- Only accept scalar values, `null`, and `array`. If an object or a resource is detected a `ValueError` error
should be thrown.

I wasn’t sure what to do with objects, but I’m happy to skip their support, especially if they would cause issues.
The rest of the suggestions align with my initial plans (maybe with the exception of throwing ValueError – I wrote
TypeError in the related section).

- Remove the addition of indices if the `array` is a list.

Yes, this also aligns with my initial plans.

Best regards,
Máté

Hi,

The WHATWG URL living standard does the following:

let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);

the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.

Yes, confirmed. Unfortunately, WHATWG URL only supports string values, so there’s no way to support
query parameters without a key (e.g. ?debug) in the RFC implementation either. :frowning:

On the other hand, the RFC 3986 implementation supports this notion, even uriparser calls this out
in its documentation: https://uriparser.github.io/doc/api/latest/index.html#querystrings.

However, there are a few other problems which came up when I was updating my implementation.

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,
it is still needed for non-list arrays, because the [..] suffix must be added to their name:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

echo $params->toRfc3986String(); // foo%5B2%5D=bar&foo%5B4%5D=baz

var_dump($params->getFirst(“foo”)); // NULL

Even though I appended params with the name “foo”, no items can be returned when calling getFirst(),
because of name mangling.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

echo $params->toRfc3986String(); // ???

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: https://github.com/php/php-src/pull/15650
Should we support them, right?

Regards,
Máté

On Thu, Dec 18, 2025 at 10:46 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi,

The WHATWG URL living standard does the following:

let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);

the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.

Yes, confirmed. Unfortunately, WHATWG URL only supports string values, so there’s no way to support
query parameters without a key (e.g. ?debug) in the RFC implementation either. :frowning:

On the other hand, the RFC 3986 implementation supports this notion, even uriparser calls this out
in its documentation: https://uriparser.github.io/doc/api/latest/index.html#querystrings.

However, there are a few other problems which came up when I was updating my implementation.

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,
it is still needed for non-list arrays, because the [..] suffix must be added to their name:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

echo $params->toRfc3986String(); // foo%5B2%5D=bar&foo%5B4%5D=baz

var_dump($params->getFirst(“foo”)); // NULL

Even though I appended params with the name “foo”, no items can be returned when calling getFirst(),
because of name mangling.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

echo $params->toRfc3986String(); // ???

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: https://github.com/php/php-src/pull/15650
Should we support them, right?

Regards,
Máté

Hi Máté

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,

it is still needed for non-list arrays, because the [..] suffix must be added to their name:

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

The bracket is a PHP specificity and I would not change it now otherwise you introduce a huge BC break in the ecosystem
for no particular gain IMHO.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.
$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: BackedEnum support for http_build_query · Issue #15650 · php/php-src · GitHub
Should we support them, right?

This gave me a WTF? moment (sorry for the language). Why was this change added in PHP without a proper discussion in the internals
or even an RFC because it sets up a lot of precedence and even changes how http_build_query is supposed to work in regards to objects.
If this had landed on the internal list I would have objected to it on the ground as it breaks expectation in the function handling of type in PHP.
Do I agree with everything the function does ? No, but introducing inconsistencies in the function is not a good thing. Now http_build_query
is aware of specific objects. Sringable or Travarsable objects are not detected but Enum are ?? Pure Enum emits a TypeError but resource

do not ? Backed Enums are not converted to int or to string by PDO ? Why would http_build_query do it differently ? The same reasoning apply as to
why Backed Enum does not have a Stringable interface.

Yes the output was “weird” in PHP8.1-> PHP8.3 but it was expected. Should something be done for DateInterval too because the
output using http_build_query is atrocious ?

I still stand on my opinion that objects, resources should NEVER be converted. In an ideal world only scalar + null and their repeated values
encapsulated in an array should be allowed; everything else should be left to the developer to decide. So yes in your implementation I do think
that Backed Enum should not be treated differently than others objects and should throw.

PS: I would even revert this change or deprecated it for removal in PHP9 (in a separate RFC)

Best regards,
Ignace

On Fri, Dec 19, 2025 at 8:59 AM ignace nyamagana butera <nyamsprod@gmail.com> wrote:

On Thu, Dec 18, 2025 at 10:46 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi,

The WHATWG URL living standard does the following:

let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);

the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.

Yes, confirmed. Unfortunately, WHATWG URL only supports string values, so there’s no way to support
query parameters without a key (e.g. ?debug) in the RFC implementation either. :frowning:

On the other hand, the RFC 3986 implementation supports this notion, even uriparser calls this out
in its documentation: https://uriparser.github.io/doc/api/latest/index.html#querystrings.

However, there are a few other problems which came up when I was updating my implementation.

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,
it is still needed for non-list arrays, because the [..] suffix must be added to their name:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

echo $params->toRfc3986String(); // foo%5B2%5D=bar&foo%5B4%5D=baz

var_dump($params->getFirst(“foo”)); // NULL

Even though I appended params with the name “foo”, no items can be returned when calling getFirst(),
because of name mangling.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

echo $params->toRfc3986String(); // ???

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: https://github.com/php/php-src/pull/15650
Should we support them, right?

Regards,
Máté

Hi Máté

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,

it is still needed for non-list arrays, because the [..] suffix must be added to their name:

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

The bracket is a PHP specificity and I would not change it now otherwise you introduce a huge BC break in the ecosystem
for no particular gain IMHO.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.
$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: BackedEnum support for http_build_query · Issue #15650 · php/php-src · GitHub
Should we support them, right?

This gave me a WTF? moment (sorry for the language). Why was this change added in PHP without a proper discussion in the internals
or even an RFC because it sets up a lot of precedence and even changes how http_build_query is supposed to work in regards to objects.
If this had landed on the internal list I would have objected to it on the ground as it breaks expectation in the function handling of type in PHP.
Do I agree with everything the function does ? No, but introducing inconsistencies in the function is not a good thing. Now http_build_query
is aware of specific objects. Sringable or Travarsable objects are not detected but Enum are ?? Pure Enum emits a TypeError but resource

do not ? Backed Enums are not converted to int or to string by PDO ? Why would http_build_query do it differently ? The same reasoning apply as to
why Backed Enum does not have a Stringable interface.

Yes the output was “weird” in PHP8.1-> PHP8.3 but it was expected. Should something be done for DateInterval too because the
output using http_build_query is atrocious ?

I still stand on my opinion that objects, resources should NEVER be converted. In an ideal world only scalar + null and their repeated values
encapsulated in an array should be allowed; everything else should be left to the developer to decide. So yes in your implementation I do think
that Backed Enum should not be treated differently than others objects and should throw.

PS: I would even revert this change or deprecated it for removal in PHP9 (in a separate RFC)

Best regards,
Ignace

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: BackedEnum support for http_build_query · Issue #15650 · php/php-src · GitHub
Should we support them, right?

Hi Máté,

After further checking and researching, here’s my view on Enum support. It is based on the PHP8.4 behaviour of Enum with json_encode since it is the base used to add its support in http_build_query.

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

If such interface lands, then Pure Enum serialization will be allowed. via the interface and the behaviour of BackedEnum also would be affected just like what is happening with json_encode. (ie: the interface takes precedence over the class instance default behaviour).

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

What do you think ?
Best regards,
Ignace

Hi Ignace,

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won’t be symmetric:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

var_dump($params->getFirst(“foo”)); // NULL

One cannot be sure if a parameter that was added can really be retrieved later via a get*() method.

Another edge cases:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [“bar”, “baz”]) // Value is a list, so “foo” is added without brackets
->append(“foo”, [2 => “qux”, 4 => “quux”]); // Value is an array, so “foo” is added with brackets

var_dump($params->toRfc3986String()); // foo=bar&foo=baz&foo%5B2%5D=qux&foo%5B4%5D=quux

var_dump($params->getLast(“foo”)) // Should it be “baz” or “quux”?
var_dump($params->getAll(“foo”)) // Should it only include the params with name “foo”, or also “foo”?

And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of
the specification is way too underspecified and vague, so URLSearchParams doesn’t seem well-usable in practice…

So an idea that I’m now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support
scalar values, therefore the param name wasn’t mangled. And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

+1

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

Hm, good idea! I’m not particularly interested in this feature, but I agree it’s a good way to add support for objects.

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

+1

Regards,
Máté

On Sun, Dec 21, 2025 at 1:13 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi Ignace,

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won’t be symmetric:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

var_dump($params->getFirst(“foo”)); // NULL

One cannot be sure if a parameter that was added can really be retrieved later via a get*() method.

Another edge cases:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [“bar”, “baz”]) // Value is a list, so “foo” is added without brackets
->append(“foo”, [2 => “qux”, 4 => “quux”]); // Value is an array, so “foo” is added with brackets

var_dump($params->toRfc3986String()); // foo=bar&foo=baz&foo%5B2%5D=qux&foo%5B4%5D=quux

var_dump($params->getLast(“foo”)) // Should it be “baz” or “quux”?
var_dump($params->getAll(“foo”)) // Should it only include the params with name “foo”, or also “foo”?

And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of
the specification is way too underspecified and vague, so URLSearchParams doesn’t seem well-usable in practice…

So an idea that I’m now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support
scalar values, therefore the param name wasn’t mangled. And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

+1

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

Hm, good idea! I’m not particularly interested in this feature, but I agree it’s a good way to add support for objects.

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

+1

Regards,
Máté

Hi Máté,

And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

I believe adding the appendArray and setArray is the way forward as the bracket addition and thus mangling is really a PHP specificity that we MUST keep to avoid hard BC.
I would even go a step further and add a getArray and hasArray methods which will lead to the following API


```
$params = (new Uri\Rfc3986\UriQueryParams())
    ->append("foo", ["bar", "baz"])          // Value is a list, so "foo" is added without brackets
    ->appendArray("foo", ["qux", "quux"]);   // Value is a list, using PHP serialization "foo" is added with brackets

var_dump($params->toRfc3986String());        // foo=bar&foo=baz&foo%5B0%5D=qux&foo%5B1%5D=quux

$params->hasArray('foo'); //returns true
$params->getArray("foo"); //returns ["qux", "quux"]

$params->has('foo');      //returns true
$params->getFirst("foo"); //returns "bar"
$params->getLast("foo");  //returns "baz"
$params->getAll('foo');   //returns ["bar", "baz"]
```

Hope this makes sense 
Regards,
Ignace

On Sun, Dec 21, 2025 at 4:51 PM ignace nyamagana butera <nyamsprod@gmail.com> wrote:

On Sun, Dec 21, 2025 at 1:13 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi Ignace,

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won’t be symmetric:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

var_dump($params->getFirst(“foo”)); // NULL

One cannot be sure if a parameter that was added can really be retrieved later via a get*() method.

Another edge cases:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [“bar”, “baz”]) // Value is a list, so “foo” is added without brackets
->append(“foo”, [2 => “qux”, 4 => “quux”]); // Value is an array, so “foo” is added with brackets

var_dump($params->toRfc3986String()); // foo=bar&foo=baz&foo%5B2%5D=qux&foo%5B4%5D=quux

var_dump($params->getLast(“foo”)) // Should it be “baz” or “quux”?
var_dump($params->getAll(“foo”)) // Should it only include the params with name “foo”, or also “foo”?

And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of
the specification is way too underspecified and vague, so URLSearchParams doesn’t seem well-usable in practice…

So an idea that I’m now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support
scalar values, therefore the param name wasn’t mangled. And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

+1

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

Hm, good idea! I’m not particularly interested in this feature, but I agree it’s a good way to add support for objects.

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

+1

Regards,
Máté

Hi Máté,

And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

I believe adding the appendArray and setArray is the way forward as the bracket addition and thus mangling is really a PHP specificity that we MUST keep to avoid hard BC.
I would even go a step further and add a getArray and hasArray methods which will lead to the following API


```
$params = (new Uri\Rfc3986\UriQueryParams())
    ->append("foo", ["bar", "baz"])          // Value is a list, so "foo" is added without brackets
    ->appendArray("foo", ["qux", "quux"]);   // Value is a list, using PHP serialization "foo" is added with brackets

var_dump($params->toRfc3986String());        // foo=bar&foo=baz&foo%5B0%5D=qux&foo%5B1%5D=quux

$params->hasArray('foo'); //returns true
$params->getArray("foo"); //returns ["qux", "quux"]

$params->has('foo');      //returns true
$params->getFirst("foo"); //returns "bar"
$params->getLast("foo");  //returns "baz"
$params->getAll('foo');   //returns ["bar", "baz"]
```

Hope this makes sense 
Regards,
Ignace

Hi Màté,

I have been playing around your Query Param API and I have a couple of questions:

Question 1) While I am not a proponent of the addition of the getQueryParams on both classes even though I know the method exists in
the WHATWG URL spec I find strange is that the method may return null. To me this makes for an awkward API where the user will always
have to add some conditional checks before using the method returned value. Why can’t this be true ?

$url = Uri\Rfc3986\Uri::parse('[https://www.example.com/path/to/whatever](https://www.example.com/path/to/whatever)');
$url->getQueryParams(); 
// should return a empty UriQueryParams instance

$url = Uri\Rfc3986\Uri::parse('[https://www.example.com/path/to/whatever](https://www.example.com/path/to/whatever)?');
$url->getQueryParams();
// should return UriQueryParams with a pair
// represented like this ['' => null] or like this ['', null]

This IMHO should also be the case for the UrlQueryParams instance

Question 1-bis) 
I prefer having some extra named constructors on the UrlQueryParams instead of having a getter on the Uri/Url classes. This fully decoupled
the Ur(i|l)QueryParams from the Uri/Url classes and let the user opt-in the new API if needed. In case of errors/bugs etc... only the QueryParams
cointainer bags would be affected ... not the Url/Uri classes.

Question 2)
 I see you have 
- UriQueryParams::fromArray, 
- UriQueryParams::list, 
If I read it correctly, this returns 2 array representations of the query ?
 My question is shouldn't we have either a fromList named constructor and/or a toArray which return both distinctive forms ?
This might confused the developer who will have a hard time understand which form is what and when to use it and it which
one in which context can be used to instantiate a new instance ?

Question 3)
I wanted to know how the following code will be processed ?

$query = 'a[]=foo&a[]=bar&a=qux';
parse_str($query, $result);
$result['a']; //returns "qux"

```
As seen in the example with  parse_str the full array notation is overwritten and can not be used/accessed
```

```
Will the getArray API still be able to access the array data or will it act like parse_str and skip the array notation ?
```

```
Best regards,
```

```
Ignace
```

Hi Ignace,

Sorry for the very late reply, I was working on something else in January, and I’ve just recently got back to this topic. I’ll answer your question as soon
as possible, but let me tell you that I’ve just separated the Query Parameter Manipulation part to its own RFC, because it’s that complex.

And I recently started a significant rework (most notably, I’m unifying the two QueryParams implementations to a single class): now, the text is not up-to-date
everywhere, so I’ll have some things to finish, but hopefully I can also announce this officially in a bit.

Regards,
Máté

Hey Ignace et al,

I have updated the RFC in the past few weeks with a lot of extra info, mostly related to path segment handling: I investigated WHATWG URL’s
behavior more thoroughly, and it turned out that path segments are handled very interestingly, so there was a significant difference compared
to RFC 3986 yet again.

Please give the RFC another read, if possible.

Regards,
Máté

Hi Máté,

I just re-read the RFC and I like the updates and precision you’ve brought to it here’s my review:
For the builders I have nothing more design wise to add this is already solid. I may nitpick on the *Builder::clear() method name I would have gone with *Builder::reset() but I presume other developers would go with clear. Other than that the public API is spot on.

For the Enum, my only concern is that they serve just as flags and their usage is tightly coupled to the Uri classes. I would add 2 static named constructors fromUrl and tryFromUrl just for completeness. I believe the maintenance cost is negligible but the developer DX is improved and allows for a broader usage of the Enum.

In regards to the path segments usage and constructor I see you already integrate my Enum suggestions and you have explained why a fully fledged class is not the right approach. So the current design is already solid.

Last but not least, The Percent encoding feature should be IMHO improved by moving the encode/decode methods from being static methods on the URI classes to becoming public API on the Enum. This would indeed imply renaming the enum from Uri\Rfc3986\UriPercentEncodingMode to Uri\Rfc3986\UriPercentEncoder with two methods encode/decode. Again it makes for a more self-contained feature and adds to the DX. Developer will not have to always statically call the URI classes for encoding/decoding strings as the Enums and their cases already convey the information correctly.

Overall I believe this is going into the right direction

Regards,
Ignace

On Sun, Mar 1, 2026 at 11:09 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hey Ignace et al,

I have updated the RFC in the past few weeks with a lot of extra info, mostly related to path segment handling: I investigated WHATWG URL’s
behavior more thoroughly, and it turned out that path segments are handled very interestingly, so there was a significant difference compared
to RFC 3986 yet again.

Please give the RFC another read, if possible.

Regards,
Máté

Hi Màté,

As always I tried to implement a polyfill for the Percent-Encoding and Decoding Support RFC. Turns out while doing so I was able to refactor the enum. Of note, the case names are the one used in the text and NOT in the Enum example provided as they differ. I will update the names once you have updated them.
Here’s my alternate proposal for Uri\Rfc3986. Keep in mind that the same reasoning would apply for the Uri\Whatwg counterpart.

namespace Uri\Rfc3986 {
    enum UriComponent
    {
        case UserInfo;
        case Host;
        case Path;
        case PathSegment;
        case AbsolutePathReferenceFirstSegment;
        case RelativePathReferenceFirstSegment;
        case Query;
        case FormQuery;
        case Fragment;
        case AllReservedCharacters;
        case AllButUnreservedCharacters;

        /**
         * @throws InvalidUriException
         */
        public function encode(string $input): string;

        /**
         * @throws InvalidUriException
         */
        public function decode(string $input): string;
    }
}

As previously stated, I added the encode/decode method in the Enum this way the feature is fully handled by the Enum and no direct reference to the Uri class via a static method is done.
The Enum is renamed UriComponent instead of the current UriPercentEncodingMode the name change highlights the intent of the Enum encoding and decoding URI component, where each enum case represents a defined component context.
Both methods may trigger an exception (I do not know if specific exceptions like UnableToEncodeException and/or UnableToDecodeException should be added but, for now, the generic InvalidUriException is used.
This rewrite also greatly simplifies the Enum usage.

Below you will see your examples from the RFC rewritten

  • Decoding the fragment

$uri = new Uri\Rfc3986\Uri(“https://example.com#_%40%2F”);
$fragment = $uri->getFragment(); // returns “_%40%2F”
echo Uri\Rfc3986\UriComponent::Fragment->decode($fragment); //returns “_%40/”
  • Decoding the query
//with the query component
$uri = new Uri\Rfc3986\Uri(“https://example.com/?q=%3A%29”);
$query = $uri->getQuery(); // returns “q=%3A%29”
echo Uri\Rfc3986\UriComponent::Query->decode($query); //returns “q=:)”
  • Usage with the new Uri::withPathSegments method
$uri = new Uri\Rfc3986\Uri(“https://example.com”);
$uri = $uri->withPathSegments([
“foo”,
Uri\Rfc3986\UriComponent::PathSegment->decode(“bar/baz”)
]);
$uri->toRawString(); // https://example.com/foo/bar%2Fbaz

Let me know what you think,
regards,
Ignace

On Tue, Mar 3, 2026 at 10:24 AM ignace nyamagana butera <nyamsprod@gmail.com> wrote:

Hi Máté,

I just re-read the RFC and I like the updates and precision you’ve brought to it here’s my review:
For the builders I have nothing more design wise to add this is already solid. I may nitpick on the *Builder::clear() method name I would have gone with *Builder::reset() but I presume other developers would go with clear. Other than that the public API is spot on.

For the Enum, my only concern is that they serve just as flags and their usage is tightly coupled to the Uri classes. I would add 2 static named constructors fromUrl and tryFromUrl just for completeness. I believe the maintenance cost is negligible but the developer DX is improved and allows for a broader usage of the Enum.

In regards to the path segments usage and constructor I see you already integrate my Enum suggestions and you have explained why a fully fledged class is not the right approach. So the current design is already solid.

Last but not least, The Percent encoding feature should be IMHO improved by moving the encode/decode methods from being static methods on the URI classes to becoming public API on the Enum. This would indeed imply renaming the enum from Uri\Rfc3986\UriPercentEncodingMode to Uri\Rfc3986\UriPercentEncoder with two methods encode/decode. Again it makes for a more self-contained feature and adds to the DX. Developer will not have to always statically call the URI classes for encoding/decoding strings as the Enums and their cases already convey the information correctly.

Overall I believe this is going into the right direction

Regards,
Ignace

On Sun, Mar 1, 2026 at 11:09 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hey Ignace et al,

I have updated the RFC in the past few weeks with a lot of extra info, mostly related to path segment handling: I investigated WHATWG URL’s
behavior more thoroughly, and it turned out that path segments are handled very interestingly, so there was a significant difference compared
to RFC 3986 yet again.

Please give the RFC another read, if possible.

Regards,
Máté