[PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri

nyamsprod_the_funky · December 16, 2025, 8:38pm

On Tue, Dec 16, 2025 at 7:14 PM Juris Evertovskis <juris@glaive.pro> wrote:

On 2025-12-16 09:53, ignace nyamagana butera wrote:

Since we will be dealing with arrays the following rules could be updated when parsing the string using PHP behaviour:

“&a” should be converted to [‘a’ => null]

Hey Ignace,

In practice valueless arguments like ?debug are most often “flags” or “booleans” and their presence implies truthiness.

Do you think it would be wrong or confusing to have it converted to ['debug' => true]?

I’m worried that ['a' => null] would not be that handy since both $params['a'] and isset($params['a']) would return falsy which would likely be opposite to the intended value.

BR,
Juris

Hi Juris,

Do you think it would be wrong or confusing to have it converted to ['debug' => true]?

Yes IMHO it would be wrong because flag parameters or booleans are converted to [‘debug’ => 1]
The [‘debug’ => null] expresses the presence of the name pair and the absence of value associated with it.
Let’s see how it is currently done:

The WHATWG URL living standard does the following:

let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);

the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.

In PHP you have currently the following behaviour:

example 1

parse_str('debug&foo=bar&debug=', $params);
var_dump($params, http_build_query($params));
//$params ['debug' => '', 'foo' => 'bar']
//after roundtrip you get 'debug=&foo=bar'

example 2

parse_str('debug&foo=bar&debug=1', $params);
var_dump($params, http_build_query($params));
//$params ['debug' => '1', 'foo' => 'bar']
//after roundtrip you get 'debug=1&foo=bar'

So you lose data and the query data can be randomly sorted
parse_str convert the first debug into [‘debug’ => ‘’]
parse_str overwrites the value (This may be a security concern if you need to hash/validate your query string)

Since IMHO interoperability and security is important you should prefer an algorithm that preserves the original query.
The proposed solution is already in use for instance in League/Uri or in Guzzle

echo Uri::withQueryValues(Utils::uriFor('[https://example.com](https://example.com)'), [
    'debug' => null,
    'foo' => 'bar',
    'baz' => '',
]), PHP_EOL;
// [https://example.com?debug&foo=bar&baz=](https://example.com?debug&foo=bar&baz=)

Because Guzzle uses an associative array, the debug variable can only appear once but there is a difference using null and the empty string.
This improves interoperability with other languages and you no longer have data loss or random query re-arrangement.

Last but not least, the Query objects proposed by Màté all expose:

a has method which will always tell if the key is present regardless of its value an equivalent to array_key_exists.
provide a way to have the same parameter appear multiple times in the query string

So IMHO it is an improvement to also allow the distinction between null and the empty string so we can finally write in PHP

echo (new Uri\Rfc3986\Query())
    ->append('debug', null)
    ->append('foo', 'bar')
    ->append('debug', '')
    ->toRfc3986String();
// debug&foo=bar&baz=

Best regards,
Ignace

Mate_Kocsis · December 16, 2025, 9:43pm

Hi Ignace,

Currently, in your proposal you have 2 Query objects. This will give the developper a lot of work to understand where, when and which object to choose and why. Is that complexity really needed? IMHO we may end up with a correct API ... that no-one will use.

Just to reiterate what I wrote to Juris a few days ago: I’m open to unifying the two classes, but I’m just hesitant because of security and evolvability reasons (but the main one is security).

With all that in mind I believe a single `Uri\Query` should be used. Its goal should be:

- to be immutable
- to store the query in its decoded form.
- to manipulate and change the data in a consistent way.

So far, I imagined the two QueryParams classes to be mutable because one of their main goals is to be able to build (~ mutate) query param list…
But otherwise an immutable implementation would be useful for sure.

Decoding/encoding should happen at the object boundaries but everything inside the object should
be done on decoded data.

Yes, that’s what I also had to find out based on my experience with implementing the POC, so I completely agree here.

On a bonus side, it would be nice to have a mechanism in PHP that allows the application to switch
from the current `parse_str` usage to the new improved parsing provided by the new class when
populating the `_GET` array. (So that deprecating `parse_str` can be initiated in some distant future.)
This last observation/remark is not mandatory but nice to have.

This is a very interesting remark, and I have not thought about this possibility yet. Generally, I agree with
the idea, but my long-term goal (or wish) is to move away from using $_GET and $_POST to access request
data in favor of using objects… So I most probably won’t deal with trying to implement this idea. However,
I’m willing to add a UriQueryParams::fromCurrentQueryString(), maybe even a UriQueryParams::fromCurrentBody()
or similar factory methods if people like it.

- in respect to `parse_str`, no mangled data should occur on parsing:

Uh, I completely forgot about this behavior of parse_str(), and I definitely agree that mangling shouldn’t happen.

- Only accept scalar values, `null`, and `array`. If an object or a resource is detected a `ValueError` error
should be thrown.

I wasn’t sure what to do with objects, but I’m happy to skip their support, especially if they would cause issues.
The rest of the suggestions align with my initial plans (maybe with the exception of throwing ValueError – I wrote
TypeError in the related section).

- Remove the addition of indices if the `array` is a list.

Yes, this also aligns with my initial plans.

Best regards,
Máté

Mate_Kocsis · December 18, 2025, 9:45pm

Hi,

The WHATWG URL living standard does the following:
let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);
the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.

Yes, confirmed. Unfortunately, WHATWG URL only supports string values, so there’s no way to support
query parameters without a key (e.g. ?debug) in the RFC implementation either.

On the other hand, the RFC 3986 implementation supports this notion, even uriparser calls this out
in its documentation: https://uriparser.github.io/doc/api/latest/index.html#querystrings.

However, there are a few other problems which came up when I was updating my implementation.

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,
it is still needed for non-list arrays, because the [..] suffix must be added to their name:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

echo $params->toRfc3986String(); // foo%5B2%5D=bar&foo%5B4%5D=baz

var_dump($params->getFirst(“foo”)); // NULL

Even though I appended params with the name “foo”, no items can be returned when calling getFirst(),
because of name mangling.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

echo $params->toRfc3986String(); // ???

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: https://github.com/php/php-src/pull/15650
Should we support them, right?

Regards,
Máté

nyamsprod_the_funky · December 19, 2025, 7:59am

On Thu, Dec 18, 2025 at 10:46 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi,
The WHATWG URL living standard does the following:
let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);
the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.
Yes, confirmed. Unfortunately, WHATWG URL only supports string values, so there’s no way to support
query parameters without a key (e.g. ?debug) in the RFC implementation either.

On the other hand, the RFC 3986 implementation supports this notion, even uriparser calls this out
in its documentation: https://uriparser.github.io/doc/api/latest/index.html#querystrings.

However, there are a few other problems which came up when I was updating my implementation.

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,
it is still needed for non-list arrays, because the [..] suffix must be added to their name:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

echo $params->toRfc3986String(); // foo%5B2%5D=bar&foo%5B4%5D=baz

var_dump($params->getFirst(“foo”)); // NULL

Even though I appended params with the name “foo”, no items can be returned when calling getFirst(),
because of name mangling.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

echo $params->toRfc3986String(); // ???

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: https://github.com/php/php-src/pull/15650
Should we support them, right?

Regards,
Máté

Hi Máté

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,

it is still needed for non-list arrays, because the [..] suffix must be added to their name:

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

The bracket is a PHP specificity and I would not change it now otherwise you introduce a huge BC break in the ecosystem
for no particular gain IMHO.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.
$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: BackedEnum support for http_build_query · Issue #15650 · php/php-src · GitHub
Should we support them, right?

This gave me a WTF? moment (sorry for the language). Why was this change added in PHP without a proper discussion in the internals
or even an RFC because it sets up a lot of precedence and even changes how http_build_query is supposed to work in regards to objects.
If this had landed on the internal list I would have objected to it on the ground as it breaks expectation in the function handling of type in PHP.
Do I agree with everything the function does ? No, but introducing inconsistencies in the function is not a good thing. Now http_build_query
is aware of specific objects. Sringable or Travarsable objects are not detected but Enum are ?? Pure Enum emits a TypeError but resource

do not ? Backed Enums are not converted to int or to string by PDO ? Why would http_build_query do it differently ? The same reasoning apply as to
why Backed Enum does not have a Stringable interface.

Yes the output was “weird” in PHP8.1-> PHP8.3 but it was expected. Should something be done for DateInterval too because the
output using http_build_query is atrocious ?

I still stand on my opinion that objects, resources should NEVER be converted. In an ideal world only scalar + null and their repeated values
encapsulated in an array should be allowed; everything else should be left to the developer to decide. So yes in your implementation I do think
that Backed Enum should not be treated differently than others objects and should throw.

PS: I would even revert this change or deprecated it for removal in PHP9 (in a separate RFC)

Best regards,
Ignace

nyamsprod_the_funky · December 20, 2025, 12:00pm

On Fri, Dec 19, 2025 at 8:59 AM ignace nyamagana butera <nyamsprod@gmail.com> wrote:

On Thu, Dec 18, 2025 at 10:46 PM Máté Kocsis <kocsismate90@gmail.com> wrote:
Hi,
The WHATWG URL living standard does the following:
let url = new URL('[https://example.com?debug&foo=bar&debug=](https://example.com?debug&foo=bar&debug=)');
console.log(
    url.searchParams.toString(), //returns debug=&foo=bar&debug='
);
the pair gets converted to [‘debug’ => ‘’]. The roundtrip does not conserve the query string as is but all key/pair (tuples) are present.
Yes, confirmed. Unfortunately, WHATWG URL only supports string values, so there’s no way to support
query parameters without a key (e.g. ?debug) in the RFC implementation either.

On the other hand, the RFC 3986 implementation supports this notion, even uriparser calls this out
in its documentation: https://uriparser.github.io/doc/api/latest/index.html#querystrings.

However, there are a few other problems which came up when I was updating my implementation.

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,
it is still needed for non-list arrays, because the [..] suffix must be added to their name:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

echo $params->toRfc3986String(); // foo%5B2%5D=bar&foo%5B4%5D=baz

var_dump($params->getFirst(“foo”)); // NULL

Even though I appended params with the name “foo”, no items can be returned when calling getFirst(),
because of name mangling.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

echo $params->toRfc3986String(); // ???

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: https://github.com/php/php-src/pull/15650
Should we support them, right?

Regards,
Máté
Hi Máté

1.) Yesterday, I wrote that name mangling of the query params shouldn’t happen. However, as I realized,

it is still needed for non-list arrays, because the [..] suffix must be added to their name:

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

The bracket is a PHP specificity and I would not change it now otherwise you introduce a huge BC break in the ecosystem
for no particular gain IMHO.

2.) I’m not really sure how empty arrays should be represented? PHP doesn’t retain them, and they are
simply skipped. But should we do the same thing? I can’t really come up with any other sensible behavior.
$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, );

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: BackedEnum support for http_build_query · Issue #15650 · php/php-src · GitHub
Should we support them, right?

This gave me a WTF? moment (sorry for the language). Why was this change added in PHP without a proper discussion in the internals
or even an RFC because it sets up a lot of precedence and even changes how http_build_query is supposed to work in regards to objects.
If this had landed on the internal list I would have objected to it on the ground as it breaks expectation in the function handling of type in PHP.
Do I agree with everything the function does ? No, but introducing inconsistencies in the function is not a good thing. Now http_build_query
is aware of specific objects. Sringable or Travarsable objects are not detected but Enum are ?? Pure Enum emits a TypeError but resource

do not ? Backed Enums are not converted to int or to string by PDO ? Why would http_build_query do it differently ? The same reasoning apply as to
why Backed Enum does not have a Stringable interface.

Yes the output was “weird” in PHP8.1-> PHP8.3 but it was expected. Should something be done for DateInterval too because the
output using http_build_query is atrocious ?

I still stand on my opinion that objects, resources should NEVER be converted. In an ideal world only scalar + null and their repeated values
encapsulated in an array should be allowed; everything else should be left to the developer to decide. So yes in your implementation I do think
that Backed Enum should not be treated differently than others objects and should throw.

PS: I would even revert this change or deprecated it for removal in PHP9 (in a separate RFC)

Best regards,
Ignace

3.) We wrote earlier that objects shouldn’t be supported when creating the query string from variables. But what about
backed enums? Support for them in http_build_query() was added not long ago: BackedEnum support for http_build_query · Issue #15650 · php/php-src · GitHub
Should we support them, right?

Hi Máté,

After further checking and researching, here’s my view on Enum support. It is based on the PHP8.4 behaviour of Enum with json_encode since it is the base used to add its support in http_build_query.

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

If such interface lands, then Pure Enum serialization will be allowed. via the interface and the behaviour of BackedEnum also would be affected just like what is happening with json_encode. (ie: the interface takes precedence over the class instance default behaviour).

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

What do you think ?
Best regards,
Ignace

Mate_Kocsis · December 21, 2025, 12:13pm

Hi Ignace,

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won’t be symmetric:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

var_dump($params->getFirst(“foo”)); // NULL

One cannot be sure if a parameter that was added can really be retrieved later via a get*() method.

Another edge cases:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [“bar”, “baz”]) // Value is a list, so “foo” is added without brackets
->append(“foo”, [2 => “qux”, 4 => “quux”]); // Value is an array, so “foo” is added with brackets

var_dump($params->toRfc3986String()); // foo=bar&foo=baz&foo%5B2%5D=qux&foo%5B4%5D=quux

var_dump($params->getLast(“foo”)) // Should it be “baz” or “quux”?
var_dump($params->getAll(“foo”)) // Should it only include the params with name “foo”, or also “foo”?

And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of
the specification is way too underspecified and vague, so URLSearchParams doesn’t seem well-usable in practice…

So an idea that I’m now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support
scalar values, therefore the param name wasn’t mangled. And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

+1

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

Hm, good idea! I’m not particularly interested in this feature, but I agree it’s a good way to add support for objects.

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

+1

Regards,
Máté

nyamsprod_the_funky · December 21, 2025, 3:51pm

On Sun, Dec 21, 2025 at 1:13 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi Ignace,

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won’t be symmetric:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

var_dump($params->getFirst(“foo”)); // NULL

One cannot be sure if a parameter that was added can really be retrieved later via a get*() method.

Another edge cases:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [“bar”, “baz”]) // Value is a list, so “foo” is added without brackets
->append(“foo”, [2 => “qux”, 4 => “quux”]); // Value is an array, so “foo” is added with brackets

var_dump($params->toRfc3986String()); // foo=bar&foo=baz&foo%5B2%5D=qux&foo%5B4%5D=quux

var_dump($params->getLast(“foo”)) // Should it be “baz” or “quux”?
var_dump($params->getAll(“foo”)) // Should it only include the params with name “foo”, or also “foo”?

And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of
the specification is way too underspecified and vague, so URLSearchParams doesn’t seem well-usable in practice…

So an idea that I’m now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support
scalar values, therefore the param name wasn’t mangled. And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

+1

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

Hm, good idea! I’m not particularly interested in this feature, but I agree it’s a good way to add support for objects.

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

+1

Regards,
Máté

Hi Máté,

And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

I believe adding the appendArray and setArray is the way forward as the bracket addition and thus mangling is really a PHP specificity that we MUST keep to avoid hard BC.
I would even go a step further and add a getArray and hasArray methods which will lead to the following API


```
$params = (new Uri\Rfc3986\UriQueryParams())
    ->append("foo", ["bar", "baz"])          // Value is a list, so "foo" is added without brackets
    ->appendArray("foo", ["qux", "quux"]);   // Value is a list, using PHP serialization "foo" is added with brackets

var_dump($params->toRfc3986String());        // foo=bar&foo=baz&foo%5B0%5D=qux&foo%5B1%5D=quux

$params->hasArray('foo'); //returns true
$params->getArray("foo"); //returns ["qux", "quux"]

$params->has('foo');      //returns true
$params->getFirst("foo"); //returns "bar"
$params->getLast("foo");  //returns "baz"
$params->getAll('foo');   //returns ["bar", "baz"]
```

Hope this makes sense

Regards,
Ignace

nyamsprod_the_funky · January 3, 2026, 1:05pm

On Sun, Dec 21, 2025 at 4:51 PM ignace nyamagana butera <nyamsprod@gmail.com> wrote:

On Sun, Dec 21, 2025 at 1:13 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hi Ignace,

When I talk about data mangling I am talking about this

parse_str(‘foo.bar=baz’, $params);
var_dump($params); //returns [‘foo_bar’ => ‘baz’]

Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won’t be symmetric:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [2 => “bar”, 4 => “baz”]);

var_dump($params->getFirst(“foo”)); // NULL

One cannot be sure if a parameter that was added can really be retrieved later via a get*() method.

Another edge cases:

$params = new Uri\Rfc3986\UriQueryParams()
->append(“foo”, [“bar”, “baz”]) // Value is a list, so “foo” is added without brackets
->append(“foo”, [2 => “qux”, 4 => “quux”]); // Value is an array, so “foo” is added with brackets

var_dump($params->toRfc3986String()); // foo=bar&foo=baz&foo%5B2%5D=qux&foo%5B4%5D=quux

var_dump($params->getLast(“foo”)) // Should it be “baz” or “quux”?
var_dump($params->getAll(“foo”)) // Should it only include the params with name “foo”, or also “foo”?

And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of
the specification is way too underspecified and vague, so URLSearchParams doesn’t seem well-usable in practice…

So an idea that I’m now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support
scalar values, therefore the param name wasn’t mangled. And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

This to me should yield the same result as ->append(‘foo’, null); as the array construct is only indicative of a repeating parameter name
if there is no repeat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behaviour)
arrays with values containing valid allowed type or array. are also supported to allow complex type support.

Any other type (object, resource, Pure Enum) are disallowed they should throw a TypeError

+1

Maybe in the future scope of this RFC or in this RFC depending on how you scope the RFC you may introduce an Interface which will allow serializing objects using a representation that
follows the described rules above. Similar to what the JsonSerializable interface is for json_encode.

Hm, good idea! I’m not particularly interested in this feature, but I agree it’s a good way to add support for objects.

Last but not Last, all this SHOULD not affect how http_buid_query works. The function should never have been modified IMHO so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.

+1

Regards,
Máté

Hi Máté,

And an extra appendArray() and setArray() method could be added that would possible
mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising
behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra
checks may be needed (when using appendArray() or setArray()).

I believe adding the appendArray and setArray is the way forward as the bracket addition and thus mangling is really a PHP specificity that we MUST keep to avoid hard BC.
I would even go a step further and add a getArray and hasArray methods which will lead to the following API
```
$params = (new Uri\Rfc3986\UriQueryParams())
    ->append("foo", ["bar", "baz"])          // Value is a list, so "foo" is added without brackets
    ->appendArray("foo", ["qux", "quux"]);   // Value is a list, using PHP serialization "foo" is added with brackets

var_dump($params->toRfc3986String());        // foo=bar&foo=baz&foo%5B0%5D=qux&foo%5B1%5D=quux

$params->hasArray('foo'); //returns true
$params->getArray("foo"); //returns ["qux", "quux"]

$params->has('foo');      //returns true
$params->getFirst("foo"); //returns "bar"
$params->getLast("foo");  //returns "baz"
$params->getAll('foo');   //returns ["bar", "baz"]
```
Hope this makes sense 
Regards,
Ignace

Hi Màté,

I have been playing around your Query Param API and I have a couple of questions:

Question 1) While I am not a proponent of the addition of the getQueryParams on both classes even though I know the method exists in
the WHATWG URL spec I find strange is that the method may return null. To me this makes for an awkward API where the user will always
have to add some conditional checks before using the method returned value. Why can’t this be true ?

$url = Uri\Rfc3986\Uri::parse('[https://www.example.com/path/to/whatever](https://www.example.com/path/to/whatever)');
$url->getQueryParams(); 
// should return a empty UriQueryParams instance

$url = Uri\Rfc3986\Uri::parse('[https://www.example.com/path/to/whatever](https://www.example.com/path/to/whatever)?');
$url->getQueryParams();
// should return UriQueryParams with a pair
// represented like this ['' => null] or like this ['', null]

This IMHO should also be the case for the UrlQueryParams instance

Question 1-bis)

I prefer having some extra named constructors on the UrlQueryParams instead of having a getter on the Uri/Url classes. This fully decoupled

the Ur(i|l)QueryParams from the Uri/Url classes and let the user opt-in the new API if needed. In case of errors/bugs etc... only the QueryParams

cointainer bags would be affected ... not the Url/Uri classes.

Question 2)

 I see you have

- UriQueryParams::fromArray,

- UriQueryParams::list,

If I read it correctly, this returns 2 array representations of the query ?

 My question is shouldn't we have either a fromList named constructor and/or a toArray which return both distinctive forms ?

This might confused the developer who will have a hard time understand which form is what and when to use it and it which

one in which context can be used to instantiate a new instance ?

Question 3)

I wanted to know how the following code will be processed ?

$query = 'a[]=foo&a[]=bar&a=qux';
parse_str($query, $result);
$result['a']; //returns "qux"


```
As seen in the example with  parse_str the full array notation is overwritten and can not be used/accessed
```

```
Will the getArray API still be able to access the array data or will it act like parse_str and skip the array notation ?
```

```
Best regards,
```

```
Ignace
```

Mate_Kocsis · February 13, 2026, 8:13pm

Hi Ignace,

Sorry for the very late reply, I was working on something else in January, and I’ve just recently got back to this topic. I’ll answer your question as soon
as possible, but let me tell you that I’ve just separated the Query Parameter Manipulation part to its own RFC, because it’s that complex.

And I recently started a significant rework (most notably, I’m unifying the two QueryParams implementations to a single class): now, the text is not up-to-date
everywhere, so I’ll have some things to finish, but hopefully I can also announce this officially in a bit.

Regards,
Máté

Mate_Kocsis · March 1, 2026, 10:09pm

Hey Ignace et al,

I have updated the RFC in the past few weeks with a lot of extra info, mostly related to path segment handling: I investigated WHATWG URL’s
behavior more thoroughly, and it turned out that path segments are handled very interestingly, so there was a significant difference compared
to RFC 3986 yet again.

Please give the RFC another read, if possible.

Regards,
Máté

nyamsprod_the_funky · March 3, 2026, 9:24am

Hi Máté,

I just re-read the RFC and I like the updates and precision you’ve brought to it here’s my review:
For the builders I have nothing more design wise to add this is already solid. I may nitpick on the *Builder::clear() method name I would have gone with *Builder::reset() but I presume other developers would go with clear. Other than that the public API is spot on.

For the Enum, my only concern is that they serve just as flags and their usage is tightly coupled to the Uri classes. I would add 2 static named constructors fromUrl and tryFromUrl just for completeness. I believe the maintenance cost is negligible but the developer DX is improved and allows for a broader usage of the Enum.

In regards to the path segments usage and constructor I see you already integrate my Enum suggestions and you have explained why a fully fledged class is not the right approach. So the current design is already solid.

Last but not least, The Percent encoding feature should be IMHO improved by moving the encode/decode methods from being static methods on the URI classes to becoming public API on the Enum. This would indeed imply renaming the enum from Uri\Rfc3986\UriPercentEncodingMode to Uri\Rfc3986\UriPercentEncoder with two methods encode/decode. Again it makes for a more self-contained feature and adds to the DX. Developer will not have to always statically call the URI classes for encoding/decoding strings as the Enums and their cases already convey the information correctly.

Overall I believe this is going into the right direction

Regards,
Ignace

On Sun, Mar 1, 2026 at 11:09 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hey Ignace et al,

I have updated the RFC in the past few weeks with a lot of extra info, mostly related to path segment handling: I investigated WHATWG URL’s
behavior more thoroughly, and it turned out that path segments are handled very interestingly, so there was a significant difference compared
to RFC 3986 yet again.

Please give the RFC another read, if possible.

Regards,
Máté

nyamsprod_the_funky · March 5, 2026, 7:09am

Hi Màté,

As always I tried to implement a polyfill for the Percent-Encoding and Decoding Support RFC. Turns out while doing so I was able to refactor the enum. Of note, the case names are the one used in the text and NOT in the Enum example provided as they differ. I will update the names once you have updated them.
Here’s my alternate proposal for Uri\Rfc3986. Keep in mind that the same reasoning would apply for the Uri\Whatwg counterpart.

namespace Uri\Rfc3986 {
    enum UriComponent
    {
        case UserInfo;
        case Host;
        case Path;
        case PathSegment;
        case AbsolutePathReferenceFirstSegment;
        case RelativePathReferenceFirstSegment;
        case Query;
        case FormQuery;
        case Fragment;
        case AllReservedCharacters;
        case AllButUnreservedCharacters;

        /**
         * @throws InvalidUriException
         */
        public function encode(string $input): string;

        /**
         * @throws InvalidUriException
         */
        public function decode(string $input): string;
    }
}

As previously stated, I added the encode/decode method in the Enum this way the feature is fully handled by the Enum and no direct reference to the Uri class via a static method is done.
The Enum is renamed UriComponent instead of the current UriPercentEncodingMode the name change highlights the intent of the Enum encoding and decoding URI component, where each enum case represents a defined component context.
Both methods may trigger an exception (I do not know if specific exceptions like UnableToEncodeException and/or UnableToDecodeException should be added but, for now, the generic InvalidUriException is used.
This rewrite also greatly simplifies the Enum usage.

Below you will see your examples from the RFC rewritten

Decoding the fragment


$uri = new Uri\Rfc3986\Uri(“https://example.com#_%40%2F”);
$fragment = $uri->getFragment(); // returns “_%40%2F”
echo Uri\Rfc3986\UriComponent::Fragment->decode($fragment); //returns “_%40/”

Decoding the query

//with the query component
$uri = new Uri\Rfc3986\Uri(“https://example.com/?q=%3A%29”);
$query = $uri->getQuery(); // returns “q=%3A%29”
echo Uri\Rfc3986\UriComponent::Query->decode($query); //returns “q=:)”

Usage with the new Uri::withPathSegments method

$uri = new Uri\Rfc3986\Uri(“https://example.com”);
$uri = $uri->withPathSegments([
“foo”,
Uri\Rfc3986\UriComponent::PathSegment->decode(“bar/baz”)
]);

$uri->toRawString(); // https://example.com/foo/bar%2Fbaz

Let me know what you think,
regards,
Ignace

On Tue, Mar 3, 2026 at 10:24 AM ignace nyamagana butera <nyamsprod@gmail.com> wrote:

Hi Máté,

I just re-read the RFC and I like the updates and precision you’ve brought to it here’s my review:
For the builders I have nothing more design wise to add this is already solid. I may nitpick on the *Builder::clear() method name I would have gone with *Builder::reset() but I presume other developers would go with clear. Other than that the public API is spot on.

For the Enum, my only concern is that they serve just as flags and their usage is tightly coupled to the Uri classes. I would add 2 static named constructors fromUrl and tryFromUrl just for completeness. I believe the maintenance cost is negligible but the developer DX is improved and allows for a broader usage of the Enum.

In regards to the path segments usage and constructor I see you already integrate my Enum suggestions and you have explained why a fully fledged class is not the right approach. So the current design is already solid.

Last but not least, The Percent encoding feature should be IMHO improved by moving the encode/decode methods from being static methods on the URI classes to becoming public API on the Enum. This would indeed imply renaming the enum from Uri\Rfc3986\UriPercentEncodingMode to Uri\Rfc3986\UriPercentEncoder with two methods encode/decode. Again it makes for a more self-contained feature and adds to the DX. Developer will not have to always statically call the URI classes for encoding/decoding strings as the Enums and their cases already convey the information correctly.

Overall I believe this is going into the right direction

Regards,
Ignace

On Sun, Mar 1, 2026 at 11:09 PM Máté Kocsis <kocsismate90@gmail.com> wrote:

Hey Ignace et al,

I have updated the RFC in the past few weeks with a lot of extra info, mostly related to path segment handling: I investigated WHATWG URL’s
behavior more thoroughly, and it turned out that path segments are handled very interestingly, so there was a significant difference compared
to RFC 3986 yet again.

Please give the RFC another read, if possible.

Regards,
Máté

Mate_Kocsis · March 13, 2026, 9:38am

Hi Ignace,

I just re-read the RFC and I like the updates and precision you’ve brought to it here’s my review:
For the builders I have nothing more design wise to add this is already solid. I may nitpick on the *Builder::clear() method name I would have gone with *Builder::reset() but I presume other developers would go with clear. Other than that the public API is spot on.

I also like the reset() method name much more! So I’ll update the RFC accordingly.

For the Enum, my only concern is that they serve just as flags and their usage is tightly coupled to the Uri classes. I would add 2 static named constructors fromUrl and tryFromUrl just for completeness. I believe the maintenance cost is negligible but the developer DX is improved and allows for a broader usage of the Enum.

I don’t really understand why it would make sense to invert the coupling? (decouple the UriHostType/UrlHostType and the
UriType enums from the Uri/Url classes, and in the same time, couple the enums to the Uri/Url classes). IMO it’s far more
ergonomic to retrieve the URI type/host type from the URI class directly, rather than to instantiate an enum each time?

Last but not least, The Percent encoding feature should be IMHO improved by moving the encode/decode methods from being static methods on the URI classes to becoming public API on the Enum. This would indeed imply renaming the enum from Uri\Rfc3986\UriPercentEncodingMode to Uri\Rfc3986\UriPercentEncoder with two methods encode/decode. Again it makes for a more self-contained feature and adds to the DX. Developer will not have to always statically call the URI classes for encoding/decoding strings as the Enums and their cases already convey the information correctly.

Personally - and I may be in the minority - I don’t see an issue with having two static methods on Uri/Url.
Uri::percentEncode() and Uri::percentDecode() as well as Url::percentEncode() and Url::percentDecode()
could indeed be implemented via a dedicated UriPercentEncoder and UrlPercentEncoder class, or even a
shared PercentEncoder one, but:

Its methods would still be static
I don’t think it’s worth to add one or two dedicated classes just for this purpose

I also got feedback that these functions could be free-standing in the URI namespace. But I really don’t like free standing functions,
so I won’t go in this direction. IMO even two static methods on the Uri and Url classes are easier to use and find.

So if we reject these ideas, then the next candidate is your suggestion of having a UriComponent/UrlComponent enum
with an encode() and decode() method (https://github.com/thephpleague/uri-src/pull/186#issuecomment-4016602880):

$uri = new Uri\Rfc3986\Uri(“https://example.com/?q=%3A%29”);
$query = $uri->getQuery(); // returns “q=%3A%29”
echo Uri\Rfc3986\UriComponent::Query->decode($query); // returns “q=:)”

But as I mentioned in my comment on the PR, percent-encoding/decoding is not necessarily tied to URI/URL components,
for example because the proposal currently contains the Uri\Rfc3986\UriPercentEncodingMode::AllReservedCharacters,
Uri\Rfc3986\UriPercentEncodingMode::AllButUnreservedCharacters, or Uri\Rfc3986\UriPercentEncodingMode::PathSegment.

Some of the enum cases which don’t relate to a component could be removed, but at least the AllReservedCharacters case is
important because it provides a direct alternative for rawurlencode() and rawurldecode(). My side-quest is to gradually phase out
*urlencode() and *urldecode() functions because their naming is very confusing, and people usually don’t know when to use which.

And I’ve just noticed that probably yet another enum case would be needed to provide a direct alternative for urlencode() and urldecode(),
because they differ from rawurlencode() and rawurldecode() with regards to how the “~” is handled, besides the " " character.
But at this point, I became unsure if it’s worth to pursue this goal, because this is not RFC 3986 compliant behavior anymore (and TBH
not even Uri\Rfc3986\UriPercentEncodingMode::FormQuery is compliant), so it has nothing to do with the Uri\Rfc3986 namespace.

So all in all… As far as I can see, not even a Uri\Rfc3986\UriComponent enum could provide a complete solution for the custom
percent-encoding/decoding part of the proposal. If we used a Uri\Rfc3986\UriEncoding enum name instead, then there would be no issue
with the various kinds of encoding/decoding modes not referring to URI components, but the naming would probably still not be right,
as I wouldn’t expect a class name with “ing” suffix to perform percent-encoding/decoding itself. But I’m happy to be corrected by
native English speakers

As I don’t have any other ideas, I think I still prefer the static method based approach.

Regards,
Máté

Mate_Kocsis · March 30, 2026, 8:40pm

Hi Ignace, Everyone,

I have recently clarified/updated a few things in the RFC:

How Uri\Rfc3986\Uri::getDecodedPathSegments() and Uri\WhatWg\Url::getDecodedPathSegments() exactly work
The exact list of percent-encoding modes and their behavior
Exactly how the percentDecode() methods work

Please have a look at these changes, because I’d like to bring this RFC to a vote soonish, since there’s not too much debate
going on for a while.

Regards,
Máté

Mate_Kocsis · April 7, 2026, 1:30pm

Hi Internals,

As the discussion has been pretty much silent lately, I plan to bring this RFC to a vote next Monday, unless significant issues
are brought up that need additional time to discuss.

Regards,
Máté

Tim_Dusterhus · April 13, 2026, 10:27pm

Hi

On 3/13/26 10:38, Máté Kocsis wrote:

Personally - and I may be in the minority - I don't see an issue with
having two static methods on Uri/Url.

As I had previously mentioned in private, I also (very strongly) believe that making these static methods on the URI class is not the correct design. Classes are not intended to be “pseudo-namespaces” and any (public) functionality on a class should directly relate to it, otherwise you're just polluting the API surface, which makes autocompletion and discoverability worse and documentation more complex. In practice this means that public static methods should be limited to “named constructors”.

Uri::percentEncode() and Uri::percentDecode() as well as
Url::percentEncode() and Url::percentDecode()
could indeed be implemented via a dedicated UriPercentEncoder and
UrlPercentEncoder class, or even a
shared PercentEncoder one, but:

- Its methods would still be static
- I don't think it's worth to add one or two dedicated classes just for
this purpose

I agree here: A dedicated class with static methods would still be a pseudo-namespace. It would be …

I also got feedback that these functions could be free-standing in the URI
namespace. But I really don't like free standing functions,

… equivalent to free-standing functions but worse. Personal preference should not play a role here: PHP supports namespacing functions and within PHP's standard library we should embrace this capability.

A big benefit of free-standing functions is that they would be easy to polyfill, particularly when we need additional of them in future PHP versions.

So all in all... As far as I can see, not even a Uri\Rfc3986\UriComponent
enum could provide a complete solution for the custom
percent-encoding/decoding part of the proposal. If we used a
Uri\Rfc3986\UriEncoding enum name instead, then there would be no issue
with the various kinds of encoding/decoding modes not referring to URI
components, but the naming would probably still not be right,
as I wouldn't expect a class name with "ing" suffix to perform
percent-encoding/decoding itself. But I'm happy to be corrected by
native English speakers

enum Uri\Rfc3986\PercentEncoder
enum Uri\WhatWg\PercentEncoder

with *non*-static encode() and decode() methods would work for me.

I don't have a strong preference between “instance methods on an enum” and “free-standing functions”, but I have a strong preference against “static methods”.

Best regards
Tim Düsterhus

Crell · April 15, 2026, 8:04pm

On Mon, Mar 30, 2026, at 3:40 PM, Máté Kocsis wrote:

Hi Ignace, Everyone,

I have recently clarified/updated a few things in the RFC:

- How Uri\Rfc3986\Uri::getDecodedPathSegments() and
Uri\WhatWg\Url::getDecodedPathSegments() exactly work
- The exact list of percent-encoding modes and their behavior
- Exactly how the percentDecode() methods work

Please have a look at these changes, because I'd like to bring this RFC
to a vote soonish, since there's not too much debate
going on for a while.

Regards,
Máté

My apologies for taking so long to review this. It's been a hectic few weeks Chez Crell.

From the builder examples:
"echo $url->toAsciiString; "

Is that missing () ?

I'm not sure I agree that one should always reset() a builder before using it. There's plenty of good reasons to make a builder part way, then fill the rest in separately. But I do agree that reset() should exist, so the non-normative advice there is not a blocker for me.

For host type detection, what does a null return signify? UrlHostType includes an Empty case, which I'd assume would be used in that situation. And I'd rather have an Empty case on UriHostType as well rather than null, unless someone can make a good argument for using null instead...

LeadingSlashPolicy::AddForNonEmtpyRelative is... a mouthful. Self-documenting is fine, but as the example demonstrates it quickly creates super long lines. That could be a problem in, say, a match() statement, inside a method, where with that enum value as a case you'd be more than halfway across the screen before you get to the executable code for that case. Is there no way to make that whole thing shorter?

I will also agree with Tim's comments in a separate message: The encode/decode operations should not be static methods. They do not relate to the *type*/*class*, therefore they should not be on the type/class. Functions in namespaces are totally fine.

I'm not sure how I feel about object methods on the enum. That would be something like:

use Uri\Rfc3986\PercentEncoder;

PercentEncoder::Path->encode($someval);

Right? That.. could work, but also feels quite convoluted.

use Uri\Rfc3986\encode;

encode($someval, PercentEncoder::Path);

Is about the same length, and with PFA easily pre-configurable, and closer to what is typically seen in the wild today. I think I'd lean toward "just a function", but I wouldn't vote against the enum method approach just for that.

Aside from those (overall minor) points, this all looks great, and I appreciate how much research has clearly gone into it!

--Larry Garfield

Tim_Dusterhus · April 15, 2026, 8:21pm

Hi

Am 2026-04-15 22:04, schrieb Larry Garfield:

For host type detection, what does a null return signify? UrlHostType includes an Empty case, which I'd assume would be used in that situation. And I'd rather have an Empty case on UriHostType as well rather than null, unless someone can make a good argument for using null instead...

An empty host is semantically different from a missing host.

You can check that with the existing functionality in PHP 8.5, where `getHost()` is `?string` for both RFC 3986 and WHATWG URL in accordance with the respective standards. If `getHostType()` will return `null` if `getHost()` returns `null`. The cases of the Ur[il]HostType enums in the featured RFC are also pulled straight from the respective specifications.

LeadingSlashPolicy::AddForNonEmtpyRelative is... a mouthful. Self-documenting is fine, but as the example demonstrates it quickly creates super long lines. That could be a problem in, say, a match() statement, inside a method, where with that enum value as a case you'd be more than halfway across the screen before you get to the executable code for that case. Is there no way to make that whole thing shorter?

FYI: I've also had little time for this RFC, but I've started discussing spinning off the “path segment” part of the RFC into a dedicated RFC with Mate off-list a few days ago, similarly to how the query parameters part has been spun off because it has become too complicated for a bulk RFC.

I'm not sure how I feel about object methods on the enum. That would be something like:

use Uri\Rfc3986\PercentEncoder;

PercentEncoder::Path->encode($someval);

Right? […]

Yes.

use Uri\Rfc3986\encode;

encode($someval, PercentEncoder::Path);

Is about the same length, and with PFA easily pre-configurable, and closer to what is typically seen in the wild today.

FWIW: The enum would also be “pre-configurable” with PHP: rfc:partial_function_application_this. Arnaud and I are working on a more general version of that RFC based on Bob's feedback, but will need to figure out the latest edge cases before proposing it.

Best regards
Tim Düsterhus

nyamsprod_the_funky · April 15, 2026, 10:21pm

As far as I am concerned, this brings me back to my earlier suggestion (with a Polyfill implementation). The name can be changed or improved but I did go
with an Enum and two public methods encode/decode see https://github.com/thephpleague/uri-src/pull/186 which to me sounded like:

a better DX. which avoid polluting the Uri(l) classes
allow for an easier transition with an additional polyfill
properly decouple the Encoding feature from the URI parsing ones.

so am I +1 on this change that I have already expressed which is my biggest concern over the current RFC proposal

On Wed, Apr 15, 2026 at 10:23 PM Tim Düsterhus <tim@bastelstu.be> wrote:

Hi

Am 2026-04-15 22:04, schrieb Larry Garfield:

For host type detection, what does a null return signify? UrlHostType
includes an Empty case, which I’d assume would be used in that
situation. And I’d rather have an Empty case on UriHostType as well
rather than null, unless someone can make a good argument for using
null instead…

An empty host is semantically different from a missing host.

You can check that with the existing functionality in PHP 8.5, where
getHost() is ?string for both RFC 3986 and WHATWG URL in accordance
with the respective standards. If getHostType() will return null if
getHost() returns null. The cases of the Ur[il]HostType enums in the
featured RFC are also pulled straight from the respective
specifications.

LeadingSlashPolicy::AddForNonEmtpyRelative is… a mouthful.
Self-documenting is fine, but as the example demonstrates it quickly
creates super long lines. That could be a problem in, say, a match()
statement, inside a method, where with that enum value as a case you’d
be more than halfway across the screen before you get to the executable
code for that case. Is there no way to make that whole thing shorter?

FYI: I’ve also had little time for this RFC, but I’ve started discussing
spinning off the “path segment” part of the RFC into a dedicated RFC
with Mate off-list a few days ago, similarly to how the query parameters
part has been spun off because it has become too complicated for a bulk
RFC.

I’m not sure how I feel about object methods on the enum. That would
be something like:

use Uri\Rfc3986\PercentEncoder;

PercentEncoder::Path->encode($someval);

Right? […]

Yes.

use Uri\Rfc3986\encode;

encode($someval, PercentEncoder::Path);

Is about the same length, and with PFA easily pre-configurable, and
closer to what is typically seen in the wild today.

FWIW: The enum would also be “pre-configurable” with
https://wiki.php.net/rfc/partial_function_application_this. Arnaud and I
are working on a more general version of that RFC based on Bob’s
feedback, but will need to figure out the latest edge cases before
proposing it.

Best regards
Tim Düsterhus

Mate_Kocsis · April 15, 2026, 10:27pm

Hi Everyone,

I don’t have a strong preference between “instance methods on an enum”
and “free-standing functions”, but I have a strong preference against
“static methods”.

After all the feedback, I chose the "instance methods on an enum” solution at last.
I don’t say that I love it, I was perfectly happy with the static method solution, even
if this functionality was a tiny little bit less related to URIs then the rest of the methods.

Thanks again to Ignace, for finding out this unorthodox, but still genuine solution.

Regards,
Máté