···
On 1.12.2025 22:36:21, Larry Garfield wrote:
Hi folks. Ilija and I would like to present our latest RFC endeavor, pattern matching:
[https://wiki.php.net/rfc/pattern-matching](https://wiki.php.net/rfc/pattern-matching)
You may note the date on the RFC is from 2020. Yes, we really have had this one in-progress for 5 years. :-) (Though it was inactive for many of those years, in fairness.) Pattern matching was intended as the next follow up to Enums, as it's a stepping stone toward full ADT support. However, we also feel it has enormous benefit on its own for simplifying complex comparisons.
This RFC has been through numerous iterations, including a full implementation rewrite just recently that made a number of features much easier. We have therefore included two patterns that were previously slated for later inclusion but turned out to be trivially easy in the new approach. (Variable pinning and numeric comparison.)
Nonetheless, there are two outstanding questions on which we are looking for feedback.
Naturally given the timing, we will not be calling a vote until at least late January, regardless of how the discussion goes. So, plenty of time to express your support. :-)
Thanks for bringing pattern matching up for discussion again.
Iâd like to note that the class-access is very ugly.
// Shorthand
if ($p is Point(:$z, x: 3, :$y)) {
print "x is 3 and y is $y and z is $z.";
}
The RFC gives as reasoning that the colon prefix is needed for support of positional parameters in ADTs. Sure. Thatâs fine to anticipate these.
But whatâs not fine is using an inconsistent syntax for variable bindings across different contexts. In arrays binding is just a bare variable. In objects it suddenly needs a colon? What.
Also, a colon is very prone to being missed in the future with ADTs. Point::2D($y, $x) vs Point::2D(:$y, :$x). Means something completely different, but if you mess up just having the colon there or not, is a serious problem.
Can we instead find some solution, which satisfies both and still delivers consistency?
An earlier iteration of the RFC had the following very nice construction:
$p is Point&{ $z, x: 3, $y }
This just worked. Itâs a Point class, and then it matches the properties of the object. Nice.
This also works for future ADTs. Move::Forward&{ $amount }. Then, if thereâs a desire to actually positionally match an object. Then itâs logical to use a parenthesized expression, for a tuple. I.e.:
$move is Move::Forward($a)
Where $a is assigned the first value passed to Move::Forward.
Similarly for destructuring without class name no longer works:
$json = json_decode($myInput);
if ($json is stdClass(type: âstoreâ, :$value)) {
// why do I need to know/specify that itâs a stdclass?! Iâm just interested in the properties.
}
vs.
if ($json is { type: âstoreâ, $value }) {
//
}
This satisfies the requirements of keep the language clear and intuitive:
- Any standalone variable is bound. No weird colon shenanigans. The syntax is consistent.
- Positional binding is quite intuitively using parenthesis - you construct the enum with Foo::Bar($var) and you read it back on the right hand side with Foo::Bar($var).
- It naturally allows destructuring without class name.
- It makes it hard to accidentally write something totally different to what was meant.
(Also, itâs likely more intuitive to users from other languages, like rust, which also has {} for named stuff and () for positional stuff.)
Further this particular syntax works nicely with a future scope of object destructuring, akin to array destructuring. As an example:
function addVec(Point $p, Vec $v) {
Point(:$px, :$py) = $p; // I already know this is a Point, why do I need to repeat it. It also looks ugly and quite a bit like a left-hand function call. Like⊠assigning something to a returned reference?
// Or would you do {$px, $py} for object destructuring? Well thatâs now truly inconsistent.
Vec(:$vx, :$vy) = $v;
return new Point($px + $vx, $py + $vy);
}
vs.
function addVec(Point $p, Vec $v) {
{$px, $py} = $p; // Plain and simple. Perfectly straightforward.
{$vx, $vy} = $v;
return new Point($px + $vx, $py + $vy);
}
Iâve also heard a consideration about âFoo::Bar & { $var }â being ambiguous with respect to âis Foo::Bar now a const or an ADT classâ. This may be resolved in the VM. I donât consider this a major issue, and is simply something which can be disambiguated at optimizer-time or run-time, depending on what type of symbol it is.
Iâm deeply unsatisfied by the handling of object properties:
âNote that matching against a propertyâs value implies reading that propertyâs valueâ, âIf the property is uninitialized, an error will be thrown.â and âIf the property is undefined and none of the above apply, it will evaluate to null and a Warning will be issued.â
This is wildly inconsistent with arrays:
âOf particular note, the pattern matching approach automatically handles array_key_exists() checking. That means a missing array element will not trigger a warning, whereas with a traditional if ($foo[âbarâ] === âbazâ) approach missing values must be accounted for by the developer manually.â
Sure, a pattern match will read an objects property. Just like it reads an arrays entry.
I assume the goal is âletâs warn when an object property is typoedâ. But it just makes for two tiers. arrays get key_exists(), properties do not get property_exists(). I welcome surprises.
From my point of view, pattern matching is an âisâ operation. Thus it ought expressing isset-like semantics. I.e. the approach for arrays is correct, and should be mirrored to objects.
I definitely think the approach of âletâs warn about typosâ is laudable, but consistency is important.
It also means that uninitialized properties forcibly throw. It also has subtle ordering implications on the semantics, given that the implementation internally short-circuits. E.g. (assuming something like âclass ResponseOrError { string $type; Exception $e; string $response; }â):
if ($obj is ResponseOrError { type: âerrorâ, exception: $e }) { throw $e; }
does not throw if $exception is uninitialized. and $type is not error. But $obj is ResponseOrError { exception: $e, type: âerrorâ } will certainly throw.
It further means that there needs to be some internal checked and you cannot simply write:
if ($obj is ResponseOrError { exception: $e }) { throw $e; }
This is bad design and takes a lot of flexibility, just for being typo-safe.
There are better approaches towards typo-safety, e.g. in future (PHP 9) we could change isset() and all other similar checks (coalesce and this proposal) to immediately throw when a property is checked for existence, whose name does not exist on a class which is not marked #[\AllowDynamicProperties].
We should make use of that instead of shoe-horning this into this proposal.
Open questions:
-
match() âisâ placement:
I prefer match() is {} rather than an âisâ inside the construct. Simpler to me, but I think either choice is fine.
-
Positional array enforcement:
Itâs relatively simple to intentionally get positional arrays via array_values(). I also donât think itâs unexpected. Thatâs just how PHPâs arrays work. Enforcing positional arrays however will be quite surprising if e.g. an entry was removed:
$a = [1, 2, 3];
unset($a[1]);
if ($a is [1, 3]) {
// huh? Itâs [1, 2 => 3], not [1, 3].
}
Thanks,
Bob