[PHP-DEV] [Pre-RFC] Scalar object methods (with a working implementation)

Hi all,

Per Seifeddine's suggestion to keep this out of the karma-request thread, I'm opening a pre-RFC discussion for scalar object methods -- calling a small, curated set of methods directly on scalar values, e.g. $str->trim(), (3)->pow(2). There's a complete, tested implementation and a full write-up (links below); I'd like to surface the strongest objections before I write the formal RFC.

Disclosure: I built this with an AI assistant (Claude) as a tool. The design and the decisions are mine, and I've independently verified the engine behaviour, performance, JIT correctness, leak-freedom and the BC scan. Flagging it up front for transparency.

I know "methods on primitives" was proposed and declined before (Nikita's 2014 "Methods on primitive types in PHP"). The reason it stalled was loose typing: $x->trim() would need a runtime type check and would behave differently depending on what $x held. This proposal sidesteps that entirely, by generalizing the resolution Nikita himself suggested in that thread -- requiring an explicit cast where the type isn't already clear.

The idea: dispatch only on receivers the compiler already knows are scalar. The method call is rewritten at compile time to an ordinary call into an internal backing class -- no runtime type dispatch, no new opcode, the object method-call path is untouched. A receiver qualifies only if its type is guaranteed syntactically: a literal, a (string)/(int) cast, a concatenation/interpolation, a non-nullable scalar-typed property, or a call with a declared non-nullable scalar return type. An untyped $x->trim() is left exactly as today (Error). Crucially, dispatch never depends on optimizer-inferred types, so behaviour is identical with and without opcache.

     echo " Hello World "->trim()->upper(); // "HELLO WORLD"
     echo (3)->pow(2); // 9
     echo "hello"->length()->pow(2); // 25 -- length():int chains into the int methods

So the cast Nikita proposed, ((string) $num)->chunk(), is only needed where the type isn't already guaranteed; everywhere else the dispatch is sound by construction, with no runtime check.

It's intended as one proposal with two independent votes:

1. Scalar methods on guaranteed free receivers (the above). A pure capability -- it adds a way to call scalar operations and changes nothing about untyped code. Proposed initial sets: a small curated Str (trim/upper/lower/length + contains/startsWith/endsWith), Int (abs/pow/clamp), and Float (round/ceil/floor/abs); bool deliberately gets none (its operations are operators, not methods). The sets are governed by explicit criteria and are the easiest thing to tune in discussion.

2. Scalar-typed local variables (int $x = ...;, scalar types only), which additionally make a typed local a guaranteed receiver (string $s = ...; $s->trim()). This is the more contested half -- it also carries the "local type discipline" argument -- so it's a separate vote: a "no" here ships the capability without typed locals.

What I'm deliberately NOT doing, up front so it's not a surprise:

- No method-call-result receivers ($this->getName()->trim()) -- that would rest on return-type covariance under inheritance; not worth the surface.
- Int::abs/pow return int|float (they can overflow, as the global functions do), so they're honest terminals -- they don't chain. (Int::clamp is the one initial int method provably :int for all inputs, so it does chain.)
- No int|false typed locals -- that's a sentinel state, not a committed type; ?T is supported, sentinel-unions are not.
- The backing classes are internal-only (NUL-prefixed name, like anonymous classes): class_exists('Str') is false, no Reflection, userland "class Str {}" can't collide.

Implementation status -- this is built and tested, not a sketch:

- Scalar methods add zero new opcodes -- the desugar emits an ordinary static call, and the object method-call path is byte-for-byte unchanged. (Typed locals add dedicated *_TYPED assignment opcodes, but the untyped hot path stays byte-identical.)
- Performance (deterministic callgrind, release build): the untyped hot path is byte-identical; the standard bench.php suite is +0.145% instructions, entirely from predicted-not-taken branches in reference opcodes only, with zero added cache misses or branch mispredictions. A typed-local write benchmarks at ~0.79x the cost of a typed-property write -- a check the language already runs on every typed-property write since 7.4.
- References (the objection that sank prior typed-locals attempts) are enforced through every path -- =&, by-ref params, array/object/static-prop refs, yield, closure capture, $$name, extract, $GLOBALS, global -- via the existing typed-property reference machinery. Leak-checked under stress.
- Correct under JIT in all three modes (interpreter, function, tracing -- differential byte-identical output). opcache SHM + file_cache round-trip verified.
- BC impact, measured: an AST scan of the 1,000 most-downloaded Packagist packages (173k+ files) found zero method-call sites with a guaranteed-scalar receiver -- i.e. zero call sites that change behaviour (every such site is a fatal error today). Userland Str classes (incl. Laravel's Illuminate\Support\Str) coexist with the backing class, verified.

Full write-up (RFC draft, plus the method-set, performance, and BC-impact analyses): php-src/scalar-object-methods-rfc at rfc/docs · kralmichal/php-src · GitHub

Implementation branches (PHP 8.6-dev base):
- Primary (scalar methods): GitHub - kralmichal/php-src at rfc/scalar-methods · GitHub
- Secondary (typed locals, stacked): GitHub - kralmichal/php-src at rfc/typed-locals · GitHub

What I'd value discussing before I write the formal RFC:

1. Does the "compile-time-guaranteed receivers only" framing actually resolve the loose-typing objection, or is there a hole I'm not seeing?
2. The method-set and naming is the most open part -- is a small curated, clean-slate set (distinct from the procedural names) the right direction, or a non-starter? How should it relate to the existing userland efforts in this space (e.g. Psl)?
3. Anything that would sink this before I invest in the full RFC.

Thanks,
Michal Kral

On Mon, 29 Jun 2026 at 23:33, Michal Kral <michal@entrylog.eu> wrote:

Hi all,

Per Seifeddine's suggestion to keep this out of the karma-request
thread, I'm opening a pre-RFC discussion for scalar object methods --
calling a small, curated set of methods directly on scalar values, e.g.
$str->trim(), (3)->pow(2). There's a complete, tested implementation and
a full write-up (links below); I'd like to surface the strongest
objections before I write the formal RFC.

Hi Michal,

Thanks for the detailed write-up. I'll be upfront: I'm against this
feature. I have two concrete objections to the approach and one
broader objection to the idea itself.

The idea: dispatch only on receivers the compiler already knows are
scalar. The method call is rewritten at compile time to an ordinary call
into an internal backing class -- no runtime type dispatch, no new
opcode, the object method-call path is untouched. A receiver qualifies
only if its type is guaranteed syntactically: a literal, a
(string)/(int) cast, a concatenation/interpolation, a non-nullable
scalar-typed property, or a call with a declared non-nullable scalar
return type.

This is my main objection, and I think it's a fatal one.

Restricting dispatch to receivers "the compiler already knows are
scalar" sounds safe, but in practice it covers almost no real code.
PHP is compiled one file at a time, with no view into other files. The
compiler does not perform whole-program analysis; when compiling a
given file, it generally has no knowledge of declarations in other
files and haven't been autoloaded yet.

So take your own qualifying rule, "a call with a declared non-nullable
scalar return type":

    class Example {
        public static function getStr(): string { return "x"; }
    }

    $x = Example::getStr();
    $x->length();

A human reads this and knows `$x` is a string. But the PHP *compiler*,
when compiling the file that contains `$x = Example::getStr()`, only
knows `getStr()` returns `string` if `Example` happens to be declared
in the same file. Move `Example` into its own autoloaded file (which
is how essentially all real code is organised) and the compiler has no
idea what `getStr()` returns at compile time. So `$x->length()` would
*not* dispatch, even though the type is completely determined.

The result is that the same expression works or fails depending on
whether a class is in the same file or autoloaded from another one.
That's not a predictable rule a developer can hold in their head.
Real-world values emerge from call chains, conditionals, and
cross-file boundaries, The cases the compiler can't prove
syntactically. The feature ends up usable only on literals and casts
and almost nothing else.

This also forces special handling in static-analysis tools because the
compiler and the analyser will disagree about the same line. Given
`$x->length()`, PHPStan/Psalm/Mago will happily infer `$x` is a string
and accept it, while the compiler rejects it.

- The backing classes are internal-only (NUL-prefixed name, like
anonymous classes): class_exists('Str') is false, no Reflection,
userland "class Str {}" can't collide.

My second objection. If `$s->length()` is sugar for `Str::length($s)`,
then `Str` (or whatever backs it) *must* be visible to userland in
some form. Static analysers (PHPStan, Psalm, Mago, Phan, PhpStorm,
...) need a definition describing which methods exist, to type-check
calls, support "go to definition," report wrong arity, and the
PHP-based ones need it to be reflectable.

The collision concern you're solving with NUL-prefixing isn't worth
that cost. Userland already has thousands of `Str` classes; the clean
fix is to namespace the backing classes (or simply pick non-colliding
names), not to hide them from the entire ecosystem. Solve the naming
problem with naming, not by blinding the tooling.

(Related, and unaddressed in the write-up: what happens with methods
that take arguments, e.g. `$s->indexOf($y)`? Does an arity mismatch
fail at compile time, or at runtime with `ArgumentCountError` like a
normal method call?)

Finally, the broader point. Even if both of the above were fully
resolved, I'd still be against this. PHP has an established way of
doing these operations and scalar methods don't remove that.
`trim($s)`, `mb_trim($s)`, and `$s->trim()` would all coexist, with
the method form available only sometimes. To me that's too disruptive
to the norm for what it buys: it doesn't replace anything, it doesn't
compose cleanly, and it introduces a method-call syntax on values that
carry no object identity. I don't think the language is better for it.

Cheers,
Seifeddine.

On 2026-06-30 10:32, Michal Kral wrote:

Hi all,

calling a small, curated set of methods directly on scalar values, e.g. $str->trim(), (3)->pow(2). There's a complete, tested implementation and a full write-up (links below); I'd like to surface the strongest objections before I write the formal RFC.

Larry Garfield has noted that when PFA lands, one would be able to write

$s|>trim(?)

One already has

$s|>trim(...)

but the ? syntax would allow

$sep|>explode(?, $str)
or
$str|>explode($sep, ?)

whichever seems more appropriate at the time.

Probably not helpful for pre-existing functions (just say explode($sep, $string) already), but user functions might be more readable when pronounced this way..

On Tue, Jun 30, 2026, at 00:32, Michal Kral wrote:

Hi all,

Per Seifeddine’s suggestion to keep this out of the karma-request
thread, I’m opening a pre-RFC discussion for scalar object methods –
calling a small, curated set of methods directly on scalar values, e.g.
$str->trim(), (3)->pow(2). There’s a complete, tested implementation and
a full write-up (links below); I’d like to surface the strongest
objections before I write the formal RFC.

Disclosure: I built this with an AI assistant (Claude) as a tool. The
design and the decisions are mine, and I’ve independently verified the
engine behaviour, performance, JIT correctness, leak-freedom and the BC
scan. Flagging it up front for transparency.

I know “methods on primitives” was proposed and declined before
(Nikita’s 2014 “Methods on primitive types in PHP”). The reason it
stalled was loose typing: $x->trim() would need a runtime type check and
would behave differently depending on what $x held. This proposal
sidesteps that entirely, by generalizing the resolution Nikita himself
suggested in that thread – requiring an explicit cast where the type
isn’t already clear.

The idea: dispatch only on receivers the compiler already knows are
scalar. The method call is rewritten at compile time to an ordinary call
into an internal backing class – no runtime type dispatch, no new
opcode, the object method-call path is untouched. A receiver qualifies
only if its type is guaranteed syntactically: a literal, a
(string)/(int) cast, a concatenation/interpolation, a non-nullable
scalar-typed property, or a call with a declared non-nullable scalar
return type. An untyped $x->trim() is left exactly as today (Error).
Crucially, dispatch never depends on optimizer-inferred types, so
behaviour is identical with and without opcache.

echo " Hello World "->trim()->upper(); // “HELLO WORLD”
echo (3)->pow(2); // 9
echo “hello”->length()->pow(2); // 25 – length():int
chains into the int methods

So the cast Nikita proposed, ((string) $num)->chunk(), is only needed
where the type isn’t already guaranteed; everywhere else the dispatch is
sound by construction, with no runtime check.

It’s intended as one proposal with two independent votes:

  1. Scalar methods on guaranteed free receivers (the above). A pure
    capability – it adds a way to call scalar operations and changes
    nothing about untyped code. Proposed initial sets: a small curated Str
    (trim/upper/lower/length + contains/startsWith/endsWith), Int
    (abs/pow/clamp), and Float (round/ceil/floor/abs); bool deliberately
    gets none (its operations are operators, not methods). The sets are
    governed by explicit criteria and are the easiest thing to tune in
    discussion.

  2. Scalar-typed local variables (int $x = …;, scalar types only),
    which additionally make a typed local a guaranteed receiver (string $s =
    …; $s->trim()). This is the more contested half – it also carries the
    “local type discipline” argument – so it’s a separate vote: a “no” here
    ships the capability without typed locals.

What I’m deliberately NOT doing, up front so it’s not a surprise:

  • No method-call-result receivers ($this->getName()->trim()) – that
    would rest on return-type covariance under inheritance; not worth the
    surface.
  • Int::abs/pow return int|float (they can overflow, as the global
    functions do), so they’re honest terminals – they don’t chain.
    (Int::clamp is the one initial int method provably :int for all inputs,
    so it does chain.)
  • No int|false typed locals – that’s a sentinel state, not a committed
    type; ?T is supported, sentinel-unions are not.
  • The backing classes are internal-only (NUL-prefixed name, like
    anonymous classes): class_exists(‘Str’) is false, no Reflection,
    userland “class Str {}” can’t collide.

Implementation status – this is built and tested, not a sketch:

  • Scalar methods add zero new opcodes – the desugar emits an ordinary
    static call, and the object method-call path is byte-for-byte unchanged.
    (Typed locals add dedicated *_TYPED assignment opcodes, but the untyped
    hot path stays byte-identical.)
  • Performance (deterministic callgrind, release build): the untyped hot
    path is byte-identical; the standard bench.php suite is +0.145%
    instructions, entirely from predicted-not-taken branches in reference
    opcodes only, with zero added cache misses or branch mispredictions. A
    typed-local write benchmarks at ~0.79x the cost of a typed-property
    write – a check the language already runs on every typed-property write
    since 7.4.
  • References (the objection that sank prior typed-locals attempts) are
    enforced through every path – =&, by-ref params,
    array/object/static-prop refs, yield, closure capture, $$name, extract,
    $GLOBALS, global – via the existing typed-property reference machinery.
    Leak-checked under stress.
  • Correct under JIT in all three modes (interpreter, function, tracing
    – differential byte-identical output). opcache SHM + file_cache
    round-trip verified.
  • BC impact, measured: an AST scan of the 1,000 most-downloaded
    Packagist packages (173k+ files) found zero method-call sites with a
    guaranteed-scalar receiver – i.e. zero call sites that change behaviour
    (every such site is a fatal error today). Userland Str classes (incl.
    Laravel’s Illuminate\Support\Str) coexist with the backing class, verified.

Full write-up (RFC draft, plus the method-set, performance, and
BC-impact analyses):
https://github.com/kralmichal/php-src/tree/rfc/docs/scalar-object-methods-rfc

Implementation branches (PHP 8.6-dev base):

What I’d value discussing before I write the formal RFC:

  1. Does the “compile-time-guaranteed receivers only” framing actually
    resolve the loose-typing objection, or is there a hole I’m not seeing?
  2. The method-set and naming is the most open part – is a small
    curated, clean-slate set (distinct from the procedural names) the right
    direction, or a non-starter? How should it relate to the existing
    userland efforts in this space (e.g. Psl)?
  3. Anything that would sink this before I invest in the full RFC.

Thanks,
Michal Kral

Hi Michal,

This is the part that gets me:

requiring an explicit cast where the type isn’t already clear.

Explicit casting in PHP is dangerous:

(int) "123password" === 123

https://3v4l.org/IAF8W

— Rob