Hi all,
Per Seifeddine's suggestion to keep this out of the karma-request thread, I'm opening a pre-RFC discussion for scalar object methods -- calling a small, curated set of methods directly on scalar values, e.g. $str->trim(), (3)->pow(2). There's a complete, tested implementation and a full write-up (links below); I'd like to surface the strongest objections before I write the formal RFC.
Disclosure: I built this with an AI assistant (Claude) as a tool. The design and the decisions are mine, and I've independently verified the engine behaviour, performance, JIT correctness, leak-freedom and the BC scan. Flagging it up front for transparency.
I know "methods on primitives" was proposed and declined before (Nikita's 2014 "Methods on primitive types in PHP"). The reason it stalled was loose typing: $x->trim() would need a runtime type check and would behave differently depending on what $x held. This proposal sidesteps that entirely, by generalizing the resolution Nikita himself suggested in that thread -- requiring an explicit cast where the type isn't already clear.
The idea: dispatch only on receivers the compiler already knows are scalar. The method call is rewritten at compile time to an ordinary call into an internal backing class -- no runtime type dispatch, no new opcode, the object method-call path is untouched. A receiver qualifies only if its type is guaranteed syntactically: a literal, a (string)/(int) cast, a concatenation/interpolation, a non-nullable scalar-typed property, or a call with a declared non-nullable scalar return type. An untyped $x->trim() is left exactly as today (Error). Crucially, dispatch never depends on optimizer-inferred types, so behaviour is identical with and without opcache.
echo " Hello World "->trim()->upper(); // "HELLO WORLD"
echo (3)->pow(2); // 9
echo "hello"->length()->pow(2); // 25 -- length():int chains into the int methods
So the cast Nikita proposed, ((string) $num)->chunk(), is only needed where the type isn't already guaranteed; everywhere else the dispatch is sound by construction, with no runtime check.
It's intended as one proposal with two independent votes:
1. Scalar methods on guaranteed free receivers (the above). A pure capability -- it adds a way to call scalar operations and changes nothing about untyped code. Proposed initial sets: a small curated Str (trim/upper/lower/length + contains/startsWith/endsWith), Int (abs/pow/clamp), and Float (round/ceil/floor/abs); bool deliberately gets none (its operations are operators, not methods). The sets are governed by explicit criteria and are the easiest thing to tune in discussion.
2. Scalar-typed local variables (int $x = ...;, scalar types only), which additionally make a typed local a guaranteed receiver (string $s = ...; $s->trim()). This is the more contested half -- it also carries the "local type discipline" argument -- so it's a separate vote: a "no" here ships the capability without typed locals.
What I'm deliberately NOT doing, up front so it's not a surprise:
- No method-call-result receivers ($this->getName()->trim()) -- that would rest on return-type covariance under inheritance; not worth the surface.
- Int::abs/pow return int|float (they can overflow, as the global functions do), so they're honest terminals -- they don't chain. (Int::clamp is the one initial int method provably :int for all inputs, so it does chain.)
- No int|false typed locals -- that's a sentinel state, not a committed type; ?T is supported, sentinel-unions are not.
- The backing classes are internal-only (NUL-prefixed name, like anonymous classes): class_exists('Str') is false, no Reflection, userland "class Str {}" can't collide.
Implementation status -- this is built and tested, not a sketch:
- Scalar methods add zero new opcodes -- the desugar emits an ordinary static call, and the object method-call path is byte-for-byte unchanged. (Typed locals add dedicated *_TYPED assignment opcodes, but the untyped hot path stays byte-identical.)
- Performance (deterministic callgrind, release build): the untyped hot path is byte-identical; the standard bench.php suite is +0.145% instructions, entirely from predicted-not-taken branches in reference opcodes only, with zero added cache misses or branch mispredictions. A typed-local write benchmarks at ~0.79x the cost of a typed-property write -- a check the language already runs on every typed-property write since 7.4.
- References (the objection that sank prior typed-locals attempts) are enforced through every path -- =&, by-ref params, array/object/static-prop refs, yield, closure capture, $$name, extract, $GLOBALS, global -- via the existing typed-property reference machinery. Leak-checked under stress.
- Correct under JIT in all three modes (interpreter, function, tracing -- differential byte-identical output). opcache SHM + file_cache round-trip verified.
- BC impact, measured: an AST scan of the 1,000 most-downloaded Packagist packages (173k+ files) found zero method-call sites with a guaranteed-scalar receiver -- i.e. zero call sites that change behaviour (every such site is a fatal error today). Userland Str classes (incl. Laravel's Illuminate\Support\Str) coexist with the backing class, verified.
Full write-up (RFC draft, plus the method-set, performance, and BC-impact analyses): php-src/scalar-object-methods-rfc at rfc/docs · kralmichal/php-src · GitHub
Implementation branches (PHP 8.6-dev base):
- Primary (scalar methods): GitHub - kralmichal/php-src at rfc/scalar-methods · GitHub
- Secondary (typed locals, stacked): GitHub - kralmichal/php-src at rfc/typed-locals · GitHub
What I'd value discussing before I write the formal RFC:
1. Does the "compile-time-guaranteed receivers only" framing actually resolve the loose-typing objection, or is there a hole I'm not seeing?
2. The method-set and naming is the most open part -- is a small curated, clean-slate set (distinct from the procedural names) the right direction, or a non-starter? How should it relate to the existing userland efforts in this space (e.g. Psl)?
3. Anything that would sink this before I invest in the full RFC.
Thanks,
Michal Kral