[PHP-DEV] [RFC] Remove \0 from default trim() character mask

LamentXU · March 15, 2026, 6:20am

Dear all,

I am sending this to introduce my new RFC: https://wiki.php.net/RFC/dont_trim_NUL

Quick summary:

Currently, PHP’s trim functions strip the NUL byte (\0) by default, treating it alongside spaces, tabs, and newlines. This creates a highly surprising edge case.

Because \0 is semantically a control character or a vital part of a binary payload rather than a typographical whitespace character, casually using trim() to clean up trailing newlines can silently corrupt binary streams or cryptographic hashes by stripping legitimate NUL bytes. Whitespace characters are intended for typographical spacing and formatting (e.g., spaces, newlines, tabs).

Also, almost every mainstream programming languages except PHP doesn’t trim NUL characters (python, go, rust, js, even ‘is_space’ function in glibc…) It sounds reasonable to expect the same here.

This RFC proposes removing \0 (ASCII 0) from the default character mask. I recognize this introduces a backward compatibility break, and therefore I would love to hear your thoughts, feedback, and any concerns regarding the BC impact before moving forward.

Cheers,
Weilin Du

LamentXU · March 24, 2026, 5:11pm

Hi all,

I believe the RFC “Don’t trim NUL bytes by default” is ready to move to the voting phase. I intend to open the voting period soon (typically 7 days).

RFC page: https://wiki.php.net/rfc/dont_trim_nul

This RFC proposes to remove \0 (NUL byte) from the default character mask of trim(), ltrim(), and rtrim(), to align with common expectations and avoid unintended trimming of legitimate NUL-containing strings. Please tell me if there are any final comments or concerns. Thanks.

Best regards,
Weilin Du

p.s. Not that sure if this email is going be be sent in the correct thread, so I would post the thread link here if it doesn’t [RFC] Remove \0 from default trim() character mask - Externals

Levi_Morrison · March 24, 2026, 6:29pm

On Tue, Mar 24, 2026 at 11:14 AM LamentXU <lamentxu@163.com> wrote:

Hi all,

I believe the RFC “Don't trim NUL bytes by default” is ready to move to the voting phase. I intend to open the voting period soon (typically 7 days).

RFC page: PHP: rfc:dont_trim_nul

This RFC proposes to remove \0 (NUL byte) from the default character mask of trim(), ltrim(), and rtrim(), to align with common expectations and avoid unintended trimming of legitimate NUL-containing strings. Please tell me if there are any final comments or concerns. Thanks.

Best regards,
Weilin Du

p.s. Not that sure if this email is going be be sent in the correct thread, so I would post the thread link here if it doesn't [RFC] Remove \0 from default trim() character mask - Externals

I agree that \0 is a control byte and not whitespace, so it probably
shouldn't be included in any of the trim functions. However, at this
stage in PHP's lifecycle I am not sure if we should fix it.

There hasn't been much discussion, so dear internals: are simply busy,
un-opinionated, or what?

Crell · March 24, 2026, 6:48pm

On Tue, Mar 24, 2026, at 1:29 PM, Levi Morrison wrote:

On Tue, Mar 24, 2026 at 11:14 AM LamentXU <lamentxu@163.com> wrote:

Hi all,

I believe the RFC “Don't trim NUL bytes by default” is ready to move to the voting phase. I intend to open the voting period soon (typically 7 days).

RFC page: PHP: rfc:dont_trim_nul

This RFC proposes to remove \0 (NUL byte) from the default character mask of trim(), ltrim(), and rtrim(), to align with common expectations and avoid unintended trimming of legitimate NUL-containing strings. Please tell me if there are any final comments or concerns. Thanks.

Best regards,
Weilin Du

p.s. Not that sure if this email is going be be sent in the correct thread, so I would post the thread link here if it doesn't [RFC] Remove \0 from default trim() character mask - Externals

I agree that \0 is a control byte and not whitespace, so it probably
shouldn't be included in any of the trim functions. However, at this
stage in PHP's lifecycle I am not sure if we should fix it.

There hasn't been much discussion, so dear internals: are simply busy,
un-opinionated, or what?

No strong feeling on the matter, will probably Abstain. I don't think it's something that I've ever run into, since I tend to know very well if my strings are bytes or characters and use them appropriately.

--Larry Garfield

Ilia · March 24, 2026, 7:31pm

That seems a bit dangerous, since non-stripped \0 can potentially lead to issues because when concatinated with other strings, which is quite common for string operations can result in un-predictability and possibly even security issues.

You make a good point about other languages, the concern is while there that is the expecation and different solutions exist for sanitizing/handling \0 they are well known and understood, in PHP the assumption is that \0 is removed and the change of this assumption breaks a lot of things. Trim functions already allow 2nd parameter with character list, so it is already possible to exclude \0 from being trimmed.

Just my 2c.

···

Ilia Alshanetsky
Technologist, CTO, Entrepreneur
E: ilia@ilia.ws
T: @iliaa
B: http://ilia.ws

Andrew_F · March 24, 2026, 7:35pm

On Mar 24, 2026, at 11:29, Levi Morrison <levi.morrison@datadoghq.com> wrote:

I agree that \0 is a control byte and not whitespace, so it probably
shouldn't be included in any of the trim functions. However, at this
stage in PHP's lifecycle I am not sure if we should fix it.

There hasn't been much discussion, so dear internals: are simply busy,
un-opinionated, or what?

For what little it's worth, I can't imagine any practical situation where this change would be helpful. Using trim() or its variants on binary data is likely to result in that data being corrupted, and this will continue to be the case even if NUL is not trimmed. Changing the current behavior is likely to break userspace code which depends on the current behavior. If, for some reason, users find it useful to trim specific whitespace characters from binary data, they can do so by passing a $characters mask to the function to fit their needs, rather than changing the function for everyone.

-- Andrew F

user6 · March 24, 2026, 8:09pm

Hello.
Using trim() for binary data sounds like a mistake. There’s nothing special in whitespace or any other characters in binary data, so why use trim() for it at all? If someone using trim for binary data, then this might be deliberate choose. For example, trimming zero byte might be the sole cause. That’s why I disagree with “Secondly” RFC point.

Java’s String.trim() treat characters with code points equals or less than \u0020 as whitespace. So there’s no “surprising case” at least for java developers and that’s why I disagree with “Thirdly” point.

However, I agree with “Firstly” point. But for semantic purists we have mb_trim function.

Removing \0 from trim() makes code vulnerable to null byte injection attack [1]. I have strong feeling that zero byte was added to trim() exactly by this cause.

[1] https://owasp.org/www-community/attacks/Embedding_Null_Code

LamentXU · March 25, 2026, 6:15am

I think there are sound opinions in both side so I will still let the vote begin and see what the majority thinks. To be short,

Reasons for supporting

semantically NUL is not whitespaces
the majority of other popular languages don’t trim NUL
Reasons for not supporting
Java do trim NUL
Security issues in existing code base
Already has mb_trim() and the second parameter instead to prevent trimming NUL if people want
Unnecessary changes in the life-cycle

This is a quite minor change (and thats why people don’t talk about this before, since little people run into the case of trimming NUL).

Well my opinion is, first I think trimming is indeed for white-spaces. I know Java do trim NULs, but it doesn’t explicitly do that, it removes every char with ascii <= 20 (and I think most people are using strip() instead, which doesn’t remove NUL), besides almost every other standard or language don’t trim NUL. So in the case of aligning with popular standards or languages it make sense to avoid trimming NUL.

Security and life-cycle concerns are good points. Un-trimming NUL may cause a sort of path hacking as Ilia mentioned, while php trimming \0 is already well-known among some php devs who ran into this case before.

We has a second character to alter the trimmed char set, but I do think most people would expect it not to be trimmed by default aligning with other languages.

···

Ilia Alshanetsky
Technologist, CTO, Entrepreneur
E: ilia@ilia.ws
T: @iliaa
B: http://ilia.ws

Robert_Humphries · March 25, 2026, 8:23am

Reasons for supporting
- semantically NUL is not whitespaces
- the majority of other popular languages don't trim NUL
Reasons for not supporting
- Java do trim NUL
- Security issues in existing code base
- Already has mb_trim() and the second parameter instead to prevent trimming NUL if people want
- Unnecessary changes in the life-cycle

I think it would be useful if there were some examples of when you
would want to be using `trim` but _not_ trim NULL bytes. The examples
in the RFC currently show the expected change in behaviour; which is
good - but you could also achieve the same effect by not running
`trim` in the first place, as the only character in the examples that
is expected to be removed before or after the change is the NULL byte
(even in the example with a new line followed by null bytes, after the
change then the string would be identical to before the `trim`).

Given that most voters seem to be not strongly against, but also
seeing no benefit in changing the status quo, some examples of how the
change being used would be useful might help.

~ Robert

On Wed, Mar 25, 2026 at 6:15 AM LamentXU <lamentxu@163.com> wrote:

I think there are sound opinions in both side so I will still let the vote begin and see what the majority thinks. To be short,

Reasons for supporting
- semantically NUL is not whitespaces
- the majority of other popular languages don't trim NUL
Reasons for not supporting
- Java do trim NUL
- Security issues in existing code base
- Already has mb_trim() and the second parameter instead to prevent trimming NUL if people want
- Unnecessary changes in the life-cycle

This is a quite minor change (and thats why people don't talk about this before, since little people run into the case of trimming NUL).

Well my opinion is, first I think trimming is indeed for white-spaces. I know Java do trim NULs, but it doesn't explicitly do that, it removes every char with ascii <= 20 (and I think most people are using strip() instead, which doesn't remove NUL), besides almost every other standard or language don't trim NUL. So in the case of aligning with popular standards or languages it make sense to avoid trimming NUL.

Security and life-cycle concerns are good points. Un-trimming NUL may cause a sort of path hacking as Ilia mentioned, while php trimming \0 is already well-known among some php devs who ran into this case before.

We has a second character to alter the trimmed char set, but I do think most people would expect it not to be trimmed by default aligning with other languages.

At 2026-03-25 03:15:25, "Ilia" <ilia@ilia.ws> wrote:

That seems a bit dangerous, since non-stripped \0 can allow it to potentially lead to issues because when concatinated with other strings, which is quite common for string operations can result in un-predictability and possibly even security issues.

You make a good point about other languages, the concern is while there that is the expecation and different solutions exist for sanitizing/handling \0 they are well known and understood, in PHP the assumption is that \0 is removed and the change of this assumption breaks a lot of things.

Just my 2c.

On Sun, Mar 15, 2026 at 2:23 AM LamentXU <lamentxu@163.com> wrote:

Dear all,

I am sending this to introduce my new RFC: PHP: rfc:dont_trim_nul

Quick summary:

Currently, PHP's trim functions strip the NUL byte (\0) by default, treating it alongside spaces, tabs, and newlines. This creates a highly surprising edge case.

Because \0 is semantically a control character or a vital part of a binary payload rather than a typographical whitespace character, casually using trim() to clean up trailing newlines can silently corrupt binary streams or cryptographic hashes by stripping legitimate NUL bytes. Whitespace characters are intended for typographical spacing and formatting (e.g., spaces, newlines, tabs).

Also, almost every mainstream programming languages except PHP doesn't trim NUL characters (python, go, rust, js, even 'is_space' function in glibc...) It sounds reasonable to expect the same here.

This RFC proposes removing \0 (ASCII 0) from the default character mask. I recognize this introduces a backward compatibility break, and therefore I would love to hear your thoughts, feedback, and any concerns regarding the BC impact before moving forward.

Cheers,
Weilin Du

--
Ilia Alshanetsky
Technologist, CTO, Entrepreneur
E: ilia@ilia.ws
T: @iliaa
B: http://ilia.ws

Marco_Pivetta · March 25, 2026, 10:02am

I’d vote against this proposal: \0 being considered one of the stripped characters is now a downstream assumption, and this ends up being a BC break with little to no advantages.

From a semantic perspective, \n, \t and \r are also “control characters” in other contexts (not the C world).

···

Marco Pivetta

https://mastodon.social/@ocramius

https://ocramius.github.io/

Markus_Podar · March 25, 2026, 2:47pm

Hi,

On Wed, Mar 25, 2026 at 11:04 AM Marco Pivetta <ocramius@gmail.com> wrote:

On Tue, 24 Mar 2026 at 19:29, Levi Morrison <levi.morrison@datadoghq.com> wrote:

There hasn’t been much discussion, so dear internals: are simply busy,
un-opinionated, or what?

I’d vote against this proposal: \0 being considered one of the stripped characters is now a downstream assumption, and this ends up being a BC break with little to no advantages.

From a semantic perspective, \n, \t and \r are also “control characters” in other contexts (not the C world).

Thanks for putting this into clear works: this is exactly my fear too.

Since the very first message of this thread arrived here, my thoughts were along the lines this is outright dangerous to do: there is “decades” of downstream assumption how this works and it’s IMPOSSIBLE to properly vet this, as this operates often on the (potentially untrusted) input level.

And, AFAICS, a perfectly valid workaround is possible by just providing as custom 2nd arg.

IM(H)O this should never come to a vote, doing this shouldn’t even be considered.

sincerely,

Markus

Rowan_Tommins_IMSoP · March 25, 2026, 3:38pm

On 25 March 2026 08:23:38 GMT, Robert Humphries <contact@developer-rob.co.uk> wrote:

I think it would be useful if there were some examples of when you
would want to be using `trim` but _not_ trim NULL bytes. The examples
in the RFC currently show the expected change in behaviour; which is
good - but you could also achieve the same effect by not running
`trim` in the first place, as the only character in the examples that
is expected to be removed before or after the change is the NULL byte
(even in the example with a new line followed by null bytes, after the
change then the string would be identical to before the `trim`).

I second this request, and would go further: the examples should show a situation where you *do* want to trim other "unusual" characters like "\v" and "\f".

The RFC talks about corrupting binary data, but wouldn't trimming *any* bytes from that data cause corruption? If you know the data is padded by a *specific* character, then passing that character to trim() explicitly is the *only* safe way to use it.

PS: Please can everyone remember to start your reply *below* the text you're replying to, and *trim* parts that are not directly relevant.

Rowan Tommins
[IMSoP]

Sara_Golemon · March 25, 2026, 7:20pm

On Sun, Mar 15, 2026 at 1:24 AM LamentXU <lamentxu@163.com> wrote:

I am sending this to introduce my new RFC: https://wiki.php.net/RFC/dont_trim_NUL

I’m not a fan of this for the reasons that several others have already stated. While \0 may not be technically whitespace, you have 30 years of scripts expecting it to be trimmed and undoing that potentially (though perhaps not actually) opens up new security vulnerability for dubious gain. I hope nobody is depending on this to remove all null bytes, because of course it won’t, but I fail to see how a text oriented function (the notion of whitespace makes this not binary oriented) should be in favor of preserving nulls under any circumstance. If anything functions like this which are specifically text oriented should (perhaps, but for performance and historical raisins probably not) be elevated to throw on discovering null bytes anywhere in the string.

-Sara

Hans_Krentel · March 29, 2026, 7:06am

On Wednesday 25 March 2026 07:15:53 (+01:00), LamentXU wrote:

> I think there are sound opinions in both side so I will still let the vote begin and see what the majority thinks. To be short,
> > > Reasons for supporting
> - semantically NUL is not whitespaces
> - the majority of other popular languages don't trim NUL
> Reasons for not supporting
> - Java do trim NUL
> - Security issues in existing code base
> - Already has mb_trim() and the second parameter instead to prevent trimming NUL if people want
> - Unnecessary changes in the life-cycle
> > > This is a quite minor change (and thats why people don't talk about this before, since little people run into the case of trimming NUL).

This change is not minor and most of all removing NUL from PHP's trim() default cutset is a security issue.

In your first RFC you have concluded that trim() is about trimming whitespace, and as isspace(\f) returns true, it is whitespace and should be added to the default cutset string value (second parameter of trim(), optional).

You underlined that with a comparison across different programming languages to manifest the impression that trim() is about whitespace, and especially for casual use, this is then in the spot of usability / locality of expected behaviour.

While it is technically correct that isspace(\f) returns true, and \f is commonly understood to be in the space character class and often in use of other scripting languages like Python for their cutters or trimmers, this does not change what trim() in PHP actually is, despite what we want it to be. It most importantly does not automatically make such a change small or straight forward or safe. It may make it appear that way, but unfortunately, that view is without precision glasses.

What remains correct IMHO is the case of casually using trim() as a whitespace trimmer, and when done that way, the trim() function in PHP requires some extra-work, and that is looking up the default value of the second parameter, to find out if it is applicable for use or if the second parameter with a value of it's own needs to be provided for that use.

As the trim() function has two invocations, the user has to pick the right one for the job. That may be conceived as extra-work by those who are not aware that a function can have multiple invocations, e.g. new users or users new to programming. This _is_ a real point.

You have suggested, that if the default value is composed entirely of characters of the C space character class, then the function is easier to use as a whitespace trimmer. Under this pretext (whitespace trimmer), I think this remains correct.

Now for the parts, if you allow me, where this falls apart:

The first misconception as I understand it is the classification of the trim() function being a whitespace trimmer. This is wrong, the correct classification of the trim() function in PHP is a string trimmer. This distinction is furthermore important because the trim() function is a binary safe function and strings in PHP are array of bytes.

If we look more closely, we can see that with the default value, both in stable and unstable (master) PHP, it is composed of *both* space and control characters. When we apply the technique with the isspace() function to classify the spaces within the default set, we get a high number, it is either 5 out of 6 (stable) or 6 out of 7 (unstable).

However we can't just pick only one character classifier function. If we use the same technique and use iscntrl() for the counter-check, we get a similar high, if not exactly the same numbers: there are 5 out of 6 (stable) or 6 out of 7 (unstable) control characters in the default cutset.

This confirms that while trim() without the second parameter can be used as a whitespace trimmer, it is *equally* used as a control character trimmer. Henceforth the differentiation on being a whitespace trimmer remains correct, but limited: It is not exclusively a whitespace trimmer.

A conclusion of the earlier discovery that \f was missing and NUL was superfluous in stable under the pretext of a space trimming function, could have also been resolved by correcting the understanding that trim() is not an exclusive whitespace trimming function at all - whould have an analysis of the character classes been done with due dilligence. It was not done, or those who did this have not shared the outline of their solution here on the list (unless my mail client has eaten up some of the messages again).

The second misconception so far in both RFCs lies in the comparison with other programming languages. While this suffices as a first explorative test for comparison purposes, it also was not done extensively. There it was exclusively looked for default values, without taking into account that when different values were provided if the function itself is an exclusive whitespace trimmer or an ordinary string trimmer, and furthermore if binary safety applies to the function or not.

If we take Python as an example with their cut family of functions, you have correctly analyzed that the default value is entirely composed of all characters of the space character class in the C default locale. However, it is only the default value. There is no problem to use the default value and add the NUL character to it as an additional character to have it in the cutset.

That Python and PHP have a default value for what might be received as the same family of functions - despite the different names - could have also lead to the conclusion that different programming languages use a) different names, b) different defaults and c) different implementations resulting in d) overall different behaviour. This is why a programming language provides documentation of their standard library functions so that users can pick and choose the right function invocation for the job. This is normally taken as a given, however as the argument is and was to change a default value, not understanding how it fundamentally works (different invocations) and which checks in terms of programming the programming language, e.g. by changing a default value, are required (and not optional), is a shortcomming in both RFC texts.

Now Python is not the only other programming language, only one other I used here to illustrate the problem argueing with defaults while we have already shown that the function (the object under discussion) is prone to misclassification during the discussion, now furthermore misclassifying the invocations the functions have.

Obviously it is easy to fall for that. This is certainly the reason why programming languages for their standard libraries try to have as little ambiguity as possible with their standard functions so that everything can stay, or in case of a correction needed, resolve in clarity.

I'd like to illustrate that with another programming language, Go:

The string trimmer and the space trimmer are two different functions. This is a good resolution of the problem you brought up, because now we can reason with clarity whether the one or the other has a bug. What we can immediately see in the Go standard library is that the optionality of the second parameter is gone: the string trimmer requires to pass the cutset next to the string, while the whitespace trimmer has one argument only.

The ambiguity the PHP trim() function, being only a string trimmer (like in Python) with the second argument only optional because it came later (this was a design decision, the function has two different invocations, of which the second came later - this is important to understand), is completely voided when having two functions with all their parameters mandatory to pass as in Golang.

When we do the cross language comparison - and despite the limitations such comparisons always have - and work actively with such limitations, we can resolve the request to ease the casual use of the trim() function as a whitespace trimming function also by finding out that a function in PHP is missing and should be added:

trim_space()

however, this has not be mentioned so far in the discussion. IMHO a shortcomming of the discussions, especially if any of those who voted yes on the earlier RFC did actually bought into one of the two key arguments: whitespace trimming -or- language comparison.

With all that found out, let's explain the security issue we face with the proposal to remove NUL under the new light.

As illustrated, while it is not entirely wrong that trim() is a space trimming function, it is equally technical correct that trim() is a control character trimming function.

While so far the argument has been used to add a control character to the default cutset (form-feed, at code-point 12, a C0 control character), the nature of this change so far suggested that practically there was not much need to discuss the change. Henceforth the understanding across the whole group is likely very different without causing enough disturbance that could endanger the vote. We can also see that in the vote of 100% yes by 25 individuals.

The nature of removing a character from the default cutset has far more severe consquences by nature. As the trim() function is undoubtfully a string trimmer (unless taken as a whitespace trimmer that, as we have shown, is a mistake or imprecision at best), the use of PHP trim() is that heavily undermined - if not sabotaged - that leading/ending NUL characters remain within the string while the use of trim() under good faith in PHP requires to remove these.

This is not just an annoyance removing more control characters as earlier expected due to adding form-feed to the default cut-set, which already violated the history rule of observable behaviour of the function - and in this pretext of a programming language, undermined the rule of good faith in that language - it has severe and dangerous consequences only by the misunderstanding of the nature of a function due to the incomplete comparison during character class anylsis and the very incomplete language comparison that has been done so far.

While it is not too late to prevent bringing this RFC to vote or if it is brought to vote, to reject it per vote, this should also show a problem with the earlier change that is currently in unstable PHP:

Users of the programming language still face an issue, while not as grave as a security issue like NUL byte injection, they remain unable when switching to master to find out about the changes of the default cutset if they intentionally use it. Obviously they use the PHP language's default cutset, however that has changed. The language however is silent about that. This is highly unexpected because the second parameter is optional, and therefore if there would be a useful other default, it would be provided as string, and not by leaving it out.

I therefore suggest to at least provide a new global string constant with the value of the original default characters so that when preparing scripts for the unstable and then later next release version of PHP a more or less simple search and replace operation can be done replacing the use of the trim family of functions in script code in the first invocation with the second invocation using this new constant.

Additionally I'd suggest that this new global constant is backported to PHP 8.5 so that current stable code can be immunized against the change that has been voted for and for which we must assume that it will come in next PHP at the time of writing.

Because of the issues you as a new contributor has raised, specifically in regard to the casual use of the trim family of functions, I'd suggest to introduce the "trim_space()" function with a single string argument, that is a dedicated whitespace trimmer capable of cutting UTF-8 encoded whitespace characters, so that, with the year of the horse, we can trim whitespace universally, and not only limited to the C locale.

My 2 cents,

-- hakre

LamentXU · March 30, 2026, 9:46am

Thank you for ideas! AFAIK here are my thoughts on this

Now I realized the point that, those trim functions are not supposed to be used to remove whitespaces in strings, which solves my main concern that NUL is not a ‘space’ character.

On top of that, we still have security concerns (and more) for this change, and therefore I would like to withdraw this RFC.

Moreover, the proposed new function ‘trim_whitespace()’ seems to be more reasonable for people to use to strip whitespaces instead of casually using the trim family, which would obviously be a better solution in this case.

Thank you for your attention and suggestions!

– Weilin Du

user6 · March 30, 2026, 11:54am

And yes, the security thing is the blocker on NUL, but if trim_space() idea
clicked, and given your interest in topic and the bit precision you've
shown, may I ask you if you have some interest in implementing it?

No need to invent trim_space() anew. It already exists:

Hans_Krentel · March 30, 2026, 11:30am

On Monday 30 March 2026 11:46:11 (+02:00), LamentXU wrote:

> Thank you for ideas! AFAIK here are my thoughts on this
> > > Now I realized the point that, those trim functions are not supposed to be used to remove whitespaces in strings, which solves my main concern that NUL is not a 'space' character. > > > > On top of that, we still have security concerns (and more) for this change, **and therefore I would like to withdraw this RFC.**
> > > Moreover, the proposed new function 'trim_whitespace()' seems to be more reasonable for people to use to strip whitespaces instead of casually using the trim family, which would obviously be a better solution in this case.
> > > Thank you for your attention and suggestions!

Thank you!

And yes, the security thing is the blocker on NUL, but if trim_space() idea clicked, and given your interest in topic and the bit precision you've shown, may I ask you if you have some interest in implementing it? Take your time for the answer, just sharing that I think your spotting of the "missing \f" shows attention to detail, and this is certainly both required and trim-able when there is need to delve into UTF-8 for the new idea (I have borrowed it from Golang as you wrote in both of your suggestions we should do more language comparison, which also was very insightful for me, so thank you for that, too!)

Apart from that, what are your thoughts about/for a constant preserving the original characters so in case the first change poses a problem for language users, they can find it and go on with it? -- That was a question raised shortly with the first RFC, in how far the change will introduce problems (albeit no one wants this), but because \f is that old it is hard to answer that question technically I think (even if we scan tons of code-bases for usages), so probably my thinking was that we should not leave users alone in case it is an issue for them and with a standard constant Tim or so can do some compiler string magic probably even. But that might be a bit my fantasy, we would first need to have a constant(?) before it could be discussed with him/other PHP compiler people. I'm open for other ideas/thoughts as well, the constant was a bit of an after-thought when writing, and it's good when you stay in the driver seat, I can only try to give some explanation when I see it or you ask and offer help with explorative testing. You own it, you make it out!

Thank you again!

-- hakre