[PHP-DEV][RFC][UNDER DISCUSSION] Oniguruma maintenance end and future of mbregex(End of mbregex)

Hi, Internals

I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.

Feel free to comment.

Regards
Yuya

--
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- youkidearitai (tekimen) · GitHub
-----------------------------

Hi Yuya,

Il 23/03/2026 01:34, youkidearitai ha scritto:

I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.
PHP: rfc:eol-oniguruma
[RFC][mbregex] Set deprecate Oniguruma(mbregex) by youkidearitai · Pull Request #21490 · php/php-src · GitHub

Thanks for that. To be honest, I'm a bit confused and I'm not sure I understand what the RFC is proposing.

Also, what exactly is mb_onig?

Cheers
--
Matteo Beccati

As far as I understand, mbstring extension internally uses library
named oniguruma, which provides regex support for mb_ereg* and few
other functions. Oniguruma reached EOL almost year ago. This RFC
proposes deprecation and subsequent removal of all regex related
functions from mbstring extension and removing dependency on
oniguruma. Instead, Yuya offers pie extension named mb_onig/mb_onig,
which is a new home for those functions. For that goal he forked
oniguruma for providing updates on security and unicode version.

2026年3月23日(月) 19:57 <go.al.ni@gmail.com>:

As far as I understand, mbstring extension internally uses library
named oniguruma, which provides regex support for mb_ereg* and few
other functions. Oniguruma reached EOL almost year ago. This RFC
proposes deprecation and subsequent removal of all regex related
functions from mbstring extension and removing dependency on
oniguruma. Instead, Yuya offers pie extension named mb_onig/mb_onig,
which is a new home for those functions. For that goal he forked
oniguruma for providing updates on security and unicode version.

Hi both
Thank you feedback and explain.

I modified clearly this RFC.

This RFC is I would like drop support mbregex.
Please read one more.

Regards
Yuya

--
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- youkidearitai (tekimen) · GitHub
-----------------------------

On Mon, 23 Mar 2026, youkidearitai wrote:

I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.
PHP: rfc:eol-oniguruma
[RFC][mbregex] Set deprecate Oniguruma(mbregex) by youkidearitai · Pull Request #21490 · php/php-src · GitHub

Feel free to comment.

I am in favour of this, but I think it would be good to have another
think about the name of the PIE package.

Right now it is mb_onig, which does not include either mbstring or
mbregex, which is what I believe people would be looking for. Do users
need to care that the library is called oniguruma, or that onig is short
for that?

I think it would be useful to name it "mbregex/mbregex" or something
like that. I also think the description
(mb_onig/composer.json at main · youkidearitai/mb_onig · GitHub)
should probably include the main function names of what it
(re-)implements (such as mb_erg) — that would also make discovery
easier.

cheers,
Derick

Thank you for writing this RFC. I don’t have a strong opinion either way. I fully understand that maintaining the Oniguruma library, while it was abandoned by the original project is a huge and unenviable task. Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution. I also wonder if there are no other “blessed” forks of the Oniguruma library to which PHP could switch. I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up. Secondly, I believe the RFC would benefit from a more detailed section about what PHP devs can do to mitigate the deprecation. For example, if the only expected text encoding is UTF-8, people can use preg_*() functions with the u modifier instead of the mb_ereg*() functions. I also think it is important to mention that the Symfony Mbstring[1] polyfill package does NOT polyfill the MB regex functionality, so cannot be used as a replacement/alternative. With this in mind, I also believe the impact analysis in the RFC should be expanded as the MbString extension is widely used. To support this, I’ve created a branch in the PHPCompatibility package [2] specifically for this deprecation and I have run the relevant checks over the Packagist Top 4000 (as of yesterday). I’ve posted the used ruleset and the full results as a gist. Summary of findings: PHP CODE SNIFFER VIOLATION SOURCE SUMMARY So, 147 occurances in the Packagist top 4000 in total. While this is lower than I would have expected, it should be remembered that most distributed packages will default to/require UTF-8 encoding and that code handling non-UTF8 encodings - and therefore needing the Mb regex functionality - is mostly found in proprietary packages. The PIE extension would help those packages. Another potential alternative for those packages would be to convert all their data and code to a UTF-8 base, which will be a humongous project for most (and that deserves a mention in the RFC). Hope this helps. Smile, Juliette 1: 2: 3:

···

On 23-3-2026 1:34, youkidearitai wrote:

Hi, Internals

I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.
[https://wiki.php.net/rfc/eol-oniguruma](https://wiki.php.net/rfc/eol-oniguruma)
[https://github.com/php/php-src/pull/21490](https://github.com/php/php-src/pull/21490)

https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214

-------------------------------------------------------------------------------------------
SOURCE COUNT
-------------------------------------------------------------------------------------------
PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated 30
PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated 25
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated 20
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated 18
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated 13
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated 10
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated 9
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated 6
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated 5
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated 4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated 4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated 2
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated 1
-------------------------------------------------------------------------------------------
A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES
-------------------------------------------------------------------------------------------

https://symfony.com/packages/polyfill-mbstring
https://github.com/PHPCompatibility/PHPCompatibility/commit/47ba8b691f82d13dcfe496549c1110d250e18a8c
https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214

2026年3月24日(火) 19:45 Derick Rethans <derick@php.net>:

On Mon, 23 Mar 2026, youkidearitai wrote:

> I decide deprecate mbregex in 8.6 and drop in 9.0.
> So I would like to go to Under Discussion phase.
> PHP: rfc:eol-oniguruma
> [RFC][mbregex] Set deprecate Oniguruma(mbregex) by youkidearitai · Pull Request #21490 · php/php-src · GitHub
>
> Feel free to comment.

I am in favour of this, but I think it would be good to have another
think about the name of the PIE package.

Right now it is mb_onig, which does not include either mbstring or
mbregex, which is what I believe people would be looking for. Do users
need to care that the library is called oniguruma, or that onig is short
for that?

I think it would be useful to name it "mbregex/mbregex" or something
like that. I also think the description
(mb_onig/composer.json at main · youkidearitai/mb_onig · GitHub)
should probably include the main function names of what it
(re-)implements (such as mb_erg) — that would also make discovery
easier.

cheers,
Derick

Hi, Derick

Thank you for your feedback.
Indeed. Seems like a good name.

For now, Your feedback(description) applied now.

I'm thinking the right extension name.

Regards
Yuya

--
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- youkidearitai (tekimen) · GitHub
-----------------------------

2026年3月24日(火) 20:46 Juliette Reinders Folmer
<php-internals_nospam@adviesenzo.nl>:

On 23-3-2026 1:34, youkidearitai wrote:

Hi, Internals

I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.
PHP: rfc:eol-oniguruma
[RFC][mbregex] Set deprecate Oniguruma(mbregex) by youkidearitai · Pull Request #21490 · php/php-src · GitHub

Thank you for writing this RFC. I don't have a strong opinion either way. I fully understand that maintaining the Oniguruma library, while it was abandoned by the original project is a huge and unenviable task.

Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution.
I also wonder if there are no other "blessed" forks of the Oniguruma library to which PHP could switch.
I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up.

Secondly, I believe the RFC would benefit from a more detailed section about what PHP devs can do to mitigate the deprecation.
For example, if the only expected text encoding is UTF-8, people can use `preg_*()` functions with the `u` modifier instead of the `mb_ereg*()` functions.

I also think it is important to mention that the Symfony Mbstring[1] polyfill package does **NOT** polyfill the MB regex functionality, so cannot be used as a replacement/alternative.

With this in mind, I also believe the impact analysis in the RFC should be expanded as the MbString extension is widely used.

To support this, I've created a branch in the PHPCompatibility package [2] specifically for this deprecation and I have run the relevant checks over the Packagist Top 4000 (as of yesterday).

I've posted the used ruleset and the full results as a gist.
Impact analysis proposed PHP 8.6 Mb Regex deprecation on Packagist top 4000 · GitHub

Summary of findings:

PHP CODE SNIFFER VIOLATION SOURCE SUMMARY
-------------------------------------------------------------------------------------------
SOURCE COUNT
-------------------------------------------------------------------------------------------
PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated 30
PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated 25
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated 20
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated 18
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated 13
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated 10
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated 9
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated 6
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated 5
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated 4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated 4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated 2
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated 1
-------------------------------------------------------------------------------------------
A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES
-------------------------------------------------------------------------------------------

So, 147 occurances in the Packagist top 4000 in total.

While this is lower than I would have expected, it should be remembered that most distributed packages will default to/require UTF-8 encoding and that code handling non-UTF8 encodings - and therefore needing the Mb regex functionality - is mostly found in proprietary packages.

The PIE extension would help those packages.

Another potential alternative for those packages would be to convert all their data and code to a UTF-8 base, which will be a humongous project for most (and that deserves a mention in the RFC).

Hope this helps.

Smile,
Juliette

1: Polyfill Mbstring package (Symfony Packages)
2: MB regex deprecation - impact analysis · PHPCompatibility/PHPCompatibility@47ba8b6 · GitHub
3: Impact analysis proposed PHP 8.6 Mb Regex deprecation on Packagist top 4000 · GitHub

Hi, Juliette

Thank you very much for your gist.
I saw your gist, seems like depends mbregex(Oniguruma).

Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution.
I also wonder if there are no other "blessed" forks of the Oniguruma library to which PHP could switch.
I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up.

Indeed, There is a Onigmo in
Ruby(ruby/regexec.c at master · ruby/ruby · GitHub) that fork
from Oniguruma.
There are Onigmo and Oniguruma differences.

I wrote your feedback to RFC.
And I quoted your gist result. Please let me know if there is any problem.
Thank you again.

Regards
Yuya

--
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- youkidearitai (tekimen) · GitHub
-----------------------------

2026年3月26日(木) 1:03 youkidearitai <youkidearitai@gmail.com>:

2026年3月24日(火) 20:46 Juliette Reinders Folmer
<php-internals_nospam@adviesenzo.nl>:
>
> On 23-3-2026 1:34, youkidearitai wrote:
>
> Hi, Internals
>
> I decide deprecate mbregex in 8.6 and drop in 9.0.
> So I would like to go to Under Discussion phase.
> PHP: rfc:eol-oniguruma
> [RFC][mbregex] Set deprecate Oniguruma(mbregex) by youkidearitai · Pull Request #21490 · php/php-src · GitHub
>
>
> Thank you for writing this RFC. I don't have a strong opinion either way. I fully understand that maintaining the Oniguruma library, while it was abandoned by the original project is a huge and unenviable task.
>
> Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution.
> I also wonder if there are no other "blessed" forks of the Oniguruma library to which PHP could switch.
> I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up.
>
> Secondly, I believe the RFC would benefit from a more detailed section about what PHP devs can do to mitigate the deprecation.
> For example, if the only expected text encoding is UTF-8, people can use `preg_*()` functions with the `u` modifier instead of the `mb_ereg*()` functions.
>
> I also think it is important to mention that the Symfony Mbstring[1] polyfill package does **NOT** polyfill the MB regex functionality, so cannot be used as a replacement/alternative.
>
> With this in mind, I also believe the impact analysis in the RFC should be expanded as the MbString extension is widely used.
>
> To support this, I've created a branch in the PHPCompatibility package [2] specifically for this deprecation and I have run the relevant checks over the Packagist Top 4000 (as of yesterday).
>
> I've posted the used ruleset and the full results as a gist.
> Impact analysis proposed PHP 8.6 Mb Regex deprecation on Packagist top 4000 · GitHub
>
> Summary of findings:
>
> PHP CODE SNIFFER VIOLATION SOURCE SUMMARY
> -------------------------------------------------------------------------------------------
> SOURCE COUNT
> -------------------------------------------------------------------------------------------
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated 30
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated 25
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated 20
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated 18
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated 13
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated 10
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated 9
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated 6
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated 5
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated 4
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated 4
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated 2
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated 1
> -------------------------------------------------------------------------------------------
> A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES
> -------------------------------------------------------------------------------------------
>
> So, 147 occurances in the Packagist top 4000 in total.
>
> While this is lower than I would have expected, it should be remembered that most distributed packages will default to/require UTF-8 encoding and that code handling non-UTF8 encodings - and therefore needing the Mb regex functionality - is mostly found in proprietary packages.
>
> The PIE extension would help those packages.
>
> Another potential alternative for those packages would be to convert all their data and code to a UTF-8 base, which will be a humongous project for most (and that deserves a mention in the RFC).
>
> Hope this helps.
>
> Smile,
> Juliette
>
>
> 1: Polyfill Mbstring package (Symfony Packages)
> 2: MB regex deprecation - impact analysis · PHPCompatibility/PHPCompatibility@47ba8b6 · GitHub
> 3: Impact analysis proposed PHP 8.6 Mb Regex deprecation on Packagist top 4000 · GitHub

Hi, Juliette

Thank you very much for your gist.
I saw your gist, seems like depends mbregex(Oniguruma).

> Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution.
> I also wonder if there are no other "blessed" forks of the Oniguruma library to which PHP could switch.
> I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up.

Indeed, There is a Onigmo in
Ruby(ruby/regexec.c at master · ruby/ruby · GitHub) that fork
from Oniguruma.
There are Onigmo and Oniguruma differences.

I wrote your feedback to RFC.
And I quoted your gist result. Please let me know if there is any problem.
Thank you again.

Regards
Yuya

--
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- youkidearitai (tekimen) · GitHub
-----------------------------

Hi, Internals

I would like to "Voting" phase at next week if there is no any concern.
Next week, re-remind email then go to "Voting" phase at next friday(2026-04-10).

If any comment, Feel free to comment.

Regards
Yuya

--
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- youkidearitai (tekimen) · GitHub
-----------------------------