Hello PHP Internals!
I believe I’ve found an acceptable solution to the issue that’s plagued SORT_REGULAR since the dawn of time.
Fundamentally, all sorting algorithms require transitive comparisons (if A ≤ B and B ≤ C, then A ≤ C). However, PHP’s SORT_REGULAR comparison function violates transitivity when mixing numeric strings, non-numeric strings, and numbers, leading to unpredictable and non-deterministic sort results.
The Solution:
Add a transitive_compare_mode flag to executor_globals (TLS) that signals zendi_smart_strcmp() to enforce consistent ordering during SORT_REGULAR operations:
- Numeric strings are consistently ordered relative to non-numeric strings
- Eliminates circular comparisons
- Maintains PHP 8+ semantics (numeric-types < numeric-strings < non-numeric)
Implementation:
- PR: https://github.com/php/php-src/pull/20315
- Uses save/restore pattern for reentrancy safety
- New test coverage included
Historical Context:
Raghubansh Kumar documented this issue in 2007, creating tests with the note “(OK to fail as result is unpredectable)”. Nikita Popov’s 2019 RFC improved string-to-number comparison semantics but didn’t eliminate the transitivity violation. This fix completes that work.
Test Results:
Four variation11 tests currently fail as expected, but their outputs are now deterministic and more “sane” than both the 2007 and 2019 versions imo:
https://gist.github.com/jmarble/957a096cb2bf25b577de47449305723f
ABI Considerations:
This adds a field to _zend_executor_globals, which is technically an ABI break appropriate for PHP 8.6. However, practical impact is minimal, I think, since most extensions access executor_globals through the EG() macro (which abstracts the struct layout),
not via direct struct manipulation.
I’d appreciate review and feedback on this approach. This is a long-standing correctness issue.
Thank you for your time and consideration!
- Jason