On Tue, Sep 3, 2024, at 13:24, Philip Hofstetter wrote:
Hello,
As per my previous email to the list, I have now created the official RFC to deprecate calling json_serialize() on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE.
https://wiki.php.net/rfc/deprecate-json_encode-nonserializable
I have also created a PR with the implementation here:
https://github.com/php/php-src/pull/15724
I have considered other options, both constraining the implementation to just Generator and/or to add special cases for Generator (and maybe Iterator), but they either continue to keep the asymmetry between serialize() and json_encode() and/or are making things even more inconsistent.
Please tell me what you think, especially, if you agree that blanked-deprecating all of ZEND_ACC_NOT_SERIALIZABLE classes is acceptable BC-wise (after a bit of deliberation over the weekend, I think it is and most json-serializations of such marked classes are probably unintentional).
Thanks in advance for all comments
Philip
Hello Phillip.
I think it would be good to list these non-serializable objects in the RFC. It doesn’t have to be exhaustive, but the list includes throwables, weak-maps, weak-references, closures, fibers, etc. Some people may be relying on this behavior (and I’d be curious to know that use case myself).
I just grepped for @not-serializable in php-src stubs, which appears to be the only use of ZEND_ACC_NOT_SERIALIZABLE.
— Rob
Hello Rob,
I think it would be good to list these non-serializable objects in the RFC. It doesn’t have to be exhaustive, but the list includes throwables, weak-maps, weak-references, closures, fibers, etc. Some people may be relying on this behavior (and I’d be curious to know that use case myself).
Thank you for your note. As I went through to the code-base in order to compile the list for the RFC, it also occurred to me that all anonymous classes are marked as ZEND_ACC_NOT_SERIALIZABLE, but json_encode()-ing such classes is unproblematic and probably very much desired for user land classes.
I have updated the RFC and attached implementation to allow anonymous classes to be json_encode()’d, provided their parent class is not marked as ZEND_ACC_NOT_SERIALIZABLE.
Thank you very much
Philip
Hello,
Sharing my experience, I never use json_encode on objects but on arrays (that are both JSON objects or JSON arrays).
When I see an object implementing JsonSerializable, I think it is the wrong approach, and I am usually able to find a better way.
Maybe if we could go back in time, we would not allow json_encode on an object, except if it implemented JsonSerializable, but that ship sailed long ago.
To your proposal, I think the BC break would be pretty big and I don’t see a way it would pass.
On https://www.php.net/json_encode we can read:
If a value to be serialized is an object, then by default only publicly visible properties will be included.
So that’s the expected behaviour.
Yes, you can say that encoding as JSON is just “another serialization method”, but the default method in PHP, using json_encode()/json_decode(), is not symmetrical when using objects.
And here lies the difference with serialize()/unserialize(), as this one aims to be symmetrical, and where it can’t be, it has a way to stop the serialization.
I am happy with the current way it works, getting an empty JSON object if there are no public properties for a Generator or Closure.
And I don’t think having an error for them would improve the language in a meaningful way, and the BC break is not worth it.
Regards,
Alex
···
As per my previous email to the list, I have now created the official RFC to deprecate calling json_serialize() on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE.
https://wiki.php.net/rfc/deprecate-json_encode-nonserializable
On Thu, Sep 5, 2024, at 10:55, Alexandru Pătrănescu wrote:
On Tue, Sep 3, 2024 at 2:27 PM Philip Hofstetter <phofstetter@sensational.ch> wrote:
Hello,
As per my previous email to the list, I have now created the official RFC to deprecate calling json_serialize() on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE.
https://wiki.php.net/rfc/deprecate-json_encode-nonserializable
Sharing my experience, I never use json_encode on objects but on arrays (that are both JSON objects or JSON arrays).
When I see an object implementing JsonSerializable, I think it is the wrong approach, and I am usually able to find a better way.
Maybe if we could go back in time, we would not allow json_encode on an object, except if it implemented JsonSerializable, but that ship sailed long ago.
To your proposal, I think the BC break would be pretty big and I don’t see a way it would pass.
On https://www.php.net/json_encode we can read:
If a value to be serialized is an object, then by default only publicly visible properties will be included.
So that’s the expected behaviour.
Yes, you can say that encoding as JSON is just “another serialization method”, but the default method in PHP, using json_encode()/json_decode(), is not symmetrical when using objects.
And here lies the difference with serialize()/unserialize(), as this one aims to be symmetrical, and where it can’t be, it has a way to stop the serialization.
I am happy with the current way it works, getting an empty JSON object if there are no public properties for a Generator or Closure.
And I don’t think having an error for them would improve the language in a meaningful way, and the BC break is not worth it.
Regards,
Alex
To add to this, we apparently use json_encode at work to serialize custom exceptions, which appears to work. This RFC would break that, I think.
— Rob
Hello Rob,
To add to this, we apparently use json_encode at work to serialize custom exceptions, which appears to work. This RFC would break that, I think.
Exception is not marked as ZEND_ACC_NOT_SERIALIZABLE and would not be affected.
Philip
On Thu, 5 Sep 2024, at 17:03, John Coggeshall wrote:
I would suggest we take a step back from this and look at it with a bit more of a wider lens. It seems to me that this would be a good place to have an attribute (e.g. #[NotSerializable]
) that could be defined for any class (with ZEND_ACC_NOT_SERIALIZABLE
being automatically given this attribute)? It just seems to be a more holistic approach that makes sense, rather than basing it on internal engine stuff and/or limiting it to internal objects.
It’s really the opposite: if we had an attribute, it would be used to toggle the existing internal flag.
ZEND_ACC_NOT_SERIALIZABLE
is just one of a long list of class, property and method flags which the engine uses internally. You can see the full list here: https://heap.space/xref/php-src/Zend/zend_compile.h?r=58aa6fc8#212
You’ll notice that includes things like “final” and “private”, which refer to straight-forward syntax in the language, as well as completely internal details like “preloaded”. It also includes ZEND_ACC_ALLOW_DYNAMIC_PROPERTIES, which is the flag set if you add the #[AllowDynamicProperties] attribute on a class: https://heap.space/xref/php-src/Zend/zend_attributes.c?r=5a18279b#validate_allow_dynamic_properties
So if we had a #[NotSerializable] attribute (which I agree might be a good idea) it would be implemented by setting the ZEND_ACC_NOT_SERIALIZABLE flag on a class definition, and
if this RFC passes, it would automatically apply to json_encode() as well as serialize().
Regards,
–
Rowan Tommins
[IMSoP]
So if we had a #[NotSerializable] attribute (which I agree might be a good idea) it would be implemented by setting the ZEND_ACC_NOT_SERIALIZABLE flag on a class definition, and
if this RFC passes, it would automatically apply to json_encode() as well as serialize().
My mistake, I haven’t familiarized myself with some of the more recent engine changes but this all makes sense. My point was more to ask the question “Would it be reasonable to expand the scope of this RFC to introduce a #[NotSerializable] attribute along with the rest of the proposal?”
Coogle
Hello Philip,
Thank you for your RFC, which is a great effort to improve serialization in PHP. Generators are indeed super useful and working with them should not be confusing.
I regret that I missed your previous emails, but I would like to suggest that you consider the following concerns:
- The original serialization and json_encode are not symmetric nor interchangeable by definition, because JSON doesn’t preserve protected properties and class names. Hence, I don’t see them to be “inconsistent” as they were not designed to be consistent.
- The reason why some objects are not serializable is because they couldn’t be fully functioning after deserialization. But this is not a barrier for JSON encoding, which serializing public properties only. Current JSON encoding has a simple, straight forward logic, not tied to serialization in any way.
You have suggested to cast a non-serializable objects to array as a workaround. Yet, array is the most popular type to JSON in PHP. To me, an ideal move would be make json_encode to iterate any iterators from the input structure. Well, this is a different topic.
Best regards,
Vasilii
···
As per my previous email to the list, I have now created the official RFC to deprecate calling json_serialize() on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE.
https://wiki.php.net/rfc/deprecate-json_encode-nonserializable
I have also created a PR with the implementation here:
https://github.com/php/php-src/pull/15724
I have considered other options, both constraining the implementation to just Generator and/or to add special cases for Generator (and maybe Iterator), but they either continue to keep the asymmetry between serialize() and json_encode() and/or are making things even more inconsistent.
Please tell me what you think, especially, if you agree that blanked-deprecating all of ZEND_ACC_NOT_SERIALIZABLE classes is acceptable BC-wise (after a bit of deliberation over the weekend, I think it is and most json-serializations of such marked classes are probably unintentional).
Thanks in advance for all comments
Philip
Hi,
An attribute adds very little value. It doesn’t add new capability, because you can achieve the same effect with a serialiser that throws unconditionally; it is just a nicer syntax. People generally don’t bother making their classes unserialisable unless they have a good reason; having an attribute won’t really change that.
The core problem is that objects are json-serialisable by default, although most of them are not supposed to be json-serialised. Taking a second step back, if we really want to solve the issue, one should:
- for internal classes, determine which ones could be json-serialisable (at least, stdClass); for all other classes,
json_encode(...)
shall be disabled (after a deprecation period);
- for user-defined classes, the user shall opt into json-serialisability, either by extending a json-serialisable class, or by using an {attribute / magic method / interface} (chose your bikeshed colour).
—Claude
···
As per my previous email to the list, I have now created the official RFC to deprecate calling json_serialize() on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE.
I would suggest we take a step back from this and look at it with a bit more of a wider lens. It seems to me that this would be a good place to have an attribute (e.g. #[NotSerializable]
) that could be defined for any class (with ZEND_ACC_NOT_SERIALIZABLE
being automatically given this attribute)? It just seems to be a more holistic approach that makes sense, rather than basing it on internal engine stuff and/or limiting it to internal objects.
Coogle
On Fri, Sep 6, 2024, at 20:07, Claude Pache wrote:
Le 5 sept. 2024 à 18:03, John Coggeshall john@coggeshall.org a écrit :
As per my previous email to the list, I have now created the official RFC to deprecate calling json_serialize() on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE.
I would suggest we take a step back from this and look at it with a bit more of a wider lens. It seems to me that this would be a good place to have an attribute (e.g. #[NotSerializable]
) that could be defined for any class (with ZEND_ACC_NOT_SERIALIZABLE
being automatically given this attribute)? It just seems to be a more holistic approach that makes sense, rather than basing it on internal engine stuff and/or limiting it to internal objects.
Coogle
Hi,
An attribute adds very little value. It doesn’t add new capability, because you can achieve the same effect with a serialiser that throws unconditionally; it is just a nicer syntax. People generally don’t bother making their classes unserialisable unless they have a good reason; having an attribute won’t really change that.
You also need to ensure that you make it final, as someone could extend your class and make your borked serializer work.
The core problem is that objects are json-serialisable by default, although most of them are not supposed to be json-serialised. Taking a second step back, if we really want to solve the issue, one should:
-
for internal classes, determine which ones could be json-serialisable (at least, stdClass); for all other classes, json_encode(...)
shall be disabled (after a deprecation period);
-
for user-defined classes, the user shall opt into json-serialisability, either by extending a json-serialisable class, or by using an {attribute / magic method / interface} (chose your bikeshed colour).
—Claude
I also agree with this to a point: there is the https://www.php.net/manual/en/class.jsonserializable.php interface, after all.
— Rob
Hi
Am 2024-09-05 18:03, schrieb John Coggeshall:
I would suggest we take a step back from this and look at it with a bit more of a wider lens. It seems to me that this would be a good place to have an attribute (e.g. #[NotSerializable] ) that could be defined for any class (with ZEND_ACC_NOT_SERIALIZABLE being automatically given this attribute)? It just seems to be a more holistic approach that makes sense, rather than basing it on internal engine stuff and/or limiting it to internal objects.
FWIW There already is such an RFC: PHP: rfc:not_serializable
Discussion thread is here: [RFC][Discussion] NotSerializable attribute - Externals
Best regards
Tim Düsterhus