Hi
On 2/8/25 20:52, Eugene Sidelnyk wrote:
Maybe there could be another good feature to start with - the function to
format bytes into a human-readable format (for debug purpuses) to have
pretty view of the size (like 1.5GB, or 20MB)
[…]
This is too little thing for having separate composer library, and perhaps
having it built-in would be better?
On a surface level such a function would appear to be very simple, but the “human-readable format” bit alone already raises multiple questions, with the most notable one being: What is human-readable?
There's different languages in the world and they all use different decimal separators and group digits differently. And they possibly have different rules with regard to whether or not a space is required between the scalar part and the unit. In fact your example already is *incorrect* English, because English requires a space between the scalar part and the unit. And it should ideally be a non-breaking space.
So in English it would be:
- 1.5 GB
- 20 MB
In German we use the comma as the decimal separator (and also a space before the unit), so it would need to be:
- 1,5 GB
- 20 MB
Then there's also the question of whether to use the binary scale or the decimal scale. In other words: Should 1460 Bytes be rendered as 1.5 kB or as 1.4 KiB?
Also: How many decimal digits should be printed? Should it even be a fixed number of decimal digits, or are we rather interested in a total number of significant digits? In more explicit terms:
For 2 *decimal* digits:
1234 Bytes -> 1.23 kB
12345 Bytes -> 12.35 kB
123456 Bytes -> 123.46 kB
For 3 *significant* digits:
1234 Bytes -> 1.23 kB
12345 Bytes -> 12.3 kB
123456 Bytes -> 123 kB
Then there's the question of when the next unit should be used. Should the cut-off point be “there needs to be a 1 in front of the decimal point”? In some cases printing 0.9 GB might be preferable to 900 MB.
I could probably go on and find further questions, but I believe this already showcases how the devil is in the details - as with many RFCs that look great on a surface level.
In this specific case of usefully formatting numbers, I can recommend taking a look at the NumberFormatter class of ext/intl (PHP: NumberFormatter - Manual). The API is not particularly pretty, but it handles the complex details of “correctly formatting a float according to language rules”. AFAICT you would still need to append the correct unit (and divide the number of bytes) yourself, though.
Best regards
Tim Düsterhus