[PHP-DEV] [RFC] Modern Compression (zstd, brotli)

Hey,

Kévin Dunglas and myself would like to present this RFC proposing inclusion of the zstandard and brotli pecl extensions into core, with the main goal of making them more broadly available.

https://wiki.php.net/rfc/modern_compression

Thanks for your consideration, we're looking forward to hearing your feedback!

Best,
Jordi

--
Jordi Boggiano
@seldaek - https://seld.be

Hi

Am 2025-02-18 11:19, schrieb Jordi Boggiano:

Thanks for your consideration, we're looking forward to hearing your feedback!

The RFC specifies that:

The zstd implementation includes a few global functions as well as namespaced ones

My question is: Why? Please also have a look at: policies/coding-standards-and-naming.rst at main · php/policies · GitHub. Instead of creating a top-level namespace for both Brotli and Zstd it would probably make sense to create a new “Compression” extension that could also include a new and improved gzip (and bz2) API as a follow-up. The new ext/random could probably serve as an API example.

function compress_add( resource $context, […]

Please do not add new resources. It would probably also make sense to consider making this a proper OO API instead of resource objects that are processed by free-standing functions.

-----

Both parts combined could then result in something like:

namespace Compression\Zstd;

class Compressor implements \Compression\Compressor { }

$file = fopen('file.txt', 'r');
$file2 = fopen('file.txt.zstd', 'w');
$decompressor = new Decompressor();
while (!feof($file)) {
     fwrite($file2, $decompressor->push(fread($file)));
}

Best regards
Tim Düsterhus

Hey Tim,

On 18.02.2025 12:30, Tim Düsterhus wrote:

My question is: Why? Please also have a look at: policies/coding-standards-and-naming.rst at main · php/policies · GitHub. Instead of creating a top-level namespace for both Brotli and Zstd it would probably make sense to create a new “Compression” extension that could also include a new and improved gzip (and bz2) API as a follow-up. The new ext/random could probably serve as an API example.

The main reason why is that we were mostly trying to keep this simple, so the RFC is simply what is currently in the extensions. But I fully agree with you, this doesn't look like the cleanest API, and it might be a good opportunity to clean things up.

I guess we'll go back to the drawing board and try to come up with an API proposal that fits in better.

Best,
Jordi

--
Jordi Boggiano
@seldaek - https://seld.be

Hi

Am 2025-02-18 13:46, schrieb Jordi Boggiano:

My question is: Why? Please also have a look at: policies/coding-standards-and-naming.rst at main · php/policies · GitHub. Instead of creating a top-level namespace for both Brotli and Zstd it would probably make sense to create a new “Compression” extension that could also include a new and improved gzip (and bz2) API as a follow-up. The new ext/random could probably serve as an API example.

The main reason why is that we were mostly trying to keep this simple, so the RFC is simply what is currently in the extensions. But I fully agree with you, this doesn't look like the cleanest API, and it might be a good opportunity to clean things up.

I assumed as much, but indeed a “blessed” implementation in core should be held to a higher standard and ideally make use of the latest and greatest language features to create an API that is a joy to use. As an example, the Brotli compression mode should likely be

     namespace Compression\Brotli;

     enum Mode {
         case Generic;
         case Text;
         case Font;
     }

or something like that.

I guess we'll go back to the drawing board and try to come up with an API proposal that fits in better.

I would also suggest critically questioning whether all the features provided by the existing extensions are necessary. As an example, I'm not sure if the implicit output compression INI settings are actually necessary nowadays. The common webservers / FastCGI gateways will perform dynamic compression of FastCGI responses by default, making the implementation redundant and possibly requiring the user to configure compression in two places. And modern frameworks likely want to have strict control about the exact output as well.

Best regards
Tim Düsterhus

Hey,

On Tue, 18 Feb 2025, 13:04 Tim Düsterhus, <tim@bastelstu.be> wrote:

Hi

Am 2025-02-18 13:46, schrieb Jordi Boggiano:

My question is: Why? Please also have a look at:
https://github.com/php/policies/blob/main/coding-standards-and-naming.rst#namespaces.
Instead of creating a top-level namespace for both Brotli and Zstd it
would probably make sense to create a new “Compression” extension that

I’ve tried to reply twice, but you guys are too quick! :slight_smile:

@Jordi, great initiative on modern stuff.

  1. Apart from asset compression can you think of other practical use cases where Brotli and such would be useful? Helps to understand bigger picture too than webserver compression of assets and similar.

  2. Introducing this into core - in reality in 2025 there are more junior devs (or just general people using abstractions) than ever before. The nature of tech evolution.

This isn’t a bad thing, because it means more users, but the negative side effect is that people are allergic to installing extensions, in general, because it’s not a simple “turnkey solution” and as such people tend to avoid using latest things or just use another programming language that it just “ships with”

Thus including this stuff into core is more important and relevant these days, than ever before.

@Tim - I really like your thinking of making a combined “standard” compression extension.

It will make maintenance and upgrades more consolidated, and a more consistent API across existing and upcoming compression libs. Such as functional vs OOP and the OOP layout/structure of things.

It also means less extensions to install, as time goes on, when we add new ones to keep PHP uptodate, looping back to my previous point of why avoiding bringing more extensions for end users to install is important.

Having spent many years in PHP-FIG and lots of effort trying to build one “standard” for lots of implementations that differ. I just want to point out that I think we should avoid trying to make a standard design for all the Compression drivers and features and instead agree that things (Enums?) will and should differ from drive too driver and that’d okay.

Maybe this is a bit of a new side topic from Jordi’s Brotli proposal, but I think it’s still relevant to keep in scope for the next steps.

Thanks,
Paul

could also include a new and improved gzip (and bz2) API as a
follow-up. The new ext/random could probably serve as an API example.

The main reason why is that we were mostly trying to keep this simple,
so the RFC is simply what is currently in the extensions. But I fully
agree with you, this doesn’t look like the cleanest API, and it might
be a good opportunity to clean things up.

I assumed as much, but indeed a “blessed” implementation in core should
be held to a higher standard and ideally make use of the latest and
greatest language features to create an API that is a joy to use. As an
example, the Brotli compression mode should likely be

namespace Compression\Brotli;

enum Mode {
case Generic;
case Text;
case Font;
}

or something like that.

I guess we’ll go back to the drawing board and try to come up with an
API proposal that fits in better.

I would also suggest critically questioning whether all the features
provided by the existing extensions are necessary. As an example, I’m
not sure if the implicit output compression INI settings are actually
necessary nowadays. The common webservers / FastCGI gateways will
perform dynamic compression of FastCGI responses by default, making the
implementation redundant and possibly requiring the user to configure
compression in two places. And modern frameworks likely want to have
strict control about the exact output as well.

Best regards
Tim Düsterhus

Hi

Am 2025-02-18 14:22, schrieb Paul Dragoonis:

Having spent many years in PHP-FIG and lots of effort trying to build one
"standard" for lots of implementations that differ. I just want to point
out that I think we should avoid trying to make a standard design for all
the Compression drivers and features and instead agree that things (Enums?)
will and should differ from drive too driver and that'd okay.

Given that all compression algorithms work in a “put source data in get compressed data out” fashion, I believe it is reasonable to have an interface specifying that to allow for proper pluggability in the output pipeline:

     if (in_array('brotli', $request->getHeader('accept-encoding'))) {
         $compressor = new \Compression\Brotli\Compressor(\Compression\Brotli\Mode::Text);
     } elseif (in_array('zstd', $request->getHeader('accept-encoding'))) {
         $compressor = new \Compression\Zstd\Compressor();
     } elseif (in_array('gzip', $request->getHeader('accept-encoding'))) {
         $compressor = new \Compression\Gzip\Compressor(level: 6);
     } else {
         $compressor = new NullCompressor();
     }

     echo $compressor->compress($response->getBody());

I trust Jordi to come up with a reasonable API design.

Best regards
Tim Düsterhus

Hi there,

Here is an alternative proposal we just discussed with Jordi:

First, reduce the scope of our RFC to simply add new stream wrappers for Zstandard and Brotli similar to those already provided for zlib: https://www.php.net/manual/en/wrappers.compression.php

PECL extensions for Brotli and Zstandard already provide these wrappers, and Symfony AssetMapper uses them when available.

To keep things moving quickly, we won’t be adding any new functions or classes (perhaps just constants or enums for context options).

To use these new formats, we’ll need to use the low-level file/stream manipulation functions. It will be possible to provide userland libraries with a more attractive API.

As a second step, in a future RFC, we could create a new “Compress” interface (or class) similar to that implemented by Go’s “compress” module (https://pkg.go.dev/compress) or Java’s InputStream/OutputStream abstraction (https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/io/InputStream.html / https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/io/OutputStream.html), which would offer a high-level API for compressing or decompressing files, and which would use stream wrappers under the hood.

In this way, we can move ahead quickly and quickly provide support for Brotli and Zstandard, which we urgently need to support natively for Composer, Symfony, and probably many other projects, and we can take the time to think about a well-designed high-level API.


Kévin Dunglas

https://dunglas.dev

First, reduce the scope of our RFC to simply add new stream wrappers for Zstandard and Brotli similar to those already provided for zlib: PHP: zlib:// - Manual

To keep things moving quickly, we won't be adding any new functions or classes (perhaps just constants or enums for context options).

To use these new formats, we'll need to use the low-level file/stream manipulation functions. It will be possible to provide userland libraries with a more attractive API.

This will be great, and existing libraries that use stream wrappers
will automatically get brotli and zstd support too.

... [snip] which we urgently need to support natively for Composer [snip] ...

Not trying to sound discouraging at all, but I'm merely curious about
the urgency, and how Composer can make use of zstd. I get that
Composer manifest downloads could make use of zstd, but the package
download format ultimately depends on what the hosting VCS supports.
In most cases, it will be GitHub, which seems to only support zip and
tar.gz.

For `content-encoding: zstd` HTTP content, Curl works great.
packagist.org seems to run on Nginx, and perhaps when used with an
zstd module (or by using something like Caddy server with zstd
built-in), we can have zstd manifest downloads even today!

Thank you.

On Feb 18, 2025, at 08:43, Kévin Dunglas kevin@dunglas.fr wrote:

Hi there,

Here is an alternative proposal we just discussed with Jordi:

First, reduce the scope of our RFC to simply add new stream wrappers for Zstandard and Brotli similar to those already provided for zlib: https://www.php.net/manual/en/wrappers.compression.php

PECL extensions for Brotli and Zstandard already provide these wrappers, and Symfony AssetMapper uses them when available.

To keep things moving quickly, we won’t be adding any new functions or classes (perhaps just constants or enums for context options).

To use these new formats, we’ll need to use the low-level file/stream manipulation functions. It will be possible to provide userland libraries with a more attractive API.

As a second step, in a future RFC, we could create a new “Compress” interface (or class) similar to that implemented by Go’s “compress” module (https://pkg.go.dev/compress) or Java’s InputStream/OutputStream abstraction (https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/io/InputStream.html / https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/io/OutputStream.html), which would offer a high-level API for compressing or decompressing files, and which would use stream wrappers under the hood.

In this way, we can move ahead quickly and quickly provide support for Brotli and Zstandard, which we urgently need to support natively for Composer, Symfony, and probably many other projects, and we can take the time to think about a well-designed high-level API.


Kévin Dunglas

https://dunglas.dev

This streamlined and narrower-scoped approach gets a big +1 from me.

I also really like the idea of a unified compression OO API, as future scope.

Cheers,
Ben

Am 18.02.2025, 11:19:26 schrieb Jordi Boggiano <j.boggiano@seld.be>:

Hey,

Kévin Dunglas and myself would like to present this RFC proposing
inclusion of the zstandard and brotli pecl extensions into core, with
the main goal of making them more broadly available.

https://wiki.php.net/rfc/modern_compression

Thanks for your consideration, we’re looking forward to hearing your
feedback!

I really like the idea to add both to core and the reasoning to add Brotli as well given Safari constraints.

As Tim said, the API could use simplification and improvement. I would propose that the constructor of a compressor gets all the options and the interface method itself are really always single argument data in, data out.

For Zstd dictionary support you could just add a second class (or an optional parameter) in the ctor. Rough idea (constant values and variable names and types for ctors are mostly made up):

https://gist.github.com/beberlei/6f3d365f79959e3ded07e6a1f1351a1b

greetings
Benjamin

Best,
Jordi


Jordi Boggiano
@seldaek - https://seld.be

On Tue 18. 2. 2025 at 11:21, Jordi Boggiano <j.boggiano@seld.be> wrote:

Hey,

Kévin Dunglas and myself would like to present this RFC proposing
inclusion of the zstandard and brotli pecl extensions into core,

It would be really good to get some thoughts of the author of these extensions (kjdev) because that’s the person who currently maintains them or it should be at least clear if there is some commitment in terms if the maintenance.

I realise that you now propose just the stream part but that essentially also creates conflict with those extensions (if you you meant using the same stream name) so the feedback from the author should be even more important.

with
the main goal of making them more broadly available.

I think it should be more about getting a secured maintenance and being covered by the security support. I think good availability can still be achieved outside the core (e.g. xdebug and some other used exts).

Regards

Jakub