On Tue, Dec 31, 2024, 13:42 Christoph M. Becker <cmbecker69@gmx.de> wrote:
On 11.07.2024 at 11:35, Giovanni Giacobbi wrote:
The recent PR #14877 [1] proposes to add the imagecompare gd function that
mimics the gdImageCompare function from libgd. I always thought that a
pixel-by-pixel matching function for two images was a big missing feature
in PHP, as the corresponding userland implementation is really, REALLY slow.
The problem with gdImageCompare is that it checks several aspects of the
two images, and fails on its core purpose. The behavior is better seen in
its source code [2] rather than explained. I see the following problems:
- The gdImageCompare is kind of buggy, as it completely disregards the
alpha channel. As a user, I would expect that two images that have a pixel
rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered
different, but the current implementation reports them as identical.
- Checking for bitmasks is cumbersome, and it requires 9 new constants in
the global namespace (the IMG_CMP_*), which are nearly useless. Most use
cases will just want to check for this: imagecompare($im1, $im2) &
IMG_CMP_IMAGE.
- The checks in gdImageCompare are also a bit arbitrary, why compare the
interlace flag but not the palette order? Why check the number of colors in
the palette if they are not even used in the image?
- If a user is interested in the other checks, they can all be implemented
in userland with existing functions: imageinterlace, imagecolortransparent,
imageistruecolor, imagesx, imagesy, imagecolorstotal.
I believe PHP should only offer required building blocks and leave to the
libraries the implementation of more structured functionality. My proposal
is a completely different function called imagematch() that solves the
exact problem of the pixel-by-pixel matching, with the added value of
optionally matching portions of the images, using the following signature:
function imagematch(GdImage $image1, GdImage $image2, int $x1 = 0, int $y1
= 0, int $x2 = 0, int $y2 = 0, ?int $width = null, ?int $height = null):
bool {}
I already drafted PR #14914 [3] with the first two parameters. There is not
an equivalent libgd implementation, so the implementation is entirely on
the php side, but as you can see it’s quite trivial. Even with the extended
functionality of matching portions of the images it would be just a few
extra lines of code, so not a big maintenance burden.
If an RFC is required, I’m happy to redact it after the initial feedback
round.
[1] https://github.com/php/php-src/pull/14877
[2]
https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c6321da918b96ef960/ext/gd/libgd/gd.c#L2834
[3] https://github.com/php/php-src/pull/14914
Difficult. It seems to me we first need to consider the use-case.
Likely useful only for testing purposes, but I may miss some other
use-cases.
And now comes the really hard part: when do we consider images to be
equal? You have mentioned that gdImageCompare() completely ignores the
alpha channel, and that would be a mistake. I tend to agree, but if the
images are stored via imagejpeg(), for instance, the alpha channel would
be ignored as well, so the resulting images would be equal.
For the “test-oriented” use case I agree, but it doesn’t only apply to php’s internal testsuite: i maintain quite a few projects which do image manipulation whose testsuites would benefit from this, and that already places us in userspace use case.
For the “what to consider different” i would apply the KISS principle and just have a pixel-by-pixel yes/no difference, not even considering palette/truecolor (as per my sample PR implementation). If you need to do further checks (i.e. the mentioned palette/truecolor) you can still check in userspace for o(1).
The idea being, can we provide the most expensive operation, that is basic pixel by pixel comparison, as native function? Otherwise we keep discussing bikeshedding and years go by without this being available. This does not stop future more advanced comparison algorithms from being introduced (i.e. imagediff()).
Then there is the question whether two fully transparent pixels (alpha
value 127) are different if their RGB channels have different values.
In theory, sure, but it practice is may not really matter.
Fully transparent black should be considered matching fully transparent white, cause there can be no possible discerning difference in the rendering.
Then we have the transparent pixel. Should equality comparison take
that into account? Again, in theory, it is not the same as a pixel with
alpha channel 127, but in practice it may very well be.
Yes. Overlaying a black 100% opaque pixel is different from overlaying 50% black pixel, on a non-black surface. Again result is “rendering oriented”.
Now you claim that doing pixel-to-pixel comparisons in PHP would be
really slow. I’d argue that with modern PHP the actual looping over
the pixels isn’t that slow. Even getting the pixel colors
(imagecolorat()) and checking them for equality shouldn’t be that slow.
I’m happy to run a quick bechmark and provider results, but my feeling is we are talking orders of magnitude (without jit, i personally don’t consider it production ready). But I will also benchmarm it with JIT enabled.
It only gets slow if you want to do more elaborate comparisons; e.g.
getting the actual values a palette entry refers to, but if you move
that out of the loop, performance might not be that bad. Implementing
that in C is certainly faster, but the difference should be measured,
and I wonder how much could be gained using SIMD instructions (something
libgd still ignores completely).
I’m talking about a naive pixel by pixel comparison, in plain C, without any special cpu optimization. Using simd would be nice but out of scope (that should definitely be a libgd discussion).
I will gladly follow up with numbers in a few days.