Re: [PHP-DEV] Maintain Windows PHP dependency builds in a GH repo

On Sun, 8 Sep 2024, Christoph M. Becker wrote:

All these problems could be solved, or at least mitigated, by setting
up a Github repository for the dependency builds and the series files.

The upload could be as simple as a cron'd `git pull` on the server.

All our websites, including PHP downloads, use the rsync server for
this. That server has a GIT checkout, and then the Web servers rsync
from there. This is superior because it means the *web servers* never
need the GIT checkout, solving duplication and permission issues. It
also means it is easy to set up mirrors (unofficial).

The only potential drawback I see would be the size of the repository.

This is already a major issue for the normal PHP downloads, which
currently sits at 15 GB.

While I believe the new repo would be way smaller than our
distribution repo[1], and we might exclude vc11 and vc15 builds, it
may still be too large for practical handling.

But in this case we can consider using Git LFS[2] which has been
developed to address this issue.

Github LFS isn't free though, once you get to large storage sizes (over
1GB) [1]. I also found it really fiddly when using it. As it's not free,
I don't think it qould qualify as something for using in an Open Source
project.

I don't believe GIT is a good fit for versioning binaries. Not for this,
nor the PHP distributions. I understand the history aspects are useful,
but it's never been designed for keeping binaries. It's not a file
management tool, but a source code tracking solution.

I also don't think Git LFS has been created for this either. It is
useful for *large files* in a repository, not *large repositories*.
Files can be maximum 2GB, which is plenty for DLLs and our releases.

[1] About billing for Git Large File Storage - GitHub Docs

What do you think? Would this require the RFC process?

I think it needs some good thinking through first. I also don't believe
the RFC system is something we need to use for deciding how to serve
files.

cheers,
Derick

--
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: Xdebug: Support

mastodon: @derickr@phpc.social @xdebug@phpc.social

On 08.09.2024 at 16:58, Derick Rethans wrote:

On Sun, 8 Sep 2024, Christoph M. Becker wrote:

The upload could be as simple as a cron'd `git pull` on the server.

All our websites, including PHP downloads, use the rsync server for
this. That server has a GIT checkout, and then the Web servers rsync
from there. This is superior because it means the *web servers* never
need the GIT checkout, solving duplication and permission issues. It
also means it is easy to set up mirrors (unofficial).

Thanks for the explanation!

Github LFS isn't free though, once you get to large storage sizes (over
1GB) [1]. I also found it really fiddly when using it. As it's not free,
I don't think it qould qualify as something for using in an Open Source
project.

Oh, I wasn't aware of that.

I don't believe GIT is a good fit for versioning binaries. Not for this,
nor the PHP distributions. I understand the history aspects are useful,
but it's never been designed for keeping binaries. It's not a file
management tool, but a source code tracking solution.

I agree.

I think it needs some good thinking through first. I also don't believe
the RFC system is something we need to use for deciding how to serve
files.

This is not about *how* to serve files, but rather *which* files to
serve; for instance, I'm currently working on updating libpng, where we
still ship v1.6.34 from Sep 29, 2017.

Anyhow, coming back to my list of problems:

(1) only few people can do these uploads
(2) the process is not transparent
(3) there is no history of the series
(4) the process is prone to error

If we only had the series files in a Github repository, (2) and (3)
would be solved, and (4) at least partially.

The workflow for updating a dependency might then look like:

* someone submits a PR with updates to the series files and a link to
the new dependency build on winlibs/winlib-builder (a PR template might
be useful)
* after some basic CI had been run, a notification is sent to those who
can do the uploads to downloads.php.net (or to the rsync server)
* one of these people can then check the PR, and if okay, upload the
dependency builds
* afterwards the PR is merged, and synced with the server
* archiving no longer needed dependencies could be done on the server (a
simple script should do; and it's not a very important task anyway, and
maybe it shouldn't be done at all, so that older Git revisions of the
series are still useable)

While that would not solve problem (1), it would at least avoid having
to ping some "random" people ("can you please upload?"), and if there is
an appropriate PR template, some further issues with problem (4) could
be resolved (e.g. do the series files refer to existing files?)

Cheers,
Christoph

On Sun, 8 Sep 2024, Christoph M. Becker wrote:

On 08.09.2024 at 16:58, Derick Rethans wrote:

> I think it needs some good thinking through first. I also don't
> believe the RFC system is something we need to use for deciding how
> to serve files.

This is not about *how* to serve files, but rather *which* files to
serve; for instance, I'm currently working on updating libpng, where
we still ship v1.6.34 from Sep 29, 2017.

Ok, but I still don't see why you need an RFC for this? :slight_smile:

Anyhow, coming back to my list of problems:

(1) only few people can do these uploads
(2) the process is not transparent
(3) there is no history of the series
(4) the process is prone to error

If we only had the series files in a Github repository, (2) and (3)
would be solved, and (4) at least partially.

The workflow for updating a dependency might then look like:

* someone submits a PR with updates to the series files and a link to
the new dependency build on winlibs/winlib-builder (a PR template might
be useful)
* after some basic CI had been run, a notification is sent to those who
can do the uploads to downloads.php.net (or to the rsync server)
* one of these people can then check the PR, and if okay, upload the
dependency builds
* afterwards the PR is merged, and synced with the server
* archiving no longer needed dependencies could be done on the server (a
simple script should do; and it's not a very important task anyway, and
maybe it shouldn't be done at all, so that older Git revisions of the
series are still useable)

While that would not solve problem (1), it would at least avoid having
to ping some "random" people ("can you please upload?"), and if there is
an appropriate PR template, some further issues with problem (4) could
be resolved (e.g. do the series files refer to existing files?)

I know Shivam (GitHub - php/web-downloads: The repository that contains the downloads.php.net website) has also been
working on doing automatic pulls of PECL builds onto the "downloads"
server.

The idea was to trigger a GitHub action to call to this API to then
download file file. Ideally the downloads server *pulls* files, as
uploading *to* it can't work through GHA as we require 2FA through a
jump host.

We'll have to have multiple Git repositories, and perhaps subdomain
names to make this all work.

The downloads.php.net site currently doesn't have any code yet, as I am
waiting for this 404 ErrorHandler to be included in it:

  <?php
  if (preg_match('/Win32-vc/', $_SERVER['REQUEST_URI'])) {
          $fixed = str_replace( 'Win32-vc', 'Win32-VC', $_SERVER['REQUEST_URI'] );
          header("Location: $fixed", true, 301);
          exit();
  }

  header('Location: /', true, 404);

Ideally, instead of having downloads.php.net/~windows, we have
downloads.php.net/windows which is a Git repository for the series
files, but it is probably better if it's all in that same web-downloads
repository.

cheers,
Derick

--
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: Xdebug: Support

mastodon: @derickr@phpc.social @xdebug@phpc.social

On 11.09.2024 at 15:51, Derick Rethans wrote:

On Sun, 8 Sep 2024, Christoph M. Becker wrote:

On 08.09.2024 at 16:58, Derick Rethans wrote:

Ok, but I still don't see why you need an RFC for this? :slight_smile:

Oh, I don't need an RFC for this. Actually, you can read my question as
"does this *really* need an RFC, or can we do without?"

I know Shivam (GitHub - php/web-downloads: The repository that contains the downloads.php.net website) has also been
working on doing automatic pulls of PECL builds onto the "downloads"
server.

The idea was to trigger a GitHub action to call to this API to then
download file file. Ideally the downloads server *pulls* files, as
uploading *to* it can't work through GHA as we require 2FA through a
jump host.

I see.

We'll have to have multiple Git repositories, and perhaps subdomain
names to make this all work.

The downloads.php.net site currently doesn't have any code yet, as I am
waiting for this 404 ErrorHandler to be included in it:

  <?php
  if (preg_match('/Win32-vc/', $_SERVER['REQUEST_URI'])) {
          $fixed = str_replace( 'Win32-vc', 'Win32-VC', $_SERVER['REQUEST_URI'] );
          header("Location: $fixed", true, 301);
          exit();
  }

  header('Location: /', true, 404);

Ideally, instead of having downloads.php.net/~windows, we have
downloads.php.net/windows which is a Git repository for the series
files, but it is probably better if it's all in that same web-downloads
repository.

Oh, there might be slight misunderstanding. By "series files" I only
refer to what is in
<https://downloads.php.net/~windows/php-sdk/deps/series/&gt;; nothing else.

While the upload/download issue might be solved one way or the other,
having a Git repository for the series files might solve a couple of
issues. I've set up <https://github.com/cmb69/php-windeps-series&gt; as a
demonstration of what I have in mind. For instance, it is hard to keep
an overview of which packages are in which series; packages.csv helps a
bit (and GH displays this nicely[1]). And then I made PR #1, which
shows a x64/x86 mismatch[2] (actually multiple, but ignore the trailing
-1 lines mismatches). Finally, I made PR #2, which uses a locally run
script to push staging to stable. Many more checks and automations can
be done in the future; but you already get the gist.

[1] <https://github.com/cmb69/php-windeps-series/blob/main/packages.csv&gt;
[2]
<Testing an x64/x86 mismatch · cmb69/php-windeps-series@0038ef6 · GitHub;

Christoph

On Wed, 11 Sep 2024, Christoph M. Becker wrote:

On 11.09.2024 at 15:51, Derick Rethans wrote:
>
> Ideally, instead of having downloads.php.net/~windows, we have
> downloads.php.net/windows which is a Git repository for the series
> files, but it is probably better if it's all in that same
> web-downloads repository.

Oh, there might be slight misunderstanding. By "series files" I only
refer to what is in
<https://downloads.php.net/~windows/php-sdk/deps/series/&gt;; nothing
else.

Yes, I realised that.

While the upload/download issue might be solved one way or the other,
having a Git repository for the series files might solve a couple of
issues. I've set up <https://github.com/cmb69/php-windeps-series&gt; as a
demonstration of what I have in mind. For instance, it is hard to keep
an overview of which packages are in which series; packages.csv helps a
bit (and GH displays this nicely[1]).

Yeah, that's exactly what we had been talking about already, but
generating the series from the CSV file seems like a step forwards, as
maintaining the series files by hand was a little awkward.

I probably would have sorted the versions the other way around though.

And then I made PR #1, which shows a x64/x86 mismatch[2] (actually
multiple, but ignore the trailing -1 lines mismatches). Finally, I
made PR #2, which uses a locally run script to push staging to stable.
Many more checks and automations can be done in the future; but you
already get the gist.

[1] <https://github.com/cmb69/php-windeps-series/blob/main/packages.csv&gt;
[2] <Testing an x64/x86 mismatch · cmb69/php-windeps-series@0038ef6 · GitHub;

Once the actually pulling from GHA to the downloads server has been
added, I can add variants of this to our deployment scripts (aka
systems/update-phpweb-backend at master · php/systems · GitHub).

I am not sure if you're aware, but some time ago we created a Google doc
to list all the things for moving the windows downloads to
downloads.php.net:

If you provide an email address, I can give you access (or request it).

cheers,
Derick

--
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: Xdebug: Support

mastodon: @derickr@phpc.social @xdebug@phpc.social

On 13.09.2024 at 11:42, Derick Rethans wrote:

On Wed, 11 Sep 2024, Christoph M. Becker wrote:

While the upload/download issue might be solved one way or the other,
having a Git repository for the series files might solve a couple of
issues. I've set up <https://github.com/cmb69/php-windeps-series&gt; as a
demonstration of what I have in mind. For instance, it is hard to keep
an overview of which packages are in which series; packages.csv helps a
bit (and GH displays this nicely[1]).

Yeah, that's exactly what we had been talking about already, but
generating the series from the CSV file seems like a step forwards, as
maintaining the series files by hand was a little awkward.

Well, as is the CSV is generated from the series files (not the other
way round), and while changing this is possible, I don't think it's the
best option. Instead I'd add some simple scripts, e.g. so you could do
something like `update openssl-3.0.15 8.2 8.3 8.4 master`.

I probably would have sorted the versions the other way around though.

I didn't sort them (only the package names are sorted). Just a quick
script.

Once the actually pulling from GHA to the downloads server has been
added, I can add variants of this to our deployment scripts (aka
systems/update-phpweb-backend at master · php/systems · GitHub).

I am not sure if you're aware, but some time ago we created a Google doc
to list all the things for moving the windows downloads to
downloads.php.net:
Moving Windows Download Server - Google Documenten
If you provide an email address, I can give you access (or request it).

I'm aware of this document, and if you like you can take my email
address from the From address of this email.

However, I'm afraid that this will not be finished within the next
couple of weeks, and I really think that some further updates to Windows
dependencies are long overdue (besides those already available in the
winlibs repositories). Would be great to get these uploaded soon, so
there are still a few 8.4.0RCs left to be able to correct potential issues.

Cheers,
Christoph