[Infra]CentOS Stream 8 based jobs are hitting RETRY or RETRY_LIMIT
Hello,
Currently CentOS Stream 8 based jobs are hitting RETRY or RETRY_LIMIT.
https://zuul.opendev.org/t/openstack/builds?result=RETRY&result=RETRY_LI... Based on the logs, all are hitting with : ``` rrors during downloading metadata for repository 'appstream': 2022-01-20 09:34:18.119377 | primary | - Status code: 404 for https://mirror-int.iad.rax.opendev.org/centos/8-stream/AppStream/x86_64/os/r... (IP: 10.208.224.52) 2022-01-20 09:34:18.119443 | primary | - Status code: 404 for https://mirror-int.iad.rax.opendev.org/centos/8-stream/AppStream/x86_64/os/r... (IP: 10.208.224.52)
``` While taking a look at https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/...
``` 404 for http://mirror.dal10.us.leaseweb.net/centos/8-stream/AppStream/x86_64/os/re podata/09255ba7c10e01afeb0d343667190f9c3e42d0a6099f887619abcb92ea0378db-filelists.xml.gz (IP: 209.58.153.1) ``` It seems the centos mirror have issues. https://review.opendev.org/c/opendev/system-config/+/825446 make the switch to facebook mirror.
It might fix the issue. Thanks Alfredo for debugging it.
Thanks,
Chandan Kumar
On 2022-01-20 15:19:45 +0530 (+0530), Chandan Kumar wrote:
Currently CentOS Stream 8 based jobs are hitting RETRY or RETRY_LIMIT.
https://zuul.opendev.org/t/openstack/builds?result=RETRY&result=RETRY_LI... Based on the logs, all are hitting with :
rrors during downloading metadata for repository 'appstream': 2022-01-20 09:34:18.119377 | primary | - Status code: 404 for https://mirror-int.iad.rax.opendev.org/centos/8-stream/AppStream/x86_64/os/repodata/09255ba7c10e01afeb0d343667190f9c3e42d0a6099f887619abcb92ea0378db-filelists.xml.gz (IP: 10.208.224.52) 2022-01-20 09:34:18.119443 | primary | - Status code: 404 for https://mirror-int.iad.rax.opendev.org/centos/8-stream/AppStream/x86_64/os/repodata/678056e5b64153ca221196673208730234dd72f03397a3ab2d30fea01392bd87-primary.xml.gz (IP: 10.208.224.52)
While taking a look at https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/...
404 for http://mirror.dal10.us.leaseweb.net/centos/8-stream/AppStream/x86_64/os/re podata/09255ba7c10e01afeb0d343667190f9c3e42d0a6099f887619abcb92ea0378db-filelists.xml.gz (IP: 209.58.153.1)
It seems the centos mirror have issues. https://review.opendev.org/c/opendev/system-config/+/825446 make the switch to facebook mirror.
It might fix the issue.
[...]
The content of /centos/8-stream/AppStream/x86_64/os/repodata/ on mirror.facebook.net is identical to what we're serving already. I checked some other mirrors, e.g. linuxsoft.cern.ch, and see the same. The repomd.xml indices on them all match too.
On 2022-01-20 14:09:09 +0000 (+0000), Jeremy Stanley wrote: [...]
The content of /centos/8-stream/AppStream/x86_64/os/repodata/ on mirror.facebook.net is identical to what we're serving already. I checked some other mirrors, e.g. linuxsoft.cern.ch, and see the same. The repomd.xml indices on them all match too.
Further investigation of our mirror update logs indicates there was some (likely global) upheaval for CentOS Stream 8 package indices, which we then mirrored on what was probably a several hour delay as we're multiple mirror "hops" from their primary. The mirror at LeaseWeb, which we pull from, had an index update around 06:00 UTC which seems to roughly coincide with when the problems began, and then we saw those indices switch back around 12:00 UTC to what they had been previously.
The timeframe where the suspected problem indices were being served from our mirrors was approximately 06:55-12:57 UTC. We also saw a failure to upload updated centos-8-stream images to a significant proportion of our providers shortly prior to this, so out of an abundance of caution I've issued a delete for that image (falling back to the one built yesterday), and our builders are presently refreshing it from what is hopefully now a sane mirror of the packages and indices.
On Thu, Jan 20, 2022 at 8:25 PM Jeremy Stanley fungi@yuggoth.org wrote:
On 2022-01-20 14:09:09 +0000 (+0000), Jeremy Stanley wrote: [...]
The content of /centos/8-stream/AppStream/x86_64/os/repodata/ on mirror.facebook.net is identical to what we're serving already. I checked some other mirrors, e.g. linuxsoft.cern.ch, and see the same. The repomd.xml indices on them all match too.
Further investigation of our mirror update logs indicates there was some (likely global) upheaval for CentOS Stream 8 package indices, which we then mirrored on what was probably a several hour delay as we're multiple mirror "hops" from their primary. The mirror at LeaseWeb, which we pull from, had an index update around 06:00 UTC which seems to roughly coincide with when the problems began, and then we saw those indices switch back around 12:00 UTC to what they had been previously.
The timeframe where the suspected problem indices were being served from our mirrors was approximately 06:55-12:57 UTC. We also saw a failure to upload updated centos-8-stream images to a significant proportion of our providers shortly prior to this, so out of an abundance of caution I've issued a delete for that image (falling back to the one built yesterday), and our builders are presently refreshing it from what is hopefully now a sane mirror of the packages and indices.
Thank you Jeremy for looking into it. It seems the issue was from the CentOS side itself. Maybe the RDO team can help here to avoid these kinds of issues in future.
Thanks,
Chandan Kumar
participants (2)
-
Chandan Kumar
-
Jeremy Stanley