We should also be able to prioritize the /latest paths using the sitemap and submit the updated sitemap to google and friends. Looking at the sitemap (openstack/openstack-manuals/www/static/sitemap.xml), it appears the priorities are incorrect. Octavia latest admin for example: <url> <loc>https://docs.openstack.org/octavia/latest/admin/</loc> <priority>0.5</priority> <changefreq>daily</changefreq> <lastmod>2020-06-21T14:55:29+0000</lastmod> </url> Octavia Pike admin: <url> <loc>https://docs.openstack.org/octavia/pike/admin/</loc> <priority>1.0</priority> <changefreq>weekly</changefreq> <lastmod>2019-10-05T14:32:32+0000</lastmod> </url> So, given the lastmod date, we haven't updated the sitemap since 2020 and we probably submitted it with stable branch docs having a higher priority (relative to other pages on our site) than the /latest paths. Michael On Tue, Nov 7, 2023 at 4:01 AM Jake Yip <jake.yip@ardc.edu.au> wrote:
Hi Tony,
Thanks for answering!
On 7/11/2023 10:01 am, Tony Breeds wrote:
This has been discussed before, not for a while though. The last time I recall was the first Denver PTG.
In the past we did remove docs as branches were EOLd which resulted in lots of 404s as a search engine would index a page and return it but we had removed it so the user got a 404.
I think this is similar to what happened to Crossplane and they had to explicitly get Google to reindex those[1].
There was also a lengthy discussion about not removing docs that people are using. I haven't looked at the user survey results but I'm sure you've seen the long tail of people still using very old results.
In that case, removing those docs may be a step too far.
We could potentially remove the older releases from the index by doing something similar to the crossplane project did. That has the potential to solve most of the problems we've all seen but doesn't remove the docs for older releases in general.
I attempted to look into it a bit, but it's not my specialty. Also, eventually, it'll require someone to re-trigger indexing from Google via search console[2], which needs domain ownership verification (via TXT record, etc).
[1] https://github.com/crossplane/docs/issues/107#issuecomment-990338800 [2] https://search.google.com/search-console/