-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure removed jenkins.io pages aren't accessible and indexed anymore #3360
Comments
at one point we enabled --delete on the publish, so it deleted any file that wasn't in the build. I think plugins jenkinsfile does it. |
Amongst several changes proposed by @zbynek this weekend (❤️), here is a PR to add I've created a snapshot of the concerned Azure File Storage.
|
Comparison between the fileshare (what's on Fastly) and a clean build, obtained with ListingsOnly in fileshare:These files would be deleted (need to check the first ones at the root folder)
Only in clean build
Files with different contentI can output all diff if needed.
|
All but one of those can be explained by case-insensitive storage and case-sensitive Git -- renaming in Git doesn't move them in storage. Should not be a problem because both lowercase and uppercase URL works for them. The last one has non-ascii character in filename but seems to have identical name in storage and Git, hopefully also OK.
The removed pipeline step docs mostly belong to suspended plugins. The extension docs are probably missing because of #3746 |
@lemeurherve I don't think this should be blocked by the extension indexer. ATM those documents exist but are not linked from the index, so users are reporting them as missing anyway. |
I agree. The unreferenced extension pages are still included in the Algolia search index (the credentials plugin, for example), but those pages are not always reachable even from the Algolia search. @kmartens27 is doing a detailed review of the pages that will be deleted in order to be confident that there are no surprises "lurking" in that list. I'd like to allow him a little more time to check those items before this is implemented. A day or two should be sufficient. |
They will only show up in the algolia search if they are still linked from somewhere else on jenkins.io I used I'm not entirely sure why I did it. I think I thought there would be obvious glaring missing links todo with this, but its still worth seeing. |
The total number of files in the Azure File Share decreased from 21051 to 15856. |
As the outdated links now all return a 403 error when pointing to (now empty) folders, I've disable the trusted.ci.jenkins.io publication job to revert the
|
Deleted files list: blog, doc, etc.
Removed authors
Pictures, CSS, JS, miscellanious
doc/developer/extensions & doc/pipeline/steps
bower
|
Empty folder list
|
404 would be better than 403, thanks for looking into that. Maybe it's enough to show the 404 page for the removed content and let visitors use the search instead of creating redirects, at least for most cases? |
If I remember correctly it's the nginx config throwing an error with try files on the directory, or apache not having index option enabled. Or maybe not. It was a long time ago we tried anything. Notes probably forever lost on irc |
You remember correctly buddy: it matches our analysis \o/ |
I think we fixed it on stories and/or plugins if you want to grab from there. |
I'll just delete empty folders to get rid of the 403 errors. |
I propose to reactivate the |
LGTM for me on this plan (don't forget to announce it!) |
+1 from me as well |
That makes sense to me, +1 for me! |
Deletion activated again and empty folders deleted. TODO: jenkins.io issue. |
Update: we have 2 cases of "moved pages" caught by users:
=> the "configuration for reverse proxy" pages sounds like they will need redirections. |
Opened a pull request to take care of the Remaining content of the "bloc/doc/etc." section:
Not sure it worth adding a redirection for them.
I don't know if something should be done for these.
No longer actively maintained, no redirection needed?
I think these project ideas don't need to be referenced anymore.
|
@kmartens27 @MarkEWaite could you please take a look at my comment above and tell me if a jenkins.io issue should be opened to add some more redirections than the ones in jenkins-infra/jenkins.io#6817 If not, I'll close this issue when the PR will be merged. |
Closing as the work looks done, thanks y'all ! |
As noted in jenkins-infra/jenkins.io#5940 (comment), when a page is removed from the repository, it's not removed from the source website www.origin.jenkins.io cached by Fastly.
We need to find a way to ensure these pages are removed and not indexed anymore.
The text was updated successfully, but these errors were encountered: