Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Broken Links] Due to the passage of time, there are some broken links in our old files. #463

Open
1 of 2 tasks
MatthewJeffson opened this issue Jul 28, 2023 · 8 comments
Assignees
Labels
ongoing This assignment is still ongoing Tier 0 This is Tier 0 task

Comments

@MatthewJeffson
Copy link
Collaborator

MatthewJeffson commented Jul 28, 2023

Overview: There are some broken links in our old files.

  • With the GitHub update, we experienced an invalid URL error due to the feature of URL.
  • Along with checking we found that we had some wrong links in our old files. Hence, we need to update the page.

Assignment acceptance

  • We are welcome anyone working with us!
  • Leave your Github Name below to indicate that you are willing to accept the assignment.
  • If you are taking on the assignment for the first time, you may receive an invitation in your GitHub mail.

Requirements(check list)

  • Locate all the broken links.
    reference: GitHub Action
  • Update the pages with broken links.

@MatthewJeffson MatthewJeffson converted this from a draft issue Jul 28, 2023
@MatthewJeffson MatthewJeffson self-assigned this Jul 28, 2023
@MatthewJeffson MatthewJeffson added ongoing This assignment is still ongoing Tier 0 This is Tier 0 task labels Jul 28, 2023
@MatthewJeffson MatthewJeffson changed the title [Broken Links Hunted] Due to the passage of time, there are some broken links in our old files. [Broken Links] Due to the passage of time, there are some broken links in our old files. Jul 28, 2023
@Dangerousfish
Copy link
Contributor

I assume we're referring to links within libraries, or does this relate to wiki & the seeedStudio website?

i.e.:

Correct URL:

Current URL:

@MatthewJeffson - if you have a list still to hand, I don't mind seeing what time I can allocate to this.

@MatthewJeffson
Copy link
Collaborator Author

@Dangerousfish Wow! Thank you so much for considering to fix this! And the broken links are refer to the "clicking 404 after the links", including, yeah, all the invalid libraries, invalid wiki and especially the Bazaar websites. Your example is absolutely right!
As for the list, I am running this GitHub Action: markdown url checking and I can see the list in the log(okay... the previouss log is disappear, I will run a new one).

These basically are the information here.

Best Regards,
Matthew

@MatthewJeffson
Copy link
Collaborator Author

The new workflow is been done, and I have seen the latest broken url:

企业微信截图_02e11f62-b30b-42be-96ac-0e0be9397fcd

I will update the comment as well.

@Dangerousfish
Copy link
Contributor

Hey @MatthewJeffson ,

Thank you for running the script again.

I have transformed the output of your script into the following spreadsheet, which should assist us with the completion of this task (or at the least, allow it to have the appearance of being less ominous).

Download:
final_broken_links.xlsx

Sorry it's taken me so long to follow-up on this one.

I'll be back at this, when I next find an opportunity.

Best Regards,

  • Fish

@Dangerousfish
Copy link
Contributor

Note:
Approximately 12,000 broken URLs

@MatthewJeffson - A lot of these failures are returning HTTP Status 429 (Rate-Limited)

  • Given the significant number of URL's to review and the apparent number that were blocked by API rate-limits I would recommend the Git Action to be re-run, with a longer delay between API queries, so that we might obtain a more realistic number to remedy.

@Dangerousfish
Copy link
Contributor

Updated File with Duplicates and errors:
duplicates_and_errors.xlsx

Error Count
Status: 400 3712
Status: 401 5
Status: 403 587
Status: 404 438
Status: 410 1
Status: 429 603
Status: 500 2
Status: 501 2
Status: 503 4
Status: 520 3
Status: 521 9
Status: 999 3
Unknown 6337

@Dangerousfish
Copy link
Contributor

Is there a sitemap anywhere that we can reference from, with valid URLs?

I'm happy to run a sniffer against SeeedStudio (with permission).

@MatthewJeffson
Copy link
Collaborator Author

Hello! @Dangerousfish Thank you so much for considering this! And yeah I think our website can be sniffed:D I will double check with the website manager.
Sorry for the long not reply, since I am on vacation haha.
Best Regards,
Matthew

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ongoing This assignment is still ongoing Tier 0 This is Tier 0 task
Development

No branches or pull requests

2 participants