Skip to content

Commit

Permalink
New broken link report (github#16412)
Browse files Browse the repository at this point in the history
* add linkinator npm package

* add new script that uses Linkinator

* reorg the excluded links file and update comments

* replace blc artifacts with linkinator artifacts in .gitignore

* update the scheduled workflow to use the new script

* dismantle BLC scripts

* add workflow_dispatch event so we can test this manually

* npm uninstall broken-link-checker

* use different exit codes depending on whether broken links are found
  • Loading branch information
sarahs authored Nov 10, 2020
1 parent fa649bf commit ce33df1
Show file tree
Hide file tree
Showing 9 changed files with 471 additions and 1,007 deletions.
12 changes: 4 additions & 8 deletions .github/workflows/check-all-english-links.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: Check all English links

on:
workflow_dispatch:
schedule:
- cron: "40 19 * * *" # once a day at 19:40 UTC / 11:40 PST

Expand All @@ -10,21 +11,16 @@ jobs:
if: github.repository == 'github/docs-internal'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@5a4ac9002d0be2fb38bd78e4b4dbde5606d7042f
- name: npm ci
run: npm ci
- name: npm run build
run: npm run build
- name: Run script
run: script/check-external-links en > broken_links.md
run: script/check-english-links > broken_links.md
- name: Check if any broken links
id: check
run: |
if [ "$(grep 'All links are good' broken_links.md)" ]; then
if [ "$(grep '0 broken links found' broken_links.md)" ]; then
echo ::set-output name=continue::no
else
echo "::set-output name=continue::yes"
echo "::set-output name=title::$(grep 'found on help.github.com' broken_links.md)"
echo "::set-output name=title::$(head -1 broken_links.md)"
fi
- if: ${{ steps.check.outputs.continue == 'yes' }}
name: Create issue from file
Expand Down
8 changes: 3 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@
node_modules
npm-debug.log
coverage

# blc: broken link checker
blc_output.log
blc_output_internal.log
dist
.linkinator
broken_links.md
dist
30 changes: 11 additions & 19 deletions lib/excluded-links.js
Original file line number Diff line number Diff line change
@@ -1,28 +1,20 @@
// Linkinator treats the following as regex.
module.exports = [
// GitHub search links fail with "429 Too Many Requests"
'https://github.com/search?*',
// Skip GitHub search links.
'https://github.com/search?.*',
'https://github.com/github/gitignore/search?',

// LinkedIn links fail due to bug: https://github.com/stevenvachon/broken-link-checker/issues/91
'https://www.linkedin.com/*',

// blc returns "BLC_UNKNOWN" on this link, even though cURL returns "302 Found"
'https://www.ilo.org/dyn/normlex/en/f?p=NORMLEXPUB:12100:0::NO::P12100_ILO_CODE:P029',

// the codercat link works but blc reports a false 404
'https://github.com/Codertocat/hello-world-npm/packages/10696?version=1.0.1',

// this URL started returning 403 to blc and cURL even though it works in a browser; see docs-internal #10124
'https://haveibeenpwned.com/',
'https://haveibeenpwned.com/*',

// this is a private repo customers are given access to when they purchase Insights; see docs-internal #12037
// These links require auth.
'https://github.com/settings/profile',
'https://github.com/github/docs/edit',
'https://github.com/github/insights-releases/releases/latest',

// developer content uses these for examples; they should not be checked
'http://localhost:1234/*',
// Developer content uses these for examples; they should not be checked.
'http://localhost:1234',
'localhost:3000',

// this URL works but blc reports a false 404
// Oneoff links that link checkers think are broken but are not.
'https://haveibeenpwned.com/',
'https://www.ilo.org/dyn/normlex/en/f?p=NORMLEXPUB:12100:0::NO::P12100_ILO_CODE:P029',
'http://www.w3.org/wiki/LinkHeader/'
]
Loading

0 comments on commit ce33df1

Please sign in to comment.