textlint rule to make sure every link in a document is available.
The targets of this rule is Markdown documents and plain text documents (See tests).
This rule is mainly adopted from textlint-rule-no-dead-link but much more faster and more features provided.
$ npm install textlint-rule-alive-link
$ npm install textlint textlint-rule-alive-link
$ textlint --rule textlint-rule-alive-link text-to-check.txt
Shows an error if a link is dead (i.e. its server returns one of the "non-ok" responses).
Shows an error if an image link is dead (i.e. its server returns one of the "non-ok" responses).
Shows an error if a link is obsolete or moved to another location (i.e. its server returns one of the "redirect" responses).
This error is fixable and textlint
will automatically replace the obsolete links with their new ones if you run it with --fix
option.
Sometimes your files contain relative URIs, which don't have domain information in an URI string. In this case, we have to somehow resolve the relative URIs and convert them into absolute URIs.
The resolution strategy is as follows:
- If the URI is like
//example.com
, the rule checkhttps://example.com
instead. - If
baseURI
is specified, use that path to resolve relative URIs (See the below section for details). - If not, try to get the path of the file being linted and use its parent folder as the base path.
- If that's not available (e.g., when you are performing linting from API), put an error
Unable to resolve the relative URI
.
URI RegExpression this rule using is:
/(?:https?:)?\/\/(?:www\.)?[-a-z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-\p{L}0-9()@:%_+.~#?&/=]*)/gu
Please write your configurations in .textlintrc
, .textlintrc.js
is recommended.
The default options are:
{
"rules": {
"alive-link": {
language: 'en', // {string} 'en' outputs info in English, set to 'zh' if you want Chinese
checkRelative: true, // {boolean} `false` disables the checks for relative URIs.
baseURI: null, // {string|function} a base URI to resolve relative URIs. If baseURI is a function, baseURI(uri) will be the URL to be checked, Function is only supported by .textlintrc.js.
ignore: [], // {string|regExp|function[]} URIs to be skipped from availability checks. RegExp and Function are only supported by .textlintrc.js.
ignoreRedirects: false, // {boolean} `false` ignores redirect status codes.
ignoreStringLinks: false, // {boolean} `false` skip URI_REGEXP checking in string.
preferGET: [], // {string[]} origins to prefer GET over HEAD.
retry: 3, // {number} Max retry count
concurrency: 8, // {number} Concurrency count of linting link [Experimental]
interval: 500, // {number} The length of time in milliseconds before the interval count resets. Must be finite. [Experimental]
intervalCap: 8, // {number} The max number of runs in the given interval of time. [Experimental]
userAgent: 'textlint-rule-alive-link/0.1', // {string} a UserAgent,
maxRetryDelay: 10, // {number} The max of waiting seconds for retry. It is related to `retry` option. It does affect to `Retry-After` header.
maxRetryAfterDelay: 10, // {number} The max of waiting seconds for `Retry-After` header.
timeout: 20, // {number} Request timeout, default is 20s.
}
}
}
This rule outputs error info in English by default (language: 'en'
).
You can change to Chinese by passing 'zh'
to this option.
This rule checks the availability of relative URIs by default.
You can turn off the checks by passing false
to this option.
The base URI to be used for resolving relative URIs.
string
type baseURI
.
Though its name, you can pass either an URI starting with http
or https
, or an file path starting with /
.
This option uses URL.resolve(baseURI, relativeURI)
method in node url
module to join URIs.
function
type baseURI
.
This option uses baseURI(url)
result as the URI to be checked.
Notice: Only supported in .textlintrc.js
Examples:
"alive-link": {
"baseURI": "https://example.com/path/"
}
// Markdown content:
// [page1](/sub1/to/page1)
// [page2](sub2/to/page2)
// Parsed link:
// https://example.com/sub1/to/page1
// https://example.com/path/sub2/to/page2
"alive-link": {
"baseURI": "/Users/textlint/path/to/parent/folder/"
}
// Markdown content:
// [file](/path/to/file)
// Parsed link:
// /Users/textlint/path/to/parent/folder/path/to/file
"alive-link": {
"baseURI": url => 'https://example.com/' + url
}
// Markdown content:
// [page](page.html)
// Parsed link:
// https://example.com/page.html
string
type config in ignore
option array.
An array of URIs, globs, to be ignored. These list will be skipped from the availability checks.
The matching method use minimatch.
regExp
type config in ignore
option array.
URI matching RegExpression, matched URI will be ignored.
Notice: Only supported in .textlintrc.js
function
type config in ignore
option array.
URI checking function, URI will be the function param and Boolean should be returned.
Notice: Only supported in .textlintrc.js
Example:
"alive-link": {
"ignore": [
"http://example.com/not-exist/index.html", // URI string
"http://example.com/*" // glob format string
/example\.com/g, // regExp
uri => uri.startsWith('https://example.com/') // function
]
}
An array of origins to lets the rule connect to the origin's URL by GET
instead of default HEAD
request at first time.
Example:
"alive-link": {
"preferGET": [
"http://example.com"
]
}
This rule checks for redirects (3xx status codes) and considers them an error by default.
To ignore redirects during checks, set this value to false
.
This rule extracts links from plain text using by default.
To ignore extracting during checks, set this value to false
.
Example:
// example 1
"alive-link": {
"ignoreStringLinks": true
}
// Won't check this URL: https://example.com/
// example 2
"alive-link": {
"ignoreStringLinks": false
}
// Will check this URL: https://example.com/
This rule checks the URL with retry.
The default max retry count is 3
.
Set smaller to save linting time as needed.
Retry request method:
- by
Response Headers Allow
parameter ifstatusCode
is405
. GET
as alternative.
Customize User-Agent
http header.
The max of waiting seconds for retry. It is related to retry
option.
📝 It does affect to Retry-After
header. If you want to max waiting seconds for Retry-After
header, please use maxRetryAfterDelay
option.
Default: 10
The max of allow waiting time second for Retry-After
header value.
Some website like GitHub returns Retry-After
header value with 429 too many requests
.
This maxRetryAfterDelay
option is for that Retry-After
.
Default: 10
Request timeout (seconds).
Default: 20
npm test