Skip to content

IBX-4175: Add links checker #2130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 27 commits into
base: 5.0
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
0bf03af
Add links checker
adriendupuis Sep 5, 2023
f784962
Merge branch 'master' into test-links
adriendupuis Jan 2, 2024
e11f4a5
Sync links.php
adriendupuis Jan 2, 2024
aec9328
Sync links.php
adriendupuis Jan 5, 2024
277f1df
Sync links.php
adriendupuis Jan 5, 2024
a1a0f26
Sync links.php
adriendupuis Jan 5, 2024
825bde6
Comment links.config.php's exclusion of docs/index.md
adriendupuis Jan 5, 2024
fcf3082
links.config.php: Ignore https://www.json.org/ → https://www.JSON.org…
adriendupuis Jan 9, 2024
69570ba
links.config.php: Ignore malicious URL example
adriendupuis Jan 16, 2024
f2be2cb
Merge branch 'master' into test-links
adriendupuis Jan 16, 2024
ec9d1cc
links.config.php: Ignore scheme examples
adriendupuis Jan 16, 2024
0fe6da3
links.config.php: Ignore fragments with badges
adriendupuis Jan 16, 2024
7d77edb
links.php: TAdd a TODO about badges, and fragment search
adriendupuis Jan 17, 2024
c17a6ac
Merge branch 'master' into test-links
adriendupuis Feb 13, 2025
96332d4
links.php: mark params as nullable
adriendupuis Feb 14, 2025
96a01ee
links.php: mark params as nullable
adriendupuis Feb 15, 2025
b116b6a
links.config.php: Add two fake hosts to ignore
adriendupuis Feb 15, 2025
0e767f4
Merge branch 'master' into test-links
adriendupuis Feb 19, 2025
a0f2d7d
PHP CS Fixes
adriendupuis Feb 19, 2025
aefb01c
links.config.php: ignore & exclude few more.
adriendupuis Feb 19, 2025
61a9d47
links.php: Enh doc
adriendupuis Feb 19, 2025
aeba155
links.php: Can use curl in some cases
adriendupuis Feb 20, 2025
bd693ac
links.config.php: AWS doc answers to curl, not PHP
adriendupuis Feb 20, 2025
e4b8629
Merge branch 'master' into test-links
adriendupuis Apr 9, 2025
eb12bbb
links.php: Fix type hinting
adriendupuis Apr 9, 2025
3c19be4
links.php: Fix types
adriendupuis Apr 9, 2025
05dd9ee
links.php: ease UrlTester::isExcludedUrl usage
adriendupuis Apr 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 156 additions & 0 deletions tools/links/links.config.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
<?php

$usageFiles = call_user_func(function (array $a): array {
asort($a);
return $a;
}, (new Finder('./docs'))
->includeName('*.md')
->excludeWholeName('./docs/release_notes/*')//TMP
->excludeWholeName('./docs/snippets/*')//TMP
->find());

$resourceFiles = call_user_func(function (array $a): array {
asort($a);
return $a;
},
// Images
(new Finder('./docs'))
->includeName('*.png')
->includeName('*.jpg')
->excludeWholeName('./docs/release_notes/img/*')//TMP
->find(),
);

$exclusionTests = array_merge_recursive(UrlTester::getDefaultExclusionTests(), [
'url' => [
function (string $url, ?string $file = null): bool {
// docs/index.md content is not Markdown but HTML with server URLs
return 'docs/index.md' === $file && str_starts_with($url, 'docs/');
},
function (string $url, ?string $file = null): bool {
// ibexa.co APIs needing authentication, namespaces, commercial aliases, etc.
return str_starts_with($url, 'https://updates.ibexa.co') // 401
|| str_starts_with($url, 'https://flex.ibexa.co') // 404
|| str_starts_with($url, 'https://support.ibexa.co') // 302 → /login
|| str_starts_with($url, 'https://connect.ibexa.co') // 302 → https://ibexa.integromat.celonis.com/
|| str_starts_with($url, 'http://ibexa.co/namespaces/') // 301
|| str_starts_with($url, 'http://ibexa.co/xmlns/') // 301
|| str_starts_with($url, 'http://ez.no/namespaces/') //301
|| str_starts_with($url, 'https://api.cloud.ibexa.co') // 301, PLATFORMSH_CLI_API_URL
//|| str_starts_with($url, 'https://admin.perso.ibexa.co/api/') // 400
|| str_starts_with($url, 'https://admin.perso.ibexa.co/') // 404
|| str_starts_with($url, 'https://event.perso.ibexa.co/api/') // 400
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|| str_starts_with($url, 'https://event.perso.ibexa.co/api/') // 400
|| str_starts_with($url, 'https://event.perso.ibexa.co/api/') // 404
|| str_starts_with($url, 'https://tracker.ibexa.co/api/') // Could not resolve host

Copy link
Contributor Author

@adriendupuis adriendupuis Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|| str_starts_with($url, 'https://event.perso.ibexa.co/ebl/') // 404
|| str_starts_with($url, 'https://import.perso.ibexa.co/api/') // 400
|| str_starts_with($url, 'https://reco.perso.ibexa.co') // 403
;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;
|| str_starts_with($url, 'https://flex.ibexa.co') // Could not resolve host
;

},
function (string $url, ?string $file = null): bool {
// Third parties APIs, namespaces, etc.
return str_starts_with($url, 'https://api.fastly.com') // 301 or 403
|| str_starts_with($url, 'https://unsplash.com') // 405
|| str_starts_with($url, 'http://docbook.org/ns/') // 301
|| str_starts_with($url, 'http://www.w3.org/1999/xlink') // 301 → https
;
},
function (string $url, ?string $file = null): bool {
return (bool)preg_match('@(https?:)?//([a-z]+\.)?(localhost|127.0.0.1|123.456.789.0)(:[0-9]+)?(/.*|$)@', $url);
},
function (string $url, ?string $file = null): bool {
// Fake, placeholders, local servers, etc.
return str_contains($url, 'foobar.com')
|| str_contains($url, 'mydomain.com')
|| str_contains($url, '//my_site.com')
|| str_contains($url, '//address.of/')
|| str_contains($url, '//some/file/here')
|| str_contains($url, '//mydoc.pdf')
|| str_contains($url, 'var/ezdemo_site/')
|| str_contains($url, '//server_uri')
|| str_contains($url, '//FRA_server_uri')
|| str_contains($url, '//user:password@host')
|| str_contains($url, '//user:pass@localhost')
|| str_contains($url, '//elasticsearch:9200')
|| str_contains($url, '//varnish:80')
|| str_contains($url, '//my.varnish.server')
;
},
function (string $url, ?string $file = null): bool {
return str_starts_with($url, '/assets/')
|| str_contains($url, '{{ asset(');
},
function (string $url, ?string $file = null): bool {
return false !== strpos($url, 'javascript:');
},
function (string $url, ?string $file = null): bool {
return str_contains($url, '{{ path(')
//|| str_contains($url, '{{ ez_path(')
|| str_contains($url, '{{ ibexa_path(')
|| str_contains($url, '{{ ibexa_url(')
//|| str_contains($url, "|e('html_attr')")
|| str_contains($url, '{{ image_uri }}')
|| str_contains($url, '{{ ibexa_checkout_step_path(')
|| str_contains($url, '{{ ibexa_checkout_step_url(')
;
},
function (string $url, ?string $file = null): bool {
return str_ends_with($file, '/rest_api_authentication.md')
&& str_ends_with($url, 'web+ez:DELETE /content/locations/1/2');
},
function (string $url, ?string $file = null): bool {
return str_ends_with($file, '/file_url_handling.md')
&& (str_contains($url, 'http://`') || str_contains($url, 'ftp://`'));
},
],
'location' => [
function (string $url, string $location, ?string $file = null): bool {
return str_starts_with($url, 'https://issues.ibexa.co/') && str_starts_with($location, 'https://issues.ibexa.co/login.jsp');
},
function (string $url, string $location, ?string $file = null): bool {
return $url === $location && str_starts_with($url, 'https://twitter.com/');
},
function (string $url, string $location, ?string $file = null): bool {
return str_starts_with($url, 'https://youtu.be/') && explode('/', $url)[3] . '&feature=youtu.be' === explode('?v=', $location)[1];
},
function (string $url, string $location, ?string $file = null): bool {
return str_starts_with($url, 'https://www.facebook.com/') && str_starts_with($location, 'https://www.facebook.com/unsupportedbrowser');
},
function (string $url, string $location, ?string $file = null): bool {
return 'https://www.json.org/' === $url && 'https://www.JSON.org/json-en.html' === $location;
},
function (string $url, string $location, ?string $file = null): bool {
return $url === 'https://console.aws.amazon.com/iam/home#/users'
&& preg_match('@https://[a-z0-9-]+\.console\.aws\.amazon\.com/iam/home#/users@', $location);
},

],
'fragment' => [
/*function (string $url, ?string $file = null): bool {
return str_ends_with($file, '.md') && (
// ## Commerce [[% include 'snippets/commerce_badge.md' %]]
str_ends_with($url, '/permission_use_cases.md#commerce')
// ### Ensure proper Captcha behavior [[% include 'snippets/experience_badge.md' %]] [[% include 'snippets/commerce_badge.md' %]]
|| str_ends_with($url, '/reverse_proxy.md#ensure-proper-captcha-behavior')
);
},*/
function (string $url, ?string $file = null): bool {
return str_starts_with($url, 'https://classic.yarnpkg.com/en/docs/')
|| str_starts_with($url, 'https://ddev.readthedocs.io/');
},
function (string $url, ?string $file = null): bool {
return $url == 'https://www.paypal.com/bizsignup/#/singlePageSignup';
}
],
]);

$curlUsageTests = [
function (string $url, ?string $file = null): bool {
return str_starts_with($url, 'https://docs.aws.amazon.com/');
},
];

$replacements = [//TODO Get from mkdocs.yml
'[[= symfony_doc =]]' => 'https://symfony.com/doc/5.4',
'[[= user_doc =]]' => 'https://doc.ibexa.co/projects/userguide/en/master',
];

$find = './docs/*/';
Loading