Skip to content

Conversation

@ksylvan
Copy link
Owner

@ksylvan ksylvan commented Jun 18, 2025

Handle reference-style links in checkLinks

Summary

This PR enhances the checkLinks functionality in the markdown-tree-parser CLI tool to properly handle reference-style links and avoid duplicate URL checking. The changes ensure that both inline links and reference-style link definitions are checked, while eliminating redundant checks for the same URL.

Files Changed

bin/md-tree.js

Modified the checkLinks method to:

  • Extract both regular links and reference-style link definitions
  • Collect all URLs from both sources
  • Use a Set to ensure only unique URLs are checked
  • Update the console output to reflect the number of unique URLs being checked

package.json

  • Bumped version from 1.5.0 to 1.5.1 for this bug fix release

test/sample.md

  • Simplified the test file to focus on link testing scenarios
  • Reduced from a comprehensive documentation example to a minimal test case

test/test-email-links.md

  • Removed empty test file that was not being used

test/test-reference-links.md

  • Added new test file specifically for testing reference-style links
  • Includes examples of valid references, local file references, and undefined references

test/test.js

  • Added comprehensive test suite for the checkLinks functionality
  • Tests reference-style link parsing and URL extraction
  • Validates that the CLI correctly identifies and checks unique URLs

Code Changes

bin/md-tree.js

const links = this.parser.selectAll(tree, 'link');
+const definitions = this.parser.selectAll(tree, 'definition');
+
+const allUrls = [];
+for (const link of links) {
+  allUrls.push(link.url);
+}
+for (const definition of definitions) {
+  allUrls.push(definition.url);
+}
+
+const uniqueUrls = new Set(allUrls);

console.log(
-  `\n🔗 Checking ${links.length} links in ${path.basename(resolvedPath)}:`
+  `\n🔗 Checking ${uniqueUrls.size} unique URLs in ${path.basename(resolvedPath)}:`
);

-for (const link of links) {
-  const url = link.url;
+for (const url of uniqueUrls) {

The key change is collecting URLs from both link nodes (inline links) and definition nodes (reference-style link definitions), then using a Set to ensure each URL is only checked once.

Reason for Changes

The previous implementation had two issues:

  1. Missing reference-style links: When using markdown reference-style links like [text][ref] with [ref]: https://example.com, the URL definitions were not being checked
  2. Duplicate checking: If the same URL appeared multiple times in a document, it would be checked multiple times, leading to unnecessary network requests and confusing output

Impact of Changes

  • Improved coverage: All links in markdown documents are now properly checked, regardless of link style
  • Better performance: Duplicate URLs are only checked once, reducing unnecessary network requests
  • Clearer output: Users see the actual number of unique URLs being checked rather than the total number of link references
  • No breaking changes: The fix is backward compatible and maintains the same CLI interface

Test Plan

Added comprehensive tests that verify:

  1. Reference-style link definitions are properly extracted
  2. Both inline and reference-style links are collected
  3. Duplicate URLs are deduplicated correctly
  4. The CLI output correctly reports the number of unique URLs
  5. Local file links and external URLs are both handled properly

The tests create temporary files to ensure local link checking works correctly and clean up after themselves.

Additional Notes

  • The fix maintains compatibility with existing functionality while extending support for reference-style links
  • The use of a Set for deduplication is an efficient approach that scales well with document size
  • The test suite includes edge cases like undefined references to ensure robust error handling

### CHANGES

- Recognize and validate reference-style markdown links.
- Check each unique URL only once for efficiency.
- Update link check summary to report unique URLs.
- Add comprehensive tests for reference link validation.
- Simplify and clean up markdown test fixture files.
@ksylvan ksylvan merged commit 5ca2c62 into main Jun 18, 2025
4 checks passed
@ksylvan ksylvan deleted the 0617-fix-checklinks branch June 18, 2025 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants