Validate Canonical field URLs match retrieval location#42
Closed
Validate Canonical field URLs match retrieval location#42
Conversation
- Created SecurityTxtCanonicalUrlMismatch violation class - Added validateCanonicalUrls method to SecurityTxtParser - Modified parseFetchResult to validate canonical URLs against fetched URL - Added tests for canonical URL validation - Per RFC 9116 section 2.5.2, if Canonical field is present, the retrieval URL must be listed Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Validate the Canonical field in security.txt
Validate Canonical field URLs match retrieval location
Dec 4, 2025
spaze
requested changes
Dec 4, 2025
…ral/singular - Created CanonicalUrlValidator in Validator namespace - Updated SecurityTxtCanonicalUrlMismatch to use singular/plural correctly - Removed validateCanonicalUrls from SecurityTxtParser - Updated all instantiations of SecurityTxtParser to pass the new validator Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
spaze
requested changes
Dec 4, 2025
- Added withFetchedUrl() and getFetchedUrl() to SecurityTxt class - Moved CanonicalUrlValidator to Validator\Validators namespace - Made it implement FieldValidator interface - Added to SecurityTxtValidator fieldValidators list - Removed CanonicalUrlValidator from Parser constructor - Parser now sets fetched URL on SecurityTxt object - Updated violation to generate format dynamically with URL placeholders - Removed all comments from validator as requested - Changed $howToFixTemplate to $howToFixFormat - No longer using implode for URLs in violation Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
spaze
requested changes
Dec 4, 2025
- Removed fetchedUrl property and with/get methods from SecurityTxt - Updated SecurityTxtValidator::validate() to accept optional SecurityTxtFetchResult parameter - Updated CanonicalUrlValidator to accept fetch result parameter - Parser now passes fetch result directly to validator - Cleaner design - no fetch-specific state in SecurityTxt object Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
…FieldValidator - CanonicalUrlValidator doesn't implement FieldValidator interface - Stored as separate property in SecurityTxtValidator - Called separately when fetch result is available - Cleaner design following SOLID principles Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
spaze
requested changes
Dec 4, 2025
- Added validateWithFetchResult() method to SecurityTxtValidator - Removed optional parameter and if checks from validate() - Parser only recreates result object when validation result changes - Cleaner separation of concerns Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
Compare arrays instead of objects to correctly determine when to recreate result Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
spaze
requested changes
Dec 5, 2025
- Added optional fetchResult parameter to parseString() - parseFetchResult() now simply passes fetchResult to parseString() - Removed validateWithFetchResult() method - Removed SecurityTxtParseStringResult recreation in parser - Canonical validator called from validate() when fetchResult available Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
spaze
requested changes
Dec 5, 2025
…erface - Renamed CanonicalUrlValidator to CanonicalUrlFieldValidator - Implements FieldValidator interface - Added to $fieldValidators array in SecurityTxtValidator - Removed separate canonicalUrlValidator property and if check - Added fetchedUrl property to SecurityTxt with with/get methods - Parser sets fetchedUrl on SecurityTxt before validation - Validator accesses fetchedUrl from SecurityTxt object Co-authored-by: spaze <1966648+spaze@users.noreply.github.com>
Owner
|
Superseded by #44 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implemented validation for the Canonical field as specified in RFC 9116 section 2.5.2.
Changes Made
SecurityTxtCanonicalUrlMismatchwarns when the fetched URL doesn't match any canonical URLsCanonicalUrlFieldValidatorinValidator\Validatorsnamespace validates that the retrieval URL is listed in the Canonical fieldFieldValidatorinterface and is added to the$fieldValidatorsarraywithFetchedUrl()andgetFetchedUrl()methods to store the retrieval URL contextparseString()sets fetched URL on SecurityTxt object before validation when fetch result is availableImplementation Details
CanonicalUrlFieldValidator- ImplementsFieldValidatorinterface, validates canonical URLs by checkinggetFetchedUrl()on SecurityTxtSecurityTxt- AddedwithFetchedUrl()andgetFetchedUrl()methods to store fetch contextSecurityTxtValidator- IncludesCanonicalUrlFieldValidatorin$fieldValidatorsarray alongside other field validatorsSecurityTxtParser::parseString()- Sets fetched URL on SecurityTxt before validation when fetch result is availableBehavior
When a security.txt file is fetched and contains a Canonical field that doesn't list the retrieval URL:
No validation occurs when:
parseString()called directly without fetchResult parameter)Multiple canonical URLs are supported—validation passes if any matches the retrieval URL.
Testing
Original prompt
Canonicalfield #40💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.