-
Notifications
You must be signed in to change notification settings - Fork 343
fix broken urls at source rather than in the indexer #2505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
josh-collinsworth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good at a quick read; confirmed it solves the problem in the browser. I'm a little nervous about edge cases we might be missing, since we're doing a lot of string manipulation here, but maybe the input is more predictable than I think. (Or, maybe we can fix those as we go, since they seem less damaging than the current state of things anyway.)
| <a | ||
| href="${this.escapeHtml(this.formatUrl(hit.document.url, hit))}" | ||
| href="${ | ||
| this.escapeHtml(hit.document.url || hit.document.path || "#") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: we should have a better fallback than href="#"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe homepage? I wasn't sure what the best option was
| // Remove any trailing or leading hashes, pipes, or other separators | ||
| cleaned = cleaned.replace(/^[#\|\-\s]+|[#\|\-\s]+$/g, ""); | ||
| // Remove any trailing or leading hashes | ||
| cleaned = cleaned.replace(/^#+|#+$/g, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we certain hashes are the only characters we have to worry about? I only ask because quite a few others were present in the original version.
search.client.ts
Outdated
| .replace(/\s*Jump\s+to\s+heading\s*/gi, "") | ||
| .replace(/\s*#Jump-to-heading\s*/gi, "") | ||
| .replace(/\s*-Jump-to-heading\s*/gi, "") | ||
| .replace(/Jump\s+to\s+heading#?/gi, "") | ||
| .replace(/#Jump-to-heading/gi, "") | ||
| .replace(/-Jump-to-heading/gi, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we could consolidate these. I'm pretty sure this would work, if we're just accounting for any combination of spaces, dashes, and/or hashes before or after the string.
.replace(/[\s#-]*jump to heading[\s#-]*/gi, "")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh he's good! Have updated
Yeah, my thinking was to remove complication as much as possible, because i think it might have hidden the issue (tho im not entirely sure it was, because the weird issue with the index not actually uploading requiring the update to a brand new datasource) |
point site at new datasource in orama, because old one does not seem to be uploading