Skip to content

Improve docs search#1959

Merged
dgageot merged 1 commit intodocker:mainfrom
dgageot:improve-docs-search
Mar 6, 2026
Merged

Improve docs search#1959
dgageot merged 1 commit intodocker:mainfrom
dgageot:improve-docs-search

Conversation

@dgageot
Copy link
Member

@dgageot dgageot commented Mar 6, 2026

Fix visual glitches in the docs search

Signed-off-by: David Gageot <david.gageot@docker.com>
@dgageot dgageot requested a review from a team as a code owner March 6, 2026 09:30
Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Assessment: 🔴 CRITICAL

This PR fixes visual glitches in the docs search, but introduces 2 critical XSS vulnerabilities and 1 logic error in the changed code.

Critical Issues (2)

  • XSS via unsanitized section names: Section names are directly concatenated into HTML without escaping
  • XSS via unvalidated URLs: Search result URLs are not validated, allowing javascript: URLs

Notable Issues (1)

  • Duplicate section headers: Section grouping logic can create duplicate headers when results span multiple sections

Recommendations

  1. HTML-escape all user-controlled data (r.section, r.title) before concatenating into HTML strings
  2. Validate or sanitize URLs to prevent javascript: and data: URL schemes
  3. Consider sorting results by section first, or track displayed sections to avoid duplicates

let lastSection = '';
for (const r of results) {
if (r.section && r.section !== lastSection) {
html += `<div class="search-result-group">${r.section}</div>`;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: XSS vulnerability in search results rendering

The code constructs HTML by concatenating strings that include r.section without sanitization or escaping. If the searchIndex contains data from an external source or user input, an attacker could inject malicious HTML/JavaScript through the section field.

For example, if r.section = '<img src=x onerror=alert(1)>', this would execute arbitrary JavaScript when the search results are rendered.

Impact: Arbitrary JavaScript execution in the context of the documentation site.

Fix: HTML-escape the section name before inserting it:

html += `<div class="search-result-group">${escapeHtml(r.section)}</div>`;

Or use a safe DOM API like textContent instead of innerHTML.

html += `<div class="search-result-group">${r.section}</div>`;
lastSection = r.section;
}
html += `<a class="search-result" href="${r.url}" tabindex="0" role="option">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: XSS via unvalidated URL in search results

r.url is directly interpolated into the href attribute without validation or sanitization. If an attacker can control the sidebar links (through CMS, markdown, or configuration), they could inject a javascript: URL or data: URL with JavaScript.

When a user clicks such a search result, arbitrary code would execute.

Example attack: r.url = 'javascript:alert(document.cookie)'

Fix: Validate that URLs are safe before using them:

const safeUrl = r.url.match(/^(https?:\/\/|\/)/) ? r.url : '#';
html += `<a class="search-result" href="${safeUrl}" tabindex="0" role="option">`;

Or sanitize the URL to ensure it doesn't contain dangerous schemes.

// Group results by section to avoid repeating section names
let html = '';
let lastSection = '';
for (const r of results) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM: Section grouping can create duplicate section headers

The grouping logic checks if (r.section && r.section !== lastSection) to insert group headers. However, the results array is sorted by matchType (title vs content) rather than by section.

If search results include multiple items from the same section but with different match types, they will be non-contiguous in the sorted array, causing duplicate section headers.

Example: If "Section A" has both title matches and content matches:

  1. Section A (title match)
  2. Section B (title match)
  3. Section A (content match) ← "Section A" header appears again

Fix: Either sort by section first:

results.sort((a, b) => {
  if (a.section !== b.section) return a.section.localeCompare(b.section);
  return a.matchType === 'title' ? -1 : 1;
});

Or track all displayed sections to avoid duplicates.

@dgageot dgageot merged commit 2101a39 into docker:main Mar 6, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants