-
Notifications
You must be signed in to change notification settings - Fork 106
feat: Generic git host support (local & remote) #307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
""" WalkthroughThis change introduces support for connecting to generic Git hosts and local Git repositories, both in backend logic and documentation. It adds new schema definitions, types, and validation for a new connection type Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant WebUI
participant Backend
participant RepoUtils
participant GitHost (generic/local/remote)
User->>WebUI: Configure connection (type: "git", url)
WebUI->>Backend: Submit connection config
Backend->>RepoUtils: compileGenericGitHostConfig(config)
alt url starts with file://
RepoUtils->>RepoUtils: Resolve glob pattern to repo paths
loop For each path
RepoUtils->>GitHost: isPathAValidGitRepoRoot(path)
RepoUtils->>GitHost: getOriginUrl(path)
end
RepoUtils-->>Backend: Return repoData for valid local repos
else url is http(s)://
RepoUtils->>GitHost: isUrlAValidGitRepo(url)
RepoUtils-->>Backend: Return repoData for remote repo
end
Backend-->>WebUI: Connection and repo info available
WebUI-->>User: Display repositories (with metadata)
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
🧹 Nitpick comments (18)
docs/docs/search/multi-branch-indexing.mdx (1)
93-93
: Consider consistent capitalization: The table entry uses "Generic git host" with a lowercase "git". For consistency with other platform names (e.g., "GitHub", "GitLab"), consider capitalizing "Git": "Generic Git host".packages/backend/src/repoManager.ts (1)
241-241
: Improved variable declarations using block-scoped constantsChanged from mutable outer-scoped variables to block-scoped constants for timing measurements, which is a good practice that prevents accidental modifications and clarifies variable scope.
Also applies to: 268-269
packages/schemas/src/v3/genericGitHost.type.ts (1)
9-10
: Small typo in URL descriptionThere appears to be a small typo in the URL description - "read-only modified" should likely be "read-only mode".
- * The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns. + * The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only mode. Local directories support glob patterns.docs/snippets/schemas/v3/index.schema.mdx (1)
774-774
: Same typo in documentation as in type definitionThe same typo appears here - "read-only modified" should likely be "read-only mode".
- "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only mode. Local directories support glob patterns.",docs/docs/connections/local-repos.mdx (2)
11-11
: Fix typos in description.There are two issues in this line:
- "meaing" should be "meaning"
- Repetition of "already" (wordiness)
-Sourcebot can sync code from generic git repositories stored in a local directory. This can be helpful in scenarios where you already have a large number of repos already checked out. Local repositories are treated as **read-only**, meaing Sourcebot will **not** `git fetch` new revisions. +Sourcebot can sync code from generic git repositories stored in a local directory. This can be helpful in scenarios where you have a large number of repos checked out. Local repositories are treated as **read-only**, meaning Sourcebot will **not** `git fetch` new revisions.🧰 Tools
🪛 LanguageTool
[style] ~11-~11: To reduce wordiness, try specifying a number or using “many” or “numerous” instead.
Context: ...ful in scenarios where you already have a large number of repos already checked out. Local reposi...(LARGE_NUMBER_OF)
33-33
: Fix typo in volume mounting explanation.There's a spelling error in "seperate".
- We need to mount a docker volume to the `repos` directory so Sourcebot can read it's contents. Sourcebot will **not** write to local repositories, so we can mount a seperate **read-only** volume: + We need to mount a docker volume to the `repos` directory so Sourcebot can read it's contents. Sourcebot will **not** write to local repositories, so we can mount a separate **read-only** volume:packages/schemas/src/v3/index.schema.ts (1)
774-774
: Typo in URL description.There's a small typo in the URL description: "read-only modified" should be "read-only mode".
- "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only mode. Local directories support glob patterns.",packages/schemas/src/v3/genericGitHost.schema.ts (1)
14-14
: Same typo in URL description as in index.schema.ts.The URL description contains the same typo: "read-only modified" should be "read-only mode".
- "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only mode. Local directories support glob patterns.",packages/schemas/src/v3/connection.schema.ts (1)
643-643
: Same typo in URL description again.The URL description contains the same typo found in the other files: "read-only modified" should be "read-only mode".
- "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only mode. Local directories support glob patterns.",docs/snippets/schemas/v3/genericGitHost.schema.mdx (2)
15-15
: Fix grammatical error in description.There's a grammatical error in the description for the
url
property. The phrase "are treated as read-only modified" is unclear and likely contains an error.- "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or an absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only. Local directories support glob patterns.",
25-25
: Clarify maximum revisions limit.The description states "A maximum of 64 revisions can be indexed", but later for both branches and tags it says "A maximum of 64 branches/tags can be indexed". This could lead to confusion about whether the limit is 64 total revisions or 64 of each type.
- "description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 revisions can be indexed, with any additional revisions being ignored.", + "description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 branches and 64 tags can be indexed, with any additional revisions being ignored.",packages/web/src/app/[domain]/search/page.tsx (1)
208-211
: Object-key type mismatch may bite later
Record<number, RepositoryInfo>
is used, but JS object keys are always strings.
This is harmless at runtime but can confuse TypeScript down the line (e.g.Object.keys(repoInfo)
returnsstring[]
, notnumber[]
).
Consider switching toRecord<string, RepositoryInfo>
orMap<number, RepositoryInfo>
for stronger type safety.packages/web/src/features/search/searchApi.ts (2)
96-124
:getFileWebUrl
is fragile and mutates loop variables
- The regexp only matches templates that start with exactly
{{URLJoinPath …}}
; templates generated by other helpers (e.g.{{printf}}
) will silently returnundefined
, losing links in the UI.- Inside
.map
, the callback mutates the parameterpart
, which is an anti-pattern and can break when TypeScript’sno-param-reassign
rule is enabled.If sticking with string manipulation, clone the value first:
- .map((part) => { - // remove wrapping quotes - if (part.startsWith("\"")) part = part.substring(1); - if (part.endsWith("\"")) part = part.substring(0, part.length - 1); - return part; + .map((raw) => { + const part = raw.replace(/^"|"$/g, ""); + return part; })Long-term, consider parsing the template once with a proper Go‐template parser on the server side and shipping the finished URL instead of the template string.
559-567
: Database hit optimisation
isUrlAValidGitRepo
already validated the URL; you can embed the resulting metadata (remoteUrl.origin
,repoName
) without a secondnew URL()
call in callers, avoiding repeated parsing.Minor, but easy to address while you are in this area.
packages/web/src/lib/utils.ts (1)
167-170
: SVG for generic Git host lacks dark-mode treatmentOther icons (GitHub) specify
dark:invert
to stay visible in dark themes. Consider adding a matching class forgitLogo
to maintain visual consistency.packages/backend/src/repoCompileUtils.ts (1)
509-513
:cloneUrl
for local repos should be URL-encodedSpaces or non-ASCII characters in
repoPath
yield invalid URLs.- cloneUrl: `file://${repoPath}`, + cloneUrl: `file://${encodeURI(repoPath)}`,This also mirrors the encoding used elsewhere (
encodeURI
ingetFileWebUrl
).packages/backend/src/git.ts (2)
79-96
: Error message doesn't match function name.The function implementation looks good, but the error messages use "isPathAGitRepoRoot" while the function is named "isPathAValidGitRepoRoot".
- throw new Error(`isPathAGitRepoRoot failed: ${error.message}`); + throw new Error(`isPathAValidGitRepoRoot failed: ${error.message}`);- throw new Error(`isPathAGitRepoRoot failed: ${error}`); + throw new Error(`isPathAValidGitRepoRoot failed: ${error}`);
98-108
: Silent failure handling differs from other functions.This function returns
false
on error rather than throwing exceptions like other functions in the file. While this may be a deliberate choice for validation purposes, it creates an inconsistency in error handling approaches.Consider documenting this behavior difference in the function's JSDoc comment to make the design choice explicit.
+/** + * Checks if the provided URL points to a valid Git repository. + * + * @param url - The URL to check + * @returns A boolean indicating if the URL is a valid Git repository + * @note Unlike other functions in this file, this function returns false instead + * of throwing an error when the URL is invalid or unreachable. + */ export const isUrlAValidGitRepo = async (url: string) => {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
packages/web/public/git.svg
is excluded by!**/*.svg
📒 Files selected for processing (43)
docs/docs.json
(2 hunks)docs/docs/connections/generic-git-host.mdx
(1 hunks)docs/docs/connections/local-repos.mdx
(1 hunks)docs/docs/connections/overview.mdx
(1 hunks)docs/docs/search/multi-branch-indexing.mdx
(1 hunks)docs/docs/search/search-contexts.mdx
(1 hunks)docs/docs/search/syntax-reference.mdx
(1 hunks)docs/self-hosting/overview.mdx
(1 hunks)docs/snippets/schemas/v3/connection.schema.mdx
(1 hunks)docs/snippets/schemas/v3/genericGitHost.schema.mdx
(1 hunks)docs/snippets/schemas/v3/index.schema.mdx
(2 hunks)packages/backend/src/connectionManager.ts
(2 hunks)packages/backend/src/git.ts
(4 hunks)packages/backend/src/repoCompileUtils.ts
(2 hunks)packages/backend/src/repoManager.ts
(5 hunks)packages/backend/src/utils.ts
(1 hunks)packages/backend/src/zoekt.ts
(2 hunks)packages/mcp/src/schemas.ts
(2 hunks)packages/schemas/src/v3/connection.schema.ts
(1 hunks)packages/schemas/src/v3/connection.type.ts
(2 hunks)packages/schemas/src/v3/genericGitHost.schema.ts
(1 hunks)packages/schemas/src/v3/genericGitHost.type.ts
(1 hunks)packages/schemas/src/v3/index.schema.ts
(2 hunks)packages/schemas/src/v3/index.type.ts
(2 hunks)packages/web/src/actions.ts
(4 hunks)packages/web/src/app/[domain]/browse/[...path]/page.tsx
(4 hunks)packages/web/src/app/[domain]/components/fileHeader.tsx
(2 hunks)packages/web/src/app/[domain]/components/repositoryCarousel.tsx
(2 hunks)packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx
(1 hunks)packages/web/src/app/[domain]/search/components/filterPanel/index.tsx
(6 hunks)packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatchContainer.tsx
(5 hunks)packages/web/src/app/[domain]/search/components/searchResultsPanel/index.tsx
(4 hunks)packages/web/src/app/[domain]/search/page.tsx
(9 hunks)packages/web/src/features/search/listReposApi.ts
(1 hunks)packages/web/src/features/search/schemas.ts
(3 hunks)packages/web/src/features/search/searchApi.ts
(3 hunks)packages/web/src/features/search/types.ts
(2 hunks)packages/web/src/features/search/zoektSchema.ts
(1 hunks)packages/web/src/lib/utils.ts
(9 hunks)schemas/v3/connection.json
(1 hunks)schemas/v3/genericGitHost.json
(1 hunks)schemas/v3/index.json
(1 hunks)vendor/zoekt
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (17)
packages/web/src/features/search/types.ts (1)
packages/web/src/features/search/schemas.ts (2)
fileSourceResponseSchema
(111-114)repositoryInfoSchema
(34-40)
packages/backend/src/zoekt.ts (1)
packages/backend/src/utils.ts (1)
getRepoPath
(97-112)
packages/backend/src/connectionManager.ts (1)
packages/backend/src/repoCompileUtils.ts (1)
compileGenericGitHostConfig
(441-457)
packages/web/src/app/[domain]/components/repositoryCarousel.tsx (1)
packages/web/src/lib/utils.ts (1)
getCodeHostInfoForRepo
(53-141)
packages/schemas/src/v3/index.type.ts (2)
packages/schemas/src/v3/connection.type.ts (2)
GenericGitHostConnectionConfig
(309-319)GitRevisions
(90-99)packages/schemas/src/v3/genericGitHost.type.ts (2)
GenericGitHostConnectionConfig
(3-13)GitRevisions
(17-26)
packages/web/src/app/[domain]/search/components/searchResultsPanel/index.tsx (1)
packages/web/src/features/search/types.ts (1)
RepositoryInfo
(29-29)
packages/web/src/app/[domain]/components/fileHeader.tsx (1)
packages/web/src/lib/utils.ts (1)
getCodeHostInfoForRepo
(53-141)
packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatchContainer.tsx (1)
packages/web/src/features/search/types.ts (1)
RepositoryInfo
(29-29)
packages/backend/src/utils.ts (1)
packages/backend/src/types.ts (1)
AppContext
(4-16)
packages/backend/src/repoManager.ts (4)
packages/backend/src/utils.ts (2)
getRepoPath
(97-112)measure
(10-18)packages/backend/src/types.ts (1)
repoMetadataSchema
(26-43)packages/web/src/lib/utils.ts (1)
measure
(256-272)packages/backend/src/git.ts (1)
fetchRepository
(31-53)
packages/web/src/app/[domain]/browse/[...path]/page.tsx (4)
packages/web/src/actions.ts (1)
getRepoInfoByName
(447-506)packages/web/src/lib/utils.ts (1)
isServiceError
(190-196)packages/web/src/lib/serviceError.ts (1)
ServiceErrorException
(16-20)packages/web/src/app/[domain]/components/fileHeader.tsx (1)
FileHeader
(24-95)
packages/web/src/app/[domain]/search/page.tsx (1)
packages/web/src/features/search/types.ts (1)
RepositoryInfo
(29-29)
packages/web/src/features/search/schemas.ts (1)
packages/mcp/src/schemas.ts (1)
repositorySchema
(85-90)
packages/web/src/actions.ts (1)
packages/web/src/lib/serviceError.ts (1)
notFound
(91-97)
packages/web/src/features/search/searchApi.ts (3)
packages/web/src/features/search/zoektSchema.ts (1)
zoektSearchResponseSchema
(51-76)packages/web/src/app/error.tsx (1)
Error
(14-54)packages/web/src/features/search/types.ts (2)
SearchResultRange
(17-17)SearchResponse
(16-16)
packages/schemas/src/v3/genericGitHost.type.ts (2)
packages/schemas/src/v3/connection.type.ts (2)
GenericGitHostConnectionConfig
(309-319)GitRevisions
(90-99)packages/schemas/src/v3/index.type.ts (2)
GenericGitHostConnectionConfig
(404-414)GitRevisions
(185-194)
packages/backend/src/repoCompileUtils.ts (3)
packages/schemas/src/v3/connection.type.ts (1)
GenericGitHostConnectionConfig
(309-319)packages/backend/src/git.ts (3)
isPathAValidGitRepoRoot
(82-96)getOriginUrl
(110-123)isUrlAValidGitRepo
(98-108)packages/backend/src/types.ts (1)
RepoMetadata
(45-45)
🪛 LanguageTool
docs/docs/connections/local-repos.mdx
[style] ~11-~11: To reduce wordiness, try specifying a number or using “many” or “numerous” instead.
Context: ...ful in scenarios where you already have a large number of repos already checked out. Local reposi...
(LARGE_NUMBER_OF)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (75)
vendor/zoekt (1)
1-1
:✅ Verification successful
Confirm Zoekt submodule pointer update
The subproject commit for Zoekt has been bumped to12a2f4ad075359a09bd8a91793acb002211217aa
. Ensure this new version supports the updated indexing command (which now takes the repository ID) inpackages/backend/src/zoekt.ts
.Please run:
to confirm the commit is present and review Zoekt’s changelog for any breaking changes.
🏁 Script executed:
#!/bin/bash # Verify the submodule commit exists and update it locally git submodule update --init vendor/zoekt cd vendor/zoekt git show 12a2f4ad075359a09bd8a91793acb002211217aaLength of output: 2251
Zoekt submodule update verified
The submodulevendor/zoekt
is now at commit12a2f4ad075359a09bd8a91793acb002211217aa
, and this commit indeed introduces the--repo_id
flag inzoekt-git-index
(mapping toRepositoryID
) and updates the Go indexer to populatedesc.ID
only whenrepoid
is set. No further action is required.docs/docs/search/search-contexts.mdx (1)
108-108
:✅ Verification successful
Verify updated documentation link path: The link was updated to
/docs/search/syntax-reference
. Ensure this path correctly maps to the relocated syntax reference page in the documentation hierarchy.
🏁 Script executed:
#!/bin/bash # Verify that `syntax-reference.mdx` exists under docs/docs/search fd 'syntax-reference.mdx' docs/docs/searchLength of output: 80
Documentation link verified
The filedocs/docs/search/syntax-reference.mdx
exists and matches the updated link path/docs/search/syntax-reference
. No further changes are needed.packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx (1)
49-49
:❓ Verification inconclusive
Ensure
webUrl
existence and type alignment: The code now assignslink: fileMatch.webUrl
. Confirm thatSearchResultFile
guaranteeswebUrl
is always defined and thatCodePreviewFile.link
expects a string. IfwebUrl
can be undefined, consider adding a fallback or refining the type definitions.
🏁 Script executed:
#!/bin/bash # Check the definition of SearchResultFile to ensure `webUrl` is required. rg -R "interface SearchResultFile" -A 5 -g "packages/web/src/features/search/types.ts"Length of output: 491
🏁 Script executed:
#!/bin/bash # Verify that `webUrl` is required on SearchResultFile rg "interface SearchResultFile" -A 5 packages/web/src/features/search/types.tsLength of output: 80
#!/bin/bash # Locate the definition of SearchResultFile echo "🔍 Searching for SearchResultFile references..." rg -n "SearchResultFile" packages/web/src # Search for its type or interface declaration echo "🔍 Searching for type/interface declaration..." rg -A5 -n "type SearchResultFile" -g "packages/web/src/**/*.{ts,tsx}" rg -A5 -n "interface SearchResultFile" -g "packages/web/src/**/*.{ts,tsx}" # Check all usages of webUrl echo "🔍 Verifying webUrl occurrences..." rg -n "webUrl" packages/web/src
Ensure
webUrl
is guaranteed and matches the expectedlink
type
Please verify the following to avoid potentialundefined
runtime errors:
- In your
SearchResultFile
definition (e.g.packages/web/src/features/search/types.ts
), confirm thatwebUrl
is a requiredstring
(notstring | undefined
).- In the
CodePreviewFile
type (wherelink
is defined), ensurelink
is declared asstring
.If
webUrl
can be optional, add a safe fallback—e.g.link: fileMatch.webUrl ?? ''or update your type definitions so that
webUrl
is always defined before assigning it tolink
.docs/self-hosting/overview.mdx (1)
85-86
: Great addition of new code host options!The additions of "Other Git hosts" and "Local Git repos" cards appropriately extend the supported code host options for users, aligning with the PR's objective to introduce support for generic Git repositories.
packages/web/src/features/search/types.ts (3)
11-11
: LGTM: Added import for repository info schemaThe import of
repositoryInfoSchema
is correctly added to support the newRepositoryInfo
type.
27-28
: LGTM: Added type annotation for FileSourceResponseAdding a type annotation for
FileSourceResponse
derived from the schema improves type safety.
29-29
: LGTM: Added RepositoryInfo type for improved repository metadata representationThe new
RepositoryInfo
type derived fromrepositoryInfoSchema
establishes a consistent way to represent repository metadata across different connection types, including the newly added generic Git hosts.packages/backend/src/zoekt.ts (2)
18-18
: LGTM: Updated repository path extraction to support read-only repositoriesThe change to destructure the
path
property fromgetRepoPath
correctly adapts to the updated function signature that now returns both path and read-only status, which is necessary for supporting local Git repositories.
68-68
: LGTM: Added repository ID to indexing commandAdding the
-repo_id
parameter to the zoekt-git-index command ensures proper repository identification during indexing, which is important for consistent repository handling across different connection types.docs/docs/connections/overview.mdx (1)
33-34
: LGTM: Updated documentation with new connection optionsThe additions of "Other Git hosts" and "Local Git repos" cards in the connections overview documentation properly document the new connection capabilities introduced in this PR.
packages/web/src/features/search/listReposApi.ts (1)
36-36
: Good URL handling improvementThis change ensures empty URL strings are properly translated to
undefined
values rather than passing them through as empty strings. This is a good defensive programming practice that will help prevent issues in UI components that might not handle empty strings gracefully.packages/web/src/features/search/zoektSchema.ts (1)
57-57
: LGTM: Repository ID additionAdding the optional
RepositoryID
field is a good enhancement that aligns with the rest of the codebase now using repository IDs as unique identifiers instead of repository names. This provides more reliable identification, especially when dealing with generic Git hosts.packages/web/src/app/[domain]/components/repositoryCarousel.tsx (1)
9-9
: Function update properly supports generic Git hostsGood refactoring to use the updated utility function which now supports generic Git hosts. The explicit parameter passing is cleaner than passing the whole object and makes the contract between components clearer.
Also applies to: 60-65
packages/backend/src/connectionManager.ts (2)
176-178
: LGTM: Generic Git host support added successfullyThe implementation correctly adds a new case to handle the
'git'
connection type and calls the appropriate compilation function. This change enables support for both remote and local Git repositories as mentioned in the PR objectives.
7-7
: Import updated to include the new compilation functionThe import statement has been properly updated to include the
compileGenericGitHostConfig
function from the repoCompileUtils module.packages/mcp/src/schemas.ts (2)
68-68
: LGTM: URL made optional to support generic Git repositoriesMaking the URL field optional in the search response schema is a necessary change to support generic Git repositories, particularly local ones that might not have a conventional URL.
87-87
: LGTM: Repository URL made optionalMaking the repository URL optional in the repository schema aligns with the implementation of generic Git host support, especially for local repositories.
docs/docs/connections/generic-git-host.mdx (1)
1-29
: Documentation looks great overallThe documentation is well-structured with a clear introduction, getting started section, and schema reference. Once the URL support inconsistency is addressed, this will provide excellent guidance for users setting up generic Git host connections.
docs/docs.json (2)
41-42
: LGTM: Navigation structure updated for new connection typesThe navigation has been properly updated to include the new generic Git host and local repos documentation pages under the "Supported platforms" section.
48-55
: LGTM: New search group added to navigationA dedicated "Search" group has been added to the navigation with appropriate pages, improving the organization of the documentation.
packages/web/src/app/[domain]/search/components/searchResultsPanel/index.tsx (3)
3-3
: Updated import to use RepositoryInfo typeThis change correctly updates the import statement to use the
RepositoryInfo
type from the search types module, which is part of the broader refactoring to support generic Git repositories.
15-15
: Repository info prop now uses numeric IDs as keysThe type signature has been updated from using string keys (repository names) to number keys (repository IDs). This change is important for supporting generic git repositories where multiple repositories might have the same name but different origins.
29-29
: Prop naming consistency with repository infoThe prop has been renamed from
repoMetadata
torepoInfo
for consistency, and is correctly passed down to child components. This change aligns with the broader refactoring of repository metadata handling across the application.Also applies to: 154-154
packages/web/src/app/[domain]/components/fileHeader.tsx (3)
1-2
: Updated import for repository code host informationThe updated import uses the new
getCodeHostInfoForRepo
utility, which has explicit support for generic Git repositories.
14-19
: Explicit repository fields instead of Repository typeThe component now uses explicit fields for repository information rather than depending on the full
Repository
type. This provides better type safety and makes the component more maintainable as it clearly defines exactly which repository properties are needed.
31-36
: Updated code host info retrievalThe function call has been updated to use the new
getCodeHostInfoForRepo
utility with explicit repository properties. This change supports the new generic Git host functionality while maintaining backward compatibility with existing code host types.packages/schemas/src/v3/connection.type.ts (2)
8-9
: Added GenericGitHostConnectionConfig to ConnectionConfig unionThis change correctly extends the
ConnectionConfig
union type to include the new generic Git host connection type, which is essential for supporting the new feature.
309-319
: Added GenericGitHostConnectionConfig interfaceThis new interface defines the structure for generic Git host connections, with:
- A fixed type property set to "git"
- A required URL that supports both remote repositories (http/https) and local repositories (file)
- Support for glob patterns in local directory paths, enabling multi-repository syncing
- Optional revisions property for specifying branches and tags
The documentation clearly explains that local directories are treated as read-only, which is an important safety feature.
packages/backend/src/repoManager.ts (4)
224-224
: Destructure path and isReadOnly flag from getRepoPathThis change extracts both the repository path and its read-only status in one call, preparing for the conditional operations that follow.
230-233
: Added read-only checks to prevent modifying local repositoriesThis change adds important safeguards to prevent modifying local repositories that are marked as read-only. It ensures that operations like deleting, fetching, cloning, and updating git config are only performed on repositories where modifications are permitted.
These checks are essential for supporting local Git repositories as described in the PR objectives, where repositories specified with
file://
URLs are treated as read-only.Also applies to: 235-245, 246-279
323-324
: Simplified syncGitRepository function callRemoved passing and returning of timing statistics since they're now captured directly within the function using constants.
475-479
: Added read-only check to garbage collectionThis change ensures that local repositories marked as read-only are not deleted during garbage collection, which is consistent with the read-only treatment throughout the codebase.
packages/backend/src/utils.ts (1)
97-112
: Properly handles local repositories with isReadOnly flagThe updated
getRepoPath
function now correctly handles both remote and local repositories by returning an object containing path and read-only status. For local git repositories (identified by thefile:
protocol andgeneric-git-host
type), it returns the actual local filesystem path and marks it as read-only to prevent write operations.This change is essential for supporting the PR objective of adding local Git repositories while ensuring they're treated appropriately as read-only sources.
packages/schemas/src/v3/index.type.ts (2)
7-13
: Added GenericGitHostConnectionConfig to ConnectionConfig unionThis addition extends the supported connection types to include generic Git hosts, which aligns with the PR objectives.
404-414
: Well-defined interface for GenericGitHostConnectionConfigThe interface definition is clear and properly typed, with the required
type: "git"
property andurl
string that supports both remote URLs and local file paths.The optional
revisions
property allows for specifying branches and tags to index, maintaining consistency with other connection types that support multi-branch indexing.docs/snippets/schemas/v3/connection.schema.mdx (1)
631-659
: Comprehensive schema for Generic Git Host connectionsThe schema properly defines:
- Required properties (
type
andurl
) with appropriate constraints- A well-formed regex pattern to validate both remote HTTP/HTTPS URLs and local file URLs
- Clear descriptive text explaining the local directory requirements and read-only nature
- Helpful examples showing both remote and local repository URL formats including glob patterns
- Reference to the existing revisions schema for branch/tag specification
This schema documentation is thorough and provides developers with clear guidance on how to configure generic Git host connections.
packages/web/src/app/[domain]/browse/[...path]/page.tsx (6)
13-14
: Simplified import with direct repository info fetchingReplacing the previous approach (listing all repositories and filtering) with a direct call to get repository info by name is more efficient.
57-60
: Improved repository info fetching with proper error handlingThe code now uses the new
getRepoInfoByName
function and only throws a ServiceErrorException if the error is not a "not found" error, allowing the UI to handle the not-found case gracefully.
79-95
: Conditional UI rendering based on repository infoThe code properly handles the repository info object, only rendering the file header if valid repository information is available (not a service error).
84-89
: Properly structured repository info objectThe repository info object now contains all necessary fields (name, displayName, webUrl, codeHostType) for the FileHeader component, maintaining consistent repository metadata handling across the application.
97-111
: Enhanced error handling with appropriate UI fallbacksThe code now properly handles repository not found errors by displaying a user-friendly message, while still using the repository info when available to render the code preview.
107-107
: Uses consistent repository namingUsing
repoInfo.name
ensures consistency with the repository metadata structure throughout the application.packages/schemas/src/v3/genericGitHost.type.ts (3)
1-2
: Note about auto-generated fileThe comment indicates this file is auto-generated. Ensure you have the proper tooling/scripts in place to regenerate this file when making changes to the schema.
3-13
: Generic Git host connection configuration looks goodThe interface correctly defines the structure for a generic Git connection with required type and URL fields, plus optional revisions. This aligns with the PR objective of supporting both remote and local Git repositories.
14-26
: GitRevisions interface is well documentedThe interface clearly defines the structure for specifying branches and tags to index, including support for glob patterns and the 64 item limit per type. The comments are thorough and help users understand the behavior.
packages/web/src/actions.ts (4)
31-31
: Import of genericGitHostSchema is appropriateAdding the schema import for the new connection type is necessary and matches the pattern used for other connection types.
447-506
: Well-implemented repository lookup function with thorough documentationThe
getRepoInfoByName
function implements a repo lookup scoped to an organization domain with proper authentication checks. The detailed comments excellently explain the edge case regarding duplicate repo names with generic git connections and suggest a future improvement with a uniqueness constraint.
1245-1246
: Connection type handler for 'git' connectionsThe connection type handler for 'git' connections is properly added to the switch statement, ensuring validation against the genericGitHostSchema.
1297-1302
: Generic Git host connection metadata setupThe implementation correctly sets up the connection metadata for generic Git repositories:
numRepos: 1
- Each connection represents exactly one repositoryhasToken: false
- No authentication token is required for these connectionsThis matches the expected behavior for generic Git repositories described in the PR.
schemas/v3/genericGitHost.json (2)
1-30
: Well-structured JSON schema for Generic Git HostThe schema properly defines:
- Required
type
field with a fixed value of "git"- Required
url
field with proper validation pattern for both remote and local repositories- Optional
revisions
field referencing the shared GitRevisions definition- Helpful examples for different repository types (GitHub, local repo, glob pattern)
Good work on ensuring consistency with the TypeScript interfaces and including the proper pattern validation for URLs.
11-14
: URL pattern validation is accurate and secureThe pattern validation correctly enforces proper formatting for both HTTP/HTTPS URLs and file URLs, which is important for security. The description and examples clearly illustrate the expected formats.
docs/snippets/schemas/v3/index.schema.mdx (2)
122-122
: Updated documentation URL for search contextsThe documentation URL for search contexts has been appropriately updated.
762-790
: Comprehensive schema documentation for Generic Git Host connectionsThe schema documentation for GenericGitHostConnectionConfig is thorough and consistent with the TypeScript interface and JSON schema. It correctly references the shared GitRevisions definition and includes clear examples for both remote and local repositories with glob patterns.
docs/docs/connections/local-repos.mdx (3)
14-17
: The warning provides valuable prerequisite information.The warning clearly communicates important prerequisites that users need to know before using local Git repositories.
43-54
: Good documentation of configuration patterns.The step explaining how to create a connection provides clear examples with both the file URL format and glob pattern support, which is essential for users to understand how to configure local repositories correctly.
60-78
: Examples section is well structured.The accordion-based examples showing both individual repo and glob pattern approaches are helpful and clear. The JSON examples illustrate proper configuration formatting.
packages/schemas/src/v3/index.schema.ts (2)
121-121
: Updated documentation URL for search contexts.The documentation URL for search contexts has been updated to point to the new path, which is a good housekeeping change.
761-790
: Well-structured Generic Git Host schema definition.The new
GenericGitHostConnectionConfig
schema is properly defined with all necessary elements:
- Required type and URL properties
- Clear description of URL formats (HTTP/HTTPS and file://)
- Appropriate regex pattern validation for URLs
- Revisions property referencing existing schema
- Good examples for both remote and local paths
This schema aligns well with the PR objective to support both remote generic Git hosts and local Git repositories.
packages/schemas/src/v3/genericGitHost.schema.ts (1)
1-70
: Well-implemented schema definition for generic Git hosts.This new schema file correctly defines the structure for generic Git host connections, with:
- Proper constant type definition
- Comprehensive URL property with validation pattern
- Examples covering both remote URLs and local paths with glob patterns
- Consistent revisions property structure matching other connection types
- Clear property requirements and additional properties restriction
The schema is well-structured and aligns perfectly with the PR objectives to support both remote generic Git hosts and local Git repositories.
packages/schemas/src/v3/connection.schema.ts (1)
630-659
: Consistent schema definition added to connection types.The
GenericGitHostConnectionConfig
schema has been correctly added to the connection schema's oneOf array, making the generic Git host connection type available in the connection configuration. The schema definition is consistent with the other files and includes:
- Appropriate title and properties
- Clear type constant and description
- URL property with validation pattern
- Reference to existing revisions schema
- Required properties and additional properties restriction
This seamlessly extends the connection configuration system to support generic Git hosts.
packages/web/src/features/search/schemas.ts (2)
34-40
: LGTM: Well-structured repository info schema.The new
repositoryInfoSchema
is well-structured with all the necessary fields for repository identification and display. The use of optional fields fordisplayName
andwebUrl
appropriately handles cases where these might not be available.
94-97
: LGTM: Making URL optional is consistent.Making the
url
field optional in therepositorySchema
is consistent with the other changes in this PR and correctly handles repositories that might not have an associated web URL.packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatchContainer.tsx (3)
8-8
: LGTM: Correctly importing new repository types.The import statement correctly updates to use the new
RepositoryInfo
type from the search feature types.
108-113
: LGTM: Explicit repository props ensure type safety.Providing explicit repository properties to the
FileHeader
component improves type safety and makes the code more maintainable by clearly showing what fields are being used.
19-19
:✅ Verification successful
Ensure props type compatibility.
The change from
Record<string, Repository>
toRecord<number, RepositoryInfo>
is a significant type change. Ensure all parent components are updated to provide the correct prop type.
🏁 Script executed:
#!/bin/bash # Check parent components that render FileMatchContainer to ensure they pass the correct repo info prop echo "Checking components that render FileMatchContainer:" rg --type-add 'tsx:*.tsx' --type tsx '<FileMatchContainer' -A 10 -B 2Length of output: 1899
🏁 Script executed:
#!/bin/bash # Locate fileMatchContainer.tsx and display its prop definitions file=$(fd fileMatchContainer.tsx -t f) echo "Found file: $file" echo "---- Props definition snippet ----" sed -n '1,200p' "$file" | sed -n 's/^/ /'Length of output: 553
🏁 Script executed:
#!/bin/bash # Check FileMatchContainer props definition and usage of repoInfo echo "Props interface or signature in fileMatchContainer.tsx:" rg --color never "FileMatchContainer" -n -A 5 -B 5 packages/web/src/app/\[domain\]/search/components/searchResultsPanel/fileMatchContainer.tsx echo "" echo "Usage of repoInfo in fileMatchContainer.tsx:" rg --color never "repoInfo" -n packages/web/src/app/\[domain\]/search/components/searchResultsPanel/fileMatchContainer.tsx echo "" echo "Check for repoInfo in parent index.tsx:" rg --color never "repoInfo" -n packages/web/src/app/\[domain\]/search/components/searchResultsPanel/index.tsxLength of output: 1890
Props type compatibility confirmed
Both
FileMatchContainer
and its parent in
packages/web/src/app/[domain]/search/components/searchResultsPanel/index.tsx
now userepoInfo: Record<number, RepositoryInfo>
. No further updates are needed.packages/web/src/app/[domain]/search/components/filterPanel/index.tsx (3)
4-5
: LGTM: Import statements correctly updated.The import statements have been correctly updated to use the new
RepositoryInfo
type and the consolidatedgetCodeHostInfoForRepo
utility function.
178-189
: LGTM: Well-structured aggregation function.The
aggregateMatches
function has been nicely refactored to accept an object withkey
andmatch
parameters, making the code more readable and maintainable.
41-50
:⚠️ Potential issueAdd null check for repository info.
Similar to the issue in
fileMatchContainer.tsx
, there's no robust null check when accessingrepoInfo[match.repositoryId]
. If the repository info is not found for a given ID, this could lead to unexpected behavior.({ key, match }) => { - const repo: RepositoryInfo | undefined = repoInfo[match.repositoryId]; + const repo: RepositoryInfo | undefined = repoInfo[match.repositoryId] || { + id: match.repositoryId, + name: match.repository, + codeHostType: 'unknown' + }; const info = repo ? getCodeHostInfoForRepo({ name: repo.name, codeHostType: repo.codeHostType, displayName: repo.displayName, webUrl: repo.webUrl, }) : undefined;Likely an incorrect or invalid review comment.
packages/backend/src/repoCompileUtils.ts (1)
546-567
: Single database write vs. fan-out
compileGenericGitHostConfig_url
always returns an array of length 1.
Down-stream code may expect batched inserts; consider returning the single object directly or documenting thatrepoData
may be 1-length to avoid unnecessary array handling.packages/backend/src/git.ts (6)
1-1
: Appropriate imports added for new Git functionality.The additional imports from 'simple-git' (
CheckRepoActions
andGitConfigScope
) support the new repository validation and metadata retrieval functions.
3-4
: Good type abstraction for progress callbacks.Creating the
onProgressFn
type alias improves code consistency and readability by standardizing the progress callback signature across multiple functions.
5-5
: Function signature updated with the new type.The
cloneRepository
function now uses theonProgressFn
type for its progress callback parameter, consistent with the new type system.
31-31
: Function signature updated with the new type.The
fetchRepository
function now uses theonProgressFn
type for its progress callback parameter, maintaining consistency with other functions.
61-61
: Function signature updated with the new type.The
upsertGitConfig
function now uses theonProgressFn
type for its progress callback parameter, completing the consistent type usage across the file.
110-123
: Good implementation of origin URL retrieval.The
getOriginUrl
function correctly retrieves the remote origin URL from a local Git repository using the appropriate GitConfigScope. Error handling follows the consistent pattern used throughout the file.
packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatchContainer.tsx
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (2)
packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts (1)
22-24
:⚠️ Potential issuePotential double decoding of file content
getFileSource
returns the file’s raw text (file.content
) – it is not base-64 encoded.
Callingbase64Decode
here will corrupt the content (non-ASCII characters turn into � or the call throws).Either remove the decode step or update
getFileSource
to return abase64
flag and conditionally decode.-const fileContent = base64Decode(fileSourceResponse.source); +const fileContent = fileSourceResponse.source; // already plain textpackages/web/src/app/api/(server)/search/route.ts (1)
9-13
: 🛠️ Refactor suggestionValidate presence of
X-Org-Domain
header
Same header-presence issue as in the repos route; a missing header currently crashes the request.- const domain = request.headers.get("X-Org-Domain")!; + const domain = request.headers.get("X-Org-Domain"); + if (!domain) { + return new Response( + JSON.stringify({ message: "Missing X-Org-Domain header" }), + { status: 400, headers: { "Content-Type": "application/json" } }, + ); + }
🧹 Nitpick comments (6)
Makefile (2)
35-35
: Remove extraneous blank line with trailing whitespace.The blank line at line 35 appears to contain trailing whitespace and isn’t necessary; please remove it to keep the Makefile tidy.
36-37
: Guard destructive commands and ensure tool availability.The additions to
clean
will flush all Redis data and reset Prisma migrations, matchingsoft-reset
. A few points to consider:
- If
redis-cli
or theyarn dev:prisma:migrate:reset
script isn’t installed or on$PATH
, theclean
target will error out. You may want to:
- Prefix these lines with
-
to ignore errors when tools are missing:-redis-cli FLUSHALL -yarn dev:prisma:migrate:reset +- redis-cli FLUSHALL +- yarn dev:prisma:migrate:reset- Or add a prerequisite check:
@command -v redis-cli >/dev/null || { echo "redis-cli not found, skipping FLUSHALL"; }- Document prerequisites in your README/CONTRIBUTING guide so developers know they need Redis and Prisma CLI available.
CHANGELOG.md (1)
10-12
: Improve wording and fix minor duplicationConsider tightening the sentence and following the usual capitalization of “Git” and “URL”.
- Added support for indexing generic git hosts given a remote clone url or local path. + Added support for indexing generic Git hosts using either a remote clone URL or a local path.🧰 Tools
🪛 LanguageTool
[duplication] ~10-~10: Possible typo: you repeated a word.
Context: ...pec/v2.0.0.html). ## [Unreleased] ### Added - Added support for indexing generic git hosts ...(ENGLISH_WORD_REPEAT_RULE)
packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts (1)
8-16
: Replaceconsole.log
with scoped loggerRaw
console.log
leaks potentially sensitive repo information and is hard to filter in production.
Use your shared logging utility (e.g.,logger.debug
) and guard it behind a verbose flag.Also applies to: 31-32
packages/web/src/features/search/listReposApi.ts (1)
16-20
: Remove redundant variable initialisation
header
is declared and then immediately overwritten. Declaring it once keeps the function shorter and clearer.- let header: Record<string, string> = {}; - header = { + const header: Record<string, string> = { "X-Tenant-ID": orgId.toString() };packages/web/src/features/search/fileSourceApi.ts (1)
11-46
: Consistent unauthenticated single-tenant access flag
You pass/* allowSingleTenantUnauthedAccess = */ true
towithAuth
, which is great. Consider surfacing this intent via a small wrapper helper (e.g.withOptionalAuth
) to avoid repeating the magic boolean across files.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
yarn.lock
is excluded by!**/yarn.lock
,!**/*.lock
📒 Files selected for processing (22)
CHANGELOG.md
(1 hunks)Makefile
(1 hunks)docs/docs/search/syntax-reference.mdx
(1 hunks)docs/snippets/schemas/v3/index.schema.mdx
(2 hunks)package.json
(1 hunks)packages/backend/package.json
(1 hunks)packages/backend/src/repoCompileUtils.ts
(2 hunks)packages/mcp/CHANGELOG.md
(1 hunks)packages/mcp/src/index.ts
(2 hunks)packages/mcp/src/schemas.ts
(3 hunks)packages/schemas/src/v3/index.schema.ts
(2 hunks)packages/schemas/src/v3/index.type.ts
(2 hunks)packages/web/src/app/[domain]/browse/[...path]/page.tsx
(5 hunks)packages/web/src/app/api/(server)/repos/route.ts
(1 hunks)packages/web/src/app/api/(server)/search/route.ts
(1 hunks)packages/web/src/app/api/(server)/source/route.ts
(1 hunks)packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts
(2 hunks)packages/web/src/features/search/fileSourceApi.ts
(1 hunks)packages/web/src/features/search/listReposApi.ts
(1 hunks)packages/web/src/features/search/schemas.ts
(3 hunks)packages/web/src/features/search/searchApi.ts
(2 hunks)schemas/v3/index.json
(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- packages/backend/package.json
- docs/docs/search/syntax-reference.mdx
- packages/mcp/CHANGELOG.md
- packages/mcp/src/index.ts
🚧 Files skipped from review as they are similar to previous changes (9)
- schemas/v3/index.json
- packages/schemas/src/v3/index.type.ts
- packages/web/src/app/[domain]/browse/[...path]/page.tsx
- packages/schemas/src/v3/index.schema.ts
- docs/snippets/schemas/v3/index.schema.mdx
- packages/web/src/features/search/schemas.ts
- packages/mcp/src/schemas.ts
- packages/web/src/features/search/searchApi.ts
- packages/backend/src/repoCompileUtils.ts
🧰 Additional context used
🧬 Code Graph Analysis (4)
packages/web/src/app/api/(server)/repos/route.ts (1)
packages/web/src/features/search/listReposApi.ts (1)
listRepositories
(7-46)
packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts (1)
packages/web/src/features/search/fileSourceApi.ts (1)
getFileSource
(11-46)
packages/web/src/app/api/(server)/source/route.ts (1)
packages/web/src/features/search/fileSourceApi.ts (1)
getFileSource
(11-46)
packages/web/src/features/search/fileSourceApi.ts (4)
packages/web/src/actions.ts (3)
sew
(43-51)withAuth
(53-77)withOrgMembership
(79-125)packages/web/src/app/api/(client)/client.ts (1)
search
(20-35)packages/web/src/lib/utils.ts (1)
isServiceError
(190-196)packages/web/src/lib/serviceError.ts (1)
fileNotFound
(67-73)
🪛 LanguageTool
CHANGELOG.md
[duplication] ~10-~10: Possible typo: you repeated a word.
Context: ...pec/v2.0.0.html). ## [Unreleased] ### Added - Added support for indexing generic git hosts ...
(ENGLISH_WORD_REPEAT_RULE)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (2)
package.json (1)
4-4
:✅ Verification successful
Consolidate Yarn workspace patterns
Switching to a single"packages/*"
glob reduces maintenance overhead and automatically includes any new packages underpackages/
.Run the following to confirm only the desired workspaces are matched:
🏁 Script executed:
yarn workspaces list --jsonLength of output: 446
Approve consolidated workspace pattern
The single
"packages/*"
glob correctly includes all current workspaces (@sourcebot/backend
,@sourcebot/crypto
,@sourcebot/db
,@sourcebot/error
,@sourcebot/mcp
,@sourcebot/schemas
,@sourcebot/web
) and no unintended paths—verified withyarn workspaces list --json
.packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts (1)
17-18
:❓ Verification inconclusive
Hard-coded
"~"
domain may bypass org-membership checks
getFileSource
now performswithOrgMembership(session, domain …)
.
Passing the sentinel value"~"
will work only for single-tenant deployments that intentionally allow unauthenticated access.
In multi-tenant mode the request will fail and the review-agent will be unable to read files.If the PR payload already contains an organisation / domain identifier (e.g.,
pr_payload.orgDomain
), forward that instead and fall back to"~"
only when the field is absent.-const response = await getFileSource(fileSourceRequest, "~"); +const domain = pr_payload.orgDomain ?? "~"; // TODO: add orgDomain to payload type +const response = await getFileSource(fileSourceRequest, domain);
🏁 Script executed:
#!/bin/bash set -e # Locate the target file file=$(fd 'fetchFileContent.ts' -t f | head -n1) echo ">> Located file: $file" # Show the beginning of the file for context echo ">> Beginning of $file (lines 1–200):" sed -n '1,200p' "$file" echo # Search for getFileSource usages in the repo echo ">> getFileSource occurrences in repo:" rg "getFileSource" -n || echo "No occurrences found" echo # Inspect withOrgMembership references echo ">> withOrgMembership occurrences in repo:" rg "withOrgMembership" -n || echo "No occurrences found" echo # Check for pr_payload usage in the file echo ">> pr_payload in $file:" grep -n "pr_payload" "$file" || echo "No pr_payload references in this file"Length of output: 8528
#!/bin/bash set -e # Locate and display the type definition for sourcebot_pr_payload types_file=$(fd 'review-agent/types.ts' -t f | head -n1) echo ">> Located types file: $types_file" echo ">> Showing sourcebot_pr_payload definition:" sed -n '1,200p' "$types_file" | sed -n '/export type sourcebot_pr_payload/,/}/!d' echo # Locate and display getFileSource implementation api_file=$(fd 'search/fileSourceApi.ts' -t f | head -n1) echo ">> Located fileSourceApi: $api_file" echo ">> Showing getFileSource signature and withOrgMembership usage:" sed -n '1,200p' "$api_file" | sed -n '/export const getFileSource/,/^}/!d'
🏁 Script executed:
#!/bin/bash set -e # Locate the sourcebot_pr_payload type definition types_file=$(fd 'features/agents/review-agent/types.ts' -t f | head -n1) echo ">> Located types file: $types_file" # Show the definition of sourcebot_pr_payload rg -C3 "export type sourcebot_pr_payload" -n "$types_file"Length of output: 231
#!/bin/bash set -e # 1. Find occurrences of sourcebot_pr_payload to locate its type definition echo ">> Searching for 'sourcebot_pr_payload' occurrences:" rg -n "sourcebot_pr_payload" || true # 2. Display its type definition types_file=$(rg -l "sourcebot_pr_payload" | head -n1) echo ">> Located types file: $types_file" echo ">> Showing definition of sourcebot_pr_payload:" sed -n '1,200p' "$types_file" | sed -n -e '/export type sourcebot_pr_payload/,/}/!d' # 3. Locate fileSourceApi.ts and show getFileSource signature and withOrgMembership usage api_file=$(fd fileSourceApi.ts -t f | head -n1) echo ">> Located fileSourceApi: $api_file" echo ">> Showing getFileSource implementation:" sed -n '1,200p' "$api_file" | sed -n -e '/export const getFileSource/,/^}/!d'
Enforce correct org domain in getFileSource calls
The hard-coded
"~"
bypasseswithOrgMembership
and will break in multi-tenant mode. Forward the actual org domain from the PR payload (falling back to"~"
only if absent):- const response = await getFileSource(fileSourceRequest, "~"); + const domain = pr_payload.orgDomain ?? "~"; // TODO: add orgDomain to payload type + const response = await getFileSource(fileSourceRequest, domain);• Add
orgDomain: string
tosourcebot_pr_payload
in
packages/web/src/features/agents/review-agent/types.ts
.
• Update all callers offetchFileContent
to populate the new field.
This ensureswithOrgMembership
enforces access controls in multi-tenant deployments.
199a3e0
to
35b5284
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
docs/docs/connections/local-repos.mdx (4)
11-11
: Fix typo and improve wording
The sentence reads:Local repositories are treated as read-only, meaing Sourcebot will not
git fetch
new revisions.
- Correct meaing → meaning.
- To reduce wordiness, consider replacing “a large number of repos” with “many repositories”.
Apply this diff:- Sourcebot can sync code from generic git repositories stored in a local directory. This can be helpful in scenarios where you already have a large number of repos already checked out. Local repositories are treated as **read-only**, meaing Sourcebot will **not** `git fetch` new revisions. + Sourcebot can sync code from generic git repositories stored in a local directory. This can be helpful if you already have many repositories checked out. Local repositories are treated as **read-only**, meaning Sourcebot will **not** `git fetch` new revisions.🧰 Tools
🪛 LanguageTool
[style] ~11-~11: To reduce wordiness, try specifying a number or using “many” or “numerous” instead.
Context: ...ful in scenarios where you already have a large number of repos already checked out. Local reposi...(LARGE_NUMBER_OF)
33-34
: Correct grammar and spelling
In the Step “Mount a volume” paragraph:… so Sourcebot can read it's contents. Sourcebot will not write to local repositories, so we can mount a seperate read-only volume.
- Change it's → its (possessive).
- Change seperate → separate.
Apply:- We need to mount a docker volume to the `repos` directory so Sourcebot can read it's contents. Sourcebot will **not** write to local repositories, so we can mount a seperate **read-only** volume: + We need to mount a docker volume to the `repos` directory so Sourcebot can read its contents. Sourcebot will **not** write to local repositories, so we can mount a separate **read-only** volume:
35-35
: Unify code fence language specifier
The Bash snippet starts with “bash”, but earlier you used “
sh”. For consistency, choose one style.
Apply:- ``` bash + ```bash
71-75
: Remove unsupported comment in JSON snippet
JSON does not support comments. The line// Attempt to sync directories contained in `repos/` (non-recursive)
may confuse users. Consider deleting the comment or moving it above the code block as explanatory text. For example:
- ```json - // Attempt to sync directories contained in `repos/` (non-recursive) - { - "type": "git", - "url": "file:///repos/*" - } - ``` + <!-- Sync directories contained in `repos/` (non-recursive) --> + ```json + { + "type": "git", + "url": "file:///repos/*" + } + ```
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (2)
packages/web/public/git.svg
is excluded by!**/*.svg
yarn.lock
is excluded by!**/yarn.lock
,!**/*.lock
📒 Files selected for processing (55)
CHANGELOG.md
(1 hunks)Makefile
(1 hunks)docs/docs.json
(2 hunks)docs/docs/connections/generic-git-host.mdx
(1 hunks)docs/docs/connections/local-repos.mdx
(1 hunks)docs/docs/connections/overview.mdx
(1 hunks)docs/docs/search/multi-branch-indexing.mdx
(1 hunks)docs/docs/search/search-contexts.mdx
(1 hunks)docs/docs/search/syntax-reference.mdx
(1 hunks)docs/self-hosting/overview.mdx
(1 hunks)docs/snippets/schemas/v3/connection.schema.mdx
(1 hunks)docs/snippets/schemas/v3/genericGitHost.schema.mdx
(1 hunks)docs/snippets/schemas/v3/index.schema.mdx
(2 hunks)package.json
(1 hunks)packages/backend/package.json
(1 hunks)packages/backend/src/connectionManager.ts
(2 hunks)packages/backend/src/git.ts
(4 hunks)packages/backend/src/repoCompileUtils.ts
(2 hunks)packages/backend/src/repoManager.ts
(5 hunks)packages/backend/src/utils.ts
(1 hunks)packages/backend/src/zoekt.ts
(2 hunks)packages/mcp/CHANGELOG.md
(1 hunks)packages/mcp/src/index.ts
(2 hunks)packages/mcp/src/schemas.ts
(3 hunks)packages/schemas/src/v3/connection.schema.ts
(1 hunks)packages/schemas/src/v3/connection.type.ts
(2 hunks)packages/schemas/src/v3/genericGitHost.schema.ts
(1 hunks)packages/schemas/src/v3/genericGitHost.type.ts
(1 hunks)packages/schemas/src/v3/index.schema.ts
(2 hunks)packages/schemas/src/v3/index.type.ts
(2 hunks)packages/web/src/actions.ts
(4 hunks)packages/web/src/app/[domain]/browse/[...path]/page.tsx
(5 hunks)packages/web/src/app/[domain]/components/fileHeader.tsx
(2 hunks)packages/web/src/app/[domain]/components/repositoryCarousel.tsx
(2 hunks)packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx
(1 hunks)packages/web/src/app/[domain]/search/components/filterPanel/index.tsx
(6 hunks)packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatchContainer.tsx
(5 hunks)packages/web/src/app/[domain]/search/components/searchResultsPanel/index.tsx
(4 hunks)packages/web/src/app/[domain]/search/page.tsx
(9 hunks)packages/web/src/app/api/(server)/repos/route.ts
(1 hunks)packages/web/src/app/api/(server)/search/route.ts
(2 hunks)packages/web/src/app/api/(server)/source/route.ts
(2 hunks)packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts
(2 hunks)packages/web/src/features/search/fileSourceApi.ts
(1 hunks)packages/web/src/features/search/listReposApi.ts
(1 hunks)packages/web/src/features/search/schemas.ts
(3 hunks)packages/web/src/features/search/searchApi.ts
(2 hunks)packages/web/src/features/search/types.ts
(2 hunks)packages/web/src/features/search/zoektSchema.ts
(1 hunks)packages/web/src/lib/errorCodes.ts
(1 hunks)packages/web/src/lib/utils.ts
(9 hunks)schemas/v3/connection.json
(1 hunks)schemas/v3/genericGitHost.json
(1 hunks)schemas/v3/index.json
(1 hunks)vendor/zoekt
(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- docs/docs/search/syntax-reference.mdx
- schemas/v3/index.json
- packages/schemas/src/v3/genericGitHost.type.ts
🚧 Files skipped from review as they are similar to previous changes (50)
- packages/mcp/CHANGELOG.md
- vendor/zoekt
- docs/self-hosting/overview.mdx
- packages/web/src/features/search/types.ts
- docs/docs/search/multi-branch-indexing.mdx
- packages/web/src/app/api/(server)/search/route.ts
- packages/web/src/lib/errorCodes.ts
- package.json
- docs/docs/search/search-contexts.mdx
- packages/backend/src/zoekt.ts
- packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx
- docs/snippets/schemas/v3/genericGitHost.schema.mdx
- Makefile
- packages/backend/package.json
- docs/docs/connections/overview.mdx
- docs/docs/connections/generic-git-host.mdx
- schemas/v3/connection.json
- packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts
- docs/docs.json
- packages/mcp/src/index.ts
- packages/web/src/app/[domain]/components/repositoryCarousel.tsx
- packages/web/src/features/search/zoektSchema.ts
- packages/backend/src/connectionManager.ts
- packages/schemas/src/v3/index.type.ts
- packages/backend/src/repoManager.ts
- packages/web/src/lib/utils.ts
- packages/web/src/features/search/fileSourceApi.ts
- packages/backend/src/utils.ts
- packages/schemas/src/v3/connection.type.ts
- docs/snippets/schemas/v3/index.schema.mdx
- packages/web/src/app/[domain]/search/components/searchResultsPanel/index.tsx
- packages/schemas/src/v3/genericGitHost.schema.ts
- packages/web/src/app/api/(server)/source/route.ts
- packages/web/src/app/api/(server)/repos/route.ts
- packages/schemas/src/v3/connection.schema.ts
- schemas/v3/genericGitHost.json
- packages/web/src/app/[domain]/components/fileHeader.tsx
- packages/web/src/app/[domain]/browse/[...path]/page.tsx
- packages/web/src/app/[domain]/search/components/filterPanel/index.tsx
- packages/web/src/features/search/listReposApi.ts
- packages/web/src/features/search/schemas.ts
- packages/backend/src/git.ts
- packages/web/src/app/[domain]/search/page.tsx
- packages/backend/src/repoCompileUtils.ts
- docs/snippets/schemas/v3/connection.schema.mdx
- packages/schemas/src/v3/index.schema.ts
- packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatchContainer.tsx
- packages/web/src/actions.ts
- packages/web/src/features/search/searchApi.ts
- packages/mcp/src/schemas.ts
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.md
[duplication] ~10-~10: Possible typo: you repeated a word.
Context: ...pec/v2.0.0.html). ## [Unreleased] ### Added - Added support for indexing generic git hosts ...
(ENGLISH_WORD_REPEAT_RULE)
docs/docs/connections/local-repos.mdx
[style] ~11-~11: To reduce wordiness, try specifying a number or using “many” or “numerous” instead.
Context: ...ful in scenarios where you already have a large number of repos already checked out. Local reposi...
(LARGE_NUMBER_OF)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (1)
CHANGELOG.md (1)
10-12
: Looks good
The “Unreleased” section correctly documents the new support for indexing generic git hosts, referencing PR #307.🧰 Tools
🪛 LanguageTool
[duplication] ~10-~10: Possible typo: you repeated a word.
Context: ...pec/v2.0.0.html). ## [Unreleased] ### Added - Added support for indexing generic git hosts ...(ENGLISH_WORD_REPEAT_RULE)
/review |
@@ -5,9 +5,8 @@ import gitlabLogo from "@/public/gitlab.svg"; | |||
import giteaLogo from "@/public/gitea.svg"; | |||
import gerritLogo from "@/public/gerrit.svg"; | |||
import bitbucketLogo from "@/public/bitbucket.svg"; | |||
import gitLogo from "@/public/git.svg"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The imported 'gitLogo' from '@/public/git.svg' is unused in the current code. Remove this import to keep the code clean and avoid unused imports warnings.
export type CodeHostType = | ||
"github" | | ||
"gitlab" | | ||
"gitea" | | ||
"gerrit" | | ||
"bitbucket-cloud" | | ||
"bitbucket-server" | | ||
"generic-git-host"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new CodeHostType includes 'generic-git-host' but no handling logic or icon is provided for this new type. Please implement a corresponding case in getCodeHostInfoForRepo and getCodeHostIcon to handle this type properly.
|
||
type CodeHostInfo = { | ||
type: CodeHostType; | ||
displayName: string; | ||
codeHostName: string; | ||
repoLink: string; | ||
repoLink?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing 'repoLink' from required to optional in CodeHostInfo may lead to nullability issues if callers expect it to always be defined. Confirm that all code consuming this property can handle undefined values safely, or provide a default/fallback value.
export const getCodeHostInfoForRepo = (repo: { | ||
codeHostType: string, | ||
name: string, | ||
displayName?: string, | ||
webUrl?: string, | ||
}): CodeHostInfo | undefined => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input type for getCodeHostInfoForRepo uses 'codeHostType' as string instead of the stricter 'CodeHostType'. This weakens type safety and could allow invalid types. It is advisable to use 'CodeHostType' to enforce valid host types.
switch (codeHostType) { | ||
case 'github': { | ||
const { src, className } = getCodeHostIcon('github')!; | ||
return { | ||
type: "github", | ||
displayName: displayName, | ||
displayName: displayName ?? name, | ||
codeHostName: "GitHub", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, | ||
icon: src, | ||
iconClassName: className, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The internal switch statement in getCodeHostInfoForRepo only shows handling for 'github'. Other previously supported types (gitlab, gitea, gerrit, bitbucket) and the new 'generic-git-host' are missing. Please restore or implement handlers for these types to maintain feature coverage.
displayName: displayName ?? name, | ||
codeHostName: "GitLab", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change sets displayName to displayName ?? name on line 77, but variable 'name' is not defined or passed into _getCodeHostInfoInternal. This will cause a ReferenceError at runtime. Ensure 'name' is defined or passed as a parameter.
codeHostName: "GitLab", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed repoLink from cloneUrl to webUrl on line 79, but _getCodeHostInfoInternal only receives three arguments (type, displayName, cloneUrl). The variable 'webUrl' is not available here. You should pass webUrl as a parameter instead of cloneUrl if you want to use it as repoLink.
displayName: displayName ?? name, | ||
codeHostName: "Gitea", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change from "displayName" to "displayName ?? name" introduces a variable "name" that is not defined or passed into the function scope in _getCodeHostInfoInternal or at the call site for gitea. This will cause a ReferenceError at runtime. Ensure "name" is defined or passed as an argument before using it as a fallback for displayName.
repoLink: webUrl, | ||
icon: src, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The property's value for "repoLink" was changed from "cloneUrl" to "webUrl". This change is inconsistent with all other code hosts and _getCodeHostInfoInternal parameters that use cloneUrl. Confirm that "webUrl" is the intended URL string representing the repository HTML web link and is defined at the call site. Otherwise, this mismatch could lead to incorrect or broken repo links for Gitea repositories.
type: "gerrit", | ||
displayName: displayName, | ||
displayName: displayName ?? name, | ||
codeHostName: "Gerrit", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the updated code for the Gerrit case, the "displayName" field uses a fallback with displayName ?? name
, but the name
identifier is not defined or passed to the function. This will cause a runtime reference error. You should ensure name
is passed as a parameter or otherwise defined before using it as a fallback.
Additionally, the repoLink
value was changed from cloneUrl
to webUrl
. It must be assured that webUrl
is the correct URL to use for Gerrit repositories and that the variable in scope matches this intent. Review how webUrl
is supplied and replace cloneUrl
only if it aligns with semantic expectations.
displayName: displayName ?? name, | ||
codeHostName: "Bitbucket", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the 'bitbucket-server' case, the 'repoLink' field is updated from 'cloneUrl' to 'webUrl', but the parameter is still named 'cloneUrl' in _getCodeHostInfoInternal. This renaming can cause confusion about what URL is being passed. Consider renaming the parameter to a more generic name like 'url' or 'repoUrl' to reflect that it can hold either clone URLs or web URLs. Also, ensure consistency in the use of 'displayName' with fallback to 'name' in other similar cases.
displayName: displayName ?? name, | ||
codeHostName: "Bitbucket", | ||
repoLink: cloneUrl, | ||
repoLink: webUrl, | ||
icon: src, | ||
iconClassName: className, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the 'bitbucket-cloud' case, 'repoLink' is changed from 'cloneUrl' to 'webUrl' in the new code, but the parameter name remains 'cloneUrl'. For clarity and to avoid confusion, consider renaming the parameter to 'url' or splitting parameters to separately accept 'cloneUrl' and 'webUrl'. This change can reduce ambiguity about which URL is used for 'repoLink'.
case "generic-git-host": { | ||
const { src, className } = getCodeHostIcon('generic-git-host')!; | ||
return { | ||
type: "generic-git-host", | ||
displayName: displayName ?? name, | ||
codeHostName: "Generic Git Host", | ||
repoLink: webUrl, | ||
icon: src, | ||
iconClassName: className, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new case 'generic-git-host' returns a CodeHostInfo object, but the function '_getCodeHostInfoInternal' is typed to only return 'CodeHostInfo | undefined' where 'CodeHostInfo' expects 'type' to be a 'CodeHostType'. Since 'generic-git-host' is not included in the 'CodeHostType' union type, this can result in a type mismatch. You should update the 'CodeHostType' type to include 'generic-git-host' or adjust the return type accordingly to avoid TypeScript errors.
return { | ||
src: bitbucketLogo, | ||
} | ||
case "generic-git-host": | ||
return { | ||
src: gitLogo, | ||
} | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new 'generic-git-host' case returns an object with 'src: gitLogo', but 'gitLogo' is not imported in this file. Import the appropriate logo image (e.g., gitLogo) at the top of the file or ensure the variable is defined to avoid a ReferenceError.
case "bitbucket-cloud": | ||
case "bitbucket-server": | ||
return true; | ||
case "generic-git-host": | ||
case "gerrit": | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new 'generic-git-host' type is added in the switch statement but is not handled in the subsequent code. This will cause _getCodeHostInfoInternal to return undefined for this type which may be unintended. Please add appropriate handling for the 'generic-git-host' case or explicitly handle unknown types with a default case.
}, | ||
{ | ||
"$ref": "./genericGitHost.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comma after the closing brace at line 19. To correctly add a new object in the array, add a comma at the end of line 19 before starting the new object on line 20.
@@ -78,7 +78,7 @@ | |||
}, | |||
"contexts": { | |||
"type": "object", | |||
"description": "[Sourcebot EE] Defines a collection of search contexts. This is only available in single-tenancy mode. See: https://docs.sourcebot.dev/self-hosting/more/search-contexts", | |||
"description": "[Sourcebot EE] Defines a collection of search contexts. This is only available in single-tenancy mode. See: https://docs.sourcebot.dev/docs/search/search-contexts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The '+' symbol at the start of line 81 appears to be a diff marker and should be removed for valid JSON syntax.
@@ -5460,6 +5460,7 @@ __metadata: | |||
cross-fetch: "npm:^4.0.0" | |||
dotenv: "npm:^16.4.5" | |||
express: "npm:^4.21.2" | |||
git-url-parse: "npm:^16.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ensure that adding 'git-url-parse' at version '^16.1.0' does not introduce conflicts or compatibility issues with existing dependencies. Verify that the version constraint aligns with the rest of the project's dependencies and that it is properly listed in the project's package.json.
"@types/parse-path@npm:^7.0.0": | ||
version: 7.0.3 | ||
resolution: "@types/parse-path@npm:7.0.3" | ||
checksum: 10c0/8344b6c7acba4e4e5a8d542f56f53c297685fa92f9b0c085d7532cc7e1b661432cecfc1c75c76cdb0d161c95679b6ecfe0573d9fef7c836962aacf604150a984 | ||
languageName: node | ||
linkType: hard | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new entry for "@types/parse-path@npm:^7.0.0" includes detailed fields such as version, resolution, checksum, languageName, and linkType. Ensure the added checksum is accurate and matches the package content to prevent integrity issues. Otherwise, the addition appears structurally correct.
"git-up@npm:^8.1.0": | ||
version: 8.1.1 | ||
resolution: "git-up@npm:8.1.1" | ||
dependencies: | ||
is-ssh: "npm:^1.4.0" | ||
parse-url: "npm:^9.2.0" | ||
checksum: 10c0/2cc4461d8565a3f7a1ecd3d262a58ddb8df0a67f7f7d4915df2913c460b2e88ae570a6ea810700a6d22fb3b9e4bea8dd10a8eb469900ddc12e35c62208608c03 | ||
languageName: node | ||
linkType: hard | ||
|
||
"git-url-parse@npm:^16.1.0": | ||
version: 16.1.0 | ||
resolution: "git-url-parse@npm:16.1.0" | ||
dependencies: | ||
git-up: "npm:^8.1.0" | ||
checksum: 10c0/b8f5ebcbd5b2baf9f1bb77a217376f0247c47fe1d42811ccaac3015768eebb0759a59051f758e50e70adf5c67ae059d1975bf6b750164f36bfd39138d11b940b | ||
languageName: node | ||
linkType: hard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please verify the addition of the 'git-up' and 'git-url-parse' packages and their versions in the lockfile to ensure that these updates are intentional and compatible. Confirm that the dependencies 'is-ssh' and 'parse-url' for 'git-up', and 'git-up' for 'git-url-parse' correspond to supported versions and do not introduce conflicts.
"is-ssh@npm:^1.4.0": | ||
version: 1.4.1 | ||
resolution: "is-ssh@npm:1.4.1" | ||
dependencies: | ||
protocols: "npm:^2.0.1" | ||
checksum: 10c0/021a7355cb032625d58db3cc8266ad9aa698cbabf460b71376a0307405577fd7d3aa0826c0bf1951d7809f134c0ee80403306f6d7633db94a5a3600a0106b398 | ||
languageName: node | ||
linkType: hard | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new entry for "is-ssh@npm:^1.4.0" is missing a 'dependenciesMeta' section, which might be necessary for optional dependencies or other metadata. Also, verify that the 'checksum' is generated correctly and matches the actual content hash. Ensure this entry includes all necessary fields for consistent package management.
"parse-path@npm:^7.0.0": | ||
version: 7.1.0 | ||
resolution: "parse-path@npm:7.1.0" | ||
dependencies: | ||
protocols: "npm:^2.0.0" | ||
checksum: 10c0/8c8c8b3019323d686e7b1cd6fd9653bc233404403ad68827836fbfe59dfe26aaef64ed4e0396d0e20c4a7e1469312ec969a679618960e79d5e7c652dc0da5a0f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checksum values for the 'parse-path' and 'parse-url' packages appear as long strings that might not be correctly formatted or standard. Verify that these checksums are valid and conform to expected formats to ensure integrity verification works properly.
"@types/parse-path": "npm:^7.0.0" | ||
parse-path: "npm:^7.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the 'parse-url' package dependencies, '@types/parse-path' is listed with a version range 'npm:^7.0.0' but without a corresponding entry for '@types/parse-path' in this snippet. Confirm that '@types/parse-path' is correctly declared and available to avoid resolution issues.
"protocols@npm:^2.0.0, protocols@npm:^2.0.1": | ||
version: 2.0.2 | ||
resolution: "protocols@npm:2.0.2" | ||
checksum: 10c0/b87d78c1fcf038d33691da28447ce94011d5c7f0c7fd25bcb5fb4d975991c99117873200c84f4b6a9d7f8b9092713a064356236960d1473a7d6fcd4228897b60 | ||
languageName: node | ||
linkType: hard | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entry "protocols@npm:^2.0.0, protocols@npm:^2.0.1" added lacks dependencies
field. Confirm if this package has no dependencies; if it does, specify them explicitly to ensure proper resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
docs/snippets/schemas/v3/connection.schema.mdx (1)
825-892
:⚠️ Potential issueURL pattern regex needs improvement for the generic Git host.
The regex pattern for validating URLs in the generic Git host connection contains an unescaped dot character which might cause incorrect validation.
The regex pattern
^(https?:\/\/[^\s/$.?#].[^\s]*|file:\/\/\/[^\s]+)$
contains an unescaped dot character.[^\s]*
which should be\.[^\s]*
to correctly match a literal dot in URLs. This could lead to validation issues with some URLs.Suggest updating the pattern to:
-"pattern": "^(https?:\\/\\/[^\\s/$.?#].[^\\s]*|file:\\/\\/\\/[^\\s]+)$", +"pattern": "^(https?:\\/\\/[^\\s/$.?#]\\.[^\\s]*|file:\\/\\/\\/[^\\s]+)$",packages/schemas/src/v3/connection.schema.ts (1)
825-892
:⚠️ Potential issueURL pattern regex needs improvement for the generic Git host.
The regex pattern for validating URLs in the generic Git host connection contains an unescaped dot character which might cause incorrect validation.
The regex pattern
^(https?:\/\/[^\s/$.?#].[^\s]*|file:\/\/\/[^\s]+)$
contains an unescaped dot character.[^\s]*
which should be\.[^\s]*
to correctly match a literal dot in URLs. This could lead to validation issues with some URLs.Suggest updating the pattern to:
-"pattern": "^(https?:\\/\\/[^\\s/$.?#].[^\\s]*|file:\\/\\/\\/[^\\s]+)$", +"pattern": "^(https?:\\/\\/[^\\s/$.?#]\\.[^\\s]*|file:\\/\\/\\/[^\\s]+)$",docs/snippets/schemas/v3/index.schema.mdx (2)
1067-1067
: URL description could be more concise.The description for the
url
property is quite verbose, making it potentially hard to scan quickly.Consider breaking the description into bullet points or simplifying it:
-"description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", +"description": "The URL to the git repository. Supports:\n- Remote URLs (http:// or https://)\n- Local paths (file:///). Must be a valid git repository root. Local paths support glob patterns and are read-only.",
1068-1068
:⚠️ Potential issueURL pattern regex needs improvement for the generic Git host.
The regex pattern for validating URLs in the generic Git host connection contains an unescaped dot character which might cause incorrect validation.
The regex pattern
^(https?:\/\/[^\s/$.?#].[^\s]*|file:\/\/\/[^\s]+)$
contains an unescaped dot character.[^\s]*
which should be\.[^\s]*
to correctly match a literal dot in URLs. This could lead to validation issues with some URLs.Suggest updating the pattern to:
-"pattern": "^(https?:\\/\\/[^\\s/$.?#].[^\\s]*|file:\\/\\/\\/[^\\s]+)$", +"pattern": "^(https?:\\/\\/[^\\s/$.?#]\\.[^\\s]*|file:\\/\\/\\/[^\\s]+)$",
🧹 Nitpick comments (4)
packages/schemas/tools/generate.ts (1)
28-36
: Ensure dereferenced output stays in-sync with the typed schemaSwitching from
$RefParser.bundle
todereference
is a nice improvement because the generated JSON is now fully self-contained.
One caveat:compileFromFile()
still operates on the original file (schemaPath
), which means:
- The TypeScript types continue to be generated from the non-dereferenced version (they still include
$ref
s).- If a later ref is introduced that cannot be resolved at compile time, the
.schema.ts
file will silently succeed (because it is already dereferenced) whilecompileFromFile()
will throw.If the intent is to have both artefacts (
.schema.ts
and.type.ts
) built from the same fully-resolved definition, consider piping the dereferenced object tocompile()
instead of the raw file path, e.g.:-const content = await compileFromFile(schemaPath, { +const tmpSchemaPath = path.join(srcOutDir, `${name}.tmp.deref.json`); +await writeFile(tmpSchemaPath, schema); +const content = await compileFromFile(tmpSchemaPath, {(or use
json-schema-to-typescript
’s programmatic API that accepts an object).This will guarantee that both outputs always reflect identical schema sources.
packages/schemas/src/v3/index.schema.ts (2)
1067-1067
: Consider clarifying the confusing "read-only modified" wordingThe description for the
url
field states that "Local directories are treated as read-only modified." This wording is confusing or potentially contradictory. Consider revising to clearly state that local repositories are treated as read-only (if that's the intent) or clarify what "modified" means in this context.- "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local repositories are treated as read-only. Local directories support glob patterns.",
1078-1094
: Consider adding maxItems constraint to enforce the 64 item limitThe description states a maximum of 64 branches/tags can be indexed, but this constraint isn't enforced in the schema. Consider adding
"maxItems": 64
to both arrays to ensure consistency between documentation and validation."branches": { "type": "array", "description": "List of branches to include when indexing. For a given repo, only the branches that exist on the repo's remote *and* match at least one of the provided `branches` will be indexed. The default branch (HEAD) is always indexed. Glob patterns are supported. A maximum of 64 branches can be indexed, with any additional branches being ignored.", "items": { "type": "string" }, + "maxItems": 64, "examples": [ [ "main", "release/*" ], [ "**" ] ], "default": [] }, "tags": { "type": "array", "description": "List of tags to include when indexing. For a given repo, only the tags that exist on the repo's remote *and* match at least one of the provided `tags` will be indexed. Glob patterns are supported. A maximum of 64 tags can be indexed, with any additional tags being ignored.", "items": { "type": "string" }, + "maxItems": 64, "examples": [ [ "latest", "v2.*.*" ], [ "**" ] ], "default": [] }The same change should be applied to all other revisions properties for GitHub, GitLab, Gitea, and Bitbucket connections.
Also applies to: 1096-1112
docs/snippets/schemas/v2/index.schema.mdx (1)
1422-1422
: Consider updating the Git URL description to match V3 schemaThe URL description for the Git configuration in the V2 schema doesn't mention support for local file URLs with glob patterns, unlike the V3 schema. If the V2 schema should support local repositories, consider updating the description to match the V3 schema.
- "description": "The URL to the git repository." + "description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local repositories are treated as read-only. Local directories support glob patterns."
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (9)
docs/snippets/schemas/v1/index.schema.mdx
(4 hunks)docs/snippets/schemas/v2/index.schema.mdx
(9 hunks)docs/snippets/schemas/v3/connection.schema.mdx
(7 hunks)docs/snippets/schemas/v3/index.schema.mdx
(8 hunks)packages/schemas/src/v1/index.schema.ts
(4 hunks)packages/schemas/src/v2/index.schema.ts
(9 hunks)packages/schemas/src/v3/connection.schema.ts
(7 hunks)packages/schemas/src/v3/index.schema.ts
(8 hunks)packages/schemas/tools/generate.ts
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (17)
packages/schemas/src/v2/index.schema.ts (1)
1-2260
: Auto-generated file – no manual review requiredThis file is entirely generated; the inlined refs improve portability and do not affect runtime logic. No actionable issues detected.
packages/schemas/src/v1/index.schema.ts (1)
1-352
: Auto-generated file – no manual review requiredSame observation as for v2: the expanded definitions are correct and tightly scoped. Nothing to address.
docs/snippets/schemas/v1/index.schema.mdx (3)
22-85
: Schema improvements through detailed inline definitions.The expanded inline definition of GitHub configuration properties enhances documentation clarity and provides comprehensive information for each property. This approach is more maintainable than using references.
87-124
: Schema consistency maintained across connection types.The GitLab configuration schema follows the same pattern of expansion, maintaining consistency with the GitHub configuration structure and providing detailed property information.
239-343
: Root schema properly reflects expanded property definitions.The root schema's
Configs
array now consistently uses an inlineanyOf
schema with the detailed GitHub and GitLab schemas, aligning with the rest of the schema refactoring.docs/snippets/schemas/v3/connection.schema.mdx (4)
235-262
: More flexible token configuration with typed objects.The enhanced token configuration now properly supports both secret references and environment variables, with strict typing, required properties and preventions of additional properties.
375-414
: Consistent revision handling across connection types.The expanded inline revisions property definition now provides detailed documentation for branches and tags across connection types, with consistent structure and constraints.
Also applies to: 534-574, 767-806
827-844
: Well-structured generic Git host connection configuration.The new generic Git host connection type is well defined with clear property constraints and requirements. The support for both remote and local repositories with the same connection type is a great addition.
845-886
: Consistent revisions schema for generic Git host.The revisions object for the generic Git host follows the same structure as other connection types, ensuring consistency across the schema.
packages/schemas/src/v3/connection.schema.ts (2)
234-262
: Improved token configuration with typed objects.The token property now correctly supports both secret references and environment variables consistently across connection types.
887-892
: Required fields properly defined for generic Git host.The schema correctly requires both
type
andurl
properties for the generic Git host connection, ensuring proper validation.docs/snippets/schemas/v3/index.schema.mdx (4)
118-178
: Enhanced settings schema with detailed property documentation.The expanded
settings
property provides comprehensive documentation for all configuration options, making it easier for users to understand available settings and their impacts.
464-492
: Consistent token configuration pattern across connection types.All connection types now follow the same pattern for token configuration, supporting both secret references and environment variables consistently.
Also applies to: 665-694, 897-925
1055-1122
: Generic Git host connection configuration properly integrated.The new generic Git host connection type is well integrated into the overall schema structure, maintaining consistency with other connection types while adding the unique capability to support both remote and local repositories.
181-181
:✅ Verification successful
URL update in documentation reference.
The documentation URL has been updated from
self-hosting/more/search-contexts
todocs/search/search-contexts
.Please verify that the new URL path is correct and accessible:
🌐 Web query:
Is https://docs.sourcebot.dev/docs/search/search-contexts a valid documentation URL for Sourcebot?
💡 Result:
Yes, https://docs.sourcebot.dev/docs/search/search-contexts is a valid documentation URL for Sourcebot. This page provides an overview of search contexts, a feature available in Sourcebot's Enterprise Edition. Search contexts allow you to define specific subsets of your codebase to tailor search results, enhancing the efficiency and relevance of your searches. For more detailed information, you can refer to the official documentation.
Documentation URL Verified
The updated URL in
docs/snippets/schemas/v3/index.schema.mdx:181
(https://docs.sourcebot.dev/docs/search/search-contexts
) is valid and correctly points to the Sourcebot Enterprise Edition Search Contexts documentation. No further changes required.docs/snippets/schemas/v2/index.schema.mdx (2)
84-102
: LGTM! Token schema improvementsThe token schemas now use an
anyOf
structure to support both direct string tokens and environment variable references, providing greater flexibility and security. This is a good improvement.Also applies to: 282-299, 473-490
772-2255
: Impressive schema expansion workThe extensive inlining of schema definitions in the
Repos
section makes the schema more self-contained and explicit, which improves clarity for users and tools. The schema expansions appear to be consistent with their corresponding definitions.
"format": "url", | ||
"description": "The URL to the git repository. This can either be a remote URL (prefixed with `http://` or `https://`) or a absolute path to a directory on the local machine (prefixed with `file://`). If a local directory is specified, it must point to the root of a git repository. Local directories are treated as read-only modified. Local directories support glob patterns.", | ||
"pattern": "^(https?:\\/\\/[^\\s/$.?#].[^\\s]*|file:\\/\\/\\/[^\\s]+)$", | ||
"examples": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the regular expression pattern for file URLs
The regex pattern for URLs needs to be carefully verified to ensure it correctly handles file URLs. The current pattern requires exactly three slashes after 'file:' (as in 'file:///path'). This is generally correct for absolute paths on Unix-like systems, but ensure this matches your implementation expectations.
Run this command to test the pattern with various sample URLs:
🏁 Script executed:
#!/bin/bash
# Test URLs against the pattern in schema
echo "Testing URL pattern..."
PATTERN='^(https?:\/\/[^\s/$.?#].[^\s]*|file:\/\/\/[^\s]+)$'
# Test cases
declare -a URLS=(
"https://github.com/sourcebot-dev/sourcebot"
"http://example.org/repo"
"file:///path/to/repo"
"file:///repos/*"
"file:/path/to/repo" # Should fail - missing slashes
"git://github.com/repo" # Should fail - not http/https/file
)
for url in "${URLS[@]}"; do
if [[ $url =~ $PATTERN ]]; then
echo "✅ $url - VALID"
else
echo "❌ $url - INVALID"
fi
done
Length of output: 1454
Update the URL regex to match valid HTTPS and file URIs
Testing shows that https://github.com/sourcebot-dev/sourcebot
is rejected by the current pattern. Simplify and correct the regex in packages/schemas/src/v3/index.schema.ts
(around the "pattern"
definition at line ≈1068):
• Location:
packages/schemas/src/v3/index.schema.ts (the "pattern"
property for URL strings)
- "pattern": "^(https?:\\/\\/[^\s/$.?#].[^\s]*|file:\\/\\/\\/[^\s]+)$"
+ "pattern": "^(https?:\\/\\/\\S+|file:\\/\\/\\/\\S+)$"
This updated regex now correctly accepts both your HTTPS test case and file:///…
paths.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In packages/schemas/src/v3/index.schema.ts around line 1068, the current regex
pattern for URL validation incorrectly rejects valid HTTPS URLs like
"https://github.com/sourcebot-dev/sourcebot" and requires exactly three slashes
after "file:". Update the regex pattern to a simplified and corrected version
that properly matches valid HTTP, HTTPS, and file URIs with three slashes,
ensuring it accepts both HTTPS URLs and file:/// paths as intended.
This PR adds support for specifying generic git repositories. "generic" here means any repository that uses Git as its VCS.
Repos may be hosted remotely on a git host (helpful when that host is not officially supported):
Or they can be local**:
**Some caveats with local repositories:
remote.origin.url
set in their git configFixes #301
Fixes #250
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Refactor
Chores