Skip to content

Search contexts #273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added
- [Sourcebot EE] Added search contexts, user-defined groupings of repositories that help focus searches on specific areas of a codebase. [#273](https://github.com/sourcebot-dev/sourcebot/pull/273)


## [3.0.4] - 2025-04-12

### Fixes
Expand Down
8 changes: 6 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
MIT License
Copyright (c) 2025 Taqla Inc.

Copyright (c) Taqla, Inc.
Portions of this software are licensed as follows:

- All content that resides under the "ee/" and "packages/web/src/ee/" directories of this repository, if these directories exist, is licensed under the license defined in "ee/LICENSE".
- All third party components incorporated into the Sourcebot Software are licensed under the original license provided by the owner of the applicable component.
- Content outside of the above mentioned directories or restrictions above is available under the "MIT Expat" license as defined below.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ clean:
soft-reset:
rm -rf .sourcebot
redis-cli FLUSHALL
yarn dev:prisma:migrate:reset


.PHONY: bin
7 changes: 5 additions & 2 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
{
"group": "More",
"pages": [
"docs/more/syntax-reference",
"docs/more/roles-and-permissions"
]
}
Expand All @@ -52,7 +53,8 @@
"group": "Getting Started",
"pages": [
"self-hosting/overview",
"self-hosting/configuration"
"self-hosting/configuration",
"self-hosting/license-key"
]
},
{
Expand All @@ -61,7 +63,8 @@
"self-hosting/more/authentication",
"self-hosting/more/tenancy",
"self-hosting/more/transactional-emails",
"self-hosting/more/declarative-config"
"self-hosting/more/declarative-config",
"self-hosting/more/search-contexts"
]
},
{
Expand Down
35 changes: 35 additions & 0 deletions docs/docs/more/syntax-reference.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: Writing search queries
---

Sourcebot uses a powerful regex-based query language that enabled precise code search within large codebases.


## Syntax reference guide

Queries consist of space-separated regular expressions. Wrapping expressions in `""` combines them. By default, a file must have at least one match for each expression to be included.

| Example | Explanation |
| :--- | :--- |
| `foo` | Match files with regex `/foo/` |
| `foo bar` | Match files with regex `/foo/` **and** `/bar/` |
| `"foo bar"` | Match files with regex `/foo bar/` |

Multiple expressions can be or'd together with `or`, negated with `-`, or grouped with `()`.

| Example | Explanation |
| :--- | :--- |
| `foo or bar` | Match files with regex `/foo/` **or** `/bar/` |
| `foo -bar` | Match files with regex `/foo/` but **not** `/bar/` |
| `foo (bar or baz)` | Match files with regex `/foo/` **and** either `/bar/` **or** `/baz/` |

Expressions can be prefixed with certain keywords to modify search behavior. Some keywords can be negated using the `-` prefix.

| Prefix | Description | Example |
| :--- | :--- | :--- |
| `file:` | Filter results from filepaths that match the regex. By default all files are searched. | `file:README` - Filter results to filepaths that match regex `/README/`<br/>`file:"my file"` - Filter results to filepaths that match regex `/my file/`<br/>`-file:test\.ts$` - Ignore results from filepaths match regex `/test\.ts$/` |
| `repo:` | Filter results from repos that match the regex. By default all repos are searched. | `repo:linux` - Filter results to repos that match regex `/linux/`<br/>`-repo:^web/.*` - Ignore results from repos that match regex `/^web\/.*` |
| `rev:` | Filter results from a specific branch or tag. By default **only** the default branch is searched. | `rev:beta` - Filter results to branches that match regex `/beta/` |
| `lang:` | Filter results by language (as defined by [linguist](https://github.com/github-linguist/linguist/blob/main/lib/linguist/languages.yml)). By default all languages are searched. | `lang:TypeScript` - Filter results to TypeScript files<br/>`-lang:YAML` - Ignore results from YAML files |
| `sym:` | Match symbol definitions created by [universal ctags](https://ctags.io/) at index time. | `sym:\bmain\b` - Filter results to symbols that match regex `/\bmain\b/` |
| `context:` | Filter results to a predefined [search context](/self-hosting/more/search-contexts). | `context:web` - Filter results to the web context<br/>`-context:pipelines` - Ignore results from the pipelines context |
Binary file added docs/images/search_contexts_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions docs/self-hosting/license-key.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: License key
sidebarTitle: License key
---

All core Sourcebot features are available in Sourcebot OSS (MIT Licensed). Some additional features require a license key. See the [pricing page](https://www.sourcebot.dev/pricing) for more details.


## Activating a license key

After purchasing a license key, you can activate it by setting the `SOURCEBOT_EE_LICENSE_KEY` environment variable.

```bash
docker run \
-e SOURCEBOT_EE_LICENSE_KEY=<your-license-key> \
/* additional args */ \
ghcr.io/sourcebot-dev/sourcebot:latest
```

## Questions?

If you have any questions regarding licensing, please [contact us](mailto:team@sourcebot.dev).
4 changes: 4 additions & 0 deletions docs/self-hosting/more/declarative-config.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ title: Configuring Sourcebot from a file (declarative config)
sidebarTitle: Declarative config
---

<Warning>
Declaratively defining `connections` is not available when [multi-tenancy](/self-hosting/more/tenancy) is enabled.
</Warning>

Some teams require Sourcebot to be configured via a file (where it can be stored in version control, run through CI/CD pipelines, etc.) instead of a web UI. For more information on configuring connections, see this [overview](/docs/connections/overview).


Expand Down
153 changes: 153 additions & 0 deletions docs/self-hosting/more/search-contexts.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
title: Search contexts
sidebarTitle: Search contexts (EE)
---

<Note>
This is only available in the Enterprise Edition. Please add your [license key](/self-hosting/license-key) to activate it.
</Note>

A **search context** is a user-defined grouping of repositories that helps focus searches on specific areas of your codebase, like frontend, backend, or infrastructure code. Some example queries using search contexts:

- `context:data_engineering userId` - search for `userId` across all repos related to Data Engineering.
- `context:k8s ingress` - search for anything related to ingresses in your k8's configs.
- `( context:project1 or context:project2 ) logger\.debug` - search for debug log calls in project1 and project2


Search contexts are defined in the `context` object inside of a [declarative config](/self-hosting/more/declarative-config). Repositories can be included / excluded from a search context by specifying the repo's URL in either the `include` array or `exclude` array. Glob patterns are supported.

## Example

Let's assume we have a GitLab instance hosted at `https://gitlab.example.com` with three top-level groups, `web`, `backend`, and `shared`:

```sh
web/
├─ admin_panel/
├─ customer_portal/
├─ pipelines/
├─ ...
backend/
├─ billing_server/
├─ auth_server/
├─ db_migrations/
├─ pipelines/
├─ ...
shared/
├─ protobufs/
├─ react/
├─ pipelines/
├─ ...
```

To make searching easier, we can create three search contexts in our [config.json](/self-hosting/more/declarative-config):
- `web`: For all frontend-related code
- `backend`: For backend services and shared APIs
- `pipelines`: For all CI/CD configurations


```json
{
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v3/index.json",
"contexts": {
"web": {
// To include repositories in a search context,
// you can reference them...
"include": [
// ... individually by specifying the repo URL.
"gitlab.example.com/web/admin_panel/core",


// ... or as groups using glob patterns. This is
// particularly useful for including entire "sub-folders"
// of repositories in one go.
"gitlab.example.com/web/customer_portal/**",
"gitlab.example.com/shared/react/**",
"gitlab.example.com/shared/protobufs/**"
],

// Same with excluding repositories.
"exclude": [
"gitlab.example.com/web/customer_portal/pipelines",
"gitlab.example.com/shared/react/hooks/**",
],

// Optional description of the search context
// that surfaces in the UI.
"description": "Web related repos."
},
"backend": { /* ... specifies backend replated repos ... */},
"pipelines": { /* ... specifies pipeline related repos ... */ }
},
"connections": {
/* ...connection definitions... */
}
}
```

<Accordion title="Repository URL details">
Repo URLs are expected to be formatted without the leading http(s):// prefix. For example:
- `github.com/sourcebot-dev/sourcebot` ([link](https://github.com/sourcebot-dev/sourcebot))
- `gitlab.com/gitlab-org/gitlab` ([link](https://gitlab.com/gitlab-org/gitlab))
- `chromium.googlesource.com/chromium` ([link](https://chromium-review.googlesource.com/admin/repos/chromium,general))
</Accordion>


Once configured, you can use these contexts in the search bar by prefixing your query with the context name. For example:
- `context:web login form` searches for login form code in frontend repositories
- `context:backend auth` searches for authentication code in backend services
- `context:pipelines deploy` searches for deployment configurations

![Example](/images/search_contexts_example.png)

Like other prefixes, contexts can be negated using `-` or combined using `or`:
- `-context:web` excludes frontend repositories from results
- `( context:web or context:backend )` searches across both frontend and backend code

See [this doc](/docs/more/syntax-reference) for more details on the search query syntax.

## Schema reference

<Accordion title="Reference">
```json
{
"type": "object",
"description": "Search context",
"properties": {
"include": {
"type": "array",
"description": "List of repositories to include in the search context. Expected to be formatted as a URL without any leading http(s):// prefix (e.g., 'github.com/sourcebot-dev/sourcebot'). Glob patterns are supported.",
"items": {
"type": "string"
},
"examples": [
[
"github.com/sourcebot-dev/**",
"gerrit.example.org/sub/path/**"
]
]
},
"exclude": {
"type": "array",
"description": "List of repositories to exclude from the search context. Expected to be formatted as a URL without any leading http(s):// prefix (e.g., 'github.com/sourcebot-dev/sourcebot'). Glob patterns are supported.",
"items": {
"type": "string"
},
"examples": [
[
"github.com/sourcebot-dev/sourcebot",
"gerrit.example.org/sub/path/**"
]
]
},
"description": {
"type": "string",
"description": "Optional description of the search context that surfaces in the UI."
}
},
"required": [
"include"
],
"additionalProperties": false
}
```
</Accordion>
27 changes: 27 additions & 0 deletions ee/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Sourcebot Enterprise license (the “Enterprise License” or "EE license")
Copyright (c) 2025 Taqla Inc.

With regard to the Sourcebot Enterprise Software:

This software and associated documentation files (the "Software") may only be used for
internal business purposes if you (and any entity that you represent) are in compliance
with an agreement governing the use of the Software, as agreed by you and Sourcebot, and otherwise
have a valid Sourcebot Enterprise license for the correct number of user seats. Subject to the foregoing
sentence, you are free to modify this Software and publish patches to the Software. You agree that Sourcebot
and/or its licensors (as applicable) retain all right, title and interest in and to all such modifications
and/or patches, and all such modifications and/or patches may only be used, copied, modified, displayed,
distributed, or otherwise exploited with a valid Sourcebot Enterprise license for the correct number of user seats.
Notwithstanding the foregoing, you may copy and modify the Software for non-production evaluation or internal
experimentation purposes, without requiring a subscription. You agree that Sourcebot and/or
its licensors (as applicable) retain all right, title and interest in and to all such modifications.
You are not granted any other rights beyond what is expressly stated herein. Subject to the
foregoing, it is forbidden to copy, merge, publish, distribute, sublicense, and/or sell the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

For all third party components incorporated into the Sourcebot Software, those components are
licensed under the original license provided by the owner of the applicable component.
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
"packages/*"
],
"scripts": {
"build": "cross-env SKIP_ENV_VALIDATION=1 yarn workspaces run build",
"test": "yarn workspaces run test",
"build": "cross-env SKIP_ENV_VALIDATION=1 yarn workspaces foreach -A run build",
"test": "yarn workspaces foreach -A run test",
"dev": "yarn dev:prisma:migrate:dev && npm-run-all --print-label --parallel dev:zoekt dev:backend dev:web",
"with-env": "cross-env PATH=\"$PWD/bin:$PATH\" dotenv -e .env.development -c --",
"dev:zoekt": "yarn with-env zoekt-webserver -index .sourcebot/index -rpc",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
-- CreateTable
CREATE TABLE "SearchContext" (
"id" SERIAL NOT NULL,
"name" TEXT NOT NULL,
"description" TEXT,
"orgId" INTEGER NOT NULL,

CONSTRAINT "SearchContext_pkey" PRIMARY KEY ("id")
);

-- CreateTable
CREATE TABLE "_RepoToSearchContext" (
"A" INTEGER NOT NULL,
"B" INTEGER NOT NULL,

CONSTRAINT "_RepoToSearchContext_AB_pkey" PRIMARY KEY ("A","B")
);

-- CreateIndex
CREATE UNIQUE INDEX "SearchContext_name_orgId_key" ON "SearchContext"("name", "orgId");

-- CreateIndex
CREATE INDEX "_RepoToSearchContext_B_index" ON "_RepoToSearchContext"("B");

-- AddForeignKey
ALTER TABLE "SearchContext" ADD CONSTRAINT "SearchContext_orgId_fkey" FOREIGN KEY ("orgId") REFERENCES "Org"("id") ON DELETE CASCADE ON UPDATE CASCADE;

-- AddForeignKey
ALTER TABLE "_RepoToSearchContext" ADD CONSTRAINT "_RepoToSearchContext_A_fkey" FOREIGN KEY ("A") REFERENCES "Repo"("id") ON DELETE CASCADE ON UPDATE CASCADE;

-- AddForeignKey
ALTER TABLE "_RepoToSearchContext" ADD CONSTRAINT "_RepoToSearchContext_B_fkey" FOREIGN KEY ("B") REFERENCES "SearchContext"("id") ON DELETE CASCADE ON UPDATE CASCADE;
17 changes: 17 additions & 0 deletions packages/db/prisma/schema.prisma
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,24 @@ model Repo {
org Org @relation(fields: [orgId], references: [id], onDelete: Cascade)
orgId Int

searchContexts SearchContext[]

@@unique([external_id, external_codeHostUrl, orgId])
}

model SearchContext {
id Int @id @default(autoincrement())

name String
description String?
repos Repo[]

org Org @relation(fields: [orgId], references: [id], onDelete: Cascade)
orgId Int

@@unique([name, orgId])
}

model Connection {
id Int @id @default(autoincrement())
name String
Expand Down Expand Up @@ -138,6 +153,8 @@ model Org {

/// List of pending invites to this organization
invites Invite[]

searchContexts SearchContext[]
}

enum OrgRole {
Expand Down
Loading