Skip to content

Commit c93180e

Browse files
Update README to reflect latest schema. Also add some example configs in 'configs/' directory
1 parent 2905187 commit c93180e

File tree

10 files changed

+233
-225
lines changed

10 files changed

+233
-225
lines changed

.vscode/settings.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
{
22
"files.associations": {
3-
"*config.json": "jsonc"
3+
"*.json": "jsonc",
4+
"index.json": "json"
45
}
56
}

README.md

Lines changed: 83 additions & 134 deletions
Original file line numberDiff line numberDiff line change
@@ -70,23 +70,25 @@ Sourcebot supports indexing and searching through public and private repositorie
7070
cd sourcebot_workspace
7171
```
7272

73-
2. Create a new config following the [configuration schema](./schemas/index.json) to specify which repositories Sourcebot should index. For example, to index [llama.cpp](https://github.com/ggerganov/llama.cpp):
73+
2. Create a new config following the [configuration schema](./schemas/v2/index.json) to specify which repositories Sourcebot should index. For example, let's index llama.cpp:
7474
7575
```sh
7676
touch my_config.json
7777
echo '{
78-
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/index.json",
79-
"Configs": [
78+
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/refs/tags/latest/schemas/v2/index.json",
79+
"repos": [
8080
{
81-
"Type": "github",
82-
"GitHubUser": "ggerganov",
83-
"Name": "^llama\\\\.cpp$"
81+
"type": "github",
82+
"repos": [
83+
"ggerganov/llama.cpp"
84+
]
8485
}
8586
]
8687
}' > my_config.json
8788
```
8889
89-
(For more examples, see [example-config.json](./example-config.json). For additional usage information, see the [configuration schema](./schemas/index.json)).
90+
>[!NOTE]
91+
> Sourcebot can also index all repos owned by a organization, user, group, etc., instead of listing them individually. For examples, see the [configs](./configs) directory. For additional usage information, see the [configuration schema](./schemas/v2/index.json).
9092
9193
3. Run Sourcebot and point it to the new config you created with the `-e CONFIG_PATH` flag:
9294
@@ -106,31 +108,8 @@ Sourcebot supports indexing and searching through public and private repositorie
106108
</details>
107109
<br>
108110
109-
You should see a `.sourcebot` folder in your current directory. This folder stores a cache of the repositories zoekt has indexed. The `HEAD` commit of a repository is re-indexed [every hour](https://github.com/sourcebot-dev/zoekt/blob/11b7713f1fb511073c502c41cea413d616f7761f/cmd/zoekt-indexserver/main.go#L86). Indexing private repos? See [Providing an access token](#providing-an-access-token).
110-
111-
>[!WARNING]
112-
> Depending on the size of your repo(s), SourceBot could take a couple of minutes to finish indexing. SourceBot doesn't currently support displaying indexing progress in real-time, so please be patient while it finishes. You can track the progress manually by investigating the `.sourcebot` cache in your workspace.
113-
114-
<details>
115-
<summary><img src="https://gitlab.com/favicon.ico" width="16" height="16" /> Using GitLab?</summary>
116-
117-
_tl;dr: A `GITLAB_TOKEN` is required to index GitLab repositories (both private & public). See [Providing an access token](#providing-an-access-token)._
118-
119-
Currently, the GitLab indexer is restricted to only indexing repositories that the associated `GITLAB_TOKEN` has access to. For example, if the token has access to `foo`, `bar`, and `baz` repositories, the following config will index all three:
120-
121-
```sh
122-
{
123-
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/index.json",
124-
"Configs": [
125-
{
126-
"Type": "gitlab"
127-
}
128-
]
129-
}
130-
```
131-
132-
See [Providing an access token](#providing-an-access-token).
133-
</details>
111+
You should see a `.sourcebot` folder in your current directory. This folder stores a cache of the repositories zoekt has indexed. The `HEAD` commit of a repository is re-indexed [every hour](./packages/backend/src/constants.ts). Indexing private repos? See [Providing an access token](#providing-an-access-token).
112+
134113
</br>
135114
136115
## Providing an access token
@@ -145,31 +124,92 @@ This will depend on the code hosting platform you're using:
145124
</picture> GitHub
146125
</summary>
147126

148-
In order to index private repositories, you'll need to generate a GitHub Personal Access Token (PAT) and pass it to Sourcebot. Create a new PAT [here](https://github.com/settings/tokens/new) and make sure you select the `repo` scope:
127+
In order to index private repositories, you'll need to generate a GitHub Personal Access Token (PAT). Create a new PAT [here](https://github.com/settings/tokens/new) and make sure you select the `repo` scope:
149128
150129
![GitHub PAT creation](.github/images/github-pat-creation.png)
151130
152-
You'll need to pass this PAT each time you run Sourcebot by setting the `GITHUB_TOKEN` environment variable:
131+
Next, update your configuration with the `token` field:
132+
```json
133+
{
134+
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/refs/tags/latest/schemas/v2/index.json",
135+
"repos": [
136+
{
137+
"type": "github",
138+
"token": "ghp_mytoken",
139+
...
140+
}
141+
]
142+
}
143+
```
144+
145+
You can also pass tokens as environment variables:
146+
```json
147+
{
148+
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/refs/tags/latest/schemas/v2/index.json",
149+
"repos": [
150+
{
151+
"type": "github",
152+
"token": {
153+
// note: this env var can be named anything. It
154+
// doesn't need to be `GITHUB_TOKEN`.
155+
"env": "GITHUB_TOKEN"
156+
},
157+
...
158+
}
159+
]
160+
}
161+
```
162+
163+
You'll need to pass this environment variable each time you run Sourcebot:
153164

154165
<pre>
155-
docker run -p 3000:3000 --rm --name sourcebot -e <b>GITHUB_TOKEN=[your-github-token]</b> -e CONFIG_PATH=/data/my_config.json -v $(pwd):/data ghcr.io/sourcebot-dev/sourcebot:latest
166+
docker run -e <b>GITHUB_TOKEN=ghp_mytoken</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
156167
</pre>
157168
</details>
158169

159170
<details>
160171
<summary><img src="https://gitlab.com/favicon.ico" width="16" height="16" /> GitLab</summary>
161172

162-
>[!NOTE]
163-
> An access token is <b>required</b> to index GitLab repositories (both private & public) since the GitLab indexer needs the token to determine which repositories to index. See [example-config.json](./example-config.json) for example usage.
164-
165173
Generate a GitLab Personal Access Token (PAT) [here](https://gitlab.com/-/user_settings/personal_access_tokens) and make sure you select the `read_api` scope:
166174

167175
![GitLab PAT creation](.github/images/gitlab-pat-creation.png)
168176

169-
You'll need to pass this PAT each time you run Sourcebot by setting the `GITLAB_TOKEN` environment variable:
177+
Next, update your configuration with the `token` field:
178+
```json
179+
{
180+
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/refs/tags/latest/schemas/v2/index.json",
181+
"repos": [
182+
{
183+
"type": "gitlab",
184+
"token": "glpat-mytoken",
185+
...
186+
}
187+
]
188+
}
189+
```
190+
191+
You can also pass tokens as environment variables:
192+
```json
193+
{
194+
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/refs/tags/latest/schemas/v2/index.json",
195+
"repos": [
196+
{
197+
"type": "gitlab",
198+
"token": {
199+
// note: this env var can be named anything. It
200+
// doesn't need to be `GITLAB_TOKEN`.
201+
"env": "GITLAB_TOKEN"
202+
},
203+
...
204+
}
205+
]
206+
}
207+
```
208+
209+
You'll need to pass this environment variable each time you run Sourcebot:
170210

171211
<pre>
172-
docker run -p 3000:3000 --rm --name sourcebot -e <b>GITLAB_TOKEN=[your-gitlab-token]</b> -e CONFIG_PATH=/data/my_config.json -v $(pwd):/data ghcr.io/sourcebot-dev/sourcebot:latest
212+
docker run -e <b>GITLAB_TOKEN=glpat-mytoken</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
173213
</pre>
174214

175215
</details>
@@ -178,63 +218,7 @@ docker run -p 3000:3000 --rm --name sourcebot -e <b>GITLAB_TOKEN=[your-gitlab-to
178218

179219
## Using a self-hosted GitLab / GitHub instance
180220

181-
If you're using a self-hosted GitLab or GitHub instance with a custom domain, there is some additional config required:
182-
183-
<div>
184-
<details>
185-
<summary>
186-
<picture>
187-
<source media="(prefers-color-scheme: dark)" srcset=".github/images/github-favicon-inverted.png">
188-
<img src="https://github.com/favicon.ico" width="16" height="16" alt="GitHub icon">
189-
</picture> GitHub
190-
</summary>
191-
192-
1. In your config, add the `GitHubURL` field to point to your deployment's URL. For example:
193-
```json
194-
{
195-
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/index.json",
196-
"Configs": [
197-
{
198-
"Type": "github",
199-
"GitHubUrl": "https://github.example.com"
200-
}
201-
]
202-
}
203-
204-
2. Set the `GITHUB_HOSTNAME` environment variable to your deployment's hostname. For example:
205-
<pre>
206-
docker run -e <b>GITHUB_HOSTNAME=github.example.com</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
207-
</pre>
208-
209-
210-
</details>
211-
212-
<details>
213-
<summary><img src="https://gitlab.com/favicon.ico" width="16" height="16" /> GitLab</summary>
214-
215-
216-
1. In your config, add the `GitLabURL` field to point to your deployment's URL. For example:
217-
```json
218-
{
219-
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/index.json",
220-
"Configs": [
221-
{
222-
"Type": "gitlab",
223-
"GitLabURL": "https://gitlab.example.com"
224-
}
225-
]
226-
}
227-
```
228-
229-
2. Set the `GITLAB_HOSTNAME` environment variable to your deployment's hostname. For example:
230-
231-
<pre>
232-
docker run -e <b>GITLAB_HOSTNAME=gitlab.example.com</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
233-
</pre>
234-
235-
</details>
236-
237-
</div>
221+
If you're using a self-hosted GitLab or GitHub instance with a custom domain, you can specify the domain in your config file. See [configs/self-hosted.json](configs/self-hosted.json) for examples.
238222

239223
## Build from source
240224
>[!NOTE]
@@ -266,49 +250,14 @@ If you're using a self-hosted GitLab or GitHub instance with a custom domain, th
266250

267251
5. Create a `config.json` file at the repository root. See [Configuring Sourcebot](#configuring-sourcebot) for more information.
268252

269-
6. (Optional) Depending on your `config.json`, you may need to pass an access token to Sourcebot:
270-
271-
<div>
272-
<details>
273-
<summary>
274-
<picture>
275-
<source media="(prefers-color-scheme: dark)" srcset=".github/images/github-favicon-inverted.png">
276-
<img src="https://github.com/favicon.ico" width="16" height="16" alt="GitHub icon">
277-
</picture>
278-
GitHub
279-
</summary>
280-
281-
First, generate a personal access token (PAT). See [Providing an access token](#providing-an-access-token).
282-
283-
Next, Create a text file named `.github-token` **in your home directory** and paste the token in it. The file should look like:
284-
```sh
285-
ghp_...
286-
```
287-
zoekt will [read this file](https://github.com/sourcebot-dev/zoekt/blob/6a5753692b46e669f851ab23211e756a3677185d/cmd/zoekt-mirror-github/main.go#L60) to authenticate with GitHub.
288-
</details>
289-
290-
<details>
291-
<summary>
292-
<img src="https://gitlab.com/favicon.ico" width="16" height="16" /> GitLab
293-
</summary>
294-
First, generate a personal access token (PAT). See [Providing an access token](#providing-an-access-token).
295-
296-
Next, Create a text file named `.gitlab-token` **in your home directory** and paste the token in it. The file should look like:
297-
```sh
298-
glpat-...
299-
```
300-
zoekt will [read this file](https://github.com/sourcebot-dev/zoekt/blob/11b7713f1fb511073c502c41cea413d616f7761f/cmd/zoekt-mirror-gitlab/main.go#L43) to authenticate with GitLab.
301-
</details>
302-
</div>
303-
304-
7. Start Sourcebot with the command:
253+
6. Start Sourcebot with the command:
305254
```sh
306255
yarn dev
307256
```
308257

309258
A `.sourcebot` directory will be created and zoekt will begin to index the repositories found given `config.json`.
310259

311-
8. Start searching at `http://localhost:3000`.
260+
7. Start searching at `http://localhost:3000`.
312261

313262
## Telemetry
314263

configs/auth.json

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
{
2+
"$schema": "../schemas/v2/index.json",
3+
"repos": [
4+
// Authenticate using a token directly in the config.
5+
// Private and public repositories will be included.
6+
{
7+
"type": "github",
8+
"token": "ghp_token1234",
9+
"orgs": [
10+
"my-org"
11+
]
12+
},
13+
{
14+
"type": "gitlab",
15+
"token": "glpat-1234",
16+
"groups": [
17+
"my-group"
18+
]
19+
},
20+
21+
// You can also store the token in a environment variable and then
22+
// references it from the config.
23+
{
24+
"type": "github",
25+
"token": {
26+
"env": "GITHUB_TOKEN_ENV_VAR"
27+
}
28+
},
29+
{
30+
"type": "gitlab",
31+
"token": {
32+
"env": "GITLAB_TOKEN_ENV_VAR"
33+
},
34+
"groups": [
35+
"my-group"
36+
]
37+
}
38+
]
39+
}

configs/basic.json

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
{
2+
"$schema": "../schemas/v2/index.json",
3+
// Note: to include private repositories, you must provide an authentication token.
4+
// See: configs/auth.json for a example.
5+
"repos": [
6+
// From GitHub, include:
7+
// - all public repos owned by user `torvalds`
8+
// - all public repos owned by organization `commai`
9+
// - repo `sourcebot-dev/sourcebot`
10+
{
11+
"type": "github",
12+
"token": "my-token",
13+
"users": [
14+
"torvalds"
15+
],
16+
"orgs": [
17+
"commaai"
18+
],
19+
"repos": [
20+
"sourcebot-dev/sourcebot"
21+
]
22+
},
23+
// From GitLab, include:
24+
// - all public projects owned by user `brendan67`
25+
// - all public projects in group `my-group` and sub-group `sub-group`
26+
// - project `my-group/project1`
27+
{
28+
"type": "gitlab",
29+
"token": "my-token",
30+
"users": [
31+
"brendan67"
32+
],
33+
"groups": [
34+
"my-group",
35+
"my-other-group/sub-group"
36+
],
37+
"projects": [
38+
"my-group/project1"
39+
]
40+
}
41+
]
42+
}

0 commit comments

Comments
 (0)