|
1 |
| -# CIL: Code Indexer Loop |
| 1 | +# Code Indexer Loop |
2 | 2 |
|
3 |
| -[](https://pypi.org/project/cil/) |
4 |
| -[](LICENSE) |
5 |
| -[](https://github.com/definitive-io/cil/network) |
6 |
| -[](https://github.com/definitive-io/cil/stargazers) |
| 3 | +[](https://pypi.org/project/code-indexer-loop/) |
| 4 | +[](LICENSE) |
| 5 | +[](https://github.com/definitive-io/code-indexer-loop/network) |
| 6 | +[](https://github.com/definitive-io/code-indexer-loop/stargazers) |
7 | 7 | [](https://twitter.com/definitiveio)
|
8 | 8 | [](https://discord.gg/CPJJfq87Vx)
|
9 | 9 |
|
10 | 10 |
|
11 |
| -**CIL** is a Python library designed to index and retrieve code snippets. |
| 11 | +**Code Indexer Loop** is a Python library designed to index and retrieve code snippets. |
12 | 12 |
|
13 | 13 | It uses the useful indexing utilities of the **LlamaIndex** library and the multi-language **tree-sitter** library to parse the code from many popular programming languages. **tiktoken** is used to count tokens in a code snippet and **LangChain** to obtain embeddings (defaults to **OpenAI**'s `text-embedding-ada-002`) and store them in an embedded **ChromaDB** vector database. It uses **watchdog** for continuous updating of the index based on file system events.
|
14 | 14 |
|
15 | 15 | ## Installation:
|
16 | 16 | Use `pip` to install Code Indexer Loop from PyPI.
|
17 | 17 | ```
|
18 |
| -pip install cil |
| 18 | +pip install code_indexer_loop |
19 | 19 | ```
|
20 | 20 |
|
21 | 21 | ## Usage:
|
22 | 22 | 1. Import necessary modules:
|
23 | 23 | ```python
|
24 |
| -from cil.api import CodeIndexer |
| 24 | +from code_indexer_loop.api import CodeIndexer |
25 | 25 | ```
|
26 | 26 | 2. Create a CodeIndexer object and have it watch for changes:
|
27 | 27 | ```python
|
@@ -76,4 +76,4 @@ Run the unit tests by invoking `pytest` in the root.
|
76 | 76 | Please see the LICENSE file provided with the source code.
|
77 | 77 |
|
78 | 78 | ## Attribution
|
79 |
| -We'd like to thank the Sweep AI for publishing their ideas about code chunking. Read their blog posts about the topic [here](https://docs.sweep.dev/blogs/chunking-2m-files) and [here](https://docs.sweep.dev/blogs/chunking-improvements). The implementation in `cil` is modified from their original implementation mainly to limit based on tokens instead of characters and to achieve perfect document reconstruction (`"".join(chunks) == original_source_code`). |
| 79 | +We'd like to thank the Sweep AI for publishing their ideas about code chunking. Read their blog posts about the topic [here](https://docs.sweep.dev/blogs/chunking-2m-files) and [here](https://docs.sweep.dev/blogs/chunking-improvements). The implementation in `code_indexer_loop` is modified from their original implementation mainly to limit based on tokens instead of characters and to achieve perfect document reconstruction (`"".join(chunks) == original_source_code`). |
0 commit comments