Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[experimental] Add command line tool for llama_index #1068

Merged
merged 1 commit into from
Apr 8, 2023

Conversation

iaalm
Copy link
Contributor

@iaalm iaalm commented Apr 5, 2023

Based on discussion #1054, add command line with following features:

# create a local config file in local dir
python -m experimental.cli init

# add file to index
python -m experimental.cli add ../data/

# query
python -m experimental.cli query "Some question?"

The idea is quite straightforward, there're two files put in current directory

  • config.ini stores embedding/predicter model setup along with its parameters
  • index.json the index file

There is still a lot of work to be done, such as supporting various embedding models (even a local LLM maybe?), and removing/updating the current index. However, I am considering merging this fundamental part first so that I can tackle the project step by step.

Open to suggestions/changes.

@jerryjliu
Copy link
Collaborator

amazing! will take a look soon

Copy link
Collaborator

@jerryjliu jerryjliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high-level looks good! main comment is about documenting usage

experimental/cli/__main__.py Show resolved Hide resolved
@jerryjliu jerryjliu merged commit 3f04971 into run-llama:main Apr 8, 2023
@jerryjliu
Copy link
Collaborator

i like it - i can envision this being an easy-to-use CLI tool so that you can ask questions over any directory on your file system

@hophanms
Copy link

hophanms commented Apr 9, 2023

Got this error:

Error while finding module specification for 'experimental.cli' (ModuleNotFoundError: No module named 'experimental')

@iaalm
Copy link
Contributor Author

iaalm commented Apr 9, 2023

Hi @hophanms, "experimental" is not included in the package yet (I think it's why it called "experimental"). For now, you need to git clone this repo and run these command in it.
Or you can use PYTHONPATH=/path/to/your/llama_index python -m experimental.cli init to run in any directory.

For long term, when this part of code is stable enough, we can move to src. At that time user will be able to call it directly with something like python -m llama_index.cli init

Hope it helps.

@hophanms
Copy link

hophanms commented Apr 9, 2023

I was in the wrong folder when I ran that CLI.

I haven't digged further but tried this
python -m experimental.cli query "Some question?"

And got this:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 6 tokens
None

There is no further way to debug.

@iaalm
Copy link
Contributor Author

iaalm commented Apr 9, 2023

Hi @hophanms , could you provider some more information about your run? There're many reason can cause this output.

Maybe I can describe the work flow clearly (maybe update the doc too, later).

python -m experimental.cli init
this creates a config file "config.ini". If default config is not enough for you, you need to update it manually (for now).

python -m experimental.cli add ../data/
This use OpenAI/Azure OpenAI API to "analysis" files under "../data/" and store it in a file named "index.json" in current directory.

python -m experimental.cli query "Some question?" this checks the local "index.json" and send some more query to OpenAI/Azure OpenAI for the answer to your question.

Because we still call OpenAI API for index and question answering, environments like "OPENAI_API_KEY" still need to be set.

Your output looks ok. There're no error message just no result too. For me, I can repro same result without "add" any file. I don't know if it's your case.

@hophanms
Copy link

hophanms commented Apr 9, 2023

I found out it silently failed on generating the .json so that's why the query returned None
If I fixed the add then it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants