YurtsAI developed a pipeline to evaluate the famous hallucination problem of large language models. Refer to Illusions Unraveled: The Magic and Madness of Hallucinations in LLMs — Part 1 to learn more about hallucinations and the evaluation pipeline.
Requirements:
- Python 3.10+
- Poetry
- 🤗HuggingFace token
Firstly, create a virtual environment and activate it:
python3.10 -m virtualenv .venv
source .venv/bin/activateTo install the required dependencies, assuming you have poetry installed, run:
It also logs you into 🤗Hub which prompts for your 🤗Hub token.
make installor in dev mode:
make install-devTo evaluate the model on the given TechCrunch dataset, run:
python -m llm_eval \
--model_name_or_path tiiuae/falcon-7b-instruct \
--max_length 512 \
--data_max_size 100 \
--num_proc 4 \
--batch_size 8 \
--compute_rewardFor more information, run
llm_eval --helporpython -m llm_eval --help.
Some models have different input formatting, i.e addition of special token or
formatted in certain ways. To handle this, you can use the --input_format
flag. For example, to preprocess the input for the
OpenAssistant/falcon-7b-sft-mix-2000 model, run:
python -m llm_eval \
--model_name_or_path OpenAssistant/falcon-7b-sft-mix-2000 \
--data_max_size 100 \
--input_format "<|prompter|>{}<|endoftext|><|assistant|>" \
--batch_size 8 \
--shuffle \
--max_length 512 \
--compute_rewardIf you'd like further data exploration, you can use pandas or your
favorite data analysis library to visualize the data.
If you're not familiar with
pandas, you can use the following snippet. Make sure topip install pandasfirst.
>>> import pandas as pd
>>> # Load the data to a pandas dataframe.
>>> df = pd.read_json('res/eval/falcon-7b-instruct_tech-crunch.jsonl', lines=True)
>>> # Filter Type-1 hallucinations.
>>> good = df[df.reward == 1]
>>> neutral = df[df.reward == 0]
>>> bad = df[df.reward < 0]
>>> # Get the number of good, neutral, and bad responses.
>>> n, n_good, n_neutral, n_bad = len(df), len(good), len(neutral), len(bad)
>>> print(f'Good: {n_good} ({n_good / n:.2%})')
>>> print(f'Neutral: {n_neutral} ({n_neutral / n:.2%})')
>>> print(f'Bad: {n_bad} ({n_bad / n:.2%})')You're welcome to submit a pull request with your visualizations!
You are very welcome to modify and use them in your own projects.
Please keep a link to the original repository. If you have made a fork with substantial modifications that you feel may be useful, then please open a new issue on GitHub with a link and short description.
This project is opened under the MIT which allows very broad use for both private and commercial purposes.
A few of the images used for demonstration purposes may be under copyright. These images are included under the "fair usage" laws.
