Skip to content

Commit

Permalink
chore(docs): add shell syntax highlighting and fix typos (promptfoo#953)
Browse files Browse the repository at this point in the history
* fix hylighting

* fix typos

* revert

* add sh to non-eval commands

* fix

* reintroduce to all non-eval commands

* fix
  • Loading branch information
mldangelo authored and typpo committed Jun 18, 2024
1 parent a8e70a7 commit 806e522
Show file tree
Hide file tree
Showing 54 changed files with 119 additions and 119 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ This will create some placeholders in your current directory: `prompts.txt` and

After editing the prompts and variables to your liking, run the eval command to kick off an evaluation:

```sh
```
npx promptfoo@latest eval
```

Expand Down Expand Up @@ -190,7 +190,7 @@ npx promptfoo view

In [this example](https://github.com/typpo/promptfoo/tree/main/examples/assistant-cli), we evaluate whether adding adjectives to the personality of an assistant bot affects the responses:

```sh
```
npx promptfoo eval -p prompts.txt -r openai:gpt-3.5-turbo -t tests.csv
```

Expand All @@ -210,7 +210,7 @@ You can also output a nice [spreadsheet](https://docs.google.com/spreadsheets/d/

In the [next example](https://github.com/typpo/promptfoo/tree/main/examples/gpt-3.5-vs-4), we evaluate the difference between GPT 3 and GPT 4 outputs for a given prompt:

```sh
```
npx promptfoo eval -p prompts.txt -r openai:gpt-3.5-turbo openai:gpt-4 -o output.html
```

Expand Down
4 changes: 2 additions & 2 deletions examples/google-vertex/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
To call Vertex AI models in Node, you'll need to install Google's official auth client as a peer dependency:

```
```sh
npm i google-auth-library
```

Make sure the Vertex AI API is enabled for the relevant project in Google Cloud. Then, ensure that you've selected that project in the gcloud cli:

```
```sh
gcloud config set project PROJECT_ID
```

Expand Down
2 changes: 1 addition & 1 deletion examples/langchain-python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ To run it, first create a virtual env and install the requirements:

Then activate the virtual env.

```
```sh
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Expand Down
2 changes: 1 addition & 1 deletion examples/phi-vs-llama/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
To get started:

```
```sh
ollama pull llama3
ollama pull phi3
```
Expand Down
2 changes: 1 addition & 1 deletion examples/tool-use/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Note that the function and tool syntax differ slightly between the two providers

The configuration for this example is specified in `promptfooconfig.yaml`. To run the example, execute the following command in your terminal:

```sh
```
promptfoo eval
```

Expand Down
14 changes: 7 additions & 7 deletions site/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,22 @@ This website is built using [Docusaurus 2](https://docusaurus.io/), a modern sta

### Installation

```
$ yarn
```sh
npm install
```

### Local Development

```
$ yarn start
```sh
npm start
```

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

### Build

```
$ yarn build
```sh
npm run build
```

This command generates static content into the `build` directory and can be served using any static contents hosting service.
This command generates static content into the `build` directory and can be served using any static content hosting service.
8 changes: 4 additions & 4 deletions site/docs/configuration/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,21 +38,21 @@ Dataset generation uses your prompts and any existing test cases to generate new

Run the command in the same directory as your config:

```
```sh
promptfoo generate dataset
```

This will output the `tests` YAML to your terminal.

If you want to write the new dataset to a file:

```
```sh
promptfoo generate dataset -o tests.yaml
```

Or if you want to edit the existing config in-place:

```
```sh
promptfoo generate dataset -w
```

Expand All @@ -71,6 +71,6 @@ You can customize the dataset generation process by providing additional options

For example:

```bash
```sh
promptfoo generate dataset --config path_to_config.yaml --output path_to_output.yaml --instructions "Consider edge cases related to international travel"
```
2 changes: 1 addition & 1 deletion site/docs/configuration/expected-outputs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -426,7 +426,7 @@ And create a list of assertions (`asserts.yaml`):

Then run the eval command:

```sh
```
promptfoo eval --assertions asserts.yaml --model-outputs outputs.json
```

Expand Down
8 changes: 4 additions & 4 deletions site/docs/configuration/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -548,25 +548,25 @@ For detailed information on the config structure, see [Configuration Reference](

If you have multiple sets of tests, it helps to split them into multiple config files. Use the `--config` or `-c` parameter to run each individual config:

```sh
```
promptfoo eval -c usecase1.yaml
```

and

```sh
```
promptfoo eval -c usecase2.yaml
```
You can run multiple configs at the same time, which will combine them into a single eval. For example:
```sh
```
promptfoo eval -c my_configs/*
```
or
```sh
```
promptfoo eval -c config1.yaml -c config2.yaml -c config3.yaml
```
Expand Down
4 changes: 2 additions & 2 deletions site/docs/configuration/telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ No additional information is collected. The above list is exhaustive.

To disable telemetry, set the following environment variable:

```bash
```sh
PROMPTFOO_DISABLE_TELEMETRY=1
```

Expand All @@ -25,6 +25,6 @@ The CLI checks NPM's package registry for updates. If there is a newer version a

To disable, set:

```bash
```sh
PROMPTFOO_DISABLE_UPDATE=1
```
6 changes: 3 additions & 3 deletions site/docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ This will create a `promptfooconfig.yaml` file in your current directory.

- **OpenAI**: if testing with an OpenAI model, you'll need to set the `OPENAI_API_KEY` environment variable (see [OpenAI provider docs](/docs/providers/openai) for more info):

```bash
```sh
export OPENAI_API_KEY=sk-abc123
```

Expand Down Expand Up @@ -195,7 +195,7 @@ Have a look at the setup and full output [here](https://github.com/promptfoo/pro

You can also output a nice [spreadsheet](https://docs.google.com/spreadsheets/d/1nanoj3_TniWrDl1Sj-qYqIMD6jwm5FBy15xPFdUTsmI/edit?usp=sharing), [JSON](https://github.com/typpo/promptfoo/blob/main/examples/simple-cli/output.json), YAML, or an HTML file:

```bash
```
npx promptfoo@latest eval -o output.html
```

Expand All @@ -214,7 +214,7 @@ providers: [openai:gpt-3.5-turbo, openai:gpt-4]

A simple `npx promptfoo@latest eval` will run the example. Also note that you can override parameters directly from the command line. For example, this command:

```bash
```
npx promptfoo@latest eval -p prompts.txt -r openai:gpt-3.5-turbo openai:gpt-4 -o output.html
```

Expand Down
6 changes: 3 additions & 3 deletions site/docs/guides/azure-vs-openai.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Before we get started, you need the following:

Additionally, make sure you have the following environment variables set:

```bash
```sh
OPENAI_API_KEY='...'
AZURE_OPENAI_API_KEY='...'
```
Expand All @@ -37,7 +37,7 @@ AZURE_OPENAI_API_KEY='...'

Create a new directory for your comparison project and initialize it:

```bash
```sh
npx promptfoo@latest init openai-azure-comparison
```

Expand Down Expand Up @@ -91,7 +91,7 @@ tests:

Execute the comparison using the `promptfoo eval` command:

```bash
```
npx promptfoo@latest eval --no-cache
```

Expand Down
4 changes: 2 additions & 2 deletions site/docs/guides/claude-vs-gpt.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ The `assert` blocks allow you to automatically check the model outputs for expec

With your configuration complete, you can kick off the evaluation:

```sh
```
npx promptfoo@latest eval
```

Expand All @@ -141,7 +141,7 @@ This will display a comparison view showing how Claude 3 and GPT-4 performed on

You can also output the raw results data to a file:

```sh
```
npx promptfoo@latest eval -o results.json
```

Expand Down
8 changes: 4 additions & 4 deletions site/docs/guides/cohere-command-r-benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The end result is a side-by-side comparison view that looks like this:

Create a new promptfoo project:

```bash
```sh
npx promptfoo@latest init cohere-benchmark
cd cohere-benchmark
```
Expand All @@ -41,7 +41,7 @@ providers:
Set the API keys:
```bash
```sh
export COHERE_API_KEY=your_cohere_key
export OPENAI_API_KEY=your_openai_key
export ANTHROPIC_API_KEY=your_anthropic_key
Expand Down Expand Up @@ -109,13 +109,13 @@ tests:
Run the benchmark:
```bash
```
npx promptfoo@latest eval
```

And view the results:

```bash
```sh
npx promptfoo@latest view
```

Expand Down
4 changes: 2 additions & 2 deletions site/docs/guides/compare-llama2-vs-gpt.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ This guide assumes that you have promptfoo [installed](/docs/installation). It a

Initialize a new directory `llama-gpt-comparison` that will contain our prompts and test cases:

```
```sh
npx promptfoo@latest init llama-gpt-comparison
```

Expand Down Expand Up @@ -202,7 +202,7 @@ These settings will apply to all test cases run against these models.

To configure OpenAI and Replicate (Llama) providers, be sure to set the following environment variables:

```bash
```sh
OPENAI_API_KEY=sk-abc123
REPLICATE_API_TOKEN=abc123
```
Expand Down
10 changes: 5 additions & 5 deletions site/docs/guides/dbrx-benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The end result will be a custom benchmark that looks similar to this:

Create a new directory for your comparison project and initialize it with `promptfoo init`.

```bash
```sh
npx promptfoo@latest init dbrx-benchmark
```

Expand Down Expand Up @@ -53,7 +53,7 @@ providers:
Set your API keys as environment variables:
```bash
```sh
export OPENROUTER_API_KEY=your_openrouter_api_key
export OPENAI_API_KEY=your_openai_api_key
```
Expand Down Expand Up @@ -179,15 +179,15 @@ Many types of assertions are supported, both deterministic and LLM-graded. See [

With everything configured, run the evaluation using the `promptfoo` CLI:

```bash
```
npx promptfoo@latest eval
```

This command will execute each test case against each configured model and record the results.

To visualize the results, use the `promptfoo` viewer:

```bash
```sh
npx promptfoo@latest view
```

Expand All @@ -201,7 +201,7 @@ Clicking into a specific output will show details on the assertions:

You can also output the results to a file in various formats, such as JSON, YAML, or CSV:

```bash
```
npx promptfoo@latest eval -o results.csv
```

Expand Down
10 changes: 5 additions & 5 deletions site/docs/guides/evaluate-llm-temperature.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ By running a temperature eval, you can make data-driven decisions that balance t

Before setting up an evaluation to compare the performance of your LLM at different temperatures, you'll need to initialize a configuration file. Run the following command to create a `promptfooconfig.yaml` file:

```bash
```sh
npx promptfoo@latest init
```

Expand Down Expand Up @@ -67,7 +67,7 @@ The `tests` section includes our test cases that will be run against both temper

To run the evaluation, use the following command:

```bash
```
npx promptfoo@latest eval
```

Expand Down Expand Up @@ -118,13 +118,13 @@ It's worth spending a few minutes to set up these automated checks. They help st

After the evaluation is complete, you can use the web viewer to review the outputs and compare the performance at different temperatures:

```bash
```sh
npx promptfoo@latest view
```

## Evaluating randomness

LLMs are inherently nondeterminstic, which means their outputs will vary with each call at nonzero temperatures (and sometimes even at zero temperature). OpenAI introduced the `seed` variable to improve reproducibility of outputs, and other providers will probably follow suit.
LLMs are inherently nondeterministic, which means their outputs will vary with each call at nonzero temperatures (and sometimes even at zero temperature). OpenAI introduced the `seed` variable to improve reproducibility of outputs, and other providers will probably follow suit.

Set a constant seed in the provider config:

Expand All @@ -146,7 +146,7 @@ providers:

The `eval` command also has a parameter, `repeat`, which runs each test multiple times:

```bash
```
promptfoo eval --repeat 3
```

Expand Down
Loading

0 comments on commit 806e522

Please sign in to comment.