Skip to content

Commit

Permalink
Merge branch 'main' into emb_op
Browse files Browse the repository at this point in the history
  • Loading branch information
Dominastorm authored Mar 11, 2024
2 parents f22852c + 50e2fbc commit 2b06de3
Show file tree
Hide file tree
Showing 88 changed files with 3,904 additions and 752 deletions.
21 changes: 1 addition & 20 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,26 +58,7 @@ Thank you for your interest in contributing to UpTrain! We look forward to revie

# Documentation

There are two ways to contribute to our documentation. You can either do it directly from this website or you can clone the repository and run the website locally.

We recommend that you use the first method if you want to make small changes to the documentation. However, if you want to make a bigger change, you can clone the repository and run the website locally.

## Directly from this documentation website

You can create a pull request or an issue for any page, directly from this website. This is the easiest way to contribute to our documentation.

There are two icons on the top right corner of each page. The left one is for opening a pull request and the right one is for opening an issue.


<h4 align="center">
<img src="https://uptrain-demo.s3.us-west-1.amazonaws.com/contributing/edit-tools.png" width="85%" alt="Performance" />
</h4>

This is convenient for small changes where you don't need to clone the repository and run the website locally.

## Locally

However, if you want to make a bigger change, you can clone the repository and run the website locally by following the instructions below.
Follow the steps below to make changes to the documentation:

1. Fork the repository to your own GitHub account. This will create a copy of the repository that you can make changes to.

Expand Down
2 changes: 2 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ services:
restart: always
entrypoint: ["npm", "run", "dev"]
working_dir: "/app/"
environment:
- NEXT_TELEMETRY_DISABLED=1
ports:
- 3000:3000
profiles: ["server"]
Expand Down
Binary file added docs/assets/dashboard/dashboard_home.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dashboard/dashboard_project1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dashboard/eval.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dashboard/eval_logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dashboard/eval_select_metrics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dashboard/prompt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dashboard/prompt_select.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
69 changes: 69 additions & 0 deletions docs/dashboard/evaluations.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: Evaluations
---

### What are Evaluations?

Using UpTrain you can run evaluations on 20+ pre-configured metrics like:
1. [Context Relevance](/predefined-evaluations/context-awareness/context-relevance): Evaluates how relevant the retrieved context is to the question specified.

2. [Factual Accuracy](/predefined-evaluations/context-awareness/factual-accuracy): Evaluates whether the response generated is factually correct and grounded by the provided context.

3. [Response Completeness](/predefined-evaluations/response-quality/response-completeness): Evaluates whether the response has answered all the aspects of the question specified

You can look at the complete list of UpTrain's supported metrics [here](/predefined-evaluations/overview)

### How does it work?

<Steps>
<Step title = "Create a new Project">
Click on `Create New Project` from Home
<Frame>
<img src="/assets/dashboard/dashboard_home.png" />
</Frame>
</Step>
<Step title = "Enter Project Information">
<Frame>
<img src="/assets/dashboard/dashboard_project1.png" />
</Frame>
* `Project name:` Create a name for your project
* `Dataset name:` Create a name for your dataset
* `Project Type:` Select project type: `Evaluations`
* `Choose File:` Upload your Dataset
Sample Dataset:
```jsonl
{"question":"","response":"","context":""}
{"question":"","response":"","context":""}
```
* `Evaluation LLM:` Select an LLM to run evaluations
</Step>
<Step title = "Select Evaluations to Run">
<Frame>
<img src="/assets/dashboard/eval_select_metrics.png" />
</Frame>
</Step>
<Step title = "View Evaluations">
You can see all the evaluations ran using UpTrain
<Frame>
<img src="/assets/dashboard/eval.png" />
</Frame>

You can also see individual logs
<Frame>
<img src="/assets/dashboard/eval_logs.png" />
</Frame>
</Step>
</Steps>

<CardGroup cols={1}>
<Card
title="Have Questions?"
href="https://join.slack.com/t/uptraincommunity/shared_invite/zt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg"
icon="slack"
color="#808080"
>
Join our community for any questions or requests
</Card>

</CardGroup>

38 changes: 38 additions & 0 deletions docs/dashboard/getting_started.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: Getting Started
---

### What is UpTrain Dashboard?

The UpTrain dashboard is a web-based interface that allows you to evaluate your LLM applications.

It is a self-hosted dashboard that runs on your local machine. You don't need to write any code to use the dashboard.

You can use the dashboard to evaluate your LLM applications, view the results, manage prompts, run experiments, and perform root cause analysis.

<Note>Before you start, ensure you have docker installed on your machine. If not, you can install it from [here](https://docs.docker.com/get-docker/). </Note>

### How to install?

The following commands will download the UpTrain dashboard and start it on your local machine:
```bash
# Clone the repository
git clone https://github.com/uptrain-ai/uptrain
cd uptrain

# Run UpTrain
bash run_uptrain.sh
```

<CardGroup cols={1}>
<Card
title="Have Questions?"
href="https://join.slack.com/t/uptraincommunity/shared_invite/zt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg"
icon="slack"
color="#808080"
>
Join our community for any questions or requests
</Card>

</CardGroup>

52 changes: 52 additions & 0 deletions docs/dashboard/project.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: Create a Project
---

### What is a Project?

Using UpTrain Dashboard you can manage all your projects.

There are 2 types of projects we support:
* **[Evaluations](/dashboard/evaluations):** Run evaluations on your queries, documents and LLM responses
* **[Prompts](/dashboard/prompts):** Find the best way to ask questions to your LLM using prompt iteration, experimentation and evaluations

### How does it work?

<Steps>
<Step title = "Create a new Project">
Click on `Create New Project` from Home
<Frame>
<img src="/assets/dashboard/dashboard_home.png" />
</Frame>
</Step>
<Step title = "Enter Project Information">
* `Project name:` Create a name for your project
* `Dataset name:` Create a name for your dataset
* `Project Type:` Select a project type between `Evaluations` and `Prompts`
* `Choose File:` Upload your Dataset
Sample Dataset:
```jsonl
{"question":"", "response":"", "context":""}
{"question":"", "response":"", "context":""}
```
* `Evaluation LLM:` Select an LLM to run evaluations
<Frame>
<img src="/assets/dashboard/dashboard_project1.png" />
</Frame>
</Step>
</Steps>

Now that you have created a project, you can run evaluations or experiment with prompts

<CardGroup cols={1}>
<Card
title="Have Questions?"
href="https://join.slack.com/t/uptraincommunity/shared_invite/zt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg"
icon="slack"
color="#808080"
>
Join our community for any questions or requests
</Card>

</CardGroup>

69 changes: 69 additions & 0 deletions docs/dashboard/prompts.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: Prompts
---

### What are Prompts?

You can manage your prompt iterations and experiment with them using UpTrain on 20+ pre-configured evaluation metrics like:
1. [Context Relevance](/predefined-evaluations/context-awareness/context-relevance): Evaluates how relevant the retrieved context is to the question specified.

2. [Factual Accuracy](/predefined-evaluations/context-awareness/factual-accuracy): Evaluates whether the response generated is factually correct and grounded by the provided context.

3. [Response Completeness](/predefined-evaluations/response-quality/response-completeness): Evaluates whether the response has answered all the aspects of the question specified

You can look at the complete list of UpTrain's supported metrics [here](/predefined-evaluations/overview)

### How does it work?

<Steps>
<Step title = "Create a new Project">
Click on `Create New Project` from Home
<Frame>
<img src="/assets/dashboard/dashboard_home.png" />
</Frame>
</Step>
<Step title = "Enter Project Information">
<Frame>
<img src="/assets/dashboard/dashboard_project1.png" />
</Frame>
* `Project name:` Create a name for your project
* `Dataset name:` Create a name for your dataset
* `Project Type:` Select project type: `Prompts`
* `Choose File:` Upload your Dataset
Sample Dataset:
```jsonl
{"question":"","response":"","context":""}
{"question":"","response":"","context":""}
```
* `Evaluation LLM:` Select an LLM to run evaluations
</Step>
<Step title = "Enter your Prompt">
<Frame>
<img src="/assets/dashboard/prompt_select.png" />
</Frame>
</Step>
<Step title = "Select Evaluations to Run">
<Frame>
<img src="/assets/dashboard/eval_select_metrics.png" />
</Frame>
</Step>
<Step title = "View Prompts">
You can see all the evaluations ran on your prompts using UpTrain
<Frame>
<img src="/assets/dashboard/prompt.png" />
</Frame>
</Step>
</Steps>

<CardGroup cols={1}>
<Card
title="Have Questions?"
href="https://join.slack.com/t/uptraincommunity/shared_invite/zt-1yih3aojn-CEoR_gAh6PDSknhFmuaJeg"
icon="slack"
color="#808080"
>
Join our community for any questions or requests
</Card>

</CardGroup>

15 changes: 10 additions & 5 deletions docs/integrations/observation-tools/langfuse.mdx
Original file line number Diff line number Diff line change
@@ -1,18 +1,23 @@
---
title: Langfuse
---
[Langfuse](https://langfuse.com/) offers the feature to score your traces and spans. They can be used in multiple ways across Langfuse:
1. Displayed on trace to provide a quick overview
2. Segment all execution traces by scores to e.g. find all traces with a low-quality score
3. Analytics: Detailed score reporting with drill downs into use cases and user segments
[Langfuse](https://langfuse.com/) is an open source LLM engineering platform which helps teams collaboratively debug, analyze and iterate on their LLM applications. Its core features are [observability](https://langfuse.com/docs/tracing/overview) (tracing), [prompt management](https://langfuse.com/docs/prompts/get-started) (versioning), [evaluations](https://langfuse.com/docs/scores/overview) (scores) and [datasets](https://langfuse.com/docs/datasets/overview) (testing).

Langfuse allows users to score individual executions or traces. Users can customize the scores and scales they use. As such, UpTrain's evaluations can easily be integrated into Langfuse.

Scores can be used in a variety of ways in Langfuse:
1. Data: Attach scores to executions and traces and view them in the Langfuse UI
2. Filter: Group executions or traces by scores to e.g. filter for traces with a low-quality score
3. Fine Tuning: Filter and [export](https://langfuse.com/docs/export-and-fine-tuning) by scores as .csv or .JSONL for fine-tuning
4. Analytics: Detailed score reporting and dashboards with drill downs into use cases and user segments

This notebook demonstrates how to use Langfuse to create traces and evaluate using UpTrain

## How to integrate?
### Setup
**Enter your Langfuse API keys and OpenAI API key**

You can get your Langfuse API keys [here](https://cloud.langfuse.com/) and OpenAI API key [here](https://platform.openai.com/api-keys)
You need to sign up to Langfuse and fetch your Langfuse API keys in your [project's settings](https://cloud.langfuse.com/). You also need an [OpenAI API key](https://platform.openai.com/api-keys)

```python
%pip install langfuse datasets uptrain litellm openai --upgrade
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ ANYSCALE_API_KEY = "esecret_***********************"

settings = Settings(model='anyscale/mistralai/Mistral-7B-Instruct-v0.1', anyscale_api_key=ANYSCALE_API_KEY)
```
<Note>
The model name should start with `anyscale/` for UpTrain to recognize you are using models hosted on Anyscale.

For example if you are using `mistralai/Mistral-7B-Instruct-v0.1` via Anyscale, the model name should be `anyscale/mistralai/Mistral-7B-Instruct-v0.1`
</Note>

We have used Mistral-7B-Instruct-v0.1 for this example. You can find a full list of available models [here](https://docs.endpoints.anyscale.com/category/supported-models).

Expand Down
Loading

0 comments on commit 2b06de3

Please sign in to comment.