Skip to content

Commit 80d6b28

Browse files
fix(docs): remove useless code in evaluation.md (#206)
* Update 2.evaluation.md * Update 4.troubleshooting.md (#207)
1 parent 5a6ff48 commit 80d6b28

File tree

2 files changed

+2
-63
lines changed

2 files changed

+2
-63
lines changed

docs/content/1.introduction/4.troubleshooting.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ navigation:
2828
- ![开通权限](/images/troubleshooting-01.png)
2929

3030
- 新账号开通后,缺少ServerlessApplicationRole授权
31-
- 当前,你可以随便点进一个创建应用的页面,(比如[这个页面](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/application/create?templateId=67f7b4678af5a6000850556c))点击「一键授权」即可
32-
- ![角色授权](/images/troubleshooting-02.png)
31+
- 前往火山引擎函数服务官网,进入创建应用页面(例如[这里](https://console.volcengine.com/vefaas/region:vefaas+cn-beijing/application/create?templateId=67f7b4678af5a6000850556c))点击「一键授权」即可
32+
- ![角色授权](/images/troubleshooting-02.png)
3333

3434
2. **安装依赖失败,显示依赖安装空间不足**
3535
- VeFaaS 最大依赖安装大小默认为 250 MB,若需更大空间,请联系 VeFaaS 产品团队扩容。

docs/content/8.observation/2.evaluation.md

Lines changed: 0 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -154,64 +154,3 @@ evaluator = DeepevalEvaluator(
154154
prometheus_config=prometheus_config,
155155
)
156156
```
157-
158-
## 完整示例
159-
160-
以下是使用 DeepEval 评测器的完整例子。其中定义了 [GEval](https://deepeval.com/docs/metrics-llm-evals) 指标和 [ToolCorrectnessMetric](https://deepeval.com/docs/metrics-tool-correctness) 指标,分别用于整体输出质量评估和工具调用正确率评估,并将评测结果上报至火山引擎的 VMP 平台:
161-
162-
```python
163-
import asyncio
164-
import os
165-
from builtin_tools.agent import agent
166-
167-
from deepeval.metrics import GEval, ToolCorrectnessMetric
168-
from deepeval.test_case import LLMTestCaseParams
169-
from veadk.config import getenv
170-
from veadk.evaluation.deepeval_evaluator import DeepevalEvaluator
171-
from veadk.evaluation.utils.prometheus import PrometheusPushgatewayConfig
172-
from veadk.prompts.prompt_evaluator import eval_principle_prompt
173-
174-
prometheus_config = PrometheusPushgatewayConfig()
175-
176-
# 1. Rollout, and generate eval set file
177-
# await agent.run(
178-
# prompt,
179-
# collect_runtime_data=True,
180-
# eval_set_id=f"eval_demo_set_{get_current_time()}",
181-
# )
182-
# # get expect output
183-
# dump_path = agent._dump_path
184-
# assert dump_path != "", "Dump eval set file failed! Please check runtime logs."
185-
186-
# 2. Evaluate in terms of eval set file
187-
evaluator = DeepevalEvaluator(
188-
agent=agent,
189-
judge_model_name=getenv("MODEL_JUDGE_NAME"),
190-
judge_model_api_base=getenv("MODEL_JUDGE_API_BASE"),
191-
judge_model_api_key=getenv("MODEL_JUDGE_API_KEY"),
192-
prometheus_config=prometheus_config,
193-
)
194-
195-
# 3. Define evaluation metrics
196-
metrics = [
197-
GEval(
198-
threshold=0.8,
199-
name="Base Evaluation",
200-
criteria=eval_principle_prompt,
201-
evaluation_params=[
202-
LLMTestCaseParams.INPUT,
203-
LLMTestCaseParams.ACTUAL_OUTPUT,
204-
LLMTestCaseParams.EXPECTED_OUTPUT,
205-
],
206-
),
207-
ToolCorrectnessMetric(
208-
threshold=0.5
209-
),
210-
]
211-
212-
# 4. Run evaluation
213-
eval_set_file_path = os.path.join(
214-
os.path.dirname(__file__), "builtin_tools", "evalsetf0aef1.evalset.json"
215-
)
216-
await evaluator.eval(eval_set_file_path=eval_set_file_path, metrics=metrics)
217-
```

0 commit comments

Comments
 (0)