模型推理参数

# 关于论文复现的两个问题

你好，首先感谢团队开源了如此优秀的项目！

在复现论文实验结果的过程中，我遇到了一些问题，希望能得到解答：

## 1. 评测配置确认

我在复现时发现部分结果与论文报告的数值存在差异。请问本项目中数据集的评测配置是否与 [LegalKit 示例](https://github.com/DavidMiao1127/LegalKit/tree/main/example) 中的配置保持一致？如果有所不同，能否提供具体的配置信息？

## 2. LLM-as-a-Judge 评测 Prompt 缺失

论文第 X 章节提到：

> Notably, while the original LexEval benchmark typically relies on ROUGE metrics for Generation tasks, we adopt Qwen3-235B-A22B [53] as an LLM-as-a-Judge to ensure a more robust and reliable evaluation. The specific prompts used in our experiments are accessible on GitHub.

但我在当前仓库中未找到相关的评测 prompt。请问这些 prompt 是否已经上传？如果还未上传，能否提供或计划近期补充到仓库中？

感谢您的帮助！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

模型推理参数 #1

关于论文复现的两个问题

1. 评测配置确认

2. LLM-as-a-Judge 评测 Prompt 缺失

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

模型推理参数 #1

Description

关于论文复现的两个问题

1. 评测配置确认

2. LLM-as-a-Judge 评测 Prompt 缺失

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions