SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services

🤗 Hugging Face | 🤖 ModelScope | 📑 Paper | 🛠️ Code

Quick Start

快速安装

评测代码安装依赖可见Opencompass REAME文档。

评测代码

SNS-Bench数据集加载和评测代码位于 opencompass/opencompass/datasets/sns_bench目录下，配置文件位于 opencompass/opencompass/configs/datasets/sns_bench 。

评测启动

执行下面命令即可

cd SNS-Bench/opencompass
opencompass examples/eval_sns_bench.py

Metrics

Detailed calculation methods of the metrics are provided in Section 4.2 of the paper.

The relevant code is located in the code folder.

我们在论文主表中汇报的指标reported_score如下（部分特别说明）：

# Note-CHLW
# code/chlw.py (line 101)
# reported_score = success_f1

# Note-QueryCorr [Topic]
# code/query_corr_topic.py (line 84)
# reported_score = success-macro-f1

# Note-MRC [Simple]
# code/mrc_simple.py (line 174-178)
# reported_score = AVG(success-f1 + success-blue + success-rouge-1 + success-rouge-2 + success-rouge-L)

# Note-MRC [Complex]
# code/mrc_complex.py (line 155-157)
# reported_score = AVG(success-total-f1 + success-option-f1 + success-option-em)

Acknowledge

感谢Opencompass优秀的评测框架

Dataset License

The dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This means that the data and models trained using the dataset can be used for non-commercial purposes as long as proper attribution is provided. Commercial use is strictly prohibited without explicit permission from the authors. If the dataset is remixed, adapted, or built upon, the modified dataset must be licensed under identical terms.

The dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
data		data
opencompass		opencompass
DATA_LICENSE.txt		DATA_LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services

Quick Start

快速安装

评测代码

评测启动

Metrics

Acknowledge

Dataset License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

HC-Guo/SNS-Bench

Folders and files

Latest commit

History

Repository files navigation

SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services

Quick Start

快速安装

评测代码

评测启动

Metrics

Acknowledge

Dataset License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages