Skip to content

HC-Guo/SNS-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services

overview.png

🤗 Hugging Face   |   🤖 ModelScope   |    📑 Paper    |    🛠️ Code   

Quick Start

快速安装

评测代码安装依赖可见Opencompass REAME文档。

评测代码

SNS-Bench数据集加载和评测代码位于 opencompass/opencompass/datasets/sns_bench目录下,配置文件位于 opencompass/opencompass/configs/datasets/sns_bench

评测启动

执行下面命令即可

cd SNS-Bench/opencompass
opencompass examples/eval_sns_bench.py

Metrics

Detailed calculation methods of the metrics are provided in Section 4.2 of the paper.

The relevant code is located in the code folder.

我们在论文主表中汇报的指标reported_score如下(部分特别说明):

# Note-CHLW
# code/chlw.py (line 101)
# reported_score = success_f1

# Note-QueryCorr [Topic]
# code/query_corr_topic.py (line 84)
# reported_score = success-macro-f1

# Note-MRC [Simple]
# code/mrc_simple.py (line 174-178)
# reported_score = AVG(success-f1 + success-blue + success-rouge-1 + success-rouge-2 + success-rouge-L)

# Note-MRC [Complex]
# code/mrc_complex.py (line 155-157)
# reported_score = AVG(success-total-f1 + success-option-f1 + success-option-em)

Acknowledge

感谢Opencompass优秀的评测框架

Dataset License

The dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This means that the data and models trained using the dataset can be used for non-commercial purposes as long as proper attribution is provided. Commercial use is strictly prohibited without explicit permission from the authors. If the dataset is remixed, adapted, or built upon, the modified dataset must be licensed under identical terms.

CC BY-NC-SA 4.0

The dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages