Skip to content

总结Prompt&LLM论文,开源数据&模型,AIGC应用

Notifications You must be signed in to change notification settings

Quant2017/DecryptPrompt

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

182 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DecryptPrompt

如果LLM的突然到来让你感到沮丧,不妨读下主目录的Choose Your Weapon Survival Strategies for Depressed AI Academics 持续更新以下内容,Star to keep updated~

  1. 开源LLM
  2. 指令微调和RLHF数据以及训练框架
  3. Prompt和LLM相关论文按细分方向梳理
  4. AIGC相关应用
  5. Prompt指南和教程
  6. ChatGPT及AGI相关解读

My blogs & ChatGPT应用

模型和数据

国外模型

模型链接 模型描述
Google Bard 谷歌bard虽迟但到,可以申请waitlist了
Claude ChatGPT最大竞争对手Claude也开放申请了,slack中无限试用
LLaMA Meta开源指令微调LLM,规模70 亿到 650 亿不等
MPT MosaicML开源的预训练+指令微调的新模型,可商用,支持84k tokens超长输入
RedPajama RedPajama项目既开源预训练数据后开源3B,7B的预训练+指令微调模型
ChatLLaMA 基于RLHF微调了LLaMA
Alpaca 斯坦福开源的使用52k数据在7B的LLaMA上微调得到,
Alpaca-lora LORA微调的LLaMA
Dromedary IBM self-aligned model with the LLaMA base
Vicuna Alpaca前成员等开源以LLama13B为基础使用ShareGPT指令微调的模型,提出了用GPT4来评测模型效果
koala 使用alpaca,HC3等开源指令集+ ShareGPT等ChatGPT数据微调llama,在榜单上排名较高
ColossalChat HPC-AI Tech开源的Llama+RLHF微调
MiniGPT4 Vicuna+BLIP2 文本视觉融合
StackLLama LLama使用Stackexchange数据+SFT+RL
Cerebras Cerebras开源了1亿到130亿的7个模型,从预训练数据到参数全开源
PaLM-E 谷歌多模态大模型,540B的PaLM语言模型和22B的ViT视觉模型相结合,得到562B的PaLM-E模型,在机器人应用场景有了新的突破
Dolly-v2 可商用 7b指令微调开源模型在GPT-J-6B上微调
OpenChatKit openai研究员打造GPT-NoX-20B微调+6B审核模型过滤
MetaLM 微软开源的大规模自监督预训练模型
Amazon Titan 亚马逊在aws上增加自家大模型
OPT-IML Meta复刻GPT3,up to 175B, 不过效果并不及GPT3
Bloom BigScience出品,规模最大176B
BloomZ BigScience出品, 基于Bloom微调
Galacia 和Bloom相似,更针对科研领域训练的模型
T0 BigScience出品,3B~11B的在T5进行指令微调的模型

国内模型

模型链接 模型描述
ChatGLM 清华开源的、支持中英双语的对话语言模型,使用了代码训练,指令微调和RLHF。和以下GLM相同大小的130B的模型还在开发中。试用了下超出预期!
Moss 为复旦正名!开源了预训练,指令微调的全部数据和模型。可商用
Wombat-7B 达摩院开源无需强化学习使用RRHF对齐的语言模型, alpaca基座
Chinese-LLaMA-Alpaca 哈工大中文指令微调的LLaMA
Luotuo 中文指令微调的LLaMA,和ChatGLM
文心一言 已经拿到邀请码并试用,虽然人格化程度显著低,但效果上并没有很拉胯,国产YYDS!不过商业化霸王条款确实不少
通义千问 阿里系LLM开放申请
星火 科大讯飞星火,数学是真的厉害
BiLLa LLama词表扩充预训练+预训练和任务1比1混合SFT+指令样本SFT三阶段训练
Phoenix 港中文开源凤凰和奇美拉LLM,Bloom基座,40+语言支持
OpenBuddy Llama 多语言对话微调模型
Guanaco LLama 7B基座,在alpaca52K数据上加入534K多语言指令数据微调
ziya IDEA研究院在7B/13B llama上继续预训练+SFT+RM+PPO+HFTT+COHFT+RBRS
Chinese Vincuna LLama 7B基座,使用Belle+Guanaco数据训练
Linly Llama 7B基座,使用belle+guanaco+pclue+firefly+CSL+newscommentary等7个指令微调数据集训练
Firefly 中文2.6B模型,提升模型中文写作,古文能力,待开源全部训练代码,当前只有模型
Baize 使用100k self-chat对话数据微调的LLama
BELLE 使用ChatGPT生成数据对开源模型进行中文优化
Chatyuan chatgpt出来后最早的国内开源对话模型,T5架构是下面PromptCLUE的衍生模型
PromptCLUE 多任务Prompt语言模型
PLUG 阿里达摩院发布的大模型,提交申请会给下载链接
CPM2.0 智源发布CPM2.0
GLM 清华发布的中英双语130B预训练模型

垂直领域模型

模型链接 模型描述
MedPalm Google在Faln-PaLM的基础上通过多种类型的医疗QA数据进行prompt-tuning指令微调得到,同时构建了MultiMedQA
ChatDoctor 110K真实医患对话样本+5KChatGPT生成数据进行指令微调
Huatuo Med-ChatGLM 医学知识图谱和chatgpt构建中文医学指令数据集+医学文献和chatgpt构建多轮问答数据
Chinese-vicuna-med Chinese-vicuna在cMedQA2数据上微调
OpenBioMed 清华AIR开源轻量版BioMedGPT, 知识图谱&20+生物研究领域多模态预训练模型
DoctorGLM ChatDoctor+MedDialog+CMD 多轮对话+单轮指令样本微调GLM
MedicalGPT-zh 自建的医学数据库ChatGPT生成QA+16个情境下SELF构建情景对话
PMC-LLaMA 医疗论文微调Llama
NHS-LLM Chatgpt生成的医疗问答,对话,微调模型
LawGPT-zh 利用ChatGPT清洗CrimeKgAssitant数据集得到52k单轮问答+我们根据中华人民共和国法律手册上最核心的9k法律条文,利用ChatGPT联想生成具体的情景问答+知识问答使用ChatGPT基于文本构建QA对
FinChat.io 使用最新的财务数据,电话会议记录,季度和年度报告,投资书籍等进行训练
OpenGPT 领域LLM指令样本生成+微调框架
乾元BigBang金融2亿模型 金融领域预训练+任务微调
度小满千亿金融大模型 在Bloom-176B的基础上进行金融+中文预训练和微调

指令微调&RL工具

工具描述 链接
LoRA:Low-Rank指令微调方案 https://github.com/tloen/alpaca-lora
peft:parameter-efficient prompt tunnging工具集 https://github.com/huggingface/peft
RL4LMs:AllenAI的RL工具 https://github.com/allenai/RL4LMs
trl:基于Transformer的强化训练框架 https://github.com/lvwerra/trl
trlx:分布式训练trl https://github.com/CarperAI/trlx
北大开源河狸项目可复现RLHF,支持多数LLM,提供RLHF数据 https://github.com/PKU-Alignment/safe-rlhf
RL4LMs:AllenAI的RL工具 https://github.com/allenai/RL4LMs
LMFlow:港科大实验室开源的大模型微调框架,支持以上多数开源模型的指令微调和RLHF https://github.com/OptimalScale/LMFlow
hugNLP:基于Huggingface开发继承Prompt技术,预训练和是指输入等多种方案 https://github.com/wjn1996/HugNLP
Deepspeed:针对RL训练和推理的整合优化 https://github.com/microsoft/DeepSpeed
Uerpy:预训练框架支持lm,mlm,unilm等 https://github.com/dbiir/UER-py
TecentPretrain: Uerpy的重构版本支持llama预训练 https://github.com/Tencent/TencentPretrain/tree/main
langchain:LLM工具集 https://github.com/hwchase17/langchain
BMTTools: 清华出品类似langchain https://github.com/OpenBMB/BMTools
BabyAGI:自执行LLM Agent https://github.com/yoheinakajima/babyagi
AutoGPT:自执行LLM Agent https://github.com/Torantulino/Auto-GPT
Jarvis: 大模型调用小模型框架,给小模型一个未来! https://github.com/search?q=jarvis
lamini: 整合指令数据生成,SFT,RLHF的工具库 https://github.com/lamini-ai/lamini/
wenda:闻达小模型整合搜索用于知识融入 https://github.com/l15y/wenda
Chain-of-thought-hub:模型推理能力评估平台 https://github.com/FranxYao/chain-of-thought-hub
FlexGen:LLM推理 CPU Offload计算架构 https://github.com/FMInference/FlexGen

开源数据

数据类型 数据描述 数据链接
指令微调 self-instruct,GPT3自动生成&过滤得到指令集 https://github.com/yizhongw/self-instruct
指令微调 Standford Alpaca:52K text-davinci-003生成的self-instruct指令数据集 https://github.com/tatsu-lab/stanford_alpaca
指令微调 GPT4-for-LLM 中文+英文+对比指令 https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
指令微调 GPTTeacher更多样的通用指令,角色扮演和代码指令 https://github.com/teknium1/GPTeacher/tree/main
指令微调 中文翻译Alpaca还有一些其他指令数据集 https://github.com/hikariming/alpaca_chinese_dataset https://github.com/carbonz0/alpaca-chinese-dataset
指令微调 alpaca指令GPT4生成,和以上几版对比显著质量更高,回复更长 https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/tree/main
指令微调 Guanaco数据:对Alphca指令重写后以不同语言生成总共534K,有对话和非对话类型,还有补充的QA生成样本 https://huggingface.co/datasets/JosephusCheung/GuanacoDataset
指令微调 OIG中文指令包括翻译alpaca+natural+unnatural,多轮对话,考试,leetcode指令 https://github.com/BAAI-Zlab/COIG
指令微调 Vicuna训练使用的样本,用API获取了sharegpt上用户和chatgpt对话历史,部分网友整理到了HF https://github.com/domeccleston/sharegpt https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/tree/main
指令微调 HC3指令数据中英文,包括金融,开放QA,百科,DBQA,医学等包含人工回复 https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese/tree/main
指令微调 MOSS开源的SFT数据包含使用plugin的对话数据 https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese/tree/main
指令微调 InstructWild数据:用四处爬取的chatgpt指令作为种子self-instruct扩充生成,中英双语 https://github.com/XueFuzhao/InstructionWild/tree/main/data
指令微调 BELLE100万指令数据,参考Alpaca用ChatGPT生成,有数学,多轮对话,校色对话等等 https://github.com/LianjiaTech/BELLE
指令微调 PromptCLUE多任务提示数据集:模板构建,只包含标准NLP任务 https://github.com/CLUEbenchmark/pCLUE
指令微调 TK-Instruct微调用的指令数据集, 全人工标注1600+NLP任务 https://instructions.apps.allenai.org/
指令微调 T0微调用的指令数据集(P3) https://huggingface.co/datasets/bigscience/P3
指令微调 p3衍生的46种多语言数据集(xmtf) https://github.com/bigscience-workshop/xmtf
指令微调 Unnatural Instruction使用GPT3生成后改写得到240k https://github.com/orhonovich/unnatural-instructions
指令微调 alpaca COT对多个数据源进行了清理并统一格式放到的了HF, 重点是人工整理的COT数据 https://github.com/PhoebusSi/Alpaca-CoT
指令微调 人工编写包含23种常见的中文NLP任务的指令数据,中文写作方向 https://github.com/yangjianxin1/Firefly
指令微调 Amazon COT指令样本包括各类QA,bigbench,math等 https://github.com/amazon-science/auto-cot
指令微调 CSL包含 396,209 篇中文核心期刊论文元信息 (标题、摘要、关键词、学科、门类)可做预训练可构建NLP指令任务 https://github.com/ydli-ai/CSL
指令微调 alpaca code 20K代码指令数据 https://github.com/sahil280114/codealpaca#data-release
指令微调 GPT4Tools 71K GPT4指令样本 https://github.com/StevenGrove/GPT4Tools
指令微调 GPT4指令+角色扮演+代码指令 https://github.com/teknium1/GPTeacher
数学MWP 腾讯人工智能实验室发布网上爬取的数学问题APE210k https://github.com/Chenny0808/ape210k
数学MWP 猿辅导 AI Lab开源小学应用题Math23K https://github.com/SCNU203/Math23k/tree/main
数学MWP grade school math把OpenAI的高中数学题有改造成指令样本有2-8步推理过程 https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions
数学MWP 数学问答数据集有推理过程和多项选择 https://huggingface.co/datasets/math_qa/viewer/default/test?row=2
数学MWP AMC竞赛数学题 https://huggingface.co/datasets/competition_math
数学MWP 线性代数等纯数学计算题 https://huggingface.co/datasets/math_dataset
代码 从不同的开放访问编码网站Codeforces、Kattis 等收集的问题 https://opendatalab.org.cn/APPS
代码 代码由带有嵌入式 SQL 的 Python 代码组成,经过仔细注释的数据库操作程序,配有中文评论和英文评论。 https://opendatalab.org.cn/Lyra
代码 来自StackOverflow问题,手动注释3k,英文 https://opendatalab.org.cn/CoNaLa/download
对话指令 LAION 策划的开放指令通用数据集中手动选择的组件子集 已开源40M 3万个,100M在路上 https://github.com/LAION-AI/Open-Instruction-Generalist
对话指令 Baize基于Chat GPT构建的self-chat数据 https://github.com/project-baize/baize-chatbot/tree/main/data
对话指令 FaceBook开源BlenderBot训练对话数据~6K https://huggingface.co/datasets/blended_skill_talk
对话指令 AllenAI开源38.5万个对话高质量数据集SODA https://realtoxicityprompts.apps.allenai.org/
对话指令 InstructDial在单一对话任务类型上进行指令微调 https://github.com/prakharguptaz/Instructdial
对话指令 Ultra Chat 两个独立的 ChatGPT Turbo API 进行对话,从而生成多轮对话数据 https://github.com/thunlp/UltraChat
对话指令 Awesome Open-domain Dialogue Models提供多个开放域对话数据 https://github.com/cingtiye/Awesome-Open-domain-Dialogue-Models#%E4%B8%AD%E6%96%87%E5%BC%80%E6%94%BE%E5%9F%9F%E5%AF%B9%E8%AF%9D%E6%95%B0%E6%8D%AE%E9%9B%86
RLFH 北大河狸开源RLHF数据集10K,1M需要申请 https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF-10K
RLHF Anthropic hh-rlhf数据集 https://huggingface.co/datasets/Anthropic/hh-rlhf
RLHF Stack-exchange上问题对应多个答案,每个答案有打分 https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences/tree/main
RLHF Facebook Bot Adversarial Dialogues数据集5K https://github.com/facebookresearch/ParlAI
RLHF AllenAI Real Toxicity prompts https://github.com/facebookresearch/ParlAI
RLHF OpenAssistant Conversations 160K消息,13500人工生成, 英文为主 https://huggingface.co/datasets/OpenAssistant/oasst1
评估集 BigBench(Beyond the Imitation Game Benchmark) https://github.com/google/BIG-bench
评估集 Complex QA:用于ChatGPT的评测指令集 https://github.com/tan92hl/Complex-Question-Answering-Evaluation-of-ChatGPT
评估集 Langchain开源评估数据集 https://huggingface.co/LangChainDatasets
评估集 2010-2022年全国高考卷的题目 https://github.com/OpenLMLab/GAOKAO-Bench
评估集 中文通用大模型综合性评测基准SuperCLUE https://github.com/CLUEbenchmark/SuperCLUE
预训练 RedPajama开源的复刻llama的预训练数据集 https://github.com/togethercomputer/RedPajama-Data
预训练 UER整理CLUECorpusSmall+News Commentary中英 https://github.com/dbiir/UER-py/wiki/%E9%A2%84%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE
预训练 智源人工智能开源的wudao 200G预训练数据 https://github.com/BAAI-WuDao/WuDaoMM
多源数据集整合 opendatalab整合了预训练阶段的多个数据源 https://opendatalab.org.cn/?industry=9821&source=JUU3JTlGJUE1JUU0JUI5JThF

Resources

Tools & Tutorial

AIGC playground

  • cognosys: 全网最火的web端AutoGPT,不过咋说呢试用了下感觉下巴要笑掉了,不剧透去试试你就知道
  • godmode:需要人为每一步交互的的AutoGPT
  • agentgpt: 基础AutoGPT
  • New Bing:需要连外网否则会重定向到bing中国,需要申请waitlist
  • Perplexity.ai: 同样需要科学上网,感觉比Bing做的更好的接入ChatGPT的神奇搜索引擎,在Bing之外还加入了相关推荐和追问 ⭐
  • BingGPT: NewBing开源桌面客户端,可以将聊天记录导出
  • DocsGPT: 把ChatGPT开放域问答转化成封闭域问答的通用方案,试用垂类领域问答场景,可以试用定制的ChatBot
  • langchain-ChatGLM: 基于ChatGLM的本地知识问答,和上面的DocsGPT相似,不过可以本地部署:star:
  • ChatPDF: 国内的ChatPDF, 上传pdf后,会给出文章的Top5可能问题,然后对话式从文档中进行问答和检索,10s读3万字
  • ChatDoc:ChatPDF升级版,增加了表格类解析,和完善的索引引用加跳转加对应文章内容高亮,哈哈我准备自己整一个
  • ChatPaper: 根据输入关键词,自动在arxiv上下载最新的论文,并对论文进行摘要总结,可以在huggingface上试用!
  • OpenRead: 面向论文写作,阅读场景,可以帮助生成文献综述,以及提供和NotionAI相似的智能Markdown用于写作
  • researchgpt: 和ChatPDF类似,支持arivx论文下载,加载后对话式获取论文重点
  • BriefGPT: 日更Arxiv论文,并对论文进行摘要,关键词抽取,帮助研究者了解最新动态, UI不错哟
  • ChatGPT-academic: 又是一个基于gradio实现的paper润色,摘要等功能打包的实现
  • feishu-chatgpt: 飞书chatgpt,和365copilot相似也是多组件集成, 有点全!
  • ChatMind: chatgpt生成思维导图,针对话题的生成还可以,但是针对某本书的就是瞎编了,但是感觉和检索式阅读方式结合效果会出彩~
  • Shell: 基于ChatGPT的AI英语聊天工具,口语练习助手
  • AI Topiah: 聆心智能AI角色聊天,和路飞唠了两句,多少有点中二之魂在燃烧
  • chatbase: 情感角色聊天,还没尝试
  • Vana: virtual DNA, 通过聊天创建虚拟自己!概念很炫
  • WriteSonic:AI写作,支持对话和定向创作如广告文案,商品描述, 支持Web检索是亮点,支持中文
  • copy.ai: WriteSonic竞品,亮点是像论文引用一样每句话都有对应网站链接,可以一键复制到右边的创作Markdown,超级好用!
  • NotionAI:智能Markdown,适用真相!在创作中用command调用AI辅助润色,扩写,检索内容,给创意idea
  • Jasper: 同上,全是竞品哈哈
  • copy.down: 中文的营销文案生成,只能定向创作,支持关键词到文案的生成
  • ChatExcel: 指令控制excel计算,对熟悉excel的有些鸡肋,对不熟悉的有点用
  • ChatPPT: 使用ChatGPT进行PPT制作
  • BibiGPT: Bilibli视频内容一键总结,多模态文档
  • Microsoft 365 Copilot:微软Office全面接入GPT4,智能PPT,Excel,Word,暂无链接。其实就是上面开源创意的全家桶套餐
  • Google Workspace: 谷歌推出的搭载各种AI服务的办公场景全覆盖,暂无使用方案。
  • Copilot: 要付费哟
  • Fauxpilot: copilot本地开源替代
  • CodeGex: 国内替代品,还没试过
  • Codeium: Copilot替代品,有免费版本支持各种plugin
  • Wolverine: 代码自我debug的python脚本
  • dreamstudio.ai: 开创者,Stable Difussion, 有试用quota
  • midjourney: 开创者,艺术风格为主
  • Dall.E: 三巨头这就凑齐了
  • ControlNet: 为绘画创作加持可控性
  • GFPGAN: 照片修复
  • Visual ChatGPT: 微软发布图像ChatGPT,对话方式进行图像生成编辑,问答
  • gemo.ai: 多模态聊天机器人,包括文本,图像,视频生成

Recommend Blog

Papers

paper List

Survey

  • A Survey of Large Language Models
  • Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing ⭐
  • Paradigm Shift in Natural Language Processing
  • Pre-Trained Models: Past, Present and Future

LLM Ability Analysis & Probing

  • LARGER LANGUAGE MODELS DO IN-CONTEXT LEARNING DIFFERENTLY
  • Evidence of Meaning in Language Models Trained on Programs
  • Sparks of Artificial General Intelligence: Early experiments with GPT-4
  • How does in-context learning work? A framework for understanding the differences from traditional supervised learning
  • Why can GPT learn in-context? Language Model Secretly Perform Gradient Descent as Meta-Optimizers
  • Emerging Ability of Large Language Models
  • Rethinking the Role of Demonstrations What Makes incontext learning work?
  • Can Explanations Be Useful for Calibrating Black Box Models

Tunning Free Prompt

  • GPT2: Language Models are Unsupervised Multitask Learners
  • GPT3: Language Models are Few-Shot Learners ⭐
  • LAMA: Language Models as Knowledge Bases?
  • AutoPrompt: Eliciting Knowledge from Language Models

Fix-Prompt LM Tunning

  • T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
  • PET-TC(a): Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference ⭐
  • PET-TC(b): PETSGLUE It’s Not Just Size That Matters Small Language Models are also few-shot learners
  • GenPET: Few-Shot Text Generation with Natural Language Instructions
  • LM-BFF: Making Pre-trained Language Models Better Few-shot Learners ⭐
  • ADEPT: Improving and Simplifying Pattern Exploiting Training

Fix-LM Prompt Tunning

  • Prefix-tuning: Optimizing continuous prompts for generation
  • Prompt-tunning: The power of scale for parameter-efficient prompt tuning ⭐
  • P-tunning: GPT Understands Too ⭐
  • WARP: Word-level Adversarial ReProgramming

LM + Prompt Tunning

  • P-tunning v2: Prompt Tuning Can Be Comparable to Fine-tunning Universally Across Scales and Tasks
  • PTR: Prompt Tuning with Rules for Text Classification
  • PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains

Fix-LM Adapter Tunning

  • LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS ⭐
  • LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
  • Parameter-Efficient Transfer Learning for NLP
  • INTRINSIC DIMENSIONALITY EXPLAINS THE EFFECTIVENESS OF LANGUAGE MODEL FINE-TUNING

Instruction Tunning LLMs

  • Flan: FINETUNED LANGUAGE MODELS ARE ZERO-SHOT LEARNERS ⭐
  • Flan-T5: Scaling Instruction-Finetuned Language Models
  • Instruct-GPT: Training language models to follow instructions with human feedback star:
  • T0: MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION
  • Natural Instructions: Cross-Task Generalization via Natural Language Crowdsourcing Instructions
  • Tk-INSTRUCT: SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Tasks
  • Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

Train for Dialogue

  • LaMDA: Language Models for Dialog Applications
  • Sparrow: Improving alignment of dialogue agents via targeted human judgements star:
  • BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage
  • How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

Chain of Thought

  • [zero-shot-COT] Large Language Models are Zero-Shot Reasoners ⭐
  • [Manual COT] Chain of Thought Prompting Elicits Reasoning in Large Language Models ⭐
  • SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS
  • COMPLEXITY-BASED PROMPTING FOR MULTI-STEP REASONING
  • LEAST-TO-MOST PROMPTING ENABLES COMPLEX REASONING IN LARGE LANGUAGE MODELS
  • Solving Quantitative Reasoning Problems with Language Models
  • Specializing Smaller Language Models towards Multi-Step Reasoning
  • Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
  • TEXT AND PATTERNS: FOR EFFECTIVE CHAIN OF THOUGHT IT TAKES TWO TO TANGO
  • Decomposed Prompting A MODULAR APPROACH FOR Solving Complex Tasks
  • Solving math word problems with processand outcome-based feedback
  • CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

RLHF

  • Deepmind
    • Teaching language models to support answers with verified quotes
    • sparrow, Improving alignment of dialogue agents via targetd human judgements ⭐
  • openai
    • PPO: Proximal Policy Optimization Algorithms ⭐
    • Deep Reinforcement Learning for Human Preference
    • Fine-Tuning Language Models from Human Preferences
    • learning to summarize from human feedback
    • InstructGPT: Training language models to follow instructions with human feedback ⭐
    • Scaling Laws for Reward Model Over optimization ⭐
  • Anthropic
    • A General Language Assistant as a Laboratory for Alignmen
    • Red Teaming Language Models to Reduce Harms Methods,Scaling Behaviors and Lessons Learned
    • Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback ⭐
    • Constitutional AI Harmlessness from AI Feedback ⭐
    • Pretraining Language Models with Human Preferences
  • AllenAI, RL4LM:IS REINFORCEMENT LEARNING (NOT) FOR NATURAL LANGUAGE PROCESSING BENCHMARKS
  • RRHF: Rank Responses to Align Language Models with Human Feedback without tears

Agent: 让模型使用工具

  • Tool Former: Toolformer: Language Models Can Teach Themselves to Use Tools
  • MRKL SystemsA modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning ⭐
  • ReAct: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS ⭐
  • Self-ask: MEASURING AND NARROWING THE COMPOSITIONALITY GAP IN LANGUAGE MODELS
  • PAL: Program-aided Language Models
  • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
  • OpenAGI: When LLM Meets Domain Experts
  • Tool Learning with Foundation Models

指令数据生成

  • APE: LARGE LANGUAGE MODELS ARE HUMAN-LEVEL PROMPT ENGINEERS ⭐
  • SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions ⭐
  • iPrompt: Explaining Data Patterns in Natural Language via Interpretable Autoprompting
  • Flipped Learning: Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
  • Fairness-guided Few-shot Prompting for Large Language Models
  • Instruction induction: From few examples to natural language task descriptions.
  • Baize An Open-Source Chat Model with Parameter-Efficient Tuning on self-Chat Data

领域模型

  • BioGPT:Generative Pre-trained Transformer for Biomedical Text Generation and Mining
  • Galactia:A Large Language Model for Science
  • PubMed GPT: A Domain-specific large language model for biomedical text ⭐
  • BloombergGPT: A Large Language Model for Finance
  • ChatDoctor:Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
  • Med-PaLM:Large Language Models Encode Clinical Knowledge[V1,V2] ⭐
  • Augmented Large Language Models with Parametric Knowledge Guiding
  • XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters

LLM超长文本处理

  • Parallel Context Windows for Large Language Models
  • Structured Prompting: Scaling In-Context Learning to 1,000 Examples
  • 苏剑林, NBCE:使用朴素贝叶斯扩展LLM的Context处理长度
  • Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
  • Unlimiformer: Long-Range Transformers with Unlimited Length Input
  • Scaling Transformer to 1M tokens and beyond with RMT
  • RECURRENTGPT: Interactive Generation of (Arbitrarily) Long Text
  • TRAIN SHORT, TEST LONG: ATTENTION WITH LINEAR BIASES ENABLES INPUT LENGTH EXTRAPOLATION ⭐

LLM Tunning Practice/Report

  • BELLE: Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
  • Baize: Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
  • A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Large LM
  • Exploring ChatGPT’s Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
  • Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
  • LIMA: Less Is More for Alignment ⭐

Other Prompt Engineer

  • Generated Knowledge Prompting for Commonsense Reasoning
  • In-Context Instruction Learning
  • PROMPTING GPT-3 TO BE RELIABLE

Multimodal

  • InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
  • Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
  • PaLM-E: An Embodied Multimodal Language Model

About

总结Prompt&LLM论文,开源数据&模型,AIGC应用

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published