Stars
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
NAT implementation(Neighborhood Attention Transformer) This is an unofficial implementation. https://arxiv.org/pdf/2204.07143.pdf
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning
DeepIE: Deep Learning for Information Extraction
NLP关系抽取:序列标注、层叠式指针网络、Multi-head Selection、Deep Biaffine Attention
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
keras implement of dgcnn for reading comprehension
Fine-tuning GPT-2 Small for Question Answering
Matplotlib styles for scientific plotting
A PyTorch-based toolkit for natural language processing
ALBERT model Pretraining and Fine Tuning using TF2.0
This toolkit was designed for the fast and efficient development of modern machine comprehension models, including both published models and original prototypes.
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.