Add Support for DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B #3332

RonalddMatias · 2025-02-11T17:30:10Z

Description

This Pull Request adds support for the DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B models, which are open-source and available on the Hugging Face platform. These additions expand the available options for various NLP and code generation tasks.

Main Changes

Added DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B to the list of supported models.
Updated configuration files to accommodate the specific parameters of these new models.

Benchmarks Executed

The DeepSeek R1 Distill Llama 8B model was evaluated on NLP tasks such as ENEM Challenge and TweetSent, while the DeepSeek Code Instruct 6.7B model was tested on HumanEval and APPS for code generation. These models demonstrated competitive performance within their respective domains.

By adding these models, we enhance flexibility in choosing state-of-the-art solutions for NLP and code generation tasks.

yifanmai

Looks good, thanks!

RonalddMatias added 2 commits February 5, 2025 11:36

add DeepSeekR1-Distill-Llama-8B

d37c0c6

add DeepSeek Coder 7B Instruct

e53b51f

yifanmai approved these changes Feb 11, 2025

View reviewed changes

yifanmai merged commit 4ce5078 into stanford-crfm:main Feb 11, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Support for DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B #3332

Add Support for DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B #3332

Uh oh!

RonalddMatias commented Feb 11, 2025

Uh oh!

yifanmai left a comment

Uh oh!

Uh oh!

Uh oh!

Add Support for DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B #3332

Add Support for DeepSeek R1 Distill Llama 8B and DeepSeek Code Instruct 6.7B #3332

Uh oh!

Conversation

RonalddMatias commented Feb 11, 2025

Description

Main Changes

Benchmarks Executed

Uh oh!

yifanmai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!