Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using native chat_template from tokenizer config in chat_template strategy #1689

Closed
5 tasks done
chiragjn opened this issue Jun 7, 2024 · 14 comments
Closed
5 tasks done
Labels
enhancement New feature or request

Comments

@chiragjn
Copy link
Contributor

chiragjn commented Jun 7, 2024

⚠️ Please check that this feature request hasn't been suggested before.

  • I searched previous Ideas in Discussions didn't find any similar feature requests.
  • I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

Currently when using type: chat_template we have to chose a chat template which are encoded in the axolotl codebase (defaults to chatml)

https://github.com/OpenAccess-AI-Collective/axolotl/blob/a82a7115224b7aef14301387a11ad4729fd6ca52/src/axolotl/prompt_strategies/chat_template.py#L99-L102

https://github.com/OpenAccess-AI-Collective/axolotl/blob/a82a7115224b7aef14301387a11ad4729fd6ca52/src/axolotl/prompt_strategies/chat_template.py#L115-L126

https://github.com/OpenAccess-AI-Collective/axolotl/blob/a82a7115224b7aef14301387a11ad4729fd6ca52/src/axolotl/utils/chat_templates.py#L21-L29

I would like to use the tokenizer tokenizer.chat_template as that would help the model start from a familiar place.

✔️ Solution

I would like to pass chat_template as None in which case it should be picked from tokenizer or raise error / fallback to chatml if tokenizer.chat_template missing

I can work on this, seems like a small change.

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

  • My issue title is concise, descriptive, and in title casing.
  • I have searched the existing issues to make sure this feature has not been requested yet.
  • I have provided enough information for the maintainers to understand and evaluate this request.

Fixed by #1970

@chiragjn chiragjn added the enhancement New feature or request label Jun 7, 2024
@chiragjn
Copy link
Contributor Author

see truefoundry@5ba183d for rough implementation
I plan to submit this as a PR soon

@interstellarninja
Copy link

hi i'm trying to fine-tune gemma 2 for function calling and our tool definitions are in the system prompt but gemma chat template doesn't support system role, how can we update the chat_template to support system role?

jinja2.exceptions.TemplateError: System role not supported

@fozziethebeat
Copy link
Contributor

Thinking on this, I would suggest two ways to generalize the current chat template stuff:

  1. Probably some explicit string to use the tokenizer's default, so probably default works best to force the code to leave the chat template as is (and throw an error if the tokenizer doesn't have one)
  2. A way to embed a custom chat template that's not in code. A value such as jinja could do this and then a new config option (chat_template_jinja would make sense). Then with chat_template: jinja a user could supply a custom jinja override for the tokenizer and the code would embed that in the tokenizer so the final saved tokenizer has this new format. This is pretty much what models like meta-llama/Meta-Llama-Guard-2-8B have done and it makes calling them much easier

@fozziethebeat
Copy link
Contributor

The code fix is I think as simple as extending this function to take in the full config object, check the two fields, and return the final template jinja. Then the few primary call paths would just take the string and naively insert it into the tokenizer.

We could also embed the system prompt change for SFT into the function to consolidate

@chiragjn
Copy link
Contributor Author

Sure, I'll can make this change

@fozziethebeat
Copy link
Contributor

I'm actually testing the changes in a branch of mine: https://github.com/fozziethebeat/axolotl/tree/fozziethebeat-flexible-chat-template

@chiragjn
Copy link
Contributor Author

chiragjn commented Jul 30, 2024

Oh, I didn't realize, just reading this comment 😅
I ended up making changes on my open PR too

@fozziethebeat
Copy link
Contributor

that looks to be functionally equivalent! great. I'm a bit busy to tidy mine up for merging so if you can get yours merged we'll have a great answer

@dacox
Copy link

dacox commented Aug 29, 2024

👋 @fozziethebeat @chiragjn Just chiming in as I'm starting to get my hands dirty learning how to use axolotyl. I've been playing with meta-llama/Meta-Llama-3-8B-Instruct and reading about chat templates in the transformers docs, and I'm realizing I dont really know what to put in my axolotyl config 😕

I'm hoping to have essentially drop in replacement for the LORA fine tunes we make using openai.

@fozziethebeat
Copy link
Contributor

The stuff being worked on in this request gives a bit more flexibility but for Llama3, you can most likely use this config and just change a few parts of the dataset.

I would suggest looking at it and the dataset I made. You should see the relationship between the dataset config and how the dataset itself actually looks.

I personally always load things up onto HuggingFace, so that's one of the easier ways to get things rolling but I know the jsonl reader works well, and it'd just require changing the dataset stanza a tiny bit

@dacox
Copy link

dacox commented Aug 30, 2024

@fozziethebeat thanks! Yeah, I actually found those examples yesterday. I can train on your dataset no problem, but I am having issues with my dataset, which includes a system message.

I added it as a role to the axolotyl config - but I believe the problem lies in the hard coded chat_template in the axolotyl config (at least that'y my hypothesis!)

I suspect it is different from the chat template on the transformers tokenizer

@fozziethebeat
Copy link
Contributor

To include system you probably need to rewrite

    roles:
      user:
        - user
      assistant:
        - assistant

to be

    roles:
      system:
        - system
      user:
        - user
      assistant:
        - assistant

You actually can likely delete that whole roles section as what is above is the default and it should support system, user, and assistant roles.

@dkarthicks27
Copy link

dkarthicks27 commented Oct 17, 2024

I have a dataset, which has a system prompt, but its same for the entire dataset.
Is there a way to add a system prompt to the config file, without adding it to the dataset, as its the same system prompt.

@chiragjn
Copy link
Contributor Author

Solved by #1970

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants