-
Notifications
You must be signed in to change notification settings - Fork 299
[Tracing] Update ignored functions list #1599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @kylesayrs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request addresses a critical tracing issue encountered with Llama models by updating the application's configuration. It specifically enhances the list of function names that are ignored during the tracing process, particularly those involved in the creation and manipulation of causal attention masks. This change ensures compatibility with recent updates in external libraries, thereby enabling successful and accurate tracing of Llama models.
Highlights
- Tracing Configuration Update: Expanded the
tracing_ignorelist within theDatasetArgumentsclass insrc/llmcompressor/args/dataset_arguments.py. This update adds several function names related to causal mask generation (e.g.,create_causal_mask,make_causal_mask,get_causal_mask,mask_function,_prepare_4d_causal_attention_mask,_prepare_fsmt_decoder_inputs,_prepare_4d_causal_attention_mask_with_cache_position) to the list of functions to be ignored during tracing. - Llama Model Tracing Fix: The primary purpose of this change is to fix tracing issues for Llama models, specifically addressing a recent change in the
huggingface/transformerslibrary where the function name for generating causal masks was updated. By ignoring these new function names, Llama models can now be successfully traced.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request updates the tracing_ignore list in DatasetArguments to include additional functions related to causal masks. This change aims to fix tracing for llama models by ignoring specific functions that create causal masks. I suggested using a set instead of a list for tracing_ignore to improve performance.
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
c8d7323
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
The transformers test still seem to be failing, is that expected? changes look good otherwise |
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
## Prerequisites ## * #1599 --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
## Purpose ## * Fix tracing for model definitions introduced as part of `transformers==4.53` * Resolves vllm-project#1603 ## Background ## In the latest transformers release, this change landed which changed the name of the function which generates the causal mask. huggingface/transformers#37866 ## Changes ## * Extend the list of function names to ignore during tracing, specifically targeting functions which create causal masks * Update debugger tool to use ignore list from `DatasetArguments` * Update Tracer to skip masking function as part of autowrapping any functions which were not caught by the autowrapper ## Testing ## * `tests/llmcompressor/transformers/tracing/test_models.py` now passes with the latest `transformers==4.53` --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
## Prerequisites ## * vllm-project#1599 --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Purpose
transformers==4.53Background
In the latest transformers release, this change landed which changed the name of the function which generates the causal mask. huggingface/transformers#37866
Changes
DatasetArgumentsTesting
tests/llmcompressor/transformers/tracing/test_models.pynow passes with the latesttransformers==4.53