Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

track RAM & save measurements during compile and run #109

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

quic-morteza
Copy link
Contributor

For the user to keep track of their measurements, code is added to log the necessary arguments and the measurements in addition to memory (RAM) usage during compile/runtime into a csv file. E.g. running the following command two times:

python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 4 --prompt_len 64 --ctx_len 1024 --generation_len 512 --mxfp6 --num_cores 16 --device_group [0] --prompt "My name is|My name is|My name is|My name is" --benchmark

gpt2_benchmarking.csv is generated in the working directory that stores
image

@ochougul
Copy link
Contributor

@quic-rishinr please review

Copy link
Contributor

@quic-rishinr quic-rishinr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @quic-morteza could you please implement the benchmarking as two separate decorator functions? Each decorator should accept a custom_name: str (to specify stages like compilation, execution, etc.) and a benchmark: bool flag. One decorator should capture execution time, and the other should capture peak memory usage. This approach will help us scale without adding specific logic in each module for capturing stats.
Additionally please move the benchmark logging out of infer. The infer function should not contain any benchmarking-specific logic.

@quic-rishinr
Copy link
Contributor

@quic-morteza any progress on the above request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants