Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Update lmi_dist warmup logic #1367

Merged
merged 1 commit into from
Dec 6, 2023

Conversation

xyang16
Copy link
Contributor

@xyang16 xyang16 commented Dec 5, 2023

Description

Brief description of what this PR is about

  • If this change is a backward incompatible change, why must this change be made?
  • Interesting edge cases to note here

@xyang16 xyang16 requested review from zachgk, frankfliu and a team as code owners December 5, 2023 22:56
self.properties.get("max_rolling_batch_prefill_tokens", -1))
self.model.warmup(batch_size, max_batch_prefill_tokens)
max_prefill_tokens = int(
self.properties.get("max_rolling_batch_prefill_tokens", 4096))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the purpose of max_rolling_batch_prefill_tokens has changed. Do we still want to keep this concept? And why we need now to introduce a max_prefill_tokens, can we just set some small value like 512, 1024?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and why we start to bring this to external, can this be done inside lmi-dist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was external before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we set to 4096 such small value? Does that have impact to the total token we can support? Or on the other hand, if customer set this value to be very high, like 32000, will this impact model loading?

@xyang16 xyang16 force-pushed the lmi_dist branch 2 times, most recently from 72a0134 to 70f7204 Compare December 5, 2023 23:37
@@ -773,6 +773,7 @@ def build_lmi_dist_model(model):
)
options = lmi_dist_model_list[model]
options["engine"] = "MPI"
options["option.mpi_mode"] = "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed? Given engine is MPI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@xyang16 xyang16 Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got error without adding mpi_mode.

@xyang16 xyang16 merged commit a39e1d1 into deepjavalibrary:master Dec 6, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants