Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

langchain_huggingface: Fix multiple GPU usage bug in from_model_id function #23628

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,14 @@
"Could not import transformers python package. "
"Please install it with `pip install transformers`."
)

if device_map is not None:
if device is not None:
logger.warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.warning(
raise ValueError(

Copy link
Author

@kenchanLOL kenchanLOL Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we simply escalate it from warning to error, it would always be triggered when we set device_map as the default value of device is -1(cpu) in current implementation. I checked the documentation of transformer and realized that the default value of device for pipeline changed from -1 to None. I tested both value and they have same behavior loading the model into CPU, the only difference is their logging message in terminal

pipe = HuggingFacePipeline.from_model_id(model_id = "microsoft/phi-2",task="text-generation",device=None)
# message : Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.

pipe = HuggingFacePipeline.from_model_id(model_id = "microsoft/phi-2",task="text-generation",device=-1)
# message : Device has 1 GPUs available. Provide device={deviceId} to `from_model_id` to use availableGPUs for execution. deviceId is -1 (default) for CPU and can be a positive integer associated with CUDA device id.

Despite the log message for device=-1 looks better, given that it is a legacy value that could cause conflicts , updating the default to None will align the behavior with the latest practices and prevent unexpected errors in the future. So, I think we should also change the default value of device in line 77 while changing this from warning to error.

"Both `device` and `device_map` are specified. `device` will override `device_map`. You"

Check failure on line 102 in libs/partners/huggingface/langchain_huggingface/llms/huggingface_pipeline.py

View workflow job for this annotation

GitHub Actions / cd libs/partners/huggingface / make lint #3.12

Ruff (E501)

langchain_huggingface/llms/huggingface_pipeline.py:102:89: E501 Line too long (108 > 88)
" will most likely encounter unexpected behavior. Please remove `device` and keep `device_map`."

Check failure on line 103 in libs/partners/huggingface/langchain_huggingface/llms/huggingface_pipeline.py

View workflow job for this annotation

GitHub Actions / cd libs/partners/huggingface / make lint #3.12

Ruff (E501)

langchain_huggingface/llms/huggingface_pipeline.py:103:89: E501 Line too long (116 > 88)
)
model_kwargs["device_map"] = device_map
device = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
device = None

Copy link
Author

@kenchanLOL kenchanLOL Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as the comment above, this line was used as a temporary fix. If the default value of device is changed, this line is no longer needed

_model_kwargs = model_kwargs or {}
tokenizer = AutoTokenizer.from_pretrained(model_id, **_model_kwargs)

Expand Down Expand Up @@ -219,7 +226,6 @@
model=model,
tokenizer=tokenizer,
device=device,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
device=device,
device=device,
device_map=device_map,

would it make sense to keep this passthrough incase transformers uses it in future?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar reason as above. There is also a similar parameter validation to check if both device and device_map are not None in pipeline. So, It would always be triggered as the current default value for device is -1 and device_map is also not None (either be 'auto' or a dictionary). So, we cant simply keep it without considering the device parameter value.

device_map=device_map,
batch_size=batch_size,
model_kwargs=_model_kwargs,
**_pipeline_kwargs,
Expand Down Expand Up @@ -262,7 +268,6 @@
text_generations: List[str] = []
pipeline_kwargs = kwargs.get("pipeline_kwargs", {})
skip_prompt = kwargs.get("skip_prompt", False)
kenchanLOL marked this conversation as resolved.
Show resolved Hide resolved

for i in range(0, len(prompts), self.batch_size):
batch_prompts = prompts[i : i + self.batch_size]

Expand Down
Loading