Initialize tensors with zeros #3660

pbielak · 2025-06-27T10:09:25Z

What does this PR do?

When initializing tensors with torch.empty the values are random, often large and near to the dtype range limits. The initialize_tensors function creates the tensors on CPU. When moving them to the destination device (using the send_to_device function), some devices will throw an error if the dtype is not supported and implicitly downcasted, e.g.: on Gaudi in lazy mode, the int64 dtype is not enabled by default, and if we create a int64 empty tensor, move it to "hpu", we will often get the following error message:

RuntimeError: Error when trying to cast Long to Int, Input values range [9223372036854775807, 9223372036854775807] exceeds Int range [-2147483648, 2147483647]

This commit changes the default initialization value of the tensors created using initialize_tensors to zero by replacing torch.empty with torch.zeros.

When initializing tensors with `torch.empty` the values are random, often large and near to the dtype range limits. The `initialize_tensors` function creates the tensors on CPU. When moving them to the destination device (using the `send_to_device` function), some devices will throw an error if the dtype is not supported and implicitly downcasted, e.g.: on Gaudi in lazy mode, the int64 dtype is not enabled by default, and if we create a int64 empty tensor, move it to "hpu", we will often get the following error message: `RuntimeError: Error when trying to cast Long to Int, Input values range [9223372036854775807, 9223372036854775807] exceeds Int range [-2147483648, 2147483647]` This commit changes the default initialization value of the tensors created using `initialize_tensors` to zero by replacing `torch.empty` with `torch.zeros`.

pbielak · 2025-07-04T12:03:30Z

Not sure whom to tag here, but maybe @IlyasMoutawwakil can help/review this one

IlyasMoutawwakil · 2025-07-04T12:07:19Z

yeah this not an initialisation problem, this is hpu specific due to the way it supports (and at the same time doesn't support) int64.
torch.empty is faster than torch.zeros so it doesn't make sense to penalise all accelerators with torch.zeros.
why not fix torch.empty in synapseAI/torch+hpu ?
or else something like this that's only applied in the case of hpu:

original_torch_empty = torch.empty

def patched_torch_empty(*args, **kwargs):
    tensor = original_torch_empty(*args, **kwargs)
    tensor.zero_()
    return tensor

torch.empty = patched_torch_empty

SunMarc · 2025-07-15T15:23:52Z

Ilyas solution seems to be better, can you update the PR @pbielak ?

pbielak marked this pull request as ready for review July 1, 2025 09:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initialize tensors with zeros #3660

Initialize tensors with zeros #3660

Uh oh!

pbielak commented Jun 27, 2025

Uh oh!

pbielak commented Jul 4, 2025

Uh oh!

IlyasMoutawwakil commented Jul 4, 2025 •

edited

Loading

Uh oh!

SunMarc commented Jul 15, 2025

Uh oh!

Uh oh!

Initialize tensors with zeros #3660

Are you sure you want to change the base?

Initialize tensors with zeros #3660

Uh oh!

Conversation

pbielak commented Jun 27, 2025

What does this PR do?

Uh oh!

pbielak commented Jul 4, 2025

Uh oh!

IlyasMoutawwakil commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc commented Jul 15, 2025

Uh oh!

Uh oh!

IlyasMoutawwakil commented Jul 4, 2025 •

edited

Loading