Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion thunder/executors/nvfuserex_impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
# NOTE This impl file is here because nvFuser may not be available, so it's imported conditionally
# by nvfuserex.py when nvFuser is available.

DIRECT_BINDINGS_SUPPORTED_VERSION = LooseVersion("0.2.34")
DIRECT_BINDINGS_SUPPORTED_VERSION = LooseVersion("0.2.35")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the minimum version which ships with LruFusionCache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

DTENSOR_SUPPORTED_VERSION = LooseVersion("0.2.28")
if nvfuser_version() >= DIRECT_BINDINGS_SUPPORTED_VERSION:
import nvfuser_direct as nvfuser
Expand Down Expand Up @@ -298,6 +298,17 @@ def multidevice_schedule(fd: FusionDefinition, in_dtensors: list[Proxy]) -> None
in_tv.set_allocation_domain(in_tv.get_loop_domain(), new_contiguity=True)


# This function wraps nvfuser_direct's LruFusionCache with a version check.
def FusionCacheDecorator(func: callable):
# For legacy bindings, the decorator does nothin.
if nvfuser_version() < DIRECT_BINDINGS_SUPPORTED_VERSION:
return func
from nvfuser_direct import LruFusionCache

return LruFusionCache(max_fusions=16384)(func)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default value for max_fusions is 16384, we can remove this here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: will there be an option of setting cache size? Or it would not be that useful

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we pick a reasonable number to avoid out-of-memory issues. We never thought to change it at runtime.



@FusionCacheDecorator
def create_fd(
bsyms: list[BoundSymbol],
input_descriptors: Sequence[type | tuple[tuple[int, ...], tuple[bool, ...], tuple[int, ...]]],
Expand Down
Loading