Skip to content

Multi Pin Bumps across PT/AO/tune/ET #1367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Dec 14, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
bcdfc54
Bump PyTorch pin to 20241111
Jack-Khuu Nov 12, 2024
a976734
bump to 1112
Jack-Khuu Nov 12, 2024
23b4536
Merge branch 'main' into pinbump1111
Jack-Khuu Nov 12, 2024
6328935
Update install_requirements.sh
Jack-Khuu Nov 13, 2024
7aa96d7
Update install_requirements.sh
Jack-Khuu Nov 14, 2024
4a977a5
Merge branch 'main' into pinbump1111
Jack-Khuu Nov 14, 2024
774ebb6
Update checkpoint.py typo
Jack-Khuu Nov 14, 2024
655dc4a
Merge branch 'main' into pinbump1111
Jack-Khuu Nov 15, 2024
a6cb90c
Update install_requirements.sh
Jack-Khuu Nov 18, 2024
8cb415d
Merge branch 'main' into pinbump1111
Jack-Khuu Nov 18, 2024
f9d0a29
Update install_requirements.sh
Jack-Khuu Nov 18, 2024
c3f18c6
Update install_requirements.sh
Jack-Khuu Nov 19, 2024
5b91d46
Merge branch 'main' into pinbump1111
Jack-Khuu Nov 22, 2024
bde427d
Merge branch 'main' into pinbump1111
Jack-Khuu Dec 2, 2024
7647d52
Bump pins, waiting for nvjit fix
Jack-Khuu Dec 2, 2024
bb6ca2a
Update install_requirements.sh
Jack-Khuu Dec 2, 2024
eb00467
bump tune
Jack-Khuu Dec 2, 2024
673f5ab
fix tune major version
Jack-Khuu Dec 2, 2024
da0a26d
Bump AO pin to pick up import fix
Jack-Khuu Dec 2, 2024
2530e71
misc
Jack-Khuu Dec 3, 2024
1ada559
Update linux_job CI to v2
Jack-Khuu Dec 3, 2024
f58c22e
Update install_requirements.sh PT pin to 1202
Jack-Khuu Dec 4, 2024
2ece601
Vision nightly is delayed
Jack-Khuu Dec 4, 2024
565338b
Bump Cuda version; drop PT version to one with vision nightly
Jack-Khuu Dec 5, 2024
7088e79
Bump to 1205 vision nightly
Jack-Khuu Dec 5, 2024
94aa9a8
Vision nightly 1205 needs 1204 torch(?)
Jack-Khuu Dec 5, 2024
6e54cba
Drop PT version to 1126 (friendly vision version), update devtoolset …
Jack-Khuu Dec 6, 2024
a05683d
Test download toolchain instead of binutils
Jack-Khuu Dec 6, 2024
411cf94
Test removing devtoolset
Jack-Khuu Dec 6, 2024
953a42e
Remove dep on devtoolset 11 that doesnt' exist on the new machine
Jack-Khuu Dec 6, 2024
6e8bfb1
Bump ET pin
Jack-Khuu Dec 6, 2024
5a80f5f
Merge branch 'main' into pinbump1111
Jack-Khuu Dec 6, 2024
59e00d5
Test nightly with updated vision
Jack-Khuu Dec 6, 2024
d67eb86
Merge branch 'main' into pinbump1111
Jack-Khuu Dec 7, 2024
aae4eb3
Attempt to account for int4wo packing pt#139611
Jack-Khuu Dec 7, 2024
25da485
Naive gguf int4wo attempt
Jack-Khuu Dec 7, 2024
a9fa27e
Update install_requirements.sh to 1210
Jack-Khuu Dec 10, 2024
bdd2356
Merge branch 'main' into pinbump1111
Jack-Khuu Dec 10, 2024
bfe5826
Update install_requirements.sh to 20241213
Jack-Khuu Dec 13, 2024
02dc6a4
Merge branch 'main' into pinbump1111
Jack-Khuu Dec 13, 2024
dbb090f
Update torchvision minor version to 22
Jack-Khuu Dec 13, 2024
9579f18
Merge branch 'main' into pinbump1111
Jack-Khuu Dec 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions install/install_requirements.sh
Original file line number Diff line number Diff line change
Expand Up @@ -60,10 +60,10 @@ echo "Using pip executable: $PIP_EXECUTABLE"
# NOTE: If a newly-fetched version of the executorch repo changes the value of
# PYTORCH_NIGHTLY_VERSION, you should re-run this script to install the necessary
# package versions.
PYTORCH_NIGHTLY_VERSION=dev20241002
PYTORCH_NIGHTLY_VERSION=dev20241112

# Nightly version for torchvision
VISION_NIGHTLY_VERSION=dev20241002
VISION_NIGHTLY_VERSION=dev20241112

# Nightly version for torchtune
TUNE_NIGHTLY_VERSION=dev20241010
Expand Down
3 changes: 2 additions & 1 deletion torchchat/cli/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,7 @@ def _load_checkpoint(builder_args: BuilderArgs):
os.path.join(builder_args.checkpoint_dir, cp_name),
map_location=builder_args.device,
mmap=True,
weights_only=False,
)
)
checkpoint = {}
Expand Down Expand Up @@ -706,4 +707,4 @@ def tokenizer_setting_to_name(tiktoken: bool, tokenizers: bool) -> str:
return "TikToken"
if tokenizers:
return "Tokenizers"
return "SentencePiece"
return "SentencePiece"
1 change: 1 addition & 0 deletions torchchat/distributed/checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ def _load_checkpoints_from_storage(
checkpoint_path,
map_location=builder_args.device,
mmap=True,
weight_only=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it needs false? All LLMs should be loadable with weights_only, shouldn't they? (Also, there are no such option as weight_only (or so I hope :P ))

Suggested change
weight_only=False,
weights_only=True,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the typo;

As for setting it to False: I'd rather keep it behavior consistent in a pin bump PR; we can flip in a separate PR

)


Expand Down
4 changes: 2 additions & 2 deletions torchchat/export.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ def export_for_server(
from executorch.exir.tracer import Value

from torch._export import capture_pre_autograd_graph
from torch.export import export, ExportedProgram
from torch.export import export_for_training, ExportedProgram

from torchchat.model import apply_rotary_emb, Attention
from torchchat.utils.build_utils import get_precision
Expand Down Expand Up @@ -238,7 +238,7 @@ def _to_core_aten(
raise ValueError(
f"Expected passed in model to be an instance of fx.GraphModule, got {type(model)}"
)
core_aten_ep = export(model, example_inputs, dynamic_shapes=dynamic_shapes)
core_aten_ep = export_for_training(model, example_inputs, dynamic_shapes=dynamic_shapes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what we are doing here, but shouldn't TorchChat be exporting for inference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was picked up from @tugsbayasgalan's PR migrating away from export(), but export_for_inference does sound more in line with what we want

@tugsbayasgalan Can you share info on the new APIs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep the intended use for inference IR is that user will export to a training IR and call run_decompositions() to lower to inference IR. In this flow, after core_aten_ep, there is to_edge call which lowers to inference. Export team is moving the IR to non-functional training IR so export_for_training will exist as an alias to official export. After we actually migrate official export, we will replace this call with export.

if verbose:
logging.info(f"Core ATen graph:\n{core_aten_ep.graph}")
return core_aten_ep
Expand Down
Loading