This repository was archived by the owner on Aug 7, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 887
Merged
Changes from 18 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
5b9adf7
adding mps support to base handler and regression test
udaij12 c3f060a
fixed method
udaij12 31c093c
mps support
udaij12 a102288
fix format
udaij12 1eff31a
changes to detection
udaij12 827fa6d
testing x86
udaij12 1d9975e
adding m1 check
udaij12 c357e0b
Merge branch 'master' into mps_m1
agunapal 29b388e
adding test cases
udaij12 7c3e876
Merge branch 'mps_m1' of https://github.com/pytorch/serve into mps_m1
udaij12 5d45c22
adding test workflow
udaij12 09fb201
modifiying tests
udaij12 1096ab7
removing python tests
udaij12 5d2879b
remove workflow
udaij12 5d7f39d
removing test config file
udaij12 31e7b00
Merge branch 'master' into mps_m1
udaij12 325688a
adding docs
udaij12 7cefd74
Merge branch 'mps_m1' of https://github.com/pytorch/serve into mps_m1
udaij12 5575f93
fixing spell check
udaij12 1ead54a
lint fix
udaij12 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# Apple Silicon Support | ||
|
||
## What is supported | ||
* TorchServe CI jobs now include M1 hardware in order to ensure support, [documentation](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories) on github M1 hardware. | ||
- [Regression Tests](https://github.com/pytorch/serve/blob/master/.github/workflows/regression_tests_cpu.yml) | ||
- [Regression binaries Test](https://github.com/pytorch/serve/blob/master/.github/workflows/regression_tests_cpu_binaries.yml) | ||
* For [Docker](https://docs.docker.com/desktop/install/mac-install/) ensure Docker for Apple silicon is installed then follow [setup steps](https://github.com/pytorch/serve/tree/master/docker) | ||
|
||
## Experimental Support | ||
|
||
* For GPU jobs on Apple Silicon, [MPS](https://pytorch.org/docs/master/notes/mps.html) is now auto detected and enabled. To prevent TorchServe from using MPS, users have to set `deviceType: "cpu"` in model-config.yaml. | ||
* This is an experimental feature and NOT ALL models are guaranteed to work. | ||
* Number of GPUs now reports GPUs on Apple Silicon | ||
|
||
### Testing | ||
* [Pytests](https://github.com/pytorch/serve/tree/master/test/pytest/test_device_config.py) that checks for MPS on MacOS M1 devices | ||
* Models that have been tested and work: Resnet-18, Densenet161, Alexnet | ||
* Models that have been tested and DO NOT work: MNIST | ||
|
||
|
||
#### Example Resnet-18 Using MPS On Mac M1 Pro | ||
``` | ||
serve % torchserve --start --model-store model_store_gen --models resnet-18=resnet-18.mar --ncs | ||
|
||
Torchserve version: 0.10.0 | ||
Number of GPUs: 16 | ||
Number of CPUs: 10 | ||
Max heap size: 8192 M | ||
Python executable: /Library/Frameworks/Python.framework/Versions/3.11/bin/python3.11 | ||
Config file: N/A | ||
Inference address: http://127.0.0.1:8080 | ||
Management address: http://127.0.0.1:8081 | ||
Metrics address: http://127.0.0.1:8082 | ||
Model Store: | ||
Initial Models: resnet-18=resnet-18.mar | ||
Log dir: | ||
Metrics dir: | ||
Netty threads: 0 | ||
Netty client threads: 0 | ||
Default workers per model: 16 | ||
Blacklist Regex: N/A | ||
Maximum Response Size: 6553500 | ||
Maximum Request Size: 6553500 | ||
Limit Maximum Image Pixels: true | ||
Prefer direct buffer: false | ||
Allowed Urls: [file://.*|http(s)?://.*] | ||
Custom python dependency for model allowed: false | ||
Enable metrics API: true | ||
Metrics mode: LOG | ||
Disable system metrics: false | ||
Workflow Store: | ||
CPP log config: N/A | ||
Model config: N/A | ||
024-04-08T14:18:02,380 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin... | ||
2024-04-08T14:18:02,391 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: resnet-18.mar | ||
2024-04-08T14:18:02,699 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model resnet-18 | ||
2024-04-08T14:18:02,699 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model resnet-18 loaded. | ||
2024-04-08T14:18:02,699 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: resnet-18, count: 16 | ||
... | ||
... | ||
serve % curl http://127.0.0.1:8080/predictions/resnet-18 -T ./examples/image_classifier/kitten.jpg | ||
... | ||
{ | ||
"tabby": 0.40966302156448364, | ||
"tiger_cat": 0.3467046618461609, | ||
"Egyptian_cat": 0.1300288736820221, | ||
"lynx": 0.02391958422958851, | ||
"bucket": 0.011532187461853027 | ||
} | ||
... | ||
``` | ||
#### Conda Example | ||
|
||
``` | ||
(myenv) serve % pip list | grep torch | ||
torch 2.2.1 | ||
torchaudio 2.2.1 | ||
torchdata 0.7.1 | ||
torchtext 0.17.1 | ||
torchvision 0.17.1 | ||
(myenv3) serve % conda install -c pytorch-nightly torchserve torch-model-archiver torch-workflow-archiver | ||
(myenv3) serve % pip list | grep torch | ||
torch 2.2.1 | ||
torch-model-archiver 0.10.0b20240312 | ||
torch-workflow-archiver 0.2.12b20240312 | ||
torchaudio 2.2.1 | ||
torchdata 0.7.1 | ||
torchserve 0.10.0b20240312 | ||
torchtext 0.17.1 | ||
torchvision 0.17.1 | ||
(myenv3) serve % torchserve --start --ncs --models densenet161.mar --model-store ./model_store_gen/ | ||
Torchserve version: 0.10.0 | ||
Number of GPUs: 0 | ||
Number of CPUs: 10 | ||
Max heap size: 8192 M | ||
Config file: N/A | ||
Inference address: http://127.0.0.1:8080 | ||
Management address: http://127.0.0.1:8081 | ||
Metrics address: http://127.0.0.1:8082 | ||
Initial Models: densenet161.mar | ||
Netty threads: 0 | ||
Netty client threads: 0 | ||
Default workers per model: 10 | ||
Blacklist Regex: N/A | ||
Maximum Response Size: 6553500 | ||
Maximum Request Size: 6553500 | ||
Limit Maximum Image Pixels: true | ||
Prefer direct buffer: false | ||
Allowed Urls: [file://.*|http(s)?://.*] | ||
Custom python dependency for model allowed: false | ||
Enable metrics API: true | ||
Metrics mode: LOG | ||
Disable system metrics: false | ||
CPP log config: N/A | ||
Model config: N/A | ||
System metrics command: default | ||
... | ||
2024-03-12T15:58:54,702 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model densenet161 loaded. | ||
2024-03-12T15:58:54,702 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: densenet161, count: 10 | ||
Model server started. | ||
... | ||
(myenv3) serve % curl http://127.0.0.1:8080/predictions/densenet161 -T examples/image_classifier/kitten.jpg | ||
{ | ||
"tabby": 0.46661922335624695, | ||
"tiger_cat": 0.46449029445648193, | ||
"Egyptian_cat": 0.0661405548453331, | ||
"lynx": 0.001292439759708941, | ||
"plastic_bag": 0.00022909720428287983 | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,181 @@ | ||
import shutil | ||
from pathlib import Path | ||
from unittest.mock import patch | ||
import tempfile | ||
|
||
import pytest | ||
import test_utils | ||
import requests | ||
import os | ||
import platform | ||
from model_archiver import ModelArchiverConfig | ||
|
||
|
||
|
||
|
||
CURR_FILE_PATH = Path(__file__).parent | ||
REPO_ROOT_DIR = CURR_FILE_PATH.parent.parent | ||
ROOT_DIR = os.path.join(tempfile.gettempdir(), "workspace") | ||
REPO_ROOT = os.path.join(os.path.dirname(os.path.abspath(__file__)), "../../") | ||
data_file_zero = os.path.join(REPO_ROOT, "test/pytest/test_data/0.png") | ||
config_file = os.path.join(REPO_ROOT, "test/resources/config_token.properties") | ||
mnist_scriptes_py = os.path.join(REPO_ROOT,"examples/image_classifier/mnist/mnist.py") | ||
|
||
HANDLER_PY = """ | ||
from ts.torch_handler.base_handler import BaseHandler | ||
|
||
class deviceHandler(BaseHandler): | ||
|
||
def initialize(self, context): | ||
super().initialize(context) | ||
assert self.get_device().type == "mps" | ||
""" | ||
|
||
MODEL_CONFIG_YAML = """ | ||
#frontend settings | ||
# TorchServe frontend parameters | ||
minWorkers: 1 | ||
batchSize: 4 | ||
maxWorkers: 4 | ||
""" | ||
|
||
MODEL_CONFIG_YAML_GPU = """ | ||
#frontend settings | ||
# TorchServe frontend parameters | ||
minWorkers: 1 | ||
batchSize: 4 | ||
maxWorkers: 4 | ||
deviceType: "gpu" | ||
""" | ||
|
||
MODEL_CONFIG_YAML_CPU = """ | ||
#frontend settings | ||
# TorchServe frontend parameters | ||
minWorkers: 1 | ||
batchSize: 4 | ||
maxWorkers: 4 | ||
deviceType: "cpu" | ||
""" | ||
|
||
|
||
@pytest.fixture(scope="module") | ||
def model_name(): | ||
yield "mnist" | ||
|
||
@pytest.fixture(scope="module") | ||
def work_dir(tmp_path_factory, model_name): | ||
return Path(tmp_path_factory.mktemp(model_name)) | ||
|
||
@pytest.fixture(scope="module") | ||
def model_config_name(request): | ||
def get_config(param): | ||
if param == "cpu": | ||
return MODEL_CONFIG_YAML_CPU | ||
elif param == "gpu": | ||
return MODEL_CONFIG_YAML_GPU | ||
else: | ||
return MODEL_CONFIG_YAML | ||
|
||
return get_config(request.param) | ||
|
||
@pytest.fixture(scope="module", name="mar_file_path") | ||
def create_mar_file(work_dir, model_archiver, model_name, model_config_name): | ||
|
||
|
||
mar_file_path = work_dir.joinpath(model_name + ".mar") | ||
|
||
model_config_yaml_file = work_dir / "model_config.yaml" | ||
model_config_yaml_file.write_text(model_config_name) | ||
|
||
model_py_file = work_dir / "model.py" | ||
|
||
model_py_file.write_text(mnist_scriptes_py) | ||
|
||
handler_py_file = work_dir / "handler.py" | ||
handler_py_file.write_text(HANDLER_PY) | ||
|
||
config = ModelArchiverConfig( | ||
model_name=model_name, | ||
version="1.0", | ||
serialized_file=None, | ||
model_file=mnist_scriptes_py, #model_py_file.as_posix(), | ||
handler=handler_py_file.as_posix(), | ||
extra_files=None, | ||
export_path=work_dir, | ||
requirements_file=None, | ||
runtime="python", | ||
force=False, | ||
archive_format="default", | ||
config_file=model_config_yaml_file.as_posix(), | ||
) | ||
|
||
with patch("archiver.ArgParser.export_model_args_parser", return_value=config): | ||
model_archiver.generate_model_archive() | ||
|
||
assert mar_file_path.exists() | ||
|
||
yield mar_file_path.as_posix() | ||
|
||
# Clean up files | ||
|
||
mar_file_path.unlink(missing_ok=True) | ||
|
||
# Clean up files | ||
|
||
@pytest.fixture(scope="module", name="model_name") | ||
def register_model(mar_file_path, model_store, torchserve): | ||
""" | ||
Register the model in torchserve | ||
""" | ||
shutil.copy(mar_file_path, model_store) | ||
|
||
file_name = Path(mar_file_path).name | ||
|
||
model_name = Path(file_name).stem | ||
|
||
params = ( | ||
("model_name", model_name), | ||
("url", file_name), | ||
("initial_workers", "1"), | ||
("synchronous", "true"), | ||
("batch_size", "1"), | ||
) | ||
|
||
test_utils.reg_resp = test_utils.register_model_with_params(params) | ||
|
||
yield model_name | ||
|
||
test_utils.unregister_model(model_name) | ||
|
||
|
||
@pytest.mark.skipif(platform.machine() != "arm64", reason="Skip on Mac M1") | ||
@pytest.mark.parametrize("model_config_name", ["gpu"], indirect=True) | ||
def test_m1_device(model_name, model_config_name): | ||
udaij12 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
response = requests.get(f"http://localhost:8081/models/{model_name}") | ||
|
||
print("-----TEST-----") | ||
print(response.content) | ||
assert response.status_code == 200, "Describe Failed" | ||
|
||
|
||
@pytest.mark.skipif(platform.machine() != "arm64", reason="Skip on Mac M1") | ||
@pytest.mark.parametrize("model_config_name", ["cpu"], indirect=True) | ||
def test_m1_device_cpu(model_name, model_config_name): | ||
|
||
response = requests.get(f"http://localhost:8081/models/{model_name}") | ||
|
||
print("-----TEST-----") | ||
print(response.content) | ||
assert response.status_code == 404, "Describe Worked" | ||
|
||
|
||
@pytest.mark.skipif(platform.machine() != "arm64", reason="Skip on Mac M1") | ||
@pytest.mark.parametrize("model_config_name", ["default"], indirect=True) | ||
def test_m1_device_default(model_name, model_config_name): | ||
|
||
response = requests.get(f"http://localhost:8081/models/{model_name}") | ||
|
||
print("-----TEST-----") | ||
print(response.content) | ||
assert response.status_code == 200, "Describe Failed" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.