forked from kubeflow/pipelines
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* initial pytorch support * modifying go crd files * modifying go crd files * test placeholder * cleaning up test placeholder file * updating with cifar10 sample * updating cifar10 instructions * correcting docker steps * adding pytorch doeckerfile * removing generated data files * correcting the sample pytorch yaml file * adding the class file and class name parameters * adding cifar10 input file and dockerfile * adding gcs location for model file * addressing review comments * simplifying PyTorch interface * making model class name optional * fix the comment ordering * removing model file * adding default behaviour
- Loading branch information
1 parent
f5a749e
commit 61a35a7
Showing
16 changed files
with
3,977 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
## Creating your own model and testing the PyTorch server. | ||
|
||
To test the [PyTorch](https://pytorch.org/) server, first we need to generate a simple cifar10 model using PyTorch. | ||
|
||
```shell | ||
python cifar10.py | ||
``` | ||
You should see an output similar to this | ||
|
||
```shell | ||
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz | ||
Failed download. Trying https -> http instead. Downloading http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz | ||
100.0%Files already downloaded and verified | ||
[1, 2000] loss: 2.232 | ||
[1, 4000] loss: 1.913 | ||
[1, 6000] loss: 1.675 | ||
[1, 8000] loss: 1.555 | ||
[1, 10000] loss: 1.492 | ||
[1, 12000] loss: 1.488 | ||
[2, 2000] loss: 1.412 | ||
[2, 4000] loss: 1.358 | ||
[2, 6000] loss: 1.362 | ||
[2, 8000] loss: 1.338 | ||
[2, 10000] loss: 1.315 | ||
[2, 12000] loss: 1.278 | ||
Finished Training | ||
``` | ||
|
||
Then, we can run the PyTorch server using the trained model and test for predictions. Models can be on local filesystem, S3 compatible object storage or Google Cloud Storage. | ||
|
||
Note: Currently KFServing supports PyTorch models saved using [state_dict method]((https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-for-inference), PyTorch's recommended way of saving models for inference. The KFServing interface for PyTorch expects users to upload the model_class_file in same location as the PyTorch model, and accepts an optional model_class_name to be passed in as a runtime input. If model class name is not specified, we use 'PyTorchModel' as the default class name. The current interface may undergo changes as we evolve this to support PyTorch models saved using other methods as well. | ||
|
||
```shell | ||
python -m pytorchserver --model_dir ./ --model_name pytorchmodel --model_class_name Net | ||
``` | ||
|
||
We can also use the inbuilt PyTorch support for sample datasets and do some simple predictions | ||
|
||
```python | ||
import torch | ||
import torchvision | ||
import torchvision.transforms as transforms | ||
transform = transforms.Compose([transforms.ToTensor(), | ||
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) | ||
testset = torchvision.datasets.CIFAR10(root='./data', train=False, | ||
download=True, transform=transform) | ||
testloader = torch.utils.data.DataLoader(testset, batch_size=4, | ||
shuffle=False, num_workers=2) | ||
dataiter = iter(testloader) | ||
images, labels = dataiter.next() | ||
formData = { | ||
'instances': images[0:1].tolist() | ||
} | ||
res = requests.post('http://localhost:8080/models/pytorchmodel:predict', json=formData) | ||
print(res) | ||
print(res.text) | ||
``` | ||
|
||
# Predict on a KFService using PyTorch | ||
|
||
## Setup | ||
1. Your ~/.kube/config should point to a cluster with [KFServing installed](https://github.com/kubeflow/kfserving/blob/master/docs/DEVELOPER_GUIDE.md#deploy-kfserving). | ||
2. Your cluster's Istio Ingress gateway must be network accessible. | ||
3. Your cluster's Istio Egresss gateway must [allow Google Cloud Storage](https://knative.dev/docs/serving/outbound-network-access/) | ||
|
||
## Create the KFService | ||
|
||
Apply the CRD | ||
``` | ||
kubectl apply -f pytorch.yaml | ||
``` | ||
|
||
Expected Output | ||
``` | ||
$ kfservice.serving.kubeflow.org/pytorch-cifar10 created | ||
``` | ||
|
||
## Run a prediction | ||
|
||
``` | ||
MODEL_NAME=pytorch-cifar10 | ||
INPUT_PATH=@./input.json | ||
CLUSTER_IP=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
SERVICE_HOSTNAME=$(kubectl get kfservice pytorch-cifar10 -o jsonpath='{.status.url}') | ||
curl -v -H "Host: ${SERVICE_HOSTNAME}" -d $INPUT_PATH http://$CLUSTER_IP/models/$MODEL_NAME:predict | ||
``` | ||
|
||
You should see an output similar to the one below: | ||
|
||
``` | ||
> POST /models/pytorch-cifar10:predict HTTP/1.1 | ||
> Host: pytorch-cifar10.default.svc.cluster.local | ||
> User-Agent: curl/7.54.0 | ||
> Accept: */* | ||
> Content-Length: 110681 | ||
> Content-Type: application/x-www-form-urlencoded | ||
> Expect: 100-continue | ||
> | ||
< HTTP/1.1 100 Continue | ||
* We are completely uploaded and fine | ||
< HTTP/1.1 200 OK | ||
< content-length: 221 | ||
< content-type: application/json; charset=UTF-8 | ||
< date: Fri, 21 Jun 2019 04:05:39 GMT | ||
< server: istio-envoy | ||
< x-envoy-upstream-service-time: 35292 | ||
< | ||
{"predictions": [[-0.8955065011978149, -1.4453213214874268, 0.1515328735113144, 2.638284683227539, -1.00240159034729, 2.270702600479126, 0.22645258903503418, -0.880557119846344, 0.08783778548240662, -1.5551214218139648]] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
import torch | ||
import torchvision | ||
import torchvision.transforms as transforms | ||
import torch.nn as nn | ||
import torch.nn.functional as F | ||
import torch.optim as optim | ||
|
||
|
||
class Net(nn.Module): | ||
def __init__(self): | ||
super(Net, self).__init__() | ||
self.conv1 = nn.Conv2d(3, 6, 5) | ||
self.pool = nn.MaxPool2d(2, 2) | ||
self.conv2 = nn.Conv2d(6, 16, 5) | ||
self.fc1 = nn.Linear(16 * 5 * 5, 120) | ||
self.fc2 = nn.Linear(120, 84) | ||
self.fc3 = nn.Linear(84, 10) | ||
|
||
def forward(self, x): | ||
x = self.pool(F.relu(self.conv1(x))) | ||
x = self.pool(F.relu(self.conv2(x))) | ||
x = x.view(-1, 16 * 5 * 5) | ||
x = F.relu(self.fc1(x)) | ||
x = F.relu(self.fc2(x)) | ||
x = self.fc3(x) | ||
return x | ||
|
||
|
||
if __name__ == "__main__": | ||
|
||
transform = transforms.Compose( | ||
[transforms.ToTensor(), | ||
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) | ||
|
||
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, | ||
download=True, transform=transform) | ||
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, | ||
shuffle=True, num_workers=2) | ||
|
||
testset = torchvision.datasets.CIFAR10(root='./data', train=False, | ||
download=True, transform=transform) | ||
testloader = torch.utils.data.DataLoader(testset, batch_size=4, | ||
shuffle=False, num_workers=2) | ||
|
||
classes = ('plane', 'car', 'bird', 'cat', | ||
'deer', 'dog', 'frog', 'horse', 'ship', 'truck') | ||
|
||
net = Net() | ||
|
||
criterion = nn.CrossEntropyLoss() | ||
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) | ||
|
||
for epoch in range(2): # loop over the dataset multiple times | ||
|
||
running_loss = 0.0 | ||
for i, data in enumerate(trainloader, 0): | ||
# get the inputs; data is a list of [inputs, labels] | ||
inputs, labels = data | ||
|
||
# zero the parameter gradients | ||
optimizer.zero_grad() | ||
|
||
# forward + backward + optimize | ||
outputs = net(inputs) | ||
loss = criterion(outputs, labels) | ||
loss.backward() | ||
optimizer.step() | ||
|
||
# print statistics | ||
running_loss += loss.item() | ||
if i % 2000 == 1999: # print every 2000 mini-batches | ||
print('[%d, %5d] loss: %.3f' % | ||
(epoch + 1, i + 1, running_loss / 2000)) | ||
running_loss = 0.0 | ||
|
||
print('Finished Training') | ||
|
||
# Save model | ||
torch.save(net.state_dict(), "model.pt") |
Oops, something went wrong.