Skip to content

Add support for multi model caching & live reloading #1428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 290 commits into from
Nov 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
290 commits
Select commit Hold shift + click to select a range
243d27d
Add some boilerplate to clear the head
RobertLucian Aug 4, 2020
d1181af
Add even more boilerplate
RobertLucian Aug 4, 2020
fd9ec82
WIP crons
RobertLucian Aug 5, 2020
8a72779
WIP for model validation on the serving container
RobertLucian Aug 5, 2020
31d1640
Few comments on lib.model.validation
RobertLucian Aug 5, 2020
6511653
Add json converter for model template
RobertLucian Aug 6, 2020
c53909c
WIP add validation for model template
RobertLucian Aug 6, 2020
4de6270
Add checks/exceptions for model template
RobertLucian Aug 6, 2020
3014205
Fix GenericPlaceholder validation
RobertLucian Aug 6, 2020
3ebf70a
Add recursive model validation and add fixes
RobertLucian Aug 7, 2020
b78fede
Small fix for validate_integer_placeholder
RobertLucian Aug 7, 2020
b5ad9d3
Replace ExclAlternativePlaceholder w/ OneOfAllPlaceholder
RobertLucian Aug 7, 2020
b6098ed
Implement OneOfAll Placeholder for validation
RobertLucian Aug 10, 2020
15a7d97
Fix AnyPlaceholder when the folder is empty
RobertLucian Aug 10, 2020
66730c7
Add validation for PlaceholderGroup
RobertLucian Aug 10, 2020
c1ac84f
Small mods to validation
RobertLucian Aug 10, 2020
6adb9cc
Add docstrings + add TensorFlowNeuronPredictor
RobertLucian Aug 10, 2020
b690ae6
Add validation for models:dir in Python
RobertLucian Aug 11, 2020
69b37bd
Make certain functions "private"
RobertLucian Aug 11, 2020
9d6ecc4
Implement part of the SimpleModelMonitor
RobertLucian Aug 11, 2020
580b946
Add LockedFile class and more
RobertLucian Aug 11, 2020
40f557f
WIP SimpleModelMonitor
RobertLucian Aug 12, 2020
3532dd4
Add mechanism for live-model-reloading
RobertLucian Aug 13, 2020
b864f15
Few forgotten docstrings
RobertLucian Aug 13, 2020
cd34eca
Add structure to hold models in memory
RobertLucian Aug 13, 2020
ef17e85
WIP - live-reloading w/ parts of model caching
RobertLucian Aug 13, 2020
8e2fc2a
Finalize ONNX client for non-models:dir paths
RobertLucian Aug 14, 2020
0d00be4
Add logic for making predictions when models:dir is set for ONNX client
RobertLucian Aug 14, 2020
718b791
Don't encode when writing to locked file
RobertLucian Aug 14, 2020
d5f5e1b
Renaming types & comment
RobertLucian Aug 17, 2020
7a74c34
Add RWLock
RobertLucian Aug 17, 2020
166ee6b
Refactor serving library
RobertLucian Aug 17, 2020
9269263
WIP on LRU, model tree & syncing
RobertLucian Aug 18, 2020
e29563b
Add ModelsTree class & Rm LockedStateAndLRU
RobertLucian Aug 18, 2020
dd13b40
WIP on model locking, global locking for GC, model structs
RobertLucian Aug 18, 2020
7f4a636
Add docstrings to existing methods
RobertLucian Aug 18, 2020
4800d86
Improve docstrings
RobertLucian Aug 19, 2020
26be81f
WIP MMC
RobertLucian Aug 19, 2020
4c5efb2
WIP MMC
RobertLucian Aug 19, 2020
262dcb8
WIP MMC refactor
RobertLucian Aug 19, 2020
fc2bbcc
WIP MMC implement retrieving mechanism for ONNX
RobertLucian Aug 20, 2020
0502a11
Fix ReadWriteLock & add writer-preferring RW lock
RobertLucian Aug 20, 2020
63c54b7
Add preference policy changer to ReadWriteLock
RobertLucian Aug 20, 2020
87d401c
Remove comments & add TODO
RobertLucian Aug 20, 2020
b1cebe7
Implement v. selection when model v. is latest or highest
RobertLucian Aug 21, 2020
413b523
Implement tag counting func. for model holder
RobertLucian Aug 21, 2020
2e781a4
Improve tag counting functionality for models holder
RobertLucian Aug 21, 2020
8930482
Implement download model callback when caching
RobertLucian Aug 24, 2020
45aac6c
Replace global -> model access for models tree
RobertLucian Aug 24, 2020
c1aa9f2
Rename model.lru to model.model
RobertLucian Aug 24, 2020
c642107
Parse timestamp from datetime when downloading model
RobertLucian Aug 24, 2020
1a5479e
Apply locking for ModelsTree.update_models method
RobertLucian Aug 24, 2020
ddf896f
Reduce the locking time for the model tree
RobertLucian Aug 24, 2020
e5a0022
Implement model upstream timestamp update when caching is disabled
RobertLucian Aug 24, 2020
1bc7f7f
Remove models that no longer appear in model_names, if any present at…
RobertLucian Aug 24, 2020
644a97a
Load/remove locally-provided models
RobertLucian Aug 24, 2020
05029c9
Fix missing UTC timestamp when caching is disabled
RobertLucian Aug 24, 2020
2e127be
Support non-version models for all predictor types (not enabled)
RobertLucian Aug 24, 2020
78f823c
Enable non-version model support for all predictor types
RobertLucian Aug 25, 2020
ab17828
Add GC cron for when caching is enabled
RobertLucian Aug 25, 2020
efa1507
Use abstract threading class for all crons
RobertLucian Aug 25, 2020
54e200d
Python package import errors
RobertLucian Aug 25, 2020
0ec565a
Add models tree updater cron
RobertLucian Aug 25, 2020
0550145
Add "latest"/"highest" model tree preloader cron
RobertLucian Aug 26, 2020
2f70930
WIP MMC
RobertLucian Aug 28, 2020
320e651
Local models availability & Predictor impl
RobertLucian Aug 28, 2020
8c7db86
Various fixes, PythonPredictor client, Predictor
RobertLucian Aug 28, 2020
eb6e2fd
Set load method for Python Predictor client
RobertLucian Aug 28, 2020
7dc18c4
WIP TensorFlow API client
RobertLucian Aug 28, 2020
be46404
WIP TensorFlowServing API client
RobertLucian Aug 28, 2020
f3c1358
Implement TFS API for loading/unloading models dynamically
RobertLucian Sep 1, 2020
99d8b46
Add extra arguments to TFS container
RobertLucian Sep 1, 2020
88794cc
Add models to TFS tree even if they failed to load
RobertLucian Sep 1, 2020
1574537
WIP MMC - TF client
RobertLucian Sep 1, 2020
d43ce17
WIP MMC - TF client
RobertLucian Sep 1, 2020
cefd834
Fully implement the TensorFlowServingAPI's class
RobertLucian Sep 2, 2020
3f52919
Add extra keyword-arguments to load model func
RobertLucian Sep 2, 2020
a577031
WIP MMC - TF client
RobertLucian Sep 3, 2020
46e75f4
WIP MMC - add remove callback for TF client
RobertLucian Sep 3, 2020
253bff5
Add model version detector for TF client when num procs > 1
RobertLucian Sep 3, 2020
a25321e
Pass kwargs to load callback for TF client when num procs > 1
RobertLucian Sep 3, 2020
4058861
Add cron to update TFS models independently when num procs > 1
RobertLucian Sep 3, 2020
e57d6f8
Mods to the cron that updates TFS models when num procs > 1
RobertLucian Sep 3, 2020
01067db
Add all crons to the predictor class
RobertLucian Sep 8, 2020
df37358
Fix cron scheduling in Predictor class
RobertLucian Sep 8, 2020
500b534
Support non-versioned TF models in the CLI
RobertLucian Sep 8, 2020
8a58bd0
WIP MMC - python model validation
RobertLucian Sep 9, 2020
e542a6c
WIP MMC - tensorflow model validation
RobertLucian Sep 9, 2020
88570a8
Rename functions more appropriately
RobertLucian Sep 9, 2020
cb80530
WIP MMC - rewrite ONNX validation for the CLI
RobertLucian Sep 10, 2020
55fde7b
WIP MMC - rewrite TF validation (tested)
RobertLucian Sep 11, 2020
4200da0
WIP MMC - rewrite python validation
RobertLucian Sep 14, 2020
781327a
Properly wrap error messages for model validations
RobertLucian Sep 14, 2020
6cc38a5
Don't ignore hidden files/folder when validating models
RobertLucian Sep 14, 2020
5e45b50
Use variable to specify whether path is S3 or not
RobertLucian Sep 14, 2020
9cc481b
Add testing logs to CLI API validation
RobertLucian Sep 14, 2020
4501f1d
Remove model downloading with downloader container
RobertLucian Sep 14, 2020
d4f856a
Add a bunch of number slice to string slice funcs
RobertLucian Sep 14, 2020
fa6364d
Cache only local models when local provider is used
RobertLucian Sep 14, 2020
1af5e53
Refactor & fix module import issues
RobertLucian Sep 16, 2020
57abbdf
Fix files-based locking class
RobertLucian Sep 16, 2020
6d80bca
Further fixes for file-based locking
RobertLucian Sep 16, 2020
9261111
Add missing argument to lock classes
RobertLucian Sep 16, 2020
71ca8ad
Fix thread locking syntax error
RobertLucian Sep 16, 2020
7f5a31c
Fix logic errors in thread locking module
RobertLucian Sep 17, 2020
9dd353f
Rename thread locking classes
RobertLucian Sep 17, 2020
b17585c
Prevent attribute not found error when __del__ is called
RobertLucian Sep 17, 2020
0221d18
Add helper API dumps (temporarily)
RobertLucian Sep 17, 2020
3e62767
Add another spec output helper (tbr)
RobertLucian Sep 17, 2020
ef1b0dc
Fix bugs in CuratedModelResources class
RobertLucian Sep 17, 2020
932241a
Fix bugs with lib.concurrency/model
RobertLucian Sep 17, 2020
7d1ffdc
Fix bugs with ModelTreeUpdater
RobertLucian Sep 18, 2020
9ef1951
Fix find_all_s3_models func when is_dir_used=True
RobertLucian Sep 18, 2020
4c65f1a
Fix multiple-bucket bug with the find_all_s3_models function
RobertLucian Sep 18, 2020
6dcc7f6
Fix model validation bug (critical bug)
RobertLucian Sep 21, 2020
46ffb3d
Fix multiple bugs
RobertLucian Sep 21, 2020
ba457d2
Ensure model path suffix
RobertLucian Sep 21, 2020
b0109bb
ONNX can no longer be specified as a single obj
RobertLucian Sep 21, 2020
17ba5ca
WIP FileBasedModelsTreeUpdater (fixing/testing)
RobertLucian Sep 21, 2020
b7ebd0d
Fix WithBreak exception not getting surpressed
RobertLucian Sep 22, 2020
59d361b
WIP fix find_ondisk_models_with_lock, LockedFile; FileBasedModelsTree…
RobertLucian Sep 22, 2020
2733a5c
Further fixes for FileBasedModelsTreeUpdater
RobertLucian Sep 22, 2020
2055d4e
Fix find_all_s3_models and FileBasedModelsTreeUpdater (fully tested)
RobertLucian Sep 23, 2020
069e85d
Fix find_ondisk_model_info helper function
RobertLucian Sep 24, 2020
706a8cf
Fix helper function find_ondisk_models
RobertLucian Sep 24, 2020
4a3b9cf
Fix bugs with TFSModelLoader
RobertLucian Sep 24, 2020
481f3ea
Fix timestamp bleeding into versions that have the same prefix as tha…
RobertLucian Sep 24, 2020
817502f
Fix TFSModelLoader (with tests)
RobertLucian Sep 24, 2020
40b5b79
Re-enable TFS communication
RobertLucian Sep 24, 2020
cd4d1e4
Fix bugs with TFSModelLoader & TensorFlowServingAPI
RobertLucian Sep 24, 2020
49b8a66
Further fixes for the TFSModelLoader
RobertLucian Sep 25, 2020
6c1b9d9
Fix TFSModelLoader when TFS is crashing due to OOM
RobertLucian Sep 25, 2020
1486d23
Reload TFSModelLoader when TFS server crashes
RobertLucian Sep 25, 2020
90ed102
Bugs and a bit of refactoring
RobertLucian Sep 26, 2020
c7cba58
Fix bug where the wrong versions were updated
RobertLucian Sep 28, 2020
f8b2f8b
Rename TFSModelLoader method
RobertLucian Sep 28, 2020
c022a98
Eliminate race condition when acquiring model lock
RobertLucian Sep 28, 2020
46c8568
WIP Python client
RobertLucian Sep 28, 2020
cd4e30e
Add NumLocalModels func to API struct
RobertLucian Sep 28, 2020
f654440
Allow tfs module to be imported when dependencies are not in
RobertLucian Sep 28, 2020
6ccc18e
Allow ONNX client to be imported without dependencies installed
RobertLucian Sep 28, 2020
60c37a1
Fix issues in the serving startup scripts
RobertLucian Sep 28, 2020
6c0b4a9
Object is not indexable - using paranthesis
RobertLucian Sep 28, 2020
4db0181
Fixing a bunch of runtime errors
RobertLucian Oct 1, 2020
9d6c288
Fix logger issue & fix more errors
RobertLucian Oct 1, 2020
85fa362
Better exception message & fix validation bug
RobertLucian Oct 2, 2020
da52d7e
Fix syntax errors for ModelsGC
RobertLucian Oct 2, 2020
f03fb25
Fix model reloading cron for multiple processes (while running)
RobertLucian Oct 2, 2020
2193faa
Ensure "/" suffix for model paths, use daemons, add grpcio to req
RobertLucian Oct 2, 2020
220c97f
Use file-based locking just when caching is disabled
RobertLucian Oct 2, 2020
963d1bf
Fixes for model live-reloading for Python/ONNX
RobertLucian Oct 2, 2020
cf01e59
Finalize fixes for the side-reloading feature for the ONNX and Python…
RobertLucian Oct 5, 2020
9375813
Use shared volume for local TF deployments
RobertLucian Oct 6, 2020
6f6a354
Fix issues with TFSModelLoader cron
RobertLucian Oct 6, 2020
56f2e6f
Fix segmentation fault with the TFS API
RobertLucian Oct 6, 2020
0790b13
Dw models in parallel and sync output logs when side-reloading
RobertLucian Oct 6, 2020
3af8e0e
Fixing bugs with the TFSModelLoader (for edge cases)
RobertLucian Oct 6, 2020
4b805e7
Fixes for the caching mechanism for the TF pred
RobertLucian Oct 6, 2020
42f4f54
Fixes to the caching mechanism for the TF pred
RobertLucian Oct 7, 2020
e3bd863
Raise exception when model is not loaded (caching + TF)
RobertLucian Oct 7, 2020
7ab7cfa
Don't download a model if it has already been downloaded (caching + a…
RobertLucian Oct 7, 2020
8817b3d
Further fixes for the caching mechanism
RobertLucian Oct 7, 2020
39c6ea3
Fix removal process of stale models for the GC
RobertLucian Oct 8, 2020
22723d6
Fix max version meth when caching enabled + others
RobertLucian Oct 8, 2020
1917aba
Fixes for the model caching GC
RobertLucian Oct 8, 2020
c18654d
Reset global preference policy for accessing resources for the GC
RobertLucian Oct 8, 2020
3db7150
Improve logging for all predictor types
RobertLucian Oct 8, 2020
409902f
Return metadata for models when live-reloading
RobertLucian Oct 9, 2020
c124a84
Model metadata retrieving from serving container
RobertLucian Oct 9, 2020
1dff2fc
Add model stats when cortex getting
RobertLucian Oct 9, 2020
eb05980
Show model stats for TF & live reloading when cx getting
RobertLucian Oct 10, 2020
6ce653f
Add examples and refactor all other examples
RobertLucian Oct 12, 2020
b6c4875
Fix merge conflicts for 'master' into feature/multi-model-caching
RobertLucian Oct 13, 2020
ec83408
Code fixes + make lint
RobertLucian Oct 13, 2020
95179c8
Fix FileTree function output
RobertLucian Oct 13, 2020
40aefd9
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 13, 2020
1772344
Bunch of fixes
RobertLucian Oct 13, 2020
77466c3
Bunch of fixes
RobertLucian Oct 13, 2020
4456fcb
Core tweaks to the GC for the MMC
RobertLucian Oct 13, 2020
ba56336
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 13, 2020
a568f40
Log message correction
RobertLucian Oct 14, 2020
3db0030
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 14, 2020
42d5ba5
Add get_model method for the ONNX predictor & fix ONNX example
RobertLucian Oct 14, 2020
be7e03c
Fixes to the CLI & to the MMC for the TF pred
RobertLucian Oct 14, 2020
245f8ef
Handle situations when TFS gets unresponsive when caching is enabled
RobertLucian Oct 14, 2020
45d4b73
Fix local models not working when caching enabled and ONNX/Py predict…
RobertLucian Oct 15, 2020
288e6f4
Fix cron for when local models & the Python/ONNX predictors are used
RobertLucian Oct 15, 2020
21c7e77
Disable caching for BatchAPI kind
RobertLucian Oct 15, 2020
b611871
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 15, 2020
1297147
Fix batch issues (with the live reloading feature)
RobertLucian Oct 15, 2020
0dadc4d
Wait for TFS to be responsive when live-reloading
RobertLucian Oct 15, 2020
e88c087
Fix merge conflicts from 'master' into feature/multi-model-caching
RobertLucian Oct 15, 2020
a71d69b
Fix model paths for multi-model-classifier example
RobertLucian Oct 15, 2020
2917159
Multiple procs for TF pred w/ caching disabled
RobertLucian Oct 16, 2020
8a27981
Switch prints with logs
RobertLucian Oct 16, 2020
7748fba
Address PR comments
RobertLucian Oct 16, 2020
e9c0b4d
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 16, 2020
bc06d06
Indentation for comment
RobertLucian Oct 16, 2020
dce2dd9
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 19, 2020
e24c799
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 20, 2020
6fc8fff
Remove CORTEX_MODEL_SOURCE_TYPE, CORTEX_MODEL_CACHE_SIZE, CORTEX_MODE…
RobertLucian Oct 20, 2020
f46b169
Add documentation for live-reloading/model caching
RobertLucian Oct 20, 2020
b897669
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 20, 2020
4fa7b05
Fixing the multi-model endpoint guide
RobertLucian Oct 20, 2020
acd30c8
Semantic fixes to models page
RobertLucian Oct 20, 2020
bbecb31
Doc fixes and re-ordering
RobertLucian Oct 20, 2020
33e48e5
Merge branch 'master' into feature/multi-model-caching
RobertLucian Oct 21, 2020
b2200e7
Address 1st round of comments
RobertLucian Nov 3, 2020
fa0bf09
Add 2nd round of requested changes
RobertLucian Nov 3, 2020
2cad972
Make lint
RobertLucian Nov 3, 2020
ac87a4a
Replace highest with latest and remove highest tag
RobertLucian Nov 4, 2020
3e129f5
3rd round of addressing comments
RobertLucian Nov 4, 2020
0ca3a9c
Expose underlying error when validating models
RobertLucian Nov 4, 2020
4cd15d0
4th round of addressing comments
RobertLucian Nov 5, 2020
4d3246a
Address merge conflicts from 'master' into feature/multi-model-caching
RobertLucian Nov 5, 2020
7814ade
A few nits
RobertLucian Nov 6, 2020
6edbf08
Fix merge conflicts from 'master' into feature/multi-model-caching
RobertLucian Nov 6, 2020
39d9996
Merge branch 'master' into feature/multi-model-caching
RobertLucian Nov 6, 2020
f1fe6fa
Fix a few bugs
RobertLucian Nov 6, 2020
d184fcf
Run the python init script as a service
RobertLucian Nov 7, 2020
bb0ee67
"versions" field not set to empty list, but to None
RobertLucian Nov 7, 2020
e190408
Fix a bug with the live-reloading for TFS
RobertLucian Nov 9, 2020
ec5ef24
Don't restart service if exit code = 0; don't kill other processes if…
RobertLucian Nov 9, 2020
8cf7a65
Prevent script.py from exiting when a cron runs; prevent HandleReload…
RobertLucian Nov 10, 2020
d05ac78
Move while loop after the touch of init_script_run.txt
RobertLucian Nov 10, 2020
85ada50
Layout fixes for the model validation in the CLI
RobertLucian Nov 10, 2020
c5b2ab9
Make lint
RobertLucian Nov 10, 2020
0559d54
Fix bug for the MMC Python/ONNX when dir field is used
RobertLucian Nov 10, 2020
035ffb1
Allow additional files for ONNX/TensorFlowNeuron models
RobertLucian Nov 10, 2020
6c70e45
Fix merge conflicts from 'master' into feature/multi-model-caching
RobertLucian Nov 10, 2020
8dd7fcb
Merge branch 'master' into feature/multi-model-caching
RobertLucian Nov 10, 2020
1cb6f73
Fix error formatting + disallow the addition of extra files for ONNX …
RobertLucian Nov 10, 2020
b63a621
Merge branch 'master' into feature/multi-model-caching
RobertLucian Nov 10, 2020
7658d2d
Misc updates
deliahu Nov 10, 2020
7780e64
Update models.md
deliahu Nov 10, 2020
bee8d0a
Fix TensorFlow + Inferentia + 1 process not live reloading
RobertLucian Nov 10, 2020
03b1cee
Prevent the crons from re-downloading the same model over and over again
RobertLucian Nov 10, 2020
3b7004d
Fix models not getting reloaded if the timestamp is older
RobertLucian Nov 10, 2020
101641f
Merge branch 'master' into feature/multi-model-caching
RobertLucian Nov 10, 2020
1fe6b7f
Remove slipped comment
RobertLucian Nov 10, 2020
27aadf2
Fix bug that would make the API think the TensorFlow Neuron predictor…
RobertLucian Nov 10, 2020
8c33b76
Fix bug that would lead to un-loadable non-versioned models when usin…
RobertLucian Nov 11, 2020
076deb2
Start batch after scrip_py service has run; otherwise the API fails
RobertLucian Nov 11, 2020
8fd2d8f
Set log level to INFO
RobertLucian Nov 11, 2020
010199a
Fix model paths in each cortex.yaml example
RobertLucian Nov 11, 2020
d3fc746
Update docs
deliahu Nov 11, 2020
4e25b99
Fix model path for batch example
RobertLucian Nov 11, 2020
7673297
Update models.md
deliahu Nov 11, 2020
552038c
update docstring
deliahu Nov 11, 2020
056f319
Merge branch 'master' of github.com:cortexlabs/cortex into feature/mu…
deliahu Nov 11, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions cli/cmd/const.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
/*
Copyright 2020 Cortex Labs, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package cmd

const (
_timeFormat = "02 Jan 06 15:04:05 MST"
)
1 change: 0 additions & 1 deletion cli/cmd/lib_batch_apis.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ const (
_titleBatchAPI = "batch api"
_titleJobCount = "running jobs"
_titleLatestJobID = "latest job id"
_timeFormat = "02 Jan 2006 15:04:05 MST"
)

func batchAPIsTable(batchAPIs []schema.APIResponse, envNames []string) table.Table {
Expand Down
276 changes: 215 additions & 61 deletions cli/cmd/lib_realtime_apis.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ import (
"io/ioutil"
"net/http"
"sort"
"strconv"
"strings"
"time"

"github.com/cortexlabs/cortex/cli/types/cliconfig"
"github.com/cortexlabs/cortex/pkg/consts"
"github.com/cortexlabs/cortex/pkg/lib/cast"
"github.com/cortexlabs/cortex/pkg/lib/console"
"github.com/cortexlabs/cortex/pkg/lib/errors"
"github.com/cortexlabs/cortex/pkg/lib/json"
Expand Down Expand Up @@ -70,8 +70,8 @@ func realtimeAPITable(realtimeAPI schema.APIResponse, env cliconfig.Environment)

out += fmt.Sprintf("\n%s curl %s -X POST -H \"Content-Type: application/json\" -d @sample.json\n", console.Bold("example curl:"), realtimeAPI.Endpoint)

if realtimeAPI.Spec.Predictor.Type == userconfig.TensorFlowPredictorType || realtimeAPI.Spec.Predictor.Type == userconfig.ONNXPredictorType {
out += "\n" + describeModelInput(realtimeAPI.Status, realtimeAPI.Endpoint)
if !(realtimeAPI.Spec.Predictor.Type == userconfig.PythonPredictorType && realtimeAPI.Spec.Predictor.ModelPath == nil && realtimeAPI.Spec.Predictor.Models == nil) {
out += "\n" + describeModelInput(realtimeAPI.Status, realtimeAPI.Spec.Predictor, realtimeAPI.Endpoint)
}

out += titleStr("configuration") + strings.TrimSpace(realtimeAPI.Spec.UserStr(env.Provider))
Expand Down Expand Up @@ -232,67 +232,40 @@ func classificationMetricsStr(metrics *metrics.Metrics) string {
return out
}

func describeModelInput(status *status.Status, apiEndpoint string) string {
func describeModelInput(status *status.Status, predictor *userconfig.Predictor, apiEndpoint string) string {
if status.Updated.Ready+status.Stale.Ready == 0 {
return "the model's input schema will be available when the api is live\n"
return "the models' metadata schema will be available when the api is live\n"
}

apiSummary, err := getAPISummary(apiEndpoint)
if err != nil {
return "error retrieving the model's input schema: " + errors.Message(err) + "\n"
}

numRows := 0
for _, inputSignatures := range apiSummary.ModelSignatures {
numRows += len(inputSignatures)
}

usesDefaultModel := false
rows := make([][]interface{}, numRows)
rowNum := 0
for modelName, inputSignatures := range apiSummary.ModelSignatures {
for inputName, inputSignature := range inputSignatures {
shapeStr := make([]string, len(inputSignature.Shape))
for idx, dim := range inputSignature.Shape {
shapeStr[idx] = s.ObjFlatNoQuotes(dim)
}

shapeRowEntry := ""
if len(shapeStr) == 1 && shapeStr[0] == "scalar" {
shapeRowEntry = "scalar"
} else if len(shapeStr) == 1 && shapeStr[0] == "unknown" {
shapeRowEntry = "unknown"
} else {
shapeRowEntry = "(" + strings.Join(shapeStr, ", ") + ")"
}
rows[rowNum] = []interface{}{
modelName,
inputName,
inputSignature.Type,
shapeRowEntry,
}
rowNum++
cachingEnabled := predictor.Models != nil && predictor.Models.CacheSize != nil && predictor.Models.DiskCacheSize != nil
if predictor.Type == userconfig.TensorFlowPredictorType && !cachingEnabled {
apiTFLiveReloadingSummary, err := getAPITFLiveReloadingSummary(apiEndpoint)
if err != nil {
return "error retrieving the models' metadata schema: " + errors.Message(err) + "\n"
}
if modelName == consts.SingleModelName {
usesDefaultModel = true
t, err := parseAPITFLiveReloadingSummary(apiTFLiveReloadingSummary)
if err != nil {
return "error retrieving the model's input schema: " + errors.Message(err) + "\n"
}
return t
}

inputTitle := "input"
if usesDefaultModel {
inputTitle = "model input"
apiModelSummary, err := getAPIModelSummary(apiEndpoint)
if err != nil {
return "error retrieving the models' metadata schema: " + errors.Message(err) + "\n"
}
t := table.Table{
Headers: []table.Header{
{Title: "model name", MaxWidth: 32, Hidden: usesDefaultModel},
{Title: inputTitle, MaxWidth: 32},
{Title: "type", MaxWidth: 10},
{Title: "shape", MaxWidth: 20},
},
Rows: rows,
t, err := parseAPIModelSummary(apiModelSummary)
if err != nil {
return "error retrieving the models' metadata schema: " + errors.Message(err) + "\n"
}
return t
}

return t.MustFormat()
func getModelFromModelID(modelID string) (modelName string, modelVersion int64, err error) {
splitIndex := strings.LastIndex(modelID, "-")
modelName = modelID[:splitIndex]
modelVersion, err = strconv.ParseInt(modelID[splitIndex+1:], 10, 64)
return
}

func makeRequest(request *http.Request) (http.Header, []byte, error) {
Expand Down Expand Up @@ -324,7 +297,26 @@ func makeRequest(request *http.Request) (http.Header, []byte, error) {
return response.Header, bodyBytes, nil
}

func getAPISummary(apiEndpoint string) (*schema.APISummary, error) {
func getAPIModelSummary(apiEndpoint string) (*schema.APIModelSummary, error) {
req, err := http.NewRequest("GET", apiEndpoint, nil)
if err != nil {
return nil, errors.Wrap(err, "unable to request api summary")
}
req.Header.Set("Content-Type", "application/json")
_, response, err := makeRequest(req)
if err != nil {
return nil, err
}

var apiModelSummary schema.APIModelSummary
err = json.DecodeWithNumber(response, &apiModelSummary)
if err != nil {
return nil, errors.Wrap(err, "unable to parse api summary response")
}
return &apiModelSummary, nil
}

func getAPITFLiveReloadingSummary(apiEndpoint string) (*schema.APITFLiveReloadingSummary, error) {
req, err := http.NewRequest("GET", apiEndpoint, nil)
if err != nil {
return nil, errors.Wrap(err, "unable to request api summary")
Expand All @@ -335,17 +327,179 @@ func getAPISummary(apiEndpoint string) (*schema.APISummary, error) {
return nil, err
}

var apiSummary schema.APISummary
err = json.DecodeWithNumber(response, &apiSummary)
var apiTFLiveReloadingSummary schema.APITFLiveReloadingSummary
err = json.DecodeWithNumber(response, &apiTFLiveReloadingSummary)
if err != nil {
return nil, errors.Wrap(err, "unable to parse api summary response")
}
return &apiTFLiveReloadingSummary, nil
}

for _, inputSignatures := range apiSummary.ModelSignatures {
for _, inputSignature := range inputSignatures {
inputSignature.Shape = cast.JSONNumbers(inputSignature.Shape)
func parseAPIModelSummary(summary *schema.APIModelSummary) (string, error) {
rows := make([][]interface{}, 0)

for modelName, modelMetadata := range summary.ModelMetadata {
latestVersion := int64(0)
for _, version := range modelMetadata.Versions {
v, err := strconv.ParseInt(version, 10, 64)
if err != nil {
return "", err
}
if v > latestVersion {
latestVersion = v
}
}
latestStrVersion := strconv.FormatInt(latestVersion, 10)

for idx, version := range modelMetadata.Versions {
var latestTag string
if latestStrVersion == version {
latestTag = " (latest)"
}

timestamp := modelMetadata.Timestamps[idx]
date := time.Unix(timestamp, 0)

rows = append(rows, []interface{}{
modelName,
version + latestTag,
date.Format(_timeFormat),
})
}
}

_, usesCortexDefaultModelName := summary.ModelMetadata[consts.SingleModelName]

t := table.Table{
Headers: []table.Header{
{
Title: "model name",
MaxWidth: 32,
Hidden: usesCortexDefaultModelName,
},
{
Title: "model version",
MaxWidth: 25,
},
{
Title: "edit time",
MaxWidth: 32,
},
},
Rows: rows,
}

return t.MustFormat(), nil
}

func parseAPITFLiveReloadingSummary(summary *schema.APITFLiveReloadingSummary) (string, error) {
latestVersions := make(map[string]int64)

numRows := 0
models := make(map[string]schema.GenericModelMetadata, 0)
for modelID, modelMetadata := range summary.ModelMetadata {
timestamp := modelMetadata.Timestamp
modelName, modelVersion, err := getModelFromModelID(modelID)
if err != nil {
return "", err
}
if _, ok := models[modelName]; !ok {
models[modelName] = schema.GenericModelMetadata{
Versions: []string{strconv.FormatInt(modelVersion, 10)},
Timestamps: []int64{timestamp},
}
} else {
model := models[modelName]
model.Versions = append(model.Versions, strconv.FormatInt(modelVersion, 10))
model.Timestamps = append(model.Timestamps, timestamp)
models[modelName] = model
}
if _, ok := latestVersions[modelName]; !ok {
latestVersions[modelName] = modelVersion
} else if modelVersion > latestVersions[modelName] {
latestVersions[modelName] = modelVersion
}
numRows += len(modelMetadata.InputSignatures)
}

rows := make([][]interface{}, 0, numRows)
for modelName, model := range models {
latestVersion := latestVersions[modelName]

for _, modelVersion := range model.Versions {
modelID := fmt.Sprintf("%s-%s", modelName, modelVersion)

inputSignatures := summary.ModelMetadata[modelID].InputSignatures
timestamp := summary.ModelMetadata[modelID].Timestamp
versionInt, err := strconv.ParseInt(modelVersion, 10, 64)
if err != nil {
return "", err
}

var applicableTags string
if versionInt == latestVersion {
applicableTags = " (latest)"
}

date := time.Unix(timestamp, 0)

for inputName, inputSignature := range inputSignatures {
shapeStr := make([]string, len(inputSignature.Shape))
for idx, dim := range inputSignature.Shape {
shapeStr[idx] = s.ObjFlatNoQuotes(dim)
}
shapeRowEntry := ""
if len(shapeStr) == 1 && shapeStr[0] == "scalar" {
shapeRowEntry = "scalar"
} else if len(shapeStr) == 1 && shapeStr[0] == "unknown" {
shapeRowEntry = "unknown"
} else {
shapeRowEntry = "(" + strings.Join(shapeStr, ", ") + ")"
}
rows = append(rows, []interface{}{
modelName,
modelVersion + applicableTags,
inputName,
inputSignature.Type,
shapeRowEntry,
date.Format(_timeFormat),
})
}
}
}

_, usesCortexDefaultModelName := summary.ModelMetadata[consts.SingleModelName]

t := table.Table{
Headers: []table.Header{
{
Title: "model name",
MaxWidth: 32,
Hidden: usesCortexDefaultModelName,
},
{
Title: "model version",
MaxWidth: 25,
},
{
Title: "model input",
MaxWidth: 32,
},
{
Title: "type",
MaxWidth: 10,
},
{
Title: "shape",
MaxWidth: 20,
},
{
Title: "edit time",
MaxWidth: 32,
},
},
Rows: rows,
}

return &apiSummary, nil
return t.MustFormat(), nil
}
Loading