Skip to content

Conversation

@ganesh-k13
Copy link
Contributor

@ganesh-k13 ganesh-k13 commented Nov 20, 2022

Changes

Honestly one thing lead to another and here is what I ended up doing:

  1. Fix the actual problem, i.e., catch non-200 responses and handle them:
    1.1. In the case of a 422, raise a TypeError (same as before), but without all the extra processing. Now I'm pretty sure (take this with a pinch of salt :)) the status code check is sufficient, we need not check the json for type error
    1.2. In case of every other error, raise an HTTPError with a meaningful error message
  2. Add test cases for the new if and some more test cases for the TypeError
  3. There was a lot of repeated code, so I added a few extensible fixtures, so will be easy to make changes, but reduces code at the same time

Example:

from vetiver.server import predict
from vetiver.data import mtcars

endpoint = "http://127.0.0.1:8080/predict1"  # Notice incorrect URL

# res = predict(endpoint, data={1:2, 2:3})
res = predict(endpoint, data=mtcars)

print(res)

Before:

Traceback (most recent call last):
  File "/home/ganesh/os/tmp/pred.py", line 7, in <module>
    res = predict(endpoint, data=mtcars)
  File "/home/ganesh/.local/lib/python3.10/site-packages/vetiver/server.py", line 265, in predict
    response_df = pd.DataFrame.from_dict(response.json())
  File "/home/ganesh/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 1762, in from_dict
    return cls(data, index=index, columns=columns, dtype=dtype)
  File "/home/ganesh/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 662, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
  File "/home/ganesh/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 493, in dict_to_mgr
    return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
  File "/home/ganesh/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 118, in arrays_to_mgr
    index = _extract_index(arrays)
  File "/home/ganesh/.local/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 656, in _extract_index
    raise ValueError("If using all scalar values, you must pass an index")
ValueError: If using all scalar values, you must pass an index

Now:

Traceback (most recent call last):
  File "/home/ganesh/os/mlops/vetiver-python/pred.py", line 7, in <module>
    res = predict(endpoint, data=mtcars)
  File "/home/ganesh/os/mlops/vetiver-python/vetiver/server.py", line 270, in predict
    raise requests.exceptions.HTTPError(
requests.exceptions.HTTPError: Could not obtain data from endpoint with error: 404 Client Error: Not Found for url: http://127.0.0.1:8080/predict1

resolves: #71

PS: Totally fine with getting rid of the extra stuff and just fixing the error, 81818dd does that.

BUG:
 - Non 200 codes gives json's that do not form dataframes
 - Catch for any non-200 and raise descriptive exceptions

MAINT:
 - Removed unwanted TypeError catch
 - Used `root_path` instead of hardcoding the predict endpoint
return client


@pytest.fixture
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels much cleaner! Thank you!

@isabelizimm
Copy link
Contributor

This looks great! I updated the statsmodels and xgboost frameworks to pass tests + match test_predict.

@isabelizimm isabelizimm merged commit ebfb3bd into rstudio:main Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Double check that predict(...) raises for http errors

2 participants