-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: CLI interface for validation of logged features #2718
Changes from 8 commits
06262b1
0e29fa3
ff9eb50
8e7da91
cf622ea
d049a61
acbfe84
301f61e
14ad6cd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,7 @@ import "feast/core/OnDemandFeatureView.proto"; | |
import "feast/core/RequestFeatureView.proto"; | ||
import "feast/core/DataSource.proto"; | ||
import "feast/core/SavedDataset.proto"; | ||
import "feast/core/ValidationProfile.proto"; | ||
import "google/protobuf/timestamp.proto"; | ||
|
||
// Next id: 13 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update 13 -> 14 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||
|
@@ -42,6 +43,7 @@ message Registry { | |
repeated RequestFeatureView request_feature_views = 9; | ||
repeated FeatureService feature_services = 7; | ||
repeated SavedDataset saved_datasets = 11; | ||
repeated ValidationReference validation_references = 13; | ||
Infra infra = 10; | ||
|
||
string registry_schema_version = 3; // to support migrations; incremented when schema is changed | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
import json | ||
from types import FunctionType | ||
from typing import Any, Callable, Dict, List | ||
|
||
import dill | ||
|
@@ -140,9 +141,12 @@ def analyze_dataset(self, df: pd.DataFrame) -> Profile: | |
return GEProfile(expectation_suite=self.user_defined_profiler(dataset)) | ||
|
||
def to_proto(self): | ||
# keep only the code and drop context for now | ||
# ToDo (pyalex): include some context, but not all (dill tries to pull too much) | ||
Comment on lines
+144
to
+145
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Curious what this means? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So right now for easier deserialization I drop the whole context of user defined function and keep only its code. The opposite of that (what dill does by default) is include all globals and refer the original module, which won't be available at the runtime. But maybe in future somebody would want to use at least function clojure, so we might want to support that. |
||
udp = FunctionType(self.user_defined_profiler.__code__, {}) | ||
return GEValidationProfilerProto( | ||
profiler=GEValidationProfilerProto.UserDefinedProfiler( | ||
body=dill.dumps(self.user_defined_profiler, recurse=True) | ||
body=dill.dumps(udp, recurse=False) | ||
) | ||
) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to reserve #2 or does it it not matter since it's never been used by anyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's never been used.