Skip to content

Clean up inferencing code #1645

@woop

Description

@woop

A few small refactors are necessary for our feature inferencing code.

All inferencing of columns from sources should only happen during apply time, not during the construction of objects. Otherwise these objects will execute eagerly during instantiation which prevents Feast from controlling the execution order and may even lead to duplicate executions.

One example is over here:

or self._infer_event_timestamp_column("TIMESTAMP|DATETIME"),

Also, inferencing code should be more cleanly separated. Right now we have a DataSource being subclassed to implementations, which reference the DataSource base class, which contains hardcoded references to the same child classes, which then call methods on this child classes. The interdependencies are overly complex and cyclic.
f55b51c#diff-fada7a84510bd3d3051cffc414ee51ed92eff4cdd4c2ac53a1b8b7c47286cab4R523

We can maybe just extract _infer_event_timestamp_column code into a function on its own.

Finally, we don't need to query tables in order to get schemas. We can simply use the get_table() method and inspect the schema instead of pulling data to the client f55b51c#diff-fada7a84510bd3d3051cffc414ee51ed92eff4cdd4c2ac53a1b8b7c47286cab4R756

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions