Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZairaChem ONNX predictor #48

Open
JHlozek opened this issue Oct 18, 2024 · 2 comments
Open

ZairaChem ONNX predictor #48

JHlozek opened this issue Oct 18, 2024 · 2 comments
Assignees

Comments

@JHlozek
Copy link
Collaborator

JHlozek commented Oct 18, 2024

I've started working on getting ZairaChem to run ONNX inference if 'zairachem predict -m' is called with an onnx model. However, seeing as this does not require pointing to a standard ZairaChem model, the pipeline doesn't have access to the original parameters.json file and needs some refactoring to cater for this.

My question is how much effort/changes do we want here? I can think of three broad options for ONNX predictions:

  1. 'zairachem predict' as just a basic wrapper that takes a csv, expects a 'SMILES' column, and returns a csv output.
  2. 'zairachem predict' still runs the input standardization step (just a few minimal changes needed here) and produces a reduced output folder with the onnx predictions.
  3. We aim to run as much of ZairaChem as possible, just excluding the estimator/pooling/etc and still producing reports. This could be done in a few ways (e.g. optionally pointing to the original zairachem model to get the parameters).

I've pretty much got option (2) working, but I wanted to check what you envisioned from a design perspective? Keep it simple or invest for more functionality?

@JHlozek JHlozek moved this to In Progress in AI2050 Oct 18, 2024
@GemmaTuron
Copy link
Member

Hi @JHlozek

Thanks for this summary. I think Option 2 makes sense for now, and once we improve the modularity of ZairaChem we can improve on point 3 and generate reports etc. I will assign this issue to me

@GemmaTuron GemmaTuron moved this from In Progress to Todo in AI2050 Nov 5, 2024
@GemmaTuron GemmaTuron self-assigned this Nov 5, 2024
@JHlozek
Copy link
Collaborator Author

JHlozek commented Nov 7, 2024

Hi @GemmaTuron

I've pushed the recent commit to ZairaChem that implements this feature. The ZairaChem predict command first tries to load the model path with onnx and runs the onnx model if appropriate else it runs the usual ZairaChem pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants