Open
Description
I recently ported the YOLOv8 object detection model to Axon, and just wanted to share my experiences with it.
https://github.com/hansihe/yolov8_elixir
- The deployment story seems a lot better than other frameworks from what I have seen so far, good job! It’s really just worked most of the time, while with PyTorch and python things are sometimes very fiddly to get running on a particular machine.
- What is the recommended replacement of
Module
from PyTorch?Axon.namespace
looks somewhat like it, but there doesn’t seem to be a way of differentiating “this is a module that contains other layers, may depend on other layers outside of the module” vs “this is a subnetwork that is fully independent all the way to inputs”.
- Related to the above, naming and identifying layers within a network for parameter loading is very fiddly sometimes. I wish there was a way to have “modules” that provided nesting in the parameter map. I ended up more or less doing this with dot separated paths in the parameter names (
04.c2f.m.0.bottle_neck.cv2.conv.conv2d
).- If not, would it be possible to introduce an abstraction where a new layer can be constructed as a combination of other layers? This could then be reflected in the parameter map by nesting.
- Example: yolov8 has a
C2f
layer which contains manyBottleneck
layers which contains other layers again.
- Example: yolov8 has a
- If not, would it be possible to introduce an abstraction where a new layer can be constructed as a combination of other layers? This could then be reflected in the parameter map by nesting.
- Maybe there could be a utility layer in
Axon
for destructuring a container? Say one layer returns an%{"a" => _, "b" => _}
container, having a way to destructure that in another layer without making many differentAxon.layer
s that just pull out one of the inner values.- I might also have missed something obvious here.
- A lot of the time when I get dimension mismatches from building my model, I get no stack trace in the “this layer was defined at” section of the error. It’s just empty. Should I do something special to get a stack trace?
- The documentation of
Axon.build
could be a little bit clearer on what theinit
vspredict
functions actually do.- Is there an expectation that
predict
can modify mutable internal state inXLA
or other backends? - Or is it mainly to initialize/copy parameters to the backend representation?
- Is there an expectation that
- It would be useful if
predict
had different stricter modes for stuff like:- Explicitly warn if a parameter is missing from the input parameter map and instead was initialized. It is unwanted to have a parameter initialize if we are loading a model for inference, it would indicate an error.
- Warn if there is extra unused data in the parameter map. This would make it easier to track down parameter naming issues.
- When working on the model in LiveBook, and printing my model as a table, the input got so large that it truncated. I wish there was a way to prevent it from truncating, I had to save the text to a file and open it in an editor to read the full table.
- Is there any way the unpickler and torch parameter loading stuff could be moved from bumblebee to another library? Right now I depend on bumblebee just for those parts.
Metadata
Metadata
Assignees
Labels
No labels