Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF Decoder model converted to CoreML model starts returning Nan MLMultiArray during inference from certain time step. #2072

Open
seungjun-green opened this issue Nov 27, 2023 · 1 comment
Labels
question Response providing clarification needed. Will not be assigned to a release. (type)

Comments

@seungjun-green
Copy link

I'm having a same problem. I created a simple Transformer Decoder in TensorFlow, and it works well. But If I convert it to a CoreML model, from some point it start do just outputs MLMultiArray filled with NaN values. And Strange thing is that if I reinitialize the model during at every time step, CoreML never returns NaN array during inference.

for i in 0..<30 {
    decoder = try! iOS_Deocder(configuration: config) // <- Like this!
    let ddd = decoder.prediction(input_1: image_feature, input_2: tokens!).Identity
    
    // some additional codes

}

To address this issue, I tried converting the TF mode to CoreML model with compute_precision=coremltools. precision.Float32 and compute_precision=coremltools. precision.Float16 and also tried setting let config = MLModelConfiguration() config.computeUnits = .cpuOnly but none of them didn't work.

But strange thing is that the way I define model in TF slightly improved it.

The final output layer in TF looked like this:

final_output = self.final_layer(seq_layer_output)
final_output = final_output + custom_bias

but removing the last line like this:
final_output = self.final_layer(seq_layer_output)

Improved the CoreML model in following way: Previously CoreML model started to generating NaN array from third time step of inference, this made CoreML to start generating NaN array from 5th or 6th. Plus also removing all for loops for decoder layers also improved it.

My guess is that during some inferencing step some inner state of the CoreML is being stored, and that's affecting inferencing at next time step?

Spend 4 days into it, but can't figure it out. Can anyone help me with this issue?

@seungjun-green seungjun-green added the question Response providing clarification needed. Will not be assigned to a release. (type) label Nov 27, 2023
@TobyRoseman
Copy link
Collaborator

Can you share simple standalone code to reproduce the issue?

This sounds like it's an issue with the Core ML Framework, not the coremltools python package. If it's a problem with the Core ML Framework, you should submit the bug (with code to reproduce) using the Feedback Assistant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Response providing clarification needed. Will not be assigned to a release. (type)
Projects
None yet
Development

No branches or pull requests

2 participants