Description
PR #3119
tidied up llama by removing a library build and op registration.
This seems to have interacted poorly with (broken the cmake flow):
examples/xnnpack/quantization/test_quantize.sh
resulting in duplicate registration of:
quantized_decomposed::embedding_byte.out
this works for the buck2 build, but not the cmake build of the library I assume because kernels/quantized/targets.bzl
was updated.
removing the duplicates in kernels/quantized/quantized.yaml breaks llama2 quantization runs on xnnpack.
removing the duplicates in exir/passes/_quant_patterns_and_replacements.py causes llama testing to fail.
Can we look for a cleaner solution on quantized operator registration that is consistent across users? the xnnpack, arm backend and llama2 uses at least.
I have a patch demonstrating the problem here, as this is blocking a fairly large commit of Arm backend code:
0790d93