Skip to content

Add Esm #2244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Aug 11, 2025
Merged

Add Esm #2244

merged 33 commits into from
Aug 11, 2025

Conversation

pass-lin
Copy link
Contributor

@pass-lin pass-lin commented May 3, 2025

from #2177
Achieved a smaller error with hf.

import os
os.environ["KERAS_BACKEND"] = "torch"
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"

from keras import ops
from transformers.models.esm.modeling_esm import EsmAttention as hf_EsmSelfAttention
from transformers import EsmConfig
from esm2.esm2_layers import EsmSelfAttention
import numpy as np
import keras
from transformers.models.esm.modeling_esm import EsmModel
weights_path = "facebook/esm2_t6_8M_UR50D"
hf_model = EsmModel.from_pretrained(weights_path)
hf_model.cuda().eval()
hf_model.embeddings.token_dropout = False


from keras_hub.src.models.esm.esm_backbone import (
    ESMBackbone,
)


keras_model =  ESMBackbone.from_preset('hf://'+weights_path)
keras_model.summary()


x = ops.array([[1,2,3,4,5]])+1
hf_out = hf_model(x,ops.ones_like(x))[0]
keras_out = keras_model({'token_ids': x})

print(ops.all(ops.isclose(hf_out, keras_out,atol=1e-4)))

ESM Checkpoint Conversion and Numerics Verification Demo (across multiple backends): Notebook Link

Train Demo: Notebook Link

@pass-lin
Copy link
Contributor Author

pass-lin commented May 3, 2025

ruff.....................................................................Passed
ruff-format..............................................................Passed
Error: Process completed with exit code 1.

Please help me figure out how to solve this problem.

@mattdangerw
Copy link
Member

Probably an issue with generating the API symbols. Looks like you need to sync with the latest changes on master, then you could try running ./shell/api_gen.sh

@sachinprasadhs
Copy link
Collaborator

ruff.....................................................................Passed
ruff-format..............................................................Passed
Error: Process completed with exit code 1.

Please help me figure out how to solve this problem.

You can rebase it to latest master code
and then run - pre-commit run --all-files
pip install -u namex

@pass-lin
Copy link
Contributor Author

keras_hub/src/layers/modeling/reversible_embedding_test.py::ReversibleEmbeddingTest::test_quantize_dtype_argument_tie_weights - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/layers/modeling/reversible_embedding_test.py::ReversibleEmbeddingTest::test_quantize_dtype_argument_untie_weights - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/layers/modeling/reversible_embedding_test.py::ReversibleEmbeddingTest::test_quantize_int8_tie_weights - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/layers/modeling/reversible_embedding_test.py::ReversibleEmbeddingTest::test_quantize_int8_untie_weights - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/albert/albert_backbone_test.py::AlbertBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/bart/bart_backbone_test.py::BartBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/bert/bert_backbone_test.py::BertBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/bloom/bloom_backbone_test.py::BloomBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/clip/clip_backbone_test.py::CLIPBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/deberta_v3/deberta_v3_backbone_test.py::DebertaV3BackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/distil_bert/distil_bert_backbone_test.py::DistilBertBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/electra/electra_backbone_test.py::ElectraBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/f_net/f_net_backbone_test.py::FNetBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/falcon/falcon_backbone_test.py::FalconBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/gemma/gemma_backbone_test.py::GemmaBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/gemma/gemma_backbone_test.py::Gemma2BackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/gpt2/gpt2_backbone_test.py::GPT2BackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/gpt_neo_x/gpt_neo_x_backbone_test.py::GPTNeoXBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/llama/llama_backbone_test.py::LlamaTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/mistral/mistral_backbone_test.py::MistralBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/opt/opt_backbone_test.py::OPTBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/pali_gemma/pali_gemma_backbone_test.py::PaliGemmaBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/pali_gemma/pali_gemma_backbone_test.py::PaliGemma2BackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/phi3/phi3_backbone_test.py::Phi3Test::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/phi3/phi3_backbone_test.py::Phi3Test::test_backbone_basics_with_su_rotary - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/roberta/roberta_backbone_test.py::RobertaBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/siglip/siglip_backbone_test.py::SigLIPBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/siglip/siglip_backbone_test.py::SigLIP2BackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/t5/t5_backbone_test.py::T5BackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/whisper/whisper_backbone_test.py::WhisperBackboneTest::test_backbone_basics - TypeError: _int8_build() takes 2 positional arguments but 3 were given
FAILED keras_hub/src/models/xlm_roberta/xlm_roberta_backbone_test.py

@mattdangerw @sachinprasadhs
Is it a problem with the test environment? Why are there so many errors that don't belong to me?

@sachinprasadhs
Copy link
Collaborator

It's not related to your code, looks like some issue with the JAX backend, we will look into it.

Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks fro the PR, I have added my comments, also add checkpoints conversion under: keras-hub/tools/checkpoint_conversion

intermediate_dim: int. The output dimension of the first Dense layer in
a two-layer feedforward network for each transformer.
dropout: float. Dropout probability for the Transformer encoder.
layer_norm_eps:bool.Should we use ln after embedding?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't get the point here, are you asking our input or it's the arg detail, if it is the arg details, it needs to be repharsed, avoid question marks and the argument name is emb_layer_norm_before

layer_norm_eps discription needs to be updated.

@pass-lin
Copy link
Contributor Author

pass-lin commented May 17, 2025

@sachinprasadhs @mattdangerw
Can anybody review my code?

@pass-lin
Copy link
Contributor Author

pass-lin commented Jun 2, 2025

@mattdangerw @sachinprasadhs
Please check my code, thank you.

Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added few more comments and few of the previous review comments still needs to be addressed

Disclaimer: Pre-trained models are provided on an "as is" basis, without
warranties or conditions of any kind.

Args:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still activation and max_wavelength description is missing!

Disclaimer: Pre-trained models are provided on an "as is" basis, without
warranties or conditions of any kind.

Args:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add arg description for pad_token_id as well

Comment on lines 45 to 46
position_embedding_type:esm1 use abs position embeding,esm2 use rope.
so this parameter is only except for absolute and rotary.
Copy link
Collaborator

@sachinprasadhs sachinprasadhs Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be changed to:

position_embedding_type: str. The position embedding type to use. One of "absolute" and
"rotary". Use "absolute" for ESM1. Use "rotary" for ESM2. Defaults to "rotary".



@keras_hub_export("keras_hub.models.ESMProteinClassifierPreprocessor")
class ESMProteinClassifierPreprocessor(BertTextClassifierPreprocessor):
Copy link
Collaborator

@sachinprasadhs sachinprasadhs Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending change here which should be subclassed from TextClassifierPreprocessor instead of BertTextClassifierPreprocessor

max_sequence_length=1024,
max_wavelength=10000,
layer_norm_eps=1e-12,
emb_layer_norm_before=False,
Copy link
Collaborator

@sachinprasadhs sachinprasadhs Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending change, instead emb_layer_norm_before --> use_pre_layer_norm



@keras_hub_export("keras_hub.models.ESMProteinClassifier")
class ESMProteinClassifier(RobertaTextClassifier):
Copy link
Collaborator

@sachinprasadhs sachinprasadhs Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending change.
You can subclass TextClassifier and make the same changes as RobertaTextClassifier instead of subclassing from another model.

@sachinprasadhs
Copy link
Collaborator

Once you address all the comments, add end to end working colab along with the checkpoints conversion under: keras-hub/tools/checkpoint_conversion

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Jul 21, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 21, 2025
@pass-lin
Copy link
Contributor Author

Hi, Just one small comment, rest everything looks good, please replace head_size argument in both backbone and classifier example with a modified argument name which i believe num_heads?

Hi, Just one small comment, rest everything looks good, please replace head_size argument in both backbone and classifier example with a modified argument name which i believe num_heads?

Thanks for the reminder, I'll make changes now.

@pass-lin
Copy link
Contributor Author

Thank you for the PR @pass-lin. Looks great already. Added a few NIT comments. Thanks for the train demo, do you have a demo that shows output for classifier and PLM with presets loaded?

Thanks for your review, relevant changes have been made.

@pass-lin
Copy link
Contributor Author

image I forgot to switch branches when I was developing other tasks, which led to this error. It has been fixed now @sachinprasadhs

@sachinprasadhs sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Jul 22, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 22, 2025
@pass-lin
Copy link
Contributor Author

@sachinprasadhs This test error should have nothing to do with me

@sachinprasadhs
Copy link
Collaborator

Yes, you can ignore the GPU Test failures

@pass-lin
Copy link
Contributor Author

Yes, you can ignore the GPU Test failures

So, can we merge now?

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Jul 25, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 25, 2025
@divyashreepathihalli
Copy link
Collaborator

The GPU tests are fixed, please rebase with master. Will run the GPU tests again

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Jul 25, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 25, 2025
@pass-lin
Copy link
Contributor Author

The GPU tests are fixed, please rebase with master. Will run the GPU tests again

It seems that there are no problems with the test

@sachinprasadhs sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Jul 28, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 28, 2025
Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pass-lin
Copy link
Contributor Author

pass-lin commented Aug 2, 2025

@divyashreepathihalli Can this pr merge now?

@divyashreepathihalli divyashreepathihalli merged commit 3f8d3b7 into keras-team:master Aug 11, 2025
10 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in KerasHub Aug 11, 2025
@divyashreepathihalli
Copy link
Collaborator

Thank you for the contribution!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants