Skip to content

Conversation

@jbarrow
Copy link
Contributor

@jbarrow jbarrow commented Dec 8, 2023

Added and documented the BERT implementation. It currently uses the HF tokenizer, because that handles a lot of nice things (like padding, masking, and token_type_ids).

It's tested against the HuggingFace implementation at: batch sizes >= 1, different models (bert-base-uncased, bert-base-cased, bert-large-uncased, bert-large-cased), etc.

@awni awni self-requested a review December 8, 2023 15:24
Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nicely done!! I love it. I left some mostly minor comments.

Please address and then we can merge it into the examples.

Also could you add a requirements.txt for dependenies?

@jbarrow
Copy link
Contributor Author

jbarrow commented Dec 9, 2023

Outside of anything that requires a change to MLX core, I think I've made the changes.

I'll put off removing the MultiHeadAttention class until I've made that change, it's been pulled into MLX, and there's been a released version. 😄

@jbarrow
Copy link
Contributor Author

jbarrow commented Dec 9, 2023

Submitted and tested a PR for the bias change in mlx. I can submit a new PR here once it's merged/released.

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just two more comments, then we can merge this and follow up after the change to core multiheadedattention.

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant, thank you for putting this together!

@awni awni merged commit 46c6bbe into ml-explore:main Dec 9, 2023
Blaizzy pushed a commit to Blaizzy/mlx-examples that referenced this pull request Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants