-
Notifications
You must be signed in to change notification settings - Fork 292
Differential Binarization model #2095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took high level pass and left some comments.
Also,
Make al the file names in follow the same format like other files, for db_utils and losses.py
The inputs to `generate` are `"prompts"`, not `"text"`. Fixes keras-team#1685
* routine HF sync * code reformat
Bumps the python group with 2 updates: torch and torchvision. Updates `torch` from 2.6.0+cu126 to 2.7.0+cu126 Updates `torchvision` from 0.21.0+cu126 to 0.22.0+cu126 --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.0+cu126 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python - dependency-name: torchvision dependency-version: 0.22.0+cu126 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Modify TransformerEncoder masking documentation * Added space before parenthesis
* Fix Mistral conversion script This commit addresses several issues in the Mistral checkpoint conversion script: - Adds `dropout` to the model initialization to match the Hugging Face model. - Replaces `requests.get` with `hf_hub_download` for more reliable tokenizer downloads. - Adds support for both `tokenizer.model` and `tokenizer.json` to handle different Mistral versions. - Fixes a `TypeError` in the `save_to_preset` function call. * address format issues * adopted to latest hub style * address format issues --------- Co-authored-by: laxmareddyp <laxmareddyp@laxma-n2-highmem-256gbram.us-central1-f.c.gtech-rmi-dev.internal>
Updates the requirements on [tensorflow-cpu](https://github.com/tensorflow/tensorflow), [tensorflow](https://github.com/tensorflow/tensorflow), [tensorflow-text](https://github.com/tensorflow/text), torch, torchvision and [tensorflow[and-cuda]](https://github.com/tensorflow/tensorflow) to permit the latest version. Updates `tensorflow-cpu` to 2.19.0 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.18.1...v2.19.0) Updates `tensorflow` to 2.19.0 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.18.1...v2.19.0) Updates `tensorflow-text` to 2.19.0 - [Release notes](https://github.com/tensorflow/text/releases) - [Commits](tensorflow/text@v2.18.0...v2.19.0) Updates `torch` from 2.7.0+cu126 to 2.7.1+cu126 Updates `torchvision` from 0.22.0+cu126 to 0.22.1+cu126 Updates `tensorflow[and-cuda]` to 2.19.0 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.18.0...v2.19.0) --- updated-dependencies: - dependency-name: tensorflow-cpu dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python - dependency-name: tensorflow dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python - dependency-name: tensorflow-text dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python - dependency-name: torch dependency-version: 2.7.1+cu126 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: python - dependency-name: torchvision dependency-version: 0.22.1+cu126 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: python - dependency-name: tensorflow[and-cuda] dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* init * update * bug fixes * add qwen causal lm test * fix qwen3 tests
* support flash-attn at torch backend * fix * fix * fix * fix conflit * fix conflit * fix conflit * fix conflit * fix conflit * fix conflit * format
* init: Add initial project structure and files * bug: Small bug related to weight loading in the conversion script * finalizing: Add TIMM preprocessing layer * incorporate reviews: Consolidate stage configurations and improve API consistency * bug: Unexpected argument error in JAX with Keras 3.5 * small addition for the D-FINE to come: No changes to the existing HGNetV2 * D-FINE JIT compile: Remove non-essential conditional statement * refactor: Address reviews and fix some nits
* Register qwen3 presets * fix format
This reverts commit 80b3c56.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added my review comment, in the issue description add the original implementation, reference paper, training colab and end to end working demo of the trained model including post processing.
if head_kernel_list is None: | ||
head_kernel_list = [3, 2, 2] | ||
if image_shape is None: | ||
image_shape = (640, 640, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this, head_kernel_list
you can add it to default argument if it is common for all the implementations, else this can be part of the config as per the specific checkpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added head_kernel_list like this, as gemini-code-reveiw suggested(Using mutable default arguments like lists or tuples is a common pitfall in Python and can lead to unexpected behavior).
I updated this part with config. Next commit able to see the changes.
)(topdown_p2) | ||
featuremap_p4 = layers.UpSampling2D((4, 4), dtype=dtype)(featuremap_p4) | ||
featuremap_p3 = layers.UpSampling2D((2, 2), dtype=dtype)(featuremap_p3) | ||
featuremap_p2 = layers.UpSampling2D((1, 1), dtype=dtype)(featuremap_p2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address this comment, it can be removed since it doesn't do anything
run_mixed_precision_check=False, | ||
run_quantization_check=False, | ||
run_data_format_check=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enable these tests
if backend.backend() == "jax": | ||
pytest.skip( | ||
"JAX backend does not support this test due to NaN issues." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be you should investigate and make it work on JAX as well.
The probability map output generated by `predict()` can be translated into | ||
polygon representation using `model.postprocess_to_polygons()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this postprocess_to_polygons is not implemented, you need to update the docstring accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apply the changes with relevant doc string. Next commit able to see the changes
`map_output` now holds a 8x224x224x3 tensor, where the last dimension | ||
corresponds to the model's probability map, threshold map and binary map | ||
outputs. Use `postprocess_to_polygons()` to obtain a polygon | ||
representation: | ||
```python | ||
detector.postprocess_to_polygons(map_output[...,0]) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
import keras | ||
|
||
|
||
def Polygon(coords): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow snake casing
image_size=(640, 640), | ||
annotation_size=(640, 360), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are we using these two arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove annotation_size argument which I missed it. As earlier I was using diffbin_utils function with image text data preprocessor. Next commit able to see the changes
|
||
|
||
@keras_hub_export("keras_hub.models.ImageTextDetectorPreprocessor") | ||
class ImageTextDetectorPreprocessor(Preprocessor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any difference between ImageClassifierPreprocessor and this, if it is used only for preprocessing images, may be you can just use ImageClassifierPreprocessor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because earlier I was using diffbin_utils functions(some functions like cv2,shapely created with keras.ops). But this was giving not accurate result as with cv2 and shapely result. So I used this ImageClassifierPreprocessor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we can subclass ImageClassifierPreprocessor
, like other vision models
Differential Binarization model.
Notebook: https://colab.sandbox.google.com/gist/mehtamansi29/86289c0ef273772f5de309ce20d0292d/diffbin_training_2.ipynb