Differential Binarization model #2095

mehtamansi29 · 2025-02-12T16:42:28Z

Differential Binarization model.

Notebook: https://colab.sandbox.google.com/gist/mehtamansi29/86289c0ef273772f5de309ce20d0292d/diffbin_training_2.ipynb

keras_hub/src/models/image_text_detector_preprocessor.py

sachinprasadhs

Took high level pass and left some comments.
Also,
Make al the file names in follow the same format like other files, for db_utils and losses.py

keras_hub/src/models/diffbin/db_utils.py

keras_hub/src/models/image_text_detector_preprocessor.py

…ted output

… shell_format

The inputs to `generate` are `"prompts"`, not `"text"`. Fixes keras-team#1685

* routine HF sync * code reformat

Bumps the python group with 2 updates: torch and torchvision. Updates `torch` from 2.6.0+cu126 to 2.7.0+cu126 Updates `torchvision` from 0.21.0+cu126 to 0.22.0+cu126 --- updated-dependencies: - dependency-name: torch dependency-version: 2.7.0+cu126 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python - dependency-name: torchvision dependency-version: 0.22.0+cu126 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Modify TransformerEncoder masking documentation * Added space before parenthesis

…s are int32 (keras-team#2305)

* Fix Mistral conversion script This commit addresses several issues in the Mistral checkpoint conversion script: - Adds `dropout` to the model initialization to match the Hugging Face model. - Replaces `requests.get` with `hf_hub_download` for more reliable tokenizer downloads. - Adds support for both `tokenizer.model` and `tokenizer.json` to handle different Mistral versions. - Fixes a `TypeError` in the `save_to_preset` function call. * address format issues * adopted to latest hub style * address format issues --------- Co-authored-by: laxmareddyp <laxmareddyp@laxma-n2-highmem-256gbram.us-central1-f.c.gtech-rmi-dev.internal>

Updates the requirements on [tensorflow-cpu](https://github.com/tensorflow/tensorflow), [tensorflow](https://github.com/tensorflow/tensorflow), [tensorflow-text](https://github.com/tensorflow/text), torch, torchvision and [tensorflow[and-cuda]](https://github.com/tensorflow/tensorflow) to permit the latest version. Updates `tensorflow-cpu` to 2.19.0 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.18.1...v2.19.0) Updates `tensorflow` to 2.19.0 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.18.1...v2.19.0) Updates `tensorflow-text` to 2.19.0 - [Release notes](https://github.com/tensorflow/text/releases) - [Commits](tensorflow/text@v2.18.0...v2.19.0) Updates `torch` from 2.7.0+cu126 to 2.7.1+cu126 Updates `torchvision` from 0.22.0+cu126 to 0.22.1+cu126 Updates `tensorflow[and-cuda]` to 2.19.0 - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.18.0...v2.19.0) --- updated-dependencies: - dependency-name: tensorflow-cpu dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python - dependency-name: tensorflow dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python - dependency-name: tensorflow-text dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python - dependency-name: torch dependency-version: 2.7.1+cu126 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: python - dependency-name: torchvision dependency-version: 0.22.1+cu126 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: python - dependency-name: tensorflow[and-cuda] dependency-version: 2.19.0 dependency-type: direct:production dependency-group: python ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* init * update * bug fixes * add qwen causal lm test * fix qwen3 tests

* support flash-attn at torch backend * fix * fix * fix * fix conflit * fix conflit * fix conflit * fix conflit * fix conflit * fix conflit * format

* init: Add initial project structure and files * bug: Small bug related to weight loading in the conversion script * finalizing: Add TIMM preprocessing layer * incorporate reviews: Consolidate stage configurations and improve API consistency * bug: Unexpected argument error in JAX with Keras 3.5 * small addition for the D-FINE to come: No changes to the existing HGNetV2 * D-FINE JIT compile: Remove non-essential conditional statement * refactor: Address reviews and fix some nits

* Register qwen3 presets * fix format

This reverts commit 80b3c56.

sachinprasadhs

Added my review comment, in the issue description add the original implementation, reference paper, training colab and end to end working demo of the trained model including post processing.

sachinprasadhs · 2025-08-07T22:14:53Z

keras_hub/src/models/diffbin/diffbin_backbone.py

+        if head_kernel_list is None:
+            head_kernel_list = [3, 2, 2]
+        if image_shape is None:
+            image_shape = (640, 640, 3)


Why do we need this, head_kernel_list you can add it to default argument if it is common for all the implementations, else this can be part of the config as per the specific checkpoint.

I added head_kernel_list like this, as gemini-code-reveiw suggested(Using mutable default arguments like lists or tuples is a common pitfall in Python and can lead to unexpected behavior).
I updated this part with config. Next commit able to see the changes.

sachinprasadhs · 2025-08-07T22:37:27Z

keras_hub/src/models/diffbin/diffbin_backbone.py

+    )(topdown_p2)
+    featuremap_p4 = layers.UpSampling2D((4, 4), dtype=dtype)(featuremap_p4)
+    featuremap_p3 = layers.UpSampling2D((2, 2), dtype=dtype)(featuremap_p3)
+    featuremap_p2 = layers.UpSampling2D((1, 1), dtype=dtype)(featuremap_p2)


Address this comment, it can be removed since it doesn't do anything

sachinprasadhs · 2025-08-07T22:42:13Z

keras_hub/src/models/diffbin/diffbin_backbone_test.py

+            run_mixed_precision_check=False,
+            run_quantization_check=False,
+            run_data_format_check=False,


Enable these tests

sachinprasadhs · 2025-08-07T22:45:29Z

keras_hub/src/models/diffbin/diffbin_loss_test.py

+        if backend.backend() == "jax":
+            pytest.skip(
+                "JAX backend does not support this test due to NaN issues."
+            )


May be you should investigate and make it work on JAX as well.

sachinprasadhs · 2025-08-07T22:47:54Z

keras_hub/src/models/diffbin/diffbin_textdetector.py

+    The probability map output generated by `predict()` can be translated into
+    polygon representation using `model.postprocess_to_polygons()`.


If this postprocess_to_polygons is not implemented, you need to update the docstring accordingly.

Apply the changes with relevant doc string. Next commit able to see the changes

sachinprasadhs · 2025-08-07T22:48:50Z

keras_hub/src/models/diffbin/diffbin_textdetector.py

+    `map_output` now holds a 8x224x224x3 tensor, where the last dimension
+    corresponds to the model's probability map, threshold map and binary map
+    outputs. Use `postprocess_to_polygons()` to obtain a polygon
+    representation:
+    ```python
+    detector.postprocess_to_polygons(map_output[...,0])
+    ```


sachinprasadhs · 2025-08-07T22:53:48Z

keras_hub/src/models/diffbin/diffbin_utils.py

+import keras
+
+
+def Polygon(coords):


Follow snake casing

sachinprasadhs · 2025-08-07T22:58:36Z

keras_hub/src/models/image_text_detector_preprocessor.py

+        image_size=(640, 640),
+        annotation_size=(640, 360),


Where are we using these two arguments?

I'll remove annotation_size argument which I missed it. As earlier I was using diffbin_utils function with image text data preprocessor. Next commit able to see the changes

sachinprasadhs · 2025-08-07T23:00:52Z

keras_hub/src/models/image_text_detector_preprocessor.py

+
+
+@keras_hub_export("keras_hub.models.ImageTextDetectorPreprocessor")
+class ImageTextDetectorPreprocessor(Preprocessor):


I don't see any difference between ImageClassifierPreprocessor and this, if it is used only for preprocessing images, may be you can just use ImageClassifierPreprocessor

This is because earlier I was using diffbin_utils functions(some functions like cv2,shapely created with keras.ops). But this was giving not accurate result as with cv2 and shapely result. So I used this ImageClassifierPreprocessor.

Then we can subclass ImageClassifierPreprocessor, like other vision models

ImageText detector preprocessor for Differential Binarization model

ed97271

sineeli reviewed Feb 12, 2025

View reviewed changes

keras_hub/src/models/image_text_detector_preprocessor.py Outdated Show resolved Hide resolved

mehtamansi29 added 3 commits March 11, 2025 17:43

db_utils functions and testfile

d97f362

Diffbin utils function and test file

de3aaae

diffbin utils function and testfile

9a3cf2a

sachinprasadhs added the WIP Pull requests which are work in progress and not ready yet for review. label Apr 11, 2025

sachinprasadhs mentioned this pull request May 7, 2025

Adding Differential Binarization model from PaddleOCR to Keras3 #1739

Closed

mehtamansi29 added 5 commits May 12, 2025 21:40

diffbin preprocessing function

93ad1ba

diffbin postprocessing function

7268535

diffbin postprocessing function_1

f1c3734

diffbin postprocessing function_2

d3c74c9

diffbin postprocessing function_3

aafef9e

sachinprasadhs reviewed May 14, 2025

View reviewed changes

sachinprasadhs reviewed May 15, 2025

View reviewed changes

keras_hub/src/models/image_text_detector_preprocessor.py Outdated Show resolved Hide resolved

sachinprasadhs reviewed May 15, 2025

View reviewed changes

keras_hub/src/models/image_text_detector_preprocessor.py Outdated Show resolved Hide resolved

mehtamansi29 added 15 commits May 20, 2025 17:33

Merge branch 'keras-team:master' into diffbin

352a089

diffbin preocessing and db_utils completed

d94a2e6

Merge branch 'keras-team:master' into diffbin

0028b90

diffbin_backbone model creation and backboone test for diffbin segmen…

d4724d9

…ted output

Merge branch 'keras-team:master' into diffbin

3c75f47

modifited diffbin _textdetector

d41dc34

Updates image_text_detector preprocessor

4b602c4

Updates image_text_detector preprocessor with ignores argument

ee2dced

Updates image_text_detector preprocessor,db_utils and formatting with…

736b0c9

… shell_format

Updates image_text_detector_1

fcfed6a

Updates image_text_detector_1

98e2fbc

Updates image_text_detector_3

5fcaefc

Updates image_text_detector_3

19c4e79

Updates image_text_detector_4

a5516dc

Updates image_text_detector_5

b46db73

hertschuh and others added 25 commits July 22, 2025 12:14

Fix PaliGemmaCausalLM example. (keras-team#2302)

bb76db9

The inputs to `generate` are `"prompts"`, not `"text"`. Fixes keras-team#1685

Routine HF sync (keras-team#2303)

e53aeb0

* routine HF sync * code reformat

incorrect condition on self.sliding_window_size (keras-team#2289)

025371f

Modify TransformerEncoder masking documentation (keras-team#2297)

2b21c6c

* Modify TransformerEncoder masking documentation * Added space before parenthesis

Fix Gemma3InterleaveEmbeddings JAX inference error by ensuring indice…

94b40e5

…s are int32 (keras-team#2305)

update preset versions (keras-team#2307)

b3d18cc

Qwen3 causal lm (keras-team#2311)

c3deb47

* init * update * bug fixes * add qwen causal lm test * fix qwen3 tests

Update JAX GPU version (keras-team#2319)

c91ca35

support flash-attn at torch backend (keras-team#2257)

c99e86d

* support flash-attn at torch backend * fix * fix * fix * fix conflit * fix conflit * fix conflit * fix conflit * fix conflit * fix conflit * format

Qwen3 presets register (keras-team#2325)

98df372

* Register qwen3 presets * fix format

diffbin_imagetextdetector and precommit changes

3eeba26

Update diffbin loss function and test file for loss function_1

445f537

Resolved conflict

afd6251

Revert "Gemini Suggested changes"

848abd0

This reverts commit 80b3c56.

Resolving Conflicts_1

5820ccc

resolving conflict___1

5546ac8

resolving conflict___2

f099d84

resolving conflict___3

9593170

Merge remote-tracking branch 'upstream/master' into diffbin

fbb0eed

Merge branch 'keras-team:master' into diffbin

8d9baf4

Resolve backend failing testcases

56e4e50

sachinprasadhs removed the WIP Pull requests which are work in progress and not ready yet for review. label Aug 7, 2025

sachinprasadhs reviewed Aug 7, 2025

View reviewed changes

sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Aug 7, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 7, 2025

Merge branch 'keras-team:master' into diffbin

46778b4

		The probability map output generated by `predict()` can be translated into
		polygon representation using `model.postprocess_to_polygons()`.



		@keras_hub_export("keras_hub.models.ImageTextDetectorPreprocessor")
		class ImageTextDetectorPreprocessor(Preprocessor):

		import keras


		def Polygon(coords):

Differential Binarization model #2095

Are you sure you want to change the base?

Differential Binarization model #2095

Uh oh!

Conversation

mehtamansi29 commented Feb 12, 2025 • edited by sachinprasadhs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mehtamansi29 Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mehtamansi29 commented Feb 12, 2025 •

edited by sachinprasadhs

Loading

mehtamansi29 Aug 11, 2025 •

edited

Loading