QR Code detector fails for integer messages longer than 3 digits #23

simon-staal · 2023-05-31T17:00:13Z

There appears to be a bug in the QR-code detection which causes it to fail when a numeric value with 4 or more digits is encoded. The QR codes are successfully decoded when using openCV's decoder implementation. This was discovered when benchmarking the performance of QR code detection. I'm assuming this is a problem with BoofCV, but opening an issue here just in case.

Sample code

import pyboof as pb
import segno
from time import time
from random import randint
# uninstall Java JDK after testing
TEST_FILE = 'qr_test.png'
ATTEMPTS = 5000

qr_pyboof = pb.FactoryFiducial(np.uint8).qrcode()
total_pyboof = 0

for i in range(1000, ATTEMPTS): # Only testing values > 999
    x = i
    qrcode = segno.make(x, version=1)
    assert qrcode.error == 'H'
    qrcode.save(TEST_FILE, scale = 5)

    img = cv2.imread(TEST_FILE)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img = cv2.resize(img, (88, 88))

    start = time()
    image = pb.ndarray_to_boof(img)
    qr_pyboof.detect(image)
    end = time()
    assert len(qr_pyboof.detections) == 0, f"Successfully detected {len(qr_pyboof.detections)} qr codes (input = {x})"
    total_pyboof += (end - start) * 1000

print(f'Took an average of {(total_pyboof / ATTEMPTS):.2f}ms for {ATTEMPTS} attempts')

lessthanoptimal · 2023-05-31T21:27:34Z

Hi Simon, So far I've NOT been able to reproduce this issue and it seems to run just fine. Could you share an image that you have problems with? Here's my dump from venv "pip freeze > requirements.txt" numpy==1.24.3 opencv-python==4.7.0.72 py4j==0.10.9.7 PyBoof==0.41 segno==1.5.2 six==1.16.0 transforms3d==0.4.1 I did have to modify the example above because it was missing imports and wouldn't run: import pyboof as pb import segno import numpy as np << new import cv2 << new from time import time from random import randint FYI on my system I get 7.41 ms Cheers, - Peter

…

On Wed, May 31, 2023 at 10:00 AM Simon Staal ***@***.***> wrote: There appears to be a bug in the QR-code detection which causes it to fail when a numeric value with 4 or more digits is encoded. The QR codes are successfully decoded when using openCV's decoder implementation. This was discovered when benchmarking the performance of QR code detection. I'm assuming this is a problem with BoofCV, but opening an issue here just in case. Sample code import pyboof as pbimport segnofrom time import timefrom random import randint# uninstall Java JDK after testingTEST_FILE = 'qr_test.png'ATTEMPTS = 5000 qr_pyboof = pb.FactoryFiducial(np.uint8).qrcode()total_pyboof = 0 for i in range(1000, ATTEMPTS): # Only testing values > 999 x = i qrcode = segno.make(x, version=1) assert qrcode.error == 'H' qrcode.save(TEST_FILE, scale = 5) img = cv2.imread(TEST_FILE) img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) img = cv2.resize(img, (88, 88)) start = time() image = pb.ndarray_to_boof(img) qr_pyboof.detect(image) end = time() assert len(qr_pyboof.detections) == 0, f"Successfully detected {len(qr_pyboof.detections)} qr codes (input = {x})" total_pyboof += (end - start) * 1000 print(f'Took an average of {(total_pyboof / ATTEMPTS):.2f}ms for {ATTEMPTS} attempts') — Reply to this email directly, view it on GitHub <#23>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFUOV5FX4K3ZQC6WJQVUCLXI52KRANCNFSM6AAAAAAYVYAQTU> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

lessthanoptimal · 2023-05-31T22:04:20Z

Realized that the code you posted only runs if the detector fails. Knowing that I ran the QR through a diagnostic application. it's failing because it claims padding bytes are malformed. Now the issue is trying to figure out if the QR is really malformed or not. If it is malformed, should that just be ignored since most [1] other detectors ignore that problem? QR codes are a bit chaotic since the standard is treated as a suggestion that's often ignored. [1] Sample size of 3. I tried a few readers I had on my phone.

…

On Wed, May 31, 2023 at 2:27 PM Peter A ***@***.***> wrote: Hi Simon, So far I've NOT been able to reproduce this issue and it seems to run just fine. Could you share an image that you have problems with? Here's my dump from venv "pip freeze > requirements.txt" numpy==1.24.3 opencv-python==4.7.0.72 py4j==0.10.9.7 PyBoof==0.41 segno==1.5.2 six==1.16.0 transforms3d==0.4.1 I did have to modify the example above because it was missing imports and wouldn't run: import pyboof as pb import segno import numpy as np << new import cv2 << new from time import time from random import randint FYI on my system I get 7.41 ms Cheers, - Peter On Wed, May 31, 2023 at 10:00 AM Simon Staal ***@***.***> wrote: > There appears to be a bug in the QR-code detection which causes it to > fail when a numeric value with 4 or more digits is encoded. The QR codes > are successfully decoded when using openCV's decoder implementation. This > was discovered when benchmarking the performance of QR code detection. I'm > assuming this is a problem with BoofCV, but opening an issue here just in > case. > > Sample code > > import pyboof as pbimport segnofrom time import timefrom random import randint# uninstall Java JDK after testingTEST_FILE = 'qr_test.png'ATTEMPTS = 5000 > qr_pyboof = pb.FactoryFiducial(np.uint8).qrcode()total_pyboof = 0 > for i in range(1000, ATTEMPTS): # Only testing values > 999 > x = i > qrcode = segno.make(x, version=1) > assert qrcode.error == 'H' > qrcode.save(TEST_FILE, scale = 5) > > img = cv2.imread(TEST_FILE) > img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) > img = cv2.resize(img, (88, 88)) > > start = time() > image = pb.ndarray_to_boof(img) > qr_pyboof.detect(image) > end = time() > assert len(qr_pyboof.detections) == 0, f"Successfully detected {len(qr_pyboof.detections)} qr codes (input = {x})" > total_pyboof += (end - start) * 1000 > print(f'Took an average of {(total_pyboof / ATTEMPTS):.2f}ms for {ATTEMPTS} attempts') > > — > Reply to this email directly, view it on GitHub > <#23>, or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAFUOV5FX4K3ZQC6WJQVUCLXI52KRANCNFSM6AAAAAAYVYAQTU> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> > -- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

-- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

simon-staal · 2023-05-31T22:15:17Z

Sorry about the missing imports (and the lack of clarity for the code snippet above). I've tested the QR codes in the [1000-9999] range (generated in the same method as above) with the following libraries. For the most part they seem to work:

OpenCV (cv2.QRCodeDetector()): all work with the exception of 9133 (some larger 5 digit values also fail)
pyzbar (pyzbar.decode()): all work
pyzxing (pyzxing.BarCodeReader()): doesn't work for 2189, 4705, perhaps more (haven't been able to test exhaustively because it's quite slow)

In terms of environment, I've been testing this on Windows 10, WSL2 (Ubuntu 22) and a Raspberry Pi 3B+. Here's my venv dump for windows / WSL2:

joblib==1.2.0
numpy==1.24.3
opencv-python==4.7.0.72
py4j==0.10.9.7
PyBoof==0.41
pyzbar==0.1.9
pyzxing==1.0.2
segno==1.5.2
six==1.16.0
transforms3d==0.4.1

And for the pi:


joblib==1.2.0
numpy==1.24.3
opencv-python==4.6.0.66 // Different version of OpenCV
py4j==0.10.9.7
PyBoof==0.41
pyzbar==0.1.9
pyzxing==1.0.2
segno==1.5.2
six==1.16.0
transforms3d==0.4.1

lessthanoptimal · 2023-05-31T22:28:13Z

As a follow up to my last message. I'm fairly confident that this is a bug in segno. It's incorrectly selecting the first byte to add padding to when the number of bits is a multiple of 8. Since padding bits don't contain the message and bugs are plentiful in encoders most decoders ignore the padding bytes. That's actually the "fix" I'm planning on adding in. How it was encoded: firstPaddingByte = location/8 + 1 How it should have been encoded firstPaddingByte = location/8 + (location%8 == 0 ? 0: 1) In this case the message is 32 bytes, which is encoded in 4 bytes. First index of padding should be byte 4 not byte 5, which is how it's encoded. A bit surprised that it took this long to run into this "problem"

…

On Wed, May 31, 2023 at 3:15 PM Simon Staal ***@***.***> wrote: Sorry about the missing imports (and the lack of clarity for the code snippet above). I've tested the QR codes in the [1000-9999] range (generated in the same method as above) with the following libraries. For the most part they seem to work: - OpenCV (cv2.QRCodeDetector()): all work with the exception of 9133 (some larger 5 digit values also fail) - pyzbar (pyzbar.decode()): all work - pyzxing (pyzxing.BarCodeReader()): doesn't work for 2189, 4705, perhaps more (haven't been able to test exhaustively because it's quite slow) In terms of environment, I've been testing this on Windows 10, WSL2 (Ubuntu 22) and a Raspberry Pi 3B+. Here's my venv dump for windows / WSL2: joblib==1.2.0 numpy==1.24.3 opencv-python==4.7.0.72 py4j==0.10.9.7 PyBoof==0.41 pyzbar==0.1.9 pyzxing==1.0.2 segno==1.5.2 six==1.16.0 transforms3d==0.4.1 And for the pi: joblib==1.2.0 numpy==1.24.3 opencv-python==4.6.0.66 // Different version of OpenCV py4j==0.10.9.7 PyBoof==0.41 pyzbar==0.1.9 pyzxing==1.0.2 segno==1.5.2 six==1.16.0 transforms3d==0.4.1 — Reply to this email directly, view it on GitHub <#23 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFUOV7PM5IJIWLXWDIOYK3XI67H7ANCNFSM6AAAAAAYVYAQTU> . You are receiving this because you commented.Message ID: ***@***.***>

-- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

simon-staal · 2023-05-31T22:40:38Z

Thanks for looking into it, I'll open an issue on segno regarding this! The following 5-digit messages also didn't work, are these also due to some kind of padding issue?

10110, 10427, 11207, 11303, 11674, 12389, 14244, 14515

lessthanoptimal · 2023-05-31T22:46:55Z

I looked at 10110. That's a BoofCV issue. There's a false positive for the finder pattern (square within a square) that's messing everything up. Good test case though!

…

On Wed, May 31, 2023 at 3:40 PM Simon Staal ***@***.***> wrote: Thanks for looking into it, I'll open an issue on segno regarding this! The following 5-digit messages also didn't work, are these also due to some kind of padding issue? 10110, 10427, 11207, 11303, 11674, 12389, 14244, 14515 — Reply to this email directly, view it on GitHub <#23 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFUOVZ3ZSKJXVHRDLFGODDXI7CHBANCNFSM6AAAAAAYVYAQTU> . You are receiving this because you commented.Message ID: ***@***.***>

-- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

lessthanoptimal · 2023-05-31T23:06:01Z

Verified that other libraries are completely ignoring padding bytes by removing them from some QR codes and they had no issues reading the mutilated QR. An update will be out soon since there's a NPE that also needs to be fixed.

…

On Wed, May 31, 2023 at 3:46 PM Peter A ***@***.***> wrote: I looked at 10110. That's a BoofCV issue. There's a false positive for the finder pattern (square within a square) that's messing everything up. Good test case though! On Wed, May 31, 2023 at 3:40 PM Simon Staal ***@***.***> wrote: > Thanks for looking into it, I'll open an issue on segno regarding this! > The following 5-digit messages also didn't work, are these also due to some > kind of padding issue? > > 10110, 10427, 11207, 11303, 11674, 12389, 14244, 14515 > > — > Reply to this email directly, view it on GitHub > <#23 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAFUOVZ3ZSKJXVHRDLFGODDXI7CHBANCNFSM6AAAAAAYVYAQTU> > . > You are receiving this because you commented.Message ID: > ***@***.***> > -- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

-- "Now, now my good man, this is no time for making enemies." — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

lessthanoptimal · 2023-06-01T18:56:30Z

Can you try the new release and see if it fixes the 4-digit issue

simon-staal · 2023-06-01T19:16:10Z

It no longer breaks for all of them, so I'm assuming the 4-digit padding issue is solved, although it still fails in the following cases (exhaustively tested from [1000,9999]):

2474, 4916, 6079, 7243, 7531, 7763, 8138, 8698, 9158, 9372

I'm assuming these are due to similar issues as those found in the 5-digit numbers?

lessthanoptimal · 2023-06-01T22:27:25Z

Yeah those aren't going to be fixed right now. I'm going to create a new ticket referring this conversation. What the image above shows is that there's a false positive that's messing it up. In all the cases I checked it should be very clear that this is a bad decision to make.

I'm planning on adding a similar test to what you did to the regression test to ensure that this gets fixed. Thanks for finding this issue!

lessthanoptimal · 2023-06-01T22:29:01Z

Out of curiously, how does the speed compare to the other libraries? Also are you testing an application where you scan documents and have perfect QR codes?

simon-staal · 2023-06-04T11:24:47Z

Out of curiously, how does the speed compare to the other libraries? Also are you testing an application where you scan documents and have perfect QR codes?

Here are the benchmark results for detecting a single QR code, which looked very similar to the sample code above. At each iteration, a random QR code was generated, in the range [0-99999] (except for BoofCV where the [0-999] range was used due to the now solved padding issue), and then the time taken to detect the QR code was measured. All tests were run on a Raspberry Pi 3B+ using the Python wrapper of these libraries:

Detector	Mean (ms)	Std. Deviation (ms)
OpenCV	219.298	8.581
BoofCV	155.192	50.998
ZBar	4.274	0.270
ZXing	1953.751	57.966

I was surprised to see BoofCV had such a large variance, not sure if you could provide any insight as to why?

As for the intended use case, it's a bit more complicated than scanning documents; I'm developing an intelligent scrabble board, which contains an embedded camera and raspberry pi inside to detect QR codes on the bottom of the tiles. Obviously this benchmark has the limitation that the QR codes used don't have imperfections from being captured with a camera, but much easier to get a rough idea this way - although if you have any recommendations on how to improve this feel free to let me know! Images are always taken directly below the QR codes, and de-noised + binarized before being passed to the QR code detector anyways.

I'm also running a benchmark now which looks at the performance of these libraries in detecting QR codes for the entire board segment, where I'm testing a couple of different methods (segmenting the image into squares and detecting each one individually vs detecting all squares in the image and determining their position via bounding boxes). If you're interested in those results I'll post them when they're ready 😄.

lessthanoptimal · 2023-06-16T17:25:00Z

~~Which detector in OpenCV are you using? I think there are at least 2 now with drastically different behavior.~~

Noticed that you said which one you were using above.

lessthanoptimal · 2023-06-16T17:31:29Z

As for the performance, I'm a bit surprised by the speed. It's very different from what I saw several years ago in this benchmark. I suspect that when you process larger images the profile will change drastically. With the large standard deviation I would need to look into it. For example, years ago Raspberry PI's official distribution included what was effectively a broken JVM that made all things java run at a glacial speed. I kinda suspect that ZBar is doing so well because it's doing a more naive threshold that's advantageous in this specific use case.

simon-staal · 2023-06-16T17:34:06Z

Out of interest did I use the "correct" one for this purpose?
I also found a cpp implementation of ZXing which also had a python wrapper linked here which I tested on the same initial benchmark above, which actually performed better than any other implementations - see updated table below:

Detector	Mean (ms)	Std. Deviation (ms)
OpenCV	219.298	8.581
BoofCV	155.192	50.998
ZBar	4.274	0.270
ZXing	1953.751	57.966
ZXing-cpp	2.323	0.039

The implementation of the cpp version has since diverged substantially from the original ZXing, so the difference in performance could be due to a different algorithm, but possible helps support your broken JVM hypothesis?

Edit: As for the more advanced benchmark, I'll try to upload the document with the results on this issue in the next week or so

Edit2: As for your benchmark, I read through it before when doing research, and I was also surprised with how difference performance was to what you measured. Something I found particularly weird was when running tests on the full board, comparing an iterative approach in which the image is first segmented into tiles, then the detector is run on each tile individually, vs running the detector once on the whole image and identifying the positions of the tiles based on their bounding boxes - essentially corresponding to your "lots" case. Surprisingly, BoofCV performed really badly with this simultaneous approach - I've attached the plot of the results below (with 95% confidence intervals drawn):

lessthanoptimal · 2023-06-16T17:41:40Z

Hmm one thing that could be done is to run the benchmark I linked to above using an updated version of the libraries. That way we can see if there has been some drastic improvement or not. What tended to kill performance for many scenes with complex background. You can see that when you look at the individual categories.

BTW it's possible to "tune" BoofCV to just use a global threshold and you will get large speed gain, but it will be much less robust when you encounter complex lighting situations.

lessthanoptimal · 2023-06-16T17:44:31Z

The opencv + boofcv comparison talks about the broken JVM issue briefly. https://boofcv.org/index.php?title=Performance:OpenCV:BoofCV

lessthanoptimal · 2023-06-16T17:50:03Z

essentially corresponding to your "lots" case. Surprisingly, BoofCV performed really badly with this

Can you post an example image from that test? It does seem like something weird is going on. Almost like it's hitting a memory limit. The other library developers might have seen that benchmark and improved their code in the past 4 years since I ran the test, I think a bunch of these libraries are effectively in maintenance mode with only minor tweaks.

simon-staal · 2023-06-16T18:07:36Z

Sure, here's an example containing images for some of the different tile percentages tested (note the QR codes are still 88 by 88, generated through the same method as above, then stacked together to form the board)

lessthanoptimal · 2023-06-16T19:16:06Z

Got a link to a full resolution image?

simon-staal · 2023-06-16T21:16:30Z

Here's one:

I can also upload the code to generate these boards if that would be helpful?

simon-staal mentioned this issue May 31, 2023

Malformed padding when number of bits is a multiple of 8 heuer/segno#123

Open

lessthanoptimal mentioned this issue May 31, 2023

Cannot decode numeric QR codes containing exactly 4 digits lessthanoptimal/BoofCV#392

Closed

lessthanoptimal mentioned this issue Jun 1, 2023

False positive finder pattern detection inside QR code lessthanoptimal/BoofCV#393

Open

This was referenced Jun 15, 2023

Unable to decode QR Codes with particular payloads opencv/opencv#23810

Open

Unable to detect QR codes with particular data values ChenjieXu/pyzxing#38

Open

Micro QR Codes don't ignore malformed padding lessthanoptimal/BoofCV#397

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QR Code detector fails for integer messages longer than 3 digits #23

QR Code detector fails for integer messages longer than 3 digits #23

simon-staal commented May 31, 2023

lessthanoptimal commented May 31, 2023 via email

lessthanoptimal commented May 31, 2023 via email

simon-staal commented May 31, 2023

lessthanoptimal commented May 31, 2023 via email

simon-staal commented May 31, 2023

lessthanoptimal commented May 31, 2023 via email

lessthanoptimal commented May 31, 2023 via email

lessthanoptimal commented Jun 1, 2023

simon-staal commented Jun 1, 2023

lessthanoptimal commented Jun 1, 2023

lessthanoptimal commented Jun 1, 2023

simon-staal commented Jun 4, 2023 •

edited

Loading

lessthanoptimal commented Jun 16, 2023 •

edited

Loading

lessthanoptimal commented Jun 16, 2023

simon-staal commented Jun 16, 2023 •

edited

Loading

lessthanoptimal commented Jun 16, 2023

lessthanoptimal commented Jun 16, 2023

lessthanoptimal commented Jun 16, 2023

simon-staal commented Jun 16, 2023

lessthanoptimal commented Jun 16, 2023

simon-staal commented Jun 16, 2023

QR Code detector fails for integer messages longer than 3 digits #23

QR Code detector fails for integer messages longer than 3 digits #23

Comments

simon-staal commented May 31, 2023

lessthanoptimal commented May 31, 2023 via email

lessthanoptimal commented May 31, 2023 via email

simon-staal commented May 31, 2023

lessthanoptimal commented May 31, 2023 via email

simon-staal commented May 31, 2023

lessthanoptimal commented May 31, 2023 via email

lessthanoptimal commented May 31, 2023 via email

lessthanoptimal commented Jun 1, 2023

simon-staal commented Jun 1, 2023

lessthanoptimal commented Jun 1, 2023

lessthanoptimal commented Jun 1, 2023

simon-staal commented Jun 4, 2023 • edited Loading

lessthanoptimal commented Jun 16, 2023 • edited Loading

lessthanoptimal commented Jun 16, 2023

simon-staal commented Jun 16, 2023 • edited Loading

lessthanoptimal commented Jun 16, 2023

lessthanoptimal commented Jun 16, 2023

lessthanoptimal commented Jun 16, 2023

simon-staal commented Jun 16, 2023

lessthanoptimal commented Jun 16, 2023

simon-staal commented Jun 16, 2023

simon-staal commented Jun 4, 2023 •

edited

Loading

lessthanoptimal commented Jun 16, 2023 •

edited

Loading

simon-staal commented Jun 16, 2023 •

edited

Loading