Skip to content

Luhn: Update testcase to avoid wrong solution #1480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 27, 2019
Merged

Luhn: Update testcase to avoid wrong solution #1480

merged 5 commits into from
Mar 27, 2019

Conversation

tqa236
Copy link
Contributor

@tqa236 tqa236 commented Mar 16, 2019

The change might seem trivia at first. It isn't. Here's a solution that passes all the test in Java

import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

class LuhnValidator {
  public boolean isValid(String candidate) {
    String number = candidate.replaceAll("\\s+", "");
    if ("0".equals(number)) return false;
    List<Integer> digits = number.chars().map(d -> d - 48).boxed().collect(Collectors.toList());
    return IntStream.range(0, digits.size())
                .map(
                    n ->
                        (n % 2 == digits.size() % 2)
                            ? (2 * digits.get(n) - 9 * (digits.get(n) >= 5 ? 1 : 0))
                            : digits.get(n))
                .sum()
            % 10
        == 0;
  }
}

In this solution, after I remove all the whitespaces, I convert all characters to the ascii number and subtract them to 48 (the ascii code of 0) to get the digits. Then I check with the list I receive and it passes all test cases. I didn't remove invalid characters at all.

The problem is that the ascii code of the invalid characters automatically make the valid number becomes invalid, so no need to remove them.

With the new test case, the sum is 90, and it should still be false because if the invalid character.

Copy link
Member

@coriolinus coriolinus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having convenient access to a Java dev environment, I've rewritten your example in Python, as directly as possible:

>>> def isValid(candidate):
...     number = candidate.replace(' ', '')
...     if number == '0':
...             return False
...     digits = [ord(c) - 48 for c in number]
...     return sum(2*digits[i] - (9*(1 if digits[i] >= 5 else 0)) if i % 2 == 0 else digits[i] for i in range(len(digits))) % 10 == 0
...

I've confirmed that where the old test case correctly returned False by accident, the new test case will incorrectly return True, failing the test case.

>>> isValid("055a 444 285")
False
>>> isValid("055b 444 285")
True

It's not a perfectly general solution, but it's an elegant way to stop this particular wrong algorithm from succeeding. I like it.

@petertseng
Copy link
Member

I also don't have access to running Java so I'll have to rely on others' help for the answer to the question: What does it do to :9 from #1246 ? : has value 58, which becomes 10. Doubled and becomes 20, subtract 9 to get 11, add 9 to get 20, so it would do the same. Any differences in behaviour of the supplied code such that it already rejects :9 ?

@coriolinus
Copy link
Member

Using the previous Python def of isValid:

>>> isValid(':9')
True

Given that the expected value of this case is False, that case should already exclude this solution, assuming always that the python implementation produces results identical to the Java one.

@tqa236
Copy link
Contributor Author

tqa236 commented Mar 16, 2019

It seems that this solution is already be ruled out by that test case. My mistake.
My original solution that passes all the test is a little different, it uses integer division. I forgot to recheck it.

(2 * digits.get(n) - 9 * (digits.get(n) / 5))
 // ? (2 * digits.get(n) - 9 * (digits.get(n) >= 5 ? 1 : 0))

Given that the range of 2*digits is from 0 to 18, these 2 formulas are both correct.
I agree that we will need a stronger test suite to completely rules out this type of solution (like randomly generate 1000 tests for example). I just propose a quick fix for now.

@petertseng
Copy link
Member

That means the interesting thing about these two cases :9 and 055b 444 285 is whether the non-digit character occupies a doubled position (yes in the former, no in the latter). That implies it would be valuable to state this fact in both of their respective test descriptions. And perhaps bring them next to each other. Thus, I recommend this.

I won't stand in the way if the decision is instead to ignore this recommendation and merge this PR as it currently stands.

@tqa236
Copy link
Contributor Author

tqa236 commented Mar 16, 2019

I agree that we should update the description to better describe the test, do you have any suggestions? By the way, is it possible to add a comment in a test about the intent of that test? It is not easy to recognize just by looking at the file. Just in case there's a wrong solution with a similar approach that passes, we know we should update the test suite to a more rigorous version.

@petertseng
Copy link
Member

I agree that we should update the description to better describe the test, do you have any suggestions?

I have a hard time finding a short description for the cases. Even though it is not a requirement that it be short, #1198 seems to imply there are benefits.

I'll try:

  • Doubled non-digit isn't treated as its ascii value
  • Non-doubled non-digit isn't treated as its ascii value

Open question, do we need to mention "ascii value offset from the value of 0" or is "ascii value" sufficient

is it possible to add a comment in a test about the intent of that test

it is possible. see an example in

"comments": [
"There may be some confusion about whether this should be 1 or error.",
"The reasoning for it being 1 is this:",
"There is one 0-character string contained in the empty string.",
"That's the empty string itself.",
"The empty product is 1 (the identity for multiplication).",
"Therefore LSP('', 0) is 1.",
"It's NOT the case that LSP('', 0) takes max of an empty list.",
"So there is no error.",
"Compare against LSP('123', 4):",
"There are zero 4-character strings in '123'.",
"So LSP('123', 4) really DOES take the max of an empty list.",
"So LSP('123', 4) errors and LSP('', 0) does NOT."
],

see the formal spec in

, "comments" : { "$ref": "#/definitions/comments" }
and
"comments":
{ "description": "An array of strings to fake multi-line comments"
, "type" : "array"
, "items" : { "type": "string" }
, "minItems" : 1
},

@tqa236
Copy link
Contributor Author

tqa236 commented Mar 19, 2019

I updated the descriptions based on your suggestion. I also grouped them together to reflect the fact that they address the same problem.

"Convert non-digits to their ascii value and then offset them by 48 sometimes accidently return the correct result.",
"This test is designed to avoid that solution."
],
"description": "using ascii value for doubled non-digit isn't allow",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for grammar, allow should be allowed here, and in the other test case as well

@@ -123,7 +115,23 @@
"expected": true
},
{
"description": "strings with non-digits is invalid",
"comments": [
"Convert non-digits to their ascii value and then offset them by 48 sometimes accidently return the correct result.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling accidently -> accidentally

@@ -123,7 +115,23 @@
"expected": true
},
{
"description": "strings with non-digits is invalid",
"comments": [
"Convert non-digits to their ascii value and then offset them by 48 sometimes accidently return the correct result.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"accidentally return the correct result" might be easy to misinterpret, or require careful thinking before a reader can determine whether the sentence is true. Can I suggest "accidentally declare an invalid string to be valid" so that requires less brainpower? my feeble mind was having a hard time figuring things out

@tqa236
Copy link
Contributor Author

tqa236 commented Mar 22, 2019

Thank you for the review. I made a new commit based on your suggestion.

Copy link
Member

@petertseng petertseng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have one last comment, that these two have been swapped.

I plan to approve after the two descriptions are swapped, or someone tells me I got mixed up. I do not plan to wait long after that to merge it. So, if other reviewers have some things to say, you should say them soon.

"Convert non-digits to their ascii values and then offset them by 48 sometimes accidentally declare an invalid string to be valid.",
"This test is designed to avoid that solution."
],
"description": "using ascii value for doubled non-digit isn't allowed",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you double-check me on this one? I think in this case, the b is in a non-doubled position, and in the latter case the : is in a doubled position. So these two test case descriptions should be switched, right?

Or did I get them mixed up

@tqa236
Copy link
Contributor Author

tqa236 commented Mar 26, 2019

You're correct. I fix that in the new commit.

Copy link
Member

@petertseng petertseng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR makes the purposes of these test cases clearer, a good improvement.

@petertseng petertseng merged commit 6aa07c9 into exercism:master Mar 27, 2019
coriolinus pushed a commit to exercism/rust that referenced this pull request Nov 15, 2019
* luhn: Updated the exercise to the version 1.6.1

Relevant PRs:
- exercism/problem-specifications#1420
- exercism/problem-specifications#1480
- exercism/problem-specifications#1500
- exercism/problem-specifications#1523

* luhn: Fixed the 'exercise' util generated comments in the test suite
sshine pushed a commit to exercism/haskell that referenced this pull request Mar 25, 2020
Changes included:

1. exercism/problem-specifications#1480: 1.4.0 to 1.5.0:
   Change "055a 444 285" test into "055b 444 285" for subtle reasons.
   New test name: using ascii value for doubled non-digit isn't allowed

2. exercism/problem-specifications#1500: 1.5.0 to 1.6.0:
   Add test case: valid number with an odd number of spaces

3. exercism/problem-specifications#1635: 1.6.0 to 1.7.0:
   Add test case: invalid long number with an even remainder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants