Skip to content

Conversation

@StevenMaude
Copy link
Contributor

Run tests on pull requests, and on push to main branch.

@StevenMaude
Copy link
Contributor Author

NB: a couple of tests fail right now for me running on GitHub.

...................................F.F............../home/runner/work/string_grouper/string_grouper/string_grouper_utils/string_grouper_utils.py:152: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
  return timestamps.transform(lambda x: np.datetime64(x))
..../home/runner/work/string_grouper/string_grouper/string_grouper_utils/string_grouper_utils.py:146: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
  return timestamps.transform(lambda x: np.datetime64(parse(x, parserinfo, **kwargs)))
......
======================================================================
FAIL: test_match_list_diagonal (string_grouper.test.test_string_grouper.StringGrouperTest)
test fails whenever _matches_list's number of self-joins is not equal to the number of strings
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/string_grouper/string_grouper/string_grouper/test/test_string_grouper.py", line 204, in test_match_list_diagonal
    self.assertEqual(num_self_joins, num_strings)
AssertionError: 5 != 6

======================================================================
FAIL: test_match_list_symmetry_without_symmetrize_function (string_grouper.test.test_string_grouper.StringGrouperTest)
mocks StringGrouper._symmetrize_matches_list so that this test fails whenever _matches_list is
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/unittest/mock.py", line 1256, in patched
    return func(*args, **keywargs)
  File "/home/runner/work/string_grouper/string_grouper/string_grouper/test/test_string_grouper.py", line 175, in test_match_list_symmetry_without_symmetrize_function
    self.assertTrue(intersection.empty or len(upper) == len(upper_prime) == len(intersection))
AssertionError: False is not true

----------------------------------------------------------------------
Ran 62 tests in 0.830s

FAILED (failures=2)

@ParticularMiner
Copy link
Contributor

Hi @StevenMaude ,

These are intentional (that is, those tests are expected to fail). So nothing to worry about. ;)

NB: a couple of tests fail right now for me running on GitHub.

...................................F.F............../home/runner/work/string_grouper/string_grouper/string_grouper_utils/string_grouper_utils.py:152: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
  return timestamps.transform(lambda x: np.datetime64(x))
..../home/runner/work/string_grouper/string_grouper/string_grouper_utils/string_grouper_utils.py:146: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
  return timestamps.transform(lambda x: np.datetime64(parse(x, parserinfo, **kwargs)))
......
======================================================================
FAIL: test_match_list_diagonal (string_grouper.test.test_string_grouper.StringGrouperTest)
test fails whenever _matches_list's number of self-joins is not equal to the number of strings
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/string_grouper/string_grouper/string_grouper/test/test_string_grouper.py", line 204, in test_match_list_diagonal
    self.assertEqual(num_self_joins, num_strings)
AssertionError: 5 != 6

======================================================================
FAIL: test_match_list_symmetry_without_symmetrize_function (string_grouper.test.test_string_grouper.StringGrouperTest)
mocks StringGrouper._symmetrize_matches_list so that this test fails whenever _matches_list is
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/unittest/mock.py", line 1256, in patched
    return func(*args, **keywargs)
  File "/home/runner/work/string_grouper/string_grouper/string_grouper/test/test_string_grouper.py", line 175, in test_match_list_symmetry_without_symmetrize_function
    self.assertTrue(intersection.empty or len(upper) == len(upper_prime) == len(intersection))
AssertionError: False is not true

----------------------------------------------------------------------
Ran 62 tests in 0.830s

FAILED (failures=2)

@Bergvca
Copy link
Owner

Bergvca commented Apr 11, 2021

Hi @ParticularMiner - Sorry i guess I missed this when merging the PR's - we should not have failing tests. That kind of defeats the purpose of unittests as a way to quickly identify new issues. These two are fixed easily by changing assertEqual to assertNotEqual and assertTrue to assertFalse. I can do that now and update the pypi version as well.

@StevenMaude thanks for this! We are using codecov as well, but it is currently not automated yet. Is that something you can add as well?

@ParticularMiner
Copy link
Contributor

Hi @Bergvca,

I completely agree that all tests should pass. Actually, it feels better that way to me too. But I'm still getting used to the philosophy of unit-tests so I wasn't sure, otherwise I would have checked with you again about them myself.

@ParticularMiner
Copy link
Contributor

Hi @StevenMaude

Sorry, I too should have thanked you before for this. Indeed it would certainly be great if the code coverage report and badge could be automated through a GitHub test workflow.. Very handy and professional! I'm eager to see how that is done.

By the way, could you also try automating the test on a Windows 10 operating system, because on my Windows 10 laptop two more test-units (not written by me) fail due to an integer datatype mismatch (a bug which seems to be well-known in the python community). See the following link for a description: https://stackoverflow.com/questions/57803232/interpreting-numpy-int64-datatype-as-native-int-datatype-in-python-on-windows-x6

@ParticularMiner
Copy link
Contributor

Hi @Bergvca @StevenMaude

I hereby provide the message-logs for the two additional failing tests:

============================= ERRORS =============================
Traceback (most recent call last):
  File "C:\Users\username\eclipse-workspace\string_grouper\string_grouper\test\test_string_grouper.py", line 362, in test_case_insensitive_build_matches_list
    pd.testing.assert_frame_equal(expected_df, sg._matches_list)
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 1704, in assert_frame_equal
    assert_series_equal(
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 1402, in assert_series_equal
    assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 917, in assert_attr_equal
    raise_assert_detail(obj, msg, left_attr, right_attr)
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 1073, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="master_side") are different

Attribute "dtype" are different
[left]:  int64
[right]: int32

 
Traceback (most recent call last):
  File "C:\Users\username\eclipse-workspace\string_grouper\string_grouper\test\test_string_grouper.py", line 350, in test_build_matches_list
    pd.testing.assert_frame_equal(expected_df, sg._matches_list)
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 1704, in assert_frame_equal
    assert_series_equal(
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 1402, in assert_series_equal
    assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 917, in assert_attr_equal
    raise_assert_detail(obj, msg, left_attr, right_attr)
  File "C:\Users\username\.virtualenvs\string_grouper\lib\site-packages\pandas\_testing.py", line 1073, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="master_side") are different

Attribute "dtype" are different
[left]:  int64
[right]: int32

@StevenMaude
Copy link
Contributor Author

StevenMaude commented Apr 11, 2021

@Bergvca @ParticularMiner Thanks for looking at the PR 👀

And, of course, for working on the library, which I'd been using and thought it was worth maybe making a small contribution back 👍

Good questions which are slightly on the edge of my knowledge 🙂

  • Codecov shouldn't be too complicated, from a quick look around. Personally I haven't used codecov before. But there's an Action from Codecov for this purpose. My intuition is that this could be a separate workflow from the test workflow to run only when a commit is pushed to your main branch, e.g. on merging a PR. Let me know if you have something else in mind though.
  • Testing on Windows also should be doable, though I haven't done this before on GitHub either. Hopefully, it doesn't need much more configuration, because the commands run on Linux and Windows should be the same, I think. For example, look at the test workflow for Black.

I'll try take a look at these over the next week, though it may be next weekend when I can spend a bit more time checking up on these.

@ParticularMiner
Copy link
Contributor

Many thanks @StevenMaude for the very useful links. I might try them out myself.

@ParticularMiner
Copy link
Contributor

I agree with your intuition for separate actions. It appears Github allows multiple actions, right? Anyway, a modular design structure for pretty much anything is often a good idea.

@StevenMaude
Copy link
Contributor Author

I rebased this branch on current master which fixes the tests on Ubuntu.

The tests now run on Windows too, but they fail as per the comment above.

@ParticularMiner
Copy link
Contributor

Excellent! Many thanks @StevenMaude !

How do I get your GitHub actions into my own fork of this repository? Do you need to create a PR for it on my branch too?
I would like to run these tests myself on my fork and see if I can fix the Windows 10 error without breaking the Ubuntu tests.

@ParticularMiner
Copy link
Contributor

How do I get your GitHub actions into my own fork of this repository? Do you need to create a PR for it on my branch too?
I would like to run these tests myself on my fork and see if I can fix the Windows 10 error without breaking the Ubuntu tests.

I figured it out. I seem to have fixed the Windows 10 problem too (for now).

@StevenMaude
Copy link
Contributor Author

@ParticularMiner:

How do I get your GitHub actions into my own fork of this repository? Do you need to create a PR for it on my branch too?

Responding to your question seeing as I was already writing the response. At the command-line, the way I would do this is probably something like:

  1. Add the fork as a git remote to your local copy of your repository: git remote add sm git@github.com:StevenMaude/string_grouper.git
  2. Fetch the fork's commits: git fetch sm
  3. Checkout the fork's branch of interest: git checkout sm/add-test-workflow.
  4. Checkout a new local branch based on that branch's commit: git checkout -b add-test-workflow.
  5. Then push the new local branch to your own remote fork on GitHub: git push origin HEAD

If you're working in GitHub's web UI, I'm not sure there is an easy way beyond copy-paste. Anyway, you got something working well enough 🙂

@StevenMaude
Copy link
Contributor Author

While I'm here, I'll also add that a codecov workflow could be a separate PR, as another feature.

@ParticularMiner
Copy link
Contributor

Thanks @StevenMaude

I ended up using the GitHub web GUI but the git method is also useful to know.

@ParticularMiner:

How do I get your GitHub actions into my own fork of this repository? Do you need to create a PR for it on my branch too?

Responding to your question seeing as I was already writing the response. At the command-line, the way I would do this is probably something like:

  1. Add the fork as a git remote to your local copy of your repository: git remote add sm git@github.com:StevenMaude/string_grouper.git
  2. Fetch the fork's commits: git fetch sm
  3. Checkout the fork's branch of interest: git checkout sm/add-test-workflow.
  4. Checkout a new local branch based on that branch's commit: git checkout -b add-test-workflow.
  5. Then push the new local branch to your own remote fork on GitHub: git push origin HEAD

If you're working in GitHub's web UI, I'm not sure there is an easy way beyond copy-paste. Anyway, you got something working well enough 🙂

@ParticularMiner
Copy link
Contributor

ParticularMiner commented Apr 12, 2021

@StevenMaude:

While I'm here, I'll also add that a codecov workflow could be a separate PR, as another feature.

For your information, clicking on the codecov badge on README.md will eventually lead you to an old code coverage report at https://codecov.io/gh/Bergvca/string_grouper

@Bergvca
Copy link
Owner

Bergvca commented Apr 25, 2021

Thanks guys! I will merge this now - agreed that codecov should be a seperate PR

@Bergvca Bergvca merged commit d17eeab into Bergvca:master Apr 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants