Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make regex patterns raw strings #605

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

benjaminhwilliams
Copy link
Contributor

@benjaminhwilliams benjaminhwilliams commented Apr 9, 2021

Python 3 interprets string literals as unicode strings, so we need to explicitly define regex patterns as raw strings.

Silences warnings like DeprecationWarning: invalid escape sequence \S.

Search pattern used here: \Wre\.\w+\(['"] \Wre\.\w+\(\s*['"]

Python 3 interprets string literals as unicode strings, so we need to
explicitly define regex patterns as raw strings.

Silences warnings like 'DeprecationWarning: invalid escape sequence \S'.

Search pattern: [\W]re\.[^\.(]+\(['"]
@benjaminhwilliams
Copy link
Contributor Author

I don't know enough about the team structure in CCTBX to know who to ask for a review. @bkpoon, can I assign this to you for triage? Since the tests pass and there is no new functionality introduced in these changes, I guess it ought to be quite an easy review.

@benjaminhwilliams
Copy link
Contributor Author

Looks like I missed a handful of cases because I didn't anticipate white space characters.

@bkpoon bkpoon requested review from Oeffner and bkpoon April 13, 2021 20:50
Copy link
Member

@bkpoon bkpoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the tests pass and there is no new functionality introduced in these changes, I guess it ought to be quite an easy review.

The tests that are run on Azure Pipelines is a subset of tests that are run regularly by other developers. The problem is that we have many integration tests that are not run because they take too long or require additional components for the integration test to pass. Moreover, we run Phenix on the entirety of the Protein Data Bank on a regular basis for more extensive testing. So the tests that pass here are necessary, but not sufficient.

It looks like these changes are due to the DeprecationWarning from the DIALS testing? Something like

2021-04-13T01:18:25.9002977Z ../modules/cctbx_project/iotbx/cif/builders.py:773
2021-04-13T01:18:25.9003505Z   /home/vsts/work/1/modules/cctbx_project/iotbx/cif/builders.py:773: DeprecationWarning: invalid escape sequence \S
2021-04-13T01:18:25.9004061Z     allmatches = re.findall("(\S*(HL(\S*)[abcdABCD](\S*)))", alllabels )

from https://dev.azure.com/azure-dials/8f9a0b14-1c65-4a82-9c06-1ec7513061ae/_apis/build/builds/3036/logs/46.

It'll be easier to approve this pull request if you limit your changes to the code that can be demonstrated to cause the DeprecationWarning you are encountering. That way, the behavior before and after your changes can be verified.

I have added @Oeffner since he switched to using regular expressions for cctbx_project/iotbx/cif/builders.py. He'll want to incorporate your changes into his working branch and he is working on updating some tests.

Thanks for finding this! There are some lingering invalid escape sequence warnings in the Azure Pipelines tests, but I do not think they are related to regular expressions, and I can fix them later.

@Oeffner Oeffner removed the request for review from Anthchirp April 13, 2021 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants