Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep original line endings when reading expectation files #25

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

xymaxim
Copy link
Collaborator

@xymaxim xymaxim commented Mar 5, 2024

Changes in this PR

Since Python by default opens files in the universal newlines mode (newline=None), line ending characters are translated. Which is not good for us. So, let's preserve original line endings while writing and reading expectation files.

@zaufi
Copy link
Owner

zaufi commented Mar 5, 2024

Could you please elaborate and gimme some more details and/or an example? I just can't understand what problem you're trying to solve %)

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 5, 2024

Could you please elaborate and gimme some more details and/or an example? I just can't understand what problem you're trying to solve %)

While I was working on #24, I noticed that regular tests (not from the previous issue) with \r\n would fail:

	def test_sample_out(capfd, expected_out):
    	print('Hello Africa!\r\n', end='')
    	stdout, stderr = capfd.readouterr()
>   	assert expected_out == stdout
E   	AssertionError: assert
E     	The test output doesn't equal to the expected
E     	(from `/tmp/pytest-of-ms/pytest-109/crlf_test0/crlf_test/test_sample_out.out`):
E     	---[BEGIN actual output]---
E     	Hello Africa!↵
E     	---[END actual output]---
E     	---[BEGIN expected output]---
E     	Hello Africa!↵
E     	---[END expected output]---

Here’s a quick demo of what’s happening:

# test: 00000000: 410d 0a42                            	A..B
#                   \r \n
with open("test", "w") as f:
    f.write("A\r\nB")

with open("test", "r") as f:
    print(repr(f.read()))  # “A\nB”

While this works as desired:

with open("test", "r", newline=””) as f:
    print(repr(f.read()))  # “A\r\nB”

@xymaxim xymaxim changed the title Keep original line endings while writing and reading expectation Keep original line endings when reading expectation files Mar 5, 2024
@@ -101,7 +101,8 @@ def _maybe_store_pattern(self, text: str) -> None:
self._pattern_filename.parent.mkdir(parents=True)

# Store!
self._pattern_filename.write_text(text)
with self._pattern_filename.open('w', newline='') as f:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this one is unnecessary and needs to be removed: the conversion of newlines doesn't happen during writing (see https://peps.python.org/pep-0278/#specification).

@zaufi
Copy link
Owner

zaufi commented Mar 5, 2024

What makes me worried is that this will ruin (almost) everything! Indeed, for example, I have a trivial output matching test print('Hello Africa!') with the obviously trivial expectation. Whatever OS I use, thanks to Git and the auto CRLF option everything works fine on all platforms! -- cuz during checkout git will replace EOL(s) in the expectations file to the native format and Python's TextIO will handle \n properly...

If I needed smth system-specific, I add a f'-{platform.system()}' suffix to the expectation filename and told Git via .gitattributes to set whatever CRLF style I needed over system-dependent expectations.

Obviously, this patch will make this way broken...

I've taken a closed look at the test added by the #24 ... it looks incorrect to me %) makepatternfile() of the fixture behaves similarly to the pytester.makefile and does not write EOL at the last (and in this case the only) text line! So, strictly speaking, that test never checks for EOL styles...

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 6, 2024

What makes me worried is that this will ruin (almost) everything! Indeed, for example, I have a trivial output matching test print('Hello Africa!') with the obviously trivial expectation. Whatever OS I use, thanks to Git and the auto CRLF option everything works fine on all platforms! -- cuz during checkout git will replace EOL(s) in the expectations file to the native format and Python's TextIO will handle \n properly...

Hmm, indeed, that’s something I hadn’t considered. My initial thoughts were that expectation texts should be treated as immutable, without any changes, and files are just an intermediate state of them, but, yes, Git couldn't be excluded from the story.

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 6, 2024

If I needed smth system-specific, I add a f'-{platform.system()}' suffix to the expectation filename and told Git via .gitattributes to set whatever CRLF style I needed over system-dependent expectations.

I recently came across a binary file with an ASCII text header that needed to be parsed, and the header contains Windows-style line endings. The imaginary test case (just for an example) for some header extract function would be to output the header as is. This is not a platform-specific case.

As I understand, on Unix, it’s not possible to have a pattern with CRLF symbols right now (even if we add tests/data/expected/crlf_test.out -text to .gitattributes) because of how Python converts newlines during reading (CRLF -> LF).

So, the only way is to read it in a binary mode?

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 6, 2024

I've taken a closed look at the test added by the #24 ... it looks incorrect to me %) makepatternfile() of the fixture behaves similarly to the pytester.makefile and does not write EOL at the last (and in this case the only) text line! So, strictly speaking, that test never checks for EOL styles...

Let's discuss it in #24, I'll answer there.

@zaufi
Copy link
Owner

zaufi commented Mar 9, 2024

As I understand, on Unix, it’s not possible to have a pattern with CRLF symbols right now (even if we add tests/data/expected/crlf_test.out -text to .gitattributes) because of how Python converts newlines during reading (CRLF -> LF).

I didn't get it... why not? The other question is if you want to make this test run on all platforms (w/ different native EOLs) and a test's input data file is "static" (in terms of EOL style in its beginning (I'm thinking about some "self-extract" archive w/ a shell script at the beginning %) obviously u need to preserve EOLs in the pattern file added to VCS (git I guess. so .gitattributes could help) and during the test tell to Python to use specific EOL style on read from file...

What could go wrong here? %)

@zaufi zaufi force-pushed the master branch 2 times, most recently from 0196775 to 68dc622 Compare March 11, 2024 03:36
@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 12, 2024

The problem is that the test from the PR doesn't pass without the proposed changes because of the missing CR symbol. This may look unexpected and confusing for users, since an expectation file actually contains the symbol, but not the fixture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants