-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception not raise properly when number of codes>2 in calc_info #4990
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #4990 +/- ##
===========================================
+ Coverage 80.22% 80.24% +0.03%
===========================================
Files 515 515
Lines 36757 36753 -4
===========================================
+ Hits 29484 29490 +6
+ Misses 7273 7263 -10
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @unkcpz . Shouldn't we just define a default maybe instead of raising? Maybe setting SERIAL
as a default makes sense and people can switch to PARALLEL
if they need. Also we really should add a test for this. Should be straightforward enough, just mock the process and call presubmit
abcc02b
to
8ec3846
Compare
Hi @sphuber , I have added a test and also set the default for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @unkcpz thanks for the contribution! I have some minor comments on it, could you take a look? Let me know if there is anything that is not clear or you don't agree with.
.gitignore
Outdated
@@ -17,6 +17,7 @@ | |||
.eggs | |||
.vscode | |||
.tox | |||
submit_test/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm this seems to be a specific folder you created in your repo but I don't think it makes sense to add this as a generic thing to the gitignore
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but no. Since this folder will be generated in current directory when you run :
pytest tests/engine/test_calc_job.py
The folder was create when I run test locally. It's not a big deal but just tedious to delete this folder when making commits.
I believe the better way would be to delete the folder using a fixture. What do you think? I am not sure I can quickly figure out how to do that. So I will remove this line from .gitignore
at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I see your point. Yes, I think your solution sounds reasonable. I'm surprised that this hasn't come up before; @sphuber is this the only test which is executing a dry_run
and comparing the resulting script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other tests that run a dry_run
, and so create this folder, but they don't compare the input script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see, but the question is what do you do with all these newly created test folder/files so that they are not shown on git? I actually don't remember these appearing for me after running tests, are they deleted automatically somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @sphuber , sorry for the insistence, I just want to make sure there is no already standardized way in which we are dealing with this issue before I propose my own.
To recap the problem: the test needs to run a dry_run
; this generates a folder with files; this files are left polluting the environment (in this case, the modification aims at removing it from files detected by git). What do other tests that use dry_run
do (I haven't noticed any submit_test
folders appearing after running the testsuite)? They delete the files manually? Is there a fixture
that takes care of deleting this folder at the end of the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two test files that run CalcJob
dry runs: tests/engine/test_launch.py
and tests/engine/test_calc_job.py
. The former cleans up after itself, which is why you never saw the directories before. The TestLaunchersDryRun
class deletes the submit_test
folder in its tearDownClass
method:
aiida-core/tests/engine/test_launch.py
Lines 156 to 169 in a468519
def tearDown(self): | |
import os | |
import shutil | |
from aiida.common.folders import CALC_JOB_DRY_RUN_BASE_PATH | |
super().tearDown() | |
self.assertIsNone(Process.current()) | |
# Make sure to clean the test directory that will be generated by the dry-run | |
filepath = os.path.join(os.getcwd(), CALC_JOB_DRY_RUN_BASE_PATH) | |
try: | |
shutil.rmtree(filepath) | |
except Exception: # pylint: disable=broad-except | |
pass |
For now, @unkcpz can maybe use this in the test_calc_job.py
file as well. At some point we can find a cleaner fixture to do this that makes it easy to reuse, but since these test files are still using old-style, maybe we leave that for another time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! I see your point! I can do add the fixture here and remove the line from .gitignore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ramirezfranciscof, thanks for the review. I accept most of your advice and leave some notes for your comment. Please have another look at your convenience. Cheers!
.gitignore
Outdated
@@ -17,6 +17,7 @@ | |||
.eggs | |||
.vscode | |||
.tox | |||
submit_test/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but no. Since this folder will be generated in current directory when you run :
pytest tests/engine/test_calc_job.py
The folder was create when I run test locally. It's not a big deal but just tedious to delete this folder when making commits.
I believe the better way would be to delete the folder using a fixture. What do you think? I am not sure I can quickly figure out how to do that. So I will remove this line from .gitignore
at the moment.
f0684e7
to
c483929
Compare
Thanks @ramirezfranciscof ! I open the issue #5034 for documentation of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @unkcpz its looking great! Thanks for the work, there are just two final details. One is the .gitignore
modification I'm still trying to figure out and the other is a modif on aiida/schedulers/scheduler.py
that I don't quite understand (maybe it was done for testing and left unintentionally?). After this is dealt with I would consider this ready to merge.
#!/bin/bash | ||
exec > _scheduler-stdout.txt | ||
exec 2> _scheduler-stderr.txt | ||
|
||
|
||
'/bin/bash' | ||
|
||
'mpirun' '-np' '1' '/bin/bash' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm this is strange no? I expected content = content.rstrip() + '\n'
to add a final newline in the file but apparently it didn't, it just ended the current line. More surprisingly so: the linter didn't complain that there was no newline end?
I mean, if it is working we can leave it like this, it is just...unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is expected though. The rstrip
strips all whitespace, so also the newline character, which you then re-append. However, having a terminating newline character, does not show up visually as an empty line. It just means the line is terminated and the carriage return goes to the start of the next line. If you want to see a visual empty line, you would need another new line character. Note that Github visualizes this difference (in diffs where this changes) with a symbol at the end of the file, saying "no newline character" or something to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, for the rstrip()
I did some tests in the shell and printing so I didn't know exactly how it would behave with write, there my "I expected..." meant "I intended that...". The unexpected part is more about the linter not complaining.
aiida/schedulers/scheduler.py
Outdated
@@ -216,7 +216,7 @@ def _get_run_line(self, codes_info, codes_run_mode): | |||
|
|||
output_string = f'{command_to_exec} {stdin_str} {stdout_str} {stderr_str}' | |||
|
|||
list_of_runlines.append(output_string) | |||
list_of_runlines.append(output_string.rstrip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @unkcpz why did you change this here? The rstrip()
on the test when you are doing the file_regression
should suffice for the test to work, no? I wouldn't change this in the actual code just to cater for a test or comply with some linter criteria in a script that is produced during usage and not part of the codebase. I mean, this is changing some intermediate file produce by the code that could be checked or parsed by some users, and its modification may create some problems (however unlikely it might be in this case).
I mean, unless you have a strong reason to propose we change to having our scripts be trimmed at the end, I would say to revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to pass the linter. As you can see from the above line, that the output_string
will end will spaces if stdin_str
, stdout_str
or stderr_str
is not set. In that case, the command line of submit script will end with spaces wiped out by the linter and fails the pre-commit.
I mean, this is changing some intermediate file produce by the code that could be checked or parsed by some users, and its modification may create some problems (however unlikely it might be in this case
I agree, but for this part of the code, I see no downside to strip the line. In other words, I would much more like the bash script ironed without extra spaces in each line ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to pass the linter. As you can see from the above line, that the
output_string
will end will spaces ifstdin_str
,stdout_str
orstderr_str
is not set. In that case, the command line of submit script will end with spaces wiped out by the linter and fails the pre-commit.
Right, but I mean, you are not committing any script that is being generated by this, right? The reference file is being created by file_regression.check(content, extension='.sh')
and you are already using rstrip
to clean up content
before putting it there. That you originally take content
from the printed script does not matter if you clean it between reading it and writing to reference with the file_regression
. Am I missing something here? Which script directly generated by this is being rejected by the linter?
I agree, but for this part of the code, I see no downside to strip the line. In other words, I would much more like the bash script ironed without extra spaces in each line ;-)
I also agree that it is very unlikely that this will break someone's code, but it is still possible. If we have had a team discussion on how to do a cleaner script and agreed in some guidelines that we were going to respect going forward, I would totally bite the bullet on the chance of causing some friction. But we are not even sure that we won't have to change this further (maybe to prevent contiguous empty lines due to non existing append/prepend texts) or even go back on it (maybe it turns out there are good reasons to have one final empty line at the end of the file). So yeah, the risk may be minimal but on the other hand there is no objective gain from this change (other than you and me liking it more aesthetically, but that is not very objective =P).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not explain it more clear.
I have to add this to pass the linter. Because the content.rstrip()
in test function, only removes the final lines of the submit script content. output_string
is the executable command line not necessary the final line. For clarity, please see the following content generated by in test and after linter pre-commit clean up (I use '␣' to represent space character):
The script generated:
#!/bin/bash
exec > _scheduler-stdout.txt
exec 2> _scheduler-stderr.txt
'/bin/bash'␣␣␣
'mpirun' '-np' '1' '/bin/bash'
After pre-commit:
#!/bin/bash
exec > _scheduler-stdout.txt
exec 2> _scheduler-stderr.txt
'/bin/bash'
'mpirun' '-np' '1' '/bin/bash'
If we agree not to add exclude files in the linter configuration, I can not figure out a simple way to get rid of this and pass the pre-commit check (Can split()
the content string with rstrip()
of every line, though it seems complex for this case. But I guess that is the middle point where we can compromise).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right @unkcpz the formatter strips all whitespace, also terminating spaces in lines in the middle of the file. Apologies for not having thought of this, I was focusing on the terminating new line characters.
OK, so here is my thought process. We have added the whitespace trimming pre-commit action such that trailing whitespace is automatically cleaned up in source code, as it can make things look messy and more difficult to read than necessary. When I added it, I did not specify what files it should run for, and so let it run for everything, even comparison files that are used in the unit tests, even though those are not the primary target. We don't need to read those often, so trailing newspace is not a real problem. When you consider the motivation for the whitespace trimming, it seems clear that the correct solution for this problem is to exclude the test comparison file from the pre-commit. So I now agree that @unkcpz original solution of updating the exclude rules in the .pre-commit-config.yaml
file is actually the correct one.
I propose therefore that we revert the changes we requested (really sorry for the back and forth @unkcpz ), removing the manual explicit whitespace stripping in the code and the test and simply update the include rules in the config. Note that instead of excluding explicitly this file, maybe it would be better to simply include all .py
files, because that is in the end what we are really interested in. We want to clean the source code files, not any other arbitrary text files.
What we could do is use the following regex I think
exclude: >- >
(?x)^(
tests/.*(?<!\.py)$
)$
i.e. exclude all files in the tests
folder that do not finish with the .py
extension. We still want to clean the test source files of course. Would you agree with this suggestion @ramirezfranciscof ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@unkcpz Ahhhh I see now! Thanks for the explanation, haha, indeed I can see the excess with the newlines but it is more difficult to notice the trailing spaces 😅. Nice idea using the strange underscore symbol to illustrate it.
@sphuber mmm I'm trying to think if this could backfire. I mean, the test section does have other kinds of file (the test/static
folders have some .cif
, .UPF
, .aiida
exports, .json
) but perhaps we don't want to be lintering those anyways? I also had the idea that there were some notebooks being used to do some tests, but I couldn't find anything there now. I'm wondering if at some point it won't be convenient to use some bash scripts as aux to some tests and we may want to linter those too. It is also worth mentioning that the .sh
extension was set explicitly by file_regression.check(content, extension='.sh')
, so we could also decide on a specific extension for reference files and exclude those (.pyreg
? .pytest_regression_uniquely_long_aiida_extension
?). I mean, I agree it would be nice to set some configuration to deal now and for all with all pytest regression files, I'm just not sure just applying the linter to .py
is the clear way to do this.
There is also one last possible chance that we might be able to deal with this in a simple way inside the test (which is, actually, more self-documenting that the initial one I proposed...):
content = '# pylint: skip-file\n' + content
In any case, as the proposed alternative would be a modification internal to the tests so we are more free to set it back if we change our minds, I'm ok with you going the way you prefer. It still pains me to have to add the exception to the .pre-commit-config.yaml
, but since you already tried a suggestion I made that underestimated the problem I won't insist on this if you think the pylint config is the best way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also one last possible chance that we might be able to deal with this in a simple way inside the test (which is, actually, more self-documenting that the initial one I proposed...):
content = '# pylint: skip-file\n' + content
It is not pylint
that does the trimming though, so that won't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ramirezfranciscof , rather than delete the folder, I add the following @pytest.fixture(scope='function')
def dry_run_in_tmp(request):
"""change to the tmp directory to do the dry run, then change back to the calling directory to avoid side-effects"""
os.chdir('/tmp')
yield
os.chdir(request.config.invocation_dir) |
This looks fine, but I wouldn't hardcode the |
@sphuber thanks! Changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @unkcpz and especially for your patience. I realize we have been going back and forth quite a bit on various things. I have now fully reviewed the code (which I hadn't done yet, just replied to question and comments) and I have two comments remaining. Once those are addressed this is good to be merged for me.
this_withmpi = self.node.get_option('withmpi') # set code mpi, default set to as calcjob one | ||
if code_info.withmpi: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, there are two ways to set withmpi
: on the process itself through the metadata.options.withmpi
input and through the CodeInfo
that is created by the plugin itself, where the latter should be leading if both are defined.
If that analysis is correct, than I think this code is not correct. By checking just if code_info.withmpi
the actual value will not be overridden even if it is set to False
. However, this seems like a good use-case from a plugin perspective. If the plugin needs to run two codes, for which one always needs to be run without mpi, i.e. code_info.withmpi == False
, then that should override the value taken from self.node.get_option('withmpi')
. That is to say, False
is also an actual value that should not be ignored. So should the conditional not be if code_info.withmpi is not None
?
To make this code a lot clearer, I would propose writing it as follows, with comment:
this_withmpi = self.node.get_option('withmpi') # set code mpi, default set to as calcjob one | |
if code_info.withmpi: | |
# To determine whether this code should be run with MPI enabled, we get the value that was set in the inputs | |
# of the entire process, which can then be overwritten by the value from the `CodeInfo`. This allows plugins | |
# to force certain codes to run without MPI, even if the user wants to run all codes with MPI whenever | |
# possible. This use case is typically useful for `CalcJob`s that consist of multiple codes where one or | |
# multiple codes always have to be executed without MPI. | |
withmpi = self.node.get_option('withmpi') | |
# Override the value of `withmpi` with that of the `CodeInfo` if and only if it is set | |
if code_info.withmpi is not None: | |
withmpi = code_info.withmpi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, I totally missed this on my pass, my bad 😕.
Follow up question: we want code_info.withmpi
to always override the calcjob setting? Or if the user has explicitly set withmpi
to be False
then all codes executed should actually respect that? (in which case setting code_info.withmpi = True
or leaving code_info.withmpi = None
would have the same effect) Or the idea is that if a plugin sets code_info.withmpi = True
it means that the code can only be ran with mpi?
Maybe we can make a table of expected results and intent and check these in a test?
[Calcjob] True | [Calcjob] False | |
---|---|---|
[Code] True (the plugin/code can only be run with mpi?) | Run with MPI | Incompatible? Raise? |
[Code] False (this code does not use parallelism at all) | Run without MPI | Run without MPI |
[Code] None (this code can run with or without MPI) | Run with MPI | Run without MPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if the user has explicitly set withmpi to be False then all codes executed should actually respect that?
In the example that I sketched, this would be problematic. Imagine a CalcJob
that executes two codes: a small pre-processing script that has to be executed serially or it will crash, and then a normal executable (like pw.x
) that can be executed serially or with MPI depending on how it is compiled. In this case, the metadata.options.withmpi
is the tool for the user to toggle the behavior for the latter, but it should not override the setting for the pre-processing script because it will fail.
I am not sure if this solution covers all use cases or whether there are exceptions where the user definitely needs to be able to override. However, if the plugin explicitly sets CodeInfo.withmpi
than there must be a good reason. If it can be run with either setting, maybe the plugin should not set one value either way. Or we should change metadata.options.withmpi
to be able to take a list, but that may become even more problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ramirezfranciscof @sphuber thanks for making this more clear. This is an important catch! It is the right code that I intended to implement.
Only if plugin developer sets CodeInfo.withmpi
explicitly the metadata.option.withmpi
will not override it through running CalcJob
. The ultimate behaviour will be the case shown in the table above. Indeed that does not include the scenario that the plugin user what to override what was set in CodeInfo. I have no strong opinion on whether we should take this into account.
But I want to mention that maybe we can consider this problem together with setting default calcjob when configuring code #4991. Plugin user can only set one default CalcJob plugin for the code and this is the code that is implicitly called with self.inputs.code
in CalcJob
. I would assume the plugin user only what to control the MPI behaviour of this particular code. And the other codes need to be explicitly passed as an InputPort in defining the CalcJob, which give plugin user an option to use the correct compiled code(MPI one or serial one) if the withmpi
of this codeinfo is fixed by the plugin developer.
Therefore, I prefer to have a detailed comment as shown by @sphuber and add information in documentation to warn the plugin developer to carefully set this codeinfo.withmpi
in their plugin when there is more than one code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the example that I sketched, this would be problematic. Imagine a
CalcJob
that executes two codes: a small pre-processing script that has to be executed serially or it will crash, and then a normal executable (likepw.x
) that can be executed serially or with MPI depending on how it is compiled. In this case, themetadata.options.withmpi
is the tool for the user to toggle the behavior for the latter, but it should not override the setting for the pre-processing script because it will fail.
Yes, so I think you are describing what should happen in the table I included when [Calcjob] True
and [Code] False
. Here it makes sense that the CodeInfo.withmpi
overrides the metadata.options.withmpi
because of the reasons you already explained: even if the user is indicating it has MPI available for the main code called by the calculation, other auxiliary calls may not benefit from or not even be able to run in parallel.
My question was more what should happen in the reciprocal case, when [Calcjob] False
and [Code] True
. Here the user is specifically saying to NOT use MPI at all, is it still justified to override this and anyways use MPI on the CodeInfo.withmpi = True
code? Or does this suggest a more fundamental incompatibility and thus should raise?
(@unkcpz you mention you agreed with the behavior in the table, does this mean you agree with raising in this second case?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies @ramirezfranciscof, I gave it a glimpse and take for granted with my own mind.
I reconsider your advice to raise here and start to like it. Maybe it can be a raising warning rather than an exception? But I still assume the developer of the plugin should be careful and know what they do when setting codeinfo.withmpi
.
Do you have a strong opinion on raising an exception or maybe a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is typically on the versatility on the end user and clarity when we are overriding his intentions. So here I would say:
- I would issue a warning if then we let
metadata.option.withmpi
overridecodeinfo.withmpi
(we let the user know the plugin designer set this to only be run with MPI for some reason but we allow them to proceed as they see fit) - In the other case, i.e. if after the warning
codeinfo.withmpi
would overridemetadata.option.withmpi
, then I would say it is preferable to raise. This setting is not being used for anything else as far as I know, if the user is setting it, it is specifically for this and we are overriding it. They wouldn't be able to do what they want anyways, so just raise saying "you are not allowed to set this".
Now as to which of those two I prefer, I'm not sure. I'm still a bit puzzled as to in what situation the plugin designer would set something to ONLY be run with MPI, it is a bit bizarre to me. I'll let you decide what is best for this (or @sphuber if he has another opinion).
NOTE: I am now re-reading the previous logic to check all the cases of the table. The previous behavior was that it would first check the individual codeinfo.withmpi
and only if it was None
and if it was a single code, then it would check metadata.option.withmpi
. So if there were multiple codes being called, then the plugin designer had to set whether they were run with MPI or not, and the user had no say in this. So before there was actually no way for the plugin designer to leave how to run some of the codes to "the criteria" of the user.
This gets more bizarre the more we look into it...🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the bizarre things come from there is only one binary flag that can be set from metadata.option.withmpi
by plugin user when actually a list of flags need for multiple codes case, which is hard to settle at this PR.
I would like to suggest that we stop here and leave all complicity to the documentation. For plugins that involve more than one code, if the user wants to get full control in MPI mode of how to run codes, they are suggested to split the calcjob into parts with each one running with one code only.
aiida/schedulers/scheduler.py
Outdated
@@ -216,7 +216,7 @@ def _get_run_line(self, codes_info, codes_run_mode): | |||
|
|||
output_string = f'{command_to_exec} {stdin_str} {stdout_str} {stderr_str}' | |||
|
|||
list_of_runlines.append(output_string) | |||
list_of_runlines.append(output_string.rstrip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right @unkcpz the formatter strips all whitespace, also terminating spaces in lines in the middle of the file. Apologies for not having thought of this, I was focusing on the terminating new line characters.
OK, so here is my thought process. We have added the whitespace trimming pre-commit action such that trailing whitespace is automatically cleaned up in source code, as it can make things look messy and more difficult to read than necessary. When I added it, I did not specify what files it should run for, and so let it run for everything, even comparison files that are used in the unit tests, even though those are not the primary target. We don't need to read those often, so trailing newspace is not a real problem. When you consider the motivation for the whitespace trimming, it seems clear that the correct solution for this problem is to exclude the test comparison file from the pre-commit. So I now agree that @unkcpz original solution of updating the exclude rules in the .pre-commit-config.yaml
file is actually the correct one.
I propose therefore that we revert the changes we requested (really sorry for the back and forth @unkcpz ), removing the manual explicit whitespace stripping in the code and the test and simply update the include rules in the config. Note that instead of excluding explicitly this file, maybe it would be better to simply include all .py
files, because that is in the end what we are really interested in. We want to clean the source code files, not any other arbitrary text files.
What we could do is use the following regex I think
exclude: >- >
(?x)^(
tests/.*(?<!\.py)$
)$
i.e. exclude all files in the tests
folder that do not finish with the .py
extension. We still want to clean the test source files of course. Would you agree with this suggestion @ramirezfranciscof ?
@sphuber no worries about back and forth, I was only trying to get rid of the pre-commit error in the previous commit. Now we know what causes the pre-commit failure. I think we need to add it in top-level, correct? Otherwise exclude files in two hooks seems nasty. exclude: >-
(?x)^(
tests/.*(?<!\.py)$
)$
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.5.0
hooks:
- id: double-quote-string-fixer
- id: end-of-file-fixer
- id: fix-encoding-pragma
- id: mixed-line-ending
- id: trailing-whitespace |
I am pretty sure you can apply the exclude for all pre-commit hooks and not apply them to all repos, like so: repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.5.0
hooks:
- id: double-quote-string-fixer
- id: end-of-file-fixer
- id: fix-encoding-pragma
- id: mixed-line-ending
- id: trailing-whitespace
exclude: >-
(?x)^(
tests/.*(?<!\.py)$
)$ |
Hmmm.... no, it is not true. The exclude key only take effect for the hook. With your configuration above, I got the following files modification:
Have to configure pre-commit like: repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.5.0
hooks:
- id: double-quote-string-fixer
- id: end-of-file-fixer
exclude: >-
(?x)^(
tests/.*(?<!\.py)$
)$
- id: fix-encoding-pragma
- id: mixed-line-ending
- id: trailing-whitespace
exclude: >-
(?x)^(
tests/.*(?<!\.py)$
)$ |
Ah sure, we need it to apply to multiple hooks, but you can define the list once and use a reference like so repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.5.0
hooks:
- id: double-quote-string-fixer
- id: end-of-file-fixer
exclude: &exclude_files >
(?x)^(
tests/.*(?<!\.py)$
)$
- id: fix-encoding-pragma
- id: mixed-line-ending
- id: trailing-whitespace
exclude: *exclude_files |
thanks @sphuber ! That makes sense. However, the |
Maybe just name it after the pre-commit step, i.e., |
Implementing `CalcJob` plugin with more than one code got incorrect `codes_run_mode` option check and raise. And the `metadata.option.withmpi` was not able to be read and set for the codes when more than one code involved in the `CalcJob` plugin. In this commit, the `codes_run_mode` has the default value set to `SERIAL` mode. As for `withmpi` option, we get the value that was set in the inputs of the entire process (by `metadata.option.withmpi`), which can then be overwritten by the value from the `CodeInfo`. This allows plugins to force certain codes to run without MPI, even if the user wants to run all codes with MPI whenever possible.
Hi @ramirezfranciscof, I add a test to check the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @unkcpz . I am approving this to be merged. I agree that any further complications are indeed due to mismatch in the process inputs and the CodeInfo
definition. I think we should simply open an issue that aims to improve the documentation on CodeInfo
which includes a discussion on the withmpi
option.
Yes, I'm ok with merging, we have had substantial discussion as it is and I don't want to drag the PR much longer (I hope it wasn't too tedious for you =P). However I do want to make a final comment that I'm not so sure this is just a matter of documentation: I feel like the current behavior may be limiting in the sense that it may be putting control on how to execute code on the wrong actor (i.e. if MPI is an option with which codes can or not be compiled, it is the user who knows how the code is compiled and I'm not sure the plugin designer can make the decision to force that at code-design time). |
If
calc_info.codes_run_mode
is not set when using more than one code in a calc_info, thePluginInternalError
is not properly raised.calc_info.codes_run_mode
will be aNone
, and the exception never raised here.