-
Notifications
You must be signed in to change notification settings - Fork 18
Add functionality for multiple morphs in one command #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #82 +/- ##
==========================================
+ Coverage 75.01% 84.33% +9.32%
==========================================
Files 36 38 +2
Lines 1437 1890 +453
==========================================
+ Hits 1078 1594 +516
+ Misses 359 296 -63
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Warnings from test come from use of depreciated pkg_resources in diffpy.utils. Removed diffpy.pdfmorph's use of pkg_resources in its version.py module. Errors in checks from importlib_metadata, requires fix. |
Need to add plotting of Rw vs temperature (if temperature given). Otherwise, a histogram will be displayed of Rw for each file in directory as target. |
Covers the first four check boxes in issue #84. Still need some refactoring to cover the final three check boxes, but those can be done in a single commit. See comments for major edits. |
# ensure all entries in target_labels are distinct for plotting | ||
unique_labels = set() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diffpy/pdfmorph/pdfplot.py
Outdated
if numerical: | ||
# Plot Rw vs Temperature | ||
plt.plot(target_labels, rws, linestyle='-', marker='o') | ||
plt.ylabel(r"$R_w$") | ||
plt.xlabel(rf"{field}") | ||
plt.minorticks_on() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
angle = numpy.arccos(bar_size / max_len) | ||
angle *= 180 / numpy.pi # Convert to degrees | ||
plt.xticks(rotation=angle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A slight tilt to make the graph less tall
self.testfiles.append(testsequence_dir.joinpath(filename)) | ||
return | ||
|
||
def test_morphsequence(self, setup_default): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test morphsequence is same as performing multiple morphs individually
s_sequence_results = multiple_morphs(self.parser, opts, pargs, stdout_flag=False) | ||
assert s_sequence_results == sequence_results | ||
|
||
def test_morph_outputs(self, setup_default, tmp_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if files are saved properly against files in the new testsaving
directory
nickel_PDF = testdata_dir.joinpath("nickel_ss0.01.cgr") | ||
serial_JSON = testdata_dir.joinpath("testsequence_serialfile.json") | ||
|
||
class TestExceptions(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change to pytest; additionally, the three files (1) test_pdfmorphapp_exceptions.py
, (2) test_morphsequence.py
and (3) test_parser.py
will be moved to a new file test_pdfmorphapp.py
to follow convention. This will be done either in the change_libraries branch or another commit.
def setUp(self): | ||
self.parser = create_option_parser() | ||
|
||
def test_length_exceptions(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure all the following throw an exception
diffpy/pdfmorph/tests/test_tools.py
Outdated
self.assertAlmostEqual(tools.nn_value(value, name=None), abs(value)) | ||
self.assertAlmostEqual(tools.nn_value(-value, name=None), abs(-value)) | ||
|
||
def test_field_sort(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check new field_sort function (edited from temperature_sort)
if val < 0: | ||
negative_value_warning = f"\n# Negative value for {name} given. Using absolute value instead." | ||
print(negative_value_warning) | ||
return -val | ||
return val | ||
|
||
|
||
def deserialize(serial_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function exists to keep the diffpy.utils
import contained in the tools.py
module
return parsers.deserialize_data(serial_file) | ||
|
||
|
||
def field_sort(filepaths: list, field, reverse=False, serfile=None, get_field_values=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort a list of files by a field. This field can either be in the header or in a metadata/serial file. Includes a reverse option.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
metavar="FILE", | ||
dest="savefile", | ||
help="Save manipulated PDF from FILE1 to FILE. Use \'-\' for stdout.", | ||
metavar="SLOC", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New options begin here. Note metavar FILE in -s
option changed to SLOC since it can now also be a directory.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
print(header_verbose, file=outfile) | ||
|
||
# Print table with label | ||
print("\n# Labels: [r] [gr]", file=outfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added label before table is printed
# Parse paths | ||
morph_file = Path(pargs[0]) | ||
if not morph_file.is_file(): | ||
parser.custom_error(f"{morph_file} is not a file. Go to --help for usage.") | ||
target_directory = Path(pargs[1]) | ||
if not target_directory.is_dir(): | ||
parser.custom_error(f"{target_directory} is not a directory. Go to --help for usage.") | ||
|
||
# Do not morph morph_file against itself if it is in the same directory | ||
target_list = list(target_directory.iterdir()) | ||
if morph_file in target_list: | ||
target_list.remove(morph_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get target files to morph against
# Format field name for printing and plotting | ||
field = None | ||
if opts.field is not None: | ||
field_words = opts.field.split() | ||
field = "" | ||
for word in field_words: | ||
field += f"{word[0].upper()}{word[1:].lower()}" | ||
field_list = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get field if we want to sort by it
diffpy/pdfmorph/pdfmorphapp.py
Outdated
# Make directory to save files in if it does not already exist | ||
Path(save_opt).mkdir(parents=True, exist_ok=True) | ||
|
||
# Morphs will be saved in the subdirectory "Morphs" | ||
save_morphs_here = Path(save_opt).joinpath("Morphs") | ||
save_morphs_here.mkdir(exist_ok=True) | ||
|
||
# Get names for the saved morphs | ||
if opts.snfile is not None: | ||
# Names should be stored properly in opts.snfile | ||
save_names = tools.deserialize(opts.snfile) | ||
# Default naming scheme | ||
else: | ||
for target_file in target_list: | ||
save_names.update({target_file.name: {"save_morph_as": | ||
f"Morph_with_Target_{target_file.stem}.cgr"}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -s
option for multiple morphs saves all the morphs to a user-specified directory. Default names for the names of morphs are "Morph_with_Target_{target_PDF_name}.cgr"
diffpy/pdfmorph/pdfmorphapp.py
Outdated
# Table labels | ||
labels = "\n# Labels: [Target]" | ||
if field is not None: | ||
labels += f" [{field}]" | ||
for param in [["Scale", scales], ["Stretch", stretches], ["Smear", smears]]: | ||
if len(param[1]) > 0: | ||
labels += f" [{param[0]}]" | ||
labels += " [Pearson] [Rw]\n" | ||
|
||
# Corresponding table | ||
table = "" | ||
for idx in range(results_length): | ||
row = f"{file_names[idx]}" | ||
if field is not None: | ||
row += f" {field_list[idx]}" | ||
for param in [scales, stretches, smears]: | ||
if len(param) > idx: | ||
row += f" {param[idx]:0.6f}" | ||
row += f" {pearsons[idx]:0.6f} {rws[idx]:0.6f}" | ||
table += f"{row}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks really great!
Most recent two commits cover the last three check boxes in issue #84 (list of edits to be made based on comments on PR #81). Only issue left to be resolved is making |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please work on the help messages for the cli parser. I am a professional PDF person and am having a hard time figuring out what you want there.... if a chemist is the target audience, try and make it science oriented. They won't know what SLOC is but they will know they have a directory folder containing a set of PDFs from their sample.
The App looks really great. I am happy to merge before refactoring the io into separate modules (or we could just put that as an issue) but I think it would be good to do that if we think any of it is reusable.
Thanks for refactoring the tests. I think that pytest can run unittests so I wasn't really needing this (just new tests in pytest syntax) but it is good to have it done! :)
.github/workflows/main.yml
Outdated
@@ -46,11 +46,22 @@ jobs: | |||
conda install --file requirements/test.txt | |||
pip install . | |||
|
|||
# redownload diffpy.utils pre 3.2.0 release, use https since public repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure there is an issue to remove this when diffpy.utils is released on conda-forge. I would rather not do this here tbh, but if it is needed for now it is ok, but let's make sure we remove it later.
What is left on diffpy.utils to get it released? Can we just do that and not have this here? it is just making more work in the future.....
diffpy/pdfmorph/pdfmorphapp.py
Outdated
help="Save manipulated PDF from FILE1 to FILE. Use \'-\' for stdout.", | ||
metavar="SLOC", | ||
dest="saveloc", | ||
help="""Save the manipulated PDF to a file SLOC. Use \'-\' for stdout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will a chemist be able to understand this? I am not sure the target audience (presume a chemist) but please write the help with the target audience in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, changed SLOC to NAME.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
'--save-names-file', | ||
metavar="SNFILE", | ||
dest="snfile", | ||
help="""Used only when -s and --multiple are enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be clear if this is OR or AND, e.g., "Used only if both -s and -multiple are enabled."
Again, read it with a beginning grad in chemistry in mind....will it be clear to them what to input here and when and how? Examples are sometimes better than text filled with jargon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the names and descriptions. Also added an example for ``--save-names-file` in tutorial.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
if len(pargs) < 2: | ||
parser.error("You must supply FILE1 and FILE2.") | ||
elif len(pargs) > 2: | ||
parser.error("Too many arguments.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with users in mind, when things fail it is always more helpful, as well as telling them why it failed, telling them what they should do next to make it work properly. Much less frustrating for the user.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
with open(opts.savefile, 'w') as outfile: | ||
try: | ||
if opts.saveloc != "-": | ||
save_file_name = Path(opts.saveloc).resolve().as_posix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the as_posix()? OK if yes, but just checking.
parser.custom_error("Too many arguments. Go to --help for usage.") | ||
|
||
# Parse paths | ||
morph_file = Path(pargs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like a lot of io handling which would be better in some kind of io function somewhere rather than in the app I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have factored some io handling in an io module, but both functions take in a lot of inputs to make work.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
opts.plot = False | ||
|
||
# Manage saving | ||
save_opt = opts.saveloc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above. Do we want io in some kind of io.py module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have this part yet to factor out
diffpy/pdfmorph/pdfmorphapp.py
Outdated
}) | ||
|
||
# Parse rws | ||
file_names = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put all on one line? e.g.,
file_names, rws, pearsons = [], [], []
diffpy/pdfmorph/pdfmorphapp.py
Outdated
if "smear" in results[key]: | ||
smears.append(results[key]["smear"]) | ||
|
||
# Input parameters used for every morph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a header-handling function may also be reusable and can go into another module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put into tools with the name get_values_from_dictionary_collection()
, though I am not particularly satisfied with that name.
diffpy/pdfmorph/pdfmorphapp.py
Outdated
# Table labels | ||
labels = "\n# Labels: [Target]" | ||
if field is not None: | ||
labels += f" [{field}]" | ||
for param in [["Scale", scales], ["Stretch", stretches], ["Smear", smears]]: | ||
if len(param[1]) > 0: | ||
labels += f" [{param[0]}]" | ||
labels += " [Pearson] [Rw]\n" | ||
|
||
# Corresponding table | ||
table = "" | ||
for idx in range(results_length): | ||
row = f"{file_names[idx]}" | ||
if field is not None: | ||
row += f" {field_list[idx]}" | ||
for param in [scales, stretches, smears]: | ||
if len(param) > idx: | ||
row += f" {param[idx]:0.6f}" | ||
row += f" {pearsons[idx]:0.6f} {rws[idx]:0.6f}" | ||
table += f"{row}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks really great!
diffpy/pdfmorph/pdfmorphapp.py
Outdated
metavar="NAME", | ||
dest="saveloc", | ||
help="""Save the manipulated PDF to a file SLOC. Use \'-\' for stdout. | ||
When --multiple is enabled, each manipulated PDF is saved to a directory SLOC. | ||
Names for each manipulated PDF can be specified using --snf.""", | ||
help="""Save the manipulated PDF to a file named NAME. Use \'-\' for stdout. | ||
When --multiple is enabled, save each manipulated PDF as a file in a directory named NAME; | ||
you can specify names for each saved PDF file using --save-names-file.""", | ||
) | ||
parser.add_option( | ||
'--snf', | ||
'--save-names-file', | ||
metavar="SNFILE", | ||
metavar="NAMESFILE", | ||
dest="snfile", | ||
help="""Used only when -s and --multiple are enabled. | ||
help="""Used only when both -s and --multiple are enabled. | ||
Specify names for each manipulated PDF when saving (see -s) using a serial file | ||
SNFILE. Each target PDF should be an entry in SFILE with a key save_morph_as | ||
whose value specifies the name to save the manipulated PDF as.""", | ||
NAMESFILE. The format of NAMESFILE should be as follows: each target PDF | ||
is an entry in NAMESFILE. For each entry, there should be a key 'save_morph_as' | ||
whose value specifies the name to save the manipulated PDF as. | ||
(See sample names files in the pdfmorph tutorial).""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope these changes are more understandable. I will add more files to tutorial and quickstart tutorial if needed.
.github/workflows/main.yml
Outdated
@@ -46,18 +46,6 @@ jobs: | |||
conda install --file requirements/test.txt | |||
pip install . | |||
|
|||
# redownload diffpy.utils pre 3.2.0 release, use https since public repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more downloading from github: use conda-forge release
diffpy/pdfmorph/pdfmorphapp.py
Outdated
group.add_option( | ||
'--plot-parameter', | ||
metavar='PLOTPARAM', | ||
dest='plotparam', | ||
help="""Used when both plotting and --multiple are enabled. | ||
Choose a PLOTPARAM to plot for each morph (i.e. adding --pp=Pearson means the program | ||
will display a plot of the Pearson correlation coefficient for each morph-target pair). | ||
PLOTPARAM is not case sensitive, so both Pearson and pearson indicate the same parameter. | ||
When PLOTPARAM is not specified, Rw values for each morph-target pair will be plotted. | ||
PLOTPARAM will be displayed as the vertical axis label for the plot.""", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one new option added this commit. All others were just moved (put into separate options group)
pdfplot.plot_rws(target_file_names, plot_results["Rw"]) | ||
try: | ||
if field_list is not None: | ||
pdfplot.plot_param(field_list, param_list, param_name, field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No tests written for plotting, but have some to check failure conditions
Fixes issue #84.
Added --sequence option for someone to morph a FILE multiple times against all files in a DIRECTORY as target. Plots the Rw values by default, and can save a table. Default is to sort files in DIRECTORY in alphabetical order.




A user can enable the --temperature option if all files in DIRECTORY end in "###K.gr" or "###K.cgr". This will sort files by ###, which should be the temperature (in Kelvin). Plots and tables look like the following if this option is enabled.
Fixes #77: default behavior is to set plot labels to file names. Making default just "morph" and "target" means multiple morphs in a morph sequence will have the same label.
Also fixes #47: checked the saving code enough to know it works.
New warnings appear during tests; see issue in diffpy.utils.