-
Notifications
You must be signed in to change notification settings - Fork 18
Writing Tutorials #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
4a5acea
0e77508
78d6ad6
a16d57c
051f368
7663bfa
597df51
91fc072
69c1544
72f735e
72fcb24
6068c0a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,12 +16,10 @@ | |
from __future__ import print_function | ||
|
||
import sys | ||
import os | ||
import os.path | ||
from pathlib import Path | ||
|
||
import numpy | ||
from diffpy.pdfmorph import __version__ | ||
from diffpy.pdfmorph.version import __version__ | ||
import diffpy.pdfmorph.tools as tools | ||
import diffpy.pdfmorph.pdfplot as pdfplot | ||
import diffpy.pdfmorph.morphs as morphs | ||
|
@@ -31,6 +29,7 @@ | |
|
||
def create_option_parser(): | ||
import optparse | ||
prog_short = Path(sys.argv[0]).name # Program name, compatible w/ all OS paths | ||
|
||
class CustomParser(optparse.OptionParser): | ||
def __init__(self, *args, **kwargs): | ||
|
@@ -62,9 +61,27 @@ def custom_error(self, msg): | |
parser.add_option( | ||
'-s', | ||
'--save', | ||
metavar="FILE", | ||
metavar="SFILE", | ||
dest="savefile", | ||
help="Save manipulated PDF from FILE1 to FILE. Use \'-\' for stdout.", | ||
help="Save manipulated PDF from FILE1 to SFILE. Use \'-\' for stdout.", | ||
) | ||
parser.add_option( | ||
'--sequence', | ||
dest="sequence", | ||
action="store_true", | ||
help=f"""Changes usage to \'{prog_short} [options] FILE DIRECTORY\'. FILE | ||
will be morphed with each file in DIRECTORY as target. | ||
Files in DIRECTORY are sorted by alphabetical order unless | ||
--temperature is enabled. Plotting and saving options are for | ||
a Rw plot and table respectively.""", | ||
) | ||
parser.add_option( | ||
'--temperature', | ||
dest="temp", | ||
action="store_true", | ||
help="""Used with --sequence to sort files in DIRECTORY by temperature. | ||
File names in DIRECTORY should end in _#K.gr or _#K.cgr (where # is a float) | ||
to use this option.""", | ||
) | ||
parser.add_option( | ||
'--rmin', type="float", help="Minimum r-value to use for PDF comparisons." | ||
|
@@ -179,12 +196,6 @@ def custom_error(self, msg): | |
dest="plot", | ||
help="Do not show the plot.", | ||
) | ||
group.add_option( | ||
'--usefilenames', | ||
action="store_true", | ||
dest="usefilenames", | ||
help="Use the file names as labels on plot." | ||
) | ||
group.add_option( | ||
'--mlabel', | ||
metavar="MLABEL", | ||
|
@@ -214,6 +225,7 @@ def custom_error(self, msg): | |
) | ||
|
||
# Defaults | ||
parser.set_defaults(sequence=False) | ||
parser.set_defaults(plot=True) | ||
parser.set_defaults(refine=True) | ||
parser.set_defaults(pearson=False) | ||
|
@@ -227,9 +239,130 @@ def custom_error(self, msg): | |
def main(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. let's put main at the bottom right next to |
||
parser = create_option_parser() | ||
(opts, pargs) = parser.parse_args() | ||
if opts.sequence: | ||
multiple_morphs(parser, opts, pargs, stdout_flag=True) | ||
else: | ||
single_morph(parser, opts, pargs, stdout_flag=True) | ||
|
||
|
||
def multiple_morphs(parser, opts, pargs, stdout_flag): | ||
# Custom error messages since usage is distinct when --sequence tag is applied | ||
if len(pargs) < 2: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems a bit brittle and unmaintainable to me since we may add or remove parser options in the future (see my comments above). Can we find a more robust way to do this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since most parser options are shared by single_morph and multiple_morphs, changing most parser options would only require changes in the single_morph function. Options that are not shared between them (plotting and saving since different items are plotted/saved) are the only ones kept separate right now. |
||
parser.custom_error("You must supply FILE and DIRECTORY. Go to --help for usage.") | ||
elif len(pargs) > 2: | ||
parser.custom_error("Too many arguments. Go to --help for usage.") | ||
|
||
# Parse paths | ||
morph_file = Path(pargs[0]) | ||
if not morph_file.is_file(): | ||
parser.custom_error(f"{morph_file} is not a file. Go to --help for usage.") | ||
target_directory = Path(pargs[1]) | ||
if not target_directory.is_dir(): | ||
parser.custom_error(f"{target_directory} is not a directory. Go to --help for usage.") | ||
|
||
# Sort files in directory | ||
target_list = list(target_directory.iterdir()) | ||
if opts.temp: | ||
# Sort by temperature | ||
try: | ||
target_list = tools.temperature_sort(target_list) | ||
except ValueError: | ||
parser.custom_error("All file names in directory must end in _#K.gr or _#K.cgr (where # is a float) to use " | ||
"the --temperature option.") | ||
else: | ||
# Default is alphabetical sort | ||
target_list.sort() | ||
|
||
# Disable single morph plotting and saving | ||
plot_opt = opts.plot | ||
opts.plot = False | ||
save_opt = opts.savefile | ||
opts.savefile = None | ||
|
||
# Do not morph morph_file against itself if it is in the same directory | ||
if morph_file in target_list: | ||
target_list.remove(morph_file) | ||
|
||
# Morph morph_file against all other files in target_directory | ||
results = [] | ||
for target_file in target_list: | ||
if target_file.is_file: | ||
results.append([ | ||
target_file.name, | ||
single_morph(parser, opts, [morph_file, target_file], stdout_flag=False), | ||
]) | ||
|
||
# Input parameters used for every morph | ||
inputs = [None, None] | ||
inputs[0] = f"# Morphed file: {morph_file.name}" | ||
inputs[1] = "\n# Input morphing parameters:" | ||
inputs[1] += f"\n# scale = {opts.scale}" | ||
inputs[1] += f"\n# stretch = {opts.stretch}" | ||
inputs[1] += f"\n# smear = {opts.smear}" | ||
|
||
# If print enabled | ||
if stdout_flag: | ||
# Separated for saving | ||
print(f"\n{inputs[0]}{inputs[1]}") | ||
|
||
# Results from each morph | ||
for entry in results: | ||
outputs = f"\n# Target: {entry[0]}\n" | ||
outputs += "# Optimized morphing parameters:\n" | ||
outputs += "\n".join(f"# {i[0]} = {i[1]:.6f}" for i in entry[1]) | ||
print(outputs) | ||
|
||
rws = [] | ||
target_labels = [] | ||
results_length = len(results) | ||
for entry in results: | ||
if opts.temp: | ||
name = entry[0] | ||
target_labels.append(tools.extract_temperatures([name])[0]) | ||
else: | ||
target_labels.append(entry[0]) | ||
for item in entry[1]: | ||
if item[0] == "Rw": | ||
rws.append(item[1]) | ||
|
||
if len(pargs) != 2: | ||
parser.error("You must supply FILE1 and FILE2") | ||
if save_opt is not None: | ||
# Save table of Rw values | ||
try: | ||
with open(save_opt, 'w') as outfile: | ||
# Header | ||
print(f"{inputs[0]}\n{inputs[1]}", file=outfile) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this looks a bit too chatty to me? Maybe enclose it in a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I also change this for single morphs as well? (Line 526 of this file and onward.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, everywhere...... |
||
if opts.temp: | ||
print(f"\n# L T(K) Rw", file=outfile) | ||
else: | ||
print(f"\n# L Target Rw", file=outfile) | ||
|
||
# Table | ||
for idx in range(results_length): | ||
print(f"{target_labels[idx]} {rws[idx]}", file=outfile) | ||
outfile.close() | ||
sbillinge marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Output to stdout | ||
path_name = Path(outfile.name).absolute() | ||
save_message = f"\n# Rw table saved to {path_name}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe don't prepend a blank line? It rarely looks good. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here is a sample output. I wanted to keep it spaced out from the other outputs. However, after adding the
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I were a user I may prefer it in a csv format for easier porting to other programs? this will also be better on PDFitc. No need to print anything to screen, though since you have it you may as well leave it but nest it in It makes sense to me to print to screen when it is a single pair of files (in that case no way I want to open up a file from disc" |
||
print(save_message) | ||
|
||
# Save failed | ||
except FileNotFoundError as e: | ||
save_fail_message = "\nUnable to save to designated location" | ||
print(save_fail_message) | ||
parser.custom_error(str(e)) | ||
|
||
if plot_opt: | ||
pdfplot.plot_rws(target_labels, rws, opts.temp) | ||
|
||
return results | ||
|
||
|
||
def single_morph(parser, opts, pargs, stdout_flag): | ||
if len(pargs) < 2: | ||
parser.error("You must supply FILE1 and FILE2.") | ||
elif len(pargs) > 2: | ||
parser.error("Too many arguments.") | ||
|
||
# Get the PDFs | ||
x_morph, y_morph = getPDFFromFile(pargs[0]) | ||
|
@@ -355,15 +488,23 @@ def main(): | |
morphs_in += f"\n# scale = {scale_in}" | ||
morphs_in += f"\n# stretch = {stretch_in}" | ||
morphs_in += f"\n# smear = {smear_in}" | ||
print(morphs_in) | ||
|
||
items = list(config.items()) | ||
items.sort() | ||
# Output morph parameters | ||
morph_results = list(config.items()) | ||
morph_results.sort() | ||
|
||
# Ensure Rw, Pearson last two outputs | ||
morph_results.append(("Rw", rw)) | ||
morph_results.append(("Pearson", pcc)) | ||
|
||
output = "\n# Optimized morphing parameters:\n" | ||
output += "\n".join(f"# {i[0]} = {i[1]:.6f}" for i in items) | ||
output += f"\n# Rw = {rw:.6f}" | ||
output += f"\n# Pearson = {pcc:.6f}" | ||
print(output) | ||
output += "\n".join(f"# {i[0]} = {i[1]:.6f}" for i in morph_results) | ||
|
||
# No stdout output when running morph sequence | ||
if stdout_flag: | ||
print(morphs_in) | ||
print(output) | ||
|
||
if opts.savefile is not None: | ||
path_name = Path(pargs[0]).absolute() | ||
header = "# PDF created by pdfmorph\n" | ||
|
@@ -397,14 +538,14 @@ def main(): | |
|
||
if opts.plot: | ||
pairlist = [chain.xy_morph_out, chain.xy_target_out] | ||
labels = ["morph", "target"] # Default label names | ||
if opts.usefilenames: | ||
labels = [pargs[0], pargs[1]] | ||
else: | ||
if opts.mlabel is not None: | ||
labels[0] = opts.mlabel | ||
if opts.tlabel is not None: | ||
labels[1] = opts.tlabel | ||
labels = [pargs[0], pargs[1]] # Default is to use file names | ||
|
||
# If user chooses labels | ||
if opts.mlabel is not None: | ||
labels[0] = opts.mlabel | ||
if opts.tlabel is not None: | ||
labels[1] = opts.tlabel | ||
|
||
# Plot extent defaults to calculation extent | ||
pmin = opts.pmin if opts.pmin is not None else opts.rmin | ||
pmax = opts.pmax if opts.pmax is not None else opts.rmax | ||
|
@@ -415,7 +556,7 @@ def main(): | |
pairlist, labels, rmin=pmin, rmax=pmax, maglim=maglim, mag=mag, rw=rw, l_width=l_width | ||
) | ||
|
||
return | ||
return morph_results | ||
|
||
|
||
def getPDFFromFile(fn): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
#!/usr/bin/env python | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. usually we use the pattern that the test module that that tests modeul There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll move these to a file called test_pdfmorphapp |
||
|
||
|
||
import unittest | ||
from pathlib import Path | ||
|
||
from diffpy.pdfmorph.pdfmorphapp import create_option_parser, single_morph, multiple_morphs | ||
|
||
thisfile = locals().get('__file__', 'file.py') | ||
tests_dir = Path(thisfile).parent.resolve() | ||
testsequence_dir = tests_dir.joinpath("testdata").joinpath("testsequence") | ||
|
||
|
||
class TestMorphSequence(unittest.TestCase): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. per my comment above, we are moving to using pytest because the syntax is easier and personally I never learned unittest syntax so I find it harder to maintain. As per above, we won't go back and change old unittest code to pytest, but can we use pytest syntax in all new tests and older tests that we revisit for some reason? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll figure out how to move setUp() or reconstruct it. Should not be too difficult to change |
||
def setUp(self): | ||
self.parser = create_option_parser() | ||
filenames = ["g_174K.gr", "f_180K.gr", "e_186K.gr", "d_192K.gr", "c_198K.gr", "b_204K.gr", "a_210K.gr"] | ||
(self.opts, pargs) = self.parser.parse_args(["--scale", "1", "--stretch", "0", "-n", "--temperature"]) | ||
self.testfiles = [] | ||
for filename in filenames: | ||
self.testfiles.append(testsequence_dir.joinpath(filename)) | ||
return | ||
|
||
def test_morphsequence(self): | ||
# Run multiple single morphs | ||
single_results = [] | ||
morph_file = self.testfiles[-1] | ||
for target_file in self.testfiles[:-1]: | ||
pargs = [morph_file, target_file] | ||
single_results.append(single_morph(self.parser, self.opts, pargs, stdout_flag=False)) | ||
pargs = [morph_file, testsequence_dir] | ||
|
||
# Run a morph sequence | ||
sequence_results = multiple_morphs(self.parser, self.opts, pargs, stdout_flag=False) | ||
|
||
# Compare results | ||
for idx in range(len(self.testfiles[:-1])): | ||
assert sequence_results[idx][1] == single_results[idx] | ||
|
||
|
||
if __name__ == '__main__': | ||
unittest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still really sorting by filename here so I think this is not needed.
Another option would be to really store by temperature where temperature is piece of metadata. Then we would need to know where the metadata came from. The gr files allow metadata in the header region so we could parse the file for a "temperature" field and then use the value of that item in alpha-numeric order. If we offer that, we could allow sorting by any metadata field, so allow the option for a user to specify a string that matches any field in the header and then sort alphanumercically by the value of those items. This seems better to me. What do you think?
I guess default behavior is generally to make the higher-T PDF the target and the lower-T the morph (it is easier to broaden than sharpen data with morphs). This may frustrate a user who has some fancy sorting that is not temperature, like a battery discharging or something and it is becoming less ordered in time or sthg, or for some other reason, so we could offer a
--reverse
option so the sorting is done with the higher value morphing onto a lower value target.Many users might have a table in their notebooks or in a csv file that does a mapping of "temperature" onto the filename (or any other item-key onto filename). Another approach would be to have tools that can parse such a file and collect the metadata that way. Moving forwards we would like to move in the direction of databases, so being able to parse the same info from json serialized data also would be useful. In this world, we would allow the user to specify a file (support csv or json or yaml formats?) that maps metadata fields to a filename. Then this could be the target we use for our sorting. If we had tools that could take this information and inject it into .gr files as an option, or vice-versa, parse out the gr file headers into metadata files, it would be good. This would then be very helpful in general.
With this in mind, I would suggest that we have a
--sort-by
that takes a string which is the field we want to sort by, a--reverse
which reverses the sort sense when true. Your--temperature
option could be omitted, or could be there but is a helper meaning "sort-by temperature". By default it could look formedatada..<ext>
as the metadata filename where could be["csv", "json", "yaml", "yml"]
or sthg, but also have a--metadata-filename
option that overrides this. What do you think?These issues are way broader than pdfmorph, so we could add a bunch of capabilities in
diffpy.utils
that do the file handling, metadata serialization and deserialization in different formats so these things are available across all of diffpy (and beyond). That could be a separate little mini project.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a way more robust solution. I can start with reading metadata from headers. Could I have an example of a separate metadata mapping file before I work on that part? I can include both as standalone functions in tools so we can move them to diffpy.utils later.