Skip to content

Writing Tutorials #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

Conversation

Sparks29032
Copy link
Collaborator

Doing some reformatting and rewriting of docscripts for automodule formatting. Initial draft of an auto-reformatter included as DocReformat.java.

Sample documentation after initial reformatting below.

Sample 1

image

Sample 2

image

@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Merging #81 (7663bfa) into main (6fdf12b) will decrease coverage by 0.11%.
The diff coverage is 55.55%.

@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
- Coverage   75.01%   74.91%   -0.11%     
==========================================
  Files          36       36              
  Lines        1437     1443       +6     
==========================================
+ Hits         1078     1081       +3     
- Misses        359      362       +3     
Impacted Files Coverage Δ
diffpy/pdfmorph/log.py 0.00% <ø> (ø)
diffpy/pdfmorph/pdfmorph_api.py 90.24% <ø> (ø)
diffpy/pdfmorph/pdfmorphapp.py 26.90% <0.00%> (-0.37%) ⬇️
diffpy/pdfmorph/pdfplot.py 8.10% <ø> (ø)
diffpy/pdfmorph/version.py 0.00% <0.00%> (ø)
diffpy/pdfmorph/morph_helpers/transformpdftordf.py 100.00% <100.00%> (ø)
diffpy/pdfmorph/morph_helpers/transformrdftopdf.py 91.30% <100.00%> (+1.30%) ⬆️
diffpy/pdfmorph/refine.py 76.00% <100.00%> (ø)
diffpy/pdfmorph/tools.py 72.54% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sbillinge
Copy link
Contributor

@Sparks29032 this looks really great.

@sbillinge
Copy link
Contributor

everything looking really great!

@Sparks29032
Copy link
Collaborator Author

Adding "morph sequence" (tentative name for performing multiple morphs in one command) functionality in morph_sequence branch. Once morph_sequence is merged, will add tutorial text in this branch and compress/zip the additionalData folder. Then this branch should be ready to merge.

@Sparks29032 Sparks29032 changed the title Initial reformatting Writing Tutorials Jul 5, 2023
Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work. please see my comments

--temperature is enabled. Plotting and saving options are for
a Rw plot and table respectively.""",
)
parser.add_option(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still really sorting by filename here so I think this is not needed.

Another option would be to really store by temperature where temperature is piece of metadata. Then we would need to know where the metadata came from. The gr files allow metadata in the header region so we could parse the file for a "temperature" field and then use the value of that item in alpha-numeric order. If we offer that, we could allow sorting by any metadata field, so allow the option for a user to specify a string that matches any field in the header and then sort alphanumercically by the value of those items. This seems better to me. What do you think?

I guess default behavior is generally to make the higher-T PDF the target and the lower-T the morph (it is easier to broaden than sharpen data with morphs). This may frustrate a user who has some fancy sorting that is not temperature, like a battery discharging or something and it is becoming less ordered in time or sthg, or for some other reason, so we could offer a --reverse option so the sorting is done with the higher value morphing onto a lower value target.

Many users might have a table in their notebooks or in a csv file that does a mapping of "temperature" onto the filename (or any other item-key onto filename). Another approach would be to have tools that can parse such a file and collect the metadata that way. Moving forwards we would like to move in the direction of databases, so being able to parse the same info from json serialized data also would be useful. In this world, we would allow the user to specify a file (support csv or json or yaml formats?) that maps metadata fields to a filename. Then this could be the target we use for our sorting. If we had tools that could take this information and inject it into .gr files as an option, or vice-versa, parse out the gr file headers into metadata files, it would be good. This would then be very helpful in general.

With this in mind, I would suggest that we have a --sort-by that takes a string which is the field we want to sort by, a --reverse which reverses the sort sense when true. Your --temperature option could be omitted, or could be there but is a helper meaning "sort-by temperature". By default it could look for medatada..<ext> as the metadata filename where could be ["csv", "json", "yaml", "yml"] or sthg, but also have a --metadata-filename option that overrides this. What do you think?

These issues are way broader than pdfmorph, so we could add a bunch of capabilities in diffpy.utils that do the file handling, metadata serialization and deserialization in different formats so these things are available across all of diffpy (and beyond). That could be a separate little mini project.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a way more robust solution. I can start with reading metadata from headers. Could I have an example of a separate metadata mapping file before I work on that part? I can include both as standalone functions in tools so we can move them to diffpy.utils later.

@@ -227,9 +239,130 @@ def custom_error(self, msg):
def main():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put main at the bottom right next to if __name__ ==....


def multiple_morphs(parser, opts, pargs, stdout_flag):
# Custom error messages since usage is distinct when --sequence tag is applied
if len(pargs) < 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit brittle and unmaintainable to me since we may add or remove parser options in the future (see my comments above). Can we find a more robust way to do this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since most parser options are shared by single_morph and multiple_morphs, changing most parser options would only require changes in the single_morph function. Options that are not shared between them (plotting and saving since different items are plotted/saved) are the only ones kept separate right now.


# Output to stdout
path_name = Path(outfile.name).absolute()
save_message = f"\n# Rw table saved to {path_name}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe don't prepend a blank line? It rarely looks good.

Copy link
Collaborator Author

@Sparks29032 Sparks29032 Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a sample output. I wanted to keep it spaced out from the other outputs. However, after adding the --verbose option, this may change.

# Input morphing parameters:
# scale = None
# stretch = None
# smear = None

# Target: SrFe2As2_174K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.072189
# Pearson = 0.997625

# Target: SrFe2As2_180K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.085544
# Pearson = 0.996714

# Target: SrFe2As2_186K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.098680
# Pearson = 0.995607

# Target: SrFe2As2_192K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.117966
# Pearson = 0.993501

# Target: SrFe2As2_198K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.150522
# Pearson = 0.989150

# Target: SrFe2As2_204K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.163814
# Pearson = 0.987273

# Target: SrFe2As2_210K.gr
# Optimized morphing parameters:
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.175156
# rmax = 100.010000
# rmin = 0.000000
# rstep = 0.010000
# Rw = 0.224387
# Pearson = 0.978108

# Rw table saved to C:\Users\Spark\desktop\savefile.txt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were a user I may prefer it in a csv format for easier porting to other programs? this will also be better on PDFitc. No need to print anything to screen, though since you have it you may as well leave it but nest it in if verbose. A summary that would be useful on screen would just be filename and Rw so I can quickly find the jump (i.e., the phase transition) without having to open a file. What do you think?

It makes sense to me to print to screen when it is a single pair of files (in that case no way I want to open up a file from disc"

try:
with open(save_opt, 'w') as outfile:
# Header
print(f"{inputs[0]}\n{inputs[1]}", file=outfile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a bit too chatty to me? Maybe enclose it in a if verbose: (collect verbose as -v, --verbose). In general I prefer my code to be less "chatty" because then the outputs just start looking like noise and important messages get lost and overlooked. using verbose allows debugging when things are going wrong and the user can't figure out why.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I also change this for single morphs as well? (Line 526 of this file and onward.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, everywhere......

@@ -0,0 +1,42 @@
#!/usr/bin/env python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we use the pattern that the test module that that tests modeul <filename> is called test_<filename>. This pattern may not be being used in pdfmorph which whose provenance is older, but when we edit code, we like the edits to follow the new standards if possible. I am not sure if it makes sense here, but if it does can we use this new pattern?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move these to a file called test_pdfmorphapp

testsequence_dir = tests_dir.joinpath("testdata").joinpath("testsequence")


class TestMorphSequence(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per my comment above, we are moving to using pytest because the syntax is easier and personally I never learned unittest syntax so I find it harder to maintain. As per above, we won't go back and change old unittest code to pytest, but can we use pytest syntax in all new tests and older tests that we revisit for some reason?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll figure out how to move setUp() or reconstruct it. Should not be too difficult to change

@sbillinge
Copy link
Contributor

@Sparks29032 please create a new clean branch off a merged main and move over just the things we want (rst files presumably)

@Sparks29032 Sparks29032 deleted the reformat_documentation branch May 30, 2024 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants