-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]: Pakman: a modular, efficient and portable tool for approximate Bayesian inference #1716
Comments
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @jmlarson1, @gonsie it looks like you're currently assigned to review this paper 🎉. ⭐ Important ⭐ If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿 To fix this do the following two things:
For a list of things I can do to help you, just type:
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
|
|
@jmlarson1, @gonsie 👋 Welcome and thanks for agreeing to review! The comments from @whedon above outline the review process, which takes place in this thread (possibly with issues filed in the Pakman repository and mentioning the URL of this issue). I'll be watching this thread if you have any questions. |
Some things of note:
|
Hi @jmlarson1, thanks for agreeing to review this project and for your comments. Please find my responses to points 1, 2, and 3 below. I will get back to you on the other points as soon as possible.
|
Hi, Do I understand pakman correctly:
It would be helpful to have some more comprehensive documentation around the pakman options. That is, it would be nice for the Do you have any examples / performance numbers of this working on an HPC cluster, possibly with a batch scheduler (e.g., slurm)? Does pakman performance scale as node-count increases? |
Hi @gonsie, You have summarised it very well! The only things I would add is that the stdin of simulators is also used to pass on parameters, and that the sweep functionality is not an ABC method, it simply performs a parameter sweep. I agree that documentation on the pakman options would be helpful. I will add a wiki page listing the pakman options similarly to the I have used pakman on a HPC cluster, but I have not yet made a performance benchmark for HPC clusters. I will do so and get back to you when I have the results. |
Hi @jmlarson1, @gonsie, and @ThomasPak — how are things going? What are the next steps for this review? |
Hi @kthyng, the hold-up is because of me, I have yet to implement the changes that the reviewers have asked for. I was very busy for the past month because I had to write a progress report for my course and be examined on it. Fortunately, that is now behind me so I can focus on this submission again. |
Hi @jmlarson1, Apologies for the delay, but I have now addressed points 4 through 6 of your comments. I have improved and expanded the documentation of the code. Also, I have added a paragraph to the paper to clearly state the problem my software is intended to solve and the intended target audience, and another paragraph to better summarise the state of the field. |
Hi @gonsie As per your suggestion, I have added a Wiki page documenting the command-line interface of Pakman here: https://github.com/ThomasPak/pakman/wiki/Command-line-interface I have performed a benchmark on my University's HPC cluster. The cluster uses Slurm for job scheduling and I performed the test on 16 nodes with 16 CPU cores each. The CPUs were all Intel Xeon E5-2650 processors. Briefly, I sent a job to the cluster with Pakman to run 4096 simulations of
Moreover, I have plotted the speedup and efficiency (defined with respect to computation times under ideal linear scaling) using The computation times scale roughly linear, although there is a drop in efficiency at higher core counts. Part of this can be attributed to overhead involved with to message passing, while part of it may be because of inefficiencies in the code. Since the average duration of a single simulation in this example is roughly three seconds, the parallel overhead would have a smaller impact for longer-running simulations. Overall, I believe that these results demonstrate that Pakman scales fairly well with increased computational resources. |
Thanks @ThomasPak. Maybe you want to document the performance graphs/details to the article? Or maybe to a wiki page? Otherwise, I think it looks good. |
I believe the checklist requirements have been satisfied. |
@gonsie I have put the results of this performance test on the Pakman wiki because I may add more tests in the future. Please have a look at https://github.com/ThomasPak/pakman/wiki/Benchmark%3A-HPC-cluster |
Hi @gonsie and @jmlarson1 , what are the next steps to complete the review process? |
I see the checklists are complete; thanks. @gonsie @jmlarson1 Please let me know if you have any further concerns. |
Hi @gonsie @jmlarson1, have you had time to look into this? |
I have no further concerns. |
Hi @gonsie, do you have any further concerns? |
yep, it looks good. |
@whedon generate pdf |
|
@whedon check references |
@whedon generate pdf |
@whedon generate pdf |
@jedbrown No worries!
Yes. The reason why the computational results are different from the analytical results is that the tolerance series used ends with a tolerance of 35. Whenever the target tolerance in the ABC SMC is nonzero, the computed posterior is only an approximation of the posterior. If the target tolerance was zero, however, the computed posterior would reproduce the analytical result. Thanks for asking this question; because of your question I discovered an error in the caption of Figure 2. In particular, I wrote that the reasons for the discrepancy were insufficient summary statistics and a nonzero tolerance. However, no summary statistics were used, so the only reason for the discrepancy is a nonzero tolerance.
Pakman currently does not have this functionality because there is no easy and straightforward way to check the quality of a computed posterior. The most general method is to use posterior predictive checks, which involves generating 'fake' data from the fitted model and comparing it to the real data. We did not include this in the scope of Pakman, although we do not rule out incorporating this functionality in the future.
We used angle brackets to stress that the order of elements is important. If we used braces, it could be mistaken for a set instead of an ordered list.
I have made these changes. Thanks for your comments and please let me know if you are happy to proceed with archiving. |
Thanks for your explanation and updates. At this point, here are the next steps:
I can then move forward with accepting the submission. |
Thanks for the quick response! I followed the steps that you outlined, and the DOI is 10.5281/zenodo.3697312. |
@whedon set v1.1.0 as version |
OK. v1.1.0 is the version. |
@whedon set 10.5281/zenodo.3697312 as archive |
OK. 10.5281/zenodo.3697312 is the archive. |
@whedon accept |
|
|
👋 @openjournals/joss-eics, this paper is ready to be accepted and published . Check final proof 👉 openjournals/joss-papers#1352 If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1352, then you can now move forward with accepting the submission by compiling again with the flag
|
The paper PDF and Crossref deposit XML look good to me 👍. |
Updated comment-The following is no longer needed: @ThomasPak I'm helping with the final processing for your paper. Can you check the point below. The following sentence needs rephrasing, please remove the words "embarrassingly": |
@Kevin-Mattheus-Moerman Other wordings are possible, but it as a widely used technical term, and wouldn't have the same meaning if you simply cut "embarrassingly". |
@ThomasPak @jedbrown okay. In that case ignore my comment. |
@whedon accept deposit=true |
|
🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦 |
🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨 Here's what you must now do:
Any issues? notify your editorial technical team... |
🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉 If you would like to include a link to your paper from your README use the following code snippets:
This is how it will look in your documentation: We need your help! Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:
|
Thanks to the editor and reviewers for making this possible! 🎉 🎉 🎉 |
Submitting author: @ThomasPak (Thomas Pak)
Repository: https://github.com/ThomasPak/pakman
Version: v1.1.0
Editor: @jedbrown
Reviewer: @jmlarson1, @gonsie
Archive: 10.5281/zenodo.3697312
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@jmlarson1 & @gonsie, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jedbrown know.
✨ Please try and complete your review in the next two weeks ✨
Review checklist for @jmlarson1
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @gonsie
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
The text was updated successfully, but these errors were encountered: