Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: seesus: a social, environmental, and economic sustainability classifier for Python #6244

Closed
editorialbot opened this issue Jan 19, 2024 · 61 comments
Assignees
Labels
accepted Jupyter Notebook published Papers published in JOSS Python recommend-accept Papers recommended for acceptance in JOSS. review TeX Track: 4 (SBCS) Social, Behavioral, and Cognitive Sciences

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Jan 19, 2024

Submitting author: @caimeng2 (Meng Cai)
Repository: https://github.com/caimeng2/seesus
Branch with paper.md (empty if default branch):
Version: v1.2.1
Editor: @oliviaguest
Reviewers: @varsha2509, @luyuhao0326
Archive: 10.5281/zenodo.10854083

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/6bfbe71ac4a3f4799c6cbbfb15a07ff6"><img src="https://joss.theoj.org/papers/6bfbe71ac4a3f4799c6cbbfb15a07ff6/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/6bfbe71ac4a3f4799c6cbbfb15a07ff6/status.svg)](https://joss.theoj.org/papers/6bfbe71ac4a3f4799c6cbbfb15a07ff6)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@varsha2509, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @oliviaguest know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @luyuhao0326

📝 Checklist for @varsha2509

@editorialbot
Copy link
Collaborator Author

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.88  T=0.07 s (291.7 files/s, 69714.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                             7             95              0           2244
Python                           6             60            153           1056
TeX                              1             12              0            143
Markdown                         2             33              0             97
Jupyter Notebook                 1              0            563             29
TOML                             1              2              0             26
YAML                             1              1              9             18
-------------------------------------------------------------------------------
SUM:                            19            203            725           3613
-------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

@editorialbot
Copy link
Collaborator Author

Wordcount for paper.md is 938

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1002/bse.2195 is OK
- 10.21105/joss.05124 is OK
- 10.1016/j.enpol.2008.02.039 is OK
- 10.1007/s10668-016-9801-z is OK
- 10.3390/ECP2023-14728 is OK
- 10.5040/9781509934058.0025 is OK
- 10.1007/978-981-10-3521-0_31 is OK
- 10.3390/su14053095 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@oliviaguest
Copy link
Member

@editorialbot add @luyuhao0326 to reviewers

@editorialbot
Copy link
Collaborator Author

@luyuhao0326 added to the reviewers list!

@oliviaguest
Copy link
Member

👋 Hi @varsha2509, @luyuhao0326, thank you so much for helping out at JOSS. If you need any pointers, please feel free to look at previous reviews (which can be found by looking at published papers) and the documentation. If you need to comment on the code itself, opening an issue at the repo and then linking to it from here (to help me/others keep track) is the way to go. For comments on the paper, you can also open issues or PRs (say for typos), but those can be directly posted as replies in this issue. Thanks, and feel free to reach out if you need me. ☺️

@luyuhao0326
Copy link

luyuhao0326 commented Jan 22, 2024

Review checklist for @luyuhao0326

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/caimeng2/seesus?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@caimeng2) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@luyuhao0326
Copy link

Thanks for the invitation and below are my reviews on installation, and software paper (most of my comments will be related to the paper since I am not a proficient Python user).

Installation

  • It would be great to include more detailed installation instructions for users who are not so familiar with Python (e.g., me and many others who will be potentially benefiting from this work) and/or GitHub.
  • That being said, I am unable to install and run this package on my machine. I will be happy to do so if one of the authors can help me install the package.

Software paper

  • The idea is novel and I can see this work being useful in many domains. I have one question regarding the example use cases listed in the paper: the paper claims that seesus can be used in "label academic publications" and "large-scale scans of planning documents". However, the example in README.md only shows how seesus evaluates individual sentences which can cause potential misinterpretation and biased results as the context of "academic publications" and "planning documents" will likely be missing when being evaluated sentence by sentence. Example 3 provided here, for example is not really a paragraph.

  • The statement of need is clear but a bit thin. Although I can appreciate JOSS is a more software-focused journal, it would still be great to provide some context on the current status of text mining/classification on UN-SDG and why it is important to for example, "quantify which dimension of sustainability receives the most attention"

  • Accuracy of 75.5% is decent but is not particularly high. Considering the evaluation method is from a different package, it would be great if the authors can provide a statement (or even better specific development plans) on how to improve the accuracy and/or usabililty of future text-mining on SDG.

@caimeng2
Copy link

caimeng2 commented Jan 24, 2024

Hi @luyuhao0326,

Thank you very much for taking the time to review seesus. We appreciate your helpful feedback. Please find our point-by-point responses below.

Installation

It would be great to include more detailed installation instructions for users who are not so familiar with Python (e.g., me and many others who will be potentially benefiting from this work) and/or GitHub.

Thank you for your suggestion. seesus is indeed a Python-based software that requires basic knowledge of Python programming. To simplify the installation process, we chose to publish seesus to PyPI and use pip install. In this way, users can easily install the package with one line of command, without the need to manually manage dependencies and configure the package. We have made the installation instructions clearer as suggested (6f02afd).

That being said, I am unable to install and run this package on my machine. I will be happy to do so if one of the authors can help me install the package.

I am more than happy to help. Do you already have Python, pip, and Jupyter (for running the example.ipynb) installed? If yes, typing pip install seesus in your terminal should do the job. If not, I would recommend installing Anaconda first. Please go to Anaconda's website and install it for your specific operating system (instructions). Then you should be able to install seesus by inputting pip install seesus in your terminal. Please let me know if you encounter any problems.

Software paper

The idea is novel and I can see this work being useful in many domains. I have one question regarding the example use cases listed in the paper: the paper claims that seesus can be used in "label academic publications" and "large-scale scans of planning documents". However, the example in README.md only shows how seesus evaluates individual sentences which can cause potential misinterpretation and biased results as the context of "academic publications" and "planning documents" will likely be missing when being evaluated sentence by sentence. Example 3 provided here, for example is not really a paragraph.

Glad to hear that you find our package novel and potentially useful in many domains. To achieve the best results, we recommend splitting a paragraph or a whole document into individual sentences (i.e., using individual sentences as the basic unit for seesus to analyze). This was the reason why we only showed how seesus evaluates individual sentences in README.md at the beginning. Thank you for pointing out that this might cause misinterpretation. To address this concern, first, we have copied the paragraph example (example 3) in example.ipynb to README.md (66862f8). Here you can tell this example is a paragraph (i.e., a chunk of text with several sentences) by scrolling to the right. The display of a Jupyter Notebook in GitHub is a bit confusing because the text is truncated. We’ve added a print statement to prevent this confusion (c77e33e). Second, we have added another example in example.ipynb to demonstrate the package’s usage in the context of academic publications (c77e33e). For both the examples of an academic publication and a planning document, we split the paragraphs into sentences and printed out the results for each sentence. Users can organize the results according to their needs.

The statement of need is clear but a bit thin. Although I can appreciate JOSS is a more software-focused journal, it would still be great to provide some context on the current status of text mining/classification on UN-SDG and why it is important to for example, "quantify which dimension of sustainability receives the most attention"

Thank you for your suggestion. We have incorporated additional context of text mining on SDGs in our paper as suggested (c8162fe). Given that JOSS requires papers to be between 250-1000 words (source), we hope the edits are sufficient to provide the necessary improvement to our statement of need.

Accuracy of 75.5% is decent but is not particularly high. Considering the evaluation method is from a different package, it would be great if the authors can provide a statement (or even better specific development plans) on how to improve the accuracy and/or usabililty of future text-mining on SDG.

This is a great idea. We’ve added a statement on maintenance in README.md to address this (1eacaaf). Following the best practices of open-source software, we welcome and encourage users to report issues if they find that a matching syntax is not accurate or can be improved.

Thanks again for your time and suggestions!

@caimeng2
Copy link

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@luyuhao0326
Copy link

@caimeng2 Thanks for your response. I will get back to you within a week or so. I will also try to install the package and give feedback if there is any.

Yuhao

@luyuhao0326
Copy link

I am more than happy to help. Do you already have Python, pip, and Jupyter (for running the example.ipynb) installed? If yes, typing pip install seesus in your terminal should do the job. If not, I would recommend installing Anaconda first. Please go to Anaconda's website and install it for your specific operating system (instructions). Then you should be able to install seesus by inputting pip install seesus in your terminal. Please let me know if you encounter any problems.

Hello, I managed to install the package and while testing one of the provided examples, I encountered a LookupError. Please see the code here.

@caimeng2
Copy link

Hello, I managed to install the package and while testing one of the provided examples, I encountered a LookupError. Please see the code here.

Hi @luyuhao0326, I'm so glad that you got the installation working 🎉 The link to the LookupError is pointing to your localhost so I can't see the traceback. But I suspect it's a bug with nltk (see this). Feel free to try some of the solutions proposed there. An alternative is to use re instead of nltk. Please see if the following code works.

from seesus import SeeSus
import re

text2 = "By working with communities in the floodplain and facilitating flood-resistant building design, DCP is reducing the city’s risks to sea level rise and coastal flooding. Hurricane Sandy was a stark reminder of these risks. The City, led by the Mayor’s Office of Recovery and Resiliency (ORR), has developed a multifaceted plan for recovering from Sandy and improving the city’s resiliency–the ability of its neighborhoods, buildings and infrastructure to withstand and recover quickly from flooding and climate events. As part of this effort, DCP has initiated a series of projects to identify and implement land use and zoning changes as well as other actions needed to support the short-term recovery and long-term vitality of communities affected by Hurricane Sandy and other areas at risk of coastal flooding."

for sent in re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text2):
    result = SeeSus(sent)
    print('"', sent, '"', sep = "")
    print("Is the sentence related to achieving sustainability?", result.sus)
    print("Which SDGs?", result.sdg)
    print("Which SDG targets specifically?", result.target)
    print("which dimensions of sustainability?", result.see)
    print("----------------")

Thank you for letting me know about this issue. I'll update the examples.

@luyuhao0326
Copy link

Hello, I managed to install the package and while testing one of the provided examples, I encountered a LookupError. Please see the code here.

Hi @luyuhao0326, I'm so glad that you got the installation working 🎉 The link to the LookupError is pointing to your localhost so I can't see the traceback. But I suspect it's a bug with nltk (see this). Feel free to try some of the solutions proposed there. An alternative is to use re instead of nltk. Please see if the following code works.

from seesus import SeeSus
import re

text2 = "By working with communities in the floodplain and facilitating flood-resistant building design, DCP is reducing the city’s risks to sea level rise and coastal flooding. Hurricane Sandy was a stark reminder of these risks. The City, led by the Mayor’s Office of Recovery and Resiliency (ORR), has developed a multifaceted plan for recovering from Sandy and improving the city’s resiliency–the ability of its neighborhoods, buildings and infrastructure to withstand and recover quickly from flooding and climate events. As part of this effort, DCP has initiated a series of projects to identify and implement land use and zoning changes as well as other actions needed to support the short-term recovery and long-term vitality of communities affected by Hurricane Sandy and other areas at risk of coastal flooding."

for sent in re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text2):
    result = SeeSus(sent)
    print('"', sent, '"', sep = "")
    print("Is the sentence related to achieving sustainability?", result.sus)
    print("Which SDGs?", result.sdg)
    print("Which SDG targets specifically?", result.target)
    print("which dimensions of sustainability?", result.see)
    print("----------------")

Thank you for letting me know about this issue. I'll update the examples.

Indeed this is the bug. It is now working with re

@luyuhao0326
Copy link

@luyuhao0326 added to the reviewers list!

The authors have addressed all my comments and made appropriate revisions. I recommend this submission to be accepted by JOSS.

@caimeng2
Copy link

caimeng2 commented Feb 1, 2024

@luyuhao0326 added to the reviewers list!

The authors have addressed all my comments and made appropriate revisions. I recommend this submission to be accepted by JOSS.

Thanks again for your suggestions, which helped to make our paper and software better!

@oliviaguest
Copy link
Member

@varsha2509 is everything going OK with your review? 😊

@varsha2509
Copy link

varsha2509 commented Feb 19, 2024

Review checklist for @varsha2509

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/caimeng2/seesus?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@caimeng2) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@varsha2509
Copy link

varsha2509 commented Feb 19, 2024

Hello. Thank you for giving me the opportunity to review this work. This authors have done a great job documenting the software, installation instructions and the functionality. Below are my comments based on the review checklist as well as a additional notes to help improve readability and adoption of this work:

  1. Functionality - while the functional claims of the software have been verified, could the authors provide more details on how the indirect search keys in SDG_keys.py specifically from this line onwards were determined? Can the authors confirm that all targets were included in the keywords?
  2. Automated tests - while the existing tests cover the explained functionality, the authors should consider including more examples in tests, especially relevant sentences that have a negative connotation to clarify the performance of this tool.
  3. Statement of need -
  • The existing statement of need isn't particularly strong. It's not very clear to me what the benefits of seesus are over existing tools, other than the functionality which classifies the expression as environmental, social or economic
    sustainability. Making the statement of need stronger will help improve adoption of this tool.
  • Could the authors provide an example of what they mean by "also the attainment of SDGs" as specified in the statement of need?
  1. State of the field -
  • OSDG (https://arxiv.org/abs/2211.11252, https://github.com/osdg-ai/osdg-tool) is another open source tool for text based classification of SDG goals and these use NLP/ML based methods. This may be worth highlighting as one of the other classifiers in the statement of need. Along with this, could the authors also include why users would consider seesus over existing open source tools?

Other notes:

  • Running through the code and script as examples, the current tool is not able to capture negative expressions, as Regex lacks semantic understanding of text. For instance using this sentence as an input "One should not resolve climate change for environmental sustainability." this is being classified as relating to achieving SDG13 and SDG15 but the output should be 'None' or 'Does not match SDG goals'.
Screenshot 2024-02-19 at 3 23 50 PM This seems to be a limitation of this tool and it would be worth highlighting this in a separate section and including some ideas on how the authors plan to address these limitations in future releases of this tool. This will help users be fully aware of the benefits and limitations of this software.
  • Related to above, could the authors talk briefly about limitations of regex for pattern matching over existing semantic text search/language models?

  • The authors mention that seesus achieves an accuracy rate of 75.5%, as determined by alignment with manual coding. Can the authors comment on how they plan to improve the performance of this tool in future releases as 75% accuracy currently seems low for usability.

@caimeng2
Copy link

Hi @varsha2509,

Thank you for taking the time to review our software and for your valuable feedback. Please find our point-by-point responses below.

Functionality - while the functional claims of the software have been verified, could the authors provide more details on how the indirect search keys in SDG_keys.py specifically from this line onwards were determined? Can the authors confirm that all targets were included in the keywords?

Yes, we can confirm that all targets are included in the keywords. We created the search keys at the levels of both the 17 SDGs and the 169 SDG targets. The indirect keys were first based on Thesaurus, and we (four researchers specialized in SDGs) manually assessed and improved the accuracy of the matching syntax by using thousands of randomly-selected statements from corporate reports. We conducted three rounds of fine-tuning and finalized these keys.

Automated tests - while the existing tests cover the explained functionality, the authors should consider including more examples in tests, especially relevant sentences that have a negative connotation to clarify the performance of this tool.

Thank you for pointing this out. Indeed, matching with negative connotation is seesus’s limitation. seesus can identify the terms related to SDGs but cannot distinguish between achieving SDGs and failing to do so. This limitation lies in regular expression’s limited logic capability and lack of context awareness. We have added another test of direct matching (4968c35) and edited the paper, deleting expressions regarding “attainment of SDGs,” to make it clear that seesus is designed to classify based on relevance.

Statement of need -
The existing statement of need isn't particularly strong. It's not very clear to me what the benefits of seesus are over existing tools, other than the functionality which classifies the expression as environmental, social or economic
sustainability. Making the statement of need stronger will help improve adoption of this tool.

The biggest benefit of seesus is its finer scale: it captures not only the SDGs but also the 169 SDG targets. To the best of our knowledge, no other Python tool does this. In addition, compared to tools based on machine learning, seesus allows users to examine and modify the matching syntax, so users can always understand and have control over the results. We’ve edited the statement of need to make it stronger as suggested (3f35864).

Could the authors provide an example of what they mean by "also the attainment of SDGs" as specified in the statement of need?

What we meant was seesus specifically looks for terms that are related to achieving the SDGs, and not just SDG-related topics themselves. For example, it is not to find words solely related to emissions (e.g., "emissions", "carbon"), but it looks for terms such as "lowering emissions" and "reducing carbon." However, we realized that this sentence is rather confusing as seesus cannot identify negative expressions, so we have deleted it to avoid further confusion.

State of the field -
OSDG (https://arxiv.org/abs/2211.11252, https://github.com/osdg-ai/osdg-tool) is another open source tool for text based classification of SDG goals and these use NLP/ML based methods. This may be worth highlighting as one of the other classifiers in the statement of need. Along with this, could the authors also include why users would consider seesus over existing open source tools?

Thank you for this reference. We have added it to the existing classifiers. We tested OSDG and noticed that it is not able to capture negative expressions either, and the results are only the 17 SDGs, not the targets. We have revised our paper to highlight seesus’s benefit.

Other notes: Running through the code and script as examples, the current tool is not able to capture negative expressions, as Regex lacks semantic understanding of text. For instance using this sentence as an input "One should not resolve climate change for environmental sustainability." this is being classified as relating to achieving SDG13 and SDG15 but the output should be 'None' or 'Does not match SDG goals'.
Screenshot 2024-02-19 at 3 23 50 PM This seems to be a limitation of this tool and it would be worth highlighting this in a separate section and including some ideas on how the authors plan to address these limitations in future releases of this tool. This will help users be fully aware of the benefits and limitations of this software.
Related to above, could the authors talk briefly about limitations of regex for pattern matching over existing semantic text search/language models?

Thank you for your suggestion. This is a very good point. Compared to language models, regex lacks the ability to understand the semantic meaning or context of text, as it operates based on character patterns. As suggested, we have added a paragraph at the end of the paper to make the limitation of seesus more clear.

The authors mention that seesus achieves an accuracy rate of 75.5%, as determined by alignment with manual coding. Can the authors comment on how they plan to improve the performance of this tool in future releases as 75% accuracy currently seems low for usability.

Yes, we have included it in the revision of the paper. We devoted hundreds of hours to fine-tune the matching syntax. 75.5% seems low but it is quite reasonable for traditional qualitative analysis. The human intercoder agreement on the same text was only at 83%.

Thanks again for all your comments and suggestions! We feel that our paper is much clearer and stronger than the previous version. Thank you!!! Please let us know if there's anything else.

@oliviaguest
Copy link
Member

@caimeng2 why is it Version: v1.2.0 above?

@oliviaguest
Copy link
Member

@caimeng2 see: caimeng2/seesus#3 ☺️

@caimeng2
Copy link

@caimeng2 why is it Version: v1.2.0 above?

Ah my bad. That was the version number for PyPI, which I totally forgot. Should have made them consistent

@caimeng2
Copy link

Hi @oliviaguest, I made a new release and redid the tasks above. Sorry about the inconvenience.

Double check authors and affiliations (including ORCIDs)

Checked

Make a release of the software with the latest changes from the review and post the version number here. This is the version that will be used in the JOSS paper.

v1.2.1

Archive the release on Zenodo/figshare/etc and post the DOI here.

DOI

Make sure that the title and author list (including ORCIDs) in the archive match those in the JOSS paper.

Checked

Make sure that the license listed for the archive is the same as the software license.

Checked

@oliviaguest
Copy link
Member

@caimeng2 thank you!

@oliviaguest
Copy link
Member

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@oliviaguest
Copy link
Member

@editorialbot set v1.2.1 as version

@editorialbot
Copy link
Collaborator Author

Done! version is now v1.2.1

@openjournals openjournals deleted a comment from editorialbot Mar 30, 2024
@oliviaguest
Copy link
Member

@caimeng2 is that the right version?

@caimeng2
Copy link

@caimeng2 is that the right version?

Yes!

@oliviaguest
Copy link
Member

@editorialbot recommend-accept

@editorialbot
Copy link
Collaborator Author

Attempting dry run of processing paper acceptance...

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1002/bse.2195 is OK
- 10.21105/joss.05124 is OK
- 10.1016/j.enpol.2008.02.039 is OK
- 10.48550/arXiv.2211.11252 is OK
- 10.1007/s10668-016-9801-z is OK
- 10.3390/ECP2023-14728 is OK
- 10.5040/9781509934058.0025 is OK
- 10.1007/978-981-10-3521-0_31 is OK
- 10.3390/su14053095 is OK

MISSING DOIs

- No DOI given, and none found for title: SDG Auto Labeller
- No DOI given, and none found for title: EUR-SDG-Mapper
- No DOI given, and none found for title: UN-SDG-Classifier
- No DOI given, and none found for title: SDG-Classifier

INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

👋 @openjournals/sbcs-eics, this paper is ready to be accepted and published.

Check final proof 👉📄 Download article

If the paper PDF and the deposit XML files look good in openjournals/joss-papers#5224, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

@editorialbot editorialbot added the recommend-accept Papers recommended for acceptance in JOSS. label Apr 7, 2024
@oliviaguest
Copy link
Member

@caimeng2 can you check the final draft please before it's accepted? 🥳

@caimeng2
Copy link

caimeng2 commented Apr 8, 2024

@caimeng2 can you check the final draft please before it's accepted? 🥳

We did, and everything looks good. Thank you for the smooth review process. I enjoyed it.

@oliviaguest
Copy link
Member

@editorialbot accept

@editorialbot
Copy link
Collaborator Author

Doing it live! Attempting automated processing of paper acceptance...

@editorialbot
Copy link
Collaborator Author

Ensure proper citation by uploading a plain text CITATION.cff file to the default branch of your repository.

If using GitHub, a Cite this repository menu will appear in the About section, containing both APA and BibTeX formats. When exported to Zotero using a browser plugin, Zotero will automatically create an entry using the information contained in the .cff file.

You can copy the contents for your CITATION.cff file here:

CITATION.cff

cff-version: "1.2.0"
authors:
- family-names: Cai
  given-names: Meng
  orcid: "https://orcid.org/0000-0002-8318-572X"
- family-names: Li
  given-names: Yingjie
  orcid: "https://orcid.org/0000-0002-8401-0649"
- family-names: Colbry
  given-names: Dirk
  orcid: "https://orcid.org/0000-0003-0666-9883"
- family-names: Frans
  given-names: Veronica F.
  orcid: "https://orcid.org/0000-0002-5634-3956"
- family-names: Zhang
  given-names: Yuqian
  orcid: "https://orcid.org/0000-0001-7576-2526"
contact:
- family-names: Cai
  given-names: Meng
  orcid: "https://orcid.org/0000-0002-8318-572X"
doi: 10.5281/zenodo.10854083
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Cai
    given-names: Meng
    orcid: "https://orcid.org/0000-0002-8318-572X"
  - family-names: Li
    given-names: Yingjie
    orcid: "https://orcid.org/0000-0002-8401-0649"
  - family-names: Colbry
    given-names: Dirk
    orcid: "https://orcid.org/0000-0003-0666-9883"
  - family-names: Frans
    given-names: Veronica F.
    orcid: "https://orcid.org/0000-0002-5634-3956"
  - family-names: Zhang
    given-names: Yuqian
    orcid: "https://orcid.org/0000-0001-7576-2526"
  date-published: 2024-04-08
  doi: 10.21105/joss.06244
  issn: 2475-9066
  issue: 96
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6244
  title: "seesus: a social, environmental, and economic sustainability
    classifier for Python"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06244"
  volume: 9
title: "seesus: a social, environmental, and economic sustainability
  classifier for Python"

If the repository is not hosted on GitHub, a .cff file can still be uploaded to set your preferred citation. Users will be able to manually copy and paste the citation.

Find more information on .cff files here and here.

@editorialbot
Copy link
Collaborator Author

🐘🐘🐘 👉 Toot for this paper 👈 🐘🐘🐘

@editorialbot
Copy link
Collaborator Author

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited 👉 Creating pull request for 10.21105.joss.06244 joss-papers#5229
  2. Wait five minutes, then verify that the paper DOI resolves https://doi.org/10.21105/joss.06244
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? Notify your editorial technical team...

@editorialbot editorialbot added accepted published Papers published in JOSS labels Apr 8, 2024
@oliviaguest
Copy link
Member

Huge thanks to @varsha2509, @luyuhao0326! ✨ JOSS appreciates your work and effort. ✨ Also, big congratulations to the author: @caimeng2! 🥳 🍾

@editorialbot
Copy link
Collaborator Author

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.06244/status.svg)](https://doi.org/10.21105/joss.06244)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.06244">
  <img src="https://joss.theoj.org/papers/10.21105/joss.06244/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.06244/status.svg
   :target: https://doi.org/10.21105/joss.06244

This is how it will look in your documentation:

DOI

We need your help!

The Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Jupyter Notebook published Papers published in JOSS Python recommend-accept Papers recommended for acceptance in JOSS. review TeX Track: 4 (SBCS) Social, Behavioral, and Cognitive Sciences
Projects
None yet
Development

No branches or pull requests

5 participants